* [PATCH v5 15/16] IB: Add PVRDMA driver
From: Adit Ranadive @ 2016-09-24 23:21 UTC (permalink / raw)
To: dledford, linux-rdma, pv-drivers
Cc: Adit Ranadive, netdev, linux-pci, jhansen, asarwade, georgezhang,
bryantan
In-Reply-To: <cover.1474759181.git.aditr@vmware.com>
This patch updates the InfiniBand subsystem to build the PVRDMA driver.
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: George Zhang <georgezhang@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
---
drivers/infiniband/Kconfig | 1 +
drivers/infiniband/hw/Makefile | 1 +
2 files changed, 2 insertions(+)
diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index 19a418a..dff4bcf 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -88,5 +88,6 @@ source "drivers/infiniband/sw/rdmavt/Kconfig"
source "drivers/infiniband/sw/rxe/Kconfig"
source "drivers/infiniband/hw/hfi1/Kconfig"
+source "drivers/infiniband/hw/pvrdma/Kconfig"
endif # INFINIBAND
diff --git a/drivers/infiniband/hw/Makefile b/drivers/infiniband/hw/Makefile
index 21fe401..c8a7a36 100644
--- a/drivers/infiniband/hw/Makefile
+++ b/drivers/infiniband/hw/Makefile
@@ -10,3 +10,4 @@ obj-$(CONFIG_INFINIBAND_OCRDMA) += ocrdma/
obj-$(CONFIG_INFINIBAND_USNIC) += usnic/
obj-$(CONFIG_INFINIBAND_HFI1) += hfi1/
obj-$(CONFIG_INFINIBAND_HNS) += hns/
+obj-$(CONFIG_INFINIBAND_PVRDMA) += pvrdma/
--
2.7.4
^ permalink raw reply related
* [PATCH v5 11/16] IB/pvrdma: Add support for memory regions
From: Adit Ranadive @ 2016-09-24 23:21 UTC (permalink / raw)
To: dledford, linux-rdma, pv-drivers
Cc: Adit Ranadive, netdev, linux-pci, jhansen, asarwade, georgezhang,
bryantan
In-Reply-To: <cover.1474759181.git.aditr@vmware.com>
This patch adds support for creating and destroying memory regions. The
PVRDMA device supports User MRs, DMA MRs (no Remote Read/Write support),
Fast Register MRs.
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: George Zhang <georgezhang@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
---
Changes v4->v5:
- Check the access flags correctly for DMA MR.
- Update to pvrdma_cmd_post for creating/destroying MRs.
Changes v3->v4:
- Changed access flag check for DMA MR to using bit operation.
- Removed some local variables.
Changes v2->v3:
- Removed boolean in pvrdma_cmd_post.
---
drivers/infiniband/hw/pvrdma/pvrdma_mr.c | 334 +++++++++++++++++++++++++++++++
1 file changed, 334 insertions(+)
create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_mr.c
diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_mr.c b/drivers/infiniband/hw/pvrdma/pvrdma_mr.c
new file mode 100644
index 0000000..8519f32
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_mr.c
@@ -0,0 +1,334 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <linux/list.h>
+#include <linux/slab.h>
+
+#include "pvrdma.h"
+
+/**
+ * pvrdma_get_dma_mr - get a DMA memory region
+ * @pd: protection domain
+ * @acc: access flags
+ *
+ * @return: ib_mr pointer on success, otherwise returns an errno.
+ */
+struct ib_mr *pvrdma_get_dma_mr(struct ib_pd *pd, int acc)
+{
+ struct pvrdma_dev *dev = to_vdev(pd->device);
+ struct pvrdma_user_mr *mr;
+ union pvrdma_cmd_req req;
+ union pvrdma_cmd_resp rsp;
+ struct pvrdma_cmd_create_mr *cmd = &req.create_mr;
+ struct pvrdma_cmd_create_mr_resp *resp = &rsp.create_mr_resp;
+ int ret;
+
+ /* Support only LOCAL_WRITE flag for DMA MRs */
+ if (acc & ~IB_ACCESS_LOCAL_WRITE) {
+ dev_warn(&dev->pdev->dev,
+ "unsupported dma mr access flags %#x\n", acc);
+ return ERR_PTR(-EOPNOTSUPP);
+ }
+
+ mr = kzalloc(sizeof(*mr), GFP_KERNEL);
+ if (!mr)
+ return ERR_PTR(-ENOMEM);
+
+ memset(cmd, 0, sizeof(*cmd));
+ cmd->hdr.cmd = PVRDMA_CMD_CREATE_MR;
+ cmd->pd_handle = to_vpd(pd)->pd_handle;
+ cmd->access_flags = acc;
+ cmd->flags = PVRDMA_MR_FLAG_DMA;
+
+ ret = pvrdma_cmd_post(dev, &req, &rsp, PVRDMA_CMD_CREATE_MR_RESP);
+ if (ret < 0) {
+ dev_warn(&dev->pdev->dev,
+ "could not get DMA mem region, error: %d\n", ret);
+ kfree(mr);
+ return ERR_PTR(ret);
+ }
+
+ mr->mmr.mr_handle = resp->mr_handle;
+ mr->ibmr.lkey = resp->lkey;
+ mr->ibmr.rkey = resp->rkey;
+
+ return &mr->ibmr;
+}
+
+/**
+ * pvrdma_reg_user_mr - register a userspace memory region
+ * @pd: protection domain
+ * @start: starting address
+ * @length: length of region
+ * @virt_addr: I/O virtual address
+ * @access_flags: access flags for memory region
+ * @udata: user data
+ *
+ * @return: ib_mr pointer on success, otherwise returns an errno.
+ */
+struct ib_mr *pvrdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
+ u64 virt_addr, int access_flags,
+ struct ib_udata *udata)
+{
+ struct pvrdma_dev *dev = to_vdev(pd->device);
+ struct pvrdma_user_mr *mr = NULL;
+ struct ib_umem *umem;
+ union pvrdma_cmd_req req;
+ union pvrdma_cmd_resp rsp;
+ struct pvrdma_cmd_create_mr *cmd = &req.create_mr;
+ struct pvrdma_cmd_create_mr_resp *resp = &rsp.create_mr_resp;
+ int nchunks;
+ int ret;
+ int entry;
+ struct scatterlist *sg;
+
+ if (length == 0 || length > dev->dsr->caps.max_mr_size) {
+ dev_warn(&dev->pdev->dev, "invalid mem region length\n");
+ return ERR_PTR(-EINVAL);
+ }
+
+ umem = ib_umem_get(pd->uobject->context, start,
+ length, access_flags, 0);
+ if (IS_ERR(umem)) {
+ dev_warn(&dev->pdev->dev,
+ "could not get umem for mem region\n");
+ return ERR_CAST(umem);
+ }
+
+ nchunks = 0;
+ for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry)
+ nchunks += sg_dma_len(sg) >> PAGE_SHIFT;
+
+ if (nchunks < 0 || nchunks > PVRDMA_PAGE_DIR_MAX_PAGES) {
+ dev_warn(&dev->pdev->dev, "overflow %d pages in mem region\n",
+ nchunks);
+ ret = -EINVAL;
+ goto err_umem;
+ }
+
+ mr = kzalloc(sizeof(*mr), GFP_KERNEL);
+ if (!mr) {
+ ret = -ENOMEM;
+ goto err_umem;
+ }
+
+ mr->mmr.iova = virt_addr;
+ mr->mmr.size = length;
+ mr->umem = umem;
+
+ ret = pvrdma_page_dir_init(dev, &mr->pdir, nchunks, false);
+ if (ret) {
+ dev_warn(&dev->pdev->dev,
+ "could not allocate page directory\n");
+ goto err_umem;
+ }
+
+ ret = pvrdma_page_dir_insert_umem(&mr->pdir, mr->umem, 0);
+ if (ret)
+ goto err_pdir;
+
+ memset(cmd, 0, sizeof(*cmd));
+ cmd->hdr.cmd = PVRDMA_CMD_CREATE_MR;
+ cmd->start = start;
+ cmd->length = length;
+ cmd->pd_handle = to_vpd(pd)->pd_handle;
+ cmd->access_flags = access_flags;
+ cmd->nchunks = nchunks;
+ cmd->pdir_dma = mr->pdir.dir_dma;
+
+ ret = pvrdma_cmd_post(dev, &req, &rsp, PVRDMA_CMD_CREATE_MR_RESP);
+ if (ret < 0) {
+ dev_warn(&dev->pdev->dev,
+ "could not register mem region, error: %d\n", ret);
+ goto err_pdir;
+ }
+
+ mr->mmr.mr_handle = resp->mr_handle;
+ mr->ibmr.lkey = resp->lkey;
+ mr->ibmr.rkey = resp->rkey;
+
+ return &mr->ibmr;
+
+err_pdir:
+ pvrdma_page_dir_cleanup(dev, &mr->pdir);
+err_umem:
+ ib_umem_release(umem);
+ kfree(mr);
+
+ return ERR_PTR(ret);
+}
+
+/**
+ * pvrdma_alloc_mr - allocate a memory region
+ * @pd: protection domain
+ * @mr_type: type of memory region
+ * @max_num_sg: maximum number of pages
+ *
+ * @return: ib_mr pointer on success, otherwise returns an errno.
+ */
+struct ib_mr *pvrdma_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
+ u32 max_num_sg)
+{
+ struct pvrdma_dev *dev = to_vdev(pd->device);
+ struct pvrdma_user_mr *mr;
+ union pvrdma_cmd_req req;
+ union pvrdma_cmd_resp rsp;
+ struct pvrdma_cmd_create_mr *cmd = &req.create_mr;
+ struct pvrdma_cmd_create_mr_resp *resp = &rsp.create_mr_resp;
+ int size = max_num_sg * sizeof(u64);
+ int ret;
+
+ if (mr_type != IB_MR_TYPE_MEM_REG ||
+ max_num_sg > PVRDMA_MAX_FAST_REG_PAGES)
+ return ERR_PTR(-EINVAL);
+
+ mr = kzalloc(sizeof(*mr), GFP_KERNEL);
+ if (!mr)
+ return ERR_PTR(-ENOMEM);
+
+ mr->pages = kzalloc(size, GFP_KERNEL);
+ if (!mr->pages) {
+ ret = -ENOMEM;
+ goto freemr;
+ }
+
+ ret = pvrdma_page_dir_init(dev, &mr->pdir, max_num_sg, false);
+ if (ret) {
+ dev_warn(&dev->pdev->dev,
+ "failed to allocate page dir for mr\n");
+ ret = -ENOMEM;
+ goto freepages;
+ }
+
+ memset(cmd, 0, sizeof(*cmd));
+ cmd->hdr.cmd = PVRDMA_CMD_CREATE_MR;
+ cmd->pd_handle = to_vpd(pd)->pd_handle;
+ cmd->access_flags = 0;
+ cmd->flags = PVRDMA_MR_FLAG_FRMR;
+ cmd->nchunks = max_num_sg;
+
+ ret = pvrdma_cmd_post(dev, &req, &rsp, PVRDMA_CMD_CREATE_MR_RESP);
+ if (ret < 0) {
+ dev_warn(&dev->pdev->dev,
+ "could not create FR mem region, error: %d\n", ret);
+ goto freepdir;
+ }
+
+ mr->max_pages = max_num_sg;
+ mr->mmr.mr_handle = resp->mr_handle;
+ mr->ibmr.lkey = resp->lkey;
+ mr->ibmr.rkey = resp->rkey;
+ mr->page_shift = PAGE_SHIFT;
+ mr->umem = NULL;
+
+ return &mr->ibmr;
+
+freepdir:
+ pvrdma_page_dir_cleanup(dev, &mr->pdir);
+freepages:
+ kfree(mr->pages);
+freemr:
+ kfree(mr);
+ return ERR_PTR(ret);
+}
+
+/**
+ * pvrdma_dereg_mr - deregister a memory region
+ * @ibmr: memory region
+ *
+ * @return: 0 on success.
+ */
+int pvrdma_dereg_mr(struct ib_mr *ibmr)
+{
+ struct pvrdma_user_mr *mr = to_vmr(ibmr);
+ struct pvrdma_dev *dev = to_vdev(ibmr->device);
+ union pvrdma_cmd_req req;
+ struct pvrdma_cmd_destroy_mr *cmd = &req.destroy_mr;
+ int ret;
+
+ memset(cmd, 0, sizeof(*cmd));
+ cmd->hdr.cmd = PVRDMA_CMD_DESTROY_MR;
+ cmd->mr_handle = mr->mmr.mr_handle;
+ ret = pvrdma_cmd_post(dev, &req, NULL, 0);
+ if (ret < 0)
+ dev_warn(&dev->pdev->dev,
+ "could not deregister mem region, error: %d\n", ret);
+
+ pvrdma_page_dir_cleanup(dev, &mr->pdir);
+ if (mr->umem)
+ ib_umem_release(mr->umem);
+
+ kfree(mr->pages);
+ kfree(mr);
+
+ return 0;
+}
+
+static int pvrdma_set_page(struct ib_mr *ibmr, u64 addr)
+{
+ struct pvrdma_user_mr *mr = to_vmr(ibmr);
+
+ if (mr->npages == mr->max_pages)
+ return -ENOMEM;
+
+ mr->pages[mr->npages++] = addr;
+ return 0;
+}
+
+int pvrdma_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents,
+ unsigned int *sg_offset)
+{
+ struct pvrdma_user_mr *mr = to_vmr(ibmr);
+ struct pvrdma_dev *dev = to_vdev(ibmr->device);
+ int ret;
+
+ mr->npages = 0;
+
+ ret = ib_sg_to_pages(ibmr, sg, sg_nents, sg_offset, pvrdma_set_page);
+ if (ret < 0)
+ dev_warn(&dev->pdev->dev, "could not map sg to pages\n");
+
+ return ret;
+}
--
2.7.4
^ permalink raw reply related
* [PATCH v5 10/16] IB/pvrdma: Add UAR support
From: Adit Ranadive @ 2016-09-24 23:21 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-pci-u79uwXL29TY76Z2rM5mHXA, jhansen-pghWNbHTmq7QT0dZR+AlfA,
asarwade-pghWNbHTmq7QT0dZR+AlfA,
georgezhang-pghWNbHTmq7QT0dZR+AlfA,
bryantan-pghWNbHTmq7QT0dZR+AlfA
In-Reply-To: <cover.1474759181.git.aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
This patch adds the UAR support for the paravirtual RDMA device. The UAR
pages are MMIO pages from the virtual PCI space. We define offsets within
this page to provide the fast data-path operations.
Reviewed-by: Yuval Shaia <yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Jorgen Hansen <jhansen-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Reviewed-by: George Zhang <georgezhang-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Reviewed-by: Aditya Sarwade <asarwade-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Reviewed-by: Bryan Tan <bryantan-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
Changes v3->v4:
- Removed an unnecessary comment.
Changes v2->v3:
- Used is_power_of_2 function.
- Simplify pvrdma_uar_alloc function.
---
drivers/infiniband/hw/pvrdma/pvrdma_doorbell.c | 127 +++++++++++++++++++++++++
1 file changed, 127 insertions(+)
create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_doorbell.c
diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_doorbell.c b/drivers/infiniband/hw/pvrdma/pvrdma_doorbell.c
new file mode 100644
index 0000000..bf51357
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_doorbell.c
@@ -0,0 +1,127 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <linux/bitmap.h>
+#include <linux/errno.h>
+#include <linux/slab.h>
+
+#include "pvrdma.h"
+
+int pvrdma_uar_table_init(struct pvrdma_dev *dev)
+{
+ u32 num = dev->dsr->caps.max_uar;
+ u32 mask = num - 1;
+ struct pvrdma_id_table *tbl = &dev->uar_table.tbl;
+
+ if (!is_power_of_2(num))
+ return -EINVAL;
+
+ tbl->last = 0;
+ tbl->top = 0;
+ tbl->max = num;
+ tbl->mask = mask;
+ spin_lock_init(&tbl->lock);
+ tbl->table = kcalloc(BITS_TO_LONGS(num), sizeof(long), GFP_KERNEL);
+ if (!tbl->table)
+ return -ENOMEM;
+
+ /* 0th UAR is taken by the device. */
+ set_bit(0, tbl->table);
+
+ return 0;
+}
+
+void pvrdma_uar_table_cleanup(struct pvrdma_dev *dev)
+{
+ struct pvrdma_id_table *tbl = &dev->uar_table.tbl;
+
+ kfree(tbl->table);
+}
+
+int pvrdma_uar_alloc(struct pvrdma_dev *dev, struct pvrdma_uar_map *uar)
+{
+ struct pvrdma_id_table *tbl;
+ unsigned long flags;
+ u32 obj;
+
+ tbl = &dev->uar_table.tbl;
+
+ spin_lock_irqsave(&tbl->lock, flags);
+ obj = find_next_zero_bit(tbl->table, tbl->max, tbl->last);
+ if (obj >= tbl->max) {
+ tbl->top = (tbl->top + tbl->max) & tbl->mask;
+ obj = find_first_zero_bit(tbl->table, tbl->max);
+ }
+
+ if (obj >= tbl->max) {
+ spin_unlock_irqrestore(&tbl->lock, flags);
+ return -ENOMEM;
+ }
+
+ set_bit(obj, tbl->table);
+ obj |= tbl->top;
+
+ spin_unlock_irqrestore(&tbl->lock, flags);
+
+ uar->index = obj;
+ uar->pfn = (pci_resource_start(dev->pdev, PVRDMA_PCI_RESOURCE_UAR) >>
+ PAGE_SHIFT) + uar->index;
+
+ return 0;
+}
+
+void pvrdma_uar_free(struct pvrdma_dev *dev, struct pvrdma_uar_map *uar)
+{
+ struct pvrdma_id_table *tbl = &dev->uar_table.tbl;
+ unsigned long flags;
+ u32 obj;
+
+ obj = uar->index & (tbl->max - 1);
+ spin_lock_irqsave(&tbl->lock, flags);
+ clear_bit(obj, tbl->table);
+ tbl->last = min(tbl->last, obj);
+ tbl->top = (tbl->top + tbl->max) & tbl->mask;
+ spin_unlock_irqrestore(&tbl->lock, flags);
+}
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 09/16] IB/pvrdma: Add support for Completion Queues
From: Adit Ranadive @ 2016-09-24 23:21 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-pci-u79uwXL29TY76Z2rM5mHXA, jhansen-pghWNbHTmq7QT0dZR+AlfA,
asarwade-pghWNbHTmq7QT0dZR+AlfA,
georgezhang-pghWNbHTmq7QT0dZR+AlfA,
bryantan-pghWNbHTmq7QT0dZR+AlfA
In-Reply-To: <cover.1474759181.git.aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
This patch adds the support for creating and destroying completion queues
on the paravirtual RDMA device.
Reviewed-by: Yuval Shaia <yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Jorgen Hansen <jhansen-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Reviewed-by: George Zhang <georgezhang-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Reviewed-by: Aditya Sarwade <asarwade-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Reviewed-by: Bryan Tan <bryantan-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
Changes v4->v5:
- Updated include for headers in UAPI folder.
- Changed from EINVAL to ENOMEM if atomic add fails.
- Added error code if destroy cq command failed.
- Update to pvrdma_cmd_post for creating/destroying CQ.
Changes v3->v4:
- Added a pvrdma_destroy_cq in the error path.
- Renamed pvrdma_flush_cqe to _pvrdma_flush_cqe since we need a lock to
be held while calling this.
- Updated to use wrapper for UAR write for CQ.
- Ensure that poll_cq does not return error values.
Changes v2->v3:
- Removed boolean from pvrdma_cmd_post.
- Return -EAGAIN if qp retrieved from CQE is bogus.
- Check for invalid index of ring.
---
drivers/infiniband/hw/pvrdma/pvrdma_cq.c | 426 +++++++++++++++++++++++++++++++
1 file changed, 426 insertions(+)
create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_cq.c
diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_cq.c b/drivers/infiniband/hw/pvrdma/pvrdma_cq.c
new file mode 100644
index 0000000..f26d4bc
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_cq.c
@@ -0,0 +1,426 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <asm/page.h>
+#include <linux/io.h>
+#include <linux/wait.h>
+#include <rdma/ib_addr.h>
+#include <rdma/ib_smi.h>
+#include <rdma/ib_user_verbs.h>
+#include <rdma/pvrdma-abi.h>
+
+#include "pvrdma.h"
+
+/**
+ * pvrdma_req_notify_cq - request notification for a completion queue
+ * @ibcq: the completion queue
+ * @notify_flags: notification flags
+ *
+ * @return: 0 for success.
+ */
+int pvrdma_req_notify_cq(struct ib_cq *ibcq,
+ enum ib_cq_notify_flags notify_flags)
+{
+ struct pvrdma_dev *dev = to_vdev(ibcq->device);
+ struct pvrdma_cq *cq = to_vcq(ibcq);
+ u32 val = cq->cq_handle;
+
+ val |= (notify_flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED ?
+ PVRDMA_UAR_CQ_ARM_SOL : PVRDMA_UAR_CQ_ARM;
+
+ pvrdma_write_uar_cq(dev, val);
+
+ return 0;
+}
+
+/**
+ * pvrdma_create_cq - create completion queue
+ * @ibdev: the device
+ * @attr: completion queue attributes
+ * @context: user context
+ * @udata: user data
+ *
+ * @return: ib_cq completion queue pointer on success,
+ * otherwise returns negative errno.
+ */
+struct ib_cq *pvrdma_create_cq(struct ib_device *ibdev,
+ const struct ib_cq_init_attr *attr,
+ struct ib_ucontext *context,
+ struct ib_udata *udata)
+{
+ int entries = attr->cqe;
+ struct pvrdma_dev *dev = to_vdev(ibdev);
+ struct pvrdma_cq *cq;
+ int ret;
+ int npages;
+ unsigned long flags;
+ union pvrdma_cmd_req req;
+ union pvrdma_cmd_resp rsp;
+ struct pvrdma_cmd_create_cq *cmd = &req.create_cq;
+ struct pvrdma_cmd_create_cq_resp *resp = &rsp.create_cq_resp;
+ struct pvrdma_create_cq ucmd;
+
+ BUILD_BUG_ON(sizeof(struct pvrdma_cqe) != 64);
+
+ entries = roundup_pow_of_two(entries);
+ if (entries < 1 || entries > dev->dsr->caps.max_cqe)
+ return ERR_PTR(-EINVAL);
+
+ if (!atomic_add_unless(&dev->num_cqs, 1, dev->dsr->caps.max_cq))
+ return ERR_PTR(-ENOMEM);
+
+ cq = kzalloc(sizeof(*cq), GFP_KERNEL);
+ if (!cq) {
+ atomic_dec(&dev->num_cqs);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ cq->ibcq.cqe = entries;
+
+ if (context) {
+ if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) {
+ ret = -EFAULT;
+ goto err_cq;
+ }
+
+ cq->umem = ib_umem_get(context, ucmd.buf_addr, ucmd.buf_size,
+ IB_ACCESS_LOCAL_WRITE, 1);
+ if (IS_ERR(cq->umem)) {
+ ret = PTR_ERR(cq->umem);
+ goto err_cq;
+ }
+
+ npages = ib_umem_page_count(cq->umem);
+ } else {
+ cq->is_kernel = true;
+
+ /* One extra page for shared ring state */
+ npages = 1 + (entries * sizeof(struct pvrdma_cqe) +
+ PAGE_SIZE - 1) / PAGE_SIZE;
+
+ /* Skip header page. */
+ cq->offset = PAGE_SIZE;
+ }
+
+ if (npages < 0 || npages > PVRDMA_PAGE_DIR_MAX_PAGES) {
+ dev_warn(&dev->pdev->dev,
+ "overflow pages in completion queue\n");
+ ret = -EINVAL;
+ goto err_umem;
+ }
+
+ ret = pvrdma_page_dir_init(dev, &cq->pdir, npages, cq->is_kernel);
+ if (ret) {
+ dev_warn(&dev->pdev->dev,
+ "could not allocate page directory\n");
+ goto err_umem;
+ }
+
+ /* Ring state is always the first page. Set in library for user cq. */
+ if (cq->is_kernel)
+ cq->ring_state = cq->pdir.pages[0];
+ else
+ pvrdma_page_dir_insert_umem(&cq->pdir, cq->umem, 0);
+
+ atomic_set(&cq->refcnt, 1);
+ init_waitqueue_head(&cq->wait);
+ spin_lock_init(&cq->cq_lock);
+
+ memset(cmd, 0, sizeof(*cmd));
+ cmd->hdr.cmd = PVRDMA_CMD_CREATE_CQ;
+ cmd->nchunks = npages;
+ cmd->ctx_handle = (context) ?
+ (u64)to_vucontext(context)->ctx_handle : 0;
+ cmd->cqe = entries;
+ cmd->pdir_dma = cq->pdir.dir_dma;
+ ret = pvrdma_cmd_post(dev, &req, &rsp, PVRDMA_CMD_CREATE_CQ_RESP);
+ if (ret < 0) {
+ dev_warn(&dev->pdev->dev,
+ "could not create completion queue, error: %d\n", ret);
+ goto err_page_dir;
+ }
+
+ cq->ibcq.cqe = resp->cqe;
+ cq->cq_handle = resp->cq_handle;
+ spin_lock_irqsave(&dev->cq_tbl_lock, flags);
+ dev->cq_tbl[cq->cq_handle % dev->dsr->caps.max_cq] = cq;
+ spin_unlock_irqrestore(&dev->cq_tbl_lock, flags);
+
+ if (context) {
+ cq->uar = &(to_vucontext(context)->uar);
+
+ /* Copy udata back. */
+ if (ib_copy_to_udata(udata, &cq->cq_handle, sizeof(__u32))) {
+ dev_warn(&dev->pdev->dev,
+ "failed to copy back udata\n");
+ pvrdma_destroy_cq(&cq->ibcq);
+ return ERR_PTR(-EINVAL);
+ }
+ }
+
+ return &cq->ibcq;
+
+err_page_dir:
+ pvrdma_page_dir_cleanup(dev, &cq->pdir);
+err_umem:
+ if (context)
+ ib_umem_release(cq->umem);
+err_cq:
+ atomic_dec(&dev->num_cqs);
+ kfree(cq);
+
+ return ERR_PTR(ret);
+}
+
+static void pvrdma_free_cq(struct pvrdma_dev *dev, struct pvrdma_cq *cq)
+{
+ atomic_dec(&cq->refcnt);
+ wait_event(cq->wait, !atomic_read(&cq->refcnt));
+
+ if (!cq->is_kernel)
+ ib_umem_release(cq->umem);
+
+ pvrdma_page_dir_cleanup(dev, &cq->pdir);
+ kfree(cq);
+}
+
+/**
+ * pvrdma_destroy_cq - destroy completion queue
+ * @cq: the completion queue to destroy.
+ *
+ * @return: 0 for success.
+ */
+int pvrdma_destroy_cq(struct ib_cq *cq)
+{
+ struct pvrdma_cq *vcq = to_vcq(cq);
+ union pvrdma_cmd_req req;
+ struct pvrdma_cmd_destroy_cq *cmd = &req.destroy_cq;
+ struct pvrdma_dev *dev = to_vdev(cq->device);
+ unsigned long flags;
+ int ret;
+
+ memset(cmd, 0, sizeof(*cmd));
+ cmd->hdr.cmd = PVRDMA_CMD_DESTROY_CQ;
+ cmd->cq_handle = vcq->cq_handle;
+
+ ret = pvrdma_cmd_post(dev, &req, NULL, 0);
+ if (ret < 0)
+ dev_warn(&dev->pdev->dev,
+ "could not destroy completion queue, error: %d\n",
+ ret);
+
+ /* free cq's resources */
+ spin_lock_irqsave(&dev->cq_tbl_lock, flags);
+ dev->cq_tbl[vcq->cq_handle] = NULL;
+ spin_unlock_irqrestore(&dev->cq_tbl_lock, flags);
+
+ pvrdma_free_cq(dev, vcq);
+ atomic_dec(&dev->num_cqs);
+
+ return ret;
+}
+
+/**
+ * pvrdma_modify_cq - modify the CQ moderation parameters
+ * @ibcq: the CQ to modify
+ * @cq_count: number of CQEs that will trigger an event
+ * @cq_period: max period of time in usec before triggering an event
+ *
+ * @return: -EOPNOTSUPP as CQ resize is not supported.
+ */
+int pvrdma_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline struct pvrdma_cqe *get_cqe(struct pvrdma_cq *cq, int i)
+{
+ return (struct pvrdma_cqe *)pvrdma_page_dir_get_ptr(
+ &cq->pdir,
+ cq->offset +
+ sizeof(struct pvrdma_cqe) * i);
+}
+
+void _pvrdma_flush_cqe(struct pvrdma_qp *qp, struct pvrdma_cq *cq)
+{
+ int head;
+ int has_data;
+
+ if (!cq->is_kernel)
+ return;
+
+ /* Lock held */
+ has_data = pvrdma_idx_ring_has_data(&cq->ring_state->rx,
+ cq->ibcq.cqe, &head);
+ if (unlikely(has_data > 0)) {
+ int items;
+ int curr;
+ int tail = pvrdma_idx(&cq->ring_state->rx.prod_tail,
+ cq->ibcq.cqe);
+ struct pvrdma_cqe *cqe;
+ struct pvrdma_cqe *curr_cqe;
+
+ items = (tail > head) ? (tail - head) :
+ (cq->ibcq.cqe - head + tail);
+ curr = --tail;
+ while (items-- > 0) {
+ if (curr < 0)
+ curr = cq->ibcq.cqe - 1;
+ if (tail < 0)
+ tail = cq->ibcq.cqe - 1;
+ curr_cqe = get_cqe(cq, curr);
+ if ((curr_cqe->qp & 0xFFFF) != qp->qp_handle) {
+ if (curr != tail) {
+ cqe = get_cqe(cq, tail);
+ *cqe = *curr_cqe;
+ }
+ tail--;
+ } else {
+ pvrdma_idx_ring_inc(
+ &cq->ring_state->rx.cons_head,
+ cq->ibcq.cqe);
+ }
+ curr--;
+ }
+ }
+}
+
+static int pvrdma_poll_one(struct pvrdma_cq *cq, struct pvrdma_qp **cur_qp,
+ struct ib_wc *wc)
+{
+ struct pvrdma_dev *dev = to_vdev(cq->ibcq.device);
+ int has_data;
+ unsigned int head;
+ bool tried = false;
+ struct pvrdma_cqe *cqe;
+
+retry:
+ has_data = pvrdma_idx_ring_has_data(&cq->ring_state->rx,
+ cq->ibcq.cqe, &head);
+ if (has_data == 0) {
+ if (tried)
+ return -EAGAIN;
+
+ pvrdma_write_uar_cq(dev, cq->cq_handle | PVRDMA_UAR_CQ_POLL);
+
+ tried = true;
+ goto retry;
+ } else if (has_data == PVRDMA_INVALID_IDX) {
+ dev_err(&dev->pdev->dev, "CQ ring state invalid\n");
+ return -EAGAIN;
+ }
+
+ cqe = get_cqe(cq, head);
+
+ /* Ensure cqe is valid. */
+ rmb();
+ if (dev->qp_tbl[cqe->qp & 0xffff])
+ *cur_qp = (struct pvrdma_qp *)dev->qp_tbl[cqe->qp & 0xffff];
+ else
+ return -EAGAIN;
+
+ wc->opcode = pvrdma_wc_opcode_to_ib(cqe->opcode);
+ wc->status = pvrdma_wc_status_to_ib(cqe->status);
+ wc->wr_id = cqe->wr_id;
+ wc->qp = &(*cur_qp)->ibqp;
+ wc->byte_len = cqe->byte_len;
+ wc->ex.imm_data = cqe->imm_data;
+ wc->src_qp = cqe->src_qp;
+ wc->wc_flags = pvrdma_wc_flags_to_ib(cqe->wc_flags);
+ wc->pkey_index = cqe->pkey_index;
+ wc->slid = cqe->slid;
+ wc->sl = cqe->sl;
+ wc->dlid_path_bits = cqe->dlid_path_bits;
+ wc->port_num = cqe->port_num;
+ wc->vendor_err = 0;
+
+ /* Update shared ring state */
+ pvrdma_idx_ring_inc(&cq->ring_state->rx.cons_head, cq->ibcq.cqe);
+
+ return 0;
+}
+
+/**
+ * pvrdma_poll_cq - poll for work completion queue entries
+ * @ibcq: completion queue
+ * @num_entries: the maximum number of entries
+ * @entry: pointer to work completion array
+ *
+ * @return: number of polled completion entries
+ */
+int pvrdma_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc)
+{
+ struct pvrdma_cq *cq = to_vcq(ibcq);
+ struct pvrdma_qp *cur_qp = NULL;
+ unsigned long flags;
+ int npolled;
+
+ if (num_entries < 1 || wc == NULL)
+ return 0;
+
+ spin_lock_irqsave(&cq->cq_lock, flags);
+ for (npolled = 0; npolled < num_entries; ++npolled) {
+ if (pvrdma_poll_one(cq, &cur_qp, wc + npolled))
+ break;
+ }
+
+ spin_unlock_irqrestore(&cq->cq_lock, flags);
+
+ /* Ensure we do not return errors from poll_cq */
+ return npolled;
+}
+
+/**
+ * pvrdma_resize_cq - resize CQ
+ * @ibcq: the completion queue
+ * @entries: CQ entries
+ * @udata: user data
+ *
+ * @return: -EOPNOTSUPP as CQ resize is not supported.
+ */
+int pvrdma_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *udata)
+{
+ return -EOPNOTSUPP;
+}
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 08/16] IB/pvrdma: Add device command support
From: Adit Ranadive @ 2016-09-24 23:21 UTC (permalink / raw)
To: dledford, linux-rdma, pv-drivers
Cc: Adit Ranadive, netdev, linux-pci, jhansen, asarwade, georgezhang,
bryantan
In-Reply-To: <cover.1474759181.git.aditr@vmware.com>
This patch enables posting Verb requests and receiving responses to/from
the backend PVRDMA emulation layer.
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: George Zhang <georgezhang@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
---
Changes v4->v5:
- Moved the timeout to pvrdma_cmd_recv.
- Added additional response code parameter to pvrdma_cmd_post.
Changes v3->v4:
- Removed the min check and added a BUILD_BUG_ON check for size.
Changes v2->v3:
- Converted pvrdma_cmd_recv to inline.
- Added a min check in the memcpy to cmd_slot.
- Removed the boolean from pvrdma_cmd_post.
---
drivers/infiniband/hw/pvrdma/pvrdma_cmd.c | 117 ++++++++++++++++++++++++++++++
1 file changed, 117 insertions(+)
create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_cmd.c
diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_cmd.c b/drivers/infiniband/hw/pvrdma/pvrdma_cmd.c
new file mode 100644
index 0000000..21f1af8
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_cmd.c
@@ -0,0 +1,117 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <linux/list.h>
+
+#include "pvrdma.h"
+
+#define PVRDMA_CMD_TIMEOUT 10000 /* ms */
+
+static inline int pvrdma_cmd_recv(struct pvrdma_dev *dev,
+ union pvrdma_cmd_resp *resp,
+ unsigned resp_code)
+{
+ int err;
+
+ dev_dbg(&dev->pdev->dev, "receive response from device\n");
+
+ err = wait_for_completion_interruptible_timeout(&dev->cmd_done,
+ msecs_to_jiffies(PVRDMA_CMD_TIMEOUT));
+ if (err == 0 || err == -ERESTARTSYS) {
+ dev_warn(&dev->pdev->dev,
+ "completion timeout or interrupted\n");
+ return -ETIMEDOUT;
+ }
+
+ spin_lock(&dev->cmd_lock);
+ memcpy(resp, dev->resp_slot, sizeof(*resp));
+ spin_unlock(&dev->cmd_lock);
+
+ if (resp->hdr.ack != resp_code) {
+ dev_warn(&dev->pdev->dev,
+ "unknown response %#x expected %#x\n",
+ resp->hdr.ack, resp_code);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+int
+pvrdma_cmd_post(struct pvrdma_dev *dev, union pvrdma_cmd_req *req,
+ union pvrdma_cmd_resp *resp, unsigned resp_code)
+{
+ int err;
+
+ dev_dbg(&dev->pdev->dev, "post request to device\n");
+
+ /* Serializiation */
+ down(&dev->cmd_sema);
+
+ BUILD_BUG_ON(sizeof(union pvrdma_cmd_req) !=
+ sizeof(struct pvrdma_cmd_modify_qp));
+
+ spin_lock(&dev->cmd_lock);
+ memcpy(dev->cmd_slot, req, sizeof(*req));
+ spin_unlock(&dev->cmd_lock);
+
+ init_completion(&dev->cmd_done);
+ pvrdma_write_reg(dev, PVRDMA_REG_REQUEST, 0);
+
+ /* Make sure the request is written before reading status. */
+ mb();
+
+ err = pvrdma_read_reg(dev, PVRDMA_REG_ERR);
+ if (err == 0) {
+ if (resp != NULL)
+ err = pvrdma_cmd_recv(dev, resp, resp_code);
+ } else {
+ dev_warn(&dev->pdev->dev, "failed to write request %d\n", err);
+ }
+
+ up(&dev->cmd_sema);
+
+ return err;
+}
--
2.7.4
^ permalink raw reply related
* [PATCH v5 07/16] IB/pvrdma: Add helper functions
From: Adit Ranadive @ 2016-09-24 23:21 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-pci-u79uwXL29TY76Z2rM5mHXA, jhansen-pghWNbHTmq7QT0dZR+AlfA,
asarwade-pghWNbHTmq7QT0dZR+AlfA,
georgezhang-pghWNbHTmq7QT0dZR+AlfA,
bryantan-pghWNbHTmq7QT0dZR+AlfA
In-Reply-To: <cover.1474759181.git.aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
This patch adds helper functions to store guest page addresses in a page
directory structure. The page directory pointer is passed down to the
backend which then maps the entire memory for the RDMA object by
traversing the directory. We add some more helper functions for converting
to/from RDMA stack address handles from/to PVRDMA ones.
Reviewed-by: Yuval Shaia <yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Jorgen Hansen <jhansen-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Reviewed-by: George Zhang <georgezhang-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Reviewed-by: Aditya Sarwade <asarwade-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Reviewed-by: Bryan Tan <bryantan-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
Changes v4->v5:
- Correct var type passed to dma_alloc_coherent.
Changes v3->v4:
- Updated conversion functions to func_name(dst, src) format.
- Removed unneeded local variables.
---
drivers/infiniband/hw/pvrdma/pvrdma_misc.c | 304 +++++++++++++++++++++++++++++
1 file changed, 304 insertions(+)
create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_misc.c
diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_misc.c b/drivers/infiniband/hw/pvrdma/pvrdma_misc.c
new file mode 100644
index 0000000..948b5cc
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_misc.c
@@ -0,0 +1,304 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <linux/errno.h>
+#include <linux/slab.h>
+#include <linux/bitmap.h>
+
+#include "pvrdma.h"
+
+int pvrdma_page_dir_init(struct pvrdma_dev *dev, struct pvrdma_page_dir *pdir,
+ u64 npages, bool alloc_pages)
+{
+ u64 i;
+
+ if (npages > PVRDMA_PAGE_DIR_MAX_PAGES)
+ return -EINVAL;
+
+ memset(pdir, 0, sizeof(*pdir));
+
+ pdir->dir = dma_alloc_coherent(&dev->pdev->dev, PAGE_SIZE,
+ &pdir->dir_dma, GFP_KERNEL);
+ if (!pdir->dir)
+ goto err;
+
+ pdir->ntables = PVRDMA_PAGE_DIR_TABLE(npages - 1) + 1;
+ pdir->tables = kcalloc(pdir->ntables, sizeof(*pdir->tables),
+ GFP_KERNEL);
+ if (!pdir->tables)
+ goto err;
+
+ for (i = 0; i < pdir->ntables; i++) {
+ pdir->tables[i] = dma_alloc_coherent(&dev->pdev->dev, PAGE_SIZE,
+ (dma_addr_t *)&pdir->dir[i],
+ GFP_KERNEL);
+ if (!pdir->tables[i])
+ goto err;
+ }
+
+ pdir->npages = npages;
+
+ if (alloc_pages) {
+ pdir->pages = kcalloc(npages, sizeof(*pdir->pages),
+ GFP_KERNEL);
+ if (!pdir->pages)
+ goto err;
+
+ for (i = 0; i < pdir->npages; i++) {
+ dma_addr_t page_dma;
+
+ pdir->pages[i] = dma_alloc_coherent(&dev->pdev->dev,
+ PAGE_SIZE,
+ &page_dma,
+ GFP_KERNEL);
+ if (!pdir->pages[i])
+ goto err;
+
+ pvrdma_page_dir_insert_dma(pdir, i, page_dma);
+ }
+ }
+
+ return 0;
+
+err:
+ pvrdma_page_dir_cleanup(dev, pdir);
+
+ return -ENOMEM;
+}
+
+static u64 *pvrdma_page_dir_table(struct pvrdma_page_dir *pdir, u64 idx)
+{
+ return pdir->tables[PVRDMA_PAGE_DIR_TABLE(idx)];
+}
+
+dma_addr_t pvrdma_page_dir_get_dma(struct pvrdma_page_dir *pdir, u64 idx)
+{
+ return pvrdma_page_dir_table(pdir, idx)[PVRDMA_PAGE_DIR_PAGE(idx)];
+}
+
+static void pvrdma_page_dir_cleanup_pages(struct pvrdma_dev *dev,
+ struct pvrdma_page_dir *pdir)
+{
+ if (pdir->pages) {
+ u64 i;
+
+ for (i = 0; i < pdir->npages && pdir->pages[i]; i++) {
+ dma_addr_t page_dma = pvrdma_page_dir_get_dma(pdir, i);
+
+ dma_free_coherent(&dev->pdev->dev, PAGE_SIZE,
+ pdir->pages[i], page_dma);
+ }
+
+ kfree(pdir->pages);
+ }
+}
+
+static void pvrdma_page_dir_cleanup_tables(struct pvrdma_dev *dev,
+ struct pvrdma_page_dir *pdir)
+{
+ if (pdir->tables) {
+ int i;
+
+ pvrdma_page_dir_cleanup_pages(dev, pdir);
+
+ for (i = 0; i < pdir->ntables; i++) {
+ u64 *table = pdir->tables[i];
+
+ if (table)
+ dma_free_coherent(&dev->pdev->dev, PAGE_SIZE,
+ table, pdir->dir[i]);
+ }
+
+ kfree(pdir->tables);
+ }
+}
+
+void pvrdma_page_dir_cleanup(struct pvrdma_dev *dev,
+ struct pvrdma_page_dir *pdir)
+{
+ if (pdir->dir) {
+ pvrdma_page_dir_cleanup_tables(dev, pdir);
+ dma_free_coherent(&dev->pdev->dev, PAGE_SIZE,
+ pdir->dir, pdir->dir_dma);
+ }
+}
+
+int pvrdma_page_dir_insert_dma(struct pvrdma_page_dir *pdir, u64 idx,
+ dma_addr_t daddr)
+{
+ u64 *table;
+
+ if (idx >= pdir->npages)
+ return -EINVAL;
+
+ table = pvrdma_page_dir_table(pdir, idx);
+ table[PVRDMA_PAGE_DIR_PAGE(idx)] = daddr;
+
+ return 0;
+}
+
+int pvrdma_page_dir_insert_umem(struct pvrdma_page_dir *pdir,
+ struct ib_umem *umem, u64 offset)
+{
+ u64 i = offset;
+ int j, entry;
+ int ret = 0, len = 0;
+ struct scatterlist *sg;
+
+ if (offset >= pdir->npages)
+ return -EINVAL;
+
+ for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) {
+ len = sg_dma_len(sg) >> PAGE_SHIFT;
+ for (j = 0; j < len; j++) {
+ dma_addr_t addr = sg_dma_address(sg) +
+ umem->page_size * j;
+
+ ret = pvrdma_page_dir_insert_dma(pdir, i, addr);
+ if (ret)
+ goto exit;
+
+ i++;
+ }
+ }
+
+exit:
+ return ret;
+}
+
+int pvrdma_page_dir_insert_page_list(struct pvrdma_page_dir *pdir,
+ u64 *page_list,
+ int num_pages)
+{
+ int i;
+ int ret;
+
+ if (num_pages > pdir->npages)
+ return -EINVAL;
+
+ for (i = 0; i < num_pages; i++) {
+ ret = pvrdma_page_dir_insert_dma(pdir, i, page_list[i]);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+void pvrdma_qp_cap_to_ib(struct ib_qp_cap *dst, const struct pvrdma_qp_cap *src)
+{
+ dst->max_send_wr = src->max_send_wr;
+ dst->max_recv_wr = src->max_recv_wr;
+ dst->max_send_sge = src->max_send_sge;
+ dst->max_recv_sge = src->max_recv_sge;
+ dst->max_inline_data = src->max_inline_data;
+}
+
+void ib_qp_cap_to_pvrdma(struct pvrdma_qp_cap *dst, const struct ib_qp_cap *src)
+{
+ dst->max_send_wr = src->max_send_wr;
+ dst->max_recv_wr = src->max_recv_wr;
+ dst->max_send_sge = src->max_send_sge;
+ dst->max_recv_sge = src->max_recv_sge;
+ dst->max_inline_data = src->max_inline_data;
+}
+
+void pvrdma_gid_to_ib(union ib_gid *dst, const union pvrdma_gid *src)
+{
+ BUILD_BUG_ON(sizeof(union pvrdma_gid) != sizeof(union ib_gid));
+ memcpy(dst, src, sizeof(*src));
+}
+
+void ib_gid_to_pvrdma(union pvrdma_gid *dst, const union ib_gid *src)
+{
+ BUILD_BUG_ON(sizeof(union pvrdma_gid) != sizeof(union ib_gid));
+ memcpy(dst, src, sizeof(*src));
+}
+
+void pvrdma_global_route_to_ib(struct ib_global_route *dst,
+ const struct pvrdma_global_route *src)
+{
+ pvrdma_gid_to_ib(&dst->dgid, &src->dgid);
+ dst->flow_label = src->flow_label;
+ dst->sgid_index = src->sgid_index;
+ dst->hop_limit = src->hop_limit;
+ dst->traffic_class = src->traffic_class;
+}
+
+void ib_global_route_to_pvrdma(struct pvrdma_global_route *dst,
+ const struct ib_global_route *src)
+{
+ ib_gid_to_pvrdma(&dst->dgid, &src->dgid);
+ dst->flow_label = src->flow_label;
+ dst->sgid_index = src->sgid_index;
+ dst->hop_limit = src->hop_limit;
+ dst->traffic_class = src->traffic_class;
+}
+
+void pvrdma_ah_attr_to_ib(struct ib_ah_attr *dst,
+ const struct pvrdma_ah_attr *src)
+{
+ pvrdma_global_route_to_ib(&dst->grh, &src->grh);
+ dst->dlid = src->dlid;
+ dst->sl = src->sl;
+ dst->src_path_bits = src->src_path_bits;
+ dst->static_rate = src->static_rate;
+ dst->ah_flags = src->ah_flags;
+ dst->port_num = src->port_num;
+ memcpy(&dst->dmac, &src->dmac, sizeof(dst->dmac));
+}
+
+void ib_ah_attr_to_pvrdma(struct pvrdma_ah_attr *dst,
+ const struct ib_ah_attr *src)
+{
+ ib_global_route_to_pvrdma(&dst->grh, &src->grh);
+ dst->dlid = src->dlid;
+ dst->sl = src->sl;
+ dst->src_path_bits = src->src_path_bits;
+ dst->static_rate = src->static_rate;
+ dst->ah_flags = src->ah_flags;
+ dst->port_num = src->port_num;
+ memcpy(&dst->dmac, &src->dmac, sizeof(dst->dmac));
+}
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 06/16] IB/pvrdma: Add paravirtual rdma device
From: Adit Ranadive @ 2016-09-24 23:21 UTC (permalink / raw)
To: dledford, linux-rdma, pv-drivers
Cc: Adit Ranadive, netdev, linux-pci, jhansen, asarwade, georgezhang,
bryantan
In-Reply-To: <cover.1474759181.git.aditr@vmware.com>
This patch adds the main device-level structures and functions to be used
to provide RDMA functionality. Also, we define conversion functions from
the IB core stack structures to the device-specific ones.
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: George Zhang <georgezhang@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
---
Changes v4->v5:
- pvrdma_cmd_post takes the response code.
Changes v3->v4:
- Renamed pvrdma_flush_cqe to _pvrdma_flush_cqe since we hold a lock
to call it.
- Added wrapper functions for writing to UARs for CQ/QP.
- The conversion functions are updated as func_name(dst, src) format.
- Renamed max_gs to max_sg.
- Added work struct for net device events.
- priviledged -> privileged.
Changes v2->v3:
- Removed VMware vendor id redefinition.
- Removed the boolean in pvrdma_cmd_post.
---
drivers/infiniband/hw/pvrdma/pvrdma.h | 473 ++++++++++++++++++++++++++++++++++
1 file changed, 473 insertions(+)
create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma.h
diff --git a/drivers/infiniband/hw/pvrdma/pvrdma.h b/drivers/infiniband/hw/pvrdma/pvrdma.h
new file mode 100644
index 0000000..8073ffc
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma.h
@@ -0,0 +1,473 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_H__
+#define __PVRDMA_H__
+
+#include <linux/compiler.h>
+#include <linux/interrupt.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/semaphore.h>
+#include <linux/workqueue.h>
+#include <rdma/ib_umem.h>
+#include <rdma/ib_verbs.h>
+
+#include "pvrdma_defs.h"
+#include "pvrdma_dev_api.h"
+#include "pvrdma_verbs.h"
+
+/* NOT the same as BIT_MASK(). */
+#define PVRDMA_MASK(n) ((n << 1) - 1)
+
+/*
+ * VMware PVRDMA PCI device id.
+ */
+#define PCI_DEVICE_ID_VMWARE_PVRDMA 0x0820
+
+struct pvrdma_dev;
+
+struct pvrdma_page_dir {
+ dma_addr_t dir_dma;
+ u64 *dir;
+ int ntables;
+ u64 **tables;
+ u64 npages;
+ void **pages;
+};
+
+struct pvrdma_cq {
+ struct ib_cq ibcq;
+ int offset;
+ spinlock_t cq_lock; /* Poll lock. */
+ struct pvrdma_uar_map *uar;
+ struct ib_umem *umem;
+ struct pvrdma_ring_state *ring_state;
+ struct pvrdma_page_dir pdir;
+ u32 cq_handle;
+ bool is_kernel;
+ atomic_t refcnt;
+ wait_queue_head_t wait;
+};
+
+struct pvrdma_id_table {
+ u32 last;
+ u32 top;
+ u32 max;
+ u32 mask;
+ spinlock_t lock; /* Table lock. */
+ unsigned long *table;
+};
+
+struct pvrdma_uar_map {
+ unsigned long pfn;
+ void __iomem *map;
+ int index;
+};
+
+struct pvrdma_uar_table {
+ struct pvrdma_id_table tbl;
+ int size;
+};
+
+struct pvrdma_ucontext {
+ struct ib_ucontext ibucontext;
+ struct pvrdma_dev *dev;
+ struct pvrdma_uar_map uar;
+ u64 ctx_handle;
+};
+
+struct pvrdma_pd {
+ struct ib_pd ibpd;
+ u32 pdn;
+ u32 pd_handle;
+ int privileged;
+};
+
+struct pvrdma_mr {
+ u32 mr_handle;
+ u64 iova;
+ u64 size;
+};
+
+struct pvrdma_user_mr {
+ struct ib_mr ibmr;
+ struct ib_umem *umem;
+ struct pvrdma_mr mmr;
+ struct pvrdma_page_dir pdir;
+ u64 *pages;
+ u32 npages;
+ u32 max_pages;
+ u32 page_shift;
+};
+
+struct pvrdma_wq {
+ struct pvrdma_ring *ring;
+ spinlock_t lock; /* Work queue lock. */
+ int wqe_cnt;
+ int wqe_size;
+ int max_sg;
+ int offset;
+};
+
+struct pvrdma_ah {
+ struct ib_ah ibah;
+ struct pvrdma_av av;
+};
+
+struct pvrdma_qp {
+ struct ib_qp ibqp;
+ u32 qp_handle;
+ u32 qkey;
+ struct pvrdma_wq sq;
+ struct pvrdma_wq rq;
+ struct ib_umem *rumem;
+ struct ib_umem *sumem;
+ struct pvrdma_page_dir pdir;
+ int npages;
+ int npages_send;
+ int npages_recv;
+ u32 flags;
+ u8 port;
+ u8 state;
+ bool is_kernel;
+ struct mutex mutex; /* QP state mutex. */
+ atomic_t refcnt;
+ wait_queue_head_t wait;
+};
+
+struct pvrdma_dev {
+ /* PCI device-related information. */
+ struct ib_device ib_dev;
+ struct pci_dev *pdev;
+ void __iomem *regs;
+ struct pvrdma_device_shared_region *dsr; /* Shared region pointer */
+ dma_addr_t dsrbase; /* Shared region base address */
+ void *cmd_slot;
+ void *resp_slot;
+ unsigned long flags;
+ struct list_head device_link;
+
+ /* Locking and interrupt information. */
+ spinlock_t cmd_lock; /* Command lock. */
+ struct semaphore cmd_sema;
+ struct completion cmd_done;
+ struct {
+ enum pvrdma_intr_type type; /* Intr type */
+ struct msix_entry msix_entry[PVRDMA_MAX_INTERRUPTS];
+ irq_handler_t handler[PVRDMA_MAX_INTERRUPTS];
+ u8 enabled[PVRDMA_MAX_INTERRUPTS];
+ u8 size;
+ } intr;
+
+ /* RDMA-related device information. */
+ union ib_gid *sgid_tbl;
+ struct pvrdma_ring_state *async_ring_state;
+ struct pvrdma_page_dir async_pdir;
+ struct pvrdma_ring_state *cq_ring_state;
+ struct pvrdma_page_dir cq_pdir;
+ struct pvrdma_cq **cq_tbl;
+ spinlock_t cq_tbl_lock;
+ struct pvrdma_qp **qp_tbl;
+ spinlock_t qp_tbl_lock;
+ struct pvrdma_uar_table uar_table;
+ struct pvrdma_uar_map driver_uar;
+ __be64 sys_image_guid;
+ spinlock_t desc_lock; /* Device modification lock. */
+ u32 port_cap_mask;
+ struct mutex port_mutex; /* Port modification mutex. */
+ bool ib_active;
+ atomic_t num_qps;
+ atomic_t num_cqs;
+ atomic_t num_pds;
+ atomic_t num_ahs;
+
+ /* Network device information. */
+ struct net_device *netdev;
+ struct notifier_block nb_netdev;
+};
+
+struct pvrdma_netdevice_work {
+ struct work_struct work;
+ struct net_device *event_netdev;
+ unsigned long event;
+};
+
+static inline struct pvrdma_dev *to_vdev(struct ib_device *ibdev)
+{
+ return container_of(ibdev, struct pvrdma_dev, ib_dev);
+}
+
+static inline struct
+pvrdma_ucontext *to_vucontext(struct ib_ucontext *ibucontext)
+{
+ return container_of(ibucontext, struct pvrdma_ucontext, ibucontext);
+}
+
+static inline struct pvrdma_pd *to_vpd(struct ib_pd *ibpd)
+{
+ return container_of(ibpd, struct pvrdma_pd, ibpd);
+}
+
+static inline struct pvrdma_cq *to_vcq(struct ib_cq *ibcq)
+{
+ return container_of(ibcq, struct pvrdma_cq, ibcq);
+}
+
+static inline struct pvrdma_user_mr *to_vmr(struct ib_mr *ibmr)
+{
+ return container_of(ibmr, struct pvrdma_user_mr, ibmr);
+}
+
+static inline struct pvrdma_qp *to_vqp(struct ib_qp *ibqp)
+{
+ return container_of(ibqp, struct pvrdma_qp, ibqp);
+}
+
+static inline struct pvrdma_ah *to_vah(struct ib_ah *ibah)
+{
+ return container_of(ibah, struct pvrdma_ah, ibah);
+}
+
+static inline void pvrdma_write_reg(struct pvrdma_dev *dev, u32 reg, u32 val)
+{
+ writel(cpu_to_le32(val), dev->regs + reg);
+}
+
+static inline u32 pvrdma_read_reg(struct pvrdma_dev *dev, u32 reg)
+{
+ return le32_to_cpu(readl(dev->regs + reg));
+}
+
+static inline void pvrdma_write_uar_cq(struct pvrdma_dev *dev, u32 val)
+{
+ writel(cpu_to_le32(val), dev->driver_uar.map + PVRDMA_UAR_CQ_OFFSET);
+}
+
+static inline void pvrdma_write_uar_qp(struct pvrdma_dev *dev, u32 val)
+{
+ writel(cpu_to_le32(val), dev->driver_uar.map + PVRDMA_UAR_QP_OFFSET);
+}
+
+static inline void *pvrdma_page_dir_get_ptr(struct pvrdma_page_dir *pdir,
+ u64 offset)
+{
+ return pdir->pages[offset / PAGE_SIZE] + (offset % PAGE_SIZE);
+}
+
+static inline enum pvrdma_mtu ib_mtu_to_pvrdma(enum ib_mtu mtu)
+{
+ return (enum pvrdma_mtu)mtu;
+}
+
+static inline enum ib_mtu pvrdma_mtu_to_ib(enum pvrdma_mtu mtu)
+{
+ return (enum ib_mtu)mtu;
+}
+
+static inline enum pvrdma_port_state ib_port_state_to_pvrdma(
+ enum ib_port_state state)
+{
+ return (enum pvrdma_port_state)state;
+}
+
+static inline enum ib_port_state pvrdma_port_state_to_ib(
+ enum pvrdma_port_state state)
+{
+ return (enum ib_port_state)state;
+}
+
+static inline int ib_port_cap_flags_to_pvrdma(int flags)
+{
+ return flags & PVRDMA_MASK(PVRDMA_PORT_CAP_FLAGS_MAX);
+}
+
+static inline int pvrdma_port_cap_flags_to_ib(int flags)
+{
+ return flags;
+}
+
+static inline enum pvrdma_port_width ib_port_width_to_pvrdma(
+ enum ib_port_width width)
+{
+ return (enum pvrdma_port_width)width;
+}
+
+static inline enum ib_port_width pvrdma_port_width_to_ib(
+ enum pvrdma_port_width width)
+{
+ return (enum ib_port_width)width;
+}
+
+static inline enum pvrdma_port_speed ib_port_speed_to_pvrdma(
+ enum ib_port_speed speed)
+{
+ return (enum pvrdma_port_speed)speed;
+}
+
+static inline enum ib_port_speed pvrdma_port_speed_to_ib(
+ enum pvrdma_port_speed speed)
+{
+ return (enum ib_port_speed)speed;
+}
+
+static inline int pvrdma_qp_attr_mask_to_ib(int attr_mask)
+{
+ return attr_mask;
+}
+
+static inline int ib_qp_attr_mask_to_pvrdma(int attr_mask)
+{
+ return attr_mask & PVRDMA_MASK(PVRDMA_QP_ATTR_MASK_MAX);
+}
+
+static inline enum pvrdma_mig_state ib_mig_state_to_pvrdma(
+ enum ib_mig_state state)
+{
+ return (enum pvrdma_mig_state)state;
+}
+
+static inline enum ib_mig_state pvrdma_mig_state_to_ib(
+ enum pvrdma_mig_state state)
+{
+ return (enum ib_mig_state)state;
+}
+
+static inline int ib_access_flags_to_pvrdma(int flags)
+{
+ return flags;
+}
+
+static inline int pvrdma_access_flags_to_ib(int flags)
+{
+ return flags & PVRDMA_MASK(PVRDMA_ACCESS_FLAGS_MAX);
+}
+
+static inline enum pvrdma_qp_type ib_qp_type_to_pvrdma(enum ib_qp_type type)
+{
+ return (enum pvrdma_qp_type)type;
+}
+
+static inline enum ib_qp_type pvrdma_qp_type_to_ib(enum pvrdma_qp_type type)
+{
+ return (enum ib_qp_type)type;
+}
+
+static inline enum pvrdma_qp_state ib_qp_state_to_pvrdma(enum ib_qp_state state)
+{
+ return (enum pvrdma_qp_state)state;
+}
+
+static inline enum ib_qp_state pvrdma_qp_state_to_ib(enum pvrdma_qp_state state)
+{
+ return (enum ib_qp_state)state;
+}
+
+static inline enum pvrdma_wr_opcode ib_wr_opcode_to_pvrdma(enum ib_wr_opcode op)
+{
+ return (enum pvrdma_wr_opcode)op;
+}
+
+static inline enum ib_wc_status pvrdma_wc_status_to_ib(
+ enum pvrdma_wc_status status)
+{
+ return (enum ib_wc_status)status;
+}
+
+static inline int pvrdma_wc_opcode_to_ib(int opcode)
+{
+ return opcode;
+}
+
+static inline int pvrdma_wc_flags_to_ib(int flags)
+{
+ return flags;
+}
+
+static inline int ib_send_flags_to_pvrdma(int flags)
+{
+ return flags & PVRDMA_MASK(PVRDMA_SEND_FLAGS_MAX);
+}
+
+void pvrdma_qp_cap_to_ib(struct ib_qp_cap *dst,
+ const struct pvrdma_qp_cap *src);
+void ib_qp_cap_to_pvrdma(struct pvrdma_qp_cap *dst,
+ const struct ib_qp_cap *src);
+void pvrdma_gid_to_ib(union ib_gid *dst, const union pvrdma_gid *src);
+void ib_gid_to_pvrdma(union pvrdma_gid *dst, const union ib_gid *src);
+void pvrdma_global_route_to_ib(struct ib_global_route *dst,
+ const struct pvrdma_global_route *src);
+void ib_global_route_to_pvrdma(struct pvrdma_global_route *dst,
+ const struct ib_global_route *src);
+void pvrdma_ah_attr_to_ib(struct ib_ah_attr *dst,
+ const struct pvrdma_ah_attr *src);
+void ib_ah_attr_to_pvrdma(struct pvrdma_ah_attr *dst,
+ const struct ib_ah_attr *src);
+
+int pvrdma_uar_table_init(struct pvrdma_dev *dev);
+void pvrdma_uar_table_cleanup(struct pvrdma_dev *dev);
+
+int pvrdma_uar_alloc(struct pvrdma_dev *dev, struct pvrdma_uar_map *uar);
+void pvrdma_uar_free(struct pvrdma_dev *dev, struct pvrdma_uar_map *uar);
+
+void _pvrdma_flush_cqe(struct pvrdma_qp *qp, struct pvrdma_cq *cq);
+
+int pvrdma_page_dir_init(struct pvrdma_dev *dev, struct pvrdma_page_dir *pdir,
+ u64 npages, bool alloc_pages);
+void pvrdma_page_dir_cleanup(struct pvrdma_dev *dev,
+ struct pvrdma_page_dir *pdir);
+int pvrdma_page_dir_insert_dma(struct pvrdma_page_dir *pdir, u64 idx,
+ dma_addr_t daddr);
+int pvrdma_page_dir_insert_umem(struct pvrdma_page_dir *pdir,
+ struct ib_umem *umem, u64 offset);
+dma_addr_t pvrdma_page_dir_get_dma(struct pvrdma_page_dir *pdir, u64 idx);
+int pvrdma_page_dir_insert_page_list(struct pvrdma_page_dir *pdir,
+ u64 *page_list, int num_pages);
+
+int pvrdma_cmd_post(struct pvrdma_dev *dev, union pvrdma_cmd_req *req,
+ union pvrdma_cmd_resp *rsp, unsigned resp_code);
+
+#endif /* __PVRDMA_H__ */
--
2.7.4
^ permalink raw reply related
* [PATCH v5 05/16] IB/pvrdma: Add functions for Verbs support
From: Adit Ranadive @ 2016-09-24 23:21 UTC (permalink / raw)
To: dledford, linux-rdma, pv-drivers
Cc: Adit Ranadive, netdev, linux-pci, jhansen, asarwade, georgezhang,
bryantan
In-Reply-To: <cover.1474759181.git.aditr@vmware.com>
This patch implements the remaining Verbs functions registered with the
core RDMA stack.
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: George Zhang <georgezhang@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
---
Changes v4->v5:
- Update include for headers in UAPI folder.
- Removed setting any properties that are reported by device as 0.
- Simplified modify_port.
- PD should be allocated first in kernel then in device.
- Update to pvrdma_cmd_post for creating/destroying PD, Query port/device.
Changes v3->v4:
- Renamed priviledged -> privileged.
- Added error numbers for command errors.
- Removed unnecessary goto in modify_device.
- Moved pd allocation to after command execution.
- Removed an incorrect atomic_dec.
---
drivers/infiniband/hw/pvrdma/pvrdma_verbs.c | 577 ++++++++++++++++++++++++++++
drivers/infiniband/hw/pvrdma/pvrdma_verbs.h | 108 ++++++
2 files changed, 685 insertions(+)
create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_verbs.c
create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_verbs.h
diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_verbs.c b/drivers/infiniband/hw/pvrdma/pvrdma_verbs.c
new file mode 100644
index 0000000..a7aef93d
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_verbs.c
@@ -0,0 +1,577 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <asm/page.h>
+#include <linux/inet.h>
+#include <linux/io.h>
+#include <rdma/ib_addr.h>
+#include <rdma/ib_smi.h>
+#include <rdma/ib_user_verbs.h>
+#include <rdma/pvrdma-abi.h>
+
+#include "pvrdma.h"
+
+/**
+ * pvrdma_query_device - query device
+ * @ibdev: the device to query
+ * @props: the device properties
+ * @uhw: user data
+ *
+ * @return: 0 on success, otherwise negative errno
+ */
+int pvrdma_query_device(struct ib_device *ibdev,
+ struct ib_device_attr *props,
+ struct ib_udata *uhw)
+{
+ struct pvrdma_dev *dev = to_vdev(ibdev);
+
+ if (uhw->inlen || uhw->outlen)
+ return -EINVAL;
+
+ memset(props, 0, sizeof(*props));
+
+ props->fw_ver = dev->dsr->caps.fw_ver;
+ props->sys_image_guid = dev->dsr->caps.sys_image_guid;
+ props->max_mr_size = dev->dsr->caps.max_mr_size;
+ props->page_size_cap = dev->dsr->caps.page_size_cap;
+ props->vendor_id = dev->dsr->caps.vendor_id;
+ props->vendor_part_id = dev->pdev->device;
+ props->hw_ver = dev->dsr->caps.hw_ver;
+ props->max_qp = dev->dsr->caps.max_qp;
+ props->max_qp_wr = dev->dsr->caps.max_qp_wr;
+ props->device_cap_flags = dev->dsr->caps.device_cap_flags;
+ props->max_sge = dev->dsr->caps.max_sge;
+ props->max_cq = dev->dsr->caps.max_cq;
+ props->max_cqe = dev->dsr->caps.max_cqe;
+ props->max_mr = dev->dsr->caps.max_mr;
+ props->max_pd = dev->dsr->caps.max_pd;
+ props->max_qp_rd_atom = dev->dsr->caps.max_qp_rd_atom;
+ props->max_qp_init_rd_atom = dev->dsr->caps.max_qp_init_rd_atom;
+ props->atomic_cap =
+ dev->dsr->caps.atomic_ops &
+ (PVRDMA_ATOMIC_OP_COMP_SWAP | PVRDMA_ATOMIC_OP_FETCH_ADD) ?
+ IB_ATOMIC_HCA : IB_ATOMIC_NONE;
+ props->masked_atomic_cap = props->atomic_cap;
+ props->max_ah = dev->dsr->caps.max_ah;
+ props->max_pkeys = dev->dsr->caps.max_pkeys;
+ props->local_ca_ack_delay = dev->dsr->caps.local_ca_ack_delay;
+ if ((dev->dsr->caps.bmme_flags & PVRDMA_BMME_FLAG_LOCAL_INV) &&
+ (dev->dsr->caps.bmme_flags & PVRDMA_BMME_FLAG_REMOTE_INV) &&
+ (dev->dsr->caps.bmme_flags & PVRDMA_BMME_FLAG_FAST_REG_WR)) {
+ props->device_cap_flags |= IB_DEVICE_MEM_MGT_EXTENSIONS;
+ }
+
+ return 0;
+}
+
+/**
+ * pvrdma_query_port - query device port attributes
+ * @ibdev: the device to query
+ * @port: the port number
+ * @props: the device properties
+ *
+ * @return: 0 on success, otherwise negative errno
+ */
+int pvrdma_query_port(struct ib_device *ibdev, u8 port,
+ struct ib_port_attr *props)
+{
+ struct pvrdma_dev *dev = to_vdev(ibdev);
+ union pvrdma_cmd_req req;
+ union pvrdma_cmd_resp rsp;
+ struct pvrdma_cmd_query_port *cmd = &req.query_port;
+ struct pvrdma_cmd_query_port_resp *resp = &rsp.query_port_resp;
+ int err;
+
+ memset(cmd, 0, sizeof(*cmd));
+ cmd->hdr.cmd = PVRDMA_CMD_QUERY_PORT;
+ cmd->port_num = port;
+
+ err = pvrdma_cmd_post(dev, &req, &rsp, PVRDMA_CMD_QUERY_PORT_RESP);
+ if (err < 0) {
+ dev_warn(&dev->pdev->dev,
+ "could not query port, error: %d\n", err);
+ return err;
+ }
+
+ memset(props, 0, sizeof(*props));
+
+ props->state = pvrdma_port_state_to_ib(resp->attrs.state);
+ props->max_mtu = pvrdma_mtu_to_ib(resp->attrs.max_mtu);
+ props->active_mtu = pvrdma_mtu_to_ib(resp->attrs.active_mtu);
+ props->gid_tbl_len = resp->attrs.gid_tbl_len;
+ props->port_cap_flags =
+ pvrdma_port_cap_flags_to_ib(resp->attrs.port_cap_flags);
+ props->max_msg_sz = resp->attrs.max_msg_sz;
+ props->bad_pkey_cntr = resp->attrs.bad_pkey_cntr;
+ props->qkey_viol_cntr = resp->attrs.qkey_viol_cntr;
+ props->pkey_tbl_len = resp->attrs.pkey_tbl_len;
+ props->lid = resp->attrs.lid;
+ props->sm_lid = resp->attrs.sm_lid;
+ props->lmc = resp->attrs.lmc;
+ props->max_vl_num = resp->attrs.max_vl_num;
+ props->sm_sl = resp->attrs.sm_sl;
+ props->subnet_timeout = resp->attrs.subnet_timeout;
+ props->init_type_reply = resp->attrs.init_type_reply;
+ props->active_width = pvrdma_port_width_to_ib(resp->attrs.active_width);
+ props->active_speed = pvrdma_port_speed_to_ib(resp->attrs.active_speed);
+ props->phys_state = resp->attrs.phys_state;
+
+ return 0;
+}
+
+/**
+ * pvrdma_query_gid - query device gid
+ * @ibdev: the device to query
+ * @port: the port number
+ * @index: the index
+ * @gid: the device gid value
+ *
+ * @return: 0 on success, otherwise negative errno
+ */
+int pvrdma_query_gid(struct ib_device *ibdev, u8 port, int index,
+ union ib_gid *gid)
+{
+ struct pvrdma_dev *dev = to_vdev(ibdev);
+
+ if (index >= dev->dsr->caps.gid_tbl_len)
+ return -EINVAL;
+
+ memcpy(gid, &dev->sgid_tbl[index], sizeof(union ib_gid));
+
+ return 0;
+}
+
+/**
+ * pvrdma_query_pkey - query device port's P_Key table
+ * @ibdev: the device to query
+ * @port: the port number
+ * @index: the index
+ * @pkey: the device P_Key value
+ *
+ * @return: 0 on success, otherwise negative errno
+ */
+int pvrdma_query_pkey(struct ib_device *ibdev, u8 port, u16 index,
+ u16 *pkey)
+{
+ int err = 0;
+ union pvrdma_cmd_req req;
+ union pvrdma_cmd_resp rsp;
+ struct pvrdma_cmd_query_pkey *cmd = &req.query_pkey;
+
+ memset(cmd, 0, sizeof(*cmd));
+ cmd->hdr.cmd = PVRDMA_CMD_QUERY_PKEY;
+ cmd->port_num = port;
+ cmd->index = index;
+
+ err = pvrdma_cmd_post(to_vdev(ibdev), &req, &rsp,
+ PVRDMA_CMD_QUERY_PKEY_RESP);
+ if (err < 0) {
+ dev_warn(&to_vdev(ibdev)->pdev->dev,
+ "could not query pkey, error: %d\n", err);
+ return err;
+ }
+
+ *pkey = rsp.query_pkey_resp.pkey;
+
+ return 0;
+}
+
+enum rdma_link_layer pvrdma_port_link_layer(struct ib_device *ibdev,
+ u8 port)
+{
+ return IB_LINK_LAYER_ETHERNET;
+}
+
+int pvrdma_modify_device(struct ib_device *ibdev, int mask,
+ struct ib_device_modify *props)
+{
+ unsigned long flags;
+
+ if (mask & ~(IB_DEVICE_MODIFY_SYS_IMAGE_GUID |
+ IB_DEVICE_MODIFY_NODE_DESC)) {
+ dev_warn(&to_vdev(ibdev)->pdev->dev,
+ "unsupported device modify mask %#x\n", mask);
+ return -EOPNOTSUPP;
+ }
+
+ if (mask & IB_DEVICE_MODIFY_NODE_DESC) {
+ spin_lock_irqsave(&to_vdev(ibdev)->desc_lock, flags);
+ memcpy(ibdev->node_desc, props->node_desc, 64);
+ spin_unlock_irqrestore(&to_vdev(ibdev)->desc_lock, flags);
+ }
+
+ if (mask & IB_DEVICE_MODIFY_SYS_IMAGE_GUID) {
+ mutex_lock(&to_vdev(ibdev)->port_mutex);
+ to_vdev(ibdev)->sys_image_guid =
+ cpu_to_be64(props->sys_image_guid);
+ mutex_unlock(&to_vdev(ibdev)->port_mutex);
+ }
+
+ return 0;
+}
+
+/**
+ * pvrdma_modify_port - modify device port attributes
+ * @ibdev: the device to modify
+ * @port: the port number
+ * @mask: attributes to modify
+ * @props: the device properties
+ *
+ * @return: 0 on success, otherwise negative errno
+ */
+int pvrdma_modify_port(struct ib_device *ibdev, u8 port, int mask,
+ struct ib_port_modify *props)
+{
+ struct ib_port_attr attr;
+ struct pvrdma_dev *vdev = to_vdev(ibdev);
+ int ret;
+
+ if (mask & ~IB_PORT_SHUTDOWN) {
+ dev_warn(&vdev->pdev->dev,
+ "unsupported port modify mask %#x\n", mask);
+ return -EOPNOTSUPP;
+ }
+
+ mutex_lock(&vdev->port_mutex);
+ ret = pvrdma_query_port(ibdev, port, &attr);
+ if (ret)
+ goto out;
+
+ vdev->port_cap_mask |= props->set_port_cap_mask;
+ vdev->port_cap_mask &= ~props->clr_port_cap_mask;
+
+ if (mask & IB_PORT_SHUTDOWN)
+ vdev->ib_active = false;
+
+out:
+ mutex_unlock(&vdev->port_mutex);
+ return ret;
+}
+
+/**
+ * pvrdma_alloc_ucontext - allocate ucontext
+ * @ibdev: the IB device
+ * @udata: user data
+ *
+ * @return: the ib_ucontext pointer on success, otherwise errno.
+ */
+struct ib_ucontext *pvrdma_alloc_ucontext(struct ib_device *ibdev,
+ struct ib_udata *udata)
+{
+ struct pvrdma_dev *vdev = to_vdev(ibdev);
+ struct pvrdma_ucontext *context;
+ union pvrdma_cmd_req req;
+ union pvrdma_cmd_resp rsp;
+ struct pvrdma_cmd_create_uc *cmd = &req.create_uc;
+ struct pvrdma_cmd_create_uc_resp *resp = &rsp.create_uc_resp;
+ struct pvrdma_alloc_ucontext_resp uresp;
+ int ret;
+ void *ptr;
+
+ if (!vdev->ib_active)
+ return ERR_PTR(-EAGAIN);
+
+ context = kmalloc(sizeof(*context), GFP_KERNEL);
+ if (!context)
+ return ERR_PTR(-ENOMEM);
+
+ context->dev = vdev;
+ ret = pvrdma_uar_alloc(vdev, &context->uar);
+ if (ret) {
+ kfree(context);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ /* get ctx_handle from host */
+ memset(cmd, 0, sizeof(*cmd));
+ cmd->pfn = context->uar.pfn;
+ cmd->hdr.cmd = PVRDMA_CMD_CREATE_UC;
+ ret = pvrdma_cmd_post(vdev, &req, &rsp, PVRDMA_CMD_CREATE_UC_RESP);
+ if (ret < 0) {
+ dev_warn(&vdev->pdev->dev,
+ "could not create ucontext, error: %d\n", ret);
+ ptr = ERR_PTR(ret);
+ goto err;
+ }
+
+ context->ctx_handle = resp->ctx_handle;
+
+ /* copy back to user */
+ uresp.qp_tab_size = vdev->dsr->caps.max_qp;
+ ret = ib_copy_to_udata(udata, &uresp, sizeof(uresp));
+ if (ret) {
+ pvrdma_uar_free(vdev, &context->uar);
+ context->ibucontext.device = ibdev;
+ pvrdma_dealloc_ucontext(&context->ibucontext);
+ return ERR_PTR(-EFAULT);
+ }
+
+ return &context->ibucontext;
+
+err:
+ pvrdma_uar_free(vdev, &context->uar);
+ kfree(context);
+ return ptr;
+}
+
+/**
+ * pvrdma_dealloc_ucontext - deallocate ucontext
+ * @ibcontext: the ucontext
+ *
+ * @return: 0 on success, otherwise errno.
+ */
+int pvrdma_dealloc_ucontext(struct ib_ucontext *ibcontext)
+{
+ struct pvrdma_ucontext *context = to_vucontext(ibcontext);
+ union pvrdma_cmd_req req;
+ struct pvrdma_cmd_destroy_uc *cmd = &req.destroy_uc;
+ int ret;
+
+ memset(cmd, 0, sizeof(*cmd));
+ cmd->hdr.cmd = PVRDMA_CMD_DESTROY_UC;
+ cmd->ctx_handle = context->ctx_handle;
+
+ ret = pvrdma_cmd_post(context->dev, &req, NULL, 0);
+ if (ret < 0)
+ dev_warn(&context->dev->pdev->dev,
+ "destroy ucontext failed, error: %d\n", ret);
+
+ /* Free the UAR even if the device command failed */
+ pvrdma_uar_free(to_vdev(ibcontext->device), &context->uar);
+ kfree(context);
+
+ return ret;
+}
+
+/**
+ * pvrdma_mmap - create mmap region
+ * @ibcontext: the user context
+ * @vma: the VMA
+ *
+ * @return: 0 on success, otherwise errno.
+ */
+int pvrdma_mmap(struct ib_ucontext *ibcontext, struct vm_area_struct *vma)
+{
+ struct pvrdma_ucontext *context = to_vucontext(ibcontext);
+ unsigned long start = vma->vm_start;
+ unsigned long size = vma->vm_end - vma->vm_start;
+ unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
+
+ dev_dbg(&context->dev->pdev->dev, "create mmap region\n");
+
+ if ((size != PAGE_SIZE) || (offset & ~PAGE_MASK)) {
+ dev_warn(&context->dev->pdev->dev,
+ "invalid params for mmap region\n");
+ return -EINVAL;
+ }
+
+ /* Map UAR to kernel space, VM_LOCKED? */
+ vma->vm_flags |= VM_DONTCOPY | VM_DONTEXPAND;
+ vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+ if (io_remap_pfn_range(vma, start, context->uar.pfn, size,
+ vma->vm_page_prot))
+ return -EAGAIN;
+
+ return 0;
+}
+
+/**
+ * pvrdma_alloc_pd - allocate protection domain
+ * @ibdev: the IB device
+ * @context: user context
+ * @udata: user data
+ *
+ * @return: the ib_pd protection domain pointer on success, otherwise errno.
+ */
+struct ib_pd *pvrdma_alloc_pd(struct ib_device *ibdev,
+ struct ib_ucontext *context,
+ struct ib_udata *udata)
+{
+ struct pvrdma_pd *pd;
+ struct pvrdma_dev *dev = to_vdev(ibdev);
+ union pvrdma_cmd_req req;
+ union pvrdma_cmd_resp rsp;
+ struct pvrdma_cmd_create_pd *cmd = &req.create_pd;
+ struct pvrdma_cmd_create_pd_resp *resp = &rsp.create_pd_resp;
+ int ret;
+ void *ptr;
+
+ /* Check allowed max pds */
+ if (!atomic_add_unless(&dev->num_pds, 1, dev->dsr->caps.max_pd))
+ return ERR_PTR(-ENOMEM);
+
+ pd = kmalloc(sizeof(*pd), GFP_KERNEL);
+ if (!pd) {
+ ptr = ERR_PTR(-ENOMEM);
+ goto err;
+ }
+
+ memset(cmd, 0, sizeof(*cmd));
+ cmd->hdr.cmd = PVRDMA_CMD_CREATE_PD;
+ cmd->ctx_handle = (context) ? to_vucontext(context)->ctx_handle : 0;
+ ret = pvrdma_cmd_post(dev, &req, &rsp, PVRDMA_CMD_CREATE_PD_RESP);
+ if (ret < 0) {
+ dev_warn(&dev->pdev->dev,
+ "failed to allocate protection domain, error: %d\n",
+ ret);
+ ptr = ERR_PTR(ret);
+ goto freepd;
+ }
+
+ pd->privileged = !context;
+ pd->pd_handle = resp->pd_handle;
+ pd->pdn = resp->pd_handle;
+
+ if (context) {
+ if (ib_copy_to_udata(udata, &pd->pdn, sizeof(__u32))) {
+ dev_warn(&dev->pdev->dev,
+ "failed to copy back protection domain\n");
+ pvrdma_dealloc_pd(&pd->ibpd);
+ return ERR_PTR(-EFAULT);
+ }
+ }
+
+ /* u32 pd handle */
+ return &pd->ibpd;
+
+freepd:
+ kfree(pd);
+err:
+ atomic_dec(&dev->num_pds);
+ return ptr;
+}
+
+/**
+ * pvrdma_dealloc_pd - deallocate protection domain
+ * @pd: the protection domain to be released
+ *
+ * @return: 0 on success, otherwise errno.
+ */
+int pvrdma_dealloc_pd(struct ib_pd *pd)
+{
+ struct pvrdma_dev *dev = to_vdev(pd->device);
+ union pvrdma_cmd_req req;
+ struct pvrdma_cmd_destroy_pd *cmd = &req.destroy_pd;
+ int ret;
+
+ memset(cmd, 0, sizeof(*cmd));
+ cmd->hdr.cmd = PVRDMA_CMD_DESTROY_PD;
+ cmd->pd_handle = to_vpd(pd)->pd_handle;
+
+ ret = pvrdma_cmd_post(dev, &req, NULL, 0);
+ if (ret)
+ dev_warn(&dev->pdev->dev,
+ "could not dealloc protection domain, error: %d\n",
+ ret);
+
+ kfree(to_vpd(pd));
+ atomic_dec(&dev->num_pds);
+
+ return 0;
+}
+
+/**
+ * pvrdma_create_ah - create an address handle
+ * @pd: the protection domain
+ * @ah_attr: the attributes of the AH
+ *
+ * @return: the ib_ah pointer on success, otherwise errno.
+ */
+struct ib_ah *pvrdma_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr)
+{
+ struct pvrdma_dev *dev = to_vdev(pd->device);
+ struct pvrdma_ah *ah;
+ enum rdma_link_layer ll;
+
+ if (!(ah_attr->ah_flags & IB_AH_GRH))
+ return ERR_PTR(-EINVAL);
+
+ ll = rdma_port_get_link_layer(pd->device, ah_attr->port_num);
+
+ if (ll != IB_LINK_LAYER_ETHERNET ||
+ rdma_is_multicast_addr((struct in6_addr *)ah_attr->grh.dgid.raw))
+ return ERR_PTR(-EINVAL);
+
+ if (!atomic_add_unless(&dev->num_ahs, 1, dev->dsr->caps.max_ah))
+ return ERR_PTR(-ENOMEM);
+
+ ah = kzalloc(sizeof(*ah), GFP_KERNEL);
+ if (!ah) {
+ atomic_dec(&dev->num_ahs);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ ah->av.port_pd = to_vpd(pd)->pd_handle | (ah_attr->port_num << 24);
+ ah->av.src_path_bits = ah_attr->src_path_bits;
+ ah->av.src_path_bits |= 0x80;
+ ah->av.gid_index = ah_attr->grh.sgid_index;
+ ah->av.hop_limit = ah_attr->grh.hop_limit;
+ ah->av.sl_tclass_flowlabel = (ah_attr->grh.traffic_class << 20) |
+ ah_attr->grh.flow_label;
+ memcpy(ah->av.dgid, ah_attr->grh.dgid.raw, 16);
+ memcpy(ah->av.dmac, ah_attr->dmac, 6);
+
+ ah->ibah.device = pd->device;
+ ah->ibah.pd = pd;
+ ah->ibah.uobject = NULL;
+
+ return &ah->ibah;
+}
+
+/**
+ * pvrdma_destroy_ah - destroy an address handle
+ * @ah: the address handle to destroyed
+ *
+ * @return: 0 on success.
+ */
+int pvrdma_destroy_ah(struct ib_ah *ah)
+{
+ struct pvrdma_dev *dev = to_vdev(ah->device);
+
+ kfree(to_vah(ah));
+ atomic_dec(&dev->num_ahs);
+
+ return 0;
+}
diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_verbs.h b/drivers/infiniband/hw/pvrdma/pvrdma_verbs.h
new file mode 100644
index 0000000..e52fbe5
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_verbs.h
@@ -0,0 +1,108 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_VERBS_H__
+#define __PVRDMA_VERBS_H__
+
+int pvrdma_query_device(struct ib_device *ibdev,
+ struct ib_device_attr *props,
+ struct ib_udata *udata);
+int pvrdma_query_port(struct ib_device *ibdev, u8 port,
+ struct ib_port_attr *props);
+int pvrdma_query_gid(struct ib_device *ibdev, u8 port,
+ int index, union ib_gid *gid);
+int pvrdma_query_pkey(struct ib_device *ibdev, u8 port,
+ u16 index, u16 *pkey);
+enum rdma_link_layer pvrdma_port_link_layer(struct ib_device *ibdev,
+ u8 port);
+int pvrdma_modify_device(struct ib_device *ibdev, int mask,
+ struct ib_device_modify *props);
+int pvrdma_modify_port(struct ib_device *ibdev, u8 port,
+ int mask, struct ib_port_modify *props);
+int pvrdma_mmap(struct ib_ucontext *context, struct vm_area_struct *vma);
+struct ib_ucontext *pvrdma_alloc_ucontext(struct ib_device *ibdev,
+ struct ib_udata *udata);
+int pvrdma_dealloc_ucontext(struct ib_ucontext *context);
+struct ib_pd *pvrdma_alloc_pd(struct ib_device *ibdev,
+ struct ib_ucontext *context,
+ struct ib_udata *udata);
+int pvrdma_dealloc_pd(struct ib_pd *ibpd);
+struct ib_mr *pvrdma_get_dma_mr(struct ib_pd *pd, int acc);
+struct ib_mr *pvrdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
+ u64 virt_addr, int access_flags,
+ struct ib_udata *udata);
+int pvrdma_dereg_mr(struct ib_mr *mr);
+struct ib_mr *pvrdma_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
+ u32 max_num_sg);
+int pvrdma_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg,
+ int sg_nents, unsigned int *sg_offset);
+int pvrdma_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period);
+int pvrdma_resize_cq(struct ib_cq *ibcq, int entries,
+ struct ib_udata *udata);
+struct ib_cq *pvrdma_create_cq(struct ib_device *ibdev,
+ const struct ib_cq_init_attr *attr,
+ struct ib_ucontext *context,
+ struct ib_udata *udata);
+int pvrdma_resize_cq(struct ib_cq *ibcq, int entries,
+ struct ib_udata *udata);
+int pvrdma_destroy_cq(struct ib_cq *cq);
+int pvrdma_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
+int pvrdma_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
+struct ib_ah *pvrdma_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr);
+int pvrdma_destroy_ah(struct ib_ah *ah);
+struct ib_qp *pvrdma_create_qp(struct ib_pd *pd,
+ struct ib_qp_init_attr *init_attr,
+ struct ib_udata *udata);
+int pvrdma_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
+ int attr_mask, struct ib_udata *udata);
+int pvrdma_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
+ int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr);
+int pvrdma_destroy_qp(struct ib_qp *qp);
+int pvrdma_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
+ struct ib_send_wr **bad_wr);
+int pvrdma_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,
+ struct ib_recv_wr **bad_wr);
+
+#endif /* __PVRDMA_VERBS_H__ */
--
2.7.4
^ permalink raw reply related
* [PATCH v5 04/16] IB/pvrdma: Add the paravirtual RDMA device specification
From: Adit Ranadive @ 2016-09-24 23:21 UTC (permalink / raw)
To: dledford, linux-rdma, pv-drivers
Cc: Adit Ranadive, netdev, linux-pci, jhansen, asarwade, georgezhang,
bryantan
In-Reply-To: <cover.1474759181.git.aditr@vmware.com>
This patch describes the main specification of the underlying virtual RDMA
device. The pvrdma_dev_api header file defines the Verbs commands and
their parameters that can be issued to the device backend.
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: George Zhang <georgezhang@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
---
Changes v4->v5:
- Update include for headers moved to UAPI.
- Removed __ prefix for unsigned vars.
Changes v3->v4:
- Removed explicit enum values.
Changes v2->v3:
- Defined 9 and 18 for page directory.
- Stripped spaces in some comments.
---
drivers/infiniband/hw/pvrdma/pvrdma_defs.h | 301 +++++++++++++++++++++++
drivers/infiniband/hw/pvrdma/pvrdma_dev_api.h | 342 ++++++++++++++++++++++++++
2 files changed, 643 insertions(+)
create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_defs.h
create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_dev_api.h
diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_defs.h b/drivers/infiniband/hw/pvrdma/pvrdma_defs.h
new file mode 100644
index 0000000..8105b01
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_defs.h
@@ -0,0 +1,301 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_DEFS_H__
+#define __PVRDMA_DEFS_H__
+
+#include <linux/types.h>
+#include <rdma/pvrdma-uapi.h>
+#include "pvrdma_ib_verbs.h"
+
+/*
+ * Masks and accessors for page directory, which is a two-level lookup:
+ * page directory -> page table -> page. Only one directory for now, but we
+ * could expand that easily. 9 bits for tables, 9 bits for pages, gives one
+ * gigabyte for memory regions and so forth.
+ */
+
+#define PVRDMA_PDIR_SHIFT 18
+#define PVRDMA_PTABLE_SHIFT 9
+#define PVRDMA_PAGE_DIR_DIR(x) (((x) >> PVRDMA_PDIR_SHIFT) & 0x1)
+#define PVRDMA_PAGE_DIR_TABLE(x) (((x) >> PVRDMA_PTABLE_SHIFT) & 0x1ff)
+#define PVRDMA_PAGE_DIR_PAGE(x) ((x) & 0x1ff)
+#define PVRDMA_PAGE_DIR_MAX_PAGES (1 * 512 * 512)
+#define PVRDMA_MAX_FAST_REG_PAGES 128
+
+/*
+ * Max MSI-X vectors.
+ */
+
+#define PVRDMA_MAX_INTERRUPTS 3
+
+/* Register offsets within PCI resource on BAR1. */
+#define PVRDMA_REG_VERSION 0x00 /* R: Version of device. */
+#define PVRDMA_REG_DSRLOW 0x04 /* W: Device shared region low PA. */
+#define PVRDMA_REG_DSRHIGH 0x08 /* W: Device shared region high PA. */
+#define PVRDMA_REG_CTL 0x0c /* W: PVRDMA_DEVICE_CTL */
+#define PVRDMA_REG_REQUEST 0x10 /* W: Indicate device request. */
+#define PVRDMA_REG_ERR 0x14 /* R: Device error. */
+#define PVRDMA_REG_ICR 0x18 /* R: Interrupt cause. */
+#define PVRDMA_REG_IMR 0x1c /* R/W: Interrupt mask. */
+#define PVRDMA_REG_MACL 0x20 /* R/W: MAC address low. */
+#define PVRDMA_REG_MACH 0x24 /* R/W: MAC address high. */
+
+/* Object flags. */
+#define PVRDMA_CQ_FLAG_ARMED_SOL BIT(0) /* Armed for solicited-only. */
+#define PVRDMA_CQ_FLAG_ARMED BIT(1) /* Armed. */
+#define PVRDMA_MR_FLAG_DMA BIT(0) /* DMA region. */
+#define PVRDMA_MR_FLAG_FRMR BIT(1) /* Fast reg memory region. */
+
+/*
+ * Atomic operation capability (masked versions are extended atomic
+ * operations.
+ */
+
+#define PVRDMA_ATOMIC_OP_COMP_SWAP BIT(0) /* Compare and swap. */
+#define PVRDMA_ATOMIC_OP_FETCH_ADD BIT(1) /* Fetch and add. */
+#define PVRDMA_ATOMIC_OP_MASK_COMP_SWAP BIT(2) /* Masked compare and swap. */
+#define PVRDMA_ATOMIC_OP_MASK_FETCH_ADD BIT(3) /* Masked fetch and add. */
+
+/*
+ * Base Memory Management Extension flags to support Fast Reg Memory Regions
+ * and Fast Reg Work Requests. Each flag represents a verb operation and we
+ * must support all of them to qualify for the BMME device cap.
+ */
+
+#define PVRDMA_BMME_FLAG_LOCAL_INV BIT(0) /* Local Invalidate. */
+#define PVRDMA_BMME_FLAG_REMOTE_INV BIT(1) /* Remote Invalidate. */
+#define PVRDMA_BMME_FLAG_FAST_REG_WR BIT(2) /* Fast Reg Work Request. */
+
+/*
+ * GID types. The interpretation of the gid_types bit field in the device
+ * capabilities will depend on the device mode. For now, the device only
+ * supports RoCE as mode, so only the different GID types for RoCE are
+ * defined.
+ */
+
+#define PVRDMA_GID_TYPE_FLAG_ROCE_V1 BIT(0)
+#define PVRDMA_GID_TYPE_FLAG_ROCE_V2 BIT(1)
+
+enum pvrdma_pci_resource {
+ PVRDMA_PCI_RESOURCE_MSIX, /* BAR0: MSI-X, MMIO. */
+ PVRDMA_PCI_RESOURCE_REG, /* BAR1: Registers, MMIO. */
+ PVRDMA_PCI_RESOURCE_UAR, /* BAR2: UAR pages, MMIO, 64-bit. */
+ PVRDMA_PCI_RESOURCE_LAST, /* Last. */
+};
+
+enum pvrdma_device_ctl {
+ PVRDMA_DEVICE_CTL_ACTIVATE, /* Activate device. */
+ PVRDMA_DEVICE_CTL_QUIESCE, /* Quiesce device. */
+ PVRDMA_DEVICE_CTL_RESET, /* Reset device. */
+};
+
+enum pvrdma_intr_vector {
+ PVRDMA_INTR_VECTOR_RESPONSE, /* Command response. */
+ PVRDMA_INTR_VECTOR_ASYNC, /* Async events. */
+ PVRDMA_INTR_VECTOR_CQ, /* CQ notification. */
+ /* Additional CQ notification vectors. */
+};
+
+enum pvrdma_intr_cause {
+ PVRDMA_INTR_CAUSE_RESPONSE = (1 << PVRDMA_INTR_VECTOR_RESPONSE),
+ PVRDMA_INTR_CAUSE_ASYNC = (1 << PVRDMA_INTR_VECTOR_ASYNC),
+ PVRDMA_INTR_CAUSE_CQ = (1 << PVRDMA_INTR_VECTOR_CQ),
+};
+
+enum pvrdma_intr_type {
+ PVRDMA_INTR_TYPE_INTX, /* Legacy. */
+ PVRDMA_INTR_TYPE_MSI, /* MSI. */
+ PVRDMA_INTR_TYPE_MSIX, /* MSI-X. */
+};
+
+enum pvrdma_gos_bits {
+ PVRDMA_GOS_BITS_UNK, /* Unknown. */
+ PVRDMA_GOS_BITS_32, /* 32-bit. */
+ PVRDMA_GOS_BITS_64, /* 64-bit. */
+};
+
+enum pvrdma_gos_type {
+ PVRDMA_GOS_TYPE_UNK, /* Unknown. */
+ PVRDMA_GOS_TYPE_LINUX, /* Linux. */
+};
+
+enum pvrdma_device_mode {
+ PVRDMA_DEVICE_MODE_ROCE, /* RoCE. */
+ PVRDMA_DEVICE_MODE_IWARP, /* iWarp. */
+ PVRDMA_DEVICE_MODE_IB, /* InfiniBand. */
+};
+
+struct pvrdma_gos_info {
+ u32 gos_bits:2; /* W: PVRDMA_GOS_BITS_ */
+ u32 gos_type:4; /* W: PVRDMA_GOS_TYPE_ */
+ u32 gos_ver:16; /* W: Guest OS version. */
+ u32 gos_misc:10; /* W: Other. */
+ u32 pad; /* Pad to 8-byte alignment. */
+};
+
+struct pvrdma_device_caps {
+ u64 fw_ver; /* R: Query device. */
+ __be64 node_guid;
+ __be64 sys_image_guid;
+ u64 max_mr_size;
+ u64 page_size_cap;
+ u64 atomic_arg_sizes; /* EXP verbs. */
+ u32 exp_comp_mask; /* EXP verbs. */
+ u32 device_cap_flags2; /* EXP verbs. */
+ u32 max_fa_bit_boundary; /* EXP verbs. */
+ u32 log_max_atomic_inline_arg; /* EXP verbs. */
+ u32 vendor_id;
+ u32 vendor_part_id;
+ u32 hw_ver;
+ u32 max_qp;
+ u32 max_qp_wr;
+ u32 device_cap_flags;
+ u32 max_sge;
+ u32 max_sge_rd;
+ u32 max_cq;
+ u32 max_cqe;
+ u32 max_mr;
+ u32 max_pd;
+ u32 max_qp_rd_atom;
+ u32 max_ee_rd_atom;
+ u32 max_res_rd_atom;
+ u32 max_qp_init_rd_atom;
+ u32 max_ee_init_rd_atom;
+ u32 max_ee;
+ u32 max_rdd;
+ u32 max_mw;
+ u32 max_raw_ipv6_qp;
+ u32 max_raw_ethy_qp;
+ u32 max_mcast_grp;
+ u32 max_mcast_qp_attach;
+ u32 max_total_mcast_qp_attach;
+ u32 max_ah;
+ u32 max_fmr;
+ u32 max_map_per_fmr;
+ u32 max_srq;
+ u32 max_srq_wr;
+ u32 max_srq_sge;
+ u32 max_uar;
+ u32 gid_tbl_len;
+ u16 max_pkeys;
+ u8 local_ca_ack_delay;
+ u8 phys_port_cnt;
+ u8 mode; /* PVRDMA_DEVICE_MODE_ */
+ u8 atomic_ops; /* PVRDMA_ATOMIC_OP_* bits */
+ u8 bmme_flags; /* FRWR Mem Mgmt Extensions */
+ u8 gid_types; /* PVRDMA_GID_TYPE_FLAG_ */
+ u8 reserved[4];
+};
+
+struct pvrdma_ring_page_info {
+ u32 num_pages; /* Num pages incl. header. */
+ u32 reserved; /* Reserved. */
+ u64 pdir_dma; /* Page directory PA. */
+};
+
+#pragma pack(push, 1)
+
+struct pvrdma_device_shared_region {
+ u32 driver_version; /* W: Driver version. */
+ u32 pad; /* Pad to 8-byte align. */
+ struct pvrdma_gos_info gos_info; /* W: Guest OS information. */
+ u64 cmd_slot_dma; /* W: Command slot address. */
+ u64 resp_slot_dma; /* W: Response slot address. */
+ struct pvrdma_ring_page_info async_ring_pages;
+ /* W: Async ring page info. */
+ struct pvrdma_ring_page_info cq_ring_pages;
+ /* W: CQ ring page info. */
+ u32 uar_pfn; /* W: UAR pageframe. */
+ u32 pad2; /* Pad to 8-byte align. */
+ struct pvrdma_device_caps caps; /* R: Device capabilities. */
+};
+
+#pragma pack(pop)
+
+
+/* Event types. Currently a 1:1 mapping with enum ib_event. */
+enum pvrdma_eqe_type {
+ PVRDMA_EVENT_CQ_ERR,
+ PVRDMA_EVENT_QP_FATAL,
+ PVRDMA_EVENT_QP_REQ_ERR,
+ PVRDMA_EVENT_QP_ACCESS_ERR,
+ PVRDMA_EVENT_COMM_EST,
+ PVRDMA_EVENT_SQ_DRAINED,
+ PVRDMA_EVENT_PATH_MIG,
+ PVRDMA_EVENT_PATH_MIG_ERR,
+ PVRDMA_EVENT_DEVICE_FATAL,
+ PVRDMA_EVENT_PORT_ACTIVE,
+ PVRDMA_EVENT_PORT_ERR,
+ PVRDMA_EVENT_LID_CHANGE,
+ PVRDMA_EVENT_PKEY_CHANGE,
+ PVRDMA_EVENT_SM_CHANGE,
+ PVRDMA_EVENT_SRQ_ERR,
+ PVRDMA_EVENT_SRQ_LIMIT_REACHED,
+ PVRDMA_EVENT_QP_LAST_WQE_REACHED,
+ PVRDMA_EVENT_CLIENT_REREGISTER,
+ PVRDMA_EVENT_GID_CHANGE,
+};
+
+/* Event queue element. */
+struct pvrdma_eqe {
+ u32 type; /* Event type. */
+ u32 info; /* Handle, other. */
+};
+
+/* CQ notification queue element. */
+struct pvrdma_cqne {
+ u32 info; /* Handle */
+};
+
+static inline void pvrdma_init_cqe(struct pvrdma_cqe *cqe, u64 wr_id, u64 qp)
+{
+ memset(cqe, 0, sizeof(*cqe));
+ cqe->status = PVRDMA_WC_GENERAL_ERR;
+ cqe->wr_id = wr_id;
+ cqe->qp = qp;
+}
+
+#endif /* __PVRDMA_DEFS_H__ */
diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_dev_api.h b/drivers/infiniband/hw/pvrdma/pvrdma_dev_api.h
new file mode 100644
index 0000000..66682cf
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_dev_api.h
@@ -0,0 +1,342 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_DEV_API_H__
+#define __PVRDMA_DEV_API_H__
+
+#include <linux/types.h>
+#include "pvrdma_ib_verbs.h"
+
+enum {
+ PVRDMA_CMD_FIRST,
+ PVRDMA_CMD_QUERY_PORT = PVRDMA_CMD_FIRST,
+ PVRDMA_CMD_QUERY_PKEY,
+ PVRDMA_CMD_CREATE_PD,
+ PVRDMA_CMD_DESTROY_PD,
+ PVRDMA_CMD_CREATE_MR,
+ PVRDMA_CMD_DESTROY_MR,
+ PVRDMA_CMD_CREATE_CQ,
+ PVRDMA_CMD_RESIZE_CQ,
+ PVRDMA_CMD_DESTROY_CQ,
+ PVRDMA_CMD_CREATE_QP,
+ PVRDMA_CMD_MODIFY_QP,
+ PVRDMA_CMD_QUERY_QP,
+ PVRDMA_CMD_DESTROY_QP,
+ PVRDMA_CMD_CREATE_UC,
+ PVRDMA_CMD_DESTROY_UC,
+ PVRDMA_CMD_CREATE_BIND,
+ PVRDMA_CMD_DESTROY_BIND,
+ PVRDMA_CMD_MAX,
+};
+
+enum {
+ PVRDMA_CMD_FIRST_RESP = (1 << 31),
+ PVRDMA_CMD_QUERY_PORT_RESP = PVRDMA_CMD_FIRST_RESP,
+ PVRDMA_CMD_QUERY_PKEY_RESP,
+ PVRDMA_CMD_CREATE_PD_RESP,
+ PVRDMA_CMD_DESTROY_PD_RESP_NOOP,
+ PVRDMA_CMD_CREATE_MR_RESP,
+ PVRDMA_CMD_DESTROY_MR_RESP_NOOP,
+ PVRDMA_CMD_CREATE_CQ_RESP,
+ PVRDMA_CMD_RESIZE_CQ_RESP,
+ PVRDMA_CMD_DESTROY_CQ_RESP_NOOP,
+ PVRDMA_CMD_CREATE_QP_RESP,
+ PVRDMA_CMD_MODIFY_QP_RESP,
+ PVRDMA_CMD_QUERY_QP_RESP,
+ PVRDMA_CMD_DESTROY_QP_RESP,
+ PVRDMA_CMD_CREATE_UC_RESP,
+ PVRDMA_CMD_DESTROY_UC_RESP_NOOP,
+ PVRDMA_CMD_CREATE_BIND_RESP_NOOP,
+ PVRDMA_CMD_DESTROY_BIND_RESP_NOOP,
+ PVRDMA_CMD_MAX_RESP,
+};
+
+struct pvrdma_cmd_hdr {
+ u64 response; /* Key for response lookup. */
+ u32 cmd; /* PVRDMA_CMD_ */
+ u32 reserved; /* Reserved. */
+};
+
+struct pvrdma_cmd_resp_hdr {
+ u64 response; /* From cmd hdr. */
+ u32 ack; /* PVRDMA_CMD_XXX_RESP */
+ u8 err; /* Error. */
+ u8 reserved[3]; /* Reserved. */
+};
+
+struct pvrdma_cmd_query_port {
+ struct pvrdma_cmd_hdr hdr;
+ u8 port_num;
+ u8 reserved[7];
+};
+
+struct pvrdma_cmd_query_port_resp {
+ struct pvrdma_cmd_resp_hdr hdr;
+ struct pvrdma_port_attr attrs;
+};
+
+struct pvrdma_cmd_query_pkey {
+ struct pvrdma_cmd_hdr hdr;
+ u8 port_num;
+ u8 index;
+ u8 reserved[6];
+};
+
+struct pvrdma_cmd_query_pkey_resp {
+ struct pvrdma_cmd_resp_hdr hdr;
+ u16 pkey;
+ u8 reserved[6];
+};
+
+struct pvrdma_cmd_create_uc {
+ struct pvrdma_cmd_hdr hdr;
+ u32 pfn; /* UAR page frame number */
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_create_uc_resp {
+ struct pvrdma_cmd_resp_hdr hdr;
+ u32 ctx_handle;
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_destroy_uc {
+ struct pvrdma_cmd_hdr hdr;
+ u32 ctx_handle;
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_create_pd {
+ struct pvrdma_cmd_hdr hdr;
+ u32 ctx_handle;
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_create_pd_resp {
+ struct pvrdma_cmd_resp_hdr hdr;
+ u32 pd_handle;
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_destroy_pd {
+ struct pvrdma_cmd_hdr hdr;
+ u32 pd_handle;
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_create_mr {
+ struct pvrdma_cmd_hdr hdr;
+ u64 start;
+ u64 length;
+ u64 pdir_dma;
+ u32 pd_handle;
+ u32 access_flags;
+ u32 flags;
+ u32 nchunks;
+};
+
+struct pvrdma_cmd_create_mr_resp {
+ struct pvrdma_cmd_resp_hdr hdr;
+ u32 mr_handle;
+ u32 lkey;
+ u32 rkey;
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_destroy_mr {
+ struct pvrdma_cmd_hdr hdr;
+ u32 mr_handle;
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_create_cq {
+ struct pvrdma_cmd_hdr hdr;
+ u64 pdir_dma;
+ u32 ctx_handle;
+ u32 cqe;
+ u32 nchunks;
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_create_cq_resp {
+ struct pvrdma_cmd_resp_hdr hdr;
+ u32 cq_handle;
+ u32 cqe;
+};
+
+struct pvrdma_cmd_resize_cq {
+ struct pvrdma_cmd_hdr hdr;
+ u32 cq_handle;
+ u32 cqe;
+};
+
+struct pvrdma_cmd_resize_cq_resp {
+ struct pvrdma_cmd_resp_hdr hdr;
+ u32 cqe;
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_destroy_cq {
+ struct pvrdma_cmd_hdr hdr;
+ u32 cq_handle;
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_create_qp {
+ struct pvrdma_cmd_hdr hdr;
+ u64 pdir_dma;
+ u32 pd_handle;
+ u32 send_cq_handle;
+ u32 recv_cq_handle;
+ u32 srq_handle;
+ u32 max_send_wr;
+ u32 max_recv_wr;
+ u32 max_send_sge;
+ u32 max_recv_sge;
+ u32 max_inline_data;
+ u32 lkey;
+ u32 access_flags;
+ u16 total_chunks;
+ u16 send_chunks;
+ u16 max_atomic_arg;
+ u8 sq_sig_all;
+ u8 qp_type;
+ u8 is_srq;
+ u8 reserved[3];
+};
+
+struct pvrdma_cmd_create_qp_resp {
+ struct pvrdma_cmd_resp_hdr hdr;
+ u32 qpn;
+ u32 max_send_wr;
+ u32 max_recv_wr;
+ u32 max_send_sge;
+ u32 max_recv_sge;
+ u32 max_inline_data;
+};
+
+struct pvrdma_cmd_modify_qp {
+ struct pvrdma_cmd_hdr hdr;
+ u32 qp_handle;
+ u32 attr_mask;
+ struct pvrdma_qp_attr attrs;
+};
+
+struct pvrdma_cmd_query_qp {
+ struct pvrdma_cmd_hdr hdr;
+ u32 qp_handle;
+ u32 attr_mask;
+};
+
+struct pvrdma_cmd_query_qp_resp {
+ struct pvrdma_cmd_resp_hdr hdr;
+ struct pvrdma_qp_attr attrs;
+};
+
+struct pvrdma_cmd_destroy_qp {
+ struct pvrdma_cmd_hdr hdr;
+ u32 qp_handle;
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_destroy_qp_resp {
+ struct pvrdma_cmd_resp_hdr hdr;
+ u32 events_reported;
+ u8 reserved[4];
+};
+
+struct pvrdma_cmd_create_bind {
+ struct pvrdma_cmd_hdr hdr;
+ u32 mtu;
+ u32 vlan;
+ u32 index;
+ u8 new_gid[16];
+ u8 gid_type;
+ u8 reserved[3];
+};
+
+struct pvrdma_cmd_destroy_bind {
+ struct pvrdma_cmd_hdr hdr;
+ u32 index;
+ u8 dest_gid[16];
+ u8 reserved[4];
+};
+
+union pvrdma_cmd_req {
+ struct pvrdma_cmd_hdr hdr;
+ struct pvrdma_cmd_query_port query_port;
+ struct pvrdma_cmd_query_pkey query_pkey;
+ struct pvrdma_cmd_create_uc create_uc;
+ struct pvrdma_cmd_destroy_uc destroy_uc;
+ struct pvrdma_cmd_create_pd create_pd;
+ struct pvrdma_cmd_destroy_pd destroy_pd;
+ struct pvrdma_cmd_create_mr create_mr;
+ struct pvrdma_cmd_destroy_mr destroy_mr;
+ struct pvrdma_cmd_create_cq create_cq;
+ struct pvrdma_cmd_resize_cq resize_cq;
+ struct pvrdma_cmd_destroy_cq destroy_cq;
+ struct pvrdma_cmd_create_qp create_qp;
+ struct pvrdma_cmd_modify_qp modify_qp;
+ struct pvrdma_cmd_query_qp query_qp;
+ struct pvrdma_cmd_destroy_qp destroy_qp;
+ struct pvrdma_cmd_create_bind create_bind;
+ struct pvrdma_cmd_destroy_bind destroy_bind;
+};
+
+union pvrdma_cmd_resp {
+ struct pvrdma_cmd_resp_hdr hdr;
+ struct pvrdma_cmd_query_port_resp query_port_resp;
+ struct pvrdma_cmd_query_pkey_resp query_pkey_resp;
+ struct pvrdma_cmd_create_uc_resp create_uc_resp;
+ struct pvrdma_cmd_create_pd_resp create_pd_resp;
+ struct pvrdma_cmd_create_mr_resp create_mr_resp;
+ struct pvrdma_cmd_create_cq_resp create_cq_resp;
+ struct pvrdma_cmd_resize_cq_resp resize_cq_resp;
+ struct pvrdma_cmd_create_qp_resp create_qp_resp;
+ struct pvrdma_cmd_query_qp_resp query_qp_resp;
+ struct pvrdma_cmd_destroy_qp_resp destroy_qp_resp;
+};
+
+#endif /* __PVRDMA_DEV_API_H__ */
--
2.7.4
^ permalink raw reply related
* [PATCH v5 02/16] IB/pvrdma: Add user-level shared functions
From: Adit Ranadive @ 2016-09-24 23:21 UTC (permalink / raw)
To: dledford, linux-rdma, pv-drivers
Cc: Adit Ranadive, netdev, linux-pci, jhansen, asarwade, georgezhang,
bryantan
In-Reply-To: <cover.1474759181.git.aditr@vmware.com>
We share some common structures with the user-level driver. This patch adds
those structures and shared functions to traverse the QP/CQ rings.
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: George Zhang <georgezhang@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
---
Changes v4->v5:
- Moved pvrdma_uapi.h and pvrdma_user.h into common UAPI folder.
- Renamed to pvrdma-uapi.h and pvrdma-abi.h respectively.
- Prefixed unsigned vars with __.
Changes v3->v4:
- Moved pvrdma_sge into pvrdma_uapi.h
---
include/uapi/rdma/Kbuild | 2 +
include/uapi/rdma/pvrdma-abi.h | 99 ++++++++++++++++
include/uapi/rdma/pvrdma-uapi.h | 255 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 356 insertions(+)
create mode 100644 include/uapi/rdma/pvrdma-abi.h
create mode 100644 include/uapi/rdma/pvrdma-uapi.h
diff --git a/include/uapi/rdma/Kbuild b/include/uapi/rdma/Kbuild
index 4edb0f2..fc2b285 100644
--- a/include/uapi/rdma/Kbuild
+++ b/include/uapi/rdma/Kbuild
@@ -7,3 +7,5 @@ header-y += rdma_netlink.h
header-y += rdma_user_cm.h
header-y += hfi/
header-y += rdma_user_rxe.h
+header-y += pvrdma-abi.h
+header-y += pvrdma-uapi.h
diff --git a/include/uapi/rdma/pvrdma-abi.h b/include/uapi/rdma/pvrdma-abi.h
new file mode 100644
index 0000000..6fa0ab6
--- /dev/null
+++ b/include/uapi/rdma/pvrdma-abi.h
@@ -0,0 +1,99 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_USER_H__
+#define __PVRDMA_USER_H__
+
+#include <linux/types.h>
+
+#define PVRDMA_UVERBS_ABI_VERSION 3
+#define PVRDMA_BOARD_ID 1
+#define PVRDMA_REV_ID 1
+
+struct pvrdma_alloc_ucontext_resp {
+ __u32 qp_tab_size;
+ __u32 reserved;
+};
+
+struct pvrdma_alloc_pd_resp {
+ __u32 pdn;
+ __u32 reserved;
+};
+
+struct pvrdma_create_cq {
+ __u64 buf_addr;
+ __u32 buf_size;
+ __u32 reserved;
+};
+
+struct pvrdma_create_cq_resp {
+ __u32 cqn;
+ __u32 reserved;
+};
+
+struct pvrdma_resize_cq {
+ __u64 buf_addr;
+ __u32 buf_size;
+ __u32 reserved;
+};
+
+struct pvrdma_create_srq {
+ __u64 buf_addr;
+};
+
+struct pvrdma_create_srq_resp {
+ __u32 srqn;
+ __u32 reserved;
+};
+
+struct pvrdma_create_qp {
+ __u64 rbuf_addr;
+ __u64 sbuf_addr;
+ __u32 rbuf_size;
+ __u32 sbuf_size;
+ __u64 qp_addr;
+};
+
+#endif /* __PVRDMA_USER_H__ */
diff --git a/include/uapi/rdma/pvrdma-uapi.h b/include/uapi/rdma/pvrdma-uapi.h
new file mode 100644
index 0000000..430d8a5
--- /dev/null
+++ b/include/uapi/rdma/pvrdma-uapi.h
@@ -0,0 +1,255 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_UAPI_H__
+#define __PVRDMA_UAPI_H__
+
+#include <linux/types.h>
+
+#define PVRDMA_VERSION 17
+
+#define PVRDMA_UAR_HANDLE_MASK 0x00FFFFFF /* Bottom 24 bits. */
+#define PVRDMA_UAR_QP_OFFSET 0 /* Offset of QP doorbell. */
+#define PVRDMA_UAR_QP_SEND BIT(30) /* Send bit. */
+#define PVRDMA_UAR_QP_RECV BIT(31) /* Recv bit. */
+#define PVRDMA_UAR_CQ_OFFSET 4 /* Offset of CQ doorbell. */
+#define PVRDMA_UAR_CQ_ARM_SOL BIT(29) /* Arm solicited bit. */
+#define PVRDMA_UAR_CQ_ARM BIT(30) /* Arm bit. */
+#define PVRDMA_UAR_CQ_POLL BIT(31) /* Poll bit. */
+#define PVRDMA_INVALID_IDX -1 /* Invalid index. */
+
+/* PVRDMA atomic compare and swap */
+struct pvrdma_exp_cmp_swap {
+ __u64 swap_val;
+ __u64 compare_val;
+ __u64 swap_mask;
+ __u64 compare_mask;
+};
+
+/* PVRDMA atomic fetch and add */
+struct pvrdma_exp_fetch_add {
+ __u64 add_val;
+ __u64 field_boundary;
+};
+
+/* PVRDMA address vector. */
+struct pvrdma_av {
+ __u32 port_pd;
+ __u32 sl_tclass_flowlabel;
+ __u8 dgid[16];
+ __u8 src_path_bits;
+ __u8 gid_index;
+ __u8 stat_rate;
+ __u8 hop_limit;
+ __u8 dmac[6];
+ __u8 reserved[6];
+};
+
+/* PVRDMA scatter/gather entry */
+struct pvrdma_sge {
+ __u64 addr;
+ __u32 length;
+ __u32 lkey;
+};
+
+/* PVRDMA receive queue work request */
+struct pvrdma_rq_wqe_hdr {
+ __u64 wr_id; /* wr id */
+ __u32 num_sge; /* size of s/g array */
+ __u32 total_len; /* reserved */
+};
+/* Use pvrdma_sge (ib_sge) for receive queue s/g array elements. */
+
+/* PVRDMA send queue work request */
+struct pvrdma_sq_wqe_hdr {
+ __u64 wr_id; /* wr id */
+ __u32 num_sge; /* size of s/g array */
+ __u32 total_len; /* reserved */
+ __u32 opcode; /* operation type */
+ __u32 send_flags; /* wr flags */
+ union {
+ __u32 imm_data;
+ __u32 invalidate_rkey;
+ } ex;
+ __u32 reserved;
+ union {
+ struct {
+ __u64 remote_addr;
+ __u32 rkey;
+ __u8 reserved[4];
+ } rdma;
+ struct {
+ __u64 remote_addr;
+ __u64 compare_add;
+ __u64 swap;
+ __u32 rkey;
+ __u32 reserved;
+ } atomic;
+ struct {
+ __u64 remote_addr;
+ __u32 log_arg_sz;
+ __u32 rkey;
+ union {
+ struct pvrdma_exp_cmp_swap cmp_swap;
+ struct pvrdma_exp_fetch_add fetch_add;
+ } wr_data;
+ } masked_atomics;
+ struct {
+ __u64 iova_start;
+ __u64 pl_pdir_dma;
+ __u32 page_shift;
+ __u32 page_list_len;
+ __u32 length;
+ __u32 access_flags;
+ __u32 rkey;
+ } fast_reg;
+ struct {
+ __u32 remote_qpn;
+ __u32 remote_qkey;
+ struct pvrdma_av av;
+ } ud;
+ } wr;
+};
+/* Use pvrdma_sge (ib_sge) for send queue s/g array elements. */
+
+/* Completion queue element. */
+struct pvrdma_cqe {
+ __u64 wr_id;
+ __u64 qp;
+ __u32 opcode;
+ __u32 status;
+ __u32 byte_len;
+ __u32 imm_data;
+ __u32 src_qp;
+ __u32 wc_flags;
+ __u32 vendor_err;
+ __u16 pkey_index;
+ __u16 slid;
+ __u8 sl;
+ __u8 dlid_path_bits;
+ __u8 port_num;
+ __u8 smac[6];
+ __u8 reserved2[7]; /* Pad to next power of 2 (64). */
+};
+
+struct pvrdma_ring {
+ atomic_t prod_tail; /* Producer tail. */
+ atomic_t cons_head; /* Consumer head. */
+};
+
+struct pvrdma_ring_state {
+ struct pvrdma_ring tx; /* Tx ring. */
+ struct pvrdma_ring rx; /* Rx ring. */
+};
+
+static inline int pvrdma_idx_valid(__u32 idx, __u32 max_elems)
+{
+ /* Generates fewer instructions than a less-than. */
+ return (idx & ~((max_elems << 1) - 1)) == 0;
+}
+
+static inline __s32 pvrdma_idx(atomic_t *var, __u32 max_elems)
+{
+ const unsigned int idx = atomic_read(var);
+
+ if (pvrdma_idx_valid(idx, max_elems))
+ return idx & (max_elems - 1);
+ return PVRDMA_INVALID_IDX;
+}
+
+static inline void pvrdma_idx_ring_inc(atomic_t *var, __u32 max_elems)
+{
+ __u32 idx = atomic_read(var) + 1; /* Increment. */
+
+ idx &= (max_elems << 1) - 1; /* Modulo size, flip gen. */
+ atomic_set(var, idx);
+}
+
+static inline __s32 pvrdma_idx_ring_has_space(const struct pvrdma_ring *r,
+ __u32 max_elems, __u32 *out_tail)
+{
+ const __u32 tail = atomic_read(&r->prod_tail);
+ const __u32 head = atomic_read(&r->cons_head);
+
+ if (pvrdma_idx_valid(tail, max_elems) &&
+ pvrdma_idx_valid(head, max_elems)) {
+ *out_tail = tail & (max_elems - 1);
+ return tail != (head ^ max_elems);
+ }
+ return PVRDMA_INVALID_IDX;
+}
+
+static inline __s32 pvrdma_idx_ring_has_data(const struct pvrdma_ring *r,
+ __u32 max_elems, __u32 *out_head)
+{
+ const __u32 tail = atomic_read(&r->prod_tail);
+ const __u32 head = atomic_read(&r->cons_head);
+
+ if (pvrdma_idx_valid(tail, max_elems) &&
+ pvrdma_idx_valid(head, max_elems)) {
+ *out_head = head & (max_elems - 1);
+ return tail != head;
+ }
+ return PVRDMA_INVALID_IDX;
+}
+
+static inline bool pvrdma_idx_ring_is_valid_idx(const struct pvrdma_ring *r,
+ __u32 max_elems, __u32 *idx)
+{
+ const __u32 tail = atomic_read(&r->prod_tail);
+ const __u32 head = atomic_read(&r->cons_head);
+
+ if (pvrdma_idx_valid(tail, max_elems) &&
+ pvrdma_idx_valid(head, max_elems) &&
+ pvrdma_idx_valid(*idx, max_elems)) {
+ if (tail > head && (*idx < tail && *idx >= head))
+ return true;
+ else if (head > tail && (*idx >= head || *idx < tail))
+ return true;
+ }
+ return false;
+}
+
+#endif /* __PVRDMA_UAPI_H__ */
--
2.7.4
^ permalink raw reply related
* [PATCH v5 01/16] vmxnet3: Move PCI Id to pci_ids.h
From: Adit Ranadive @ 2016-09-24 23:21 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-pci-u79uwXL29TY76Z2rM5mHXA, jhansen-pghWNbHTmq7QT0dZR+AlfA,
asarwade-pghWNbHTmq7QT0dZR+AlfA,
georgezhang-pghWNbHTmq7QT0dZR+AlfA,
bryantan-pghWNbHTmq7QT0dZR+AlfA
In-Reply-To: <cover.1474759181.git.aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
The VMXNet3 PCI Id will be shared with our paravirtual RDMA driver.
Moved it to the shared location in pci_ids.h.
Suggested-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Acked-by: Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Yuval Shaia <yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
drivers/net/vmxnet3/vmxnet3_int.h | 3 +--
include/linux/pci_ids.h | 1 +
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/vmxnet3/vmxnet3_int.h b/drivers/net/vmxnet3/vmxnet3_int.h
index 74fc030..2bd6bf8 100644
--- a/drivers/net/vmxnet3/vmxnet3_int.h
+++ b/drivers/net/vmxnet3/vmxnet3_int.h
@@ -119,9 +119,8 @@ enum {
};
/*
- * PCI vendor and device IDs.
+ * Maximum devices supported.
*/
-#define PCI_DEVICE_ID_VMWARE_VMXNET3 0x07B0
#define MAX_ETHERNET_CARDS 10
#define MAX_PCI_PASSTHRU_DEVICE 6
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index c58752f..98bb455 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2251,6 +2251,7 @@
#define PCI_DEVICE_ID_RASTEL_2PORT 0x2000
#define PCI_VENDOR_ID_VMWARE 0x15ad
+#define PCI_DEVICE_ID_VMWARE_VMXNET3 0x07b0
#define PCI_VENDOR_ID_ZOLTRIX 0x15b0
#define PCI_DEVICE_ID_ZOLTRIX_2BD0 0x2bd0
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 2/2] brcmfmac: compile fws(ignal) code only with BCDC support enabled
From: Rafał Miłecki @ 2016-09-24 20:44 UTC (permalink / raw)
To: Kalle Valo
Cc: Arend van Spriel, Franky Lin, Hante Meuleman,
Pieter-Paul Giesberts, Franky Lin, linux-wireless,
brcm80211-dev-list.pdl, netdev, linux-kernel,
Rafał Miłecki
In-Reply-To: <20160924204419.10213-1-zajec5@gmail.com>
From: Rafał Miłecki <rafal@milecki.pl>
It's not needed by the other (msgbuf) protocol, so let's save some size
and compile it conditionally.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
---
.../wireless/broadcom/brcm80211/brcmfmac/Makefile | 4 +-
.../broadcom/brcm80211/brcmfmac/fwsignal.h | 59 ++++++++++++++++++++++
2 files changed, 61 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/Makefile b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/Makefile
index 9e4b505..ad3b06e 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/Makefile
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/Makefile
@@ -27,7 +27,6 @@ brcmfmac-objs += \
chip.o \
fwil.o \
fweh.o \
- fwsignal.o \
p2p.o \
proto.o \
common.o \
@@ -37,7 +36,8 @@ brcmfmac-objs += \
btcoex.o \
vendor.o
brcmfmac-$(CONFIG_BRCMFMAC_PROTO_BCDC) += \
- bcdc.o
+ bcdc.o \
+ fwsignal.o
brcmfmac-$(CONFIG_BRCMFMAC_PROTO_MSGBUF) += \
commonring.o \
flowring.o \
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.h b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.h
index 8f7c1d7..ba0c1bc 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.h
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.h
@@ -18,6 +18,7 @@
#ifndef FWSIGNAL_H_
#define FWSIGNAL_H_
+#ifdef CONFIG_BRCMFMAC_PROTO_BCDC
int brcmf_fws_init(struct brcmf_pub *drvr);
void brcmf_fws_deinit(struct brcmf_pub *drvr);
bool brcmf_fws_skbs_queueing(struct brcmf_fws_info *fws);
@@ -31,5 +32,63 @@ void brcmf_fws_del_interface(struct brcmf_if *ifp);
void brcmf_fws_bustxfail(struct brcmf_fws_info *fws, struct sk_buff *skb);
void brcmf_fws_bus_blocked(struct brcmf_pub *drvr, bool flow_blocked);
void brcmf_fws_rxreorder(struct brcmf_if *ifp, struct sk_buff *skb);
+#else
+static inline int brcmf_fws_init(struct brcmf_pub *drvr)
+{
+ return -ENOTSUPP;
+}
+
+static inline void brcmf_fws_deinit(struct brcmf_pub *drvr)
+{
+}
+
+static inline bool brcmf_fws_skbs_queueing(struct brcmf_fws_info *fws)
+{
+ return false;
+}
+
+static inline bool brcmf_fws_fc_active(struct brcmf_fws_info *fws)
+{
+ return false;
+}
+
+static inline void brcmf_fws_hdrpull(struct brcmf_if *ifp, s16 siglen,
+ struct sk_buff *skb)
+{
+}
+
+static inline int brcmf_fws_process_skb(struct brcmf_if *ifp,
+ struct sk_buff *skb)
+{
+ return -ENOTSUPP;
+}
+
+static inline void brcmf_fws_reset_interface(struct brcmf_if *ifp)
+{
+}
+
+static inline void brcmf_fws_add_interface(struct brcmf_if *ifp)
+{
+}
+
+static inline void brcmf_fws_del_interface(struct brcmf_if *ifp)
+{
+}
+
+static inline void brcmf_fws_bustxfail(struct brcmf_fws_info *fws,
+ struct sk_buff *skb)
+{
+}
+
+static inline void brcmf_fws_bus_blocked(struct brcmf_pub *drvr,
+ bool flow_blocked)
+{
+}
+
+static inline void brcmf_fws_rxreorder(struct brcmf_if *ifp,
+ struct sk_buff *skb)
+{
+}
+#endif
#endif /* FWSIGNAL_H_ */
--
2.9.3
^ permalink raw reply related
* [PATCH 1/2] brcmfmac: initialize fws(ignal) for BCDC protocol only
From: Rafał Miłecki @ 2016-09-24 20:44 UTC (permalink / raw)
To: Kalle Valo
Cc: Arend van Spriel, Franky Lin, Hante Meuleman,
Pieter-Paul Giesberts, Franky Lin, linux-wireless,
brcm80211-dev-list.pdl, netdev, linux-kernel,
Rafał Miłecki
From: Rafał Miłecki <rafal@milecki.pl>
There are two protocols used by Broadcom FullMAC devices: BCDC and
msgbuf. They use different ways for (some part of) communication with
the firmware. Firmware Signaling is required for the first one only
(BCDC).
So far we were always initializing fws and always calling it's skb
processing function. It was fws that was passing skb processing to the
protocol specific function. It was redundant for the msgbuf case.
Simply taking few lines of code out of fws allows us to totally avoid
using it. This simplifies code flow, saves some memory & will allow
further optimizations like not compiling fwsignal.c.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
---
.../wireless/broadcom/brcm80211/brcmfmac/core.c | 24 ++++++++++++++++------
.../broadcom/brcm80211/brcmfmac/fwsignal.c | 17 ++++++---------
.../broadcom/brcm80211/brcmfmac/fwsignal.h | 1 +
3 files changed, 25 insertions(+), 17 deletions(-)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
index 27cd50a..bc3d8ab 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
@@ -250,7 +250,17 @@ static netdev_tx_t brcmf_netdev_start_xmit(struct sk_buff *skb,
if (eh->h_proto == htons(ETH_P_PAE))
atomic_inc(&ifp->pend_8021x_cnt);
- ret = brcmf_fws_process_skb(ifp, skb);
+ /* determine the priority */
+ if (skb->priority == 0 || skb->priority > 7)
+ skb->priority = cfg80211_classify8021d(skb, NULL);
+
+ if (drvr->fws && brcmf_fws_skbs_queueing(drvr->fws)) {
+ ret = brcmf_fws_process_skb(ifp, skb);
+ } else {
+ ret = brcmf_proto_txdata(drvr, ifp->ifidx, 0, skb);
+ if (ret < 0)
+ brcmf_txfinalize(ifp, skb, false);
+ }
done:
if (ret) {
@@ -405,7 +415,7 @@ void brcmf_txcomplete(struct device *dev, struct sk_buff *txp, bool success)
struct brcmf_if *ifp;
/* await txstatus signal for firmware if active */
- if (brcmf_fws_fc_active(drvr->fws)) {
+ if (drvr->fws && brcmf_fws_fc_active(drvr->fws)) {
if (!success)
brcmf_fws_bustxfail(drvr->fws, txp);
} else {
@@ -1006,11 +1016,13 @@ int brcmf_bus_start(struct device *dev)
}
brcmf_feat_attach(drvr);
- ret = brcmf_fws_init(drvr);
- if (ret < 0)
- goto fail;
+ if (bus_if->proto_type == BRCMF_PROTO_BCDC) {
+ ret = brcmf_fws_init(drvr);
+ if (ret < 0)
+ goto fail;
- brcmf_fws_add_interface(ifp);
+ brcmf_fws_add_interface(ifp);
+ }
drvr->config = brcmf_cfg80211_attach(drvr, bus_if->dev,
drvr->settings->p2p_enable);
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c
index a190f53..495eaf8 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c
@@ -2100,16 +2100,6 @@ int brcmf_fws_process_skb(struct brcmf_if *ifp, struct sk_buff *skb)
int rc = 0;
brcmf_dbg(DATA, "tx proto=0x%X\n", ntohs(eh->h_proto));
- /* determine the priority */
- if ((skb->priority == 0) || (skb->priority > 7))
- skb->priority = cfg80211_classify8021d(skb, NULL);
-
- if (fws->avoid_queueing) {
- rc = brcmf_proto_txdata(drvr, ifp->ifidx, 0, skb);
- if (rc < 0)
- brcmf_txfinalize(ifp, skb, false);
- return rc;
- }
/* set control buffer information */
skcb->if_flags = 0;
@@ -2155,7 +2145,7 @@ void brcmf_fws_add_interface(struct brcmf_if *ifp)
struct brcmf_fws_info *fws = ifp->drvr->fws;
struct brcmf_fws_mac_descriptor *entry;
- if (!ifp->ndev)
+ if (!fws || !ifp->ndev)
return;
entry = &fws->desc.iface[ifp->ifidx];
@@ -2442,6 +2432,11 @@ void brcmf_fws_deinit(struct brcmf_pub *drvr)
kfree(fws);
}
+bool brcmf_fws_skbs_queueing(struct brcmf_fws_info *fws)
+{
+ return !fws->avoid_queueing;
+}
+
bool brcmf_fws_fc_active(struct brcmf_fws_info *fws)
{
if (!fws->creditmap_received)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.h b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.h
index ef0ad85..8f7c1d7 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.h
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.h
@@ -20,6 +20,7 @@
int brcmf_fws_init(struct brcmf_pub *drvr);
void brcmf_fws_deinit(struct brcmf_pub *drvr);
+bool brcmf_fws_skbs_queueing(struct brcmf_fws_info *fws);
bool brcmf_fws_fc_active(struct brcmf_fws_info *fws);
void brcmf_fws_hdrpull(struct brcmf_if *ifp, s16 siglen, struct sk_buff *skb);
int brcmf_fws_process_skb(struct brcmf_if *ifp, struct sk_buff *skb);
--
2.9.3
^ permalink raw reply related
* Re: [PATCH] realtek: Add switch variable to 'switch case not processed' messages
From: Jes Sorensen @ 2016-09-24 20:35 UTC (permalink / raw)
To: Joe Perches
Cc: Larry Finger, Jean Delvare, Chaoming Li, Kalle Valo,
linux-wireless, netdev, linux-kernel
In-Reply-To: <1474748954.23838.21.camel@perches.com>
Joe Perches <joe@perches.com> writes:
> On Sat, 2016-09-24 at 14:06 -0500, Larry Finger wrote:
>> On 09/24/2016 12:32 PM, Joe Perches wrote:
> []
>> o Reindent all the switch/case blocks to a more normal
>> kernel style (git diff -w would show no changes here)
>> That sounds like busy work to me, but if you want to do it, go ahead.
>
> It's really just to make the comparison case block reductions
> easier to verify for later steps done
>
>> > o cast, spacing and parenthesis reductions
>> > Lots of odd and somewhat unique styles in various
>> > drivers, looks like too many individual authors without
>> > a style guide / code enforcer using slightly different
>> > personalized code. Glancing at the code, it looks to be
>> > similar logic, just written in different styles.
>> Same comment.
>
> Same rationale
>
>> > o Logic changes like
>> > from:
>> > if (foo) func(..., bar, ...); else func(..., baz, ...);
>> > to:
>> > func(..., foo ? bar : baz, ...);
>> > to make the case statement code blocks more consistent
>> > and emit somewhat smaller object code.
>> I find if .. else constructs much easier to read than the cond ? xxxx : yyyy
>> form. I would reject any such patches.
>
> <shrug> I think object code reduction generally a good thing
> but then again, I'm not a maintainer here.
I missed this part, but I am with Larry here - 'foo ? bar : boo' are
just obfuscating the code and far less clear than if or switch
statements.
Jes
^ permalink raw reply
* Re: [PATCH] realtek: Add switch variable to 'switch case not processed' messages
From: Joe Perches @ 2016-09-24 20:29 UTC (permalink / raw)
To: Larry Finger, Jean Delvare, Jes Sorensen
Cc: Chaoming Li, Kalle Valo, linux-wireless, netdev, linux-kernel
In-Reply-To: <cef6d8da-c390-f435-7a5e-d294dfbebbfb@lwfinger.net>
On Sat, 2016-09-24 at 14:06 -0500, Larry Finger wrote:
> On 09/24/2016 12:32 PM, Joe Perches wrote:
[]
> o Reindent all the switch/case blocks to a more normal
> kernel style (git diff -w would show no changes here)
> That sounds like busy work to me, but if you want to do it, go ahead.
It's really just to make the comparison case block reductions
easier to verify for later steps done
> > o cast, spacing and parenthesis reductions
> > Lots of odd and somewhat unique styles in various
> > drivers, looks like too many individual authors without
> > a style guide / code enforcer using slightly different
> > personalized code. Glancing at the code, it looks to be
> > similar logic, just written in different styles.
> Same comment.
Same rationale
> > o Logic changes like
> > from:
> > if (foo) func(..., bar, ...); else func(..., baz, ...);
> > to:
> > func(..., foo ? bar : baz, ...);
> > to make the case statement code blocks more consistent
> > and emit somewhat smaller object code.
> I find if .. else constructs much easier to read than the cond ? xxxx : yyyy
> form. I would reject any such patches.
<shrug> I think object code reduction generally a good thing
but then again, I'm not a maintainer here.
> > o Consolidation of equivalent function spanning drivers
> > With the style only changes minimized, where possible
> > make the drivers use common ops/callback functions.
> The is no question that there are similar routines in different drivers. I would
> like to place as much as possible into common routines, but I never seem to find
> the time. There are too many bugs in other things I support to consider these
> niceties.
Consolidation generally reduces defects and improves ease of
updating.
>
^ permalink raw reply
* Huge static buffer in liquidio
From: Denys Vlasenko @ 2016-09-24 20:02 UTC (permalink / raw)
To: Raghu Vatsavayi, Satanand Burla, Felix Manlunas, Raghu Vatsavayi,
Derek Chickles
Cc: Dave Miller, Linux Kernel Mailing List, netdev
Hello,
I would like to discuss this commit:
commit d3d7e6c65f75de18ced10a98595a847f9f95f0ce
Author: Raghu Vatsavayi <rvatsavayi@caviumnetworks.com>
Date: Tue Jun 21 22:53:07 2016 -0700
liquidio: Firmware image download
This patch has firmware image related changes for: firmware
release upon failure, support latest firmware version and
firmware download in 4MB chunks.
Here is a part of it:
+u8 fbuf[4 * 1024 * 1024];
^^^^^^^^^^^^^^^ what?????????? [also, why it is not static?]
+
int octeon_download_firmware(struct octeon_device *oct, const u8 *data,
size_t size)
{
int ret = 0;
- u8 *p;
- u8 *buffer;
+ u8 *p = fbuf;
Don't you think that using 4 megabytes of static buffer
*just for the firmware upload* is not a good practice?
Further down the patch it's obvious that the buffer is not even
needed, because the firmware is already in memory, the "data"
and "size" parameters of this function point to it.
The meat of the change of the FW download is this (deleted
some debug messages code):
- buffer = kmemdup(data, size, GFP_KERNEL);
- if (!buffer)
- return -ENOMEM;
-
- p = buffer + sizeof(struct octeon_firmware_file_header);
+ data += sizeof(struct octeon_firmware_file_header);
/* load all images */
for (i = 0; i < be32_to_cpu(h->num_images); i++) {
load_addr = be64_to_cpu(h->desc[i].addr);
image_len = be32_to_cpu(h->desc[i].len);
- /* download the image */
- octeon_pci_write_core_mem(oct, load_addr, p, image_len);
+ /* Write in 4MB chunks*/
+ rem = image_len;
- p += image_len;
+ while (rem) {
+ if (rem < (4 * 1024 * 1024))
+ size = rem;
+ else
+ size = 4 * 1024 * 1024;
+
+ memcpy(p, data, size);
+
+ /* download the image */
+ octeon_pci_write_core_mem(oct, load_addr, p, (u32)size);
+
+ data += size;
+ rem -= (u32)size;
+ load_addr += size;
+ }
}
+ dev_info(&oct->pci_dev->dev, "Writing boot command: %s\n",
+ h->bootcmd);
/* Invoke the bootcmd */
ret = octeon_console_send_cmd(oct, h->bootcmd, 50);
-done_downloading:
- kfree(buffer);
octeon_pci_write_core_mem() takes spinlock around copy op,
I take this was the reason for this change: reduce
IRQ-disabled time.
Do you actually need an intermediate fbuf[] buffer here?
(IOW: can't you just write data to PCI from memory area pointed
by "data" ptr?)
If there is indeed a reason for intermediate buffer,
why did you drop the approach of having a temporary kmalloc'ed
buffer and switches to a static (and *huge*) buffer?
In my opinion, 4 Mbyte buffer is an overkill in any case.
A buffer of ~4..16 Kbyte one would work just fine - it's not like
you need to squeeze last 0.5% of speed when you upload firmware.
^ permalink raw reply
* Re: [PATCH] realtek: Add switch variable to 'switch case not processed' messages
From: Jes Sorensen @ 2016-09-24 20:02 UTC (permalink / raw)
To: Larry Finger
Cc: Joe Perches, Jean Delvare, Chaoming Li, Kalle Valo,
linux-wireless, netdev, linux-kernel
In-Reply-To: <cef6d8da-c390-f435-7a5e-d294dfbebbfb@lwfinger.net>
Larry Finger <Larry.Finger@lwfinger.net> writes:
> On 09/24/2016 12:32 PM, Joe Perches wrote:
>> Is there any value in that or is Jes' work going to make
>> doing any or all of this unnecessary and futile?
>
> That is not yet determined. The only driver that is to be replaced at
> this point is rtl8192cu. Jes only has USB I/O for his driver. We are
> looking at adding SDIO, and once that is done, PCI should be possible.
If someone else wants to address PCI then it could happen quite soon,
but at the current schedule I don't see PCI happen in my driver for at
least a year, probably more.
If you can reduce the size of rtlwifi in the mean time that probably
isn't going to upset a lot of people.
Jes
^ permalink raw reply
* [PATCH net] Revert "net: ethernet: bcmgenet: use phydev from struct net_device"
From: Florian Fainelli @ 2016-09-24 19:58 UTC (permalink / raw)
To: netdev; +Cc: davem, tremyfr, jaedon.shin, Florian Fainelli
This reverts commit 62469c76007e ("net: ethernet: bcmgenet: use phydev
from struct net_device") because it causes GENETv1/2/3 adapters to
expose the following behavior after an ifconfig down/up sequence:
PING fainelli-linux (10.112.156.244): 56 data bytes
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.352 ms
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.472 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.496 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.517 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.536 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.557 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=752.448 ms (DUP!)
This was previously fixed by commit 5dbebbb44a6a ("net: bcmgenet:
Software reset EPHY after power on") but the commit we are reverting was
essentially making this previous commit void, here is why.
Without commit 62469c76007e we would have the following scenario after
an ifconfig down then up sequence:
- bcmgenet_open() calls bcmgenet_power_up() to make sure the PHY is
initialized *before* we get to initialize the UniMAC, this is
critical to ensure the PHY is in a correct state, priv->phydev is
valid, this code executes fine
- second time from bcmgenet_mii_probe(), through the normal
phy_init_hw() call (which arguably could be optimized out)
Everything is fine in that case. With commit 62469c76007e, we would have
the following scenario to happen after an ifconfig down then up
sequence:
- bcmgenet_close() calls phy_disonnect() which makes dev->phydev become
NULL
- when bcmgenet_open() executes again and calls bcmgenet_mii_reset() from
bcmgenet_power_up() to initialize the internal PHY, the NULL check
becomes true, so we do not reset the PHY, yet we keep going on and
initialize the UniMAC, causing MAC activity to occur
- we call bcmgenet_mii_reset() from bcmgenet_mii_probe(), but this is
too late, the PHY is botched, and causes the above bogus pings/packets
transmission/reception to occur
Reported-by: Jaedon Shin <jaedon.shin@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
David,
There is already a commit:
Revert "net: ethernet: bcmgenet: use phy_ethtool_{get|set}_link_ksettings"
which should make this apply cleanly to "net" now.
Thanks!
drivers/net/ethernet/broadcom/genet/bcmgenet.c | 45 ++++++++++++++------------
drivers/net/ethernet/broadcom/genet/bcmgenet.h | 1 +
drivers/net/ethernet/broadcom/genet/bcmmii.c | 24 +++++++-------
3 files changed, 39 insertions(+), 31 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 8d4f8495dbb3..541456398dfb 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -453,25 +453,29 @@ static inline void bcmgenet_rdma_ring_writel(struct bcmgenet_priv *priv,
static int bcmgenet_get_settings(struct net_device *dev,
struct ethtool_cmd *cmd)
{
+ struct bcmgenet_priv *priv = netdev_priv(dev);
+
if (!netif_running(dev))
return -EINVAL;
- if (!dev->phydev)
+ if (!priv->phydev)
return -ENODEV;
- return phy_ethtool_gset(dev->phydev, cmd);
+ return phy_ethtool_gset(priv->phydev, cmd);
}
static int bcmgenet_set_settings(struct net_device *dev,
struct ethtool_cmd *cmd)
{
+ struct bcmgenet_priv *priv = netdev_priv(dev);
+
if (!netif_running(dev))
return -EINVAL;
- if (!dev->phydev)
+ if (!priv->phydev)
return -ENODEV;
- return phy_ethtool_sset(dev->phydev, cmd);
+ return phy_ethtool_sset(priv->phydev, cmd);
}
static int bcmgenet_set_rx_csum(struct net_device *dev,
@@ -937,7 +941,7 @@ static int bcmgenet_get_eee(struct net_device *dev, struct ethtool_eee *e)
e->eee_active = p->eee_active;
e->tx_lpi_timer = bcmgenet_umac_readl(priv, UMAC_EEE_LPI_TIMER);
- return phy_ethtool_get_eee(dev->phydev, e);
+ return phy_ethtool_get_eee(priv->phydev, e);
}
static int bcmgenet_set_eee(struct net_device *dev, struct ethtool_eee *e)
@@ -954,7 +958,7 @@ static int bcmgenet_set_eee(struct net_device *dev, struct ethtool_eee *e)
if (!p->eee_enabled) {
bcmgenet_eee_enable_set(dev, false);
} else {
- ret = phy_init_eee(dev->phydev, 0);
+ ret = phy_init_eee(priv->phydev, 0);
if (ret) {
netif_err(priv, hw, dev, "EEE initialization failed\n");
return ret;
@@ -964,12 +968,14 @@ static int bcmgenet_set_eee(struct net_device *dev, struct ethtool_eee *e)
bcmgenet_eee_enable_set(dev, true);
}
- return phy_ethtool_set_eee(dev->phydev, e);
+ return phy_ethtool_set_eee(priv->phydev, e);
}
static int bcmgenet_nway_reset(struct net_device *dev)
{
- return genphy_restart_aneg(dev->phydev);
+ struct bcmgenet_priv *priv = netdev_priv(dev);
+
+ return genphy_restart_aneg(priv->phydev);
}
/* standard ethtool support functions. */
@@ -996,13 +1002,12 @@ static struct ethtool_ops bcmgenet_ethtool_ops = {
static int bcmgenet_power_down(struct bcmgenet_priv *priv,
enum bcmgenet_power_mode mode)
{
- struct net_device *ndev = priv->dev;
int ret = 0;
u32 reg;
switch (mode) {
case GENET_POWER_CABLE_SENSE:
- phy_detach(ndev->phydev);
+ phy_detach(priv->phydev);
break;
case GENET_POWER_WOL_MAGIC:
@@ -1063,6 +1068,7 @@ static void bcmgenet_power_up(struct bcmgenet_priv *priv,
/* ioctl handle special commands that are not present in ethtool. */
static int bcmgenet_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
{
+ struct bcmgenet_priv *priv = netdev_priv(dev);
int val = 0;
if (!netif_running(dev))
@@ -1072,10 +1078,10 @@ static int bcmgenet_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
case SIOCGMIIPHY:
case SIOCGMIIREG:
case SIOCSMIIREG:
- if (!dev->phydev)
+ if (!priv->phydev)
val = -ENODEV;
else
- val = phy_mii_ioctl(dev->phydev, rq, cmd);
+ val = phy_mii_ioctl(priv->phydev, rq, cmd);
break;
default:
@@ -2458,7 +2464,6 @@ static void bcmgenet_irq_task(struct work_struct *work)
{
struct bcmgenet_priv *priv = container_of(
work, struct bcmgenet_priv, bcmgenet_irq_work);
- struct net_device *ndev = priv->dev;
netif_dbg(priv, intr, priv->dev, "%s\n", __func__);
@@ -2471,7 +2476,7 @@ static void bcmgenet_irq_task(struct work_struct *work)
/* Link UP/DOWN event */
if (priv->irq0_stat & UMAC_IRQ_LINK_EVENT) {
- phy_mac_interrupt(ndev->phydev,
+ phy_mac_interrupt(priv->phydev,
!!(priv->irq0_stat & UMAC_IRQ_LINK_UP));
priv->irq0_stat &= ~UMAC_IRQ_LINK_EVENT;
}
@@ -2833,7 +2838,7 @@ static void bcmgenet_netif_start(struct net_device *dev)
/* Monitor link interrupts now */
bcmgenet_link_intr_enable(priv);
- phy_start(dev->phydev);
+ phy_start(priv->phydev);
}
static int bcmgenet_open(struct net_device *dev)
@@ -2932,7 +2937,7 @@ static void bcmgenet_netif_stop(struct net_device *dev)
struct bcmgenet_priv *priv = netdev_priv(dev);
netif_tx_stop_all_queues(dev);
- phy_stop(dev->phydev);
+ phy_stop(priv->phydev);
bcmgenet_intr_disable(priv);
bcmgenet_disable_rx_napi(priv);
bcmgenet_disable_tx_napi(priv);
@@ -2958,7 +2963,7 @@ static int bcmgenet_close(struct net_device *dev)
bcmgenet_netif_stop(dev);
/* Really kill the PHY state machine and disconnect from it */
- phy_disconnect(dev->phydev);
+ phy_disconnect(priv->phydev);
/* Disable MAC receive */
umac_enable_set(priv, CMD_RX_EN, false);
@@ -3517,7 +3522,7 @@ static int bcmgenet_suspend(struct device *d)
bcmgenet_netif_stop(dev);
- phy_suspend(dev->phydev);
+ phy_suspend(priv->phydev);
netif_device_detach(dev);
@@ -3581,7 +3586,7 @@ static int bcmgenet_resume(struct device *d)
if (priv->wolopts)
clk_disable_unprepare(priv->clk_wol);
- phy_init_hw(dev->phydev);
+ phy_init_hw(priv->phydev);
/* Speed settings must be restored */
bcmgenet_mii_config(priv->dev);
@@ -3614,7 +3619,7 @@ static int bcmgenet_resume(struct device *d)
netif_device_attach(dev);
- phy_resume(dev->phydev);
+ phy_resume(priv->phydev);
if (priv->eee.eee_enabled)
bcmgenet_eee_enable_set(dev, true);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.h b/drivers/net/ethernet/broadcom/genet/bcmgenet.h
index 0f0868c56f05..1e2dc34d331a 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.h
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.h
@@ -597,6 +597,7 @@ struct bcmgenet_priv {
/* MDIO bus variables */
wait_queue_head_t wq;
+ struct phy_device *phydev;
bool internal_phy;
struct device_node *phy_dn;
struct device_node *mdio_dn;
diff --git a/drivers/net/ethernet/broadcom/genet/bcmmii.c b/drivers/net/ethernet/broadcom/genet/bcmmii.c
index e907acd81da9..457c3bc8cfff 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmmii.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c
@@ -86,7 +86,7 @@ static int bcmgenet_mii_write(struct mii_bus *bus, int phy_id,
void bcmgenet_mii_setup(struct net_device *dev)
{
struct bcmgenet_priv *priv = netdev_priv(dev);
- struct phy_device *phydev = dev->phydev;
+ struct phy_device *phydev = priv->phydev;
u32 reg, cmd_bits = 0;
bool status_changed = false;
@@ -183,9 +183,9 @@ void bcmgenet_mii_reset(struct net_device *dev)
if (GENET_IS_V4(priv))
return;
- if (dev->phydev) {
- phy_init_hw(dev->phydev);
- phy_start_aneg(dev->phydev);
+ if (priv->phydev) {
+ phy_init_hw(priv->phydev);
+ phy_start_aneg(priv->phydev);
}
}
@@ -236,7 +236,6 @@ static void bcmgenet_internal_phy_setup(struct net_device *dev)
static void bcmgenet_moca_phy_setup(struct bcmgenet_priv *priv)
{
- struct net_device *ndev = priv->dev;
u32 reg;
/* Speed settings are set in bcmgenet_mii_setup() */
@@ -245,14 +244,14 @@ static void bcmgenet_moca_phy_setup(struct bcmgenet_priv *priv)
bcmgenet_sys_writel(priv, reg, SYS_PORT_CTRL);
if (priv->hw_params->flags & GENET_HAS_MOCA_LINK_DET)
- fixed_phy_set_link_update(ndev->phydev,
+ fixed_phy_set_link_update(priv->phydev,
bcmgenet_fixed_phy_link_update);
}
int bcmgenet_mii_config(struct net_device *dev)
{
struct bcmgenet_priv *priv = netdev_priv(dev);
- struct phy_device *phydev = dev->phydev;
+ struct phy_device *phydev = priv->phydev;
struct device *kdev = &priv->pdev->dev;
const char *phy_name = NULL;
u32 id_mode_dis = 0;
@@ -303,7 +302,7 @@ int bcmgenet_mii_config(struct net_device *dev)
* capabilities, use that knowledge to also configure the
* Reverse MII interface correctly.
*/
- if ((phydev->supported & PHY_BASIC_FEATURES) ==
+ if ((priv->phydev->supported & PHY_BASIC_FEATURES) ==
PHY_BASIC_FEATURES)
port_ctrl = PORT_MODE_EXT_RVMII_25;
else
@@ -372,7 +371,7 @@ int bcmgenet_mii_probe(struct net_device *dev)
return -ENODEV;
}
} else {
- phydev = dev->phydev;
+ phydev = priv->phydev;
phydev->dev_flags = phy_flags;
ret = phy_connect_direct(dev, phydev, bcmgenet_mii_setup,
@@ -383,6 +382,8 @@ int bcmgenet_mii_probe(struct net_device *dev)
}
}
+ priv->phydev = phydev;
+
/* Configure port multiplexer based on what the probed PHY device since
* reading the 'max-speed' property determines the maximum supported
* PHY speed which is needed for bcmgenet_mii_config() to configure
@@ -390,7 +391,7 @@ int bcmgenet_mii_probe(struct net_device *dev)
*/
ret = bcmgenet_mii_config(dev);
if (ret) {
- phy_disconnect(phydev);
+ phy_disconnect(priv->phydev);
return ret;
}
@@ -400,7 +401,7 @@ int bcmgenet_mii_probe(struct net_device *dev)
* Ethernet MAC ISRs
*/
if (priv->internal_phy)
- phydev->irq = PHY_IGNORE_INTERRUPT;
+ priv->phydev->irq = PHY_IGNORE_INTERRUPT;
return 0;
}
@@ -605,6 +606,7 @@ static int bcmgenet_mii_pd_init(struct bcmgenet_priv *priv)
}
+ priv->phydev = phydev;
priv->phy_interface = pd->phy_interface;
return 0;
--
2.7.4
^ permalink raw reply related
* Re: [PATCH] realtek: Add switch variable to 'switch case not processed' messages
From: Larry Finger @ 2016-09-24 19:06 UTC (permalink / raw)
To: Joe Perches, Jean Delvare, Jes Sorensen
Cc: Chaoming Li, Kalle Valo, linux-wireless, netdev, linux-kernel
In-Reply-To: <1474738358.23838.11.camel@perches.com>
On 09/24/2016 12:32 PM, Joe Perches wrote:
> (adding Jes Sorensen to recipients)
>
> On Sat, 2016-09-24 at 11:35 -0500, Larry Finger wrote:
>> I have patches that makes HAL_DEF_WOWLAN be a no-op for the rest of the drivers,
>> and one that sets the enum values for that particular statement to hex values. I
>> also looked at the other large enums and decided that they never need the human
>> lookup.
>
> Hey Larry.
>
> There are many somewhat common realtek wireless drivers.
>
> Not to step on your toes, but what do you think of
> rationalizing the switch/case statements of all the
> realtek drivers in a few steps:
>
> o Reindent all the switch/case blocks to a more normal
> kernel style (git diff -w would show no changes here)
That sounds like busy work to me, but if you want to do it, go ahead.
> o cast, spacing and parenthesis reductions
> Lots of odd and somewhat unique styles in various
> drivers, looks like too many individual authors without
> a style guide / code enforcer using slightly different
> personalized code. Glancing at the code, it looks to be
> similar logic, just written in different styles.
Same comment.
> o Logic changes like
> from:
> if (foo) func(..., bar, ...); else func(..., baz, ...);
> to:
> func(..., foo ? bar : baz, ...);
> to make the case statement code blocks more consistent
> and emit somewhat smaller object code.
I find if .. else constructs much easier to read than the cond ? xxxx : yyyy
form. I would reject any such patches.
> o Consolidation of equivalent function spanning drivers
> With the style only changes minimized, where possible
> make the drivers use common ops/callback functions.
The is no question that there are similar routines in different drivers. I would
like to place as much as possible into common routines, but I never seem to find
the time. There are too many bugs in other things I support to consider these
niceties.
> Is there any value in that or is Jes' work going to make
> doing any or all of this unnecessary and futile?
That is not yet determined. The only driver that is to be replaced at this point
is rtl8192cu. Jes only has USB I/O for his driver. We are looking at adding
SDIO, and once that is done, PCI should be possible.
Larry
^ permalink raw reply
* [PATCH v3] bpf: Set register type according to is_valid_access()
From: Mickaël Salaün @ 2016-09-24 18:01 UTC (permalink / raw)
To: linux-kernel
Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
Daniel Borkmann, Kees Cook, Sargun Dhillon, Tejun Heo, netdev
This prevent future potential pointer leaks when an unprivileged eBPF
program will read a pointer value from its context. Even if
is_valid_access() returns a pointer type, the eBPF verifier replace it
with UNKNOWN_VALUE. The register value that contains a kernel address is
then allowed to leak. Moreover, this fix allows unprivileged eBPF
programs to use functions with (legitimate) pointer arguments.
Not an issue currently since reg_type is only set for PTR_TO_PACKET or
PTR_TO_PACKET_END in XDP and TC programs that can only be loaded as
privileged. For now, the only unprivileged eBPF program allowed is for
socket filtering and all the types from its context are UNKNOWN_VALUE.
However, this fix is important for future unprivileged eBPF programs
which could use pointers in their context.
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
---
kernel/bpf/verifier.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index daea765d72e6..adbc7c161ba5 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -795,9 +795,8 @@ static int check_mem_access(struct verifier_env *env, u32 regno, int off,
err = check_ctx_access(env, off, size, t, ®_type);
if (!err && t == BPF_READ && value_regno >= 0) {
mark_reg_unknown_value(state->regs, value_regno);
- if (env->allow_ptr_leaks)
- /* note that reg.[id|off|range] == 0 */
- state->regs[value_regno].type = reg_type;
+ /* note that reg.[id|off|range] == 0 */
+ state->regs[value_regno].type = reg_type;
}
} else if (reg->type == FRAME_PTR || reg->type == PTR_TO_STACK) {
--
2.9.3
^ permalink raw reply related
* [PATCH net-next] gre: use nla_get_be32() to extract flowinfo
From: Lance Richardson @ 2016-09-24 18:01 UTC (permalink / raw)
To: netdev
Eliminate a sparse endianness mismatch warning, use nla_get_be32() to
extract a __be32 value instead of nla_get_u32().
Signed-off-by: Lance Richardson <lrichard@redhat.com>
---
net/ipv6/ip6_gre.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 397e1ed..4ce74f8 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -1239,7 +1239,7 @@ static void ip6gre_netlink_parms(struct nlattr *data[],
parms->encap_limit = nla_get_u8(data[IFLA_GRE_ENCAP_LIMIT]);
if (data[IFLA_GRE_FLOWINFO])
- parms->flowinfo = nla_get_u32(data[IFLA_GRE_FLOWINFO]);
+ parms->flowinfo = nla_get_be32(data[IFLA_GRE_FLOWINFO]);
if (data[IFLA_GRE_FLAGS])
parms->flags = nla_get_u32(data[IFLA_GRE_FLAGS]);
--
2.5.5
^ permalink raw reply related
* Re: [PATCH] realtek: Add switch variable to 'switch case not processed' messages
From: Joe Perches @ 2016-09-24 17:32 UTC (permalink / raw)
To: Larry Finger, Jean Delvare, Jes Sorensen
Cc: Chaoming Li, Kalle Valo, linux-wireless, netdev, linux-kernel
In-Reply-To: <49094659-662d-a902-6740-ea3d1fea6660@lwfinger.net>
(adding Jes Sorensen to recipients)
On Sat, 2016-09-24 at 11:35 -0500, Larry Finger wrote:
> I have patches that makes HAL_DEF_WOWLAN be a no-op for the rest of the drivers,
> and one that sets the enum values for that particular statement to hex values. I
> also looked at the other large enums and decided that they never need the human
> lookup.
Hey Larry.
There are many somewhat common realtek wireless drivers.
Not to step on your toes, but what do you think of
rationalizing the switch/case statements of all the
realtek drivers in a few steps:
o Reindent all the switch/case blocks to a more normal
kernel style (git diff -w would show no changes here)
o cast, spacing and parenthesis reductions
Lots of odd and somewhat unique styles in various
drivers, looks like too many individual authors without
a style guide / code enforcer using slightly different
personalized code. Glancing at the code, it looks to be
similar logic, just written in different styles.
o Logic changes like
from:
if (foo) func(..., bar, ...); else func(..., baz, ...);
to:
func(..., foo ? bar : baz, ...);
to make the case statement code blocks more consistent
and emit somewhat smaller object code.
o Consolidation of equivalent function spanning drivers
With the style only changes minimized, where possible
make the drivers use common ops/callback functions.
Is there any value in that or is Jes' work going to make
doing any or all of this unnecessary and futile?
^ permalink raw reply
* Re: [PATCH] Net Driver: Add Cypress GX3 VID=04b4 PID=3610.
From: Chris Roth @ 2016-09-24 16:59 UTC (permalink / raw)
To: linux-usb, netdev, linux-kernel
In-Reply-To: <CAPZMiRa7mTJK5SOWm2eLbhuNZ5EZ7CCPzG40dPzzxsstP78VTQ@mail.gmail.com>
Due to my lack of familiarity with the how git send-email works, I've
unintentionally had my name listed as the first 'from' whereas I
intended Allan Chou to be listed as the first 'from' in the patch. If
anyone can correct this on my behalf, I would appreciate it.
Regards,
Chris
On Sat, Sep 24, 2016 at 10:57 AM, Chris Roth <chris.roth@usask.ca> wrote:
>
> Due to my lack of familiarity with the how git send-email works, I've unintentionally had my name listed as the first 'from' whereas I intended Allan Chou to be listed as the first 'from' in the patch. If anyone can correct this on my behalf, I would appreciate it.
>
> Regards,
> Chris
>
> On Fri, Sep 23, 2016 at 4:24 PM, <chris.roth@usask.ca> wrote:
>>
>> From: Chris Roth <chris.roth@usask.ca>
>>
>> From: Allan Chou <allan@asix.com.tw>
>>
>> Add support for Cypress GX3 SuperSpeed to Gigabit Ethernet
>> Bridge Controller (Vendor=04b4 ProdID=3610).
>>
>> Patch verified on x64 linux kernel 4.7.4 system with the
>> Kensington SD4600P USB-C Universal Dock with Power, which uses the
>> Cypress GX3 SuperSpeed to Gigabit Ethernet Bridge Controller.
>>
>> A similar patch was signed-off and tested-by Allan Chou
>> <allan@asix.com.tw> on 2015-12-01.
>>
>> Allan verified his similar patch on x86 Linux kernel 4.1.6 system
>> with Cypress GX3 SuperSpeed to Gigabit Ethernet Bridge Controller.
>>
>> Tested-by: Allan Chou <allan@asix.com.tw>
>> Tested-by: Chris Roth <chris.roth@usask.ca>
>>
>> Signed-off-by: Allan Chou <allan@asix.com.tw>
>> Signed-off-by: Chris Roth <chris.roth@usask.ca>
>> ---
>> drivers/net/usb/ax88179_178a.c | 17 +++++++++++++++++
>> 1 file changed, 17 insertions(+)
>>
>> diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
>> index e6338c1..8a6675d 100644
>> --- a/drivers/net/usb/ax88179_178a.c
>> +++ b/drivers/net/usb/ax88179_178a.c
>> @@ -1656,6 +1656,19 @@ static const struct driver_info ax88178a_info = {
>> .tx_fixup = ax88179_tx_fixup,
>> };
>>
>> +static const struct driver_info cypress_GX3_info = {
>> + .description = "Cypress GX3 SuperSpeed to Gigabit Ethernet Controller",
>> + .bind = ax88179_bind,
>> + .unbind = ax88179_unbind,
>> + .status = ax88179_status,
>> + .link_reset = ax88179_link_reset,
>> + .reset = ax88179_reset,
>> + .stop = ax88179_stop,
>> + .flags = FLAG_ETHER | FLAG_FRAMING_AX,
>> + .rx_fixup = ax88179_rx_fixup,
>> + .tx_fixup = ax88179_tx_fixup,
>> +};
>> +
>> static const struct driver_info dlink_dub1312_info = {
>> .description = "D-Link DUB-1312 USB 3.0 to Gigabit Ethernet Adapter",
>> .bind = ax88179_bind,
>> @@ -1718,6 +1731,10 @@ static const struct usb_device_id products[] = {
>> USB_DEVICE(0x0b95, 0x178a),
>> .driver_info = (unsigned long)&ax88178a_info,
>> }, {
>> + /* Cypress GX3 SuperSpeed to Gigabit Ethernet Bridge Controller */
>> + USB_DEVICE(0x04b4, 0x3610),
>> + .driver_info = (unsigned long)&cypress_GX3_info,
>> +}, {
>> /* D-Link DUB-1312 USB 3.0 to Gigabit Ethernet Adapter */
>> USB_DEVICE(0x2001, 0x4a00),
>> .driver_info = (unsigned long)&dlink_dub1312_info,
>> --
>> 2.7.4
>>
>
^ permalink raw reply
* Re: [Intel-wired-lan] [PATCH net-next v2 1/2] i40e: remove superfluous I40E_DEBUG_USER statement
From: Alexander Duyck @ 2016-09-24 16:35 UTC (permalink / raw)
To: Stefan Assmann; +Cc: David Miller, intel-wired-lan, Netdev
In-Reply-To: <6457da99-5895-cd0f-4a43-01f0876b2ea6@kpanic.de>
On Sat, Sep 24, 2016 at 4:13 AM, Stefan Assmann <sassmann@kpanic.de> wrote:
> On 24.09.2016 04:48, Alexander Duyck wrote:
>> On Fri, Sep 23, 2016 at 6:30 AM, Stefan Assmann <sassmann@kpanic.de> wrote:
>>> This debug statement is confusing and never set in the code. Any debug
>>> output should be guarded by the proper I40E_DEBUG_* statement which can
>>> be enabled via the debug module parameter or ethtool.
>>> Remove or convert the I40E_DEBUG_USER cases to I40E_DEBUG_INIT.
>>>
>>> v2: re-add setting the debug_mask in i40e_set_msglevel() so that the
>>> debug level can still be altered via ethtool msglvl.
>>>
>>> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
>>> ---
>>> drivers/net/ethernet/intel/i40e/i40e_common.c | 3 ---
>>> drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 6 -----
>>> drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 3 +--
>>> drivers/net/ethernet/intel/i40e/i40e_main.c | 35 +++++++++++++-------------
>>> drivers/net/ethernet/intel/i40e/i40e_type.h | 2 --
>>> 5 files changed, 18 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c
>>> index 2154a34..8ccb09c 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_common.c
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
>>> @@ -3207,9 +3207,6 @@ static void i40e_parse_discover_capabilities(struct i40e_hw *hw, void *buff,
>>> break;
>>> case I40E_AQ_CAP_ID_MSIX:
>>> p->num_msix_vectors = number;
>>> - i40e_debug(hw, I40E_DEBUG_INIT,
>>> - "HW Capability: MSIX vector count = %d\n",
>>> - p->num_msix_vectors);
>>> break;
>>> case I40E_AQ_CAP_ID_VF_MSIX:
>>> p->num_msix_vectors_vf = number;
>>
>> I'm assuming this is dropped because you considered it redundant with
>> the dump in i40e_get_capabilities. If so it would have been nice to
>> see this called out in your patch description somewhere as it doesn't
>> jive with the rest of the patch since you are stripping something that
>> is using I40E_DEBUG_INIT.
>
> Hi Alex,
>
> agreed, it seemed redundant. I'll make a note about it in the next
> version when we have decided how to proceed.
>
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
>>> index 05cf9a7..e9c6f1c 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
>>> @@ -1210,12 +1210,6 @@ static ssize_t i40e_dbg_command_write(struct file *filp,
>>> u32 level;
>>> cnt = sscanf(&cmd_buf[10], "%i", &level);
>>> if (cnt) {
>>> - if (I40E_DEBUG_USER & level) {
>>> - pf->hw.debug_mask = level;
>>> - dev_info(&pf->pdev->dev,
>>> - "set hw.debug_mask = 0x%08x\n",
>>> - pf->hw.debug_mask);
>>> - }
>>> pf->msg_enable = level;
>>> dev_info(&pf->pdev->dev, "set msg_enable = 0x%08x\n",
>>> pf->msg_enable);
>>
>> From what I can tell the interface is completely redundant as ethtool
>> can already do this. I'd say it is okay to just remove this command
>> and section entirely from the debugfs interface.
>
> Yes, I didn't want to stray too far from what the description said and
> just removed the I40E_DEBUG_USER related code.
>
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
>>> index 1835186..02f55ab 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
>>> @@ -987,8 +987,7 @@ static void i40e_set_msglevel(struct net_device *netdev, u32 data)
>>> struct i40e_netdev_priv *np = netdev_priv(netdev);
>>> struct i40e_pf *pf = np->vsi->back;
>>>
>>> - if (I40E_DEBUG_USER & data)
>>> - pf->hw.debug_mask = data;
>>> + pf->hw.debug_mask = data;
>>> pf->msg_enable = data;
>>> }
>>>
>>
>> So the way I view this is that I40E_DEBUG_USER appears to be a flag
>> that is being used to differentiate between some proprietary flags and
>> the standard msg level. The problem is that msg_enable and debug_mask
>> are playing off of two completely different bit definitions. For
>> example how much sense does it make for NETIF_F_MSG_TX_DONE to map to
>> I40E_DEBUG_DCB. If anything what should probably happen here is
>> instead of dropping the if there probably needs to be an else.
>
> As you said the flags don't match, which is part of the problem. What
> tipped me of starting to work on this is, that the debug module
> parameter doesn't do a thing atm and I had to debug some stuff during
> driver MSI-X initialization. So my main pain point here is to get the
> debug parameter in a sane state.
>
>> This is one of many things on my list of items to fix since I have
>> come back to Intel. It is just a matter of finding the time.
>> Basically what I would really prefer to see here is us move all of the
>> flags in i40e_debug_mask so that we didn't have any overlap with the
>> NETIF_F_MSG_* flags unless there is a relation between the two.
>
> That sounds like a good idea and I'm happy to join in. So for now, I
> could drop the I40E_DEBUG_USER changes and just focus on making the
> debug parameter usable. All the non-standard debug output could be
> handled by I40E_DEBUG_USER or whatever better name we could for the
> flag. The current name doesn't really explain what it's meant for.
>
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
>>> index 61b0fc4..56369761 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
>>> @@ -6665,16 +6665,19 @@ static int i40e_get_capabilities(struct i40e_pf *pf)
>>> }
>>> } while (err);
>>>
>>> - if (pf->hw.debug_mask & I40E_DEBUG_USER)
>>> - dev_info(&pf->pdev->dev,
>>> - "pf=%d, num_vfs=%d, msix_pf=%d, msix_vf=%d, fd_g=%d, fd_b=%d, pf_max_q=%d num_vsi=%d\n",
>>> - pf->hw.pf_id, pf->hw.func_caps.num_vfs,
>>> - pf->hw.func_caps.num_msix_vectors,
>>> - pf->hw.func_caps.num_msix_vectors_vf,
>>> - pf->hw.func_caps.fd_filters_guaranteed,
>>> - pf->hw.func_caps.fd_filters_best_effort,
>>> - pf->hw.func_caps.num_tx_qp,
>>> - pf->hw.func_caps.num_vsis);
>>> + i40e_debug(&pf->hw, I40E_DEBUG_INIT,
>>> + "HW Capabilities: PF-id[%d] num_vfs=%d, msix_pf=%d, msix_vf=%d\n",
>>> + pf->hw.pf_id,
>>> + pf->hw.func_caps.num_vfs,
>>> + pf->hw.func_caps.num_msix_vectors,
>>> + pf->hw.func_caps.num_msix_vectors_vf);
>>> + i40e_debug(&pf->hw, I40E_DEBUG_INIT,
>>> + "HW Capabilities: PF-id[%d] fd_g=%d, fd_b=%d, pf_max_qp=%d num_vsis=%d\n",
>>> + pf->hw.pf_id,
>>> + pf->hw.func_caps.fd_filters_guaranteed,
>>> + pf->hw.func_caps.fd_filters_best_effort,
>>> + pf->hw.func_caps.num_tx_qp,
>>> + pf->hw.func_caps.num_vsis);
>>>
>>> #define DEF_NUM_VSI (1 + (pf->hw.func_caps.fcoe ? 1 : 0) \
>>> + pf->hw.func_caps.num_vfs)
>>
>> I'd say don't bother with this. There isn't any point.
>
> OK, I thought the same thing but wasn't sure if anybody relies on this
> info.
>
>>> @@ -8495,14 +8498,10 @@ static int i40e_sw_init(struct i40e_pf *pf)
>>> int err = 0;
>>> int size;
>>>
>>> - pf->msg_enable = netif_msg_init(I40E_DEFAULT_MSG_ENABLE,
>>> - (NETIF_MSG_DRV|NETIF_MSG_PROBE|NETIF_MSG_LINK));
>>> - if (debug != -1 && debug != I40E_DEFAULT_MSG_ENABLE) {
>>> - if (I40E_DEBUG_USER & debug)
>>> - pf->hw.debug_mask = debug;
>>> - pf->msg_enable = netif_msg_init((debug & ~I40E_DEBUG_USER),
>>> - I40E_DEFAULT_MSG_ENABLE);
>>> - }
>>> + pf->msg_enable = netif_msg_init(debug,
>>> + NETIF_MSG_DRV |
>>> + NETIF_MSG_PROBE |
>>> + NETIF_MSG_LINK);
>>>
>>> /* Set default capability flags */
>>> pf->flags = I40E_FLAG_RX_CSUM_ENABLED |
>>
>> Okay so I think I now see why there is confusion about how debug is
>> used. The documentation in the driver is wrong for how it worked. It
>> wasn't being passed as a 0-16, somebody implemented this as a 32 bit
>> bitmask. So the question becomes how to fix it. The problem is with
>> the patch as it is so far we end up with pf->msg_enable being
>> populated but pf->hw.debug_mask never being populated. The values you
>> are passing as the default don't make any sense either since they
>> don't really map to the same functionality in I40e. They map to
>> DEBUG_INIT, DEBUG_RELEASE, and an unused bit.
>>
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h b/drivers/net/ethernet/intel/i40e/i40e_type.h
>>> index bd5f13b..7e88e35 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_type.h
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
>>> @@ -85,8 +85,6 @@ enum i40e_debug_mask {
>>> I40E_DEBUG_AQ_COMMAND = 0x06000000,
>>> I40E_DEBUG_AQ = 0x0F000000,
>>>
>>> - I40E_DEBUG_USER = 0xF0000000,
>>> -
>>> I40E_DEBUG_ALL = 0xFFFFFFFF
>>> };
>>>
>>
>> This end piece is where the problem really lies. The problem
>> statement for this would essentially be that the i40e driver uses the
>> debug module parameter in a non-standard way. It is using a tg3 style
>> bitmask to populate the fields, but then documenting it and coding
>> part of it like it is expecting the default debug usage. To top it
>> off it is doing the same kind of nonsense with the ethtool msg level
>> interface.
>>
>> The one piece we probably need with all this in order to really "fix"
>> the issue and still maintain some sense of functionality would be to
>> look at adding something that would populate pf->hw.debug_mask. I'm
>> half tempted to say that we could try adding another module parameter
>> named i40e_debug that we could use like tg3 does with tg3_debug and
>> change the debugfs interface to only modify that instead of messing
>> with the msg level, but the fact is that would probably just be more
>> confusing. For now what I would suggest doing is just splitting
>> msg_enable and pf->hw.debug_mask and for now just default the value of
>> pf->hw.debug_mask to I40E_DEFAULT_MSG_ENABLE. That way in a week or
>> two after netdev/netconf we will hopefully had a chance to hash this
>> all out and can find a better way to solve this.
>
> I really appreciate your feedback. What you're suggesting makes sense.
> Let's split the generic (msg_enable) debug information provided by the
> debug parameter from driver specific debug information. Not sure I'd
> like to see another module parameter, but that doesn't have to be
> decided right away.
>
> If you agree, I'll rewrite the patches to make a clear separation
> between debug_mask and msg_enable, making the debug parameter actually
> usable.
> And we'll sort out the rest along the way.
>
> Stefan
I think that works for me. It will be easier for us to sort out the
I40E_DEBUG_* flags since I think the out-of-tree driver might be
hiding some additional debug info. So if we can split msg_enable and
hw.debug_mask for now that would probably work out best.
One thought I had is if we wanted to overload the debug value what we
could do is make it so that we use bit 31 as the "I40E_DEBUG_USER"
value. Then we could go through and essentially do an if/else setup
throughout the code where if debug is negative it is assumed to be
either -1 or meant to represent hw.debug_mask, and if they are
positive the value is assumed to be setting msg enable. With negative
values used to identify the hw.debug_mask values we have the added
advantage that it then automatically forces netif_msg_init into
setting the msg_enable to the value passed in default_msg_enable_bits.
- Alex
^ permalink raw reply
* Re: [PATCH] realtek: Add switch variable to 'switch case not processed' messages
From: Larry Finger @ 2016-09-24 16:35 UTC (permalink / raw)
To: Joe Perches, Jean Delvare
Cc: Chaoming Li, Kalle Valo, linux-wireless, netdev, linux-kernel
In-Reply-To: <1474733754.23838.3.camel@perches.com>
On 09/24/2016 11:15 AM, Joe Perches wrote:
> On Sat, 2016-09-24 at 17:55 +0200, Jean Delvare wrote:
>> Would it make sense to explicitly set the enum values, or add them as
>> comments, to make such look-ups easier?
>
> If you want to create enum->#ENUM structs and
> "const char *" lookup functions, please be my guest.
>
> otherwise, hex is at least a consistent way to display
> what should be infrequent output.
Displaying those values as hex is OK. As Joe says, they will not be shown very
often.
I have patches that makes HAL_DEF_WOWLAN be a no-op for the rest of the drivers,
and one that sets the enum values for that particular statement to hex values. I
also looked at the other large enums and decided that they never need the human
lookup.
Larry
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox