Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* Re: [PATCH 01/28] bnxt_en: Add bnxt_set_max_func_irqs().
From: Michael Chan @ 2016-12-05 17:10 UTC (permalink / raw)
  To: David Miller; +Cc: Selvin Xavier, dledford, linux-rdma, Netdev
In-Reply-To: <20161205.114704.1700305660817315475.davem@davemloft.net>

On Mon, Dec 5, 2016 at 8:47 AM, David Miller <davem@davemloft.net> wrote:
>
> It really doesn't make any sense to only send 7 out of 28 of these
> patches to the networking list.
>
> In fact I would say that you need to split this series into two
> components.

OK.  I will resend those bnxt_en patches later today as a separate series.

>
> One that goes into the networking tree, and once that's accepted you
> can submit the IB parts to that subsystem's maintainer.

^ permalink raw reply

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
From: Jason Gunthorpe @ 2016-12-05 17:10 UTC (permalink / raw)
  To: Liran Liss
  Cc: Hefty, Sean, Tom Talpey, Steve Wise, 'Doug Ledford',
	'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	'Steve Wise', Marciniszyn, Mike, Dalessandro, Dennis,
	'Lijun Ou', 'Wei Hu(Xavier)', Latif, Faisal,
	Yishai Hadas, 'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', Moni
In-Reply-To: <HE1PR0501MB2812393A0A690DEC1E03DE3FB1800-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>

On Sun, Dec 04, 2016 at 08:38:14PM +0000, Liran Liss wrote:

> Anyway, returning to the initial matter at hand: I would like to
> start with each port reporting a bit mask of the supported protocols
> on that link (RoCE v1/v2, Raw Ethernet, iWARP, etc.)  It will be
> used for reporting device capabilities in general for tools, as well
> as by applications that don't use rdmacm.

Why don't we start by defining how it is supposed to even work and how
to fix the RDMA CM before adding even more random capability bits?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH V1 rdma-core 0/2] Optimize RoCE address handle creation for userspace
From: Jason Gunthorpe @ 2016-12-05 17:07 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w
In-Reply-To: <1480866806-5052-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

On Sun, Dec 04, 2016 at 05:53:24PM +0200, Yishai Hadas wrote:

> Test results:
> Upon testing the above use case the provider couldn't be loaded as expected,
> "libibverbs: Warning: couldn't load driver 'libmlx5-rdmav2.so':
> libmlx5-rdmav2.so: cannot open shared object file: No such file or directory"

The dynmic linker didn't print something too? That is surprising..

I'm not sure where we are in the release cycle right now.. Doug - if
we didn't make a release with the 1.3 symbol tag (this was created
just before rdma-core) then this should go into 1.3 when it gets
merged..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: Use ib_drain_qp instead of ib_drain_rq in ib_srp
From: Bart Van Assche @ 2016-12-05 17:05 UTC (permalink / raw)
  To: Max Gurtovoy, sagig, Christoph Hellwig,
	swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <fa2d38be-bc59-5120-6dfd-f24ab01d6d8f-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

On 12/05/2016 06:43 AM, Max Gurtovoy wrote:
> hi guys,
> I've noticed that we use ib_drain_rq in teardown flow in ib_srp driver.
> Trying to figure out why is this better than ib_drain_qp ?
> BTW, the recv_cq != send_cq in srp so it's even better to use
> ib_drain_qp, isn't it ?
>
> I haven't encountered a bug in this area yet, but just trying to
> understand if it's there.

Hello Max,

The description of a patch that was accepted upstream about two years 
ago explains the purpose of the ib_drain_rq() call:

commit 7dad6b2e440d810273946b0e7092a8fe043c3b8a
Author: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Date:   Tue Oct 21 18:00:35 2014 +0200

     IB/srp: Fix a race condition triggered by destroying a queue pair

     At least LID reassignment can trigger a race condition in the SRP
     initiator driver, namely the receive completion handler trying to
     post a request on a QP during or after QP destruction and before
     the CQ's have been destroyed. Avoid this race by modifying a QP
     into the error state and by waiting until all receive completions
     have been processed before destroying a QP.

     Reported-by: Max Gurtuvoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
     Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
     Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
     Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>

There is no risk that any send work will be posted while the ib_srp 
driver destroys a QP.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 00/28] Broadcom RoCE Driver (bnxt_re)
From: Jason Gunthorpe @ 2016-12-05 16:51 UTC (permalink / raw)
  To: Selvin Xavier
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1480919912-1079-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

On Sun, Dec 04, 2016 at 10:38:04PM -0800, Selvin Xavier wrote:
>  drivers/infiniband/hw/bnxtre/bnxt_re_uverbs_abi.h |   60 +

This file probably needs to be in include/uapi/rdma like the others

Do you have a git tree someplace for this?

Do you have the user space component ready for rdma-core as well?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 01/28] bnxt_en: Add bnxt_set_max_func_irqs().
From: David Miller @ 2016-12-05 16:47 UTC (permalink / raw)
  To: selvin.xavier; +Cc: dledford, linux-rdma, michael.chan, netdev
In-Reply-To: <1480919912-1079-2-git-send-email-selvin.xavier@broadcom.com>


It really doesn't make any sense to only send 7 out of 28 of these
patches to the networking list.

In fact I would say that you need to split this series into two
components.

One that goes into the networking tree, and once that's accepted you
can submit the IB parts to that subsystem's maintainer.

^ permalink raw reply

* Re: [PATCH 00/28] Broadcom RoCE Driver (bnxt_re)
From: Doug Ledford @ 2016-12-05 15:17 UTC (permalink / raw)
  To: Selvin Xavier; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1480919912-1079-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 5329 bytes --]

On 12/5/2016 1:38 AM, Selvin Xavier wrote:
> Hi Doug,
> 
> This series introduces the RoCE driver for the Broadcom
> NetXtreme-C/E 10/25/40/50 gigabit RoCE HCAs. This driver
> is dependent on the bnxt_en NIC driver.
> 
> This patch series is based on the "bnxt_re" branch on Doug's
> repository + four pending bnxt_en NIC driver patches that are
> yet to be pulled into net-next tree.

Please be clear about what you are expecting done with those four
patches.  If Dave is currently reviewing them, then I need to know
if/when he takes them so I can rebase the bnxt_en branch on his net-next
and pick them up that way.

> Please review and consider applying this to linux-rdma repository.
> 
> Thanks,
> Selvin Xavier
> 
> Michael Chan (7):
>   bnxt_en: Add bnxt_set_max_func_irqs().
>   bnxt_en: Enable MSIX early in bnxt_init_one().
>   bnxt_en: Move function reset to bnxt_init_one().
>   bnxt_en: Improve completion ring allocation for VFs.
>   bnxt_en: Reserve RDMA resources by default.
>   bnxt_en: Refactor the driver registration function with firmware.
>   bnxt_en: Add interface to support RDMA driver.
> 
> Selvin Xavier (21):
>   bnxt_re: Add bnxt_re RoCE driver files
>   bnxt_re: Introducing autogenerated Host Software Interface(hsi) file
>   bnxt_re: register with the NIC driver
>   bnxt_re: Enabling RoCE control path
>   bnxt_re: Adding Notification Queue support
>   bnxt_re: Support for PD, ucontext and mmap verbs
>   bnxt_re: Support for query and modify device verbs
>   bnxt_re: Adding support for port related verbs
>   bnxt_re: Support for GID related verbs
>   bnxt_re: Support for CQ verbs
>   bnxt_re: Support for AH verbs
>   bnxt_re: Support memory registration verbs
>   bnxt_re: Support QP verbs
>   bnxt_re: Support post_send verb
>   bnxt_re: Support post_recv
>   bnxt_re: Support poll_cq verb
>   bnxt_re: Handling dispatching of events to IB stack and cleanup during
>     unload
>   bnxt_re: Support for DCB
>   bnxt_re: Support debugfs
>   bnxt_re: Set uverbs command mask
>   bnxt_re: Add QP event handling
> 
>  drivers/infiniband/Kconfig                        |    2 +
>  drivers/infiniband/hw/Makefile                    |    1 +
>  drivers/infiniband/hw/bnxtre/Kconfig              |    9 +
>  drivers/infiniband/hw/bnxtre/Makefile             |    6 +
>  drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c      | 2146 +++++++++
>  drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h      |  391 ++
>  drivers/infiniband/hw/bnxtre/bnxt_qplib_rcfw.c    |  660 +++
>  drivers/infiniband/hw/bnxtre/bnxt_qplib_rcfw.h    |  193 +
>  drivers/infiniband/hw/bnxtre/bnxt_qplib_res.c     |  802 ++++
>  drivers/infiniband/hw/bnxtre/bnxt_qplib_res.h     |  197 +
>  drivers/infiniband/hw/bnxtre/bnxt_qplib_sp.c      |  811 ++++
>  drivers/infiniband/hw/bnxtre/bnxt_qplib_sp.h      |  134 +
>  drivers/infiniband/hw/bnxtre/bnxt_re.h            |  120 +
>  drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.c    |  136 +
>  drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.h    |   25 +
>  drivers/infiniband/hw/bnxtre/bnxt_re_hsi.h        | 5201 +++++++++++++++++++++
>  drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c   | 3190 +++++++++++++
>  drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h   |  171 +
>  drivers/infiniband/hw/bnxtre/bnxt_re_main.c       | 1339 ++++++
>  drivers/infiniband/hw/bnxtre/bnxt_re_uverbs_abi.h |   60 +
>  drivers/net/ethernet/broadcom/bnxt/Makefile       |    2 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c         |  355 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt.h         |   22 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c   |   14 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c     |  288 ++
>  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h     |   91 +
>  26 files changed, 16251 insertions(+), 115 deletions(-)
>  create mode 100644 drivers/infiniband/hw/bnxtre/Kconfig
>  create mode 100644 drivers/infiniband/hw/bnxtre/Makefile
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_qplib_rcfw.c
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_qplib_rcfw.h
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_qplib_res.c
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_qplib_res.h
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_qplib_sp.c
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_qplib_sp.h
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_re.h
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.c
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.h
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_re_hsi.h
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_re_main.c
>  create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_re_uverbs_abi.h
>  create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
>  create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
> 


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG Key ID: 0E572FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply

* Re: [PATCH 08/28] bnxt_re: Add bnxt_re RoCE driver files
From: Doug Ledford @ 2016-12-05 15:16 UTC (permalink / raw)
  To: Selvin Xavier
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Eddie Wai, Devesh Sharma,
	Somnath Kotur, Sriharsha Basavapatna
In-Reply-To: <1480919912-1079-9-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 4093 bytes --]

On 12/5/2016 1:38 AM, Selvin Xavier wrote:

> diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
> index fb3fb89..a4fab22 100644
> --- a/drivers/infiniband/Kconfig
> +++ b/drivers/infiniband/Kconfig
> @@ -91,4 +91,6 @@ source "drivers/infiniband/hw/hfi1/Kconfig"
>  
>  source "drivers/infiniband/hw/qedr/Kconfig"
>  
> +source "drivers/infiniband/hw/bnxtre/Kconfig"
> +
>  endif # INFINIBAND
> diff --git a/drivers/infiniband/hw/Makefile b/drivers/infiniband/hw/Makefile
> index e7a5ed9..7227b36 100644
> --- a/drivers/infiniband/hw/Makefile
> +++ b/drivers/infiniband/hw/Makefile
> @@ -11,3 +11,4 @@ obj-$(CONFIG_INFINIBAND_USNIC)		+= usnic/
>  obj-$(CONFIG_INFINIBAND_HFI1)		+= hfi1/
>  obj-$(CONFIG_INFINIBAND_HNS)		+= hns/
>  obj-$(CONFIG_INFINIBAND_QEDR)		+= qedr/
> +obj-$(CONFIG_INFINIBAND_BNXTRE)		+= bnxtre/
> diff --git a/drivers/infiniband/hw/bnxtre/Kconfig b/drivers/infiniband/hw/bnxtre/Kconfig
> new file mode 100644
> index 0000000..2637544
> --- /dev/null
> +++ b/drivers/infiniband/hw/bnxtre/Kconfig
> @@ -0,0 +1,9 @@
> +config INFINIBAND_BNXTRE
> +    tristate "Broadcom Netxtreme HCA support"
> +    depends on ETHERNET && NETDEVICES && PCI && INET
> +    select NET_VENDOR_BROADCOM
> +    select BNXT
> +    ---help---
> +	  This driver supports Broadcom NetXtreme-C/E 10/25/40/50 gigabit
> +	  RoCE HCAs.  To compile this driver as a module, choose M here:
> +	  the module will be called bnxt_re.
> diff --git a/drivers/infiniband/hw/bnxtre/Makefile b/drivers/infiniband/hw/bnxtre/Makefile
> new file mode 100644
> index 0000000..0521489
> --- /dev/null
> +++ b/drivers/infiniband/hw/bnxtre/Makefile
> @@ -0,0 +1,5 @@
> +
> +obj-$(CONFIG_INFINIBAND_BNXTRE) += bnxt_re.o
> +bnxt_re-y := bnxt_re_main.o bnxt_re_ib_verbs.o \
> +	     bnxt_qplib_res.o bnxt_qplib_rcfw.o	\
> +	     bnxt_qplib_sp.o bnxt_qplib_fp.o

A lot of times I prefer these files to be the final patch in the series.
 It's completely not possible to break bisectability if these are last.
Then again, if I squash this down to one commit it doesn't really matter....

> diff --git a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
> new file mode 100644
> index 0000000..34873f4
> --- /dev/null
> +++ b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
> @@ -0,0 +1,12 @@
> +/* Broadcom NetXtreme-C/E RoCE driver.
> + *
> + * Copyright (c) 2016 Broadcom Corporation
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation.
> + */

This copyright, repeated many times in the various skeleton files...

> diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
> new file mode 100644
> index 0000000..4c377dc
> --- /dev/null
> +++ b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
> @@ -0,0 +1,96 @@
> +/* Broadcom NetXtreme-C/E RoCE driver.
> + *
> + * Copyright (c) 2016 Broadcom Corporation
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation.
> + */
> +
> +/*
> + * Description: Main component of the bnxt_re driver
> + */
> +
> +#include <linux/module.h>
> +#include <linux/netdevice.h>
> +#include <linux/mutex.h>
> +#include <linux/list.h>
> +#include <linux/rculist.h>
> +#include "bnxt_re.h"
> +static char version[] =
> +		BNXT_RE_DESC " v" ROCE_DRV_MODULE_VERSION "\n";
> +
> +
> +MODULE_AUTHOR("Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>");
> +MODULE_DESCRIPTION(BNXT_RE_DESC " Driver");
> +MODULE_LICENSE("Dual BSD/GPL");

and this module license description do not agree.  Please make
everything consistent.  I don't care if it's GPLv2 or Dual licensed, it
simply needs to be consistent.




-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG Key ID: 0E572FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply

* Re: Use ib_drain_qp instead of ib_drain_rq in ib_srp
From: Max Gurtovoy @ 2016-12-05 14:51 UTC (permalink / raw)
  To: Bart Van Assche, sagig, Christoph Hellwig,
	swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <fa2d38be-bc59-5120-6dfd-f24ab01d6d8f-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Just noticed that send_cq has IB_POLL_DIRECT restriction in srp.

what about iSER then ? we use there ib_drain_sq...


On 12/5/2016 4:43 PM, Max Gurtovoy wrote:
> hi guys,
> I've noticed that we use ib_drain_rq in teardown flow in ib_srp driver.
> Trying to figure out why is this better than ib_drain_qp ?
> BTW, the recv_cq != send_cq in srp so it's even better to use
> ib_drain_qp, isn't it ?
>
> I haven't encountered a bug in this area yet, but just trying to
> understand if it's there.
>
> thanks,
> Max.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Use ib_drain_qp instead of ib_drain_rq in ib_srp
From: Max Gurtovoy @ 2016-12-05 14:43 UTC (permalink / raw)
  To: Bart Van Assche, sagig, Christoph Hellwig,
	swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

hi guys,
I've noticed that we use ib_drain_rq in teardown flow in ib_srp driver.
Trying to figure out why is this better than ib_drain_qp ?
BTW, the recv_cq != send_cq in srp so it's even better to use 
ib_drain_qp, isn't it ?

I haven't encountered a bug in this area yet, but just trying to 
understand if it's there.

thanks,
Max.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Open Fabrics meetings
From: Weiny, Ira @ 2016-12-05 14:10 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
  Cc: Christoph Lameter,
	'Jason Gunthorpe (jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org)',
	Doug Ledford (dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	Susan Coulter (skc-YOWKrPYUwWM@public.gmane.org),
	Liran Liss (liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org),
	Leon Romanovsky, Fleck, John, Hefty, Sean

John and I are trying to make plans to attend the OFA conference in Austin.

https://openfabrics.org/index.php/2017-ofa-workshop.html

Are there any plans to meet outside of the Monday - Friday already scheduled?

Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2 0/8] RXE improvements
From: Boyer, Andrew @ 2016-12-05 13:58 UTC (permalink / raw)
  To: Moni Shoua; +Cc: Yonatan Cohen, linux-rdma, Bart Van Assche
In-Reply-To: <CAG9sBKMkoCbeCO6SgPZrW-x0rBNTQUaar+wAqeVDTLeoT8HfHw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On 11/23/16, 12:59 PM, "monisonlists-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org on behalf of Moni Shoua"
<monisonlists-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org on behalf of monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:

>> Andrew Boyer (8):
>>   IB/rxe: Remove buffer used for printing IP address
>>   IB/rxe: Advance the consumer pointer before posting the CQE
>>   IB/rxe: Don't update the response PSN unless it's going forwards
>>   IB/rxe: Unblock loopback by moving skb_out increment
>>   IB/rxe: Add support for zero-byte operations
>>   IB/rxe: Add support for IB_CQ_REPORT_MISSED_EVENTS
>>   IB/rxe: Fix ref leak in rxe_create_qp()
>>   IB/rxe: Fix ref leak in duplicate_request()
>
>Thanks for the series.
>
>Please see comment in response for the first patch but except that
>
>Acked-by: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>
>for everything else

Hello Moni,
I saw the comment from Bart in response to the first patch, but I prefer
it this way because of how much other code is required to take advantage
of %pIS. If you want it changed to %pIS, we can do that, though. What
would you prefer?

-Andrew

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH v2 2/2] IB/rxe: Hold refs when running tasklets
From: Andrew Boyer @ 2016-12-05 13:43 UTC (permalink / raw)
  To: monis-VPRAkNaXOzVWk0Htik3J/w, yonatanc-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Andrew Boyer
In-Reply-To: <1480945401-3025-2-git-send-email-andrew.boyer-8PEkshWhKlo@public.gmane.org>

It might be possible for all of a QP's references to be dropped
while one of that QP's tasklets is running.

For example, the completer might run during QP destroy.
If qp->valid is false, it will drop all of the packets on
the resp_pkts list, potentially removing the last reference.
Then it tries to advance the SQ consumer pointer. If the
SQ's buffer has already been destroyed, the system will
panic.

To be safe, hold a reference on the QP for the duration
of each tasklet.

Signed-off-by: Andrew Boyer <andrew.boyer-8PEkshWhKlo@public.gmane.org>
---
 drivers/infiniband/sw/rxe/rxe_comp.c | 4 ++++
 drivers/infiniband/sw/rxe/rxe_req.c  | 4 ++++
 drivers/infiniband/sw/rxe/rxe_resp.c | 3 +++
 3 files changed, 11 insertions(+)

diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index 6c5e29d..3687fcd 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -510,6 +510,8 @@ int rxe_completer(void *arg)
 	struct rxe_pkt_info *pkt = NULL;
 	enum comp_state state;
 
+	rxe_add_ref(qp);
+
 	if (!qp->valid) {
 		while ((skb = skb_dequeue(&qp->resp_pkts))) {
 			rxe_drop_ref(qp);
@@ -739,11 +741,13 @@ int rxe_completer(void *arg)
 	/* we come here if we are done with processing and want the task to
 	 * exit from the loop calling us
 	 */
+	rxe_drop_ref(qp);
 	return -EAGAIN;
 
 done:
 	/* we come here if we have processed a packet we want the task to call
 	 * us again to see if there is anything else to do
 	 */
+	rxe_drop_ref(qp);
 	return 0;
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 22bd963..035cb90 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -596,6 +596,8 @@ int rxe_requester(void *arg)
 	struct rxe_qp rollback_qp;
 	struct rxe_send_wqe rollback_wqe;
 
+	rxe_add_ref(qp);
+
 next_wqe:
 	if (unlikely(!qp->valid || qp->req.state == QP_STATE_ERROR))
 		goto exit;
@@ -756,8 +758,10 @@ int rxe_requester(void *arg)
 	 */
 	wqe->wr.send_flags |= IB_SEND_SIGNALED;
 	__rxe_do_task(&qp->comp.task);
+	rxe_drop_ref(qp);
 	return -EAGAIN;
 
 exit:
+	rxe_drop_ref(qp);
 	return -EAGAIN;
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index dd3d88a..337a1cb 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -1198,6 +1198,8 @@ int rxe_responder(void *arg)
 	struct rxe_pkt_info *pkt = NULL;
 	int ret = 0;
 
+	rxe_add_ref(qp);
+
 	qp->resp.aeth_syndrome = AETH_ACK_UNLIMITED;
 
 	if (!qp->valid) {
@@ -1386,5 +1388,6 @@ int rxe_responder(void *arg)
 exit:
 	ret = -EAGAIN;
 done:
+	rxe_drop_ref(qp);
 	return ret;
 }
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v2 1/2] IB/rxe: Wait for tasklets to finish before tearing down QP
From: Andrew Boyer @ 2016-12-05 13:43 UTC (permalink / raw)
  To: monis-VPRAkNaXOzVWk0Htik3J/w, yonatanc-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Andrew Boyer

The system may crash when a malformed request is received and
the error is detected by the responder.

NodeA: $ ibv_rc_pingpong -g 0 -d rxe0 -i 1 -n 1 -s 50000
NodeB: $ ibv_rc_pingpong -g 0 -d rxe0 -i 1 -n 1 -s 1024 <NodeA_ip>

The responder generates a receive error on node B since the incoming
SEND is oversized. If the client tears down the QP before the responder
or the completer finish running, a page fault may occur.

The fix makes the destroy operation spin until the tasks complete, which
appears to be original intent of the design.

Signed-off-by: Andrew Boyer <andrew.boyer-8PEkshWhKlo@public.gmane.org>
Reviewed-by: Yuval Shaia <yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 drivers/infiniband/sw/rxe/rxe_task.c | 19 +++++++++++++++++++
 drivers/infiniband/sw/rxe/rxe_task.h |  1 +
 2 files changed, 20 insertions(+)

diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
index 1e19bf8..d2a14a1 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -121,6 +121,7 @@ int rxe_init_task(void *obj, struct rxe_task *task,
 	task->arg	= arg;
 	task->func	= func;
 	snprintf(task->name, sizeof(task->name), "%s", name);
+	task->destroyed	= false;
 
 	tasklet_init(&task->tasklet, rxe_do_task, (unsigned long)task);
 
@@ -132,11 +133,29 @@ int rxe_init_task(void *obj, struct rxe_task *task,
 
 void rxe_cleanup_task(struct rxe_task *task)
 {
+	unsigned long flags;
+	bool idle;
+
+	/*
+	 * Mark the task, then wait for it to finish. It might be
+	 * running in a non-tasklet (direct call) context.
+	 */
+	task->destroyed = true;
+
+	do {
+		spin_lock_irqsave(&task->state_lock, flags);
+		idle = (task->state == TASK_STATE_START);
+		spin_unlock_irqrestore(&task->state_lock, flags);
+	} while (!idle);
+
 	tasklet_kill(&task->tasklet);
 }
 
 void rxe_run_task(struct rxe_task *task, int sched)
 {
+	if (task->destroyed)
+		return;
+
 	if (sched)
 		tasklet_schedule(&task->tasklet);
 	else
diff --git a/drivers/infiniband/sw/rxe/rxe_task.h b/drivers/infiniband/sw/rxe/rxe_task.h
index d14aa6d..08ff42d 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.h
+++ b/drivers/infiniband/sw/rxe/rxe_task.h
@@ -54,6 +54,7 @@ struct rxe_task {
 	int			(*func)(void *arg);
 	int			ret;
 	char			name[16];
+	bool			destroyed;
 };
 
 /*
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v2 0/2] IB/rxe: Fix kernel panics when tearing down QPs
From: Andrew Boyer @ 2016-12-05 13:42 UTC (permalink / raw)
  To: monis-VPRAkNaXOzVWk0Htik3J/w, yonatanc-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Andrew Boyer

This is a set of two patches that prevent kernel panics seen when tearing
down QPs. The second patch (holding refs in tasklets) might or might not be
needed once the first patch (waiting for tasklets to finish) is applied.
Feedback welcomed.

Update for v2:
 - Remove default initialization of idle in rxe_cleanup_task() per review

Andrew Boyer (2):
  IB/rxe: Wait for tasklets to finish before tearing down QP
  IB/rxe: Hold refs when running tasklets

 drivers/infiniband/sw/rxe/rxe_comp.c |  4 ++++
 drivers/infiniband/sw/rxe/rxe_req.c  |  4 ++++
 drivers/infiniband/sw/rxe/rxe_resp.c |  3 +++
 drivers/infiniband/sw/rxe/rxe_task.c | 19 +++++++++++++++++++
 drivers/infiniband/sw/rxe/rxe_task.h |  1 +
 5 files changed, 31 insertions(+)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: I/O error on dd commands
From: Max Gurtovoy @ 2016-12-05 10:12 UTC (permalink / raw)
  To: Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil,
	monis-VPRAkNaXOzVWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <b8b48041a15444bc9c62176d6807433a-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org>



On 12/5/2016 7:41 AM, Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil@public.gmane.org wrote:
> Hi Moni,
>
> Does a rxe driver of vanilla 4.9-rc6 fine work?
> When the dd command is tested for a read and write, it'll be the following error.
>
> (read)
> # dd if=/dev/nvme0n1 of=<readfile>.bin bs=1024 count=10000 iflag=direct
>
> blk_update_request: I/O error, dev nvme0n1, sector 1860
> nvme nvme0: reconnecting in 10 seconds
> nvme nvme0: Successfully reconnected
>
> or
>
> nvme nvme0: failed nvme_keep_alive_end_io error=16391
> nvme nvme0: reconnecting in 10 seconds
> nvme nvme0: Successfully reconnected
>
> (write)
> # dd if=<writefile>.bin of=/dev/nvme0n1 bs=1024 count=10000 oflag=direct
>
> blk_update_request: I/O error, dev nvme0n1, sector 1860
> nvme nvme0: reconnecting in 10 seconds
> nvme nvme0: Successfully reconnected
>
> or
>
> nvme nvme0: failed nvme_keep_alive_end_io error=16391
> nvme nvme0: reconnecting in 10 seconds
> nvme nvme0: Successfully reconnected
>
> I'd like to investigate the root cause of this error, are there any ideas?

Hi Haruo,

can you try to repro it with iSER ?
what is your backing store device ?
is this happens in 1k bs only or in different bs as well ?

thanks,
Max.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH 28/28] bnxt_re: Add QP event handling
From: Selvin Xavier @ 2016-12-05  6:38 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Selvin Xavier, Eddie Wai,
	Devesh Sharma, Somnath Kotur, Sriharsha Basavapatna
In-Reply-To: <1480919912-1079-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Implements callback handler for processing affiliated Async events of a QP.
This patch also implements the control path command completion handling.

Signed-off-by: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxtre/bnxt_qplib_rcfw.c | 49 ++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/drivers/infiniband/hw/bnxtre/bnxt_qplib_rcfw.c b/drivers/infiniband/hw/bnxtre/bnxt_qplib_rcfw.c
index 64077ae..6cbe472 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_qplib_rcfw.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_qplib_rcfw.c
@@ -221,6 +221,46 @@ static int bnxt_qplib_process_func_event(struct bnxt_qplib_rcfw *rcfw,
 	return 0;
 }
 
+static int bnxt_qplib_process_qp_event(struct bnxt_qplib_rcfw *rcfw,
+				       struct creq_qp_event *qp_event)
+{
+	struct bnxt_qplib_crsq *crsq = &rcfw->crsq;
+	struct bnxt_qplib_hwq *cmdq = &rcfw->cmdq;
+	struct bnxt_qplib_crsqe *crsqe;
+	u16 cbit, cookie, blocked = 0;
+	unsigned long flags;
+	u32 sw_cons;
+
+	switch (qp_event->event) {
+	case CREQ_QP_EVENT_EVENT_QP_ERROR_NOTIFICATION:
+		break;
+	default:
+	{
+		/* Command Response */
+		spin_lock_irqsave(&cmdq->lock, flags);
+		sw_cons = HWQ_CMP(crsq->cons, crsq);
+		crsqe = &crsq->crsq[sw_cons];
+		crsq->cons++;
+		memcpy(&crsqe->qp_event, qp_event, sizeof(crsqe->qp_event));
+
+		cookie = le16_to_cpu(crsqe->qp_event.cookie);
+		blocked = cookie & RCFW_CMD_IS_BLOCKING;
+		cookie &= RCFW_MAX_COOKIE_VALUE;
+		cbit = cookie % RCFW_MAX_OUTSTANDING_CMD;
+		if (!test_and_clear_bit(cbit, rcfw->cmdq_bitmap))
+			dev_warn(&rcfw->pdev->dev,
+				 "QPLIB: CMD bit %d was not requested", cbit);
+
+		cmdq->cons += crsqe->req_size;
+		spin_unlock_irqrestore(&cmdq->lock, flags);
+		if (!blocked)
+			wake_up(&rcfw->waitq);
+		break;
+	}
+	}
+	return 0;
+}
+
 /* SP - CREQ Completion handlers */
 static void bnxt_qplib_service_creq(unsigned long data)
 {
@@ -244,6 +284,15 @@ static void bnxt_qplib_service_creq(unsigned long data)
 		type = creqe->type & CREQ_BASE_TYPE_MASK;
 		switch (type) {
 		case CREQ_BASE_TYPE_QP_EVENT:
+			if (!bnxt_qplib_process_qp_event
+			    (rcfw, (struct creq_qp_event *)creqe))
+				rcfw->creq_qp_event_processed++;
+			else {
+				dev_warn(&rcfw->pdev->dev, "QPLIB: crsqe with");
+				dev_warn(&rcfw->pdev->dev,
+					 "QPLIB: type = 0x%x not handled",
+					 type);
+			}
 			break;
 		case CREQ_BASE_TYPE_FUNC_EVENT:
 			if (!bnxt_qplib_process_func_event
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 27/28] bnxt_re: Set uverbs command mask
From: Selvin Xavier @ 2016-12-05  6:38 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Selvin Xavier, Eddie Wai,
	Devesh Sharma, Somnath Kotur, Sriharsha Basavapatna
In-Reply-To: <1480919912-1079-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

This patch exports available uverbs command mask to the IB stack.
Also, populating some of the missing parameters in the ibdev structure
used for registration.

Signed-off-by: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxtre/bnxt_re_main.c | 36 +++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
index 67b6a80..9f39600 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
@@ -36,6 +36,7 @@
 #include "bnxt_re.h"
 #include "bnxt_re_debugfs.h"
 #include "bnxt_re_ib_verbs.h"
+#include "bnxt_re_uverbs_abi.h"
 #include "bnxt.h"
 static char version[] =
 		BNXT_RE_DESC " v" ROCE_DRV_MODULE_VERSION "\n";
@@ -418,8 +419,42 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
 		strlen(BNXT_RE_DESC) + 5);
 	ibdev->phys_port_cnt = 1;
 
+	bnxt_qplib_get_guid(rdev->netdev->dev_addr, (u8 *)&ibdev->node_guid);
+
 	ibdev->num_comp_vectors	= 1;
 	ibdev->dma_device = &rdev->en_dev->pdev->dev;
+	ibdev->local_dma_lkey = BNXT_QPLIB_RSVD_LKEY;
+
+	/* User space */
+	ibdev->uverbs_abi_ver = BNXT_RE_ABI_VERSION;
+	ibdev->uverbs_cmd_mask =
+			(1ull << IB_USER_VERBS_CMD_GET_CONTEXT)		|
+			(1ull << IB_USER_VERBS_CMD_QUERY_DEVICE)	|
+			(1ull << IB_USER_VERBS_CMD_QUERY_PORT)		|
+			(1ull << IB_USER_VERBS_CMD_ALLOC_PD)		|
+			(1ull << IB_USER_VERBS_CMD_DEALLOC_PD)		|
+			(1ull << IB_USER_VERBS_CMD_REG_MR)		|
+			(1ull << IB_USER_VERBS_CMD_REREG_MR)		|
+			(1ull << IB_USER_VERBS_CMD_DEREG_MR)		|
+			(1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) |
+			(1ull << IB_USER_VERBS_CMD_CREATE_CQ)		|
+			(1ull << IB_USER_VERBS_CMD_RESIZE_CQ)		|
+			(1ull << IB_USER_VERBS_CMD_DESTROY_CQ)		|
+			(1ull << IB_USER_VERBS_CMD_CREATE_QP)		|
+			(1ull << IB_USER_VERBS_CMD_MODIFY_QP)		|
+			(1ull << IB_USER_VERBS_CMD_QUERY_QP)		|
+			(1ull << IB_USER_VERBS_CMD_DESTROY_QP)		|
+			(1ull << IB_USER_VERBS_CMD_CREATE_SRQ)		|
+			(1ull << IB_USER_VERBS_CMD_MODIFY_SRQ)		|
+			(1ull << IB_USER_VERBS_CMD_QUERY_SRQ)		|
+			(1ull << IB_USER_VERBS_CMD_DESTROY_SRQ)		|
+			(1ull << IB_USER_VERBS_CMD_CREATE_AH)		|
+			(1ull << IB_USER_VERBS_CMD_MODIFY_AH)		|
+			(1ull << IB_USER_VERBS_CMD_QUERY_AH)		|
+			(1ull << IB_USER_VERBS_CMD_DESTROY_AH);
+	/* POLL_CQ and REQ_NOTIFY_CQ is directly handled in libbnxt_re */
+
+	/* Kernel verbs */
 	ibdev->query_device		= bnxt_re_query_device;
 	ibdev->modify_device		= bnxt_re_modify_device;
 
@@ -1046,6 +1081,7 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
 		pr_err("Failed to register with IB: %#x\n", rc);
 		goto fail;
 	}
+	dev_info(rdev_to_dev(rdev), "Device registered successfully");
 	for (i = 0; i < ARRAY_SIZE(bnxt_re_attributes); i++) {
 		rc = device_create_file(&rdev->ibdev.dev,
 					bnxt_re_attributes[i]);
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 26/28] bnxt_re: Support debugfs
From: Selvin Xavier @ 2016-12-05  6:38 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Selvin Xavier, Eddie Wai,
	Devesh Sharma, Somnath Kotur, Sriharsha Basavapatna
In-Reply-To: <1480919912-1079-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

This patch exports some of the FW debug counters

Signed-off-by: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxtre/Makefile          |   2 +-
 drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.c | 136 +++++++++++++++++++++++++
 drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.h |  25 +++++
 drivers/infiniband/hw/bnxtre/bnxt_re_main.c    |   4 +
 4 files changed, 166 insertions(+), 1 deletion(-)
 create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.c
 create mode 100644 drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.h

diff --git a/drivers/infiniband/hw/bnxtre/Makefile b/drivers/infiniband/hw/bnxtre/Makefile
index 71aa5a1..39df4f1 100644
--- a/drivers/infiniband/hw/bnxtre/Makefile
+++ b/drivers/infiniband/hw/bnxtre/Makefile
@@ -1,6 +1,6 @@
 
 ccflags-y := -Idrivers/net/ethernet/broadcom/bnxt
 obj-$(CONFIG_INFINIBAND_BNXTRE) += bnxt_re.o
-bnxt_re-y := bnxt_re_main.o bnxt_re_ib_verbs.o \
+bnxt_re-y := bnxt_re_main.o bnxt_re_ib_verbs.o bnxt_re_debugfs.o \
 	     bnxt_qplib_res.o bnxt_qplib_rcfw.o	\
 	     bnxt_qplib_sp.o bnxt_qplib_fp.o
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.c b/drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.c
new file mode 100644
index 0000000..a80a241
--- /dev/null
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.c
@@ -0,0 +1,136 @@
+
+/* Broadcom NetXtreme-C/E network driver.
+ *
+ * Copyright (c) 2014-2016 Broadcom Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ *
+ */
+
+/*
+ * Description: DebugFS specifics
+ */
+
+#include <linux/module.h>
+#include <linux/interrupt.h>
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+#include <linux/netdevice.h>
+
+#include <rdma/ib_verbs.h>
+#include "bnxt_re_hsi.h"
+#include "bnxt_ulp.h"
+#include "bnxt_qplib_res.h"
+#include "bnxt_qplib_sp.h"
+#include "bnxt_qplib_fp.h"
+#include "bnxt_qplib_rcfw.h"
+
+#include "bnxt_re.h"
+#include "bnxt_re_debugfs.h"
+
+static struct dentry *bnxt_re_debugfs_root;
+static struct dentry *bnxt_re_debugfs_info;
+
+static ssize_t bnxt_re_debugfs_clear(struct file *fil, const char __user *u,
+				     size_t size, loff_t *off)
+{
+	return size;
+}
+
+static int bnxt_re_debugfs_show(struct seq_file *s, void *unused)
+{
+	struct bnxt_re_dev *rdev;
+
+	seq_puts(s, "bnxt_re debug info:\n");
+
+	mutex_lock(&bnxt_re_dev_lock);
+	list_for_each_entry(rdev, &bnxt_re_dev_list, list) {
+		struct ctx_hw_stats *stats = rdev->qplib_ctx.stats.dma;
+
+		seq_printf(s, "=====[ IBDEV %s ]=============================\n",
+			   rdev->ibdev.name);
+		if (rdev->netdev)
+			seq_printf(s, "\tlink state: %s\n",
+				   test_bit(__LINK_STATE_START,
+					    &rdev->netdev->state) ?
+				   (test_bit(__LINK_STATE_NOCARRIER,
+					     &rdev->netdev->state) ?
+				    "DOWN" : "UP") : "DOWN");
+		seq_printf(s, "\tMax QP: 0x%x\n", rdev->dev_attr.max_qp);
+		seq_printf(s, "\tMax SRQ: 0x%x\n", rdev->dev_attr.max_srq);
+		seq_printf(s, "\tMax CQ: 0x%x\n", rdev->dev_attr.max_cq);
+		seq_printf(s, "\tMax MR: 0x%x\n", rdev->dev_attr.max_mr);
+		seq_printf(s, "\tMax MW: 0x%x\n", rdev->dev_attr.max_mw);
+
+		seq_printf(s, "\tActive QP: %d\n",
+			   atomic_read(&rdev->qp_count));
+		seq_printf(s, "\tActive SRQ: %d\n",
+			   atomic_read(&rdev->srq_count));
+		seq_printf(s, "\tActive CQ: %d\n",
+			   atomic_read(&rdev->cq_count));
+		seq_printf(s, "\tActive MR: %d\n",
+			   atomic_read(&rdev->mr_count));
+		seq_printf(s, "\tActive MW: %d\n",
+			   atomic_read(&rdev->mw_count));
+		seq_printf(s, "\tRx Pkts: %lld\n",
+			   stats ? stats->rx_ucast_pkts : 0);
+		seq_printf(s, "\tRx Bytes: %lld\n",
+			   stats ? stats->rx_ucast_bytes : 0);
+		seq_printf(s, "\tTx Pkts: %lld\n",
+			   stats ? stats->tx_ucast_pkts : 0);
+		seq_printf(s, "\tTx Bytes: %lld\n",
+			   stats ? stats->tx_ucast_bytes : 0);
+		seq_printf(s, "\tRecoverable Errors: %lld\n",
+			   stats ? stats->tx_bcast_pkts : 0);
+		seq_puts(s, "\n");
+	}
+	mutex_unlock(&bnxt_re_dev_lock);
+	return 0;
+}
+
+static int bnxt_re_debugfs_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, bnxt_re_debugfs_show, NULL);
+}
+
+static int bnxt_re_debugfs_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations bnxt_re_dbg_ops = {
+	.owner		= THIS_MODULE,
+	.open		= bnxt_re_debugfs_open,
+	.read		= seq_read,
+	.write		= bnxt_re_debugfs_clear,
+	.llseek		= seq_lseek,
+	.release	= bnxt_re_debugfs_release,
+};
+
+void bnxt_re_debugfs_remove(void)
+{
+	debugfs_remove_recursive(bnxt_re_debugfs_root);
+	bnxt_re_debugfs_root = NULL;
+}
+
+void bnxt_re_debugfs_init(void)
+{
+	bnxt_re_debugfs_root = debugfs_create_dir(ROCE_DRV_MODULE_NAME, NULL);
+	if (IS_ERR_OR_NULL(bnxt_re_debugfs_root)) {
+		dev_dbg(NULL, "%s: Unable to create debugfs root directory ",
+			ROCE_DRV_MODULE_NAME);
+		dev_dbg(NULL, "with err 0x%lx", PTR_ERR(bnxt_re_debugfs_root));
+		return;
+	}
+	bnxt_re_debugfs_info = debugfs_create_file("info", 0400,
+						   bnxt_re_debugfs_root, NULL,
+						   &bnxt_re_dbg_ops);
+	if (IS_ERR_OR_NULL(bnxt_re_debugfs_info)) {
+		dev_dbg(NULL, "%s: Unable to create debugfs info node ",
+			ROCE_DRV_MODULE_NAME);
+		dev_dbg(NULL, "with err 0x%lx", PTR_ERR(bnxt_re_debugfs_info));
+		bnxt_re_debugfs_remove();
+	}
+}
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.h b/drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.h
new file mode 100644
index 0000000..4089fa5
--- /dev/null
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_debugfs.h
@@ -0,0 +1,25 @@
+
+/* Broadcom NetXtreme-C/E network driver.
+ *
+ * Copyright (c) 2014-2016 Broadcom Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ *
+ */
+
+/*
+ * Description: DebugFS header
+ */
+
+#ifndef __BNXT_RE_DEBUGFS__
+#define __BNXT_RE_DEBUGFS__
+
+extern struct list_head bnxt_re_dev_list;
+extern struct mutex bnxt_re_dev_lock;
+
+void bnxt_re_debugfs_init(void);
+void bnxt_re_debugfs_remove(void);
+
+#endif
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
index ab70d23..67b6a80 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
@@ -34,6 +34,7 @@
 #include "bnxt_qplib_fp.h"
 #include "bnxt_qplib_rcfw.h"
 #include "bnxt_re.h"
+#include "bnxt_re_debugfs.h"
 #include "bnxt_re_ib_verbs.h"
 #include "bnxt.h"
 static char version[] =
@@ -1252,6 +1253,7 @@ static int __init bnxt_re_mod_init(void)
 	if (!bnxt_re_wq)
 		return -ENOMEM;
 
+	bnxt_re_debugfs_init();
 	INIT_LIST_HEAD(&bnxt_re_dev_list);
 
 	rc = register_netdevice_notifier(&bnxt_re_netdev_notifier);
@@ -1263,6 +1265,7 @@ static int __init bnxt_re_mod_init(void)
 	return 0;
 
 err_netdev:
+	bnxt_re_debugfs_remove();
 	destroy_workqueue(bnxt_re_wq);
 
 	return rc;
@@ -1291,6 +1294,7 @@ static void __exit bnxt_re_mod_exit(void)
 	}
 
 	unregister_netdevice_notifier(&bnxt_re_netdev_notifier);
+	bnxt_re_debugfs_remove();
 	if (bnxt_re_wq)
 		destroy_workqueue(bnxt_re_wq);
 }
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 25/28] bnxt_re: Support for DCB
From: Selvin Xavier @ 2016-12-05  6:38 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Selvin Xavier, Eddie Wai,
	Devesh Sharma, Somnath Kotur, Sriharsha Basavapatna
In-Reply-To: <1480919912-1079-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

This patch queries the configured RoCE APP Priority on the host
using the dcbnl API and programs the RoCE FW with the corresponding
Traffic Class(es) for the priority.

Signed-off-by: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxtre/bnxt_qplib_sp.h |   3 +-
 drivers/infiniband/hw/bnxtre/bnxt_re.h       |   6 ++
 drivers/infiniband/hw/bnxtre/bnxt_re_main.c  | 134 +++++++++++++++++++++++++++
 3 files changed, 142 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/bnxtre/bnxt_qplib_sp.h b/drivers/infiniband/hw/bnxtre/bnxt_qplib_sp.h
index c299c02..d86ea48 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_qplib_sp.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_qplib_sp.h
@@ -130,4 +130,5 @@ int bnxt_qplib_alloc_fast_reg_page_list(struct bnxt_qplib_res *res,
 					struct bnxt_qplib_frpl *frpl, int max);
 int bnxt_qplib_free_fast_reg_page_list(struct bnxt_qplib_res *res,
 				       struct bnxt_qplib_frpl *frpl);
-#endif /* __BNXT_QPLIB_SP_H__*/
+int bnxt_qplib_map_tc2cos(struct bnxt_qplib_res *res, u16 *cids);
+#endif
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re.h b/drivers/infiniband/hw/bnxtre/bnxt_re.h
index 2a49d10..f27acca 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re.h
@@ -16,6 +16,9 @@
 #define BNXT_RE_REF_WAIT_COUNT		10
 #define BNXT_RE_DESC	"Broadcom NetXtreme-C/E RoCE Driver"
 
+#define BNXT_RE_ROCE_V1_ETH_TYPE	0x8915
+#define BNXT_RE_ROCE_V2_PORT_NO		4791
+
 #define BNXT_RE_PAGE_SIZE_4K		BIT(12)
 #define BNXT_RE_PAGE_SIZE_8K		BIT(13)
 #define BNXT_RE_PAGE_SIZE_64K		BIT(16)
@@ -66,6 +69,9 @@ struct bnxt_re_dev {
 
 	int				id;
 
+	struct delayed_work		worker;
+	u8				cur_prio_map;
+
 	/* FP Notification Queue (CQ & SRQ) */
 	struct tasklet_struct		nq_task;
 
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
index 3f55548..ab70d23 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
@@ -728,6 +728,45 @@ static void bnxt_re_dispatch_event(struct ib_device *ibdev, struct ib_qp *qp,
 	ib_dispatch_event(&ib_event);
 }
 
+#define HWRM_QUEUE_PRI2COS_QCFG_INPUT_FLAGS_IVLAN      0x02
+int bnxt_re_query_hwrm_pri2cos(struct bnxt_re_dev *rdev, u8 prio_mask, u8 dir,
+			       u64 *cid_map)
+{
+	struct hwrm_queue_pri2cos_qcfg_input req = {0};
+	struct bnxt *bp = netdev_priv(rdev->netdev);
+	struct hwrm_queue_pri2cos_qcfg_output resp;
+	struct bnxt_en_dev *en_dev = rdev->en_dev;
+	struct bnxt_fw_msg fw_msg = {0};
+	u32 flags = 0;
+	u8 *cidmap;
+	int rc = 0;
+
+	bnxt_re_init_hwrm_hdr(rdev, (void *)&req,
+			      HWRM_QUEUE_PRI2COS_QCFG, -1, -1);
+	flags |= (dir & 0x01);
+	flags |= HWRM_QUEUE_PRI2COS_QCFG_INPUT_FLAGS_IVLAN;
+	req.flags = cpu_to_le32(flags);
+	req.port_id = bp->pf.port_id;
+
+	bnxt_re_fill_fw_msg(&fw_msg, (void *)&req, sizeof(req), (void *)&resp,
+			    sizeof(resp), DFLT_HWRM_CMD_TIMEOUT);
+	rc = en_dev->en_ops->bnxt_send_fw_msg(en_dev, BNXT_ROCE_ULP, &fw_msg);
+	if (rc)
+		return rc;
+
+	if (resp.queue_cfg_info) {
+		dev_warn(rdev_to_dev(rdev),
+			 "Asymmetric cos queue configuration detected");
+		dev_warn(rdev_to_dev(rdev),
+			 " on device, QoS may not be fully functional\n");
+	}
+	cidmap = &resp.pri0_cos_queue_id;
+	if (cid_map)
+		*cid_map = le64_to_cpu(*((u64 *)cidmap));
+
+	return rc;
+}
+
 static bool bnxt_re_is_qp1_or_shadow_qp(struct bnxt_re_dev *rdev,
 					struct bnxt_re_qp *qp)
 {
@@ -768,6 +807,80 @@ static void bnxt_re_dev_stop(struct bnxt_re_dev *rdev, bool qp_wait)
 	}
 }
 
+static u32 bnxt_re_get_priority_mask(struct bnxt_re_dev *rdev)
+{
+	u32 prio_map = 0, tmp_map = 0;
+	struct net_device *netdev;
+	struct dcb_app app;
+
+	netdev = rdev->netdev;
+
+	memset(&app, 0, sizeof(app));
+	app.selector = IEEE_8021QAZ_APP_SEL_ETHERTYPE;
+	app.protocol = BNXT_RE_ROCE_V1_ETH_TYPE;
+	tmp_map = dcb_ieee_getapp_mask(netdev, &app);
+	prio_map = tmp_map;
+
+	app.selector = IEEE_8021QAZ_APP_SEL_DGRAM;
+	app.protocol = BNXT_RE_ROCE_V2_PORT_NO;
+	tmp_map = dcb_ieee_getapp_mask(netdev, &app);
+	prio_map |= tmp_map;
+
+	if (!prio_map)
+		prio_map = -EFAULT;
+	return prio_map;
+}
+
+static void bnxt_re_parse_cid_map(u8 prio_map, u8 *cid_map, u16 *cosq)
+{
+	u16 prio;
+	u8 id;
+
+	for (prio = 0, id = 0; prio < 8; prio++) {
+		if (prio_map & (1 << prio)) {
+			cosq[id] = cid_map[prio];
+			id++;
+			if (id == 2) /* Max 2 tcs supported */
+				break;
+		}
+	}
+}
+
+static int bnxt_re_setup_qos(struct bnxt_re_dev *rdev)
+{
+	u8 prio_map = 0;
+	u64 cid_map;
+	int rc;
+
+	/* Get priority for roce */
+	rc = bnxt_re_get_priority_mask(rdev);
+	if (rc < 0)
+		return rc;
+	prio_map = (u8)rc;
+
+	if (prio_map == rdev->cur_prio_map)
+		return 0;
+	rdev->cur_prio_map = prio_map;
+	/* Get cosq id for this priority */
+	rc = bnxt_re_query_hwrm_pri2cos(rdev, prio_map, 0, &cid_map);
+	if (rc) {
+		dev_warn(rdev_to_dev(rdev), "no cos for p_mask %x\n", prio_map);
+		return rc;
+	}
+	/* Parse CoS IDs for app priority */
+	bnxt_re_parse_cid_map(prio_map, (u8 *)&cid_map, rdev->cosq);
+
+	/* Config BONO. */
+	rc = bnxt_qplib_map_tc2cos(&rdev->qplib_res, rdev->cosq);
+	if (rc) {
+		dev_warn(rdev_to_dev(rdev), "no tc for cos{%x, %x}\n",
+			 rdev->cosq[0], rdev->cosq[1]);
+		return rc;
+	}
+
+	return 0;
+}
+
 static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev, bool lock_wait)
 {
 	int i, rc;
@@ -779,6 +892,9 @@ static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev, bool lock_wait)
 		/* Cleanup ib dev */
 		bnxt_re_unregister_ib(rdev);
 	}
+	if (test_and_clear_bit(BNXT_RE_FLAG_QOS_WORK_REG, &rdev->flags))
+		cancel_delayed_work(&rdev->worker);
+
 	bnxt_re_cleanup_res(rdev);
 	bnxt_re_free_res(rdev, lock_wait);
 
@@ -821,6 +937,16 @@ static void bnxt_re_set_resource_limits(struct bnxt_re_dev *rdev)
 		rdev->dev_attr.tqm_alloc_reqs[i];
 }
 
+/* worker thread for polling periodic events. Now used for QoS programming*/
+static void bnxt_re_worker(struct work_struct *work)
+{
+	struct bnxt_re_dev *rdev = container_of(work, struct bnxt_re_dev,
+						worker.work);
+
+	bnxt_re_setup_qos(rdev);
+	schedule_delayed_work(&rdev->worker, msecs_to_jiffies(30000));
+}
+
 static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
 {
 	int i, j, rc;
@@ -905,6 +1031,14 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
 		goto fail;
 	}
 
+	rc = bnxt_re_setup_qos(rdev);
+	if (rc)
+		pr_info("RoCE priority not yet configured\n");
+
+	INIT_DELAYED_WORK(&rdev->worker, bnxt_re_worker);
+	set_bit(BNXT_RE_FLAG_QOS_WORK_REG, &rdev->flags);
+	schedule_delayed_work(&rdev->worker, msecs_to_jiffies(30000));
+
 	/* Register ib dev */
 	rc = bnxt_re_register_ib(rdev);
 	if (rc) {
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 24/28] bnxt_re: Handling dispatching of events to IB stack and cleanup during unload
From: Selvin Xavier @ 2016-12-05  6:38 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Selvin Xavier, Eddie Wai,
	Devesh Sharma, Somnath Kotur, Sriharsha Basavapatna
In-Reply-To: <1480919912-1079-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

This patch implements handling dispatch of appropriate event to the IB stack
based on NETDEV events received.
Also implements cleanup of the resources during driver unload.

Signed-off-by: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxtre/bnxt_re_main.c | 77 +++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
index a138741..3f55548 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
@@ -714,6 +714,60 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
 	return rc;
 }
 
+static void bnxt_re_dispatch_event(struct ib_device *ibdev, struct ib_qp *qp,
+				   u8 port_num, enum ib_event_type event)
+{
+	struct ib_event ib_event;
+
+	ib_event.device = ibdev;
+	if (qp)
+		ib_event.element.qp = qp;
+	else
+		ib_event.element.port_num = port_num;
+	ib_event.event = event;
+	ib_dispatch_event(&ib_event);
+}
+
+static bool bnxt_re_is_qp1_or_shadow_qp(struct bnxt_re_dev *rdev,
+					struct bnxt_re_qp *qp)
+{
+	return (qp->ib_qp.qp_type == IB_QPT_GSI) || (qp == rdev->qp1_sqp);
+}
+
+static void bnxt_re_dev_stop(struct bnxt_re_dev *rdev, bool qp_wait)
+{
+	int mask = IB_QP_STATE, qp_count, count = 1;
+	struct ib_qp_attr qp_attr;
+	struct bnxt_re_qp *qp;
+
+	qp_attr.qp_state = IB_QPS_ERR;
+	mutex_lock(&rdev->qp_lock);
+	list_for_each_entry(qp, &rdev->qp_list, list) {
+		/* Modify the state of all QPs except QP1/Shadow QP */
+		if (qp && !bnxt_re_is_qp1_or_shadow_qp(rdev, qp)) {
+			if (qp->qplib_qp.state !=
+			    CMDQ_MODIFY_QP_NEW_STATE_RESET ||
+			    qp->qplib_qp.state !=
+			    CMDQ_MODIFY_QP_NEW_STATE_ERR) {
+				bnxt_re_dispatch_event(&rdev->ibdev, &qp->ib_qp,
+						       1, IB_EVENT_QP_FATAL);
+				bnxt_re_modify_qp(&qp->ib_qp, &qp_attr, mask,
+						  NULL);
+			}
+		}
+	}
+
+	mutex_unlock(&rdev->qp_lock);
+	if (qp_wait) {
+		/* Give the application some time to clean up */
+		do {
+			qp_count = atomic_read(&rdev->qp_count);
+			msleep(100);
+		} while ((qp_count != atomic_read(&rdev->qp_count)) &&
+			  count--);
+	}
+}
+
 static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev, bool lock_wait)
 {
 	int i, rc;
@@ -872,6 +926,8 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
 		}
 	}
 	set_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags);
+	bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1, IB_EVENT_PORT_ACTIVE);
+	bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1, IB_EVENT_GID_CHANGE);
 
 	return 0;
 free_sctx:
@@ -950,10 +1006,18 @@ static void bnxt_re_task(struct work_struct *work)
 				"Failed to register with IB: %#x", rc);
 			break;
 	case NETDEV_UP:
+		bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1,
+				       IB_EVENT_PORT_ACTIVE);
 		break;
 	case NETDEV_DOWN:
+		bnxt_re_dev_stop(rdev, false);
 		break;
 	case NETDEV_CHANGE:
+		if (!netif_carrier_ok(rdev->netdev))
+			bnxt_re_dev_stop(rdev, false);
+		else if (netif_carrier_ok(rdev->netdev))
+			bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1,
+					       IB_EVENT_PORT_ACTIVE);
 		break;
 	default:
 		break;
@@ -1071,6 +1135,7 @@ static int __init bnxt_re_mod_init(void)
 }
 static void __exit bnxt_re_mod_exit(void)
 {
+	struct bnxt_re_dev *rdev, *next;
 	LIST_HEAD(to_be_deleted);
 
 	/* Free all adapter allocated resources */
@@ -1079,6 +1144,18 @@ static void __exit bnxt_re_mod_exit(void)
 		list_splice_init_rcu(&bnxt_re_dev_list, &to_be_deleted,
 				     synchronize_rcu);
 	mutex_unlock(&bnxt_re_dev_lock);
+
+	/* Can use the new list without protection */
+	/* Cleanup the devices in reverse order so that the VF device
+	 * cleanup is done before PF cleanup
+	 */
+	list_for_each_entry_safe_reverse(rdev, next, &to_be_deleted, list) {
+		bnxt_re_dev_stop(rdev, true);
+		bnxt_re_ib_unreg(rdev, true);
+		bnxt_re_remove_one(rdev);
+		bnxt_re_dev_unreg(rdev);
+	}
+
 	unregister_netdevice_notifier(&bnxt_re_netdev_notifier);
 	if (bnxt_re_wq)
 		destroy_workqueue(bnxt_re_wq);
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 23/28] bnxt_re: Support poll_cq verb
From: Selvin Xavier @ 2016-12-05  6:38 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Selvin Xavier, Eddie Wai,
	Devesh Sharma, Somnath Kotur, Sriharsha Basavapatna
In-Reply-To: <1480919912-1079-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Enables the fastpath ib_poll_cq verb.

Signed-off-by: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c    | 553 +++++++++++++++++++++++-
 drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h    |   7 +-
 drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c | 519 ++++++++++++++++++++++
 drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h |   1 +
 drivers/infiniband/hw/bnxtre/bnxt_re_main.c     |  22 +-
 5 files changed, 1097 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
index 9be1f39..6ef9761 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
@@ -185,7 +185,7 @@ void bnxt_qplib_disable_nq(struct bnxt_qplib_nq *nq)
 int bnxt_qplib_enable_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq,
 			 int msix_vector, int bar_reg_offset,
 			 int (*cqn_handler)(struct bnxt_qplib_nq *nq,
-					    void *),
+					    struct bnxt_qplib_cq *),
 			 int (*srqn_handler)(struct bnxt_qplib_nq *nq,
 					     void *, u8 event))
 {
@@ -1583,6 +1583,557 @@ int bnxt_qplib_destroy_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq)
 	return 0;
 }
 
+static int __flush_sq(struct bnxt_qplib_q *sq, struct bnxt_qplib_qp *qp,
+		      struct bnxt_qplib_cqe **pcqe, int *budget)
+{
+	u32 sw_prod, sw_cons;
+	struct bnxt_qplib_cqe *cqe;
+	int rc = 0;
+
+	/* Now complete all outstanding SQEs with FLUSHED_ERR */
+	sw_prod = HWQ_CMP(sq->hwq.prod, &sq->hwq);
+	cqe = *pcqe;
+	while (*budget) {
+		sw_cons = HWQ_CMP(sq->hwq.cons, &sq->hwq);
+		if (sw_cons == sw_prod) {
+			sq->flush_in_progress = false;
+			break;
+		}
+		memset(cqe, 0, sizeof(*cqe));
+		cqe->status = CQ_REQ_STATUS_WORK_REQUEST_FLUSHED_ERR;
+		cqe->opcode = CQ_BASE_CQE_TYPE_REQ;
+		cqe->qp_handle = (u64)qp;
+		cqe->wr_id = sq->swq[sw_cons].wr_id;
+		cqe->src_qp = qp->id;
+		cqe->type = sq->swq[sw_cons].type;
+		cqe++;
+		(*budget)--;
+		sq->hwq.cons++;
+	}
+	*pcqe = cqe;
+	if (!budget && HWQ_CMP(sq->hwq.cons, &sq->hwq) != sw_prod)
+		/* Out of budget */
+		rc = -EAGAIN;
+
+	return rc;
+}
+
+static int __flush_rq(struct bnxt_qplib_q *rq, struct bnxt_qplib_qp *qp,
+		      int opcode, struct bnxt_qplib_cqe **pcqe, int *budget)
+{
+	struct bnxt_qplib_cqe *cqe;
+	u32 sw_prod, sw_cons;
+	int rc = 0;
+
+	/* Flush the rest of the RQ */
+	sw_prod = HWQ_CMP(rq->hwq.prod, &rq->hwq);
+	cqe = *pcqe;
+	while (*budget) {
+		sw_cons = HWQ_CMP(rq->hwq.cons, &rq->hwq);
+		if (sw_cons == sw_prod)
+			break;
+		memset(cqe, 0, sizeof(*cqe));
+		cqe->status =
+		    CQ_RES_RC_STATUS_WORK_REQUEST_FLUSHED_ERR;
+		cqe->opcode = opcode;
+		cqe->qp_handle = (u64)qp;
+		cqe->wr_id = rq->swq[sw_cons].wr_id;
+		cqe++;
+		(*budget)--;
+		rq->hwq.cons++;
+	}
+	*pcqe = cqe;
+	if (!*budget && HWQ_CMP(rq->hwq.cons, &rq->hwq) != sw_prod)
+		/* Out of budget */
+		rc = -EAGAIN;
+
+	return rc;
+}
+
+static int bnxt_qplib_cq_process_req(struct bnxt_qplib_cq *cq,
+				     struct cq_req *hwcqe,
+				     struct bnxt_qplib_cqe **pcqe, int *budget)
+{
+	struct bnxt_qplib_qp *qp;
+	struct bnxt_qplib_q *sq;
+	struct bnxt_qplib_cqe *cqe;
+	u32 sw_cons, cqe_cons;
+	int rc = 0;
+
+	qp = (struct bnxt_qplib_qp *)le64_to_cpu(hwcqe->qp_handle);
+	if (!qp) {
+		dev_err(&cq->hwq.pdev->dev,
+			"QPLIB: FP: Process Req qp is NULL");
+		return -EINVAL;
+	}
+	sq = &qp->sq;
+
+	cqe_cons = HWQ_CMP(le16_to_cpu(hwcqe->sq_cons_idx), &sq->hwq);
+	if (cqe_cons > sq->hwq.max_elements) {
+		dev_err(&cq->hwq.pdev->dev,
+			"QPLIB: FP: CQ Process req reported ");
+		dev_err(&cq->hwq.pdev->dev,
+			"QPLIB: sq_cons_idx 0x%x which exceeded max 0x%x",
+			cqe_cons, sq->hwq.max_elements);
+		return -EINVAL;
+	}
+	/* If we were in the middle of flushing the SQ, continue */
+	if (sq->flush_in_progress)
+		goto flush;
+
+	/* Require to walk the sq's swq to fabricate CQEs for all previously
+	 * signaled SWQEs due to CQE aggregation from the current sq cons
+	 * to the cqe_cons
+	 */
+	cqe = *pcqe;
+	while (*budget) {
+		sw_cons = HWQ_CMP(sq->hwq.cons, &sq->hwq);
+		if (sw_cons == cqe_cons)
+			break;
+		memset(cqe, 0, sizeof(*cqe));
+		cqe->opcode = CQ_BASE_CQE_TYPE_REQ;
+		cqe->qp_handle = (u64)qp;
+		cqe->src_qp = qp->id;
+		cqe->wr_id = sq->swq[sw_cons].wr_id;
+		cqe->type = sq->swq[sw_cons].type;
+
+		/* For the last CQE, check for status.  For errors, regardless
+		 * of the request being signaled or not, it must complete with
+		 * the hwcqe error status
+		 */
+		if (HWQ_CMP((sw_cons + 1), &sq->hwq) == cqe_cons &&
+		    hwcqe->status != CQ_REQ_STATUS_OK) {
+			cqe->status = hwcqe->status;
+			dev_err(&cq->hwq.pdev->dev,
+				"QPLIB: FP: CQ Processed Req ");
+			dev_err(&cq->hwq.pdev->dev,
+				"QPLIB: wr_id[%d] = 0x%llx with status 0x%x",
+				sw_cons, cqe->wr_id, cqe->status);
+			cqe++;
+			(*budget)--;
+			sq->flush_in_progress = true;
+			/* Must block new posting of SQ and RQ */
+			qp->state = CMDQ_MODIFY_QP_NEW_STATE_ERR;
+		} else {
+			if (sq->swq[sw_cons].flags &
+			    SQ_SEND_FLAGS_SIGNAL_COMP) {
+				cqe->status = CQ_REQ_STATUS_OK;
+				cqe++;
+				(*budget)--;
+			}
+		}
+		sq->hwq.cons++;
+	}
+	*pcqe = cqe;
+	if (!*budget && HWQ_CMP(sq->hwq.cons, &sq->hwq) != cqe_cons) {
+		/* Out of budget */
+		rc = -EAGAIN;
+		goto done;
+	}
+	if (!sq->flush_in_progress)
+		goto done;
+flush:
+	/* Require to walk the sq's swq to fabricate CQEs for all
+	 * previously posted SWQEs due to the error CQE received
+	 */
+	rc = __flush_sq(sq, qp, pcqe, budget);
+	if (!rc)
+		sq->flush_in_progress = false;
+done:
+	return rc;
+}
+
+static int bnxt_qplib_cq_process_res_rc(struct bnxt_qplib_cq *cq,
+					struct cq_res_rc *hwcqe,
+					struct bnxt_qplib_cqe **pcqe,
+					int *budget)
+{
+	struct bnxt_qplib_qp *qp;
+	struct bnxt_qplib_q *rq;
+	struct bnxt_qplib_cqe *cqe;
+	u64 wr_id_idx;
+	int rc = 0;
+
+	qp = (struct bnxt_qplib_qp *)le64_to_cpu(hwcqe->qp_handle);
+	if (!qp) {
+		dev_err(&cq->hwq.pdev->dev, "QPLIB: process_cq RC qp is NULL");
+		return -EINVAL;
+	}
+	cqe = *pcqe;
+	cqe->opcode = hwcqe->cqe_type_toggle & CQ_BASE_CQE_TYPE_MASK;
+	cqe->length = le32_to_cpu(hwcqe->length);
+	cqe->immdata_or_invrkey = le32_to_cpu(hwcqe->imm_data_or_inv_r_key);
+	cqe->mr_handle = le64_to_cpu(hwcqe->mr_handle);
+	cqe->flags = le16_to_cpu(hwcqe->flags);
+	cqe->status = hwcqe->status;
+	cqe->qp_handle = (u64)qp;
+
+	wr_id_idx = le64_to_cpu(hwcqe->srq_or_rq_wr_id &
+				CQ_RES_RC_SRQ_OR_RQ_WR_ID_MASK);
+	rq = &qp->rq;
+	if (wr_id_idx > rq->hwq.max_elements) {
+		dev_err(&cq->hwq.pdev->dev, "QPLIB: FP: CQ Process RC ");
+		dev_err(&cq->hwq.pdev->dev,
+			"QPLIB: wr_id idx 0x%llx exceeded RQ max 0x%x",
+			wr_id_idx, rq->hwq.max_elements);
+		return -EINVAL;
+	}
+	if (rq->flush_in_progress)
+		goto flush_rq;
+
+	cqe->wr_id = rq->swq[wr_id_idx].wr_id;
+	cqe++;
+	(*budget)--;
+	rq->hwq.cons++;
+	*pcqe = cqe;
+
+	if (hwcqe->status != CQ_RES_RC_STATUS_OK) {
+		rq->flush_in_progress = true;
+flush_rq:
+		rc = __flush_rq(rq, qp, CQ_BASE_CQE_TYPE_RES_RC, pcqe, budget);
+		if (!rc)
+			rq->flush_in_progress = false;
+	}
+	return rc;
+}
+
+static int bnxt_qplib_cq_process_res_ud(struct bnxt_qplib_cq *cq,
+					struct cq_res_ud *hwcqe,
+					struct bnxt_qplib_cqe **pcqe,
+					int *budget)
+{
+	struct bnxt_qplib_qp *qp;
+	struct bnxt_qplib_q *rq;
+	struct bnxt_qplib_cqe *cqe;
+	u64 wr_id_idx;
+	int rc = 0;
+
+	qp = (struct bnxt_qplib_qp *)le64_to_cpu(hwcqe->qp_handle);
+	if (!qp) {
+		dev_err(&cq->hwq.pdev->dev, "QPLIB: process_cq UD qp is NULL");
+		return -EINVAL;
+	}
+	cqe = *pcqe;
+	cqe->opcode = hwcqe->cqe_type_toggle & CQ_BASE_CQE_TYPE_MASK;
+	cqe->length = le32_to_cpu(hwcqe->length);
+	cqe->immdata_or_invrkey = le32_to_cpu(hwcqe->imm_data);
+	cqe->flags = le16_to_cpu(hwcqe->flags);
+	cqe->status = hwcqe->status;
+	cqe->qp_handle = (u64)qp;
+	memcpy(cqe->smac, hwcqe->src_mac, 6);
+	wr_id_idx = le64_to_cpu(hwcqe->src_qp_high_srq_or_rq_wr_id
+				& CQ_RES_UD_SRQ_OR_RQ_WR_ID_MASK);
+	cqe->src_qp = le16_to_cpu(hwcqe->src_qp_low) |
+				(hwcqe->src_qp_high_srq_or_rq_wr_id &
+				 CQ_RES_UD_SRC_QP_HIGH_MASK >> 8);
+
+	rq = &qp->rq;
+	if (wr_id_idx > rq->hwq.max_elements) {
+		dev_err(&cq->hwq.pdev->dev, "QPLIB: FP: CQ Process UD ");
+		dev_err(&cq->hwq.pdev->dev,
+			"QPLIB: wr_id idx 0x%llx exceeded RQ max 0x%x",
+			wr_id_idx, rq->hwq.max_elements);
+			return -EINVAL;
+	}
+	if (rq->flush_in_progress)
+		goto flush_rq;
+
+	cqe->wr_id = rq->swq[wr_id_idx].wr_id;
+	cqe++;
+	(*budget)--;
+	rq->hwq.cons++;
+	*pcqe = cqe;
+
+	if (hwcqe->status != CQ_RES_RC_STATUS_OK) {
+		rq->flush_in_progress = true;
+flush_rq:
+		rc = __flush_rq(rq, qp, CQ_BASE_CQE_TYPE_RES_UD, pcqe, budget);
+		if (!rc)
+			rq->flush_in_progress = false;
+	}
+	return rc;
+}
+
+static int bnxt_qplib_cq_process_res_raweth_qp1(struct bnxt_qplib_cq *cq,
+						struct cq_res_raweth_qp1 *hwcqe,
+						struct bnxt_qplib_cqe **pcqe,
+						int *budget)
+{
+	struct bnxt_qplib_qp *qp;
+	struct bnxt_qplib_q *rq;
+	struct bnxt_qplib_cqe *cqe;
+	u64 wr_id_idx;
+	int rc = 0;
+
+	qp = (struct bnxt_qplib_qp *)le64_to_cpu(hwcqe->qp_handle);
+	if (!qp) {
+		dev_err(&cq->hwq.pdev->dev,
+			"QPLIB: process_cq Raw/QP1 qp is NULL");
+		return -EINVAL;
+	}
+	cqe = *pcqe;
+	cqe->opcode = hwcqe->cqe_type_toggle & CQ_BASE_CQE_TYPE_MASK;
+	cqe->flags = le16_to_cpu(hwcqe->flags);
+	cqe->qp_handle = (u64)qp;
+
+	wr_id_idx = le64_to_cpu(hwcqe->raweth_qp1_payload_offset_srq_or_rq_wr_id
+				& CQ_RES_RAWETH_QP1_SRQ_OR_RQ_WR_ID_MASK);
+	cqe->src_qp = qp->id;
+	if (qp->id == 1 && !cqe->length) {
+		/* Add workaround for the length misdetection */
+		cqe->length = 296;
+	} else {
+		cqe->length = le16_to_cpu(hwcqe->length);
+	}
+	cqe->pkey_index = qp->pkey_index;
+	memcpy(cqe->smac, qp->smac, 6);
+
+	cqe->raweth_qp1_flags = le16_to_cpu(hwcqe->raweth_qp1_flags);
+	cqe->raweth_qp1_flags2 = le16_to_cpu(hwcqe->raweth_qp1_flags2);
+
+	rq = &qp->rq;
+	if (wr_id_idx > rq->hwq.max_elements) {
+		dev_err(&cq->hwq.pdev->dev, "QPLIB: FP: CQ Process Raw/QP1 RQ wr_id ");
+		dev_err(&cq->hwq.pdev->dev, "QPLIB: ix 0x%llx exceeded RQ max 0x%x",
+			wr_id_idx, rq->hwq.max_elements);
+		return -EINVAL;
+	}
+	if (rq->flush_in_progress)
+		goto flush_rq;
+
+	cqe->wr_id = rq->swq[wr_id_idx].wr_id;
+	cqe++;
+	(*budget)--;
+	rq->hwq.cons++;
+	*pcqe = cqe;
+
+	if (hwcqe->status != CQ_RES_RC_STATUS_OK) {
+		rq->flush_in_progress = true;
+flush_rq:
+		rc = __flush_rq(rq, qp, CQ_BASE_CQE_TYPE_RES_RAWETH_QP1, pcqe,
+				budget);
+		if (!rc)
+			rq->flush_in_progress = false;
+	}
+	return rc;
+}
+
+static int bnxt_qplib_cq_process_terminal(struct bnxt_qplib_cq *cq,
+					  struct cq_terminal *hwcqe,
+					  struct bnxt_qplib_cqe **pcqe,
+					  int *budget)
+{
+	struct bnxt_qplib_qp *qp;
+	struct bnxt_qplib_q *sq, *rq;
+	struct bnxt_qplib_cqe *cqe;
+	u32 sw_cons, cqe_cons;
+	int rc = 0;
+	u8 opcode = 0;
+
+	/* Check the Status */
+	if (hwcqe->status != CQ_TERMINAL_STATUS_OK)
+		dev_warn(&cq->hwq.pdev->dev,
+			 "QPLIB: FP: CQ Process Terminal Error status = 0x%x",
+			 hwcqe->status);
+
+	qp = (struct bnxt_qplib_qp *)le64_to_cpu(hwcqe->qp_handle);
+	if (!qp) {
+		dev_err(&cq->hwq.pdev->dev,
+			"QPLIB: FP: CQ Process terminal qp is NULL");
+		return -EINVAL;
+	}
+	/* Must block new posting of SQ and RQ */
+	qp->state = CMDQ_MODIFY_QP_NEW_STATE_ERR;
+
+	sq = &qp->sq;
+	rq = &qp->rq;
+
+	cqe_cons = le16_to_cpu(hwcqe->sq_cons_idx);
+	if (cqe_cons == 0xFFFF)
+		goto do_rq;
+
+	if (cqe_cons > sq->hwq.max_elements) {
+		dev_err(&cq->hwq.pdev->dev,
+			"QPLIB: FP: CQ Process terminal reported ");
+		dev_err(&cq->hwq.pdev->dev,
+			"QPLIB: sq_cons_idx 0x%x which exceeded max 0x%x",
+			cqe_cons, sq->hwq.max_elements);
+		goto do_rq;
+	}
+	/* If we were in the middle of flushing, continue */
+	if (sq->flush_in_progress)
+		goto flush_sq;
+
+	/* Terminal CQE can also include aggregated successful CQEs prior.
+	 * So we must complete all CQEs from the current sq's cons to the
+	 * cq_cons with status OK
+	 */
+	cqe = *pcqe;
+	while (*budget) {
+		sw_cons = HWQ_CMP(sq->hwq.cons, &sq->hwq);
+		if (sw_cons == cqe_cons)
+			break;
+		if (sq->swq[sw_cons].flags & SQ_SEND_FLAGS_SIGNAL_COMP) {
+			memset(cqe, 0, sizeof(*cqe));
+			cqe->status = CQ_REQ_STATUS_OK;
+			cqe->opcode = CQ_BASE_CQE_TYPE_REQ;
+			cqe->qp_handle = (u64)qp;
+			cqe->src_qp = qp->id;
+			cqe->wr_id = sq->swq[sw_cons].wr_id;
+			cqe->type = sq->swq[sw_cons].type;
+			cqe++;
+			(*budget)--;
+		}
+		sq->hwq.cons++;
+	}
+	*pcqe = cqe;
+	if (!budget && sw_cons != cqe_cons) {
+		/* Out of budget */
+		rc = -EAGAIN;
+		goto sq_done;
+	}
+	sq->flush_in_progress = true;
+flush_sq:
+	rc = __flush_sq(sq, qp, pcqe, budget);
+	if (!rc)
+		sq->flush_in_progress = false;
+sq_done:
+	if (rc)
+		return rc;
+do_rq:
+	cqe_cons = le16_to_cpu(hwcqe->rq_cons_idx);
+	if (cqe_cons == 0xFFFF) {
+		goto done;
+	} else if (cqe_cons > rq->hwq.max_elements) {
+		dev_err(&cq->hwq.pdev->dev,
+			"QPLIB: FP: CQ Processed terminal ");
+		dev_err(&cq->hwq.pdev->dev,
+			"QPLIB: reported rq_cons_idx 0x%x exceeds max 0x%x",
+			cqe_cons, rq->hwq.max_elements);
+		goto done;
+	}
+	/* Terminal CQE requires all posted RQEs to complete with FLUSHED_ERR
+	 * from the current rq->cons to the rq->prod regardless what the
+	 * rq->cons the terminal CQE indicates
+	 */
+	rq->flush_in_progress = true;
+	switch (qp->type) {
+	case CMDQ_CREATE_QP1_TYPE_GSI:
+		opcode = CQ_BASE_CQE_TYPE_RES_RAWETH_QP1;
+		break;
+	case CMDQ_CREATE_QP_TYPE_RC:
+		opcode = CQ_BASE_CQE_TYPE_RES_RC;
+		break;
+	case CMDQ_CREATE_QP_TYPE_UD:
+		opcode = CQ_BASE_CQE_TYPE_RES_UD;
+		break;
+	}
+
+	rc = __flush_rq(rq, qp, opcode, pcqe, budget);
+	if (!rc)
+		rq->flush_in_progress = false;
+done:
+	return rc;
+}
+
+static int bnxt_qplib_cq_process_cutoff(struct bnxt_qplib_cq *cq,
+					struct cq_cutoff *hwcqe)
+{
+	/* Check the Status */
+	if (hwcqe->status != CQ_CUTOFF_STATUS_OK) {
+		dev_err(&cq->hwq.pdev->dev,
+			"QPLIB: FP: CQ Process Cutoff Error status = 0x%x",
+			hwcqe->status);
+		return -EINVAL;
+	}
+	clear_bit(CQ_FLAGS_RESIZE_IN_PROG, &cq->flags);
+	wake_up_interruptible(&cq->waitq);
+
+	return 0;
+}
+
+int bnxt_qplib_poll_cq(struct bnxt_qplib_cq *cq, struct bnxt_qplib_cqe *cqe,
+		       int num_cqes)
+{
+	struct cq_base *hw_cqe, **hw_cqe_ptr;
+	unsigned long flags;
+	u32 sw_cons, raw_cons;
+	int budget, rc = 0;
+
+	spin_lock_irqsave(&cq->hwq.lock, flags);
+	raw_cons = cq->hwq.cons;
+	budget = num_cqes;
+
+	while (budget) {
+		sw_cons = HWQ_CMP(raw_cons, &cq->hwq);
+		hw_cqe_ptr = (struct cq_base **)cq->hwq.pbl_ptr;
+		hw_cqe = &hw_cqe_ptr[CQE_PG(sw_cons)][CQE_IDX(sw_cons)];
+
+		/* Check for Valid bit */
+		if (!CQE_CMP_VALID(hw_cqe, raw_cons, cq->hwq.max_elements))
+			break;
+
+		/* From the device's respective CQE format to qplib_wc*/
+		switch (hw_cqe->cqe_type_toggle & CQ_BASE_CQE_TYPE_MASK) {
+		case CQ_BASE_CQE_TYPE_REQ:
+			rc = bnxt_qplib_cq_process_req(cq,
+						       (struct cq_req *)hw_cqe,
+						       &cqe, &budget);
+			break;
+		case CQ_BASE_CQE_TYPE_RES_RC:
+			rc = bnxt_qplib_cq_process_res_rc(cq,
+							  (struct cq_res_rc *)
+							  hw_cqe, &cqe,
+							  &budget);
+			break;
+		case CQ_BASE_CQE_TYPE_RES_UD:
+			rc = bnxt_qplib_cq_process_res_ud
+					(cq, (struct cq_res_ud *)hw_cqe, &cqe,
+					 &budget);
+			break;
+		case CQ_BASE_CQE_TYPE_RES_RAWETH_QP1:
+			rc = bnxt_qplib_cq_process_res_raweth_qp1
+					(cq, (struct cq_res_raweth_qp1 *)
+					 hw_cqe, &cqe, &budget);
+			break;
+		case CQ_BASE_CQE_TYPE_TERMINAL:
+			rc = bnxt_qplib_cq_process_terminal
+					(cq, (struct cq_terminal *)hw_cqe,
+					 &cqe, &budget);
+			break;
+		case CQ_BASE_CQE_TYPE_CUT_OFF:
+			bnxt_qplib_cq_process_cutoff
+					(cq, (struct cq_cutoff *)hw_cqe);
+			/* Done processing this CQ */
+			goto exit;
+		default:
+			dev_err(&cq->hwq.pdev->dev,
+				"QPLIB: process_cq unknown type 0x%lx",
+				hw_cqe->cqe_type_toggle &
+				CQ_BASE_CQE_TYPE_MASK);
+			rc = -EINVAL;
+			break;
+		}
+		if (rc < 0) {
+			if (rc == -EAGAIN)
+				break;
+			/* Error while processing the CQE, just skip to the
+			 * next one
+			 */
+			dev_err(&cq->hwq.pdev->dev,
+				"QPLIB: process_cqe error rc = 0x%x", rc);
+		}
+		raw_cons++;
+	}
+	if (cq->hwq.cons != raw_cons) {
+		cq->hwq.cons = raw_cons;
+		bnxt_qplib_arm_cq(cq, DBR_DBR_TYPE_CQ);
+	}
+exit:
+	spin_unlock_irqrestore(&cq->hwq.lock, flags);
+	return num_cqes - budget;
+}
+
 void bnxt_qplib_req_notify_cq(struct bnxt_qplib_cq *cq, u32 arm_type)
 {
 	unsigned long flags;
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
index 12e9fcb..f598e17 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
@@ -348,7 +348,7 @@ struct bnxt_qplib_nq {
 
 	int				(*cqn_handler)
 						(struct bnxt_qplib_nq *nq,
-						 void *cq);
+						 struct bnxt_qplib_cq *cq);
 	int				(*srqn_handler)
 						(struct bnxt_qplib_nq *nq,
 						 void *srq,
@@ -359,7 +359,7 @@ void bnxt_qplib_disable_nq(struct bnxt_qplib_nq *nq);
 int bnxt_qplib_enable_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq,
 			 int msix_vector, int bar_reg_offset,
 			 int (*cqn_handler)(struct bnxt_qplib_nq *nq,
-					    void *cq),
+					    struct bnxt_qplib_cq *cq),
 			 int (*srqn_handler)(struct bnxt_qplib_nq *nq,
 					     void *srq,
 					     u8 event));
@@ -383,7 +383,8 @@ int bnxt_qplib_post_recv(struct bnxt_qplib_qp *qp,
 			 struct bnxt_qplib_swqe *wqe);
 int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq);
 int bnxt_qplib_destroy_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq);
-
+int bnxt_qplib_poll_cq(struct bnxt_qplib_cq *cq, struct bnxt_qplib_cqe *cqe,
+		       int num);
 void bnxt_qplib_req_notify_cq(struct bnxt_qplib_cq *cq, u32 arm_type);
 void bnxt_qplib_free_nq(struct bnxt_qplib_nq *nq);
 int bnxt_qplib_alloc_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq);
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
index 6d870eb..3ff87ff 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
@@ -2222,6 +2222,525 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
 	return ERR_PTR(rc);
 }
 
+static u8 __req_to_ib_wc_status(u8 qstatus)
+{
+	switch (qstatus) {
+	case CQ_REQ_STATUS_OK:
+		return IB_WC_SUCCESS;
+	case CQ_REQ_STATUS_BAD_RESPONSE_ERR:
+		return IB_WC_BAD_RESP_ERR;
+	case CQ_REQ_STATUS_LOCAL_LENGTH_ERR:
+		return IB_WC_LOC_LEN_ERR;
+	case CQ_REQ_STATUS_LOCAL_QP_OPERATION_ERR:
+		return IB_WC_LOC_QP_OP_ERR;
+	case CQ_REQ_STATUS_LOCAL_PROTECTION_ERR:
+		return IB_WC_LOC_PROT_ERR;
+	case CQ_REQ_STATUS_MEMORY_MGT_OPERATION_ERR:
+		return IB_WC_GENERAL_ERR;
+	case CQ_REQ_STATUS_REMOTE_INVALID_REQUEST_ERR:
+		return IB_WC_REM_INV_REQ_ERR;
+	case CQ_REQ_STATUS_REMOTE_ACCESS_ERR:
+		return IB_WC_REM_ACCESS_ERR;
+	case CQ_REQ_STATUS_REMOTE_OPERATION_ERR:
+		return IB_WC_REM_OP_ERR;
+	case CQ_REQ_STATUS_RNR_NAK_RETRY_CNT_ERR:
+		return IB_WC_RNR_RETRY_EXC_ERR;
+	case CQ_REQ_STATUS_TRANSPORT_RETRY_CNT_ERR:
+		return IB_WC_RETRY_EXC_ERR;
+	case CQ_REQ_STATUS_WORK_REQUEST_FLUSHED_ERR:
+		return IB_WC_WR_FLUSH_ERR;
+	default:
+		return IB_WC_GENERAL_ERR;
+	}
+	return 0;
+}
+
+static u8 __rawqp1_to_ib_wc_status(u8 qstatus)
+{
+	switch (qstatus) {
+	case CQ_RES_RAWETH_QP1_STATUS_OK:
+		return IB_WC_SUCCESS;
+	case CQ_RES_RAWETH_QP1_STATUS_LOCAL_ACCESS_ERROR:
+		return IB_WC_LOC_ACCESS_ERR;
+	case CQ_RES_RAWETH_QP1_STATUS_HW_LOCAL_LENGTH_ERR:
+		return IB_WC_LOC_LEN_ERR;
+	case CQ_RES_RAWETH_QP1_STATUS_LOCAL_PROTECTION_ERR:
+		return IB_WC_LOC_PROT_ERR;
+	case CQ_RES_RAWETH_QP1_STATUS_LOCAL_QP_OPERATION_ERR:
+		return IB_WC_LOC_QP_OP_ERR;
+	case CQ_RES_RAWETH_QP1_STATUS_MEMORY_MGT_OPERATION_ERR:
+		return IB_WC_GENERAL_ERR;
+	case CQ_RES_RAWETH_QP1_STATUS_WORK_REQUEST_FLUSHED_ERR:
+		return IB_WC_WR_FLUSH_ERR;
+	case CQ_RES_RAWETH_QP1_STATUS_HW_FLUSH_ERR:
+		return IB_WC_WR_FLUSH_ERR;
+	default:
+		return IB_WC_GENERAL_ERR;
+	}
+}
+
+static u8 __rc_to_ib_wc_status(u8 qstatus)
+{
+	switch (qstatus) {
+	case CQ_RES_RC_STATUS_OK:
+		return IB_WC_SUCCESS;
+	case CQ_RES_RC_STATUS_LOCAL_ACCESS_ERROR:
+		return IB_WC_LOC_ACCESS_ERR;
+	case CQ_RES_RC_STATUS_LOCAL_LENGTH_ERR:
+		return IB_WC_LOC_LEN_ERR;
+	case CQ_RES_RC_STATUS_LOCAL_PROTECTION_ERR:
+		return IB_WC_LOC_PROT_ERR;
+	case CQ_RES_RC_STATUS_LOCAL_QP_OPERATION_ERR:
+		return IB_WC_LOC_QP_OP_ERR;
+	case CQ_RES_RC_STATUS_MEMORY_MGT_OPERATION_ERR:
+		return IB_WC_GENERAL_ERR;
+	case CQ_RES_RC_STATUS_REMOTE_INVALID_REQUEST_ERR:
+		return IB_WC_REM_INV_REQ_ERR;
+	case CQ_RES_RC_STATUS_WORK_REQUEST_FLUSHED_ERR:
+		return IB_WC_WR_FLUSH_ERR;
+	case CQ_RES_RC_STATUS_HW_FLUSH_ERR:
+		return IB_WC_WR_FLUSH_ERR;
+	default:
+		return IB_WC_GENERAL_ERR;
+	}
+}
+
+static void bnxt_re_process_req_wc(struct ib_wc *wc, struct bnxt_qplib_cqe *cqe)
+{
+	switch (cqe->type) {
+	case BNXT_QPLIB_SWQE_TYPE_SEND:
+		wc->opcode = IB_WC_SEND;
+		break;
+	case BNXT_QPLIB_SWQE_TYPE_SEND_WITH_IMM:
+		wc->opcode = IB_WC_SEND;
+		wc->wc_flags |= IB_WC_WITH_IMM;
+		break;
+	case BNXT_QPLIB_SWQE_TYPE_SEND_WITH_INV:
+		wc->opcode = IB_WC_SEND;
+		wc->wc_flags |= IB_WC_WITH_INVALIDATE;
+		break;
+	case BNXT_QPLIB_SWQE_TYPE_RDMA_WRITE:
+		wc->opcode = IB_WC_RDMA_WRITE;
+		break;
+	case BNXT_QPLIB_SWQE_TYPE_RDMA_WRITE_WITH_IMM:
+		wc->opcode = IB_WC_RDMA_WRITE;
+		wc->wc_flags |= IB_WC_WITH_IMM;
+		break;
+	case BNXT_QPLIB_SWQE_TYPE_RDMA_READ:
+		wc->opcode = IB_WC_RDMA_READ;
+		break;
+	case BNXT_QPLIB_SWQE_TYPE_ATOMIC_CMP_AND_SWP:
+		wc->opcode = IB_WC_COMP_SWAP;
+		break;
+	case BNXT_QPLIB_SWQE_TYPE_ATOMIC_FETCH_AND_ADD:
+		wc->opcode = IB_WC_FETCH_ADD;
+		break;
+	case BNXT_QPLIB_SWQE_TYPE_LOCAL_INV:
+		wc->opcode = IB_WC_LOCAL_INV;
+		break;
+	case BNXT_QPLIB_SWQE_TYPE_REG_MR:
+		wc->opcode = IB_WC_REG_MR;
+		break;
+	default:
+		wc->opcode = IB_WC_SEND;
+		break;
+	}
+
+	wc->status = __req_to_ib_wc_status(cqe->status);
+}
+
+int bnxt_re_check_packet_type(u16 raweth_qp1_flags, u16 raweth_qp1_flags2)
+{
+	bool is_udp = false, is_ipv6 = false, is_ipv4 = false;
+
+	/* raweth_qp1_flags Bit 9-6 indicates itype */
+	if ((raweth_qp1_flags & CQ_RES_RAWETH_QP1_RAWETH_QP1_FLAGS_ITYPE_ROCE)
+	    != CQ_RES_RAWETH_QP1_RAWETH_QP1_FLAGS_ITYPE_ROCE)
+		return -1;
+
+	if (raweth_qp1_flags2 &
+	    CQ_RES_RAWETH_QP1_RAWETH_QP1_FLAGS2_IP_CS_CALC &&
+	    raweth_qp1_flags2 &
+	    CQ_RES_RAWETH_QP1_RAWETH_QP1_FLAGS2_L4_CS_CALC) {
+		is_udp = true;
+		/* raweth_qp1_flags2 Bit 8 indicates ip_type. 0-v4 1 - v6 */
+		(raweth_qp1_flags2 &
+		 CQ_RES_RAWETH_QP1_RAWETH_QP1_FLAGS2_IP_TYPE) ?
+			(is_ipv6 = true) : (is_ipv4 = true);
+		return ((is_ipv6) ?
+			 BNXT_RE_ROCEV2_IPV6_PACKET :
+			 BNXT_RE_ROCEV2_IPV4_PACKET);
+	} else {
+		return BNXT_RE_ROCE_V1_PACKET;
+	}
+}
+
+static int bnxt_re_to_ib_nw_type(int nw_type)
+{
+	u8 nw_hdr_type = 0xFF;
+
+	switch (nw_type) {
+	case BNXT_RE_ROCE_V1_PACKET:
+		nw_hdr_type = RDMA_NETWORK_ROCE_V1;
+		break;
+	case BNXT_RE_ROCEV2_IPV4_PACKET:
+		nw_hdr_type = RDMA_NETWORK_IPV4;
+		break;
+	case BNXT_RE_ROCEV2_IPV6_PACKET:
+		nw_hdr_type = RDMA_NETWORK_IPV6;
+		break;
+	}
+	return nw_hdr_type;
+}
+
+static bool bnxt_re_is_loopback_packet(struct bnxt_re_dev *rdev,
+				       void *rq_hdr_buf)
+{
+	u8 *tmp_buf = NULL;
+	struct ethhdr *eth_hdr;
+	u16 eth_type;
+	bool rc = false;
+
+	tmp_buf = (u8 *)rq_hdr_buf;
+	/*
+	 * If dest mac is not same as I/F mac, this could be a
+	 * loopback address or multicast address, check whether
+	 * it is a loopback packet
+	 */
+	if (!ether_addr_equal(tmp_buf, rdev->netdev->dev_addr)) {
+		tmp_buf += 4;
+		/* Check the  ether type */
+		eth_hdr = (struct ethhdr *)tmp_buf;
+		eth_type = ntohs(eth_hdr->h_proto);
+		switch (eth_type) {
+		case BNXT_QPLIB_ETHTYPE_ROCEV1:
+			rc = true;
+			break;
+		case ETH_P_IP:
+		case ETH_P_IPV6: {
+			u32 len;
+			struct udphdr *udp_hdr;
+
+			len = (eth_type == ETH_P_IP ? sizeof(struct iphdr) :
+						      sizeof(struct ipv6hdr));
+			tmp_buf += sizeof(struct ethhdr) + len;
+			udp_hdr = (struct udphdr *)tmp_buf;
+			if (ntohs(udp_hdr->dest) ==
+				    ROCE_V2_UDP_DPORT)
+				rc = true;
+				break;
+			}
+		default:
+			break;
+		}
+	}
+
+	return rc;
+}
+
+static int bnxt_re_process_raw_qp_pkt_rx(struct bnxt_re_qp *qp1_qp,
+					 struct bnxt_qplib_cqe *cqe)
+{
+	struct bnxt_re_dev *rdev = qp1_qp->rdev;
+	struct bnxt_re_sqp_entries *sqp_entry = NULL;
+	struct bnxt_re_qp *qp = rdev->qp1_sqp;
+	struct ib_send_wr *swr;
+	struct ib_ud_wr udwr;
+	struct ib_recv_wr rwr;
+	u8 pkt_type = 0;
+	u32 tbl_idx;
+	void *rq_hdr_buf;
+	dma_addr_t rq_hdr_buf_map;
+	dma_addr_t shrq_hdr_buf_map;
+	u32 offset = 0;
+	u32 skip_bytes = 0;
+	struct ib_sge s_sge[2];
+	struct ib_sge r_sge[2];
+	int rc;
+
+	memset(&udwr, 0, sizeof(udwr));
+	memset(&rwr, 0, sizeof(rwr));
+	memset(&s_sge, 0, sizeof(s_sge));
+	memset(&r_sge, 0, sizeof(r_sge));
+
+	swr = &udwr.wr;
+	tbl_idx = cqe->wr_id;
+
+	rq_hdr_buf = qp1_qp->qplib_qp.rq_hdr_buf +
+			(tbl_idx * qp1_qp->qplib_qp.rq_hdr_buf_size);
+	rq_hdr_buf_map = bnxt_qplib_get_qp_buf_from_index(&qp1_qp->qplib_qp,
+							  tbl_idx);
+
+	/* Shadow QP header buffer */
+	shrq_hdr_buf_map = bnxt_qplib_get_qp_buf_from_index(&qp->qplib_qp,
+							    tbl_idx);
+	sqp_entry = &rdev->sqp_tbl[tbl_idx];
+
+	/* Store this cqe */
+	memcpy(&sqp_entry->cqe, cqe, sizeof(struct bnxt_qplib_cqe));
+	sqp_entry->qp1_qp = qp1_qp;
+
+	/* Find packet type from the cqe */
+
+	pkt_type = bnxt_re_check_packet_type(cqe->raweth_qp1_flags,
+					     cqe->raweth_qp1_flags2);
+	if (pkt_type < 0) {
+		dev_err(rdev_to_dev(rdev), "Invalid packet\n");
+		return -EINVAL;
+	}
+
+	/* Adjust the offset for the user buffer and post in the rq */
+
+	if (pkt_type == BNXT_RE_ROCEV2_IPV4_PACKET)
+		offset = 20;
+
+	/*
+	 * QP1 loopback packet has 4 bytes of internal header before
+	 * ether header. Skip these four bytes.
+	 */
+	if (bnxt_re_is_loopback_packet(rdev, rq_hdr_buf))
+		skip_bytes = 4;
+
+	/* First send SGE . Skip the ether header*/
+	s_sge[0].addr = rq_hdr_buf_map + BNXT_QPLIB_MAX_QP1_RQ_ETH_HDR_SIZE
+			+ skip_bytes;
+	s_sge[0].lkey = 0xFFFFFFFF;
+	s_sge[0].length = offset ? BNXT_QPLIB_MAX_GRH_HDR_SIZE_IPV4 :
+				BNXT_QPLIB_MAX_GRH_HDR_SIZE_IPV6;
+
+	/* Second Send SGE */
+	s_sge[1].addr = s_sge[0].addr + s_sge[0].length +
+			BNXT_QPLIB_MAX_QP1_RQ_BDETH_HDR_SIZE;
+	if (pkt_type != BNXT_RE_ROCE_V1_PACKET)
+		s_sge[1].addr += 8;
+	s_sge[1].lkey = 0xFFFFFFFF;
+	s_sge[1].length = 256;
+
+	/* First recv SGE */
+
+	r_sge[0].addr = shrq_hdr_buf_map;
+	r_sge[0].lkey = 0xFFFFFFFF;
+	r_sge[0].length = 40;
+
+	r_sge[1].addr = sqp_entry->sge.addr + offset;
+	r_sge[1].lkey = sqp_entry->sge.lkey;
+	r_sge[1].length = BNXT_QPLIB_MAX_GRH_HDR_SIZE_IPV6 + 256 - offset;
+
+	/* Create receive work request */
+	rwr.num_sge = 2;
+	rwr.sg_list = r_sge;
+	rwr.wr_id = tbl_idx;
+	rwr.next = NULL;
+
+	rc = bnxt_re_post_recv_shadow_qp(rdev, qp, &rwr);
+	if (rc) {
+		dev_err(rdev_to_dev(rdev),
+			"Failed to post Rx buffers to shadow QP");
+		return -ENOMEM;
+	}
+
+	swr->num_sge = 2;
+	swr->sg_list = s_sge;
+	swr->wr_id = tbl_idx;
+	swr->opcode = IB_WR_SEND;
+	swr->next = NULL;
+
+	udwr.ah = &rdev->sqp_ah->ib_ah;
+	udwr.remote_qpn = rdev->qp1_sqp->qplib_qp.id;
+	udwr.remote_qkey = rdev->qp1_sqp->qplib_qp.qkey;
+
+	/* post data received  in the send queue */
+	rc = bnxt_re_post_send_shadow_qp(rdev, qp, swr);
+
+	return 0;
+}
+
+static void bnxt_re_process_res_rawqp1_wc(struct ib_wc *wc,
+					  struct bnxt_qplib_cqe *cqe)
+{
+	wc->opcode = IB_WC_RECV;
+	wc->status = __rawqp1_to_ib_wc_status(cqe->status);
+	wc->wc_flags |= IB_WC_GRH;
+}
+
+static void bnxt_re_process_res_rc_wc(struct ib_wc *wc,
+				      struct bnxt_qplib_cqe *cqe)
+{
+	wc->opcode = IB_WC_RECV;
+	wc->status = __rc_to_ib_wc_status(cqe->status);
+
+	if (cqe->flags & CQ_RES_RC_FLAGS_IMM)
+		wc->wc_flags |= IB_WC_WITH_IMM;
+	if (cqe->flags & CQ_RES_RC_FLAGS_INV)
+		wc->wc_flags |= IB_WC_WITH_INVALIDATE;
+	if ((cqe->flags & (CQ_RES_RC_FLAGS_RDMA | CQ_RES_RC_FLAGS_IMM)) ==
+	    (CQ_RES_RC_FLAGS_RDMA | CQ_RES_RC_FLAGS_IMM))
+		wc->opcode = IB_WC_RECV_RDMA_WITH_IMM;
+}
+
+static void bnxt_re_process_res_shadow_qp_wc(struct bnxt_re_qp *qp,
+					     struct ib_wc *wc,
+					     struct bnxt_qplib_cqe *cqe)
+{
+	u32 tbl_idx;
+	struct bnxt_re_dev *rdev = qp->rdev;
+	struct bnxt_re_qp *qp1_qp = NULL;
+	struct bnxt_qplib_cqe *orig_cqe = NULL;
+	struct bnxt_re_sqp_entries *sqp_entry = NULL;
+	int nw_type;
+
+	tbl_idx = cqe->wr_id;
+
+	sqp_entry = &rdev->sqp_tbl[tbl_idx];
+	qp1_qp = sqp_entry->qp1_qp;
+	orig_cqe = &sqp_entry->cqe;
+
+	wc->wr_id = sqp_entry->wrid;
+	wc->byte_len = orig_cqe->length;
+	wc->qp = &qp1_qp->ib_qp;
+
+	wc->ex.imm_data = orig_cqe->immdata_or_invrkey;
+	wc->src_qp = orig_cqe->src_qp;
+	memcpy(wc->smac, orig_cqe->smac, ETH_ALEN);
+	wc->port_num = 1;
+	wc->vendor_err = orig_cqe->status;
+
+	wc->opcode = IB_WC_RECV;
+	wc->status = __rawqp1_to_ib_wc_status(orig_cqe->status);
+	wc->wc_flags |= IB_WC_GRH;
+
+	nw_type = bnxt_re_check_packet_type(orig_cqe->raweth_qp1_flags,
+					    orig_cqe->raweth_qp1_flags2);
+	if (nw_type >= 0) {
+		wc->network_hdr_type = bnxt_re_to_ib_nw_type(nw_type);
+		wc->wc_flags |= IB_WC_WITH_NETWORK_HDR_TYPE;
+	}
+}
+
+static void bnxt_re_process_res_ud_wc(struct ib_wc *wc,
+				      struct bnxt_qplib_cqe *cqe)
+{
+	wc->opcode = IB_WC_RECV;
+	wc->status = __rc_to_ib_wc_status(cqe->status);
+
+	if (cqe->flags & CQ_RES_RC_FLAGS_IMM)
+		wc->wc_flags |= IB_WC_WITH_IMM;
+	if (cqe->flags & CQ_RES_RC_FLAGS_INV)
+		wc->wc_flags |= IB_WC_WITH_INVALIDATE;
+	if ((cqe->flags & (CQ_RES_RC_FLAGS_RDMA | CQ_RES_RC_FLAGS_IMM)) ==
+	    (CQ_RES_RC_FLAGS_RDMA | CQ_RES_RC_FLAGS_IMM))
+		wc->opcode = IB_WC_RECV_RDMA_WITH_IMM;
+}
+
+int bnxt_re_poll_cq(struct ib_cq *ib_cq, int num_entries, struct ib_wc *wc)
+{
+	struct bnxt_re_cq *cq = to_bnxt_re(ib_cq, struct bnxt_re_cq, ib_cq);
+	struct bnxt_re_qp *qp;
+	struct bnxt_qplib_cqe *cqe;
+	int i, ncqe, budget;
+	u32 tbl_idx;
+	struct bnxt_re_sqp_entries *sqp_entry = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&cq->cq_lock, flags);
+	budget = min_t(u32, num_entries, cq->max_cql);
+	if (!cq->cql) {
+		dev_err(rdev_to_dev(cq->rdev), "POLL CQ : no CQL to use");
+		goto exit;
+	}
+	cqe = &cq->cql[0];
+	while (budget) {
+		ncqe = bnxt_qplib_poll_cq(&cq->qplib_cq, cqe, budget);
+		if (!ncqe)
+			break;
+
+		for (i = 0; i < ncqe; i++, cqe++) {
+			/* Transcribe each qplib_wqe back to ib_wc */
+			memset(wc, 0, sizeof(*wc));
+
+			wc->wr_id = cqe->wr_id;
+			wc->byte_len = cqe->length;
+			qp = to_bnxt_re((struct bnxt_qplib_qp *)cqe->qp_handle,
+					struct bnxt_re_qp, qplib_qp);
+			if (!qp) {
+				dev_err(rdev_to_dev(cq->rdev),
+					"POLL CQ : bad QP handle");
+				continue;
+			}
+			wc->qp = &qp->ib_qp;
+			wc->ex.imm_data = cqe->immdata_or_invrkey;
+			wc->src_qp = cqe->src_qp;
+			memcpy(wc->smac, cqe->smac, ETH_ALEN);
+			wc->port_num = 1;
+			wc->vendor_err = cqe->status;
+
+			switch (cqe->opcode) {
+			case CQ_BASE_CQE_TYPE_REQ:
+				if (qp->qplib_qp.id ==
+				    qp->rdev->qp1_sqp->qplib_qp.id) {
+					/* Handle this completion with
+					 * the stored completion
+					 */
+					memset(wc, 0, sizeof(*wc));
+					continue;
+				}
+				bnxt_re_process_req_wc(wc, cqe);
+				break;
+			case CQ_BASE_CQE_TYPE_RES_RAWETH_QP1:
+				if (!cqe->status) {
+					int rc = 0;
+
+					rc = bnxt_re_process_raw_qp_pkt_rx
+								(qp, cqe);
+					if (!rc) {
+						memset(wc, 0, sizeof(*wc));
+						continue;
+					}
+					cqe->status = -1;
+				}
+				/* Errors need not be looped back.
+				 * But change the wr_id to the one
+				 * stored in the table
+				 */
+				tbl_idx = cqe->wr_id;
+				sqp_entry = &cq->rdev->sqp_tbl[tbl_idx];
+				wc->wr_id = sqp_entry->wrid;
+				bnxt_re_process_res_rawqp1_wc(wc, cqe);
+				break;
+			case CQ_BASE_CQE_TYPE_RES_RC:
+				bnxt_re_process_res_rc_wc(wc, cqe);
+				break;
+			case CQ_BASE_CQE_TYPE_RES_UD:
+				if (qp->qplib_qp.id ==
+				    qp->rdev->qp1_sqp->qplib_qp.id) {
+					/* Handle this completion with
+					 * the stored completion
+					 */
+					if (cqe->status) {
+						continue;
+					} else {
+						bnxt_re_process_res_shadow_qp_wc
+								(qp, wc, cqe);
+						break;
+					}
+				}
+				bnxt_re_process_res_ud_wc(wc, cqe);
+				break;
+			default:
+				dev_err(rdev_to_dev(cq->rdev),
+					"POLL CQ : type 0x%x not handled",
+					cqe->opcode);
+				continue;
+			}
+			wc++;
+			budget--;
+		}
+	}
+exit:
+	spin_unlock_irqrestore(&cq->cq_lock, flags);
+	return num_entries - budget;
+}
+
 int bnxt_re_req_notify_cq(struct ib_cq *ib_cq,
 			  enum ib_cq_notify_flags ib_cqn_flags)
 {
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
index f36ce98..9e00bb9 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
@@ -146,6 +146,7 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
 				struct ib_ucontext *context,
 				struct ib_udata *udata);
 int bnxt_re_destroy_cq(struct ib_cq *cq);
+int bnxt_re_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc);
 int bnxt_re_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
 struct ib_mr *bnxt_re_get_dma_mr(struct ib_pd *pd, int mr_access_flags);
 
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
index 0811e10..a138741 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
@@ -450,6 +450,7 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
 
 	ibdev->create_cq		= bnxt_re_create_cq;
 	ibdev->destroy_cq		= bnxt_re_destroy_cq;
+	ibdev->poll_cq			= bnxt_re_poll_cq;
 	ibdev->req_notify_cq		= bnxt_re_req_notify_cq;
 
 	ibdev->get_dma_mr		= bnxt_re_get_dma_mr;
@@ -599,6 +600,25 @@ static int bnxt_re_aeq_handler(struct bnxt_qplib_rcfw *rcfw,
 	return 0;
 }
 
+static int bnxt_re_cqn_handler(struct bnxt_qplib_nq *nq,
+			       struct bnxt_qplib_cq *handle)
+{
+	struct bnxt_re_cq *cq = to_bnxt_re(handle, struct bnxt_re_cq,
+					   qplib_cq);
+
+	if (!cq) {
+		dev_err(NULL, "%s: CQ is NULL, CQN not handled",
+			ROCE_DRV_MODULE_NAME);
+		return -EINVAL;
+	}
+	if (cq->ib_cq.comp_handler) {
+		/* Lock comp_handler? */
+		(*cq->ib_cq.comp_handler)(&cq->ib_cq, cq->ib_cq.cq_context);
+	}
+
+	return 0;
+}
+
 static void bnxt_re_cleanup_res(struct bnxt_re_dev *rdev)
 {
 	if (rdev->nq.hwq.max_elements)
@@ -620,7 +640,7 @@ static int bnxt_re_init_res(struct bnxt_re_dev *rdev)
 	rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nq,
 				  rdev->msix_entries[BNXT_RE_NQ_IDX].vector,
 				  rdev->msix_entries[BNXT_RE_NQ_IDX].db_offset,
-				  NULL,
+				  &bnxt_re_cqn_handler,
 				  NULL);
 
 	if (rc)
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 22/28] bnxt_re: Support post_recv
From: Selvin Xavier @ 2016-12-05  6:38 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Selvin Xavier, Eddie Wai,
	Devesh Sharma, Somnath Kotur, Sriharsha Basavapatna
In-Reply-To: <1480919912-1079-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Enables the fastpath verb ib_post_recv.

Signed-off-by: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c    | 100 ++++++++++++++++++
 drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h    |   8 ++
 drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c | 133 ++++++++++++++++++++++++
 drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h |   2 +
 drivers/infiniband/hw/bnxtre/bnxt_re_main.c     |   2 +
 5 files changed, 245 insertions(+)

diff --git a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
index 41359c8..9be1f39 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
@@ -1082,6 +1082,37 @@ void *bnxt_qplib_get_qp1_sq_buf(struct bnxt_qplib_qp *qp,
 	return NULL;
 }
 
+u32 bnxt_qplib_get_rq_prod_index(struct bnxt_qplib_qp *qp)
+{
+	struct bnxt_qplib_q *rq = &qp->rq;
+
+	return HWQ_CMP(rq->hwq.prod, &rq->hwq);
+}
+
+dma_addr_t bnxt_qplib_get_qp_buf_from_index(struct bnxt_qplib_qp *qp, u32 index)
+{
+	return (qp->rq_hdr_buf_map + index * qp->rq_hdr_buf_size);
+}
+
+void *bnxt_qplib_get_qp1_rq_buf(struct bnxt_qplib_qp *qp,
+				struct bnxt_qplib_sge *sge)
+{
+	struct bnxt_qplib_q *rq = &qp->rq;
+	u32 sw_prod;
+
+	memset(sge, 0, sizeof(*sge));
+
+	if (qp->rq_hdr_buf) {
+		sw_prod = HWQ_CMP(rq->hwq.prod, &rq->hwq);
+		sge->addr = (dma_addr_t)(qp->rq_hdr_buf_map +
+					 sw_prod * qp->rq_hdr_buf_size);
+		sge->lkey = 0xFFFFFFFF;
+		sge->size = qp->rq_hdr_buf_size;
+		return qp->rq_hdr_buf + sw_prod * sge->size;
+	}
+	return NULL;
+}
+
 void bnxt_qplib_post_send_db(struct bnxt_qplib_qp *qp)
 {
 	struct bnxt_qplib_q *sq = &qp->sq;
@@ -1330,6 +1361,75 @@ int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp,
 	return rc;
 }
 
+void bnxt_qplib_post_recv_db(struct bnxt_qplib_qp *qp)
+{
+	struct bnxt_qplib_q *rq = &qp->rq;
+	struct dbr_dbr db_msg = { 0 };
+	u32 sw_prod;
+
+	sw_prod = HWQ_CMP(rq->hwq.prod, &rq->hwq);
+	db_msg.index = cpu_to_le32((sw_prod << DBR_DBR_INDEX_SFT) &
+				   DBR_DBR_INDEX_MASK);
+	db_msg.type_xid =
+		cpu_to_le32(((qp->id << DBR_DBR_XID_SFT) & DBR_DBR_XID_MASK) |
+			    DBR_DBR_TYPE_RQ);
+
+	/* Flush the writes to HW Rx WQE before the ringing Rx DB */
+	wmb();
+	__iowrite64_copy(qp->dpi->dbr, &db_msg, sizeof(db_msg) / sizeof(u64));
+}
+
+int bnxt_qplib_post_recv(struct bnxt_qplib_qp *qp,
+			 struct bnxt_qplib_swqe *wqe)
+{
+	struct bnxt_qplib_q *rq = &qp->rq;
+	struct rq_wqe *rqe, **rqe_ptr;
+	struct sq_sge *hw_sge;
+	u32 sw_prod;
+	int i, rc = 0;
+
+	if (qp->state == CMDQ_MODIFY_QP_NEW_STATE_ERR) {
+		dev_err(&rq->hwq.pdev->dev,
+			"QPLIB: FP: QP (0x%x) is in the 0x%x state",
+			qp->id, qp->state);
+		rc = -EINVAL;
+		goto done;
+	}
+	if (HWQ_CMP((rq->hwq.prod + 1), &rq->hwq) ==
+	    HWQ_CMP(rq->hwq.cons, &rq->hwq)) {
+		dev_err(&rq->hwq.pdev->dev,
+			"QPLIB: FP: QP (0x%x) RQ is full!", qp->id);
+		rc = -EINVAL;
+		goto done;
+	}
+	sw_prod = HWQ_CMP(rq->hwq.prod, &rq->hwq);
+	rq->swq[sw_prod].wr_id = wqe->wr_id;
+
+	rqe_ptr = (struct rq_wqe **)rq->hwq.pbl_ptr;
+	rqe = &rqe_ptr[RQE_PG(sw_prod)][RQE_IDX(sw_prod)];
+
+	memset(rqe, 0, BNXT_QPLIB_MAX_RQE_ENTRY_SIZE);
+
+	/* Calculate wqe_size16 and data_len */
+	for (i = 0, hw_sge = (struct sq_sge *)rqe->data;
+	     i < wqe->num_sge; i++, hw_sge++) {
+		hw_sge->va_or_pa = cpu_to_le64(wqe->sg_list[i].addr);
+		hw_sge->l_key = cpu_to_le32(wqe->sg_list[i].lkey);
+		hw_sge->size = cpu_to_le32(wqe->sg_list[i].size);
+	}
+	rqe->wqe_type = wqe->type;
+	rqe->flags = wqe->flags;
+	rqe->wqe_size = wqe->num_sge +
+			((offsetof(typeof(*rqe), data) + 15) >> 4);
+
+	/* Supply the rqe->wr_id index to the wr_id_tbl for now */
+	rqe->wr_id[0] = cpu_to_le32(sw_prod);
+
+	rq->hwq.prod++;
+done:
+	return rc;
+}
+
 /* CQ */
 
 /* Spinlock must be held */
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
index f0a1198..12e9fcb 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
@@ -370,9 +370,17 @@ int bnxt_qplib_query_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
 int bnxt_qplib_destroy_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
 void *bnxt_qplib_get_qp1_sq_buf(struct bnxt_qplib_qp *qp,
 				struct bnxt_qplib_sge *sge);
+void *bnxt_qplib_get_qp1_rq_buf(struct bnxt_qplib_qp *qp,
+				struct bnxt_qplib_sge *sge);
+u32 bnxt_qplib_get_rq_prod_index(struct bnxt_qplib_qp *qp);
+dma_addr_t bnxt_qplib_get_qp_buf_from_index(struct bnxt_qplib_qp *qp,
+					    u32 index);
 void bnxt_qplib_post_send_db(struct bnxt_qplib_qp *qp);
 int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp,
 			 struct bnxt_qplib_swqe *wqe);
+void bnxt_qplib_post_recv_db(struct bnxt_qplib_qp *qp);
+int bnxt_qplib_post_recv(struct bnxt_qplib_qp *qp,
+			 struct bnxt_qplib_swqe *wqe);
 int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq);
 int bnxt_qplib_destroy_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq);
 
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
index 118ed37..6d870eb 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
@@ -1619,6 +1619,61 @@ static int bnxt_re_build_qp1_send_v2(struct bnxt_re_qp *qp,
 	return rc;
 }
 
+/* For the MAD layer, it only provides the recv SGE the size of
+ * ib_grh + MAD datagram.  No Ethernet headers, Ethertype, BTH, DETH,
+ * nor RoCE iCRC.  The Cu+ solution must provide buffer for the entire
+ * receive packet (334 bytes) with no VLAN and then copy the GRH
+ * and the MAD datagram out to the provided SGE.
+ */
+static int bnxt_re_build_qp1_shadow_qp_recv(struct bnxt_re_qp *qp,
+					    struct ib_recv_wr *wr,
+					    struct bnxt_qplib_swqe *wqe,
+					    int payload_size)
+{
+	struct bnxt_qplib_sge ref, sge;
+	int rc = 0;
+	u32 rq_prod_index;
+	struct bnxt_re_sqp_entries *sqp_entry;
+
+	rq_prod_index = bnxt_qplib_get_rq_prod_index(&qp->qplib_qp);
+
+	if (bnxt_qplib_get_qp1_rq_buf(&qp->qplib_qp, &sge)) {
+		/* Create 1 SGE to receive the entire
+		 * ethernet packet
+		 */
+		/* Save the reference from ULP */
+		ref.addr = wqe->sg_list[0].addr;
+		ref.lkey = wqe->sg_list[0].lkey;
+		ref.size = wqe->sg_list[0].size;
+
+		sqp_entry = &qp->rdev->sqp_tbl[rq_prod_index];
+
+		/* SGE 1 */
+		wqe->sg_list[0].addr = sge.addr;
+		wqe->sg_list[0].lkey = sge.lkey;
+		wqe->sg_list[0].size = BNXT_QPLIB_MAX_QP1_RQ_HDR_SIZE_V2;
+		sge.size -= wqe->sg_list[0].size;
+		if (sge.size < 0) {
+			dev_err(rdev_to_dev(qp->rdev),
+				"QP1 rq buffer is empty!");
+			rc = -ENOMEM;
+			goto done;
+		}
+
+		sqp_entry->sge.addr = ref.addr;
+		sqp_entry->sge.lkey = ref.lkey;
+		sqp_entry->sge.size = ref.size;
+		/* Store the wrid for reporting completion */
+		sqp_entry->wrid = wqe->wr_id;
+		/* change the wqe->wrid to table index */
+		wqe->wr_id = rq_prod_index;
+	}
+	return 0;
+done:
+
+	return rc;
+}
+
 int is_ud_qp(struct bnxt_re_qp *qp)
 {
 	return qp->qplib_qp.type == CMDQ_CREATE_QP_TYPE_UD;
@@ -1963,6 +2018,84 @@ int bnxt_re_post_send(struct ib_qp *ib_qp, struct ib_send_wr *wr,
 	return rc;
 }
 
+int bnxt_re_post_recv_shadow_qp(struct bnxt_re_dev *rdev,
+				struct bnxt_re_qp *qp,
+				struct ib_recv_wr *wr)
+{
+	struct bnxt_qplib_swqe wqe;
+	int rc = 0, payload_sz = 0;
+
+	memset(&wqe, 0, sizeof(wqe));
+	while (wr) {
+		/* House keeping */
+		memset(&wqe, 0, sizeof(wqe));
+
+		/* Common */
+		wqe.num_sge = wr->num_sge;
+		if (wr->num_sge > qp->qplib_qp.rq.max_sge) {
+			dev_err(rdev_to_dev(rdev),
+				"Limit exceeded for Receive SGEs");
+			rc = -EINVAL;
+			goto bad;
+		}
+		payload_sz = bnxt_re_build_sgl(wr->sg_list, wqe.sg_list,
+					       wr->num_sge);
+		wqe.wr_id = wr->wr_id;
+		wqe.type = BNXT_QPLIB_SWQE_TYPE_RECV;
+
+		if (!rc)
+			rc = bnxt_qplib_post_recv(&qp->qplib_qp, &wqe);
+bad:
+		if (rc)
+			break;
+
+		wr = wr->next;
+	}
+	bnxt_qplib_post_recv_db(&qp->qplib_qp);
+	return rc;
+}
+
+int bnxt_re_post_recv(struct ib_qp *ib_qp, struct ib_recv_wr *wr,
+		      struct ib_recv_wr **bad_wr)
+{
+	struct bnxt_re_qp *qp = to_bnxt_re(ib_qp, struct bnxt_re_qp, ib_qp);
+	struct bnxt_qplib_swqe wqe;
+	int rc = 0, payload_sz = 0;
+
+	while (wr) {
+		/* House keeping */
+		memset(&wqe, 0, sizeof(wqe));
+
+		/* Common */
+		wqe.num_sge = wr->num_sge;
+		if (wr->num_sge > qp->qplib_qp.rq.max_sge) {
+			dev_err(rdev_to_dev(qp->rdev),
+				"Limit exceeded for Receive SGEs");
+			rc = -EINVAL;
+			goto bad;
+		}
+
+		payload_sz = bnxt_re_build_sgl(wr->sg_list, wqe.sg_list,
+					       wr->num_sge);
+		wqe.wr_id = wr->wr_id;
+		wqe.type = BNXT_QPLIB_SWQE_TYPE_RECV;
+
+		if (ib_qp->qp_type == IB_QPT_GSI)
+			rc = bnxt_re_build_qp1_shadow_qp_recv(qp, wr, &wqe,
+							      payload_sz);
+		if (!rc)
+			rc = bnxt_qplib_post_recv(&qp->qplib_qp, &wqe);
+bad:
+		if (rc) {
+			*bad_wr = wr;
+			break;
+		}
+		wr = wr->next;
+	}
+	bnxt_qplib_post_recv_db(&qp->qplib_qp);
+	return rc;
+}
+
 /* Completion Queues */
 int bnxt_re_destroy_cq(struct ib_cq *ib_cq)
 {
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
index bb6f395..f36ce98 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
@@ -139,6 +139,8 @@ int bnxt_re_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
 int bnxt_re_destroy_qp(struct ib_qp *qp);
 int bnxt_re_post_send(struct ib_qp *qp, struct ib_send_wr *send_wr,
 		      struct ib_send_wr **bad_send_wr);
+int bnxt_re_post_recv(struct ib_qp *qp, struct ib_recv_wr *recv_wr,
+		      struct ib_recv_wr **bad_recv_wr);
 struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
 				const struct ib_cq_init_attr *attr,
 				struct ib_ucontext *context,
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
index be77d91..0811e10 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
@@ -446,6 +446,8 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
 	ibdev->destroy_qp		= bnxt_re_destroy_qp;
 
 	ibdev->post_send		= bnxt_re_post_send;
+	ibdev->post_recv		= bnxt_re_post_recv;
+
 	ibdev->create_cq		= bnxt_re_create_cq;
 	ibdev->destroy_cq		= bnxt_re_destroy_cq;
 	ibdev->req_notify_cq		= bnxt_re_req_notify_cq;
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 21/28] bnxt_re: Support post_send verb
From: Selvin Xavier @ 2016-12-05  6:38 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Selvin Xavier, Eddie Wai,
	Devesh Sharma, Somnath Kotur, Sriharsha Basavapatna
In-Reply-To: <1480919912-1079-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Enables the ib_post_send fastpath verb for posting Send work requests on QPs.

Signed-off-by: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c    | 267 ++++++++++++
 drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h    |   8 +
 drivers/infiniband/hw/bnxtre/bnxt_re.h          |   5 +
 drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c | 542 +++++++++++++++++++++++-
 drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h |   2 +
 drivers/infiniband/hw/bnxtre/bnxt_re_main.c     |   1 +
 6 files changed, 822 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
index 5d502857..41359c8 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
@@ -1063,6 +1063,273 @@ int bnxt_qplib_destroy_qp(struct bnxt_qplib_res *res,
 	return 0;
 }
 
+void *bnxt_qplib_get_qp1_sq_buf(struct bnxt_qplib_qp *qp,
+				struct bnxt_qplib_sge *sge)
+{
+	struct bnxt_qplib_q *sq = &qp->sq;
+	u32 sw_prod;
+
+	memset(sge, 0, sizeof(*sge));
+
+	if (qp->sq_hdr_buf) {
+		sw_prod = HWQ_CMP(sq->hwq.prod, &sq->hwq);
+		sge->addr = (dma_addr_t)(qp->sq_hdr_buf_map +
+					 sw_prod * qp->sq_hdr_buf_size);
+		sge->lkey = 0xFFFFFFFF;
+		sge->size = qp->sq_hdr_buf_size;
+		return qp->sq_hdr_buf + sw_prod * sge->size;
+	}
+	return NULL;
+}
+
+void bnxt_qplib_post_send_db(struct bnxt_qplib_qp *qp)
+{
+	struct bnxt_qplib_q *sq = &qp->sq;
+	struct dbr_dbr db_msg = { 0 };
+	u32 sw_prod;
+
+	sw_prod = HWQ_CMP(sq->hwq.prod, &sq->hwq);
+
+	db_msg.index = cpu_to_le32((sw_prod << DBR_DBR_INDEX_SFT) &
+				   DBR_DBR_INDEX_MASK);
+	db_msg.type_xid =
+		cpu_to_le32(((qp->id << DBR_DBR_XID_SFT) & DBR_DBR_XID_MASK) |
+			    DBR_DBR_TYPE_SQ);
+	/* Flush all the WQE writes to HW */
+	wmb();
+	__iowrite64_copy(qp->dpi->dbr, &db_msg, sizeof(db_msg) / sizeof(u64));
+}
+
+int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp,
+			 struct bnxt_qplib_swqe *wqe)
+{
+	struct bnxt_qplib_q *sq = &qp->sq;
+	struct bnxt_qplib_swq *swq;
+	struct sq_send *hw_sq_send_hdr, **hw_sq_send_ptr;
+	struct sq_sge *hw_sge;
+	u32 sw_prod;
+	u8 wqe_size16;
+	int i, rc = 0, data_len = 0, pkt_num = 0;
+	u32 temp32;
+
+	if (qp->state != CMDQ_MODIFY_QP_NEW_STATE_RTS) {
+		rc = -EINVAL;
+		goto done;
+	}
+	if (HWQ_CMP((sq->hwq.prod + 1), &sq->hwq) ==
+	    HWQ_CMP(sq->hwq.cons, &sq->hwq)) {
+		rc = -ENOMEM;
+		goto done;
+	}
+	sw_prod = HWQ_CMP(sq->hwq.prod, &sq->hwq);
+	swq = &sq->swq[sw_prod];
+	swq->wr_id = wqe->wr_id;
+	swq->type = wqe->type;
+	swq->flags = wqe->flags;
+	if (qp->sig_type)
+		swq->flags |= SQ_SEND_FLAGS_SIGNAL_COMP;
+	swq->start_psn = sq->psn & BTH_PSN_MASK;
+
+	hw_sq_send_ptr = (struct sq_send **)sq->hwq.pbl_ptr;
+	hw_sq_send_hdr = &hw_sq_send_ptr[SQE_PG(sw_prod)][SQE_IDX(sw_prod)];
+
+	memset(hw_sq_send_hdr, 0, BNXT_QPLIB_MAX_SQE_ENTRY_SIZE);
+
+	if (wqe->flags & BNXT_QPLIB_SWQE_FLAGS_INLINE) {
+		/* Copy the inline data */
+		if (wqe->inline_len > BNXT_QPLIB_SWQE_MAX_INLINE_LENGTH) {
+			dev_warn(&sq->hwq.pdev->dev,
+				 "QPLIB: Inline data length > 96 detected");
+			data_len = BNXT_QPLIB_SWQE_MAX_INLINE_LENGTH;
+		} else {
+			data_len = wqe->inline_len;
+		}
+		memcpy(hw_sq_send_hdr->data, wqe->inline_data, data_len);
+		wqe_size16 = (data_len + 15) >> 4;
+	} else {
+		for (i = 0, hw_sge = (struct sq_sge *)hw_sq_send_hdr->data;
+		     i < wqe->num_sge; i++, hw_sge++) {
+			hw_sge->va_or_pa = cpu_to_le64(wqe->sg_list[i].addr);
+			hw_sge->l_key = cpu_to_le32(wqe->sg_list[i].lkey);
+			hw_sge->size = cpu_to_le32(wqe->sg_list[i].size);
+			data_len += hw_sge->size;
+		}
+		/* Each SGE entry = 1 WQE size16 */
+		wqe_size16 = wqe->num_sge;
+	}
+
+	/* Specifics */
+	switch (wqe->type) {
+	case BNXT_QPLIB_SWQE_TYPE_SEND:
+		if (qp->type == CMDQ_CREATE_QP1_TYPE_GSI) {
+			/* Assemble info for Raw Ethertype QPs */
+			struct sq_send_raweth_qp1 *sqe =
+				(struct sq_send_raweth_qp1 *)hw_sq_send_hdr;
+
+			sqe->wqe_type = wqe->type;
+			sqe->flags = wqe->flags;
+			sqe->wqe_size = wqe_size16 +
+				((offsetof(typeof(*sqe), data) + 15) >> 4);
+			sqe->cfa_action = cpu_to_le16(wqe->rawqp1.cfa_action);
+			sqe->lflags = cpu_to_le16(wqe->rawqp1.lflags);
+			sqe->length = cpu_to_le32(data_len);
+			sqe->cfa_meta = cpu_to_le32((wqe->rawqp1.cfa_meta &
+				SQ_SEND_RAWETH_QP1_CFA_META_VLAN_VID_MASK) <<
+				SQ_SEND_RAWETH_QP1_CFA_META_VLAN_VID_SFT);
+
+			break;
+		}
+		/* else, just fall thru */
+	case BNXT_QPLIB_SWQE_TYPE_SEND_WITH_IMM:
+	case BNXT_QPLIB_SWQE_TYPE_SEND_WITH_INV:
+	{
+		struct sq_send *sqe = (struct sq_send *)hw_sq_send_hdr;
+
+		sqe->wqe_type = wqe->type;
+		sqe->flags = wqe->flags;
+		sqe->wqe_size = wqe_size16 +
+				((offsetof(typeof(*sqe), data) + 15) >> 4);
+		sqe->inv_key_or_imm_data = cpu_to_le32(
+						wqe->send.imm_data_or_inv_key);
+		if (qp->type == CMDQ_CREATE_QP_TYPE_UD) {
+			sqe->q_key = cpu_to_le32(wqe->send.q_key);
+			sqe->dst_qp = cpu_to_le32(
+					wqe->send.dst_qp & SQ_SEND_DST_QP_MASK);
+			sqe->length = cpu_to_le32(data_len);
+			sqe->avid = cpu_to_le32(wqe->send.avid &
+						SQ_SEND_AVID_MASK);
+			sq->psn = (sq->psn + 1) & BTH_PSN_MASK;
+		} else {
+			sqe->length = cpu_to_le32(data_len);
+			sqe->dst_qp = 0;
+			sqe->avid = 0;
+			if (qp->mtu)
+				pkt_num = (data_len + qp->mtu - 1) / qp->mtu;
+			if (!pkt_num)
+				pkt_num = 1;
+			sq->psn = (sq->psn + pkt_num) & BTH_PSN_MASK;
+		}
+		break;
+	}
+	case BNXT_QPLIB_SWQE_TYPE_RDMA_WRITE:
+	case BNXT_QPLIB_SWQE_TYPE_RDMA_WRITE_WITH_IMM:
+	case BNXT_QPLIB_SWQE_TYPE_RDMA_READ:
+	{
+		struct sq_rdma *sqe = (struct sq_rdma *)hw_sq_send_hdr;
+
+		sqe->wqe_type = wqe->type;
+		sqe->flags = wqe->flags;
+		sqe->wqe_size = wqe_size16 +
+				((offsetof(typeof(*sqe), data) + 15) >> 4);
+		sqe->imm_data = cpu_to_le32(wqe->rdma.imm_data_or_inv_key);
+		sqe->length = cpu_to_le32((u32)data_len);
+		sqe->remote_va = cpu_to_le64(wqe->rdma.remote_va);
+		sqe->remote_key = cpu_to_le32(wqe->rdma.r_key);
+		if (qp->mtu)
+			pkt_num = (data_len + qp->mtu - 1) / qp->mtu;
+		if (!pkt_num)
+			pkt_num = 1;
+		sq->psn = (sq->psn + pkt_num) & BTH_PSN_MASK;
+		break;
+	}
+	case BNXT_QPLIB_SWQE_TYPE_ATOMIC_CMP_AND_SWP:
+	case BNXT_QPLIB_SWQE_TYPE_ATOMIC_FETCH_AND_ADD:
+	{
+		struct sq_atomic *sqe = (struct sq_atomic *)hw_sq_send_hdr;
+
+		sqe->wqe_type = wqe->type;
+		sqe->flags = wqe->flags;
+		sqe->remote_key = cpu_to_le32(wqe->atomic.r_key);
+		sqe->remote_va = cpu_to_le64(wqe->atomic.remote_va);
+		sqe->swap_data = cpu_to_le64(wqe->atomic.swap_data);
+		sqe->cmp_data = cpu_to_le64(wqe->atomic.cmp_data);
+		if (qp->mtu)
+			pkt_num = (data_len + qp->mtu - 1) / qp->mtu;
+		if (!pkt_num)
+			pkt_num = 1;
+		sq->psn = (sq->psn + pkt_num) & BTH_PSN_MASK;
+		break;
+	}
+	case BNXT_QPLIB_SWQE_TYPE_LOCAL_INV:
+	{
+		struct sq_localinvalidate *sqe =
+				(struct sq_localinvalidate *)hw_sq_send_hdr;
+
+		sqe->wqe_type = wqe->type;
+		sqe->flags = wqe->flags;
+		sqe->inv_l_key = cpu_to_le32(wqe->local_inv.inv_l_key);
+
+		break;
+	}
+	case BNXT_QPLIB_SWQE_TYPE_FAST_REG_MR:
+	{
+		struct sq_fr_pmr *sqe = (struct sq_fr_pmr *)hw_sq_send_hdr;
+
+		sqe->wqe_type = wqe->type;
+		sqe->flags = wqe->flags;
+		sqe->access_cntl = wqe->frmr.access_cntl |
+				   SQ_FR_PMR_ACCESS_CNTL_LOCAL_WRITE;
+		sqe->zero_based_page_size_log =
+			(wqe->frmr.pg_sz_log & SQ_FR_PMR_PAGE_SIZE_LOG_MASK) <<
+			SQ_FR_PMR_PAGE_SIZE_LOG_SFT |
+			(wqe->frmr.zero_based ? SQ_FR_PMR_ZERO_BASED : 0);
+		sqe->l_key = cpu_to_le32(wqe->frmr.l_key);
+		temp32 = cpu_to_le32(wqe->frmr.length);
+		memcpy(sqe->length, &temp32, sizeof(wqe->frmr.length));
+		sqe->numlevels_pbl_page_size_log =
+			((wqe->frmr.pbl_pg_sz_log <<
+					SQ_FR_PMR_PBL_PAGE_SIZE_LOG_SFT) &
+					SQ_FR_PMR_PBL_PAGE_SIZE_LOG_MASK) |
+			((wqe->frmr.levels << SQ_FR_PMR_NUMLEVELS_SFT) &
+					SQ_FR_PMR_NUMLEVELS_MASK);
+
+		for (i = 0; i < wqe->frmr.page_list_len; i++)
+			wqe->frmr.pbl_ptr[i] = cpu_to_le64(
+						wqe->frmr.page_list[i] |
+						PTU_PTE_VALID);
+		sqe->pblptr = cpu_to_le64(wqe->frmr.pbl_dma_ptr);
+		sqe->va = cpu_to_le64(wqe->frmr.va);
+
+		break;
+	}
+	case BNXT_QPLIB_SWQE_TYPE_BIND_MW:
+	{
+		struct sq_bind *sqe = (struct sq_bind *)hw_sq_send_hdr;
+
+		sqe->wqe_type = wqe->type;
+		sqe->flags = wqe->flags;
+		sqe->access_cntl = wqe->bind.access_cntl;
+		sqe->mw_type_zero_based = wqe->bind.mw_type |
+			(wqe->bind.zero_based ? SQ_BIND_ZERO_BASED : 0);
+		sqe->parent_l_key = cpu_to_le32(wqe->bind.parent_l_key);
+		sqe->l_key = cpu_to_le32(wqe->bind.r_key);
+		sqe->va = cpu_to_le64(wqe->bind.va);
+		temp32 = cpu_to_le32(wqe->bind.length);
+		memcpy(&sqe->length, &temp32, sizeof(wqe->bind.length));
+		break;
+	}
+	default:
+		/* Bad wqe, return error */
+		rc = -EINVAL;
+		goto done;
+	}
+	swq->next_psn = sq->psn & BTH_PSN_MASK;
+	if (swq->psn_search) {
+		swq->psn_search->opcode_start_psn = cpu_to_le32(
+			((swq->start_psn << SQ_PSN_SEARCH_START_PSN_SFT) &
+			 SQ_PSN_SEARCH_START_PSN_MASK) |
+			((wqe->type << SQ_PSN_SEARCH_OPCODE_SFT) &
+			 SQ_PSN_SEARCH_OPCODE_MASK));
+		swq->psn_search->flags_next_psn = cpu_to_le32(
+			((swq->next_psn << SQ_PSN_SEARCH_NEXT_PSN_SFT) &
+			 SQ_PSN_SEARCH_NEXT_PSN_MASK));
+	}
+
+	sq->hwq.prod++;
+done:
+	return rc;
+}
+
 /* CQ */
 
 /* Spinlock must be held */
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
index 61c8bcc7..f0a1198 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
@@ -13,6 +13,9 @@
 
 #ifndef __BNXT_QPLIB_FP_H__
 #define __BNXT_QPLIB_FP_H__
+
+#define BNXT_QPLIB_ETHTYPE_ROCEV1      0x8915
+
 struct bnxt_qplib_sge {
 	u64				addr;
 	u32				lkey;
@@ -365,6 +368,11 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
 int bnxt_qplib_modify_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
 int bnxt_qplib_query_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
 int bnxt_qplib_destroy_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
+void *bnxt_qplib_get_qp1_sq_buf(struct bnxt_qplib_qp *qp,
+				struct bnxt_qplib_sge *sge);
+void bnxt_qplib_post_send_db(struct bnxt_qplib_qp *qp);
+int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp,
+			 struct bnxt_qplib_swqe *wqe);
 int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq);
 int bnxt_qplib_destroy_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq);
 
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re.h b/drivers/infiniband/hw/bnxtre/bnxt_re.h
index 30dee42..2a49d10 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re.h
@@ -104,6 +104,11 @@ struct bnxt_re_dev {
 
 #define to_bnxt_re_dev(ptr, member)	\
 	container_of((ptr), struct bnxt_re_dev, member)
+
+#define BNXT_RE_ROCE_V1_PACKET		0
+#define BNXT_RE_ROCEV2_IPV4_PACKET	2
+#define BNXT_RE_ROCEV2_IPV6_PACKET	3
+
 #define	rdev_to_dev(rdev)	((rdev) ? (&(rdev)->ibdev.dev) : NULL)
 
 #endif
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
index 5e35d19..118ed37 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
@@ -45,6 +45,20 @@ static int bnxt_re_copy_to_udata(struct bnxt_re_dev *rdev, void *data, int len,
 	return rc ? -EFAULT : 0;
 }
 
+static int bnxt_re_build_sgl(struct ib_sge *ib_sg_list,
+			     struct bnxt_qplib_sge *sg_list, int num)
+{
+	int i, total = 0;
+
+	for (i = 0; i < num; i++) {
+		sg_list[i].addr = ib_sg_list[i].addr;
+		sg_list[i].lkey = ib_sg_list[i].lkey;
+		sg_list[i].size = ib_sg_list[i].length;
+		total += sg_list[i].size;
+	}
+	return total;
+}
+
 /* Device */
 struct net_device *bnxt_re_get_netdev(struct ib_device *ibdev, u8 port_num)
 {
@@ -683,8 +697,6 @@ static u8 __from_ib_qp_type(enum ib_qp_type type)
 		return CMDQ_CREATE_QP_TYPE_RC;
 	case IB_QPT_UD:
 		return CMDQ_CREATE_QP_TYPE_UD;
-	case IB_QPT_RAW_ETHERTYPE:
-		return CMDQ_CREATE_QP_TYPE_RAW_ETHERTYPE;
 	default:
 		return IB_QPT_MAX;
 	}
@@ -856,7 +868,6 @@ struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd,
 	struct bnxt_re_dev *rdev = pd->rdev;
 	struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
 	struct bnxt_re_qp *qp;
-	struct bnxt_re_srq *srq;
 	struct bnxt_re_cq *cq;
 	int rc, entries;
 
@@ -1427,6 +1438,531 @@ int bnxt_re_query_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr,
 	return 0;
 }
 
+/* Routine for sending QP1 packets for RoCE V1 an V2
+ */
+static int bnxt_re_build_qp1_send_v2(struct bnxt_re_qp *qp,
+				     struct ib_send_wr *wr,
+				     struct bnxt_qplib_swqe *wqe,
+				     int payload_size)
+{
+	struct ib_device *ibdev = &qp->rdev->ibdev;
+	struct bnxt_re_ah *ah = to_bnxt_re(ud_wr(wr)->ah, struct bnxt_re_ah,
+					   ib_ah);
+	struct bnxt_qplib_ah *qplib_ah = &ah->qplib_ah;
+	struct bnxt_qplib_sge sge;
+	union ib_gid sgid;
+	u8 nw_type;
+	u16 ether_type;
+	struct ib_gid_attr sgid_attr;
+	union ib_gid dgid;
+	bool is_eth = false;
+	bool is_vlan = false;
+	bool is_grh = false;
+	bool is_udp = false;
+	u8 ip_version = 0;
+	u16 vlan_id = 0xFFFF;
+	void *buf;
+	int i, rc = 0, size;
+
+	memset(&qp->qp1_hdr, 0, sizeof(qp->qp1_hdr));
+
+	rc = ib_get_cached_gid(ibdev, 1,
+			       qplib_ah->host_sgid_index, &sgid,
+			       &sgid_attr);
+	if (rc) {
+		dev_err(rdev_to_dev(qp->rdev),
+			"Failed to query gid at index %d",
+			qplib_ah->host_sgid_index);
+		return rc;
+	}
+	if (sgid_attr.ndev) {
+		if (is_vlan_dev(sgid_attr.ndev))
+			vlan_id = vlan_dev_vlan_id(sgid_attr.ndev);
+		dev_put(sgid_attr.ndev);
+	}
+	/* Get network header type for this GID */
+	nw_type = ib_gid_to_network_type(sgid_attr.gid_type, &sgid);
+	switch (nw_type) {
+	case RDMA_NETWORK_IPV4:
+		nw_type = BNXT_RE_ROCEV2_IPV4_PACKET;
+		break;
+	case RDMA_NETWORK_IPV6:
+		nw_type = BNXT_RE_ROCEV2_IPV6_PACKET;
+		break;
+	default:
+		nw_type = BNXT_RE_ROCE_V1_PACKET;
+		break;
+	}
+	memcpy(&dgid.raw, &qplib_ah->dgid, 16);
+	is_udp = sgid_attr.gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP;
+	if (is_udp) {
+		if (ipv6_addr_v4mapped((struct in6_addr *)&sgid)) {
+			ip_version = 4;
+			ether_type = ETH_P_IP;
+		} else {
+			ip_version = 6;
+			ether_type = ETH_P_IPV6;
+		}
+		is_grh = false;
+	} else {
+		ether_type = BNXT_QPLIB_ETHTYPE_ROCEV1;
+		is_grh = true;
+	}
+
+	is_eth = true;
+	is_vlan = (vlan_id && (vlan_id < 0x1000)) ? true : false;
+
+	ib_ud_header_init(payload_size, !is_eth, is_eth, is_vlan, is_grh,
+			  ip_version, is_udp, 0, &qp->qp1_hdr);
+
+	/* ETH */
+	ether_addr_copy(qp->qp1_hdr.eth.dmac_h, ah->qplib_ah.dmac);
+	ether_addr_copy(qp->qp1_hdr.eth.smac_h, qp->qplib_qp.smac);
+
+	/* For vlan, check the sgid for vlan existence */
+
+	if (!is_vlan) {
+		qp->qp1_hdr.eth.type = cpu_to_be16(ether_type);
+	} else {
+		qp->qp1_hdr.vlan.type = cpu_to_be16(ether_type);
+		qp->qp1_hdr.vlan.tag = cpu_to_be16(vlan_id);
+	}
+
+	if (is_grh || (ip_version == 6)) {
+		memcpy(qp->qp1_hdr.grh.source_gid.raw, sgid.raw, sizeof(sgid));
+		memcpy(qp->qp1_hdr.grh.destination_gid.raw, qplib_ah->dgid.data,
+		       sizeof(sgid));
+		qp->qp1_hdr.grh.hop_limit     = qplib_ah->hop_limit;
+	}
+
+	if (ip_version == 4) {
+		qp->qp1_hdr.ip4.tos = 0;
+		qp->qp1_hdr.ip4.id = 0;
+		qp->qp1_hdr.ip4.frag_off = htons(IP_DF);
+		qp->qp1_hdr.ip4.ttl = qplib_ah->hop_limit;
+
+		memcpy(&qp->qp1_hdr.ip4.saddr, sgid.raw + 12, 4);
+		memcpy(&qp->qp1_hdr.ip4.daddr, qplib_ah->dgid.data + 12, 4);
+		qp->qp1_hdr.ip4.check = ib_ud_ip4_csum(&qp->qp1_hdr);
+	}
+
+	if (is_udp) {
+		qp->qp1_hdr.udp.dport = htons(ROCE_V2_UDP_DPORT);
+		qp->qp1_hdr.udp.sport = htons(0x8CD1);
+		qp->qp1_hdr.udp.csum = 0;
+	}
+
+	/* BTH */
+	if (wr->opcode == IB_WR_SEND_WITH_IMM) {
+		qp->qp1_hdr.bth.opcode = IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE;
+		qp->qp1_hdr.immediate_present = 1;
+	} else {
+		qp->qp1_hdr.bth.opcode = IB_OPCODE_UD_SEND_ONLY;
+	}
+	if (wr->send_flags & IB_SEND_SOLICITED)
+		qp->qp1_hdr.bth.solicited_event = 1;
+	/* pad_count */
+	qp->qp1_hdr.bth.pad_count = (4 - payload_size) & 3;
+
+	/* P_key for QP1 is for all members */
+	qp->qp1_hdr.bth.pkey = cpu_to_be16(0xFFFF);
+	qp->qp1_hdr.bth.destination_qpn = IB_QP1;
+	qp->qp1_hdr.bth.ack_req = 0;
+	qp->send_psn++;
+	qp->send_psn &= BTH_PSN_MASK;
+	qp->qp1_hdr.bth.psn = cpu_to_be32(qp->send_psn);
+	/* DETH */
+	/* Use the priviledged Q_Key for QP1 */
+	qp->qp1_hdr.deth.qkey = cpu_to_be32(IB_QP1_QKEY);
+	qp->qp1_hdr.deth.source_qpn = IB_QP1;
+
+	/* Pack the QP1 to the transmit buffer */
+	buf = bnxt_qplib_get_qp1_sq_buf(&qp->qplib_qp, &sge);
+	if (buf) {
+		size = ib_ud_header_pack(&qp->qp1_hdr, buf);
+		for (i = wqe->num_sge; i; i--) {
+			wqe->sg_list[i].addr = wqe->sg_list[i - 1].addr;
+			wqe->sg_list[i].lkey = wqe->sg_list[i - 1].lkey;
+			wqe->sg_list[i].size = wqe->sg_list[i - 1].size;
+		}
+
+		/*
+		 * Max Header buf size for IPV6 RoCE V2 is 86,
+		 * which is same as the QP1 SQ header buffer.
+		 * Header buf size for IPV4 RoCE V2 can be 66.
+		 * ETH(14) + VLAN(4)+ IP(20) + UDP (8) + BTH(20).
+		 * Subtract 20 bytes from QP1 SQ header buf size
+		 */
+		if (is_udp && ip_version == 4)
+			sge.size -= 20;
+		/*
+		 * Max Header buf size for RoCE V1 is 78.
+		 * ETH(14) + VLAN(4) + GRH(40) + BTH(20).
+		 * Subtract 8 bytes from QP1 SQ header buf size
+		 */
+		if (!is_udp)
+			sge.size -= 8;
+
+		/* Subtract 4 bytes for non vlan packets */
+		if (!is_vlan)
+			sge.size -= 4;
+
+		wqe->sg_list[0].addr = sge.addr;
+		wqe->sg_list[0].lkey = sge.lkey;
+		wqe->sg_list[0].size = sge.size;
+		wqe->num_sge++;
+
+	} else {
+		dev_err(rdev_to_dev(qp->rdev), "QP1 buffer is empty!");
+		rc = -ENOMEM;
+	}
+	return rc;
+}
+
+int is_ud_qp(struct bnxt_re_qp *qp)
+{
+	return qp->qplib_qp.type == CMDQ_CREATE_QP_TYPE_UD;
+}
+
+static int bnxt_re_build_send_wqe(struct bnxt_re_qp *qp,
+				  struct ib_send_wr *wr,
+				  struct bnxt_qplib_swqe *wqe)
+{
+	struct bnxt_re_ah *ah = NULL;
+
+	if (is_ud_qp(qp)) {
+		ah = to_bnxt_re(ud_wr(wr)->ah, struct bnxt_re_ah, ib_ah);
+		wqe->send.q_key = ud_wr(wr)->remote_qkey;
+		wqe->send.dst_qp = ud_wr(wr)->remote_qpn;
+		wqe->send.avid = ah->qplib_ah.id;
+	}
+	switch (wr->opcode) {
+	case IB_WR_SEND:
+		wqe->type = BNXT_QPLIB_SWQE_TYPE_SEND;
+		break;
+	case IB_WR_SEND_WITH_IMM:
+		wqe->type = BNXT_QPLIB_SWQE_TYPE_SEND_WITH_IMM;
+		wqe->send.imm_data_or_inv_key = wr->ex.imm_data;
+		break;
+	case IB_WR_SEND_WITH_INV:
+		wqe->type = BNXT_QPLIB_SWQE_TYPE_SEND_WITH_INV;
+		wqe->send.imm_data_or_inv_key = wr->ex.invalidate_rkey;
+		break;
+	default:
+		return -EINVAL;
+	}
+	if (wr->send_flags & IB_SEND_SIGNALED)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SIGNAL_COMP;
+	if (wr->send_flags & IB_SEND_FENCE)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_UC_FENCE;
+	if (wr->send_flags & IB_SEND_SOLICITED)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SOLICIT_EVENT;
+	if (wr->send_flags & IB_SEND_INLINE)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_INLINE;
+
+	return 0;
+}
+
+static int bnxt_re_build_rdma_wqe(struct ib_send_wr *wr,
+				  struct bnxt_qplib_swqe *wqe)
+{
+	switch (wr->opcode) {
+	case IB_WR_RDMA_WRITE:
+		wqe->type = BNXT_QPLIB_SWQE_TYPE_RDMA_WRITE;
+		break;
+	case IB_WR_RDMA_WRITE_WITH_IMM:
+		wqe->type = BNXT_QPLIB_SWQE_TYPE_RDMA_WRITE_WITH_IMM;
+		wqe->rdma.imm_data_or_inv_key = wr->ex.imm_data;
+		break;
+	case IB_WR_RDMA_READ:
+		wqe->type = BNXT_QPLIB_SWQE_TYPE_RDMA_READ;
+		wqe->rdma.imm_data_or_inv_key = wr->ex.invalidate_rkey;
+		break;
+	default:
+		return -EINVAL;
+	}
+	wqe->rdma.remote_va = rdma_wr(wr)->remote_addr;
+	wqe->rdma.r_key = rdma_wr(wr)->rkey;
+	if (wr->send_flags & IB_SEND_SIGNALED)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SIGNAL_COMP;
+	if (wr->send_flags & IB_SEND_FENCE)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_UC_FENCE;
+	if (wr->send_flags & IB_SEND_SOLICITED)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SOLICIT_EVENT;
+	if (wr->send_flags & IB_SEND_INLINE)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_INLINE;
+
+	return 0;
+}
+
+static int bnxt_re_build_atomic_wqe(struct ib_send_wr *wr,
+				    struct bnxt_qplib_swqe *wqe)
+{
+	switch (wr->opcode) {
+	case IB_WR_ATOMIC_CMP_AND_SWP:
+		wqe->type = BNXT_QPLIB_SWQE_TYPE_ATOMIC_CMP_AND_SWP;
+		wqe->atomic.swap_data = atomic_wr(wr)->swap;
+		break;
+	case IB_WR_ATOMIC_FETCH_AND_ADD:
+		wqe->type = BNXT_QPLIB_SWQE_TYPE_ATOMIC_FETCH_AND_ADD;
+		wqe->atomic.cmp_data = atomic_wr(wr)->compare_add;
+		break;
+	default:
+		return -EINVAL;
+	}
+	wqe->atomic.remote_va = atomic_wr(wr)->remote_addr;
+	wqe->atomic.r_key = atomic_wr(wr)->rkey;
+	if (wr->send_flags & IB_SEND_SIGNALED)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SIGNAL_COMP;
+	if (wr->send_flags & IB_SEND_FENCE)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_UC_FENCE;
+	if (wr->send_flags & IB_SEND_SOLICITED)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SOLICIT_EVENT;
+	return 0;
+}
+
+static int bnxt_re_build_inv_wqe(struct ib_send_wr *wr,
+				 struct bnxt_qplib_swqe *wqe)
+{
+	wqe->type = BNXT_QPLIB_SWQE_TYPE_LOCAL_INV;
+	wqe->local_inv.inv_l_key = wr->ex.invalidate_rkey;
+
+	if (wr->send_flags & IB_SEND_SIGNALED)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SIGNAL_COMP;
+	if (wr->send_flags & IB_SEND_FENCE)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_UC_FENCE;
+	if (wr->send_flags & IB_SEND_SOLICITED)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SOLICIT_EVENT;
+
+	return 0;
+}
+
+static int bnxt_re_build_reg_wqe(struct ib_reg_wr *wr,
+				 struct bnxt_qplib_swqe *wqe)
+{
+	struct bnxt_re_mr *mr = to_bnxt_re(wr->mr, struct bnxt_re_mr, ib_mr);
+	struct bnxt_qplib_frpl *qplib_frpl = &mr->qplib_frpl;
+	int access = wr->access;
+
+	wqe->frmr.pbl_ptr = (u64 *)qplib_frpl->hwq.pbl_ptr[0];
+	wqe->frmr.pbl_dma_ptr = qplib_frpl->hwq.pbl_dma_ptr[0];
+	wqe->frmr.page_list = mr->pages;
+	wqe->frmr.page_list_len = mr->npages;
+	wqe->frmr.levels = qplib_frpl->hwq.level + 1;
+	wqe->type = BNXT_QPLIB_SWQE_TYPE_REG_MR;
+
+	if (wr->wr.send_flags & IB_SEND_FENCE)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_UC_FENCE;
+	if (wr->wr.send_flags & IB_SEND_SIGNALED)
+		wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SIGNAL_COMP;
+
+	if (access & IB_ACCESS_LOCAL_WRITE)
+		wqe->frmr.access_cntl |= SQ_FR_PMR_ACCESS_CNTL_LOCAL_WRITE;
+	if (access & IB_ACCESS_REMOTE_READ)
+		wqe->frmr.access_cntl |= SQ_FR_PMR_ACCESS_CNTL_REMOTE_READ;
+	if (access & IB_ACCESS_REMOTE_WRITE)
+		wqe->frmr.access_cntl |= SQ_FR_PMR_ACCESS_CNTL_REMOTE_WRITE;
+	if (access & IB_ACCESS_REMOTE_ATOMIC)
+		wqe->frmr.access_cntl |= SQ_FR_PMR_ACCESS_CNTL_REMOTE_ATOMIC;
+	if (access & IB_ACCESS_MW_BIND)
+		wqe->frmr.access_cntl |= SQ_FR_PMR_ACCESS_CNTL_WINDOW_BIND;
+
+	wqe->frmr.l_key = wr->key;
+	wqe->frmr.length = wr->mr->length;
+	wqe->frmr.pbl_pg_sz_log = (wr->mr->page_size >> PAGE_SHIFT_4K) - 1;
+	wqe->frmr.va = wr->mr->iova;
+	return 0;
+}
+
+int bnxt_re_copy_inline_data(struct bnxt_re_dev *rdev, struct ib_send_wr *wr,
+			     struct bnxt_qplib_swqe *wqe)
+{
+	/*  Copy the inline data to the data  field */
+	u8 *in_data;
+	u32 i, sge_len;
+	void *sge_addr;
+
+	in_data = wqe->inline_data;
+	for (i = 0; i < wr->num_sge; i++) {
+		sge_addr = (void *)(unsigned long)
+				wr->sg_list[i].addr;
+		sge_len = wr->sg_list[i].length;
+
+		if ((sge_len + wqe->inline_len) >
+		    BNXT_QPLIB_SWQE_MAX_INLINE_LENGTH) {
+			dev_err(rdev_to_dev(rdev),
+				"Inline data size requested > supported value");
+			return -EINVAL;
+		}
+		sge_len = wr->sg_list[i].length;
+
+		memcpy(in_data, sge_addr, sge_len);
+		in_data += wr->sg_list[i].length;
+		wqe->inline_len += wr->sg_list[i].length;
+	}
+	return wqe->inline_len;
+}
+
+int bnxt_re_copy_wr_payload(struct bnxt_re_dev *rdev, struct ib_send_wr *wr,
+			    struct bnxt_qplib_swqe *wqe)
+{
+	int payload_sz = 0;
+
+	if (wr->send_flags & IB_SEND_INLINE)
+		payload_sz = bnxt_re_copy_inline_data(rdev, wr, wqe);
+	else
+		payload_sz = bnxt_re_build_sgl(wr->sg_list, wqe->sg_list,
+					       wqe->num_sge);
+
+	return payload_sz;
+}
+
+int bnxt_re_post_send_shadow_qp(struct bnxt_re_dev *rdev,
+				struct bnxt_re_qp *qp,
+				struct ib_send_wr *wr)
+{
+	struct bnxt_qplib_swqe wqe;
+	int rc = 0, payload_sz = 0;
+	unsigned long flags;
+
+	spin_lock_irqsave(&qp->sq_lock, flags);
+	memset(&wqe, 0, sizeof(wqe));
+	while (wr) {
+		/* House keeping */
+		memset(&wqe, 0, sizeof(wqe));
+
+		/* Common */
+		wqe.num_sge = wr->num_sge;
+		if (wr->num_sge > qp->qplib_qp.sq.max_sge) {
+			dev_err(rdev_to_dev(rdev),
+				"Limit exceeded for Send SGEs");
+			rc = -EINVAL;
+			goto bad;
+		}
+
+		payload_sz = bnxt_re_copy_wr_payload(qp->rdev, wr, &wqe);
+		if (payload_sz < 0) {
+			rc = -EINVAL;
+			goto bad;
+		}
+		wqe.wr_id = wr->wr_id;
+
+		wqe.type = BNXT_QPLIB_SWQE_TYPE_SEND;
+
+		rc = bnxt_re_build_send_wqe(qp, wr, &wqe);
+		if (!rc)
+			rc = bnxt_qplib_post_send(&qp->qplib_qp, &wqe);
+bad:
+		if (rc) {
+			dev_err(rdev_to_dev(rdev),
+				"Post send failed opcode = %#x rc = %d",
+				wr->opcode, rc);
+			break;
+		}
+		wr = wr->next;
+	}
+	bnxt_qplib_post_send_db(&qp->qplib_qp);
+	spin_unlock_irqrestore(&qp->sq_lock, flags);
+	return rc;
+}
+
+int bnxt_re_post_send(struct ib_qp *ib_qp, struct ib_send_wr *wr,
+		      struct ib_send_wr **bad_wr)
+{
+	struct bnxt_re_qp *qp = to_bnxt_re(ib_qp, struct bnxt_re_qp, ib_qp);
+	struct bnxt_qplib_swqe wqe;
+	int rc = 0, payload_sz = 0;
+	unsigned long flags;
+
+	spin_lock_irqsave(&qp->sq_lock, flags);
+	while (wr) {
+		/* House keeping */
+		memset(&wqe, 0, sizeof(wqe));
+
+		/* Common */
+		wqe.num_sge = wr->num_sge;
+		if (wr->num_sge > qp->qplib_qp.sq.max_sge) {
+			dev_err(rdev_to_dev(qp->rdev),
+				"Limit exceeded for Send SGEs");
+			rc = -EINVAL;
+			goto bad;
+		}
+
+		payload_sz = bnxt_re_copy_wr_payload(qp->rdev, wr, &wqe);
+		if (payload_sz < 0) {
+			rc = -EINVAL;
+			goto bad;
+		}
+		wqe.wr_id = wr->wr_id;
+
+		switch (wr->opcode) {
+		case IB_WR_SEND:
+		case IB_WR_SEND_WITH_IMM:
+			if (ib_qp->qp_type == IB_QPT_GSI) {
+				rc = bnxt_re_build_qp1_send_v2(qp, wr, &wqe,
+							       payload_sz);
+				if (rc)
+					goto bad;
+				wqe.rawqp1.lflags |=
+					SQ_SEND_RAWETH_QP1_LFLAGS_ROCE_CRC;
+			}
+			switch (wr->send_flags) {
+			case IB_SEND_IP_CSUM:
+				wqe.rawqp1.lflags |=
+					SQ_SEND_RAWETH_QP1_LFLAGS_IP_CHKSUM;
+				break;
+			default:
+				break;
+			}
+			/* Fall thru to build the wqe */
+		case IB_WR_SEND_WITH_INV:
+			rc = bnxt_re_build_send_wqe(qp, wr, &wqe);
+			break;
+		case IB_WR_RDMA_WRITE:
+		case IB_WR_RDMA_WRITE_WITH_IMM:
+		case IB_WR_RDMA_READ:
+			rc = bnxt_re_build_rdma_wqe(wr, &wqe);
+			break;
+		case IB_WR_ATOMIC_CMP_AND_SWP:
+		case IB_WR_ATOMIC_FETCH_AND_ADD:
+			rc = bnxt_re_build_atomic_wqe(wr, &wqe);
+			break;
+		case IB_WR_RDMA_READ_WITH_INV:
+			dev_err(rdev_to_dev(qp->rdev),
+				"RDMA Read with Invalidate is not supported");
+			rc = -EINVAL;
+			goto bad;
+		case IB_WR_LOCAL_INV:
+			rc = bnxt_re_build_inv_wqe(wr, &wqe);
+			break;
+		case IB_WR_REG_MR:
+			rc = bnxt_re_build_reg_wqe(reg_wr(wr), &wqe);
+			break;
+		default:
+			/* Unsupported WRs */
+			dev_err(rdev_to_dev(qp->rdev),
+				"WR (%#x) is not supported", wr->opcode);
+			rc = -EINVAL;
+			goto bad;
+		}
+		if (!rc)
+			rc = bnxt_qplib_post_send(&qp->qplib_qp, &wqe);
+bad:
+		if (rc) {
+			dev_err(rdev_to_dev(qp->rdev),
+				"post_send failed op:%#x qps = %#x rc = %d\n",
+				wr->opcode, qp->qplib_qp.state, rc);
+			*bad_wr = wr;
+			break;
+		}
+		wr = wr->next;
+	}
+	bnxt_qplib_post_send_db(&qp->qplib_qp);
+	spin_unlock_irqrestore(&qp->sq_lock, flags);
+
+	return rc;
+}
+
 /* Completion Queues */
 int bnxt_re_destroy_cq(struct ib_cq *ib_cq)
 {
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
index fdfe8bf..bb6f395 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
@@ -137,6 +137,8 @@ int bnxt_re_modify_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
 int bnxt_re_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
 		     int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr);
 int bnxt_re_destroy_qp(struct ib_qp *qp);
+int bnxt_re_post_send(struct ib_qp *qp, struct ib_send_wr *send_wr,
+		      struct ib_send_wr **bad_send_wr);
 struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
 				const struct ib_cq_init_attr *attr,
 				struct ib_ucontext *context,
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
index 80ee5b7..be77d91 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
@@ -445,6 +445,7 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
 	ibdev->query_qp			= bnxt_re_query_qp;
 	ibdev->destroy_qp		= bnxt_re_destroy_qp;
 
+	ibdev->post_send		= bnxt_re_post_send;
 	ibdev->create_cq		= bnxt_re_create_cq;
 	ibdev->destroy_cq		= bnxt_re_destroy_cq;
 	ibdev->req_notify_cq		= bnxt_re_req_notify_cq;
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 20/28] bnxt_re: Support QP verbs
From: Selvin Xavier @ 2016-12-05  6:38 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Selvin Xavier, Eddie Wai,
	Devesh Sharma, Somnath Kotur, Sriharsha Basavapatna
In-Reply-To: <1480919912-1079-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

This patch implements create_qp, destroy_qp, query_qp and modify_qp verbs.

Signed-off-by: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c      | 873 ++++++++++++++++++++++
 drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h      | 250 +++++++
 drivers/infiniband/hw/bnxtre/bnxt_re.h            |  14 +
 drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c   | 762 +++++++++++++++++++
 drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h   |  21 +
 drivers/infiniband/hw/bnxtre/bnxt_re_main.c       |   6 +
 drivers/infiniband/hw/bnxtre/bnxt_re_uverbs_abi.h |  10 +
 7 files changed, 1936 insertions(+)

diff --git a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
index fa2adab..5d502857 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.c
@@ -25,6 +25,69 @@
 #include "bnxt_qplib_fp.h"
 
 static void bnxt_qplib_arm_cq_enable(struct bnxt_qplib_cq *cq);
+
+static void bnxt_qplib_free_qp_hdr_buf(struct bnxt_qplib_res *res,
+				       struct bnxt_qplib_qp *qp)
+{
+	struct bnxt_qplib_q *rq = &qp->rq;
+	struct bnxt_qplib_q *sq = &qp->sq;
+
+	if (qp->rq_hdr_buf)
+		dma_free_coherent(&res->pdev->dev,
+				  rq->hwq.max_elements * qp->rq_hdr_buf_size,
+				  qp->rq_hdr_buf, qp->rq_hdr_buf_map);
+	if (qp->sq_hdr_buf)
+		dma_free_coherent(&res->pdev->dev,
+				  sq->hwq.max_elements * qp->sq_hdr_buf_size,
+				  qp->sq_hdr_buf, qp->sq_hdr_buf_map);
+	qp->rq_hdr_buf = NULL;
+	qp->sq_hdr_buf = NULL;
+	qp->rq_hdr_buf_map = 0;
+	qp->sq_hdr_buf_map = 0;
+	qp->sq_hdr_buf_size = 0;
+	qp->rq_hdr_buf_size = 0;
+}
+
+static int bnxt_qplib_alloc_qp_hdr_buf(struct bnxt_qplib_res *res,
+				       struct bnxt_qplib_qp *qp)
+{
+	struct bnxt_qplib_q *rq = &qp->rq;
+	struct bnxt_qplib_q *sq = &qp->rq;
+	int rc = 0;
+
+	if (qp->sq_hdr_buf_size && sq->hwq.max_elements) {
+		qp->sq_hdr_buf = dma_alloc_coherent(&res->pdev->dev,
+					sq->hwq.max_elements *
+					qp->sq_hdr_buf_size,
+					&qp->sq_hdr_buf_map, GFP_KERNEL);
+		if (!qp->sq_hdr_buf) {
+			rc = -ENOMEM;
+			dev_err(&res->pdev->dev,
+				"QPLIB: Failed to create sq_hdr_buf");
+			goto fail;
+		}
+	}
+
+	if (qp->rq_hdr_buf_size && rq->hwq.max_elements) {
+		qp->rq_hdr_buf = dma_alloc_coherent(&res->pdev->dev,
+						    rq->hwq.max_elements *
+						    qp->rq_hdr_buf_size,
+						    &qp->rq_hdr_buf_map,
+						    GFP_KERNEL);
+		if (!qp->rq_hdr_buf) {
+			rc = -ENOMEM;
+			dev_err(&res->pdev->dev,
+				"QPLIB: Failed to create rq_hdr_buf");
+			goto fail;
+		}
+	}
+	return 0;
+
+fail:
+	bnxt_qplib_free_qp_hdr_buf(res, qp);
+	return rc;
+}
+
 static void bnxt_qplib_service_nq(unsigned long data)
 {
 	struct bnxt_qplib_nq *nq = (struct bnxt_qplib_nq *)data;
@@ -190,6 +253,816 @@ int bnxt_qplib_alloc_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq)
 	return 0;
 }
 
+/* QP */
+int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
+{
+	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
+	struct cmdq_create_qp1 req;
+	struct creq_create_qp1_resp *resp;
+	struct bnxt_qplib_pbl *pbl;
+	struct bnxt_qplib_q *sq = &qp->sq;
+	struct bnxt_qplib_q *rq = &qp->rq;
+	int rc;
+	u16 cmd_flags = 0;
+	u32 qp_flags = 0;
+
+	RCFW_CMD_PREP(req, CREATE_QP1, cmd_flags);
+
+	/* General */
+	req.type = qp->type;
+	req.dpi = cpu_to_le32(qp->dpi->dpi);
+	req.qp_handle = cpu_to_le64(qp->qp_handle);
+
+	/* SQ */
+	sq->hwq.max_elements = sq->max_wqe;
+	rc = bnxt_qplib_alloc_init_hwq(res->pdev, &sq->hwq, NULL, 0,
+				       &sq->hwq.max_elements,
+				       BNXT_QPLIB_MAX_SQE_ENTRY_SIZE, 0,
+				       PAGE_SIZE, HWQ_TYPE_QUEUE);
+	if (rc)
+		goto exit;
+
+	sq->swq = kcalloc(sq->hwq.max_elements, sizeof(*sq->swq), GFP_KERNEL);
+	if (!sq->swq) {
+		rc = -ENOMEM;
+		goto fail_sq;
+	}
+	pbl = &sq->hwq.pbl[PBL_LVL_0];
+	req.sq_pbl = cpu_to_le64(pbl->pg_map_arr[0]);
+	req.sq_pg_size_sq_lvl =
+		((sq->hwq.level & CMDQ_CREATE_QP1_SQ_LVL_MASK)
+				<<  CMDQ_CREATE_QP1_SQ_LVL_SFT) |
+		(pbl->pg_size == ROCE_PG_SIZE_4K ?
+				CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_4K :
+		 pbl->pg_size == ROCE_PG_SIZE_8K ?
+				CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_8K :
+		 pbl->pg_size == ROCE_PG_SIZE_64K ?
+				CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_64K :
+		 pbl->pg_size == ROCE_PG_SIZE_2M ?
+				CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_2M :
+		 pbl->pg_size == ROCE_PG_SIZE_8M ?
+				CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_8M :
+		 pbl->pg_size == ROCE_PG_SIZE_1G ?
+				CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_1G :
+		 CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_4K);
+
+	if (qp->scq)
+		req.scq_cid = cpu_to_le32(qp->scq->id);
+
+	qp_flags |= CMDQ_CREATE_QP1_QP_FLAGS_RESERVED_LKEY_ENABLE;
+
+	/* RQ */
+	if (rq->max_wqe) {
+		rq->hwq.max_elements = qp->rq.max_wqe;
+		rc = bnxt_qplib_alloc_init_hwq(res->pdev, &rq->hwq, NULL, 0,
+					       &rq->hwq.max_elements,
+					       BNXT_QPLIB_MAX_RQE_ENTRY_SIZE, 0,
+					       PAGE_SIZE, HWQ_TYPE_QUEUE);
+		if (rc)
+			goto fail_sq;
+
+		rq->swq = kcalloc(rq->hwq.max_elements, sizeof(*rq->swq),
+				  GFP_KERNEL);
+		if (!rq->swq) {
+			rc = -ENOMEM;
+			goto fail_rq;
+		}
+		pbl = &rq->hwq.pbl[PBL_LVL_0];
+		req.rq_pbl = cpu_to_le64(pbl->pg_map_arr[0]);
+		req.rq_pg_size_rq_lvl =
+			((rq->hwq.level & CMDQ_CREATE_QP1_RQ_LVL_MASK) <<
+			 CMDQ_CREATE_QP1_RQ_LVL_SFT) |
+				(pbl->pg_size == ROCE_PG_SIZE_4K ?
+					CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_4K :
+				 pbl->pg_size == ROCE_PG_SIZE_8K ?
+					CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_8K :
+				 pbl->pg_size == ROCE_PG_SIZE_64K ?
+					CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_64K :
+				 pbl->pg_size == ROCE_PG_SIZE_2M ?
+					CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_2M :
+				 pbl->pg_size == ROCE_PG_SIZE_8M ?
+					CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_8M :
+				 pbl->pg_size == ROCE_PG_SIZE_1G ?
+					CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_1G :
+				 CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_4K);
+		if (qp->rcq)
+			req.rcq_cid = cpu_to_le32(qp->rcq->id);
+	}
+
+	/* Header buffer - allow hdr_buf pass in */
+	rc = bnxt_qplib_alloc_qp_hdr_buf(res, qp);
+	if (rc) {
+		rc = -ENOMEM;
+		goto fail;
+	}
+	req.qp_flags = cpu_to_le32(qp_flags);
+	req.sq_size = cpu_to_le32(sq->hwq.max_elements);
+	req.rq_size = cpu_to_le32(rq->hwq.max_elements);
+
+	req.sq_fwo_sq_sge =
+		cpu_to_le16((sq->max_sge & CMDQ_CREATE_QP1_SQ_SGE_MASK) <<
+			    CMDQ_CREATE_QP1_SQ_SGE_SFT);
+	req.rq_fwo_rq_sge =
+		cpu_to_le16((rq->max_sge & CMDQ_CREATE_QP1_RQ_SGE_MASK) <<
+			    CMDQ_CREATE_QP1_RQ_SGE_SFT);
+
+	req.pd_id = cpu_to_le32(qp->pd->id);
+
+	resp = (struct creq_create_qp1_resp *)
+			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+						     NULL, 0);
+	if (!resp) {
+		dev_err(&res->pdev->dev, "QPLIB: FP: CREATE_QP1 send failed");
+		rc = -EINVAL;
+		goto fail;
+	}
+	/**/
+	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
+		/* Cmd timed out */
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_QP1 timed out");
+		rc = -ETIMEDOUT;
+		goto fail;
+	}
+	if (RCFW_RESP_STATUS(resp) ||
+	    RCFW_RESP_COOKIE(resp) != RCFW_CMDQ_COOKIE(req)) {
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_QP1 failed ");
+		dev_err(&rcfw->pdev->dev,
+			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
+			RCFW_RESP_STATUS(resp), RCFW_CMDQ_COOKIE(req),
+			RCFW_RESP_COOKIE(resp));
+		rc = -EINVAL;
+		goto fail;
+	}
+	qp->id = le32_to_cpu(resp->xid);
+	qp->cur_qp_state = CMDQ_MODIFY_QP_NEW_STATE_RESET;
+	sq->flush_in_progress = false;
+	rq->flush_in_progress = false;
+
+	return 0;
+
+fail:
+	bnxt_qplib_free_qp_hdr_buf(res, qp);
+fail_rq:
+	bnxt_qplib_free_hwq(res->pdev, &rq->hwq);
+	kfree(rq->swq);
+fail_sq:
+	bnxt_qplib_free_hwq(res->pdev, &sq->hwq);
+	kfree(sq->swq);
+exit:
+	return rc;
+}
+
+int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
+{
+	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
+	struct sq_send *hw_sq_send_hdr, **hw_sq_send_ptr;
+	struct cmdq_create_qp req;
+	struct creq_create_qp_resp *resp;
+	struct bnxt_qplib_pbl *pbl;
+	struct sq_psn_search **psn_search_ptr;
+	unsigned long long int psn_search, poff = 0;
+	struct bnxt_qplib_q *sq = &qp->sq;
+	struct bnxt_qplib_q *rq = &qp->rq;
+	struct bnxt_qplib_hwq *xrrq;
+	int i, rc, req_size, psn_sz;
+	u16 cmd_flags = 0, max_ssge;
+	u32 sw_prod, qp_flags = 0;
+
+	RCFW_CMD_PREP(req, CREATE_QP, cmd_flags);
+
+	/* General */
+	req.type = qp->type;
+	req.dpi = cpu_to_le32(qp->dpi->dpi);
+	req.qp_handle = cpu_to_le64(qp->qp_handle);
+
+	/* SQ */
+	psn_sz = (qp->type == CMDQ_CREATE_QP_TYPE_RC) ?
+		 sizeof(struct sq_psn_search) : 0;
+	sq->hwq.max_elements = sq->max_wqe;
+	rc = bnxt_qplib_alloc_init_hwq(res->pdev, &sq->hwq, sq->sglist,
+				       sq->nmap, &sq->hwq.max_elements,
+				       BNXT_QPLIB_MAX_SQE_ENTRY_SIZE,
+				       psn_sz,
+				       PAGE_SIZE, HWQ_TYPE_QUEUE);
+	if (rc)
+		goto exit;
+
+	sq->swq = kcalloc(sq->hwq.max_elements, sizeof(*sq->swq), GFP_KERNEL);
+	if (!sq->swq) {
+		rc = -ENOMEM;
+		goto fail_sq;
+	}
+	hw_sq_send_ptr = (struct sq_send **)sq->hwq.pbl_ptr;
+	if (psn_sz) {
+		psn_search_ptr = (struct sq_psn_search **)
+				  &hw_sq_send_ptr[SQE_PG(sq->hwq.max_elements)];
+		psn_search = (unsigned long long int)
+			      &hw_sq_send_ptr[SQE_PG(sq->hwq.max_elements)]
+			      [SQE_IDX(sq->hwq.max_elements)];
+		if (psn_search & ~PAGE_MASK) {
+			/* If the psn_search does not start on a page boundary,
+			 * then calculate the offset
+			 */
+			poff = (psn_search & ~PAGE_MASK) /
+				BNXT_QPLIB_MAX_PSNE_ENTRY_SIZE;
+		}
+		for (i = 0; i < sq->hwq.max_elements; i++)
+			sq->swq[i].psn_search =
+				&psn_search_ptr[PSNE_PG(i + poff)]
+					       [PSNE_IDX(i + poff)];
+	}
+	pbl = &sq->hwq.pbl[PBL_LVL_0];
+	req.sq_pbl = cpu_to_le64(pbl->pg_map_arr[0]);
+	req.sq_pg_size_sq_lvl =
+		((sq->hwq.level & CMDQ_CREATE_QP_SQ_LVL_MASK)
+				 <<  CMDQ_CREATE_QP_SQ_LVL_SFT) |
+		(pbl->pg_size == ROCE_PG_SIZE_4K ?
+				CMDQ_CREATE_QP_SQ_PG_SIZE_PG_4K :
+		 pbl->pg_size == ROCE_PG_SIZE_8K ?
+				CMDQ_CREATE_QP_SQ_PG_SIZE_PG_8K :
+		 pbl->pg_size == ROCE_PG_SIZE_64K ?
+				CMDQ_CREATE_QP_SQ_PG_SIZE_PG_64K :
+		 pbl->pg_size == ROCE_PG_SIZE_2M ?
+				CMDQ_CREATE_QP_SQ_PG_SIZE_PG_2M :
+		 pbl->pg_size == ROCE_PG_SIZE_8M ?
+				CMDQ_CREATE_QP_SQ_PG_SIZE_PG_8M :
+		 pbl->pg_size == ROCE_PG_SIZE_1G ?
+				CMDQ_CREATE_QP_SQ_PG_SIZE_PG_1G :
+		 CMDQ_CREATE_QP_SQ_PG_SIZE_PG_4K);
+
+	/* initialize all SQ WQEs to LOCAL_INVALID (sq prep for hw fetch) */
+	hw_sq_send_ptr = (struct sq_send **)sq->hwq.pbl_ptr;
+	for (sw_prod = 0; sw_prod < sq->hwq.max_elements; sw_prod++) {
+		hw_sq_send_hdr = &hw_sq_send_ptr[SQE_PG(sw_prod)]
+						[SQE_IDX(sw_prod)];
+		hw_sq_send_hdr->wqe_type = SQ_BASE_WQE_TYPE_LOCAL_INVALID;
+	}
+
+	if (qp->scq)
+		req.scq_cid = cpu_to_le32(qp->scq->id);
+
+	qp_flags |= CMDQ_CREATE_QP_QP_FLAGS_RESERVED_LKEY_ENABLE;
+	qp_flags |= CMDQ_CREATE_QP_QP_FLAGS_FR_PMR_ENABLED;
+	if (qp->sig_type)
+		qp_flags |= CMDQ_CREATE_QP_QP_FLAGS_FORCE_COMPLETION;
+
+	/* RQ */
+	if (rq->max_wqe) {
+		rq->hwq.max_elements = rq->max_wqe;
+		rc = bnxt_qplib_alloc_init_hwq(res->pdev, &rq->hwq, rq->sglist,
+					       rq->nmap, &rq->hwq.max_elements,
+					       BNXT_QPLIB_MAX_RQE_ENTRY_SIZE, 0,
+					       PAGE_SIZE, HWQ_TYPE_QUEUE);
+		if (rc)
+			goto fail_sq;
+
+		rq->swq = kcalloc(rq->hwq.max_elements, sizeof(*rq->swq),
+				  GFP_KERNEL);
+		if (!rq->swq) {
+			rc = -ENOMEM;
+			goto fail_rq;
+		}
+		pbl = &rq->hwq.pbl[PBL_LVL_0];
+		req.rq_pbl = cpu_to_le64(pbl->pg_map_arr[0]);
+		req.rq_pg_size_rq_lvl =
+			((rq->hwq.level & CMDQ_CREATE_QP_RQ_LVL_MASK) <<
+			 CMDQ_CREATE_QP_RQ_LVL_SFT) |
+				(pbl->pg_size == ROCE_PG_SIZE_4K ?
+					CMDQ_CREATE_QP_RQ_PG_SIZE_PG_4K :
+				 pbl->pg_size == ROCE_PG_SIZE_8K ?
+					CMDQ_CREATE_QP_RQ_PG_SIZE_PG_8K :
+				 pbl->pg_size == ROCE_PG_SIZE_64K ?
+					CMDQ_CREATE_QP_RQ_PG_SIZE_PG_64K :
+				 pbl->pg_size == ROCE_PG_SIZE_2M ?
+					CMDQ_CREATE_QP_RQ_PG_SIZE_PG_2M :
+				 pbl->pg_size == ROCE_PG_SIZE_8M ?
+					CMDQ_CREATE_QP_RQ_PG_SIZE_PG_8M :
+				 pbl->pg_size == ROCE_PG_SIZE_1G ?
+					CMDQ_CREATE_QP_RQ_PG_SIZE_PG_1G :
+				 CMDQ_CREATE_QP_RQ_PG_SIZE_PG_4K);
+	}
+
+	if (qp->rcq)
+		req.rcq_cid = cpu_to_le32(qp->rcq->id);
+	req.qp_flags = cpu_to_le32(qp_flags);
+	req.sq_size = cpu_to_le32(sq->hwq.max_elements);
+	req.rq_size = cpu_to_le32(rq->hwq.max_elements);
+	qp->sq_hdr_buf = NULL;
+	qp->rq_hdr_buf = NULL;
+
+	rc = bnxt_qplib_alloc_qp_hdr_buf(res, qp);
+	if (rc)
+		goto fail_rq;
+
+	/* CTRL-22434: Irrespective of the requested SGE count on the SQ
+	 * always create the QP with max send sges possible if the requested
+	 * inline size is greater than 0.
+	 */
+	max_ssge = qp->max_inline_data ? 6 : sq->max_sge;
+	req.sq_fwo_sq_sge = cpu_to_le16(
+				((max_ssge & CMDQ_CREATE_QP_SQ_SGE_MASK)
+				 << CMDQ_CREATE_QP_SQ_SGE_SFT) | 0);
+	req.rq_fwo_rq_sge = cpu_to_le16(
+				((rq->max_sge & CMDQ_CREATE_QP_RQ_SGE_MASK)
+				 << CMDQ_CREATE_QP_RQ_SGE_SFT) | 0);
+	/* ORRQ and IRRQ */
+	if (psn_sz) {
+		xrrq = &qp->orrq;
+		xrrq->max_elements =
+			ORD_LIMIT_TO_ORRQ_SLOTS(qp->max_rd_atomic);
+		req_size = xrrq->max_elements *
+			   BNXT_QPLIB_MAX_ORRQE_ENTRY_SIZE + PAGE_SIZE - 1;
+		req_size &= ~(PAGE_SIZE - 1);
+		rc = bnxt_qplib_alloc_init_hwq(res->pdev, xrrq, NULL, 0,
+					       &xrrq->max_elements,
+					       BNXT_QPLIB_MAX_ORRQE_ENTRY_SIZE,
+					       0, req_size, HWQ_TYPE_CTX);
+		if (rc)
+			goto fail_buf_free;
+		pbl = &xrrq->pbl[PBL_LVL_0];
+		req.orrq_addr = cpu_to_le64(pbl->pg_map_arr[0]);
+
+		xrrq = &qp->irrq;
+		xrrq->max_elements = IRD_LIMIT_TO_IRRQ_SLOTS(
+						qp->max_dest_rd_atomic);
+		req_size = xrrq->max_elements *
+			   BNXT_QPLIB_MAX_IRRQE_ENTRY_SIZE + PAGE_SIZE - 1;
+		req_size &= ~(PAGE_SIZE - 1);
+
+		rc = bnxt_qplib_alloc_init_hwq(res->pdev, xrrq, NULL, 0,
+					       &xrrq->max_elements,
+					       BNXT_QPLIB_MAX_IRRQE_ENTRY_SIZE,
+					       0, req_size, HWQ_TYPE_CTX);
+		if (rc)
+			goto fail_orrq;
+
+		pbl = &xrrq->pbl[PBL_LVL_0];
+		req.irrq_addr = cpu_to_le64(pbl->pg_map_arr[0]);
+	}
+	req.pd_id = cpu_to_le32(qp->pd->id);
+
+	resp = (struct creq_create_qp_resp *)
+			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+						     NULL, 0);
+	if (!resp) {
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_QP send failed");
+		rc = -EINVAL;
+		goto fail;
+	}
+	/**/
+	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
+		/* Cmd timed out */
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_QP timed out");
+		rc = -ETIMEDOUT;
+		goto fail;
+	}
+	if (RCFW_RESP_STATUS(resp) ||
+	    RCFW_RESP_COOKIE(resp) != RCFW_CMDQ_COOKIE(req)) {
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_QP failed ");
+		dev_err(&rcfw->pdev->dev,
+			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
+			RCFW_RESP_STATUS(resp), RCFW_CMDQ_COOKIE(req),
+			RCFW_RESP_COOKIE(resp));
+		rc = -EINVAL;
+		goto fail;
+	}
+	qp->id = le32_to_cpu(resp->xid);
+	qp->cur_qp_state = CMDQ_MODIFY_QP_NEW_STATE_RESET;
+	sq->flush_in_progress = false;
+	rq->flush_in_progress = false;
+
+	return 0;
+
+fail:
+	if (qp->irrq.max_elements)
+		bnxt_qplib_free_hwq(res->pdev, &qp->irrq);
+fail_orrq:
+	if (qp->orrq.max_elements)
+		bnxt_qplib_free_hwq(res->pdev, &qp->orrq);
+fail_buf_free:
+	bnxt_qplib_free_qp_hdr_buf(res, qp);
+fail_rq:
+	bnxt_qplib_free_hwq(res->pdev, &rq->hwq);
+	kfree(rq->swq);
+fail_sq:
+	bnxt_qplib_free_hwq(res->pdev, &sq->hwq);
+	kfree(sq->swq);
+exit:
+	return rc;
+}
+
+static void __filter_modify_flags(struct bnxt_qplib_qp *qp)
+{
+	switch (qp->cur_qp_state) {
+	case CMDQ_MODIFY_QP_NEW_STATE_RESET:
+		switch (qp->state) {
+		case CMDQ_MODIFY_QP_NEW_STATE_INIT:
+			break;
+		default:
+			break;
+		}
+		break;
+	case CMDQ_MODIFY_QP_NEW_STATE_INIT:
+		switch (qp->state) {
+		case CMDQ_MODIFY_QP_NEW_STATE_RTR:
+			/* INIT->RTR, configure the path_mtu to the default
+			 * 2048 if not being requested
+			 */
+			if (!(qp->modify_flags &
+			      CMDQ_MODIFY_QP_MODIFY_MASK_PATH_MTU)) {
+				qp->modify_flags |=
+					CMDQ_MODIFY_QP_MODIFY_MASK_PATH_MTU;
+				qp->path_mtu = CMDQ_MODIFY_QP_PATH_MTU_MTU_2048;
+			}
+			qp->modify_flags &=
+				~CMDQ_MODIFY_QP_MODIFY_MASK_VLAN_ID;
+			/* Bono FW requires the max_dest_rd_atomic to be >= 1 */
+			if (qp->max_dest_rd_atomic < 1)
+				qp->max_dest_rd_atomic = 1;
+			qp->modify_flags &= ~CMDQ_MODIFY_QP_MODIFY_MASK_SRC_MAC;
+			/* Bono FW 20.6.5 requires SGID_INDEX configuration */
+			if (!(qp->modify_flags &
+			      CMDQ_MODIFY_QP_MODIFY_MASK_SGID_INDEX)) {
+				qp->modify_flags |=
+					CMDQ_MODIFY_QP_MODIFY_MASK_SGID_INDEX;
+				qp->ah.sgid_index = 0;
+			}
+			break;
+		default:
+			break;
+		}
+		break;
+	case CMDQ_MODIFY_QP_NEW_STATE_RTR:
+		switch (qp->state) {
+		case CMDQ_MODIFY_QP_NEW_STATE_RTS:
+			/* Bono FW requires the max_rd_atomic to be >= 1 */
+			if (qp->max_rd_atomic < 1)
+				qp->max_rd_atomic = 1;
+			/* Bono FW does not allow PKEY_INDEX,
+			 * DGID, FLOW_LABEL, SGID_INDEX, HOP_LIMIT,
+			 * TRAFFIC_CLASS, DEST_MAC, PATH_MTU, RQ_PSN,
+			 * MIN_RNR_TIMER, MAX_DEST_RD_ATOMIC, DEST_QP_ID
+			 * modification
+			 */
+			qp->modify_flags &=
+				~(CMDQ_MODIFY_QP_MODIFY_MASK_PKEY |
+				  CMDQ_MODIFY_QP_MODIFY_MASK_DGID |
+				  CMDQ_MODIFY_QP_MODIFY_MASK_FLOW_LABEL |
+				  CMDQ_MODIFY_QP_MODIFY_MASK_SGID_INDEX |
+				  CMDQ_MODIFY_QP_MODIFY_MASK_HOP_LIMIT |
+				  CMDQ_MODIFY_QP_MODIFY_MASK_TRAFFIC_CLASS |
+				  CMDQ_MODIFY_QP_MODIFY_MASK_DEST_MAC |
+				  CMDQ_MODIFY_QP_MODIFY_MASK_PATH_MTU |
+				  CMDQ_MODIFY_QP_MODIFY_MASK_RQ_PSN |
+				  CMDQ_MODIFY_QP_MODIFY_MASK_MIN_RNR_TIMER |
+				  CMDQ_MODIFY_QP_MODIFY_MASK_MAX_DEST_RD_ATOMIC
+				  | CMDQ_MODIFY_QP_MODIFY_MASK_DEST_QP_ID);
+			break;
+		default:
+			break;
+		}
+		break;
+	case CMDQ_MODIFY_QP_NEW_STATE_RTS:
+		break;
+	case CMDQ_MODIFY_QP_NEW_STATE_SQD:
+		break;
+	case CMDQ_MODIFY_QP_NEW_STATE_SQE:
+		break;
+	case CMDQ_MODIFY_QP_NEW_STATE_ERR:
+		break;
+	default:
+		break;
+	}
+}
+
+int bnxt_qplib_modify_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
+{
+	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
+	struct cmdq_modify_qp req;
+	struct creq_modify_qp_resp *resp;
+	u16 cmd_flags = 0, pkey;
+	u32 temp32[4];
+	u32 bmask;
+
+	RCFW_CMD_PREP(req, MODIFY_QP, cmd_flags);
+
+	/* Filter out the qp_attr_mask based on the state->new transition */
+	__filter_modify_flags(qp);
+	bmask = qp->modify_flags;
+	req.modify_mask = cpu_to_le64(qp->modify_flags);
+	req.qp_cid = cpu_to_le32(qp->id);
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_STATE) {
+		req.network_type_en_sqd_async_notify_new_state =
+				(qp->state & CMDQ_MODIFY_QP_NEW_STATE_MASK) |
+				(qp->en_sqd_async_notify ?
+					CMDQ_MODIFY_QP_EN_SQD_ASYNC_NOTIFY : 0);
+	}
+	req.network_type_en_sqd_async_notify_new_state |= qp->nw_type;
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_ACCESS)
+		req.access = qp->access;
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_PKEY) {
+		if (!bnxt_qplib_get_pkey(res, &res->pkey_tbl,
+					 qp->pkey_index, &pkey))
+			req.pkey = cpu_to_le16(pkey);
+	}
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_QKEY)
+		req.qkey = cpu_to_le32(qp->qkey);
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_DGID) {
+		memcpy(temp32, qp->ah.dgid.data, sizeof(struct bnxt_qplib_gid));
+		req.dgid[0] = cpu_to_le32(temp32[0]);
+		req.dgid[1] = cpu_to_le32(temp32[1]);
+		req.dgid[2] = cpu_to_le32(temp32[2]);
+		req.dgid[3] = cpu_to_le32(temp32[3]);
+	}
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_FLOW_LABEL)
+		req.flow_label = cpu_to_le32(qp->ah.flow_label);
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_SGID_INDEX)
+		req.sgid_index = cpu_to_le16(res->sgid_tbl.hw_id
+					     [qp->ah.sgid_index]);
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_HOP_LIMIT)
+		req.hop_limit = qp->ah.hop_limit;
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_TRAFFIC_CLASS)
+		req.traffic_class = qp->ah.traffic_class;
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_DEST_MAC)
+		memcpy(req.dest_mac, qp->ah.dmac, 6);
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_PATH_MTU)
+		req.path_mtu = cpu_to_le16(qp->path_mtu);
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_TIMEOUT)
+		req.timeout = qp->timeout;
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_RETRY_CNT)
+		req.retry_cnt = qp->retry_cnt;
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_RNR_RETRY)
+		req.rnr_retry = qp->rnr_retry;
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_MIN_RNR_TIMER)
+		req.min_rnr_timer = qp->min_rnr_timer;
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_RQ_PSN)
+		req.rq_psn = cpu_to_le32(qp->rq.psn);
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_SQ_PSN)
+		req.sq_psn = cpu_to_le32(qp->sq.psn);
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_MAX_RD_ATOMIC)
+		req.max_rd_atomic =
+			ORD_LIMIT_TO_ORRQ_SLOTS(qp->max_rd_atomic);
+
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_MAX_DEST_RD_ATOMIC)
+		req.max_dest_rd_atomic =
+			IRD_LIMIT_TO_IRRQ_SLOTS(qp->max_dest_rd_atomic);
+
+	req.sq_size = cpu_to_le32(qp->sq.hwq.max_elements);
+	req.rq_size = cpu_to_le32(qp->rq.hwq.max_elements);
+	req.sq_sge = cpu_to_le16(qp->sq.max_sge);
+	req.rq_sge = cpu_to_le16(qp->rq.max_sge);
+	req.max_inline_data = cpu_to_le32(qp->max_inline_data);
+	if (bmask & CMDQ_MODIFY_QP_MODIFY_MASK_DEST_QP_ID)
+		req.dest_qp_id = cpu_to_le32(qp->dest_qpn);
+
+	req.vlan_pcp_vlan_dei_vlan_id = cpu_to_le16(qp->vlan_id);
+
+	resp = (struct creq_modify_qp_resp *)
+			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+						     NULL, 0);
+	if (!resp) {
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: MODIFY_QP send failed");
+		return -EINVAL;
+	}
+	/**/
+	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
+		/* Cmd timed out */
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: MODIFY_QP timed out");
+		return -ETIMEDOUT;
+	}
+	if (RCFW_RESP_STATUS(resp) ||
+	    RCFW_RESP_COOKIE(resp) != RCFW_CMDQ_COOKIE(req)) {
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: MODIFY_QP failed ");
+		dev_err(&rcfw->pdev->dev,
+			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
+			RCFW_RESP_STATUS(resp), RCFW_CMDQ_COOKIE(req),
+			RCFW_RESP_COOKIE(resp));
+		return -EINVAL;
+	}
+	qp->cur_qp_state = qp->state;
+	return 0;
+}
+
+int bnxt_qplib_query_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
+{
+	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
+	struct cmdq_query_qp req;
+	struct creq_query_qp_resp *resp;
+	struct creq_query_qp_resp_sb *sb;
+	u16 cmd_flags = 0;
+	u32 temp32[4];
+	int i;
+
+	RCFW_CMD_PREP(req, QUERY_QP, cmd_flags);
+
+	req.qp_cid = cpu_to_le32(qp->id);
+	req.resp_size = sizeof(*sb) / BNXT_QPLIB_CMDQE_UNITS;
+	resp = (struct creq_query_qp_resp *)
+			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+						     (void **)&sb, 0);
+	if (!resp) {
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: QUERY_QP send failed");
+		return -EINVAL;
+	}
+	/**/
+	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
+		/* Cmd timed out */
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: QUERY_QP timed out");
+		return -ETIMEDOUT;
+	}
+	if (RCFW_RESP_STATUS(resp) ||
+	    RCFW_RESP_COOKIE(resp) != RCFW_CMDQ_COOKIE(req)) {
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: QUERY_QP failed ");
+		dev_err(&rcfw->pdev->dev,
+			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
+			RCFW_RESP_STATUS(resp), RCFW_CMDQ_COOKIE(req),
+			RCFW_RESP_COOKIE(resp));
+		return -EINVAL;
+	}
+	/* Extract the context from the side buffer */
+	qp->state = sb->en_sqd_async_notify_state &
+			CREQ_QUERY_QP_RESP_SB_STATE_MASK;
+	qp->en_sqd_async_notify = sb->en_sqd_async_notify_state &
+				  CREQ_QUERY_QP_RESP_SB_EN_SQD_ASYNC_NOTIFY ?
+				  true : false;
+	qp->access = sb->access;
+	qp->pkey_index = le16_to_cpu(sb->pkey);
+	qp->qkey = le32_to_cpu(sb->qkey);
+
+	temp32[0] = le32_to_cpu(sb->dgid[0]);
+	temp32[1] = le32_to_cpu(sb->dgid[1]);
+	temp32[2] = le32_to_cpu(sb->dgid[2]);
+	temp32[3] = le32_to_cpu(sb->dgid[3]);
+	memcpy(qp->ah.dgid.data, temp32, sizeof(qp->ah.dgid.data));
+
+	qp->ah.flow_label = le32_to_cpu(sb->flow_label);
+
+	qp->ah.sgid_index = 0;
+	for (i = 0; i < res->sgid_tbl.max; i++) {
+		if (res->sgid_tbl.hw_id[i] == le16_to_cpu(sb->sgid_index)) {
+			qp->ah.sgid_index = i;
+			break;
+		}
+	}
+	if (i == res->sgid_tbl.max)
+		dev_warn(&res->pdev->dev, "QPLIB: SGID not found??");
+
+	qp->ah.hop_limit = sb->hop_limit;
+	qp->ah.traffic_class = sb->traffic_class;
+	memcpy(qp->ah.dmac, sb->dest_mac, 6);
+	qp->ah.vlan_id = le16_to_cpu((sb->path_mtu_dest_vlan_id &
+				CREQ_QUERY_QP_RESP_SB_VLAN_ID_MASK) >>
+				CREQ_QUERY_QP_RESP_SB_VLAN_ID_SFT);
+	qp->path_mtu = sb->path_mtu_dest_vlan_id &
+				    CREQ_QUERY_QP_RESP_SB_PATH_MTU_MASK;
+	qp->timeout = sb->timeout;
+	qp->retry_cnt = sb->retry_cnt;
+	qp->rnr_retry = sb->rnr_retry;
+	qp->min_rnr_timer = sb->min_rnr_timer;
+	qp->rq.psn = le32_to_cpu(sb->rq_psn);
+	qp->max_rd_atomic = ORRQ_SLOTS_TO_ORD_LIMIT(sb->max_rd_atomic);
+	qp->sq.psn = le32_to_cpu(sb->sq_psn);
+	qp->max_dest_rd_atomic =
+			IRRQ_SLOTS_TO_IRD_LIMIT(sb->max_dest_rd_atomic);
+	qp->sq.max_wqe = qp->sq.hwq.max_elements;
+	qp->rq.max_wqe = qp->rq.hwq.max_elements;
+	qp->sq.max_sge = le16_to_cpu(sb->sq_sge);
+	qp->rq.max_sge = le32_to_cpu(sb->rq_sge);
+	qp->max_inline_data = le32_to_cpu(sb->max_inline_data);
+	qp->dest_qpn = le32_to_cpu(sb->dest_qp_id);
+	memcpy(qp->smac, sb->src_mac, 6);
+	qp->vlan_id = le16_to_cpu(sb->vlan_pcp_vlan_dei_vlan_id);
+	return 0;
+}
+
+static void __clean_cq(struct bnxt_qplib_cq *cq, u64 qp)
+{
+	struct bnxt_qplib_hwq *cq_hwq = &cq->hwq;
+	struct cq_base *hw_cqe, **hw_cqe_ptr;
+	int i;
+
+	for (i = 0; i < cq_hwq->max_elements; i++) {
+		hw_cqe_ptr = (struct cq_base **)cq_hwq->pbl_ptr;
+		hw_cqe = &hw_cqe_ptr[CQE_PG(i)][CQE_IDX(i)];
+		if (!CQE_CMP_VALID(hw_cqe, i, cq_hwq->max_elements))
+			continue;
+		switch (hw_cqe->cqe_type_toggle & CQ_BASE_CQE_TYPE_MASK) {
+		case CQ_BASE_CQE_TYPE_REQ:
+		case CQ_BASE_CQE_TYPE_TERMINAL:
+		{
+			struct cq_req *cqe = (struct cq_req *)hw_cqe;
+
+			if (qp == le64_to_cpu(cqe->qp_handle))
+				cqe->qp_handle = 0;
+			break;
+		}
+		case CQ_BASE_CQE_TYPE_RES_RC:
+		case CQ_BASE_CQE_TYPE_RES_UD:
+		case CQ_BASE_CQE_TYPE_RES_RAWETH_QP1:
+		{
+			struct cq_res_rc *cqe = (struct cq_res_rc *)hw_cqe;
+
+			if (qp == le64_to_cpu(cqe->qp_handle))
+				cqe->qp_handle = 0;
+			break;
+		}
+		default:
+			break;
+		}
+	}
+}
+
+static unsigned long bnxt_qplib_lock_cqs(struct bnxt_qplib_qp *qp)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&qp->scq->hwq.lock, flags);
+	if (qp->rcq && qp->rcq != qp->scq)
+		spin_lock(&qp->rcq->hwq.lock);
+
+	return flags;
+}
+
+static void bnxt_qplib_unlock_cqs(struct bnxt_qplib_qp *qp,
+				  unsigned long flags)
+{
+	if (qp->rcq && qp->rcq != qp->scq)
+		spin_unlock(&qp->rcq->hwq.lock);
+	spin_unlock_irqrestore(&qp->scq->hwq.lock, flags);
+}
+
+int bnxt_qplib_destroy_qp(struct bnxt_qplib_res *res,
+			  struct bnxt_qplib_qp *qp)
+{
+	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
+	struct cmdq_destroy_qp req;
+	struct creq_destroy_qp_resp *resp;
+	unsigned long flags;
+	u16 cmd_flags = 0;
+
+	RCFW_CMD_PREP(req, DESTROY_QP, cmd_flags);
+
+	req.qp_cid = cpu_to_le32(qp->id);
+	resp = (struct creq_destroy_qp_resp *)
+			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+						     NULL, 0);
+	if (!resp) {
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: DESTROY_QP send failed");
+		return -EINVAL;
+	}
+	/**/
+	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
+		/* Cmd timed out */
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: DESTROY_QP timed out");
+		return -ETIMEDOUT;
+	}
+	if (RCFW_RESP_STATUS(resp) ||
+	    RCFW_RESP_COOKIE(resp) != RCFW_CMDQ_COOKIE(req)) {
+		dev_err(&rcfw->pdev->dev, "QPLIB: FP: DESTROY_QP failed ");
+		dev_err(&rcfw->pdev->dev,
+			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
+			RCFW_RESP_STATUS(resp), RCFW_CMDQ_COOKIE(req),
+			RCFW_RESP_COOKIE(resp));
+		return -EINVAL;
+	}
+
+	/* Must walk the associated CQs to nullified the QP ptr */
+	flags = bnxt_qplib_lock_cqs(qp);
+	__clean_cq(qp->scq, (u64)qp);
+	if (qp->rcq != qp->scq)
+		__clean_cq(qp->rcq, (u64)qp);
+	bnxt_qplib_unlock_cqs(qp, flags);
+
+	bnxt_qplib_free_qp_hdr_buf(res, qp);
+	bnxt_qplib_free_hwq(res->pdev, &qp->sq.hwq);
+	kfree(qp->sq.swq);
+
+	bnxt_qplib_free_hwq(res->pdev, &qp->rq.hwq);
+	kfree(qp->rq.swq);
+
+	if (qp->irrq.max_elements)
+		bnxt_qplib_free_hwq(res->pdev, &qp->irrq);
+	if (qp->orrq.max_elements)
+		bnxt_qplib_free_hwq(res->pdev, &qp->orrq);
+
+	return 0;
+}
+
 /* CQ */
 
 /* Spinlock must be held */
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
index 4cb4d47..61c8bcc7 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_qplib_fp.h
@@ -13,8 +13,246 @@
 
 #ifndef __BNXT_QPLIB_FP_H__
 #define __BNXT_QPLIB_FP_H__
+struct bnxt_qplib_sge {
+	u64				addr;
+	u32				lkey;
+	u32				size;
+};
+
+#define BNXT_QPLIB_MAX_SQE_ENTRY_SIZE	sizeof(struct sq_send)
+
+#define SQE_CNT_PER_PG		(PAGE_SIZE / BNXT_QPLIB_MAX_SQE_ENTRY_SIZE)
+#define SQE_MAX_IDX_PER_PG	(SQE_CNT_PER_PG - 1)
+#define SQE_PG(x)		(((x) & ~SQE_MAX_IDX_PER_PG) / SQE_CNT_PER_PG)
+#define SQE_IDX(x)		((x) & SQE_MAX_IDX_PER_PG)
+
+#define BNXT_QPLIB_MAX_PSNE_ENTRY_SIZE	sizeof(struct sq_psn_search)
+
+#define PSNE_CNT_PER_PG		(PAGE_SIZE / BNXT_QPLIB_MAX_PSNE_ENTRY_SIZE)
+#define PSNE_MAX_IDX_PER_PG	(PSNE_CNT_PER_PG - 1)
+#define PSNE_PG(x)		(((x) & ~PSNE_MAX_IDX_PER_PG) / PSNE_CNT_PER_PG)
+#define PSNE_IDX(x)		((x) & PSNE_MAX_IDX_PER_PG)
+
+#define BNXT_QPLIB_QP_MAX_SGL	6
+
+struct bnxt_qplib_swq {
+	u64				wr_id;
+	u8				type;
+	u8				flags;
+	u32				start_psn;
+	u32				next_psn;
+	struct sq_psn_search		*psn_search;
+};
+
+struct bnxt_qplib_swqe {
+	/* General */
+	u64				wr_id;
+	u8				reqs_type;
+	u8				type;
+#define BNXT_QPLIB_SWQE_TYPE_SEND			0
+#define BNXT_QPLIB_SWQE_TYPE_SEND_WITH_IMM		1
+#define BNXT_QPLIB_SWQE_TYPE_SEND_WITH_INV		2
+#define BNXT_QPLIB_SWQE_TYPE_RDMA_WRITE			4
+#define BNXT_QPLIB_SWQE_TYPE_RDMA_WRITE_WITH_IMM	5
+#define BNXT_QPLIB_SWQE_TYPE_RDMA_READ			6
+#define BNXT_QPLIB_SWQE_TYPE_ATOMIC_CMP_AND_SWP		8
+#define BNXT_QPLIB_SWQE_TYPE_ATOMIC_FETCH_AND_ADD	11
+#define BNXT_QPLIB_SWQE_TYPE_LOCAL_INV			12
+#define BNXT_QPLIB_SWQE_TYPE_FAST_REG_MR		13
+#define BNXT_QPLIB_SWQE_TYPE_REG_MR			13
+#define BNXT_QPLIB_SWQE_TYPE_BIND_MW			14
+#define BNXT_QPLIB_SWQE_TYPE_RECV			128
+#define BNXT_QPLIB_SWQE_TYPE_RECV_RDMA_IMM		129
+	u8				flags;
+#define BNXT_QPLIB_SWQE_FLAGS_SIGNAL_COMP		BIT(0)
+#define BNXT_QPLIB_SWQE_FLAGS_RD_ATOMIC_FENCE		BIT(1)
+#define BNXT_QPLIB_SWQE_FLAGS_UC_FENCE			BIT(2)
+#define BNXT_QPLIB_SWQE_FLAGS_SOLICIT_EVENT		BIT(3)
+#define BNXT_QPLIB_SWQE_FLAGS_INLINE			BIT(4)
+	struct bnxt_qplib_sge		sg_list[BNXT_QPLIB_QP_MAX_SGL];
+	int				num_sge;
+	/* Max inline data is 96 bytes */
+	u32				inline_len;
+#define BNXT_QPLIB_SWQE_MAX_INLINE_LENGTH		96
+	u8		inline_data[BNXT_QPLIB_SWQE_MAX_INLINE_LENGTH];
+
+	union {
+		/* Send, with imm, inval key */
+		struct {
+			u32		imm_data_or_inv_key;
+			u32		q_key;
+			u32		dst_qp;
+			u16		avid;
+		} send;
+
+		/* Send Raw Ethernet and QP1 */
+		struct {
+			u16		lflags;
+			u16		cfa_action;
+			u32		cfa_meta;
+		} rawqp1;
+
+		/* RDMA write, with imm, read */
+		struct {
+			u32		imm_data_or_inv_key;
+			u64		remote_va;
+			u32		r_key;
+		} rdma;
+
+		/* Atomic cmp/swap, fetch/add */
+		struct {
+			u64		remote_va;
+			u32		r_key;
+			u64		swap_data;
+			u64		cmp_data;
+		} atomic;
+
+		/* Local Invalidate */
+		struct {
+			u32		inv_l_key;
+		} local_inv;
+
+		/* FR-PMR */
+		struct {
+			u8		access_cntl;
+			u8		pg_sz_log;
+			bool		zero_based;
+			u32		l_key;
+			u32		length;
+			u8		pbl_pg_sz_log;
+#define BNXT_QPLIB_SWQE_PAGE_SIZE_4K			0
+#define BNXT_QPLIB_SWQE_PAGE_SIZE_8K			1
+#define BNXT_QPLIB_SWQE_PAGE_SIZE_64K			4
+#define BNXT_QPLIB_SWQE_PAGE_SIZE_256K			6
+#define BNXT_QPLIB_SWQE_PAGE_SIZE_1M			8
+#define BNXT_QPLIB_SWQE_PAGE_SIZE_2M			9
+#define BNXT_QPLIB_SWQE_PAGE_SIZE_4M			10
+#define BNXT_QPLIB_SWQE_PAGE_SIZE_1G			18
+			u8		levels;
+#define PAGE_SHIFT_4K	12
+			u64		*pbl_ptr;
+			dma_addr_t	pbl_dma_ptr;
+			u64		*page_list;
+			u16		page_list_len;
+			u64		va;
+		} frmr;
+
+		/* Bind */
+		struct {
+			u8		access_cntl;
+#define BNXT_QPLIB_BIND_SWQE_ACCESS_LOCAL_WRITE		BIT(0)
+#define BNXT_QPLIB_BIND_SWQE_ACCESS_REMOTE_READ		BIT(1)
+#define BNXT_QPLIB_BIND_SWQE_ACCESS_REMOTE_WRITE	BIT(2)
+#define BNXT_QPLIB_BIND_SWQE_ACCESS_REMOTE_ATOMIC	BIT(3)
+#define BNXT_QPLIB_BIND_SWQE_ACCESS_WINDOW_BIND		BIT(4)
+			bool		zero_based;
+			u8		mw_type;
+			u32		parent_l_key;
+			u32		r_key;
+			u64		va;
+			u32		length;
+		} bind;
+	};
+};
+
+#define BNXT_QPLIB_MAX_RQE_ENTRY_SIZE	sizeof(struct rq_wqe)
+
+#define RQE_CNT_PER_PG		(PAGE_SIZE / BNXT_QPLIB_MAX_RQE_ENTRY_SIZE)
+#define RQE_MAX_IDX_PER_PG	(RQE_CNT_PER_PG - 1)
+#define RQE_PG(x)		(((x) & ~RQE_MAX_IDX_PER_PG) / RQE_CNT_PER_PG)
+#define RQE_IDX(x)		((x) & RQE_MAX_IDX_PER_PG)
+
+struct bnxt_qplib_q {
+	struct bnxt_qplib_hwq		hwq;
+	struct bnxt_qplib_swq		*swq;
+	struct scatterlist		*sglist;
+	u32				nmap;
+	u32				max_wqe;
+	u16				max_sge;
+	u32				psn;
+	bool				flush_in_progress;
+};
+
+struct bnxt_qplib_qp {
+	struct bnxt_qplib_pd		*pd;
+	struct bnxt_qplib_dpi		*dpi;
+	u64				qp_handle;
+	u32				id;
+	u8				type;
+	u8				sig_type;
+	u64				modify_flags;
+	u8				state;
+	u8				cur_qp_state;
+	u32				max_inline_data;
+	u32				mtu;
+	u32				path_mtu;
+	bool				en_sqd_async_notify;
+	u16				pkey_index;
+	u32				qkey;
+	u32				dest_qp_id;
+	u8				access;
+	u8				timeout;
+	u8				retry_cnt;
+	u8				rnr_retry;
+	u32				min_rnr_timer;
+	u32				max_rd_atomic;
+	u32				max_dest_rd_atomic;
+	u32				dest_qpn;
+	u8				smac[6];
+	u16				vlan_id;
+	u8				nw_type;
+	struct bnxt_qplib_ah		ah;
+
+#define BTH_PSN_MASK			((1 << 24) - 1)
+	/* SQ */
+	struct bnxt_qplib_q		sq;
+	/* RQ */
+	struct bnxt_qplib_q		rq;
+	/* SRQ */
+	struct bnxt_qplib_srq		*srq;
+	/* CQ */
+	struct bnxt_qplib_cq		*scq;
+	struct bnxt_qplib_cq		*rcq;
+	/* IRRQ and ORRQ */
+	struct bnxt_qplib_hwq		irrq;
+	struct bnxt_qplib_hwq		orrq;
+	/* Header buffer for QP1 */
+	int				sq_hdr_buf_size;
+	int				rq_hdr_buf_size;
+/*
+ * Buffer space for ETH(14), IP or GRH(40), UDP header(8)
+ * and ib_bth + ib_deth (20).
+ * Max required is 82 when RoCE V2 is enabled
+ */
+#define BNXT_QPLIB_MAX_QP1_SQ_HDR_SIZE_V2	86
+	/* Ethernet header	=  14 */
+	/* ib_grh		=  40 (provided by MAD) */
+	/* ib_bth + ib_deth	=  20 */
+	/* MAD			= 256 (provided by MAD) */
+	/* iCRC			=   4 */
+#define BNXT_QPLIB_MAX_QP1_RQ_ETH_HDR_SIZE	14
+#define BNXT_QPLIB_MAX_QP1_RQ_HDR_SIZE_V2	512
+#define BNXT_QPLIB_MAX_GRH_HDR_SIZE_IPV4	20
+#define BNXT_QPLIB_MAX_GRH_HDR_SIZE_IPV6	40
+#define BNXT_QPLIB_MAX_QP1_RQ_BDETH_HDR_SIZE	20
+	void				*sq_hdr_buf;
+	dma_addr_t			sq_hdr_buf_map;
+	void				*rq_hdr_buf;
+	dma_addr_t			rq_hdr_buf_map;
+};
+
 #define BNXT_QPLIB_MAX_CQE_ENTRY_SIZE	sizeof(struct cq_base)
 
+#define CQE_CNT_PER_PG		(PAGE_SIZE / BNXT_QPLIB_MAX_CQE_ENTRY_SIZE)
+#define CQE_MAX_IDX_PER_PG	(CQE_CNT_PER_PG - 1)
+#define CQE_PG(x)		(((x) & ~CQE_MAX_IDX_PER_PG) / CQE_CNT_PER_PG)
+#define CQE_IDX(x)		((x) & CQE_MAX_IDX_PER_PG)
+
+#define ROCE_CQE_CMP_V			0
+#define CQE_CMP_VALID(hdr, raw_cons, cp_bit)			\
+	(!!((hdr)->cqe_type_toggle & CQ_BASE_TOGGLE) ==		\
+	   !((raw_cons) & (cp_bit)))
+
 struct bnxt_qplib_cqe {
 	u8				status;
 	u8				type;
@@ -57,6 +295,13 @@ struct bnxt_qplib_cq {
 	wait_queue_head_t		waitq;
 };
 
+#define BNXT_QPLIB_MAX_IRRQE_ENTRY_SIZE	sizeof(struct xrrq_irrq)
+#define BNXT_QPLIB_MAX_ORRQE_ENTRY_SIZE	sizeof(struct xrrq_orrq)
+#define IRD_LIMIT_TO_IRRQ_SLOTS(x)	(2 * (x) + 2)
+#define IRRQ_SLOTS_TO_IRD_LIMIT(s)	(((s) >> 1) - 1)
+#define ORD_LIMIT_TO_ORRQ_SLOTS(x)	((x) + 1)
+#define ORRQ_SLOTS_TO_ORD_LIMIT(s)	((s) - 1)
+
 #define BNXT_QPLIB_MAX_NQE_ENTRY_SIZE	sizeof(struct nq_base)
 
 #define NQE_CNT_PER_PG		(PAGE_SIZE / BNXT_QPLIB_MAX_NQE_ENTRY_SIZE)
@@ -115,6 +360,11 @@ int bnxt_qplib_enable_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq,
 			 int (*srqn_handler)(struct bnxt_qplib_nq *nq,
 					     void *srq,
 					     u8 event));
+int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
+int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
+int bnxt_qplib_modify_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
+int bnxt_qplib_query_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
+int bnxt_qplib_destroy_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
 int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq);
 int bnxt_qplib_destroy_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq);
 
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re.h b/drivers/infiniband/hw/bnxtre/bnxt_re.h
index fc01340..30dee42 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re.h
@@ -35,6 +35,14 @@ struct bnxt_re_work {
 	struct net_device	*vlan_dev;
 };
 
+struct bnxt_re_sqp_entries {
+	struct bnxt_qplib_sge sge;
+	u64 wrid;
+	/* For storing the actual qp1 cqe */
+	struct bnxt_qplib_cqe cqe;
+	struct bnxt_re_qp *qp1_qp;
+};
+
 #define BNXT_RE_MIN_MSIX		2
 #define BNXT_RE_MAX_MSIX		16
 #define BNXT_RE_AEQ_IDX			0
@@ -83,6 +91,12 @@ struct bnxt_re_dev {
 	atomic_t			mw_count;
 	/* Max of 2 lossless traffic class supported per port */
 	u16				cosq[2];
+
+	/* QP for for handling QP1 packets */
+	u32				sqp_id;
+	struct bnxt_re_qp		*qp1_sqp;
+	struct bnxt_re_ah		*sqp_ah;
+	struct bnxt_re_sqp_entries sqp_tbl[1024];
 };
 
 #define to_bnxt_re(ptr, type, member)	\
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
index 5884920..5e35d19 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.c
@@ -624,6 +624,481 @@ int bnxt_re_query_ah(struct ib_ah *ib_ah, struct ib_ah_attr *ah_attr)
 	return 0;
 }
 
+/* Queue Pairs */
+int bnxt_re_destroy_qp(struct ib_qp *ib_qp)
+{
+	struct bnxt_re_qp *qp = to_bnxt_re(ib_qp, struct bnxt_re_qp, ib_qp);
+	struct bnxt_re_dev *rdev = qp->rdev;
+	int rc;
+
+	rc = bnxt_qplib_destroy_qp(&rdev->qplib_res, &qp->qplib_qp);
+	if (rc) {
+		dev_err(rdev_to_dev(rdev), "Failed to destroy HW QP");
+		return rc;
+	}
+	if (ib_qp->qp_type == IB_QPT_GSI && rdev->qp1_sqp) {
+		rc = bnxt_qplib_destroy_ah(&rdev->qplib_res,
+					   &rdev->sqp_ah->qplib_ah);
+		if (rc) {
+			dev_err(rdev_to_dev(rdev),
+				"Failed to destroy HW AH for shadow QP");
+			return rc;
+		}
+
+		rc = bnxt_qplib_destroy_qp(&rdev->qplib_res,
+					   &rdev->qp1_sqp->qplib_qp);
+		if (rc) {
+			dev_err(rdev_to_dev(rdev),
+				"Failed to destroy Shadow QP");
+			return rc;
+		}
+		mutex_lock(&rdev->qp_lock);
+		list_del(&rdev->qp1_sqp->list);
+		atomic_dec(&rdev->qp_count);
+		mutex_unlock(&rdev->qp_lock);
+
+		kfree(rdev->sqp_ah);
+		kfree(rdev->qp1_sqp);
+	}
+
+	if (qp->rumem && !IS_ERR(qp->rumem))
+		ib_umem_release(qp->rumem);
+	if (qp->sumem && !IS_ERR(qp->sumem))
+		ib_umem_release(qp->sumem);
+
+	mutex_lock(&rdev->qp_lock);
+	list_del(&qp->list);
+	atomic_dec(&rdev->qp_count);
+	mutex_unlock(&rdev->qp_lock);
+	kfree(qp);
+	return 0;
+}
+
+static u8 __from_ib_qp_type(enum ib_qp_type type)
+{
+	switch (type) {
+	case IB_QPT_GSI:
+		return CMDQ_CREATE_QP1_TYPE_GSI;
+	case IB_QPT_RC:
+		return CMDQ_CREATE_QP_TYPE_RC;
+	case IB_QPT_UD:
+		return CMDQ_CREATE_QP_TYPE_UD;
+	case IB_QPT_RAW_ETHERTYPE:
+		return CMDQ_CREATE_QP_TYPE_RAW_ETHERTYPE;
+	default:
+		return IB_QPT_MAX;
+	}
+}
+
+int bnxt_re_init_user_qp(struct bnxt_re_dev *rdev, struct bnxt_re_pd *pd,
+			 struct bnxt_re_qp *qp, struct ib_udata *udata)
+{
+	struct bnxt_re_qp_req ureq;
+	struct bnxt_qplib_qp *qplib_qp = &qp->qplib_qp;
+	struct ib_umem *umem;
+	int bytes = 0;
+	struct ib_ucontext *context = pd->ib_pd.uobject->context;
+	struct bnxt_re_ucontext *cntx = to_bnxt_re(context,
+						  struct bnxt_re_ucontext,
+						  ib_uctx);
+	if (ib_copy_from_udata(&ureq, udata, sizeof(ureq)))
+		return -EFAULT;
+
+	bytes = (qplib_qp->sq.max_wqe * BNXT_QPLIB_MAX_SQE_ENTRY_SIZE);
+	/* Consider mapping PSN search memory only for RC QPs. */
+	if (qplib_qp->type == CMDQ_CREATE_QP_TYPE_RC)
+		bytes += (qplib_qp->sq.max_wqe * sizeof(struct sq_psn_search));
+	bytes = PAGE_ALIGN(bytes);
+	umem = ib_umem_get(context, ureq.qpsva, bytes,
+			   IB_ACCESS_LOCAL_WRITE, 1);
+	if (IS_ERR(umem))
+		return PTR_ERR(umem);
+
+	qp->sumem = umem;
+	qplib_qp->sq.sglist = umem->sg_head.sgl;
+	qplib_qp->sq.nmap = umem->nmap;
+	qplib_qp->qp_handle = ureq.qp_handle;
+
+	if (!qp->qplib_qp.srq) {
+		bytes = (qplib_qp->rq.max_wqe * BNXT_QPLIB_MAX_RQE_ENTRY_SIZE);
+		bytes = PAGE_ALIGN(bytes);
+		umem = ib_umem_get(context, ureq.qprva, bytes,
+				   IB_ACCESS_LOCAL_WRITE, 1);
+		if (IS_ERR(umem))
+			goto rqfail;
+		qp->rumem = umem;
+		qplib_qp->rq.sglist = umem->sg_head.sgl;
+		qplib_qp->rq.nmap = umem->nmap;
+	}
+
+	qplib_qp->dpi = cntx->dpi;
+	return 0;
+rqfail:
+	ib_umem_release(qp->sumem);
+	qp->sumem = NULL;
+	qplib_qp->sq.sglist = NULL;
+	qplib_qp->sq.nmap = 0;
+
+	return PTR_ERR(umem);
+}
+
+struct bnxt_re_ah *bnxt_re_create_shadow_qp_ah(struct bnxt_re_pd *pd,
+					       struct bnxt_qplib_res *qp1_res,
+					       struct bnxt_qplib_qp *qp1_qp)
+{
+	struct bnxt_re_dev *rdev = pd->rdev;
+	struct bnxt_re_ah *ah;
+	union ib_gid sgid;
+	int rc;
+
+	ah = kzalloc(sizeof(*ah), GFP_KERNEL);
+	if (!ah)
+		return NULL;
+
+	memset(ah, 0, sizeof(*ah));
+	ah->rdev = rdev;
+	ah->qplib_ah.pd = &pd->qplib_pd;
+
+	rc = bnxt_re_query_gid(&rdev->ibdev, 1, 0, &sgid);
+	if (rc)
+		goto fail;
+
+	/* supply the dgid data same as sgid */
+	memcpy(ah->qplib_ah.dgid.data, &sgid.raw,
+	       sizeof(union ib_gid));
+	ah->qplib_ah.sgid_index = 0;
+
+	ah->qplib_ah.traffic_class = 0;
+	ah->qplib_ah.flow_label = 0;
+	ah->qplib_ah.hop_limit = 1;
+	ah->qplib_ah.sl = 0;
+	/* Have DMAC same as SMAC */
+	ether_addr_copy(ah->qplib_ah.dmac, rdev->netdev->dev_addr);
+
+	rc = bnxt_qplib_create_ah(&rdev->qplib_res, &ah->qplib_ah);
+	if (rc) {
+		dev_err(rdev_to_dev(rdev),
+			"Failed to allocate HW AH for Shadow QP");
+		goto fail;
+	}
+
+	return ah;
+
+fail:
+	kfree(ah);
+	return NULL;
+}
+
+struct bnxt_re_qp *bnxt_re_create_shadow_qp(struct bnxt_re_pd *pd,
+					    struct bnxt_qplib_res *qp1_res,
+					    struct bnxt_qplib_qp *qp1_qp)
+{
+	struct bnxt_re_dev *rdev = pd->rdev;
+	struct bnxt_re_qp *qp;
+	int rc;
+
+	qp = kzalloc(sizeof(*qp), GFP_KERNEL);
+	if (!qp)
+		return NULL;
+
+	memset(qp, 0, sizeof(*qp));
+	qp->rdev = rdev;
+
+	/* Initialize the shadow QP structure from the QP1 values */
+	ether_addr_copy(qp->qplib_qp.smac, rdev->netdev->dev_addr);
+
+	qp->qplib_qp.pd = &pd->qplib_pd;
+	qp->qplib_qp.qp_handle = (u64)&qp->qplib_qp;
+	qp->qplib_qp.type = IB_QPT_UD;
+
+	qp->qplib_qp.max_inline_data = 0;
+	qp->qplib_qp.sig_type = true;
+
+	/* Shadow QP SQ depth should be same as QP1 RQ depth */
+	qp->qplib_qp.sq.max_wqe = qp1_qp->rq.max_wqe;
+	qp->qplib_qp.sq.max_sge = 2;
+
+	qp->qplib_qp.scq = qp1_qp->scq;
+	qp->qplib_qp.rcq = qp1_qp->rcq;
+
+	qp->qplib_qp.rq.max_wqe = qp1_qp->rq.max_wqe;
+	qp->qplib_qp.rq.max_sge = qp1_qp->rq.max_sge;
+
+	qp->qplib_qp.mtu = qp1_qp->mtu;
+
+	qp->qplib_qp.sq_hdr_buf_size = 0;
+	qp->qplib_qp.rq_hdr_buf_size = BNXT_QPLIB_MAX_GRH_HDR_SIZE_IPV6;
+	qp->qplib_qp.dpi = &rdev->dpi_privileged;
+
+	rc = bnxt_qplib_create_qp(qp1_res, &qp->qplib_qp);
+	if (rc)
+		goto fail;
+
+	rdev->sqp_id = qp->qplib_qp.id;
+
+	spin_lock_init(&qp->sq_lock);
+	INIT_LIST_HEAD(&qp->list);
+	mutex_lock(&rdev->qp_lock);
+	list_add_tail(&qp->list, &rdev->qp_list);
+	atomic_inc(&rdev->qp_count);
+	mutex_unlock(&rdev->qp_lock);
+	return qp;
+fail:
+	kfree(qp);
+	return NULL;
+}
+
+struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd,
+				struct ib_qp_init_attr *qp_init_attr,
+				struct ib_udata *udata)
+{
+	struct bnxt_re_pd *pd = to_bnxt_re(ib_pd, struct bnxt_re_pd, ib_pd);
+	struct bnxt_re_dev *rdev = pd->rdev;
+	struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
+	struct bnxt_re_qp *qp;
+	struct bnxt_re_srq *srq;
+	struct bnxt_re_cq *cq;
+	int rc, entries;
+
+	if ((qp_init_attr->cap.max_send_wr > dev_attr->max_qp_wqes) ||
+	    (qp_init_attr->cap.max_recv_wr > dev_attr->max_qp_wqes) ||
+	    (qp_init_attr->cap.max_send_sge > dev_attr->max_qp_sges) ||
+	    (qp_init_attr->cap.max_recv_sge > dev_attr->max_qp_sges) ||
+	    (qp_init_attr->cap.max_inline_data > dev_attr->max_inline_data))
+		return ERR_PTR(-EINVAL);
+
+	qp = kzalloc(sizeof(*qp), GFP_KERNEL);
+	if (!qp)
+		return ERR_PTR(-ENOMEM);
+
+	qp->rdev = rdev;
+	ether_addr_copy(qp->qplib_qp.smac, rdev->netdev->dev_addr);
+	qp->qplib_qp.pd = &pd->qplib_pd;
+	qp->qplib_qp.qp_handle = (u64)&qp->qplib_qp;
+	qp->qplib_qp.type = __from_ib_qp_type(qp_init_attr->qp_type);
+	if (qp->qplib_qp.type == IB_QPT_MAX) {
+		dev_err(rdev_to_dev(rdev), "QP type 0x%x not supported",
+			qp->qplib_qp.type);
+		rc = -EINVAL;
+		goto fail;
+	}
+	qp->qplib_qp.max_inline_data = qp_init_attr->cap.max_inline_data;
+	qp->qplib_qp.sig_type = ((qp_init_attr->sq_sig_type ==
+				  IB_SIGNAL_ALL_WR) ? true : false);
+
+	entries = roundup_pow_of_two(qp_init_attr->cap.max_send_wr + 1);
+	if (entries > dev_attr->max_qp_wqes + 1)
+		entries = dev_attr->max_qp_wqes + 1;
+	qp->qplib_qp.sq.max_wqe = entries;
+
+	qp->qplib_qp.sq.max_sge = qp_init_attr->cap.max_send_sge;
+	if (qp->qplib_qp.sq.max_sge > dev_attr->max_qp_sges)
+		qp->qplib_qp.sq.max_sge = dev_attr->max_qp_sges;
+
+	if (qp_init_attr->send_cq) {
+		cq = to_bnxt_re(qp_init_attr->send_cq, struct bnxt_re_cq,
+				ib_cq);
+		if (!cq) {
+			dev_err(rdev_to_dev(rdev), "Send CQ not found");
+			rc = -EINVAL;
+			goto fail;
+		}
+		qp->qplib_qp.scq = &cq->qplib_cq;
+	}
+
+	if (qp_init_attr->recv_cq) {
+		cq = to_bnxt_re(qp_init_attr->recv_cq, struct bnxt_re_cq,
+				ib_cq);
+		if (!cq) {
+			dev_err(rdev_to_dev(rdev), "Receive CQ not found");
+			rc = -EINVAL;
+			goto fail;
+		}
+		qp->qplib_qp.rcq = &cq->qplib_cq;
+	}
+
+	if (qp_init_attr->srq) {
+		dev_err(rdev_to_dev(rdev), "SRQ not supported");
+		rc = -ENOTSUPP;
+		goto fail;
+	} else {
+		/* Allocate 1 more than what's provided so posting max doesn't
+		 * mean empty
+		 */
+		entries = roundup_pow_of_two(qp_init_attr->cap.max_recv_wr + 1);
+		if (entries > dev_attr->max_qp_wqes + 1)
+			entries = dev_attr->max_qp_wqes + 1;
+		qp->qplib_qp.rq.max_wqe = entries;
+
+		qp->qplib_qp.rq.max_sge = qp_init_attr->cap.max_recv_sge;
+		if (qp->qplib_qp.rq.max_sge > dev_attr->max_qp_sges)
+			qp->qplib_qp.rq.max_sge = dev_attr->max_qp_sges;
+	}
+
+	qp->qplib_qp.mtu = ib_mtu_enum_to_int(iboe_get_mtu(rdev->netdev->mtu));
+
+	if (qp_init_attr->qp_type == IB_QPT_GSI) {
+		qp->qplib_qp.rq.max_sge = dev_attr->max_qp_sges;
+		if (qp->qplib_qp.rq.max_sge > dev_attr->max_qp_sges)
+			qp->qplib_qp.rq.max_sge = dev_attr->max_qp_sges;
+		qp->qplib_qp.sq.max_sge++;
+		if (qp->qplib_qp.sq.max_sge > dev_attr->max_qp_sges)
+			qp->qplib_qp.sq.max_sge = dev_attr->max_qp_sges;
+
+		qp->qplib_qp.rq_hdr_buf_size =
+					BNXT_QPLIB_MAX_QP1_RQ_HDR_SIZE_V2;
+
+		qp->qplib_qp.sq_hdr_buf_size =
+					BNXT_QPLIB_MAX_QP1_SQ_HDR_SIZE_V2;
+		qp->qplib_qp.dpi = &rdev->dpi_privileged;
+		rc = bnxt_qplib_create_qp1(&rdev->qplib_res, &qp->qplib_qp);
+		if (rc) {
+			dev_err(rdev_to_dev(rdev), "Failed to create HW QP1");
+			goto fail;
+		}
+		/* Create a shadow QP to handle the QP1 traffic */
+		rdev->qp1_sqp = bnxt_re_create_shadow_qp(pd, &rdev->qplib_res,
+							 &qp->qplib_qp);
+		if (!rdev->qp1_sqp) {
+			rc = -EINVAL;
+			dev_err(rdev_to_dev(rdev),
+				"Failed to create Shadow QP for QP1");
+			goto qp_destroy;
+		}
+		rdev->sqp_ah = bnxt_re_create_shadow_qp_ah(pd, &rdev->qplib_res,
+							   &qp->qplib_qp);
+		if (!rdev->sqp_ah) {
+			bnxt_qplib_destroy_qp(&rdev->qplib_res,
+					      &rdev->qp1_sqp->qplib_qp);
+			rc = -EINVAL;
+			dev_err(rdev_to_dev(rdev),
+				"Failed to create AH entry for ShadowQP");
+			goto qp_destroy;
+		}
+
+	} else {
+		qp->qplib_qp.max_rd_atomic = dev_attr->max_qp_rd_atom;
+		qp->qplib_qp.max_dest_rd_atomic = dev_attr->max_qp_init_rd_atom;
+		if (udata) {
+			rc = bnxt_re_init_user_qp(rdev, pd, qp, udata);
+			if (rc)
+				goto fail;
+		} else {
+			qp->qplib_qp.dpi = &rdev->dpi_privileged;
+		}
+
+		rc = bnxt_qplib_create_qp(&rdev->qplib_res, &qp->qplib_qp);
+		if (rc) {
+			dev_err(rdev_to_dev(rdev), "Failed to create HW QP");
+			goto fail;
+		}
+	}
+
+	qp->ib_qp.qp_num = qp->qplib_qp.id;
+	spin_lock_init(&qp->sq_lock);
+
+	if (udata) {
+		struct bnxt_re_qp_resp resp;
+
+		resp.qpid = qp->ib_qp.qp_num;
+		rc = bnxt_re_copy_to_udata(rdev, &resp, sizeof(resp), udata);
+		if (rc) {
+			dev_err(rdev_to_dev(rdev), "Failed to copy QP udata");
+			goto qp_destroy;
+		}
+	}
+	INIT_LIST_HEAD(&qp->list);
+	mutex_lock(&rdev->qp_lock);
+	list_add_tail(&qp->list, &rdev->qp_list);
+	atomic_inc(&rdev->qp_count);
+	mutex_unlock(&rdev->qp_lock);
+
+	return &qp->ib_qp;
+qp_destroy:
+	bnxt_qplib_destroy_qp(&rdev->qplib_res, &qp->qplib_qp);
+fail:
+	kfree(qp);
+	return ERR_PTR(rc);
+}
+
+static u8 __from_ib_qp_state(enum ib_qp_state state)
+{
+	switch (state) {
+	case IB_QPS_RESET:
+		return CMDQ_MODIFY_QP_NEW_STATE_RESET;
+	case IB_QPS_INIT:
+		return CMDQ_MODIFY_QP_NEW_STATE_INIT;
+	case IB_QPS_RTR:
+		return CMDQ_MODIFY_QP_NEW_STATE_RTR;
+	case IB_QPS_RTS:
+		return CMDQ_MODIFY_QP_NEW_STATE_RTS;
+	case IB_QPS_SQD:
+		return CMDQ_MODIFY_QP_NEW_STATE_SQD;
+	case IB_QPS_SQE:
+		return CMDQ_MODIFY_QP_NEW_STATE_SQE;
+	case IB_QPS_ERR:
+	default:
+		return CMDQ_MODIFY_QP_NEW_STATE_ERR;
+	}
+}
+
+static enum ib_qp_state __to_ib_qp_state(u8 state)
+{
+	switch (state) {
+	case CMDQ_MODIFY_QP_NEW_STATE_RESET:
+		return IB_QPS_RESET;
+	case CMDQ_MODIFY_QP_NEW_STATE_INIT:
+		return IB_QPS_INIT;
+	case CMDQ_MODIFY_QP_NEW_STATE_RTR:
+		return IB_QPS_RTR;
+	case CMDQ_MODIFY_QP_NEW_STATE_RTS:
+		return IB_QPS_RTS;
+	case CMDQ_MODIFY_QP_NEW_STATE_SQD:
+		return IB_QPS_SQD;
+	case CMDQ_MODIFY_QP_NEW_STATE_SQE:
+		return IB_QPS_SQE;
+	case CMDQ_MODIFY_QP_NEW_STATE_ERR:
+	default:
+		return IB_QPS_ERR;
+	}
+}
+
+static u32 __from_ib_mtu(enum ib_mtu mtu)
+{
+	switch (mtu) {
+	case IB_MTU_256:
+		return CMDQ_MODIFY_QP_PATH_MTU_MTU_256;
+	case IB_MTU_512:
+		return CMDQ_MODIFY_QP_PATH_MTU_MTU_512;
+	case IB_MTU_1024:
+		return CMDQ_MODIFY_QP_PATH_MTU_MTU_1024;
+	case IB_MTU_2048:
+		return CMDQ_MODIFY_QP_PATH_MTU_MTU_2048;
+	case IB_MTU_4096:
+		return CMDQ_MODIFY_QP_PATH_MTU_MTU_4096;
+	default:
+		return CMDQ_MODIFY_QP_PATH_MTU_MTU_2048;
+	}
+}
+
+static enum ib_mtu __to_ib_mtu(u32 mtu)
+{
+	switch (mtu & CREQ_QUERY_QP_RESP_SB_PATH_MTU_MASK) {
+	case CMDQ_MODIFY_QP_PATH_MTU_MTU_256:
+		return IB_MTU_256;
+	case CMDQ_MODIFY_QP_PATH_MTU_MTU_512:
+		return IB_MTU_512;
+	case CMDQ_MODIFY_QP_PATH_MTU_MTU_1024:
+		return IB_MTU_1024;
+	case CMDQ_MODIFY_QP_PATH_MTU_MTU_2048:
+		return IB_MTU_2048;
+	case CMDQ_MODIFY_QP_PATH_MTU_MTU_4096:
+		return IB_MTU_4096;
+	default:
+		return IB_MTU_2048;
+	}
+}
+
 static int __from_ib_access_flags(int iflags)
 {
 	int qflags = 0;
@@ -665,6 +1140,293 @@ static enum ib_access_flags __to_ib_access_flags(int qflags)
 		iflags |= IB_ACCESS_ON_DEMAND;
 	return iflags;
 };
+
+int bnxt_re_modify_shadow_qp(struct bnxt_re_dev *rdev,
+			     struct bnxt_re_qp *qp1_qp,
+			     int qp_attr_mask)
+{
+	struct bnxt_re_qp *qp = rdev->qp1_sqp;
+	int rc = 0;
+
+	if (qp_attr_mask & IB_QP_STATE) {
+		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_STATE;
+		qp->qplib_qp.state = qp1_qp->qplib_qp.state;
+	}
+	if (qp_attr_mask & IB_QP_PKEY_INDEX) {
+		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_PKEY;
+		qp->qplib_qp.pkey_index = qp1_qp->qplib_qp.pkey_index;
+	}
+
+	if (qp_attr_mask & IB_QP_QKEY) {
+		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_QKEY;
+		/* Using a Random  QKEY */
+		qp->qplib_qp.qkey = 0x81818181;
+	}
+	if (qp_attr_mask & IB_QP_SQ_PSN) {
+		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_SQ_PSN;
+		qp->qplib_qp.sq.psn = qp1_qp->qplib_qp.sq.psn;
+	}
+
+	rc = bnxt_qplib_modify_qp(&rdev->qplib_res, &qp->qplib_qp);
+	if (rc)
+		dev_err(rdev_to_dev(rdev),
+			"Failed to modify Shadow QP for QP1");
+	return rc;
+}
+
+int bnxt_re_modify_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr,
+		      int qp_attr_mask, struct ib_udata *udata)
+{
+	struct bnxt_re_qp *qp = to_bnxt_re(ib_qp, struct bnxt_re_qp, ib_qp);
+	struct bnxt_re_dev *rdev = qp->rdev;
+	struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
+	enum ib_qp_state curr_qp_state, new_qp_state;
+	int rc, entries;
+	int status;
+	union ib_gid sgid;
+	struct ib_gid_attr sgid_attr;
+	u8 nw_type;
+
+	qp->qplib_qp.modify_flags = 0;
+	if (qp_attr_mask & IB_QP_STATE) {
+		curr_qp_state = __to_ib_qp_state(qp->qplib_qp.cur_qp_state);
+		new_qp_state = qp_attr->qp_state;
+		if (!ib_modify_qp_is_ok(curr_qp_state, new_qp_state,
+					ib_qp->qp_type, qp_attr_mask,
+					IB_LINK_LAYER_ETHERNET)) {
+			dev_err(rdev_to_dev(rdev),
+				"Invalid attribute mask: %#x specified ",
+				qp_attr_mask);
+			dev_err(rdev_to_dev(rdev),
+				"for qpn: %#x type: %#x",
+				ib_qp->qp_num, ib_qp->qp_type);
+			dev_err(rdev_to_dev(rdev),
+				"curr_qp_state=0x%x, new_qp_state=0x%x\n",
+				curr_qp_state, new_qp_state);
+			return -EINVAL;
+		}
+		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_STATE;
+		qp->qplib_qp.state = __from_ib_qp_state(qp_attr->qp_state);
+	}
+	if (qp_attr_mask & IB_QP_EN_SQD_ASYNC_NOTIFY) {
+		qp->qplib_qp.modify_flags |=
+				CMDQ_MODIFY_QP_MODIFY_MASK_EN_SQD_ASYNC_NOTIFY;
+		qp->qplib_qp.en_sqd_async_notify = true;
+	}
+	if (qp_attr_mask & IB_QP_ACCESS_FLAGS) {
+		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_ACCESS;
+		qp->qplib_qp.access =
+			__from_ib_access_flags(qp_attr->qp_access_flags);
+		/* LOCAL_WRITE access must be set to allow RC receive */
+		qp->qplib_qp.access |= BNXT_QPLIB_ACCESS_LOCAL_WRITE;
+	}
+	if (qp_attr_mask & IB_QP_PKEY_INDEX) {
+		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_PKEY;
+		qp->qplib_qp.pkey_index = qp_attr->pkey_index;
+	}
+	if (qp_attr_mask & IB_QP_QKEY) {
+		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_QKEY;
+		qp->qplib_qp.qkey = qp_attr->qkey;
+	}
+	if (qp_attr_mask & IB_QP_AV) {
+		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_DGID |
+				     CMDQ_MODIFY_QP_MODIFY_MASK_FLOW_LABEL |
+				     CMDQ_MODIFY_QP_MODIFY_MASK_SGID_INDEX |
+				     CMDQ_MODIFY_QP_MODIFY_MASK_HOP_LIMIT |
+				     CMDQ_MODIFY_QP_MODIFY_MASK_TRAFFIC_CLASS |
+				     CMDQ_MODIFY_QP_MODIFY_MASK_DEST_MAC |
+				     CMDQ_MODIFY_QP_MODIFY_MASK_VLAN_ID;
+		memcpy(qp->qplib_qp.ah.dgid.data, qp_attr->ah_attr.grh.dgid.raw,
+		       sizeof(qp->qplib_qp.ah.dgid.data));
+		qp->qplib_qp.ah.flow_label = qp_attr->ah_attr.grh.flow_label;
+		/* If RoCE V2 is enabled, stack will have two entries for
+		 * each GID entry. Avoiding this duplicte entry in HW. Dividing
+		 * the GID index by 2 for RoCE V2
+		 */
+		qp->qplib_qp.ah.sgid_index =
+					qp_attr->ah_attr.grh.sgid_index / 2;
+		qp->qplib_qp.ah.host_sgid_index =
+					qp_attr->ah_attr.grh.sgid_index;
+		qp->qplib_qp.ah.hop_limit = qp_attr->ah_attr.grh.hop_limit;
+		qp->qplib_qp.ah.traffic_class =
+					qp_attr->ah_attr.grh.traffic_class;
+		qp->qplib_qp.ah.sl = qp_attr->ah_attr.sl;
+		ether_addr_copy(qp->qplib_qp.ah.dmac, qp_attr->ah_attr.dmac);
+
+		status = ib_get_cached_gid(&rdev->ibdev, 1,
+					   qp_attr->ah_attr.grh.sgid_index,
+					   &sgid, &sgid_attr);
+		if (!status && sgid_attr.ndev) {
+			memcpy(qp->qplib_qp.smac, sgid_attr.ndev->dev_addr,
+			       ETH_ALEN);
+			dev_put(sgid_attr.ndev);
+			nw_type = ib_gid_to_network_type(sgid_attr.gid_type,
+							 &sgid);
+			switch (nw_type) {
+			case RDMA_NETWORK_IPV4:
+				qp->qplib_qp.nw_type =
+					CMDQ_MODIFY_QP_NETWORK_TYPE_ROCEV2_IPV4;
+				break;
+			case RDMA_NETWORK_IPV6:
+				qp->qplib_qp.nw_type =
+					CMDQ_MODIFY_QP_NETWORK_TYPE_ROCEV2_IPV6;
+				break;
+			default:
+				qp->qplib_qp.nw_type =
+					CMDQ_MODIFY_QP_NETWORK_TYPE_ROCEV1;
+				break;
+			}
+		}
+	}
+
+	if (qp_attr_mask & IB_QP_PATH_MTU) {
+		qp->qplib_qp.modify_flags |=
+				CMDQ_MODIFY_QP_MODIFY_MASK_PATH_MTU;
+		qp->qplib_qp.path_mtu = __from_ib_mtu(qp_attr->path_mtu);
+	} else if (qp_attr->qp_state == IB_QPS_RTR) {
+		qp->qplib_qp.modify_flags |=
+			CMDQ_MODIFY_QP_MODIFY_MASK_PATH_MTU;
+		qp->qplib_qp.path_mtu =
+			__from_ib_mtu(iboe_get_mtu(rdev->netdev->mtu));
+	}
+
+	if (qp_attr_mask & IB_QP_TIMEOUT) {
+		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_TIMEOUT;
+		qp->qplib_qp.timeout = qp_attr->timeout;
+	}
+	if (qp_attr_mask & IB_QP_RETRY_CNT) {
+		qp->qplib_qp.modify_flags |=
+				CMDQ_MODIFY_QP_MODIFY_MASK_RETRY_CNT;
+		qp->qplib_qp.retry_cnt = qp_attr->retry_cnt;
+	}
+	if (qp_attr_mask & IB_QP_RNR_RETRY) {
+		qp->qplib_qp.modify_flags |=
+				CMDQ_MODIFY_QP_MODIFY_MASK_RNR_RETRY;
+		qp->qplib_qp.rnr_retry = qp_attr->rnr_retry;
+	}
+	if (qp_attr_mask & IB_QP_MIN_RNR_TIMER) {
+		qp->qplib_qp.modify_flags |=
+				CMDQ_MODIFY_QP_MODIFY_MASK_MIN_RNR_TIMER;
+		qp->qplib_qp.min_rnr_timer = qp_attr->min_rnr_timer;
+	}
+	if (qp_attr_mask & IB_QP_RQ_PSN) {
+		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_RQ_PSN;
+		qp->qplib_qp.rq.psn = qp_attr->rq_psn;
+	}
+	if (qp_attr_mask & IB_QP_MAX_QP_RD_ATOMIC) {
+		qp->qplib_qp.modify_flags |=
+				CMDQ_MODIFY_QP_MODIFY_MASK_MAX_RD_ATOMIC;
+		qp->qplib_qp.max_rd_atomic = qp_attr->max_rd_atomic;
+	}
+	if (qp_attr_mask & IB_QP_SQ_PSN) {
+		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_SQ_PSN;
+		qp->qplib_qp.sq.psn = qp_attr->sq_psn;
+	}
+	if (qp_attr_mask & IB_QP_MAX_DEST_RD_ATOMIC) {
+		qp->qplib_qp.modify_flags |=
+				CMDQ_MODIFY_QP_MODIFY_MASK_MAX_DEST_RD_ATOMIC;
+		qp->qplib_qp.max_dest_rd_atomic = qp_attr->max_dest_rd_atomic;
+	}
+	if (qp_attr_mask & IB_QP_CAP) {
+		qp->qplib_qp.modify_flags |=
+				CMDQ_MODIFY_QP_MODIFY_MASK_SQ_SIZE |
+				CMDQ_MODIFY_QP_MODIFY_MASK_RQ_SIZE |
+				CMDQ_MODIFY_QP_MODIFY_MASK_SQ_SGE |
+				CMDQ_MODIFY_QP_MODIFY_MASK_RQ_SGE |
+				CMDQ_MODIFY_QP_MODIFY_MASK_MAX_INLINE_DATA;
+		if ((qp_attr->cap.max_send_wr >= dev_attr->max_qp_wqes) ||
+		    (qp_attr->cap.max_recv_wr >= dev_attr->max_qp_wqes) ||
+		    (qp_attr->cap.max_send_sge >= dev_attr->max_qp_sges) ||
+		    (qp_attr->cap.max_recv_sge >= dev_attr->max_qp_sges) ||
+		    (qp_attr->cap.max_inline_data >=
+						dev_attr->max_inline_data)) {
+			dev_err(rdev_to_dev(rdev),
+				"Create QP failed - max exceeded");
+			return -EINVAL;
+		}
+		entries = roundup_pow_of_two(qp_attr->cap.max_send_wr);
+		if (entries > dev_attr->max_qp_wqes)
+			entries = dev_attr->max_qp_wqes;
+		qp->qplib_qp.sq.max_wqe = entries;
+		qp->qplib_qp.sq.max_sge = qp_attr->cap.max_send_sge;
+		if (qp->qplib_qp.rq.max_wqe) {
+			entries = roundup_pow_of_two(qp_attr->cap.max_recv_wr);
+			if (entries > dev_attr->max_qp_wqes)
+				entries = dev_attr->max_qp_wqes;
+			qp->qplib_qp.rq.max_wqe = entries;
+			qp->qplib_qp.rq.max_sge = qp_attr->cap.max_recv_sge;
+		} else {
+			/* SRQ was used prior, just ignore the RQ caps */
+		}
+	}
+	if (qp_attr_mask & IB_QP_DEST_QPN) {
+		qp->qplib_qp.modify_flags |=
+				CMDQ_MODIFY_QP_MODIFY_MASK_DEST_QP_ID;
+		qp->qplib_qp.dest_qpn = qp_attr->dest_qp_num;
+	}
+	rc = bnxt_qplib_modify_qp(&rdev->qplib_res, &qp->qplib_qp);
+	if (rc) {
+		dev_err(rdev_to_dev(rdev), "Failed to modify HW QP");
+		return rc;
+	}
+	if (ib_qp->qp_type == IB_QPT_GSI && rdev->qp1_sqp)
+		rc = bnxt_re_modify_shadow_qp(rdev, qp, qp_attr_mask);
+	return rc;
+}
+
+int bnxt_re_query_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr,
+		     int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr)
+{
+	struct bnxt_re_qp *qp = to_bnxt_re(ib_qp, struct bnxt_re_qp, ib_qp);
+	struct bnxt_re_dev *rdev = qp->rdev;
+	struct bnxt_qplib_qp qplib_qp;
+	int rc;
+
+	memset(&qplib_qp, 0, sizeof(struct bnxt_qplib_qp));
+	qplib_qp.id = qp->qplib_qp.id;
+	qplib_qp.ah.host_sgid_index = qp->qplib_qp.ah.host_sgid_index;
+
+	rc = bnxt_qplib_query_qp(&rdev->qplib_res, &qplib_qp);
+	if (rc) {
+		dev_err(rdev_to_dev(rdev), "Failed to query HW QP");
+		return rc;
+	}
+	qp_attr->qp_state = __to_ib_qp_state(qplib_qp.state);
+	qp_attr->en_sqd_async_notify = qplib_qp.en_sqd_async_notify ? 1 : 0;
+	qp_attr->qp_access_flags = __to_ib_access_flags(qplib_qp.access);
+	qp_attr->pkey_index = qplib_qp.pkey_index;
+	qp_attr->qkey = qplib_qp.qkey;
+	memcpy(qp_attr->ah_attr.grh.dgid.raw, qplib_qp.ah.dgid.data,
+	       sizeof(qplib_qp.ah.dgid.data));
+	qp_attr->ah_attr.grh.flow_label = qplib_qp.ah.flow_label;
+	qp_attr->ah_attr.grh.sgid_index = qplib_qp.ah.host_sgid_index;
+	qp_attr->ah_attr.grh.hop_limit = qplib_qp.ah.hop_limit;
+	qp_attr->ah_attr.grh.traffic_class = qplib_qp.ah.traffic_class;
+	qp_attr->ah_attr.sl = qplib_qp.ah.sl;
+	ether_addr_copy(qp_attr->ah_attr.dmac, qplib_qp.ah.dmac);
+	qp_attr->path_mtu = __to_ib_mtu(qplib_qp.path_mtu);
+	qp_attr->timeout = qplib_qp.timeout;
+	qp_attr->retry_cnt = qplib_qp.retry_cnt;
+	qp_attr->rnr_retry = qplib_qp.rnr_retry;
+	qp_attr->min_rnr_timer = qplib_qp.min_rnr_timer;
+	qp_attr->rq_psn = qplib_qp.rq.psn;
+	qp_attr->max_rd_atomic = qplib_qp.max_rd_atomic;
+	qp_attr->sq_psn = qplib_qp.sq.psn;
+	qp_attr->max_dest_rd_atomic = qplib_qp.max_dest_rd_atomic;
+	qp_init_attr->sq_sig_type = qplib_qp.sig_type ? IB_SIGNAL_ALL_WR :
+							IB_SIGNAL_REQ_WR;
+	qp_attr->dest_qp_num = qplib_qp.dest_qpn;
+
+	qp_attr->cap.max_send_wr = qp->qplib_qp.sq.max_wqe;
+	qp_attr->cap.max_send_sge = qp->qplib_qp.sq.max_sge;
+	qp_attr->cap.max_recv_wr = qp->qplib_qp.rq.max_wqe;
+	qp_attr->cap.max_recv_sge = qp->qplib_qp.rq.max_sge;
+	qp_attr->cap.max_inline_data = qp->qplib_qp.max_inline_data;
+	qp_init_attr->cap = qp_attr->cap;
+
+	return 0;
+}
+
 /* Completion Queues */
 int bnxt_re_destroy_cq(struct ib_cq *ib_cq)
 {
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
index 50b30ed..fdfe8bf 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_ib_verbs.h
@@ -32,6 +32,19 @@ struct bnxt_re_ah {
 	struct bnxt_qplib_ah	qplib_ah;
 };
 
+struct bnxt_re_qp {
+	struct list_head	list;
+	struct bnxt_re_dev	*rdev;
+	struct ib_qp		ib_qp;
+	spinlock_t		sq_lock;	/* protect sq */
+	struct bnxt_qplib_qp	qplib_qp;
+	struct ib_umem		*sumem;
+	struct ib_umem		*rumem;
+	/* QP1 */
+	u32			send_psn;
+	struct ib_ud_header	qp1_hdr;
+};
+
 struct bnxt_re_cq {
 	struct bnxt_re_dev	*rdev;
 	spinlock_t              cq_lock;	/* protect cq */
@@ -116,6 +129,14 @@ struct ib_ah *bnxt_re_create_ah(struct ib_pd *pd,
 int bnxt_re_modify_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr);
 int bnxt_re_query_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr);
 int bnxt_re_destroy_ah(struct ib_ah *ah);
+struct ib_qp *bnxt_re_create_qp(struct ib_pd *pd,
+				struct ib_qp_init_attr *qp_init_attr,
+				struct ib_udata *udata);
+int bnxt_re_modify_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
+		      int qp_attr_mask, struct ib_udata *udata);
+int bnxt_re_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
+		     int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr);
+int bnxt_re_destroy_qp(struct ib_qp *qp);
 struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
 				const struct ib_cq_init_attr *attr,
 				struct ib_ucontext *context,
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
index dc11612..80ee5b7 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_main.c
@@ -439,6 +439,12 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
 	ibdev->modify_ah		= bnxt_re_modify_ah;
 	ibdev->query_ah			= bnxt_re_query_ah;
 	ibdev->destroy_ah		= bnxt_re_destroy_ah;
+
+	ibdev->create_qp		= bnxt_re_create_qp;
+	ibdev->modify_qp		= bnxt_re_modify_qp;
+	ibdev->query_qp			= bnxt_re_query_qp;
+	ibdev->destroy_qp		= bnxt_re_destroy_qp;
+
 	ibdev->create_cq		= bnxt_re_create_cq;
 	ibdev->destroy_cq		= bnxt_re_destroy_cq;
 	ibdev->req_notify_cq		= bnxt_re_req_notify_cq;
diff --git a/drivers/infiniband/hw/bnxtre/bnxt_re_uverbs_abi.h b/drivers/infiniband/hw/bnxtre/bnxt_re_uverbs_abi.h
index fec6d35..6aeda5e 100644
--- a/drivers/infiniband/hw/bnxtre/bnxt_re_uverbs_abi.h
+++ b/drivers/infiniband/hw/bnxtre/bnxt_re_uverbs_abi.h
@@ -40,6 +40,16 @@ struct bnxt_re_cq_resp {
 	__u32 phase;
 } __packed;
 
+struct bnxt_re_qp_req {
+	__u64 qpsva;
+	__u64 qprva;
+	__u64 qp_handle;
+} __packed;
+
+struct bnxt_re_qp_resp {
+	__u32 qpid;
+} __packed;
+
 enum bnxt_re_shpg_offt {
 	BNXT_RE_BEG_RESV_OFFT	= 0x00,
 	BNXT_RE_AVID_OFFT	= 0x10,
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox