Netdev List

Netdev List
 help / color / mirror / Atom feed

* [RFC PATCH 0/9] Hirschmann Hellcreek DSA driver
From: Kurt Kanzenbach @ 2020-06-18  6:40 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli
  Cc: David S. Miller, Jakub Kicinski, netdev, Rob Herring, devicetree,
	Sebastian Andrzej Siewior, Richard Cochran, Kamil Alkhouri,
	ilias.apalodimas, Kurt Kanzenbach

Hi,

this series adds a DSA driver for the Hirschmann Hellcreek TSN switch
IP. Characteristics of that IP:

 * Full duplex Ethernet interface at 100/1000 Mbps on three ports
 * IEEE 802.1Q-compliant Ethernet Switch
 * IEEE 802.1Qbv Time-Aware scheduling support
 * IEEE 1588 and IEEE 802.1AS support

That IP is used e.g. in

 https://www.arrow.com/en/campaigns/arrow-kairos

Due to the hardware setup the switch driver is implemented using DSA. A special
tagging protocol is leveraged. Furthermore, this driver supports PTP, hardware
timestamping and TAPRIO offloading.

This work is part of the AccessTSN project: https://www.accesstsn.com/

If there are any objections let me know.

Thanks,
Kurt

Kamil Alkhouri (2):
  net: dsa: hellcreek: Add PTP clock support
  net: dsa: hellcreek: Add support for hardware timestamping

Kurt Kanzenbach (7):
  net: dsa: Add tag handling for Hirschmann Hellcreek switches
  net: dsa: Add DSA driver for Hirschmann Hellcreek switches
  net: dsa: hellcreek: Add TAPRIO offloading support
  net: dsa: hellcreek: Add debugging mechanisms
  net: dsa: hellcreek: Add PTP status LEDs
  dt-bindings: Add vendor prefix for Hirschmann
  dt-bindings: net: dsa: Add documentation for Hellcreek switches

 .../devicetree/bindings/net/dsa/hellcreek.txt |   72 +
 .../devicetree/bindings/vendor-prefixes.yaml  |    2 +
 drivers/net/dsa/Kconfig                       |    2 +
 drivers/net/dsa/Makefile                      |    1 +
 drivers/net/dsa/hirschmann/Kconfig            |    7 +
 drivers/net/dsa/hirschmann/Makefile           |    5 +
 drivers/net/dsa/hirschmann/hellcreek.c        | 1751 +++++++++++++++++
 drivers/net/dsa/hirschmann/hellcreek.h        |  302 +++
 .../net/dsa/hirschmann/hellcreek_hwtstamp.c   |  492 +++++
 .../net/dsa/hirschmann/hellcreek_hwtstamp.h   |   58 +
 drivers/net/dsa/hirschmann/hellcreek_ptp.c    |  369 ++++
 drivers/net/dsa/hirschmann/hellcreek_ptp.h    |   76 +
 include/net/dsa.h                             |    2 +
 net/dsa/Kconfig                               |    6 +
 net/dsa/Makefile                              |    1 +
 net/dsa/tag_hellcreek.c                       |  101 +
 16 files changed, 3247 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/dsa/hellcreek.txt
 create mode 100644 drivers/net/dsa/hirschmann/Kconfig
 create mode 100644 drivers/net/dsa/hirschmann/Makefile
 create mode 100644 drivers/net/dsa/hirschmann/hellcreek.c
 create mode 100644 drivers/net/dsa/hirschmann/hellcreek.h
 create mode 100644 drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
 create mode 100644 drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
 create mode 100644 drivers/net/dsa/hirschmann/hellcreek_ptp.c
 create mode 100644 drivers/net/dsa/hirschmann/hellcreek_ptp.h
 create mode 100644 net/dsa/tag_hellcreek.c

-- 
2.20.1


^ permalink raw reply

* Re: [PATCH net-next 0/3] add MP_PRIO, MP_FAIL and MP_FASTCLOSE suboptions handling
From: Geliang Tang @ 2020-06-18  6:27 UTC (permalink / raw)
  To: Matthieu Baerts
  Cc: Mat Martineau, David S. Miller, Jakub Kicinski, netdev, mptcp,
	linux-kernel
In-Reply-To: <04ae76d9-231a-de8e-ad33-1e4e80bb314c@tessares.net>

On Tue, Jun 16, 2020 at 05:18:56PM +0200, Matthieu Baerts wrote:
> Hi Geliang
> 
> On 16/06/2020 08:47, Geliang Tang wrote:
> > Add handling for sending and receiving the MP_PRIO, MP_FAIL, and
> > MP_FASTCLOSE suboptions.
> 
> Thank you for the patches!
> 
> Unfortunately, I don't think it would be wise to accept them now: for the
> moment, these suboptions are ignored at the reception. If we accept them and
> change some variables like you did, we would need to make sure the kernel is
> still acting correctly. In other words, we would need tests:
> * For MP_PRIO, there are still quite some works to do regarding the
> scheduling of the packets between the different MPTCP subflows to do before
> supporting this.
> * For MP_FAIL, we should forward the info to the path manager.
> * For MP_FASTCLOSE, we should close connections and ACK this.
> 
> Also, net-next is closed for the moment:
> http://vger.kernel.org/~davem/net-next.html
> 
> I would suggest you to discuss about that on MPTCP mailing list. We also
> have meetings every Thursday. New devs are always welcome to contribute to
> new features and bug-fixes!
> 

Hi Matt,

Thanks for your reply. I will do these tests and improve my patches.

-Geliang

> Cheers,
> Matt
> -- 
> Tessares | Belgium | Hybrid Access Solutions
> www.tessares.net

^ permalink raw reply

* Re: [PATCH] [net/sched]: Remove redundant condition in qdisc_graft
From: kernel test robot @ 2020-06-18  6:05 UTC (permalink / raw)
  To: Gaurav Singh, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	David S. Miller, Jakub Kicinski, open list:TC subsystem,
	open list
  Cc: kbuild-all, clang-built-linux, netdev
In-Reply-To: <20200618005526.27101-1-gaurav1086@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4398 bytes --]

Hi Gaurav,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.8-rc1 next-20200617]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use  as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Gaurav-Singh/Remove-redundant-condition-in-qdisc_graft/20200618-085703
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 1b5044021070efa3259f3e9548dc35d1eb6aa844
config: s390-randconfig-r016-20200618 (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 487ca07fcc75d52755c9fe2ee05bcb3b6eeeec44)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install s390 cross compiling tool for clang build
        # apt-get install binutils-s390-linux-gnu
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):


vim +1097 net/sched/sch_api.c

  1019	
  1020	/* Graft qdisc "new" to class "classid" of qdisc "parent" or
  1021	 * to device "dev".
  1022	 *
  1023	 * When appropriate send a netlink notification using 'skb'
  1024	 * and "n".
  1025	 *
  1026	 * On success, destroy old qdisc.
  1027	 */
  1028	
  1029	static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
  1030			       struct sk_buff *skb, struct nlmsghdr *n, u32 classid,
  1031			       struct Qdisc *new, struct Qdisc *old,
  1032			       struct netlink_ext_ack *extack)
  1033	{
  1034		struct Qdisc *q = old;
  1035		struct net *net = dev_net(dev);
  1036	
  1037		if (parent == NULL) {
  1038			unsigned int i, num_q, ingress;
  1039	
  1040			ingress = 0;
  1041			num_q = dev->num_tx_queues;
  1042			if ((q && q->flags & TCQ_F_INGRESS) ||
  1043			    (new && new->flags & TCQ_F_INGRESS)) {
  1044				num_q = 1;
  1045				ingress = 1;
  1046				if (!dev_ingress_queue(dev)) {
  1047					NL_SET_ERR_MSG(extack, "Device does not have an ingress queue");
  1048					return -ENOENT;
  1049				}
  1050			}
  1051	
  1052			if (dev->flags & IFF_UP)
  1053				dev_deactivate(dev);
  1054	
  1055			qdisc_offload_graft_root(dev, new, old, extack);
  1056	
  1057			if (new && new->ops->attach)
  1058				goto skip;
  1059	
  1060			for (i = 0; i < num_q; i++) {
  1061				struct netdev_queue *dev_queue = dev_ingress_queue(dev);
  1062	
  1063				if (!ingress)
  1064					dev_queue = netdev_get_tx_queue(dev, i);
  1065	
  1066				old = dev_graft_qdisc(dev_queue, new);
  1067				if (new && i > 0)
  1068					qdisc_refcount_inc(new);
  1069	
  1070				if (!ingress)
  1071					qdisc_put(old);
  1072			}
  1073	
  1074	skip:
  1075			if (!ingress) {
  1076				notify_and_destroy(net, skb, n, classid,
  1077						   dev->qdisc, new);
  1078				if (new && !new->ops->attach)
  1079					qdisc_refcount_inc(new);
  1080				dev->qdisc = new ? : &noop_qdisc;
  1081	
  1082				if (new && new->ops->attach)
  1083					new->ops->attach(new);
  1084			} else {
  1085				notify_and_destroy(net, skb, n, classid, old, new);
  1086			}
  1087	
  1088			if (dev->flags & IFF_UP)
  1089				dev_activate(dev);
  1090		} else {
  1091			const struct Qdisc_class_ops *cops = parent->ops->cl_ops;
  1092			unsigned long cl;
  1093			int err;
  1094	
  1095			/* Only support running class lockless if parent is lockless */
  1096			if (new && (new->flags & TCQ_F_NOLOCK) &&
> 1097			    && !(parent->flags & TCQ_F_NOLOCK))
  1098				qdisc_clear_nolock(new);
  1099	
  1100			if (!cops || !cops->graft)
  1101				return -EOPNOTSUPP;
  1102	
  1103			cl = cops->find(parent, classid);
  1104			if (!cl) {
  1105				NL_SET_ERR_MSG(extack, "Specified class not found");
  1106				return -ENOENT;
  1107			}
  1108	
  1109			err = cops->graft(parent, cl, new, &old, extack);
  1110			if (err)
  1111				return err;
  1112			notify_and_destroy(net, skb, n, classid, old, new);
  1113		}
  1114		return 0;
  1115	}
  1116	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 21348 bytes --]

^ permalink raw reply

* [PATCH net-next v2 5/5] cxgb4: add support to read serial flash
From: Vishal Kulkarni @ 2020-06-18  6:05 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, dt, Vishal Kulkarni
In-Reply-To: <20200618060556.14410-1-vishal@chelsio.com>

This patch adds support to dump flash memory via
ethtool --get-dump

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cudbg_if.h |  3 +-
 .../net/ethernet/chelsio/cxgb4/cudbg_lib.c    | 38 +++++++++++++++++++
 .../net/ethernet/chelsio/cxgb4/cudbg_lib.h    |  4 +-
 .../net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c  | 14 +++++++
 .../net/ethernet/chelsio/cxgb4/cxgb4_cudbg.h  |  1 +
 5 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_if.h b/drivers/net/ethernet/chelsio/cxgb4/cudbg_if.h
index fc3813050f0d..c84719e3ca08 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_if.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_if.h
@@ -70,7 +70,8 @@ enum cudbg_dbg_entity_type {
 	CUDBG_HMA_INDIRECT = 67,
 	CUDBG_HMA = 68,
 	CUDBG_QDESC = 70,
-	CUDBG_MAX_ENTITY = 71,
+	CUDBG_FLASH = 71,
+	CUDBG_MAX_ENTITY = 72,
 };
 
 struct cudbg_init {
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
index 7b9cd69f9844..a09790989584 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
@@ -3156,3 +3156,41 @@ int cudbg_collect_qdesc(struct cudbg_init *pdbg_init,
 
 	return rc;
 }
+
+int cudbg_collect_flash(struct cudbg_init *pdbg_init,
+			struct cudbg_buffer *dbg_buff,
+			struct cudbg_error *cudbg_err)
+{
+	struct adapter *padap = pdbg_init->adap;
+	u32 count = padap->params.sf_size, n;
+	struct cudbg_buffer temp_buff = {0};
+	u32 addr, i;
+	int rc;
+
+	addr = FLASH_EXP_ROM_START;
+
+	for (i = 0; i < count; i += SF_PAGE_SIZE) {
+		n = min_t(u32, count - i, SF_PAGE_SIZE);
+
+		rc = cudbg_get_buff(pdbg_init, dbg_buff, n, &temp_buff);
+		if (rc) {
+			cudbg_err->sys_warn = CUDBG_STATUS_PARTIAL_DATA;
+			goto out;
+		}
+		rc = t4_read_flash(padap, addr, n, (u32 *)temp_buff.data, 0);
+		if (rc)
+			goto out;
+
+		addr += (n * 4);
+		rc = cudbg_write_and_release_buff(pdbg_init, &temp_buff,
+						  dbg_buff);
+		if (rc) {
+			cudbg_err->sys_warn = CUDBG_STATUS_PARTIAL_DATA;
+			goto out;
+		}
+	}
+
+out:
+	return rc;
+}
+
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.h b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.h
index 10ee6ed1d932..0f488d52797b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.h
@@ -162,7 +162,9 @@ int cudbg_collect_hma_meminfo(struct cudbg_init *pdbg_init,
 int cudbg_collect_qdesc(struct cudbg_init *pdbg_init,
 			struct cudbg_buffer *dbg_buff,
 			struct cudbg_error *cudbg_err);
-
+int cudbg_collect_flash(struct cudbg_init *pdbg_init,
+			struct cudbg_buffer *dbg_buff,
+			struct cudbg_error *cudbg_err);
 struct cudbg_entity_hdr *cudbg_get_entity_hdr(void *outbuf, int i);
 void cudbg_align_debug_buffer(struct cudbg_buffer *dbg_buff,
 			      struct cudbg_entity_hdr *entity_hdr);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
index e374b413d9ac..d7afe0746878 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
@@ -66,6 +66,10 @@ static const struct cxgb4_collect_entity cxgb4_collect_hw_dump[] = {
 	{ CUDBG_HMA_INDIRECT, cudbg_collect_hma_indirect },
 };
 
+static const struct cxgb4_collect_entity cxgb4_collect_flash_dump[] = {
+	{ CUDBG_FLASH, cudbg_collect_flash },
+};
+
 static u32 cxgb4_get_entity_length(struct adapter *adap, u32 entity)
 {
 	struct cudbg_tcam tcam_region = { 0 };
@@ -330,6 +334,9 @@ u32 cxgb4_get_dump_length(struct adapter *adap, u32 flag)
 		}
 	}
 
+	if (flag & CXGB4_ETH_DUMP_FLASH)
+		len += adap->params.sf_size;
+
 	/* If compression is enabled, a smaller destination buffer is enough */
 	wsize = cudbg_get_workspace_size();
 	if (wsize && len > CUDBG_DUMP_BUFF_SIZE)
@@ -468,6 +475,13 @@ int cxgb4_cudbg_collect(struct adapter *adap, void *buf, u32 *buf_size,
 					   buf,
 					   &total_size);
 
+	if (flag & CXGB4_ETH_DUMP_FLASH)
+		cxgb4_cudbg_collect_entity(&cudbg_init, &dbg_buff,
+					   cxgb4_collect_flash_dump,
+					   ARRAY_SIZE(cxgb4_collect_flash_dump),
+					   buf,
+					   &total_size);
+
 	cudbg_free_compress_buff(&cudbg_init);
 	cudbg_hdr->data_len = total_size;
 	if (cudbg_init.compress_type != CUDBG_COMPRESSION_NONE)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.h
index 66b805c7a92c..c04a49b6378d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.h
@@ -27,6 +27,7 @@ enum CXGB4_ETHTOOL_DUMP_FLAGS {
 	CXGB4_ETH_DUMP_NONE = ETH_FW_DUMP_DISABLE,
 	CXGB4_ETH_DUMP_MEM = (1 << 0), /* On-Chip Memory Dumps */
 	CXGB4_ETH_DUMP_HW = (1 << 1), /* various FW and HW dumps */
+	CXGB4_ETH_DUMP_FLASH = (1 << 2), /* Dump flash memory */
 };
 
 #define CXGB4_ETH_DUMP_ALL (CXGB4_ETH_DUMP_MEM | CXGB4_ETH_DUMP_HW)
-- 
2.21.1


^ permalink raw reply related

* [PATCH net-next v2 4/5] cxgb4: add support to flash boot cfg image
From: Vishal Kulkarni @ 2020-06-18  6:05 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, dt, Vishal Kulkarni
In-Reply-To: <20200618060556.14410-1-vishal@chelsio.com>

Update set_flash to flash boot cfg image to flash region

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h    |  9 ++
 .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c    | 30 +++++++
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c    | 90 +++++++++++++++++++
 3 files changed, 129 insertions(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 7b5f1869d8e7..999816273328 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -143,6 +143,12 @@ enum {
 	CXGB4_ETHTOOL_FLASH_FW = 1,
 	CXGB4_ETHTOOL_FLASH_PHY = 2,
 	CXGB4_ETHTOOL_FLASH_BOOT = 3,
+	CXGB4_ETHTOOL_FLASH_BOOTCFG = 4
+};
+
+struct cxgb4_bootcfg_data {
+	__le16 signature;
+	__u8 reserved[2];
 };
 
 struct cxgb4_pcir_data {
@@ -183,6 +189,7 @@ struct legacy_pci_rom_hdr {
 
 /* BOOT constants */
 enum {
+	BOOT_CFG_SIG = 0x4243,
 	BOOT_SIZE_INC = 512,
 	BOOT_SIGNATURE = 0xaa55,
 	BOOT_MIN_SIZE = sizeof(struct cxgb4_pci_exp_rom_header),
@@ -2046,6 +2053,8 @@ int t4_i2c_rd(struct adapter *adap, unsigned int mbox, int port,
 	      unsigned int len, u8 *buf);
 int t4_load_boot(struct adapter *adap, u8 *boot_data,
 		 unsigned int boot_addr, unsigned int size);
+int t4_load_bootcfg(struct adapter *adap,
+		    const u8 *cfg_data, unsigned int size);
 void free_rspq_fl(struct adapter *adap, struct sge_rspq *rq, struct sge_fl *fl);
 void free_tx_desc(struct adapter *adap, struct sge_txq *q,
 		  unsigned int n, bool unmap);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
index 29645252aa27..0bfdc97e9083 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
@@ -28,6 +28,7 @@ static const char * const flash_region_strings[] = {
 	"Firmware",
 	"PHY Firmware",
 	"Boot",
+	"Boot CFG",
 };
 
 static const char stats_strings[][ETH_GSTRING_LEN] = {
@@ -1242,6 +1243,19 @@ static int set_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom,
 	return err;
 }
 
+static int cxgb4_ethtool_flash_bootcfg(struct net_device *netdev,
+				       const u8 *data, u32 size)
+{
+	struct adapter *adap = netdev2adap(netdev);
+	int ret;
+
+	ret = t4_load_bootcfg(adap, data, size);
+	if (ret)
+		dev_err(adap->pdev_dev, "Failed to load boot cfg image\n");
+
+	return ret;
+}
+
 static int cxgb4_ethtool_flash_boot(struct net_device *netdev,
 				    const u8 *bdata, u32 size)
 {
@@ -1336,6 +1350,9 @@ static int cxgb4_ethtool_flash_region(struct net_device *netdev,
 	case CXGB4_ETHTOOL_FLASH_BOOT:
 		ret = cxgb4_ethtool_flash_boot(netdev, data, size);
 		break;
+	case CXGB4_ETHTOOL_FLASH_BOOTCFG:
+		ret = cxgb4_ethtool_flash_bootcfg(netdev, data, size);
+		break;
 	default:
 		ret = -EOPNOTSUPP;
 		break;
@@ -1365,6 +1382,17 @@ static int cxgb4_validate_fw_image(const u8 *data, u32 *size)
 	return 0;
 }
 
+static int cxgb4_validate_bootcfg_image(const u8 *data, u32 *size)
+{
+	struct cxgb4_bootcfg_data *header;
+
+	header = (struct cxgb4_bootcfg_data *)data;
+	if (le16_to_cpu(header->signature) != BOOT_CFG_SIG)
+		return -EINVAL;
+
+	return 0;
+}
+
 static int cxgb4_validate_boot_image(const u8 *data, u32 *size)
 {
 	struct cxgb4_pci_exp_rom_header *exp_header;
@@ -1401,6 +1429,8 @@ static int cxgb4_ethtool_get_flash_region(const u8 *data, u32 *size)
 		return CXGB4_ETHTOOL_FLASH_BOOT;
 	if (!cxgb4_validate_phy_image(data, size))
 		return CXGB4_ETHTOOL_FLASH_PHY;
+	if (!cxgb4_validate_bootcfg_image(data, size))
+		return CXGB4_ETHTOOL_FLASH_BOOTCFG;
 
 	return -EOPNOTSUPP;
 }
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index ccb550c6fab0..9d557f3cd3aa 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -10668,3 +10668,93 @@ int t4_load_boot(struct adapter *adap, u8 *boot_data,
 			ret);
 	return ret;
 }
+
+/**
+ *	t4_flash_bootcfg_addr - return the address of the flash
+ *	optionrom configuration
+ *	@adapter: the adapter
+ *
+ *	Return the address within the flash where the OptionROM Configuration
+ *	is stored, or an error if the device FLASH is too small to contain
+ *	a OptionROM Configuration.
+ */
+static int t4_flash_bootcfg_addr(struct adapter *adapter)
+{
+	/**
+	 * If the device FLASH isn't large enough to hold a Firmware
+	 * Configuration File, return an error.
+	 */
+	if (adapter->params.sf_size <
+	    FLASH_BOOTCFG_START + FLASH_BOOTCFG_MAX_SIZE)
+		return -ENOSPC;
+
+	return FLASH_BOOTCFG_START;
+}
+
+int t4_load_bootcfg(struct adapter *adap, const u8 *cfg_data, unsigned int size)
+{
+	unsigned int sf_sec_size = adap->params.sf_size / adap->params.sf_nsec;
+	struct cxgb4_bootcfg_data *header;
+	unsigned int flash_cfg_start_sec;
+	unsigned int addr, npad;
+	int ret, i, n, cfg_addr;
+
+	cfg_addr = t4_flash_bootcfg_addr(adap);
+	if (cfg_addr < 0)
+		return cfg_addr;
+
+	addr = cfg_addr;
+	flash_cfg_start_sec = addr / SF_SEC_SIZE;
+
+	if (size > FLASH_BOOTCFG_MAX_SIZE) {
+		dev_err(adap->pdev_dev, "bootcfg file too large, max is %u bytes\n",
+			FLASH_BOOTCFG_MAX_SIZE);
+		return -EFBIG;
+	}
+
+	header = (struct cxgb4_bootcfg_data *)cfg_data;
+	if (le16_to_cpu(header->signature) != BOOT_CFG_SIG) {
+		dev_err(adap->pdev_dev, "Wrong bootcfg signature\n");
+		ret = -EINVAL;
+		goto out;
+	}
+
+	i = DIV_ROUND_UP(FLASH_BOOTCFG_MAX_SIZE,
+			 sf_sec_size);
+	ret = t4_flash_erase_sectors(adap, flash_cfg_start_sec,
+				     flash_cfg_start_sec + i - 1);
+
+	/**
+	 * If size == 0 then we're simply erasing the FLASH sectors associated
+	 * with the on-adapter OptionROM Configuration File.
+	 */
+	if (ret || size == 0)
+		goto out;
+
+	/* this will write to the flash up to SF_PAGE_SIZE at a time */
+	for (i = 0; i < size; i += SF_PAGE_SIZE) {
+		n = min_t(u32, size - i, SF_PAGE_SIZE);
+
+		ret = t4_write_flash(adap, addr, n, cfg_data);
+		if (ret)
+			goto out;
+
+		addr += SF_PAGE_SIZE;
+		cfg_data += SF_PAGE_SIZE;
+	}
+
+	npad = ((size + 4 - 1) & ~3) - size;
+	for (i = 0; i < npad; i++) {
+		u8 data = 0;
+
+		ret = t4_write_flash(adap, cfg_addr + size + i, 1, &data);
+		if (ret)
+			goto out;
+	}
+
+out:
+	if (ret)
+		dev_err(adap->pdev_dev, "boot config data %s failed %d\n",
+			(size == 0 ? "clear" : "download"), ret);
+	return ret;
+}
-- 
2.21.1


^ permalink raw reply related

* [PATCH net-next v2 3/5] cxgb4: add support to flash boot image
From: Vishal Kulkarni @ 2020-06-18  6:05 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, dt, Vishal Kulkarni
In-Reply-To: <20200618060556.14410-1-vishal@chelsio.com>

Update set_flash to flash boot image to flash region

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h    |  48 +++++
 .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c    |  56 ++++++
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c    | 187 ++++++++++++++++++
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h  |   6 +
 4 files changed, 297 insertions(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index b49d16a54ada..7b5f1869d8e7 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -142,6 +142,52 @@ enum cc_fec {
 enum {
 	CXGB4_ETHTOOL_FLASH_FW = 1,
 	CXGB4_ETHTOOL_FLASH_PHY = 2,
+	CXGB4_ETHTOOL_FLASH_BOOT = 3,
+};
+
+struct cxgb4_pcir_data {
+	__le32 signature;	/* Signature. The string "PCIR" */
+	__le16 vendor_id;	/* Vendor Identification */
+	__le16 device_id;	/* Device Identification */
+	__u8 vital_product[2];	/* Pointer to Vital Product Data */
+	__u8 length[2];		/* PCIR Data Structure Length */
+	__u8 revision;		/* PCIR Data Structure Revision */
+	__u8 class_code[3];	/* Class Code */
+	__u8 image_length[2];	/* Image Length. Multiple of 512B */
+	__u8 code_revision[2];	/* Revision Level of Code/Data */
+	__u8 code_type;
+	__u8 indicator;
+	__u8 reserved[2];
+};
+
+/* BIOS boot headers */
+struct cxgb4_pci_exp_rom_header {
+	__le16 signature;	/* ROM Signature. Should be 0xaa55 */
+	__u8 reserved[22];	/* Reserved per processor Architecture data */
+	__le16 pcir_offset;	/* Offset to PCI Data Structure */
+};
+
+/* Legacy PCI Expansion ROM Header */
+struct legacy_pci_rom_hdr {
+	__u8 signature[2];	/* ROM Signature. Should be 0xaa55 */
+	__u8 size512;		/* Current Image Size in units of 512 bytes */
+	__u8 initentry_point[4];
+	__u8 cksum;		/* Checksum computed on the entire Image */
+	__u8 reserved[16];	/* Reserved */
+	__le16 pcir_offset;	/* Offset to PCI Data Struture */
+};
+
+#define CXGB4_HDR_CODE1 0x00
+#define CXGB4_HDR_CODE2 0x03
+#define CXGB4_HDR_INDI 0x80
+
+/* BOOT constants */
+enum {
+	BOOT_SIZE_INC = 512,
+	BOOT_SIGNATURE = 0xaa55,
+	BOOT_MIN_SIZE = sizeof(struct cxgb4_pci_exp_rom_header),
+	BOOT_MAX_SIZE = 1024 * BOOT_SIZE_INC,
+	PCIR_SIGNATURE = 0x52494350
 };
 
 struct port_stats {
@@ -1998,6 +2044,8 @@ void t4_register_netevent_notifier(void);
 int t4_i2c_rd(struct adapter *adap, unsigned int mbox, int port,
 	      unsigned int devid, unsigned int offset,
 	      unsigned int len, u8 *buf);
+int t4_load_boot(struct adapter *adap, u8 *boot_data,
+		 unsigned int boot_addr, unsigned int size);
 void free_rspq_fl(struct adapter *adap, struct sge_rspq *rq, struct sge_fl *fl);
 void free_tx_desc(struct adapter *adap, struct sge_txq *q,
 		  unsigned int n, bool unmap);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
index 7118ba016f01..29645252aa27 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
@@ -27,6 +27,7 @@ static const char * const flash_region_strings[] = {
 	"All",
 	"Firmware",
 	"PHY Firmware",
+	"Boot",
 };
 
 static const char stats_strings[][ETH_GSTRING_LEN] = {
@@ -1241,6 +1242,28 @@ static int set_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom,
 	return err;
 }
 
+static int cxgb4_ethtool_flash_boot(struct net_device *netdev,
+				    const u8 *bdata, u32 size)
+{
+	struct adapter *adap = netdev2adap(netdev);
+	unsigned int offset;
+	u8 *data;
+	int ret;
+
+	data = kmemdup(bdata, size, GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	offset = OFFSET_G(t4_read_reg(adap, PF_REG(0, PCIE_PF_EXPROM_OFST_A)));
+
+	ret = t4_load_boot(adap, data, offset, size);
+	if (ret)
+		dev_err(adap->pdev_dev, "Failed to load boot image\n");
+
+	kfree(data);
+	return ret;
+}
+
 #define CXGB4_PHY_SIG 0x130000ea
 
 static int cxgb4_validate_phy_image(const u8 *data, u32 *size)
@@ -1310,6 +1333,9 @@ static int cxgb4_ethtool_flash_region(struct net_device *netdev,
 	case CXGB4_ETHTOOL_FLASH_PHY:
 		ret = cxgb4_ethtool_flash_phy(netdev, data, size);
 		break;
+	case CXGB4_ETHTOOL_FLASH_BOOT:
+		ret = cxgb4_ethtool_flash_boot(netdev, data, size);
+		break;
 	default:
 		ret = -EOPNOTSUPP;
 		break;
@@ -1339,10 +1365,40 @@ static int cxgb4_validate_fw_image(const u8 *data, u32 *size)
 	return 0;
 }
 
+static int cxgb4_validate_boot_image(const u8 *data, u32 *size)
+{
+	struct cxgb4_pci_exp_rom_header *exp_header;
+	struct cxgb4_pcir_data *pcir_header;
+	struct legacy_pci_rom_hdr *header;
+	const u8 *cur_header = data;
+	u16 pcir_offset;
+
+	exp_header = (struct cxgb4_pci_exp_rom_header *)data;
+
+	if (le16_to_cpu(exp_header->signature) != BOOT_SIGNATURE)
+		return -EINVAL;
+
+	if (size) {
+		do {
+			header = (struct legacy_pci_rom_hdr *)cur_header;
+			pcir_offset = le16_to_cpu(header->pcir_offset);
+			pcir_header = (struct cxgb4_pcir_data *)(cur_header +
+				      pcir_offset);
+
+			*size += header->size512 * 512;
+			cur_header += header->size512 * 512;
+		} while (!(pcir_header->indicator & CXGB4_HDR_INDI));
+	}
+
+	return 0;
+}
+
 static int cxgb4_ethtool_get_flash_region(const u8 *data, u32 *size)
 {
 	if (!cxgb4_validate_fw_image(data, size))
 		return CXGB4_ETHTOOL_FLASH_FW;
+	if (!cxgb4_validate_boot_image(data, size))
+		return CXGB4_ETHTOOL_FLASH_BOOT;
 	if (!cxgb4_validate_phy_image(data, size))
 		return CXGB4_ETHTOOL_FLASH_PHY;
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index 1c8068c02728..ccb550c6fab0 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -10481,3 +10481,190 @@ int t4_set_vlan_acl(struct adapter *adap, unsigned int mbox, unsigned int vf,
 
 	return t4_wr_mbox(adap, adap->mbox, &vlan_cmd, sizeof(vlan_cmd), NULL);
 }
+
+/**
+ *	modify_device_id - Modifies the device ID of the Boot BIOS image
+ *	@device_id: the device ID to write.
+ *	@boot_data: the boot image to modify.
+ *
+ *	Write the supplied device ID to the boot BIOS image.
+ */
+static void modify_device_id(int device_id, u8 *boot_data)
+{
+	struct cxgb4_pcir_data *pcir_header;
+	struct legacy_pci_rom_hdr *header;
+	u8 *cur_header = boot_data;
+	u16 pcir_offset;
+
+	 /* Loop through all chained images and change the device ID's */
+	do {
+		header = (struct legacy_pci_rom_hdr *)cur_header;
+		pcir_offset = le16_to_cpu(header->pcir_offset);
+		pcir_header = (struct cxgb4_pcir_data *)(cur_header +
+			      pcir_offset);
+
+		/**
+		 * Only modify the Device ID if code type is Legacy or HP.
+		 * 0x00: Okay to modify
+		 * 0x01: FCODE. Do not modify
+		 * 0x03: Okay to modify
+		 * 0x04-0xFF: Do not modify
+		 */
+		if (pcir_header->code_type == CXGB4_HDR_CODE1) {
+			u8 csum = 0;
+			int i;
+
+			/**
+			 * Modify Device ID to match current adatper
+			 */
+			pcir_header->device_id = cpu_to_le16(device_id);
+
+			/**
+			 * Set checksum temporarily to 0.
+			 * We will recalculate it later.
+			 */
+			header->cksum = 0x0;
+
+			/**
+			 * Calculate and update checksum
+			 */
+			for (i = 0; i < (header->size512 * 512); i++)
+				csum += cur_header[i];
+
+			/**
+			 * Invert summed value to create the checksum
+			 * Writing new checksum value directly to the boot data
+			 */
+			cur_header[7] = -csum;
+
+		} else if (pcir_header->code_type == CXGB4_HDR_CODE2) {
+			/**
+			 * Modify Device ID to match current adatper
+			 */
+			pcir_header->device_id = cpu_to_le16(device_id);
+		}
+
+		/**
+		 * Move header pointer up to the next image in the ROM.
+		 */
+		cur_header += header->size512 * 512;
+	} while (!(pcir_header->indicator & CXGB4_HDR_INDI));
+}
+
+/**
+ *	t4_load_boot - download boot flash
+ *	@adap: the adapter
+ *	@boot_data: the boot image to write
+ *	@boot_addr: offset in flash to write boot_data
+ *	@size: image size
+ *
+ *	Write the supplied boot image to the card's serial flash.
+ *	The boot image has the following sections: a 28-byte header and the
+ *	boot image.
+ */
+int t4_load_boot(struct adapter *adap, u8 *boot_data,
+		 unsigned int boot_addr, unsigned int size)
+{
+	unsigned int sf_sec_size = adap->params.sf_size / adap->params.sf_nsec;
+	unsigned int boot_sector = (boot_addr * 1024);
+	struct cxgb4_pci_exp_rom_header *header;
+	struct cxgb4_pcir_data *pcir_header;
+	int pcir_offset;
+	unsigned int i;
+	u16 device_id;
+	int ret, addr;
+
+	/**
+	 * Make sure the boot image does not encroach on the firmware region
+	 */
+	if ((boot_sector + size) >> 16 > FLASH_FW_START_SEC) {
+		dev_err(adap->pdev_dev, "boot image encroaching on firmware region\n");
+		return -EFBIG;
+	}
+
+	/* Get boot header */
+	header = (struct cxgb4_pci_exp_rom_header *)boot_data;
+	pcir_offset = le16_to_cpu(header->pcir_offset);
+	/* PCIR Data Structure */
+	pcir_header = (struct cxgb4_pcir_data *)&boot_data[pcir_offset];
+
+	/**
+	 * Perform some primitive sanity testing to avoid accidentally
+	 * writing garbage over the boot sectors.  We ought to check for
+	 * more but it's not worth it for now ...
+	 */
+	if (size < BOOT_MIN_SIZE || size > BOOT_MAX_SIZE) {
+		dev_err(adap->pdev_dev, "boot image too small/large\n");
+		return -EFBIG;
+	}
+
+	if (le16_to_cpu(header->signature) != BOOT_SIGNATURE) {
+		dev_err(adap->pdev_dev, "Boot image missing signature\n");
+		return -EINVAL;
+	}
+
+	/* Check PCI header signature */
+	if (le32_to_cpu(pcir_header->signature) != PCIR_SIGNATURE) {
+		dev_err(adap->pdev_dev, "PCI header missing signature\n");
+		return -EINVAL;
+	}
+
+	/* Check Vendor ID matches Chelsio ID*/
+	if (le16_to_cpu(pcir_header->vendor_id) != PCI_VENDOR_ID_CHELSIO) {
+		dev_err(adap->pdev_dev, "Vendor ID missing signature\n");
+		return -EINVAL;
+	}
+
+	/**
+	 * The boot sector is comprised of the Expansion-ROM boot, iSCSI boot,
+	 * and Boot configuration data sections. These 3 boot sections span
+	 * sectors 0 to 7 in flash and live right before the FW image location.
+	 */
+	i = DIV_ROUND_UP(size ? size : FLASH_FW_START,  sf_sec_size);
+	ret = t4_flash_erase_sectors(adap, boot_sector >> 16,
+				     (boot_sector >> 16) + i - 1);
+
+	/**
+	 * If size == 0 then we're simply erasing the FLASH sectors associated
+	 * with the on-adapter option ROM file
+	 */
+	if (ret || size == 0)
+		goto out;
+	/* Retrieve adapter's device ID */
+	pci_read_config_word(adap->pdev, PCI_DEVICE_ID, &device_id);
+       /* Want to deal with PF 0 so I strip off PF 4 indicator */
+	device_id = device_id & 0xf0ff;
+
+	 /* Check PCIE Device ID */
+	if (le16_to_cpu(pcir_header->device_id) != device_id) {
+		/**
+		 * Change the device ID in the Boot BIOS image to match
+		 * the Device ID of the current adapter.
+		 */
+		modify_device_id(device_id, boot_data);
+	}
+
+	/**
+	 * Skip over the first SF_PAGE_SIZE worth of data and write it after
+	 * we finish copying the rest of the boot image. This will ensure
+	 * that the BIOS boot header will only be written if the boot image
+	 * was written in full.
+	 */
+	addr = boot_sector;
+	for (size -= SF_PAGE_SIZE; size; size -= SF_PAGE_SIZE) {
+		addr += SF_PAGE_SIZE;
+		boot_data += SF_PAGE_SIZE;
+		ret = t4_write_flash(adap, addr, SF_PAGE_SIZE, boot_data);
+		if (ret)
+			goto out;
+	}
+
+	ret = t4_write_flash(adap, boot_sector, SF_PAGE_SIZE,
+			     (const u8 *)header);
+
+out:
+	if (ret)
+		dev_err(adap->pdev_dev, "boot image load failed, error %d\n",
+			ret);
+	return ret;
+}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
index 4a9fcd6c226c..4b697550f08d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
@@ -563,6 +563,12 @@
 #define AIVEC_V(x) ((x) << AIVEC_S)
 
 #define PCIE_PF_CLI_A	0x44
+
+#define PCIE_PF_EXPROM_OFST_A 0x4c
+#define OFFSET_S    10
+#define OFFSET_M    0x3fffU
+#define OFFSET_G(x) (((x) >> OFFSET_S) & OFFSET_M)
+
 #define PCIE_INT_CAUSE_A	0x3004
 
 #define UNXSPLCPLERR_S    29
-- 
2.21.1


^ permalink raw reply related

* [PATCH net-next v2 2/5] cxgb4: add support to flash PHY image
From: Vishal Kulkarni @ 2020-06-18  6:05 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, dt, Vishal Kulkarni
In-Reply-To: <20200618060556.14410-1-vishal@chelsio.com>

Update set_flash to flash PHY image to flash region

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h    |  1 +
 .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c    | 39 +++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index a7a1e1f5d554..b49d16a54ada 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -141,6 +141,7 @@ enum cc_fec {
 
 enum {
 	CXGB4_ETHTOOL_FLASH_FW = 1,
+	CXGB4_ETHTOOL_FLASH_PHY = 2,
 };
 
 struct port_stats {
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
index 92f79d0cd6ab..7118ba016f01 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
@@ -26,6 +26,7 @@ static void set_msglevel(struct net_device *dev, u32 val)
 static const char * const flash_region_strings[] = {
 	"All",
 	"Firmware",
+	"PHY Firmware",
 };
 
 static const char stats_strings[][ETH_GSTRING_LEN] = {
@@ -1240,6 +1241,39 @@ static int set_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom,
 	return err;
 }
 
+#define CXGB4_PHY_SIG 0x130000ea
+
+static int cxgb4_validate_phy_image(const u8 *data, u32 *size)
+{
+	struct cxgb4_fw_data *header;
+
+	header = (struct cxgb4_fw_data *)data;
+	if (be32_to_cpu(header->signature) != CXGB4_PHY_SIG)
+		return -EINVAL;
+
+	return 0;
+}
+
+static int cxgb4_ethtool_flash_phy(struct net_device *netdev,
+				   const u8 *data, u32 size)
+{
+	struct adapter *adap = netdev2adap(netdev);
+	int ret;
+
+	ret = cxgb4_validate_phy_image(data, NULL);
+	if (ret) {
+		dev_err(adap->pdev_dev, "PHY signature mismatch\n");
+		return ret;
+	}
+
+	ret = t4_load_phy_fw(adap, MEMWIN_NIC, &adap->win0_lock,
+			     NULL, data, size);
+	if (ret)
+		dev_err(adap->pdev_dev, "Failed to load PHY FW\n");
+
+	return ret;
+}
+
 static int cxgb4_ethtool_flash_fw(struct net_device *netdev,
 				  const u8 *data, u32 size)
 {
@@ -1273,6 +1307,9 @@ static int cxgb4_ethtool_flash_region(struct net_device *netdev,
 	case CXGB4_ETHTOOL_FLASH_FW:
 		ret = cxgb4_ethtool_flash_fw(netdev, data, size);
 		break;
+	case CXGB4_ETHTOOL_FLASH_PHY:
+		ret = cxgb4_ethtool_flash_phy(netdev, data, size);
+		break;
 	default:
 		ret = -EOPNOTSUPP;
 		break;
@@ -1306,6 +1343,8 @@ static int cxgb4_ethtool_get_flash_region(const u8 *data, u32 *size)
 {
 	if (!cxgb4_validate_fw_image(data, size))
 		return CXGB4_ETHTOOL_FLASH_FW;
+	if (!cxgb4_validate_phy_image(data, size))
+		return CXGB4_ETHTOOL_FLASH_PHY;
 
 	return -EOPNOTSUPP;
 }
-- 
2.21.1


^ permalink raw reply related

* [PATCH net-next v2 1/5] cxgb4: update set_flash to flash different images
From: Vishal Kulkarni @ 2020-06-18  6:05 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, dt, Vishal Kulkarni
In-Reply-To: <20200618060556.14410-1-vishal@chelsio.com>

Chelsio adapter contains different flash regions and each
region is used by different binary files. This patch adds
support to flash images like PHY firmware, boot and boot config
using ethtool -f N.

The N value mapping is as follows.
N = 0 : Parse image and decide which region to flash
N = 1 : Firmware
N = 2 : PHY firmware
N = 3 : boot image
N = 4 : boot cfg

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>"
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h    |   9 ++
 .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c    | 121 +++++++++++++++---
 2 files changed, 115 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index cf69c6edcfec..a7a1e1f5d554 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -139,6 +139,10 @@ enum cc_fec {
 	FEC_BASER_RS  = 1 << 2   /* BaseR/Reed-Solomon */
 };
 
+enum {
+	CXGB4_ETHTOOL_FLASH_FW = 1,
+};
+
 struct port_stats {
 	u64 tx_octets;            /* total # of octets in good frames */
 	u64 tx_frames;            /* all good frames */
@@ -492,6 +496,11 @@ struct trace_params {
 	unsigned char port;
 };
 
+struct cxgb4_fw_data {
+	__be32 signature;
+	__u8 reserved[4];
+};
+
 /* Firmware Port Capabilities types. */
 
 typedef u16 fw_port_cap16_t;	/* 16-bit Port Capabilities integral value */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
index 9fd496732b2c..92f79d0cd6ab 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
@@ -23,6 +23,11 @@ static void set_msglevel(struct net_device *dev, u32 val)
 	netdev2adap(dev)->msg_enable = val;
 }
 
+static const char * const flash_region_strings[] = {
+	"All",
+	"Firmware",
+};
+
 static const char stats_strings[][ETH_GSTRING_LEN] = {
 	"tx_octets_ok           ",
 	"tx_frames_ok           ",
@@ -1235,15 +1240,88 @@ static int set_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom,
 	return err;
 }
 
-static int set_flash(struct net_device *netdev, struct ethtool_flash *ef)
+static int cxgb4_ethtool_flash_fw(struct net_device *netdev,
+				  const u8 *data, u32 size)
 {
-	int ret;
-	const struct firmware *fw;
 	struct adapter *adap = netdev2adap(netdev);
 	unsigned int mbox = PCIE_FW_MASTER_M + 1;
-	u32 pcie_fw;
+	int ret;
+
+	/* If the adapter has been fully initialized then we'll go ahead and
+	 * try to get the firmware's cooperation in upgrading to the new
+	 * firmware image otherwise we'll try to do the entire job from the
+	 * host ... and we always "force" the operation in this path.
+	 */
+	if (adap->flags & CXGB4_FULL_INIT_DONE)
+		mbox = adap->mbox;
+
+	ret = t4_fw_upgrade(adap, mbox, data, size, 1);
+	if (ret)
+		dev_err(adap->pdev_dev,
+			"Failed to flash firmware\n");
+
+	return ret;
+}
+
+static int cxgb4_ethtool_flash_region(struct net_device *netdev,
+				      const u8 *data, u32 size, u32 region)
+{
+	struct adapter *adap = netdev2adap(netdev);
+	int ret;
+
+	switch (region) {
+	case CXGB4_ETHTOOL_FLASH_FW:
+		ret = cxgb4_ethtool_flash_fw(netdev, data, size);
+		break;
+	default:
+		ret = -EOPNOTSUPP;
+		break;
+	}
+
+	if (!ret)
+		dev_info(adap->pdev_dev,
+			 "loading %s successful, reload cxgb4 driver\n",
+			 flash_region_strings[region]);
+	return ret;
+}
+
+#define CXGB4_FW_SIG 0x4368656c
+#define CXGB4_FW_SIG_OFFSET 0x160
+
+static int cxgb4_validate_fw_image(const u8 *data, u32 *size)
+{
+	struct cxgb4_fw_data *header;
+
+	header = (struct cxgb4_fw_data *)&data[CXGB4_FW_SIG_OFFSET];
+	if (be32_to_cpu(header->signature) != CXGB4_FW_SIG)
+		return -EINVAL;
+
+	if (size)
+		*size = be16_to_cpu(((struct fw_hdr *)data)->len512) * 512;
+
+	return 0;
+}
+
+static int cxgb4_ethtool_get_flash_region(const u8 *data, u32 *size)
+{
+	if (!cxgb4_validate_fw_image(data, size))
+		return CXGB4_ETHTOOL_FLASH_FW;
+
+	return -EOPNOTSUPP;
+}
+
+static int set_flash(struct net_device *netdev, struct ethtool_flash *ef)
+{
+	struct adapter *adap = netdev2adap(netdev);
+	const struct firmware *fw;
 	unsigned int master;
 	u8 master_vld = 0;
+	const u8 *fw_data;
+	size_t fw_size;
+	u32 size = 0;
+	u32 pcie_fw;
+	int region;
+	int ret;
 
 	pcie_fw = t4_read_reg(adap, PCIE_FW_A);
 	master = PCIE_FW_MASTER_G(pcie_fw);
@@ -1261,19 +1339,32 @@ static int set_flash(struct net_device *netdev, struct ethtool_flash *ef)
 	if (ret < 0)
 		return ret;
 
-	/* If the adapter has been fully initialized then we'll go ahead and
-	 * try to get the firmware's cooperation in upgrading to the new
-	 * firmware image otherwise we'll try to do the entire job from the
-	 * host ... and we always "force" the operation in this path.
-	 */
-	if (adap->flags & CXGB4_FULL_INIT_DONE)
-		mbox = adap->mbox;
+	fw_data = fw->data;
+	fw_size = fw->size;
+	if (ef->region == ETHTOOL_FLASH_ALL_REGIONS) {
+		while (fw_size > 0) {
+			size = 0;
+			region = cxgb4_ethtool_get_flash_region(fw_data, &size);
+			if (region < 0 || !size) {
+				ret = region;
+				goto out_free_fw;
+			}
+
+			ret = cxgb4_ethtool_flash_region(netdev, fw_data, size,
+							 region);
+			if (ret)
+				goto out_free_fw;
+
+			fw_data += size;
+			fw_size -= size;
+		}
+	} else {
+		ret = cxgb4_ethtool_flash_region(netdev, fw_data, fw_size,
+						 ef->region);
+	}
 
-	ret = t4_fw_upgrade(adap, mbox, fw->data, fw->size, 1);
+out_free_fw:
 	release_firmware(fw);
-	if (!ret)
-		dev_info(adap->pdev_dev,
-			 "loaded firmware %s, reload cxgb4 driver\n", ef->data);
 	return ret;
 }
 
-- 
2.21.1


^ permalink raw reply related

* [PATCH net-next v2 0/5] cxgb4: add support to read/write flash
From: Vishal Kulkarni @ 2020-06-18  6:05 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, dt, Vishal Kulkarni

This series of patches adds support to read/write different binary images
of serial flash present in Chelsio terminator.

V2 changes:
Patch 1: No change
Patch 2: No change
Patch 3: Fix 4 compilation warnings reported by C=1, W=1 flags
Patch 4: No change
Patch 5: No change
 
Vishal Kulkarni (5):
  cxgb4: update set_flash to flash different images
  cxgb4: add support to flash PHY image
  cxgb4: add support to flash boot image
  cxgb4: add support to flash boot cfg image
  cxgb4: add support to read serial flash

 drivers/net/ethernet/chelsio/cxgb4/cudbg_if.h |   3 +-
 .../net/ethernet/chelsio/cxgb4/cudbg_lib.c    |  38 +++
 .../net/ethernet/chelsio/cxgb4/cudbg_lib.h    |   4 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h    |  67 +++++
 .../net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c  |  14 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_cudbg.h  |   1 +
 .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c    | 244 ++++++++++++++-
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c    | 277 ++++++++++++++++++
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h  |   6 +
 9 files changed, 638 insertions(+), 16 deletions(-)

-- 
2.21.1


^ permalink raw reply

* Re: [PATCH bpf-next 8/9] tools/bpftool: show info for processes holding BPF map/prog/link/btf FDs
From: Andrii Nakryiko @ 2020-06-18  6:01 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, Hao Luo, Arnaldo Carvalho de Melo,
	Song Liu
In-Reply-To: <eebb2cea-dc27-77c6-936e-06ac5272921a@isovalent.com>

On Wed, Jun 17, 2020 at 5:24 PM Quentin Monnet <quentin@isovalent.com> wrote:
>
> 2020-06-17 09:18 UTC-0700 ~ Andrii Nakryiko <andriin@fb.com>
> > Add bpf_iter-based way to find all the processes that hold open FDs against
> > BPF object (map, prog, link, btf). bpftool always attempts to discover this,
> > but will silently give up if kernel doesn't yet support bpf_iter BPF programs.
> > Process name and PID are emitted for each process (task group).
> >
> > Sample output for each of 4 BPF objects:
> >
> > $ sudo ./bpftool prog show
> > 2694: cgroup_device  tag 8c42dee26e8cd4c2  gpl
> >         loaded_at 2020-06-16T15:34:32-0700  uid 0
> >         xlated 648B  jited 409B  memlock 4096B
> >         pids systemd(1)
> > 2907: cgroup_skb  name egress  tag 9ad187367cf2b9e8  gpl
> >         loaded_at 2020-06-16T18:06:54-0700  uid 0
> >         xlated 48B  jited 59B  memlock 4096B  map_ids 2436
> >         btf_id 1202
> >         pids test_progs(2238417), test_progs(2238445)
> >
> > $ sudo ./bpftool map show
> > 2436: array  name test_cgr.bss  flags 0x400
> >         key 4B  value 8B  max_entries 1  memlock 8192B
> >         btf_id 1202
> >         pids test_progs(2238417), test_progs(2238445)
> > 2445: array  name pid_iter.rodata  flags 0x480
> >         key 4B  value 4B  max_entries 1  memlock 8192B
> >         btf_id 1214  frozen
> >         pids bpftool(2239612)
> >
> > $ sudo ./bpftool link show
> > 61: cgroup  prog 2908
> >         cgroup_id 375301  attach_type egress
> >         pids test_progs(2238417), test_progs(2238445)
> > 62: cgroup  prog 2908
> >         cgroup_id 375344  attach_type egress
> >         pids test_progs(2238417), test_progs(2238445)
> >
> > $ sudo ./bpftool btf show
> > 1202: size 1527B  prog_ids 2908,2907  map_ids 2436
> >         pids test_progs(2238417), test_progs(2238445)
> > 1242: size 34684B
> >         pids bpftool(2258892)
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
>
> [...]
>
> > diff --git a/tools/bpf/bpftool/pids.c b/tools/bpf/bpftool/pids.c
> > new file mode 100644
> > index 000000000000..3474a91743ff
> > --- /dev/null
> > +++ b/tools/bpf/bpftool/pids.c
> > @@ -0,0 +1,229 @@
>
> [...]
>
> > +int build_obj_refs_table(struct obj_refs_table *table, enum bpf_obj_type type)
> > +{
> > +     char buf[4096];
> > +     struct pid_iter_bpf *skel;
> > +     struct pid_iter_entry *e;
> > +     int err, ret, fd = -1, i;
> > +     libbpf_print_fn_t default_print;
> > +
> > +     hash_init(table->table);
> > +     set_max_rlimit();
> > +
> > +     skel = pid_iter_bpf__open();
> > +     if (!skel) {
> > +             p_err("failed to open PID iterator skeleton");
> > +             return -1;
> > +     }
> > +
> > +     skel->rodata->obj_type = type;
> > +
> > +     /* we don't want output polluted with libbpf errors if bpf_iter is not
> > +      * supported
> > +      */
> > +     default_print = libbpf_set_print(libbpf_print_none);
> > +     err = pid_iter_bpf__load(skel);
> > +     libbpf_set_print(default_print);
> > +     if (err) {
> > +             /* too bad, kernel doesn't support BPF iterators yet */
> > +             err = 0;
> > +             goto out;
> > +     }
> > +     err = pid_iter_bpf__attach(skel);
> > +     if (err) {
> > +             /* if we loaded above successfully, attach has to succeed */
> > +             p_err("failed to attach PID iterator: %d", err);
>
> Nit: What about using strerror(err) for the error messages, here and
> below? It's easier to read than an integer value.

I'm actually against it. Just a pure string message for error is often
quite confusing. It's an extra step, and sometimes quite a quest in
itself, to find what's the integer value of errno it was, just to try
to understand what kind of error it actually is. So I certainly prefer
having integer value, optionally with a string error message.

But that's too much hassle for this "shouldn't happen" type of errors.
If they happen, the user is unlikely to infer anything useful and fix
the problem by themselves. They will most probably have to ask on the
mailing list and paste error messages verbatim and let people like me
and you try to guess what's going on. In such cases, having an errno
number is much more helpful.

>
> > +             goto out;
> > +     }
> > +
> > +     fd = bpf_iter_create(bpf_link__fd(skel->links.iter));
> > +     if (fd < 0) {
> > +             err = -errno;
> > +             p_err("failed to create PID iterator session: %d", err);
> > +             goto out;
> > +     }
> > +
> > +     while (true) {
> > +             ret = read(fd, buf, sizeof(buf));
> > +             if (ret < 0) {
> > +                     err = -errno;
> > +                     p_err("failed to read PID iterator output: %d", err);
> > +                     goto out;
> > +             }
> > +             if (ret == 0)
> > +                     break;
> > +             if (ret % sizeof(*e)) {
> > +                     err = -EINVAL;
> > +                     p_err("invalid PID iterator output format");
> > +                     goto out;
> > +             }
> > +             ret /= sizeof(*e);
> > +
> > +             e = (void *)buf;
> > +             for (i = 0; i < ret; i++, e++) {
> > +                     add_ref(table, e);
> > +             }
> > +     }
> > +     err = 0;
> > +out:
> > +     if (fd >= 0)
> > +             close(fd);
> > +     pid_iter_bpf__destroy(skel);
> > +     return err;
> > +}
>
> [...]
>
> > diff --git a/tools/bpf/bpftool/skeleton/pid_iter.bpf.c b/tools/bpf/bpftool/skeleton/pid_iter.bpf.c
> > new file mode 100644
> > index 000000000000..f560e48add07
> > --- /dev/null
> > +++ b/tools/bpf/bpftool/skeleton/pid_iter.bpf.c
> > @@ -0,0 +1,80 @@
> > +// SPDX-License-Identifier: GPL-2.0
>
> This would make it the only file not dual-licensed GPL/BSD in bpftool.
> We've had issues with that before [0], although linking to libbfd is no
> more a hard requirement. But I see you used a dual-license in the
> corresponding header file pid_iter.h, so is the single license
> intentional here? Or would you consider GPL/BSD?
>

The other BPF program (skeleton/profiler.bpf.c) is also GPL-2.0, we
should probably switch both.

> [0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=896165#38
>
> > +// Copyright (c) 2020 Facebook
> > +#include <vmlinux.h>
> > +#include <bpf/bpf_helpers.h>
> > +#include <bpf/bpf_core_read.h>
> > +#include <bpf/bpf_tracing.h>
> > +#include "pid_iter.h"
>
> [...]
>
> > +
> > +char LICENSE[] SEC("license") = "GPL";

I wonder if leaving this as GPL would be ok, if the source code itself
is dual GPL/BSD?


> > diff --git a/tools/bpf/bpftool/skeleton/pid_iter.h b/tools/bpf/bpftool/skeleton/pid_iter.h
> > new file mode 100644
> > index 000000000000..5692cf257adb
> > --- /dev/null
> > +++ b/tools/bpf/bpftool/skeleton/pid_iter.h
> > @@ -0,0 +1,12 @@
> > +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */
>
> [...]
>

^ permalink raw reply

* Re: [PATCH bpf-next 9/9] tools/bpftool: add documentation and sample output for process info
From: Andrii Nakryiko @ 2020-06-18  5:51 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, Hao Luo, Arnaldo Carvalho de Melo,
	Song Liu
In-Reply-To: <e22ae69f-c174-1cc8-d3b3-68fdda8934ae@isovalent.com>

On Wed, Jun 17, 2020 at 5:25 PM Quentin Monnet <quentin@isovalent.com> wrote:
>
> 2020-06-17 09:18 UTC-0700 ~ Andrii Nakryiko <andriin@fb.com>
> > Add statements about bpftool being able to discover process info, holding
> > reference to BPF map, prog, link, or BTF. Show example output as well.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >  tools/bpf/bpftool/Documentation/bpftool-btf.rst  |  5 +++++
> >  tools/bpf/bpftool/Documentation/bpftool-link.rst | 13 ++++++++++++-
> >  tools/bpf/bpftool/Documentation/bpftool-map.rst  |  8 +++++++-
> >  tools/bpf/bpftool/Documentation/bpftool-prog.rst | 11 +++++++++++
> >  4 files changed, 35 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/bpf/bpftool/Documentation/bpftool-btf.rst b/tools/bpf/bpftool/Documentation/bpftool-btf.rst
> > index ce3a724f50c1..85f7c82ebb28 100644
> > --- a/tools/bpf/bpftool/Documentation/bpftool-btf.rst
> > +++ b/tools/bpf/bpftool/Documentation/bpftool-btf.rst
> > @@ -36,6 +36,11 @@ DESCRIPTION
> >                 otherwise list all BTF objects currently loaded on the
> >                 system.
> >
> > +               Since Linux 5.8 bpftool is able to discover information about
> > +               processes that hold open file descriptors (FDs) against BPF
> > +               links. On such kernels bpftool will automatically emit this
>
> Copy-paste error: s/BPF links/BTF objects/
>

oops, will fix

> > +               information as well.
> > +
> >       **bpftool btf dump** *BTF_SRC*
> >                 Dump BTF entries from a given *BTF_SRC*.
> >
> > diff --git a/tools/bpf/bpftool/Documentation/bpftool-link.rst b/tools/bpf/bpftool/Documentation/bpftool-link.rst
> > index 0e43d7b06c11..1da7ef65b514 100644
> > --- a/tools/bpf/bpftool/Documentation/bpftool-link.rst
> > +++ b/tools/bpf/bpftool/Documentation/bpftool-link.rst
> > @@ -37,6 +37,11 @@ DESCRIPTION
> >                 zero or more named attributes, some of which depend on type
> >                 of link.
> >
> > +               Since Linux 5.8 bpftool is able to discover information about
> > +               processes that hold open file descriptors (FDs) against BPF
> > +               links. On such kernels bpftool will automatically emit this
> > +               information as well.
> > +
> >       **bpftool link pin** *LINK* *FILE*
> >                 Pin link *LINK* as *FILE*.
> >
> > @@ -82,6 +87,7 @@ EXAMPLES
> >
> >      10: cgroup  prog 25
> >              cgroup_id 614  attach_type egress
> > +            pids test_progs(2238417)
>
> (That's a big PID. Maybe something below the default max pid (32768)
> might be less confusing for users, but also maybe that's just me
> nitpicking too much.)

heh, real system, but yeah, I can make up a smaller PID :)

>
> >
> >  **# bpftool --json --pretty link show**
> >
> > @@ -91,7 +97,12 @@ EXAMPLES
> >              "type": "cgroup",
> >              "prog_id": 25,
> >              "cgroup_id": 614,
> > -            "attach_type": "egress"
> > +            "attach_type": "egress",
> > +            "pids": [{
> > +                    "pid": 2238417,
> > +                    "comm": "test_progs"
> > +                }
> > +            ]
> >          }
> >      ]
> >

^ permalink raw reply

* Re: [PATCH v5 3/7] fs: Add fd_install_received() wrapper for __fd_install_received()
From: Sargun Dhillon @ 2020-06-18  5:49 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, Christian Brauner, Tycho Andersen, David Laight,
	Christoph Hellwig, David S. Miller, Jakub Kicinski,
	Alexander Viro, Aleksa Sarai, Matt Denton, Jann Horn,
	Chris Palmer, Robert Sesek, Giuseppe Scrivano, Greg Kroah-Hartman,
	Andy Lutomirski, Will Drewry, Shuah Khan, netdev, containers,
	linux-api, linux-fsdevel, linux-kselftest
In-Reply-To: <20200617220327.3731559-4-keescook@chromium.org>

On Wed, Jun 17, 2020 at 03:03:23PM -0700, Kees Cook wrote:
> For both pidfd and seccomp, the __user pointer is not used. Update
> __fd_install_received() to make writing to ufd optional via a NULL check.
> However, for the fd_install_received_user() wrapper, ufd is NULL checked
> so an -EFAULT can be returned to avoid changing the SCM_RIGHTS interface
> behavior. Add new wrapper fd_install_received() for pidfd and seccomp
> that does not use the ufd argument. For the new helper, the new fd needs
> to be returned on success. Update the existing callers to handle it.
> 
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  fs/file.c            | 22 ++++++++++++++--------
>  include/linux/file.h |  7 +++++++
>  net/compat.c         |  2 +-
>  net/core/scm.c       |  2 +-
>  4 files changed, 23 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/file.c b/fs/file.c
> index f2167d6feec6..de85a42defe2 100644
> --- a/fs/file.c
> +++ b/fs/file.c
> @@ -942,9 +942,10 @@ int replace_fd(unsigned fd, struct file *file, unsigned flags)
>   * @o_flags: the O_* flags to apply to the new fd entry
>   *
>   * Installs a received file into the file descriptor table, with appropriate
> - * checks and count updates. Writes the fd number to userspace.
> + * checks and count updates. Optionally writes the fd number to userspace, if
> + * @ufd is non-NULL.
>   *
> - * Returns -ve on error.
> + * Returns newly install fd or -ve on error.
>   */
>  int __fd_install_received(struct file *file, int __user *ufd, unsigned int o_flags)
>  {
> @@ -960,20 +961,25 @@ int __fd_install_received(struct file *file, int __user *ufd, unsigned int o_fla
>  	if (new_fd < 0)
>  		return new_fd;
>  
> -	error = put_user(new_fd, ufd);
> -	if (error) {
> -		put_unused_fd(new_fd);
> -		return error;
> +	if (ufd) {
> +		error = put_user(new_fd, ufd);
> +		if (error) {
> +			put_unused_fd(new_fd);
> +			return error;
> +		}
>  	}
>  
> -	/* Bump the usage count and install the file. */
> +	/* Bump the usage count and install the file. The resulting value of
> +	 * "error" is ignored here since we only need to take action when
> +	 * the file is a socket and testing "sock" for NULL is sufficient.
> +	 */
>  	sock = sock_from_file(file, &error);
>  	if (sock) {
>  		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
>  		sock_update_classid(&sock->sk->sk_cgrp_data);
>  	}
>  	fd_install(new_fd, get_file(file));
> -	return 0;
> +	return new_fd;
>  }
>  
>  static int ksys_dup3(unsigned int oldfd, unsigned int newfd, int flags)
> diff --git a/include/linux/file.h b/include/linux/file.h
> index fe18a1a0d555..e19974ed9322 100644
> --- a/include/linux/file.h
> +++ b/include/linux/file.h
> @@ -9,6 +9,7 @@
>  #include <linux/compiler.h>
>  #include <linux/types.h>
>  #include <linux/posix_types.h>
> +#include <linux/errno.h>
>  
>  struct file;
>  
> @@ -96,8 +97,14 @@ extern int __fd_install_received(struct file *file, int __user *ufd,
>  static inline int fd_install_received_user(struct file *file, int __user *ufd,
>  					   unsigned int o_flags)
>  {
> +	if (ufd == NULL)
> +		return -EFAULT;
Isn't this *technically* a behvaiour change? Nonetheless, I think this is a much better
approach than forcing everyone to do null checking, and avoids at least one error case
where the kernel installs FDs for SCM_RIGHTS, and they're not actualy usable.

>  	return __fd_install_received(file, ufd, o_flags);
>  }
> +static inline int fd_install_received(struct file *file, unsigned int o_flags)
> +{
> +	return __fd_install_received(file, NULL, o_flags);
> +}
>  
>  extern void flush_delayed_fput(void);
>  extern void __fput_sync(struct file *);
> diff --git a/net/compat.c b/net/compat.c
> index 94f288e8dac5..71494337cca7 100644
> --- a/net/compat.c
> +++ b/net/compat.c
> @@ -299,7 +299,7 @@ void scm_detach_fds_compat(struct msghdr *msg, struct scm_cookie *scm)
>  
>  	for (i = 0; i < fdmax; i++) {
>  		err = fd_install_received_user(scm->fp->fp[i], cmsg_data + i, o_flags);
> -		if (err)
> +		if (err < 0)
>  			break;
>  	}
>  
> diff --git a/net/core/scm.c b/net/core/scm.c
> index df190f1fdd28..b9a0442ebd26 100644
> --- a/net/core/scm.c
> +++ b/net/core/scm.c
> @@ -307,7 +307,7 @@ void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm)
>  
>  	for (i = 0; i < fdmax; i++) {
>  		err = fd_install_received_user(scm->fp->fp[i], cmsg_data + i, o_flags);
> -		if (err)
> +		if (err < 0)
>  			break;
>  	}
>  
> -- 
> 2.25.1
> 

Reviewed-by: Sargun Dhillon <sargun@sargun.me>

^ permalink raw reply

* [net-next 13/15] iecm: Add ethtool
From: Jeff Kirsher @ 2020-06-18  5:13 UTC (permalink / raw)
  To: davem
  Cc: Alice Michael, netdev, nhorman, sassmann, Alan Brady, Phani Burra,
	Joshua Hay, Madhu Chittim, Pavan Kumar Linga, Donald Skidmore,
	Jesse Brandeburg, Sridhar Samudrala, Jeff Kirsher
In-Reply-To: <20200618051344.516587-1-jeffrey.t.kirsher@intel.com>

From: Alice Michael <alice.michael@intel.com>

Implement ethtool interface for the common module.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Donald Skidmore <donald.c.skidmore@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 .../net/ethernet/intel/iecm/iecm_ethtool.c    | 1107 ++++++++++++++++-
 drivers/net/ethernet/intel/iecm/iecm_lib.c    |  100 +-
 2 files changed, 1203 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/iecm/iecm_ethtool.c b/drivers/net/ethernet/intel/iecm/iecm_ethtool.c
index a6532592f2f4..2031d736bac6 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_ethtool.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_ethtool.c
@@ -3,6 +3,1111 @@
 
 #include <linux/net/intel/iecm.h>
 
+/**
+ * iecm_get_rxnfc - command to get RX flow classification rules
+ * @netdev: network interface device structure
+ * @cmd: ethtool rxnfc command
+ * @rule_locs: pointer to store rule locations
+ *
+ * Returns Success if the command is supported.
+ */
+static int iecm_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd,
+			  u32 *rule_locs)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+	int ret = -EOPNOTSUPP;
+
+	switch (cmd->cmd) {
+	case ETHTOOL_GRXRINGS:
+		cmd->data = vport->num_rxq;
+		ret = 0;
+		break;
+	case ETHTOOL_GRXFH:
+		netdev_info(netdev, "RSS hash info is not available\n");
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
+/**
+ * iecm_get_rxfh_key_size - get the RSS hash key size
+ * @netdev: network interface device structure
+ *
+ * Returns the table size.
+ */
+static u32 iecm_get_rxfh_key_size(struct net_device *netdev)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+
+	if (!iecm_is_cap_ena(vport->adapter, VIRTCHNL_CAP_RSS)) {
+		dev_info(&vport->adapter->pdev->dev, "RSS is not supported on this device\n");
+		return 0;
+	}
+
+	return vport->adapter->rss_data.rss_key_size;
+}
+
+/**
+ * iecm_get_rxfh_indir_size - get the Rx flow hash indirection table size
+ * @netdev: network interface device structure
+ *
+ * Returns the table size.
+ */
+static u32 iecm_get_rxfh_indir_size(struct net_device *netdev)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+
+	if (!iecm_is_cap_ena(vport->adapter, VIRTCHNL_CAP_RSS)) {
+		dev_info(&vport->adapter->pdev->dev, "RSS is not supported on this device\n");
+		return 0;
+	}
+
+	return vport->adapter->rss_data.rss_lut_size;
+}
+
+/**
+ * iecm_find_virtual_qid - Finds the virtual RX qid from the absolute RX qid
+ * @vport: virtual port structure
+ * @qid_list: List of the RX qid's
+ *
+ * Returns the virtual RX QID.
+ */
+static u32 iecm_find_virtual_qid(struct iecm_vport *vport, u16 *qid_list,
+				 u32 abs_rx_qid)
+{
+	u32 i;
+
+	for (i = 0; i < vport->num_rxq; i++)
+		if ((u32)qid_list[i] == abs_rx_qid)
+			break;
+	return i;
+}
+
+/**
+ * iecm_get_rxfh - get the Rx flow hash indirection table
+ * @netdev: network interface device structure
+ * @indir: indirection table
+ * @key: hash key
+ * @hfunc: hash function in use
+ *
+ * Reads the indirection table directly from the hardware. Always returns 0.
+ */
+static int iecm_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
+			 u8 *hfunc)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+	struct iecm_adapter *adapter;
+	u16 i, *qid_list;
+	u32 abs_qid;
+
+	adapter = vport->adapter;
+
+	if (!iecm_is_cap_ena(adapter, VIRTCHNL_CAP_RSS)) {
+		dev_info(&vport->adapter->pdev->dev, "RSS is not supported on this device\n");
+		return 0;
+	}
+
+	if (adapter->state != __IECM_UP)
+		return 0;
+
+	if (hfunc)
+		*hfunc = ETH_RSS_HASH_TOP;
+
+	if (key)
+		memcpy(key, adapter->rss_data.rss_key,
+		       adapter->rss_data.rss_key_size);
+
+	qid_list = kcalloc(vport->num_rxq, sizeof(u16), GFP_KERNEL);
+	if (!qid_list)
+		return -ENOMEM;
+
+	iecm_get_rx_qid_list(vport, qid_list);
+
+	if (indir)
+		/* Each 32 bits pointed by 'indir' is stored with a lut entry */
+		for (i = 0; i < adapter->rss_data.rss_lut_size; i++) {
+			abs_qid = (u32)adapter->rss_data.rss_lut[i];
+			indir[i] = iecm_find_virtual_qid(vport, qid_list,
+							 abs_qid);
+		}
+
+	kfree(qid_list);
+
+	return 0;
+}
+
+/**
+ * iecm_set_rxfh - set the Rx flow hash indirection table
+ * @netdev: network interface device structure
+ * @indir: indirection table
+ * @key: hash key
+ * @hfunc: hash function to use
+ *
+ * Returns -EINVAL if the table specifies an invalid queue id, otherwise
+ * returns 0 after programming the table.
+ */
+static int iecm_set_rxfh(struct net_device *netdev, const u32 *indir,
+			 const u8 *key, const u8 hfunc)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+	struct iecm_adapter *adapter;
+	u16 *qid_list;
+	u16 lut;
+
+	adapter = vport->adapter;
+
+	if (!iecm_is_cap_ena(adapter, VIRTCHNL_CAP_RSS)) {
+		dev_info(&adapter->pdev->dev, "RSS is not supported on this device.\n");
+		return 0;
+	}
+	if (adapter->state != __IECM_UP)
+		return 0;
+
+	if (hfunc != ETH_RSS_HASH_NO_CHANGE && hfunc != ETH_RSS_HASH_TOP)
+		return -EOPNOTSUPP;
+
+	if (key)
+		memcpy(adapter->rss_data.rss_key, key,
+		       adapter->rss_data.rss_key_size);
+
+	qid_list = kcalloc(vport->num_rxq, sizeof(u16), GFP_KERNEL);
+	if (!qid_list)
+		return -ENOMEM;
+
+	iecm_get_rx_qid_list(vport, qid_list);
+
+	if (indir) {
+		for (lut = 0; lut < adapter->rss_data.rss_lut_size; lut++) {
+			int index = indir[lut];
+
+			if (index >= vport->num_rxq) {
+				kfree(qid_list);
+				return -EINVAL;
+			}
+			adapter->rss_data.rss_lut[lut] = qid_list[index];
+		}
+	} else {
+		iecm_fill_dflt_rss_lut(vport, qid_list);
+	}
+
+	kfree(qid_list);
+
+	return iecm_config_rss(vport);
+}
+
+/**
+ * iecm_get_channels: get the number of channels supported by the device
+ * @netdev: network interface device structure
+ * @ch: channel information structure
+ *
+ * Report maximum of TX and RX. Report one extra channel to match our mailbox
+ * Queue.
+ */
+static void iecm_get_channels(struct net_device *netdev,
+			      struct ethtool_channels *ch)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+
+	/* Report maximum channels */
+	ch->max_combined = IECM_MAX_Q;
+
+	ch->max_other = IECM_MAX_NONQ;
+	ch->other_count = IECM_MAX_NONQ;
+
+	ch->combined_count = max(vport->num_txq, vport->num_rxq);
+}
+
+/**
+ * iecm_set_channels: set the new channel count
+ * @netdev: network interface device structure
+ * @ch: channel information structure
+ *
+ * Negotiate a new number of channels with CP. Returns 0 on success, negative
+ * on failure.
+ */
+static int iecm_set_channels(struct net_device *netdev,
+			     struct ethtool_channels *ch)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+	int num_req_q = ch->combined_count;
+
+	if (num_req_q == max(vport->num_txq, vport->num_rxq))
+		return 0;
+
+	/* All of these should have already been checked by ethtool before this
+	 * even gets to us, but just to be sure.
+	 */
+	if (num_req_q <= 0 || num_req_q > IECM_MAX_Q)
+		return -EINVAL;
+
+	if (ch->rx_count || ch->tx_count || ch->other_count != IECM_MAX_NONQ)
+		return -EINVAL;
+
+	vport->adapter->config_data.num_req_qs = num_req_q;
+
+	return iecm_initiate_soft_reset(vport, __IECM_SR_Q_CHANGE);
+}
+
+/**
+ * iecm_get_ringparam - Get ring parameters
+ * @netdev: network interface device structure
+ * @ring: ethtool ringparam structure
+ *
+ * Returns current ring parameters. TX and RX rings are reported separately,
+ * but the number of rings is not reported.
+ */
+static void iecm_get_ringparam(struct net_device *netdev,
+			       struct ethtool_ringparam *ring)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+
+	ring->rx_max_pending = IECM_MAX_RXQ_DESC;
+	ring->tx_max_pending = IECM_MAX_TXQ_DESC;
+	ring->rx_pending = vport->rxq_desc_count;
+	ring->tx_pending = vport->txq_desc_count;
+}
+
+/**
+ * iecm_set_ringparam - Set ring parameters
+ * @netdev: network interface device structure
+ * @ring: ethtool ringparam structure
+ *
+ * Sets ring parameters. TX and RX rings are controlled separately, but the
+ * number of rings is not specified, so all rings get the same settings.
+ */
+static int iecm_set_ringparam(struct net_device *netdev,
+			      struct ethtool_ringparam *ring)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+	u32 new_rx_count, new_tx_count;
+
+	if (ring->rx_mini_pending || ring->rx_jumbo_pending)
+		return -EINVAL;
+
+	new_tx_count = clamp_t(u32, ring->tx_pending,
+			       IECM_MIN_TXQ_DESC,
+			       IECM_MAX_TXQ_DESC);
+	new_tx_count = ALIGN(new_tx_count, IECM_REQ_DESC_MULTIPLE);
+
+	new_rx_count = clamp_t(u32, ring->rx_pending,
+			       IECM_MIN_RXQ_DESC,
+			       IECM_MAX_RXQ_DESC);
+	new_rx_count = ALIGN(new_rx_count, IECM_REQ_DESC_MULTIPLE);
+
+	/* if nothing to do return success */
+	if (new_tx_count == vport->txq_desc_count &&
+	    new_rx_count == vport->rxq_desc_count)
+		return 0;
+
+	vport->adapter->config_data.num_req_txq_desc = new_tx_count;
+	vport->adapter->config_data.num_req_rxq_desc = new_rx_count;
+
+	return iecm_initiate_soft_reset(vport, __IECM_SR_Q_DESC_CHANGE);
+}
+
+/**
+ * struct iecm_stats - definition for an ethtool statistic
+ * @stat_string: statistic name to display in ethtool -S output
+ * @sizeof_stat: the sizeof() the stat, must be no greater than sizeof(u64)
+ * @stat_offset: offsetof() the stat from a base pointer
+ *
+ * This structure defines a statistic to be added to the ethtool stats buffer.
+ * It defines a statistic as offset from a common base pointer. Stats should
+ * be defined in constant arrays using the IECM_STAT macro, with every element
+ * of the array using the same _type for calculating the sizeof_stat and
+ * stat_offset.
+ *
+ * The @sizeof_stat is expected to be sizeof(u8), sizeof(u16), sizeof(u32) or
+ * sizeof(u64). Other sizes are not expected and will produce a WARN_ONCE from
+ * the iecm_add_ethtool_stat() helper function.
+ *
+ * The @stat_string is interpreted as a format string, allowing formatted
+ * values to be inserted while looping over multiple structures for a given
+ * statistics array. Thus, every statistic string in an array should have the
+ * same type and number of format specifiers, to be formatted by variadic
+ * arguments to the iecm_add_stat_string() helper function.
+ */
+struct iecm_stats {
+	char stat_string[ETH_GSTRING_LEN];
+	int sizeof_stat;
+	int stat_offset;
+};
+
+/* Helper macro to define an iecm_stat structure with proper size and type.
+ * Use this when defining constant statistics arrays. Note that @_type expects
+ * only a type name and is used multiple times.
+ */
+#define IECM_STAT(_type, _name, _stat) { \
+	.stat_string = _name, \
+	.sizeof_stat = sizeof_field(_type, _stat), \
+	.stat_offset = offsetof(_type, _stat) \
+}
+
+/* Helper macro for defining some statistics related to queues */
+#define IECM_QUEUE_STAT(_name, _stat) \
+	IECM_STAT(struct iecm_queue, _name, _stat)
+
+/* Stats associated with a Tx queue */
+static const struct iecm_stats iecm_gstrings_tx_queue_stats[] = {
+	IECM_QUEUE_STAT("%s-%u.packets", q_stats.tx.packets),
+	IECM_QUEUE_STAT("%s-%u.bytes", q_stats.tx.bytes),
+};
+
+/* Stats associated with an Rx queue */
+static const struct iecm_stats iecm_gstrings_rx_queue_stats[] = {
+	IECM_QUEUE_STAT("%s-%u.packets", q_stats.rx.packets),
+	IECM_QUEUE_STAT("%s-%u.bytes", q_stats.rx.bytes),
+	IECM_QUEUE_STAT("%s-%u.generic_csum", q_stats.rx.generic_csum),
+	IECM_QUEUE_STAT("%s-%u.basic_csum", q_stats.rx.basic_csum),
+	IECM_QUEUE_STAT("%s-%u.csum_err", q_stats.rx.csum_err),
+	IECM_QUEUE_STAT("%s-%u.hsplit_buf_overflow", q_stats.rx.hsplit_hbo),
+};
+
+#define IECM_TX_QUEUE_STATS_LEN		ARRAY_SIZE(iecm_gstrings_tx_queue_stats)
+#define IECM_RX_QUEUE_STATS_LEN		ARRAY_SIZE(iecm_gstrings_rx_queue_stats)
+
+/* For now we have one and only one private flag and it is only defined
+ * when we have support for the SKIP_CPU_SYNC DMA attribute.  Instead
+ * of leaving all this code sitting around empty we will strip it unless
+ * our one private flag is actually available.
+ */
+struct iecm_priv_flags {
+	char flag_string[ETH_GSTRING_LEN];
+	bool read_only;
+	u32 flag;
+};
+
+#define IECM_PRIV_FLAG(_name, _flag, _read_only) { \
+	.read_only = _read_only, \
+	.flag_string = _name, \
+	.flag = _flag, \
+}
+
+static const struct iecm_priv_flags iecm_gstrings_priv_flags[] = {
+	IECM_PRIV_FLAG("", 0, 0),
+};
+
+#define IECM_PRIV_FLAGS_STR_LEN ARRAY_SIZE(iecm_gstrings_priv_flags)
+
+/**
+ * __iecm_add_stat_strings - copy stat strings into ethtool buffer
+ * @p: ethtool supplied buffer
+ * @stats: stat definitions array
+ * @size: size of the stats array
+ *
+ * Format and copy the strings described by stats into the buffer pointed at
+ * by p.
+ */
+static void __iecm_add_stat_strings(u8 **p, const struct iecm_stats stats[],
+				    const unsigned int size, ...)
+{
+	unsigned int i;
+
+	for (i = 0; i < size; i++) {
+		va_list args;
+
+		va_start(args, size);
+		vsnprintf((char *)*p, ETH_GSTRING_LEN,
+			  stats[i].stat_string, args);
+		*p += ETH_GSTRING_LEN;
+		va_end(args);
+	}
+}
+
+/**
+ * iecm_add_stat_strings - copy stat strings into ethtool buffer
+ * @p: ethtool supplied buffer
+ * @stats: stat definitions array
+ *
+ * Format and copy the strings described by the const static stats value into
+ * the buffer pointed at by p.
+ *
+ * The parameter @stats is evaluated twice, so parameters with side effects
+ * should be avoided. Additionally, stats must be an array such that
+ * ARRAY_SIZE can be called on it.
+ */
+#define iecm_add_stat_strings(p, stats, ...) \
+	__iecm_add_stat_strings(p, stats, ARRAY_SIZE(stats), ## __VA_ARGS__)
+
+/**
+ * iecm_get_stat_strings - Get stat strings
+ * @netdev: network interface device structure
+ * @data: buffer for string data
+ *
+ * Builds the statistics string table
+ */
+static void iecm_get_stat_strings(struct net_device *netdev, u8 *data)
+{
+	unsigned int i;
+
+	/* It's critical that we always report a constant number of strings and
+	 * that the strings are reported in the same order regardless of how
+	 * many queues are actually in use.
+	 */
+	for (i = 0; i < IECM_MAX_Q; i++)
+		iecm_add_stat_strings(&data, iecm_gstrings_tx_queue_stats,
+				      "tx", i);
+	for (i = 0; i < IECM_MAX_Q; i++)
+		iecm_add_stat_strings(&data, iecm_gstrings_rx_queue_stats,
+				      "rx", i);
+}
+
+/**
+ * iecm_get_priv_flag_strings - Get private flag strings
+ * @netdev: network interface device structure
+ * @data: buffer for string data
+ *
+ * Builds the private flags string table
+ */
+static void iecm_get_priv_flag_strings(struct net_device *netdev, u8 *data)
+{
+	unsigned int i;
+
+	for (i = 0; i < IECM_PRIV_FLAGS_STR_LEN; i++) {
+		snprintf((char *)data, ETH_GSTRING_LEN, "%s",
+			 iecm_gstrings_priv_flags[i].flag_string);
+		data += ETH_GSTRING_LEN;
+	}
+}
+
+/**
+ * iecm_get_strings - Get string set
+ * @netdev: network interface device structure
+ * @sset: id of string set
+ * @data: buffer for string data
+ *
+ * Builds string tables for various string sets
+ */
+static void iecm_get_strings(struct net_device *netdev, u32 sset, u8 *data)
+{
+	switch (sset) {
+	case ETH_SS_STATS:
+		iecm_get_stat_strings(netdev, data);
+		break;
+	case ETH_SS_PRIV_FLAGS:
+		iecm_get_priv_flag_strings(netdev, data);
+		break;
+	default:
+		break;
+	}
+}
+
+/**
+ * iecm_get_sset_count - Get length of string set
+ * @netdev: network interface device structure
+ * @sset: id of string set
+ *
+ * Reports size of various string tables.
+ */
+static int iecm_get_sset_count(struct net_device *netdev, int sset)
+{
+	if (sset == ETH_SS_STATS)
+		/* This size reported back here *must* be constant throughout
+		 * the lifecycle of the netdevice, i.e. we must report the
+		 * maximum length even for queues that don't technically exist.
+		 * This is due to the fact that this userspace API uses three
+		 * separate ioctl calls to get stats data but has no way to
+		 * communicate back to userspace when that size has changed,
+		 * which can typically happen as a result of changing number of
+		 * queues. If the number/order of stats change in the middle of
+		 * this call chain it will lead to userspace crashing/accessing
+		 * bad data through buffer under/overflow.
+		 */
+		return (IECM_TX_QUEUE_STATS_LEN * IECM_MAX_Q) +
+			(IECM_RX_QUEUE_STATS_LEN * IECM_MAX_Q);
+	else if (sset == ETH_SS_PRIV_FLAGS)
+		return IECM_PRIV_FLAGS_STR_LEN;
+	else
+		return -EINVAL;
+}
+
+/**
+ * iecm_add_one_ethtool_stat - copy the stat into the supplied buffer
+ * @data: location to store the stat value
+ * @pstat: old stat pointer to copy from
+ * @stat: the stat definition
+ *
+ * Copies the stat data defined by the pointer and stat structure pair into
+ * the memory supplied as data. Used to implement iecm_add_ethtool_stats and
+ * iecm_add_queue_stats. If the pointer is null, data will be zero'd.
+ */
+static void
+iecm_add_one_ethtool_stat(u64 *data, void *pstat,
+			  const struct iecm_stats *stat)
+{
+	char *p;
+
+	if (!pstat) {
+		/* ensure that the ethtool data buffer is zero'd for any stats
+		 * which don't have a valid pointer.
+		 */
+		*data = 0;
+		return;
+	}
+
+	p = (char *)pstat + stat->stat_offset;
+	switch (stat->sizeof_stat) {
+	case sizeof(u64):
+		*data = *((u64 *)p);
+		break;
+	case sizeof(u32):
+		*data = *((u32 *)p);
+		break;
+	case sizeof(u16):
+		*data = *((u16 *)p);
+		break;
+	case sizeof(u8):
+		*data = *((u8 *)p);
+		break;
+	default:
+		WARN_ONCE(1, "unexpected stat size for %s",
+			  stat->stat_string);
+		*data = 0;
+	}
+}
+
+/**
+ * iecm_add_queue_stats - copy queue statistics into supplied buffer
+ * @data: ethtool stats buffer
+ * @q: the queue to copy
+ *
+ * Queue statistics must be copied while protected by
+ * u64_stats_fetch_begin_irq, so we can't directly use iecm_add_ethtool_stats.
+ * Assumes that queue stats are defined in iecm_gstrings_queue_stats. If the
+ * queue pointer is null, zero out the queue stat values and update the data
+ * pointer. Otherwise safely copy the stats from the queue into the supplied
+ * buffer and update the data pointer when finished.
+ *
+ * This function expects to be called while under rcu_read_lock().
+ */
+static void
+iecm_add_queue_stats(u64 **data, struct iecm_queue *q)
+{
+	const struct iecm_stats *stats;
+	unsigned int start;
+	unsigned int size;
+	unsigned int i;
+
+	if (q->q_type == VIRTCHNL_QUEUE_TYPE_RX) {
+		size = IECM_RX_QUEUE_STATS_LEN;
+		stats = iecm_gstrings_rx_queue_stats;
+	} else {
+		size = IECM_TX_QUEUE_STATS_LEN;
+		stats = iecm_gstrings_tx_queue_stats;
+	}
+
+	/* To avoid invalid statistics values, ensure that we keep retrying
+	 * the copy until we get a consistent value according to
+	 * u64_stats_fetch_retry_irq. But first, make sure our queue is
+	 * non-null before attempting to access its syncp.
+	 */
+	do {
+		start = u64_stats_fetch_begin_irq(&q->stats_sync);
+		for (i = 0; i < size; i++)
+			iecm_add_one_ethtool_stat(&(*data)[i], q, &stats[i]);
+	} while (u64_stats_fetch_retry_irq(&q->stats_sync, start));
+
+	/* Once we successfully copy the stats in, update the data pointer */
+	*data += size;
+}
+
+/**
+ * iecm_add_empty_queue_stats - Add stats for a non-existent queue
+ * @data: pointer to data buffer
+ * @qtype: type of data queue
+ *
+ * We must report a constant length of stats back to userspace regardless of
+ * how many queues are actually in use because stats collection happens over
+ * three separate ioctls and there's no way to notify userspace the size
+ * changed between those calls. This adds empty to data to the stats since we
+ * don't have a real queue to refer to for this stats slot.
+ */
+static void
+iecm_add_empty_queue_stats(u64 **data, enum virtchnl_queue_type qtype)
+{
+	unsigned int i;
+	int stats_len;
+
+	if (qtype == VIRTCHNL_QUEUE_TYPE_RX)
+		stats_len = IECM_RX_QUEUE_STATS_LEN;
+	else
+		stats_len = IECM_TX_QUEUE_STATS_LEN;
+
+	for (i = 0; i < stats_len; i++)
+		(*data)[i] = 0;
+	*data += stats_len;
+}
+
+/**
+ * iecm_get_ethtool_stats - report device statistics
+ * @netdev: network interface device structure
+ * @stats: ethtool statistics structure
+ * @data: pointer to data buffer
+ *
+ * All statistics are added to the data buffer as an array of u64.
+ */
+static void iecm_get_ethtool_stats(struct net_device *netdev,
+				   struct ethtool_stats *stats, u64 *data)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+	enum virtchnl_queue_type qtype;
+	unsigned int total = 0;
+	unsigned int i, j;
+
+	if (vport->adapter->state != __IECM_UP)
+		return;
+
+	rcu_read_lock();
+	for (i = 0; i < vport->num_txq_grp; i++) {
+		struct iecm_txq_group *txq_grp = &vport->txq_grps[i];
+
+		qtype = VIRTCHNL_QUEUE_TYPE_TX;
+
+		for (j = 0; j < txq_grp->num_txq; j++, total++) {
+			struct iecm_queue *txq = &txq_grp->txqs[j];
+
+			if (!txq)
+				iecm_add_empty_queue_stats(&data, qtype);
+			else
+				iecm_add_queue_stats(&data, txq);
+		}
+	}
+	/* It is critical we provide a constant number of stats back to
+	 * userspace regardless of how many queues are actually in use because
+	 * there is no way to inform userspace the size has changed between
+	 * ioctl calls. This will fill in any missing stats with zero.
+	 */
+	for (; total < IECM_MAX_Q; total++)
+		iecm_add_empty_queue_stats(&data, VIRTCHNL_QUEUE_TYPE_TX);
+	total = 0;
+
+	for (i = 0; i < vport->num_rxq_grp; i++) {
+		struct iecm_rxq_group *rxq_grp = &vport->rxq_grps[i];
+		int num_rxq;
+
+		qtype = VIRTCHNL_QUEUE_TYPE_RX;
+
+		if (iecm_is_queue_model_split(vport->rxq_model))
+			num_rxq = rxq_grp->splitq.num_rxq_sets;
+		else
+			num_rxq = rxq_grp->singleq.num_rxq;
+
+		for (j = 0; j < num_rxq; j++, total++) {
+			struct iecm_queue *rxq;
+
+			if (iecm_is_queue_model_split(vport->rxq_model))
+				rxq = &rxq_grp->splitq.rxq_sets[j].rxq;
+			else
+				rxq = &rxq_grp->singleq.rxqs[j];
+			if (!rxq)
+				iecm_add_empty_queue_stats(&data, qtype);
+			else
+				iecm_add_queue_stats(&data, rxq);
+		}
+	}
+	for (; total < IECM_MAX_Q; total++)
+		iecm_add_empty_queue_stats(&data, VIRTCHNL_QUEUE_TYPE_RX);
+	rcu_read_unlock();
+}
+
+/**
+ * iecm_find_rxq - find rxq from q index
+ * @vport: virtual port associated to queue
+ * @q_num: q index used to find queue
+ *
+ * returns pointer to Rx queue
+ */
+static struct iecm_queue *
+iecm_find_rxq(struct iecm_vport *vport, int q_num)
+{
+	struct iecm_queue *rxq;
+	int q_grp, q_idx;
+
+	if (iecm_is_queue_model_split(vport->rxq_model)) {
+		q_grp = q_num / IECM_DFLT_SPLITQ_RXQ_PER_GROUP;
+		q_idx = q_num % IECM_DFLT_SPLITQ_RXQ_PER_GROUP;
+
+		rxq = &vport->rxq_grps[q_grp].splitq.rxq_sets[q_idx].rxq;
+	} else {
+		q_grp = q_num / IECM_DFLT_SINGLEQ_RXQ_PER_GROUP;
+		q_idx = q_num % IECM_DFLT_SINGLEQ_RXQ_PER_GROUP;
+
+		rxq = &vport->rxq_grps[q_grp].singleq.rxqs[q_idx];
+	}
+
+	return rxq;
+}
+
+/**
+ * iecm_find_txq - find txq from q index
+ * @vport: virtual port associated to queue
+ * @q_num: q index used to find queue
+ *
+ * returns pointer to Tx queue
+ */
+static struct iecm_queue *
+iecm_find_txq(struct iecm_vport *vport, int q_num)
+{
+	struct iecm_queue *txq;
+
+	if (iecm_is_queue_model_split(vport->txq_model)) {
+		int q_grp = q_num / IECM_DFLT_SPLITQ_TXQ_PER_GROUP;
+
+		txq = vport->txq_grps[q_grp].complq;
+	} else {
+		txq = vport->txqs[q_num];
+	}
+
+	return txq;
+}
+
+/**
+ * __iecm_get_q_coalesce - get ITR values for specific queue
+ * @ec: ethtool structure to fill with driver's coalesce settings
+ * @q: queue of Rx or Tx
+ */
+static void
+__iecm_get_q_coalesce(struct ethtool_coalesce *ec, struct iecm_queue *q)
+{
+	u16 itr_setting;
+	bool dyn_ena;
+
+	itr_setting = IECM_ITR_SETTING(q->itr.target_itr);
+	dyn_ena = IECM_ITR_IS_DYNAMIC(q->itr.target_itr);
+	if (q->q_type == VIRTCHNL_QUEUE_TYPE_RX) {
+		ec->use_adaptive_rx_coalesce = dyn_ena;
+		ec->rx_coalesce_usecs = itr_setting;
+	} else {
+		ec->use_adaptive_tx_coalesce = dyn_ena;
+		ec->tx_coalesce_usecs = itr_setting;
+	}
+}
+
+/**
+ * iecm_get_q_coalesce - get ITR values for specific queue
+ * @netdev: pointer to the netdev associated with this query
+ * @ec: coalesce settings to program the device with
+ * @q_num: update ITR/INTRL (coalesce) settings for this queue number/index
+ *
+ * Return 0 on success, and negative on failure
+ */
+static int
+iecm_get_q_coalesce(struct net_device *netdev, struct ethtool_coalesce *ec,
+		    u32 q_num)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+
+	if (vport->adapter->state != __IECM_UP)
+		return 0;
+
+	if (q_num >= vport->num_rxq && q_num >= vport->num_txq)
+		return -EINVAL;
+
+	if (q_num < vport->num_rxq) {
+		struct iecm_queue *rxq = iecm_find_rxq(vport, q_num);
+
+		__iecm_get_q_coalesce(ec, rxq);
+	}
+
+	if (q_num < vport->num_txq) {
+		struct iecm_queue *txq = iecm_find_txq(vport, q_num);
+
+		__iecm_get_q_coalesce(ec, txq);
+	}
+
+	return 0;
+}
+
+/**
+ * iecm_get_coalesce - get ITR values as requested by user
+ * @netdev: pointer to the netdev associated with this query
+ * @ec: coalesce settings to be filled
+ *
+ * Return 0 on success, and negative on failure
+ */
+static int
+iecm_get_coalesce(struct net_device *netdev, struct ethtool_coalesce *ec)
+{
+	/* Return coalesce based on queue number zero */
+	return iecm_get_q_coalesce(netdev, ec, 0);
+}
+
+/**
+ * iecm_get_per_q_coalesce - get ITR values as requested by user
+ * @netdev: pointer to the netdev associated with this query
+ * @q_num: queue for which the ITR values has to retrieved
+ * @ec: coalesce settings to be filled
+ *
+ * Return 0 on success, and negative on failure
+ */
+
+static int
+iecm_get_per_q_coalesce(struct net_device *netdev, u32 q_num,
+			struct ethtool_coalesce *ec)
+{
+	return iecm_get_q_coalesce(netdev, ec, q_num);
+}
+
+/**
+ * __iecm_set_q_coalesce - set ITR values for specific queue
+ * @ec: ethtool structure from user to update ITR settings
+ * @q: queue for which ITR values has to be set
+ *
+ * Returns 0 on success, negative otherwise.
+ */
+static int
+__iecm_set_q_coalesce(struct ethtool_coalesce *ec, struct iecm_queue *q)
+{
+	const char *q_type_str = (q->q_type == VIRTCHNL_QUEUE_TYPE_RX)
+				  ? "Rx" : "Tx";
+	u32 use_adaptive_coalesce, coalesce_usecs;
+	struct iecm_vport *vport;
+	u16 itr_setting;
+
+	itr_setting = IECM_ITR_SETTING(q->itr.target_itr);
+	vport = q->vport;
+	if (q->q_type == VIRTCHNL_QUEUE_TYPE_RX) {
+		use_adaptive_coalesce = ec->use_adaptive_rx_coalesce;
+		coalesce_usecs = ec->rx_coalesce_usecs;
+	} else {
+		use_adaptive_coalesce = ec->use_adaptive_tx_coalesce;
+		coalesce_usecs = ec->tx_coalesce_usecs;
+	}
+
+	if (itr_setting != coalesce_usecs && use_adaptive_coalesce) {
+		netdev_info(vport->netdev, "%s ITR cannot be changed if adaptive-%s is enabled\n",
+			    q_type_str, q_type_str);
+		return -EINVAL;
+	}
+
+	if (coalesce_usecs > IECM_ITR_MAX) {
+		netdev_info(vport->netdev,
+			    "Invalid value, %d-usecs range is 0-%d\n",
+			    coalesce_usecs, IECM_ITR_MAX);
+		return -EINVAL;
+	}
+
+	/* hardware only supports an ITR granularity of 2us */
+	if (coalesce_usecs % 2 != 0) {
+		netdev_info(vport->netdev,
+			    "Invalid value, %d-usecs must be even\n",
+			    coalesce_usecs);
+		return -EINVAL;
+	}
+
+	q->itr.target_itr = coalesce_usecs;
+	if (use_adaptive_coalesce)
+		q->itr.target_itr |= IECM_ITR_DYNAMIC;
+	/* Update of static/dynamic ITR will be taken care when interrupt is
+	 * fired
+	 */
+	return 0;
+}
+
+/**
+ * iecm_set_q_coalesce - set ITR values for specific queue
+ * @vport: vport associated to the queue that need updating
+ * @ec: coalesce settings to program the device with
+ * @q_num: update ITR/INTRL (coalesce) settings for this queue number/index
+ * @is_rxq: is queue type Rx
+ *
+ * Return 0 on success, and negative on failure
+ */
+static int
+iecm_set_q_coalesce(struct iecm_vport *vport, struct ethtool_coalesce *ec,
+		    int q_num, bool is_rxq)
+{
+	if (is_rxq) {
+		struct iecm_queue *rxq = iecm_find_rxq(vport, q_num);
+
+		if (rxq && __iecm_set_q_coalesce(ec, rxq))
+			return -EINVAL;
+	} else {
+		struct iecm_queue *txq = iecm_find_txq(vport, q_num);
+
+		if (txq && __iecm_set_q_coalesce(ec, txq))
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+/**
+ * iecm_set_coalesce - set ITR values as requested by user
+ * @netdev: pointer to the netdev associated with this query
+ * @ec: coalesce settings to program the device with
+ *
+ * Return 0 on success, and negative on failure
+ */
+static int
+iecm_set_coalesce(struct net_device *netdev, struct ethtool_coalesce *ec)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+	int i, err = 0;
+
+	if (vport->adapter->state != __IECM_UP)
+		return 0;
+
+	for (i = 0; i < vport->num_txq; i++) {
+		err = iecm_set_q_coalesce(vport, ec, i, false);
+		if (err)
+			goto set_coalesce_err;
+	}
+
+	for (i = 0; i < vport->num_rxq; i++) {
+		err = iecm_set_q_coalesce(vport, ec, i, true);
+		if (err)
+			goto set_coalesce_err;
+	}
+set_coalesce_err:
+	return err;
+}
+
+/**
+ * iecm_set_per_q_coalesce - set ITR values as requested by user
+ * @netdev: pointer to the netdev associated with this query
+ * @q_num: queue for which the ITR values has to be set
+ * @ec: coalesce settings to program the device with
+ *
+ * Return 0 on success, and negative on failure
+ */
+static int
+iecm_set_per_q_coalesce(struct net_device *netdev, u32 q_num,
+			struct ethtool_coalesce *ec)
+{
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+	int err;
+
+	if (vport->adapter->state != __IECM_UP)
+		return 0;
+
+	err = iecm_set_q_coalesce(vport, ec, q_num, false);
+	if (!err)
+		err = iecm_set_q_coalesce(vport, ec, q_num, true);
+
+	return err;
+}
+
+/**
+ * iecm_get_msglevel - Get debug message level
+ * @netdev: network interface device structure
+ *
+ * Returns current debug message level.
+ */
+static u32 iecm_get_msglevel(struct net_device *netdev)
+{
+	struct iecm_adapter *adapter = iecm_netdev_to_adapter(netdev);
+
+	return adapter->msg_enable;
+}
+
+/**
+ * iecm_set_msglevel - Set debug message level
+ * @netdev: network interface device structure
+ * @data: message level
+ *
+ * Set current debug message level. Higher values cause the driver to
+ * be noisier.
+ */
+static void iecm_set_msglevel(struct net_device *netdev, u32 data)
+{
+	struct iecm_adapter *adapter = iecm_netdev_to_adapter(netdev);
+
+	adapter->msg_enable = data;
+}
+
+/**
+ * iecm_get_link_ksettings - Get Link Speed and Duplex settings
+ * @netdev: network interface device structure
+ * @ecmd: ethtool command
+ *
+ * Reports speed/duplex settings.
+ **/
+static int iecm_get_link_ksettings(struct net_device *netdev,
+				   struct ethtool_link_ksettings *cmd)
+{
+	struct iecm_netdev_priv *np = netdev_priv(netdev);
+	struct iecm_adapter *adapter = np->vport->adapter;
+
+	ethtool_link_ksettings_zero_link_mode(cmd, supported);
+	cmd->base.autoneg = AUTONEG_DISABLE;
+	cmd->base.port = PORT_NONE;
+	/* Set speed and duplex */
+	switch (adapter->link_speed) {
+	case VIRTCHNL_LINK_SPEED_40GB:
+		cmd->base.speed = SPEED_40000;
+		break;
+	case VIRTCHNL_LINK_SPEED_25GB:
+#ifdef SPEED_25000
+		cmd->base.speed = SPEED_25000;
+#else
+		netdev_info(netdev,
+			    "Speed is 25G, display not supported by this version of ethtool.\n");
+#endif
+		break;
+	case VIRTCHNL_LINK_SPEED_20GB:
+		cmd->base.speed = SPEED_20000;
+		break;
+	case VIRTCHNL_LINK_SPEED_10GB:
+		cmd->base.speed = SPEED_10000;
+		break;
+	case VIRTCHNL_LINK_SPEED_1GB:
+		cmd->base.speed = SPEED_1000;
+		break;
+	case VIRTCHNL_LINK_SPEED_100MB:
+		cmd->base.speed = SPEED_100;
+		break;
+	default:
+		break;
+	}
+	cmd->base.duplex = DUPLEX_FULL;
+
+	return 0;
+}
+
+/**
+ * iecm_get_drvinfo - Get driver info
+ * @netdev: network interface device structure
+ * @drvinfo: ethtool driver info structure
+ *
+ * Returns information about the driver and device for display to the user.
+ */
+static void iecm_get_drvinfo(struct net_device *netdev,
+			     struct ethtool_drvinfo *drvinfo)
+{
+	struct iecm_adapter *adapter = iecm_netdev_to_adapter(netdev);
+
+	strlcpy(drvinfo->driver, iecm_drv_name, 32);
+	strlcpy(drvinfo->fw_version, "N/A", 4);
+	strlcpy(drvinfo->bus_info, pci_name(adapter->pdev), 32);
+}
+
+static const struct ethtool_ops iecm_ethtool_ops = {
+	.get_drvinfo		= iecm_get_drvinfo,
+	.get_msglevel		= iecm_get_msglevel,
+	.set_msglevel		= iecm_set_msglevel,
+	.get_coalesce		= iecm_get_coalesce,
+	.set_coalesce		= iecm_set_coalesce,
+	.get_per_queue_coalesce	= iecm_get_per_q_coalesce,
+	.set_per_queue_coalesce	= iecm_set_per_q_coalesce,
+	.get_ethtool_stats	= iecm_get_ethtool_stats,
+	.get_strings		= iecm_get_strings,
+	.get_sset_count		= iecm_get_sset_count,
+	.get_rxnfc		= iecm_get_rxnfc,
+	.get_rxfh_key_size	= iecm_get_rxfh_key_size,
+	.get_rxfh_indir_size	= iecm_get_rxfh_indir_size,
+	.get_rxfh		= iecm_get_rxfh,
+	.set_rxfh		= iecm_set_rxfh,
+	.get_channels		= iecm_get_channels,
+	.set_channels		= iecm_set_channels,
+	.get_ringparam		= iecm_get_ringparam,
+	.set_ringparam		= iecm_set_ringparam,
+	.get_link_ksettings	= iecm_get_link_ksettings,
+};
+
 /**
  * iecm_set_ethtool_ops - Initialize ethtool ops struct
  * @netdev: network interface device structure
@@ -12,5 +1117,5 @@
  */
 void iecm_set_ethtool_ops(struct net_device *netdev)
 {
-	/* stub */
+	netdev->ethtool_ops = &iecm_ethtool_ops;
 }
diff --git a/drivers/net/ethernet/intel/iecm/iecm_lib.c b/drivers/net/ethernet/intel/iecm/iecm_lib.c
index 707520553912..096f24fa2f15 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_lib.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_lib.c
@@ -764,7 +764,37 @@ void iecm_deinit_task(struct iecm_adapter *adapter)
 static enum iecm_status
 iecm_init_hard_reset(struct iecm_adapter *adapter)
 {
-	/* stub */
+	enum iecm_status err;
+
+	/* Prepare for reset */
+	if (test_bit(__IECM_HR_FUNC_RESET, adapter->flags)) {
+		iecm_deinit_task(adapter);
+		adapter->dev_ops.reg_ops.trigger_reset(adapter,
+						       __IECM_HR_FUNC_RESET);
+		set_bit(__IECM_UP_REQUESTED, adapter->flags);
+		clear_bit(__IECM_HR_FUNC_RESET, adapter->flags);
+	} else if (test_bit(__IECM_HR_CORE_RESET, adapter->flags)) {
+		if (adapter->state == __IECM_UP)
+			set_bit(__IECM_UP_REQUESTED, adapter->flags);
+		iecm_deinit_task(adapter);
+		clear_bit(__IECM_HR_CORE_RESET, adapter->flags);
+	} else if (test_and_clear_bit(__IECM_HR_DRV_LOAD, adapter->flags)) {
+	/* Trigger reset */
+	} else {
+		dev_err(&adapter->pdev->dev, "Unhandled hard reset cause\n");
+		err = IECM_ERR_PARAM;
+		goto handle_err;
+	}
+
+	/* Reset is complete and so start building the driver resources again */
+	err = iecm_init_dflt_mbx(adapter);
+	if (err) {
+		dev_err(&adapter->pdev->dev, "Failed to initialize default mailbox: %d\n",
+			err);
+	}
+handle_err:
+	mutex_unlock(&adapter->reset_lock);
+	return err;
 }
 
 /**
@@ -793,7 +823,57 @@ static void iecm_vc_event_task(struct work_struct *work)
 int iecm_initiate_soft_reset(struct iecm_vport *vport,
 			     enum iecm_flags reset_cause)
 {
-	/* stub */
+	struct iecm_adapter *adapter = vport->adapter;
+	enum iecm_state current_state;
+	enum iecm_status status;
+	int err = 0;
+
+	/* Make sure we do not end up in initiating multiple resets */
+	mutex_lock(&adapter->reset_lock);
+
+	current_state = vport->adapter->state;
+	switch (reset_cause) {
+	case __IECM_SR_Q_CHANGE:
+		/* If we're changing number of queues requested, we need to
+		 * send a 'delete' message before freeing the queue resources.
+		 * We'll send an 'add' message in adjust_qs which doesn't
+		 * require the queue resources to be reallocated yet.
+		 */
+		if (current_state <= __IECM_DOWN) {
+			iecm_send_delete_queues_msg(vport);
+		} else {
+			set_bit(__IECM_DEL_QUEUES, adapter->flags);
+			iecm_vport_stop(vport);
+		}
+		iecm_deinit_rss(vport);
+		status = adapter->dev_ops.vc_ops.adjust_qs(vport);
+		if (status) {
+			err = -EFAULT;
+			goto reset_failure;
+		}
+		iecm_intr_rel(adapter);
+		iecm_vport_calc_num_q_vec(vport);
+		iecm_intr_req(adapter);
+		break;
+	case __IECM_SR_Q_DESC_CHANGE:
+		iecm_vport_stop(vport);
+		iecm_vport_calc_num_q_desc(vport);
+		break;
+	case __IECM_SR_Q_SCH_CHANGE:
+	case __IECM_SR_MTU_CHANGE:
+		iecm_vport_stop(vport);
+		break;
+	default:
+		dev_err(&adapter->pdev->dev, "Unhandled soft reset cause\n");
+		err = -EINVAL;
+		goto reset_failure;
+	}
+
+	if (current_state == __IECM_UP)
+		err = iecm_vport_open(vport);
+reset_failure:
+	mutex_unlock(&adapter->reset_lock);
+	return err;
 }
 
 /**
@@ -884,6 +964,7 @@ int iecm_probe(struct pci_dev *pdev,
 	INIT_DELAYED_WORK(&adapter->init_task, iecm_init_task);
 	INIT_DELAYED_WORK(&adapter->vc_event_task, iecm_vc_event_task);
 
+	adapter->dev_ops.reg_ops.reset_reg_init(&adapter->reset_reg);
 	mutex_lock(&adapter->reset_lock);
 	set_bit(__IECM_HR_DRV_LOAD, adapter->flags);
 	err = iecm_init_hard_reset(adapter);
@@ -977,7 +1058,20 @@ static int iecm_open(struct net_device *netdev)
  */
 static int iecm_change_mtu(struct net_device *netdev, int new_mtu)
 {
-	/* stub */
+	struct iecm_vport *vport =  iecm_netdev_to_vport(netdev);
+
+	if (new_mtu < netdev->min_mtu) {
+		netdev_err(netdev, "new MTU invalid. min_mtu is %d\n",
+			   netdev->min_mtu);
+		return -EINVAL;
+	} else if (new_mtu > netdev->max_mtu) {
+		netdev_err(netdev, "new MTU invalid. max_mtu is %d\n",
+			   netdev->max_mtu);
+		return -EINVAL;
+	}
+	netdev->mtu = new_mtu;
+
+	return iecm_initiate_soft_reset(vport, __IECM_SR_MTU_CHANGE);
 }
 
 static const struct net_device_ops iecm_netdev_ops_splitq = {
-- 
2.26.2


^ permalink raw reply related

* Re: [net-next 02/15] iecm: Add framework set of header files
From: Joe Perches @ 2020-06-18  5:32 UTC (permalink / raw)
  To: Jeff Kirsher, davem
  Cc: Alice Michael, netdev, nhorman, sassmann, Alan Brady, Phani Burra,
	Joshua Hay, Madhu Chittim, Pavan Kumar Linga, Donald Skidmore,
	Jesse Brandeburg, Sridhar Samudrala
In-Reply-To: <20200618051344.516587-3-jeffrey.t.kirsher@intel.com>

On Wed, 2020-06-17 at 22:13 -0700, Jeff Kirsher wrote:
> From: Alice Michael <alice.michael@intel.com>
[]
> diff --git a/include/linux/net/intel/iecm_controlq_api.h b/include/linux/net/intel/iecm_controlq_api.h
[]
> +enum iecm_ctlq_err {
> +	IECM_CTLQ_RC_OK		= 0,  /* Success */

Why is it necessary to effectively duplicate the
generic error codes with different error numbers?

> +	IECM_CTLQ_RC_EPERM	= 1,  /* Operation not permitted */
> +	IECM_CTLQ_RC_ENOENT	= 2,  /* No such element */
> +	IECM_CTLQ_RC_ESRCH	= 3,  /* Bad opcode */
> +	IECM_CTLQ_RC_EINTR	= 4,  /* Operation interrupted */
> +	IECM_CTLQ_RC_EIO	= 5,  /* I/O error */
> +	IECM_CTLQ_RC_ENXIO	= 6,  /* No such resource */
> +	IECM_CTLQ_RC_E2BIG	= 7,  /* Arg too long */
> +	IECM_CTLQ_RC_EAGAIN	= 8,  /* Try again */
> +	IECM_CTLQ_RC_ENOMEM	= 9,  /* Out of memory */
> +	IECM_CTLQ_RC_EACCES	= 10, /* Permission denied */
> +	IECM_CTLQ_RC_EFAULT	= 11, /* Bad address */
> +	IECM_CTLQ_RC_EBUSY	= 12, /* Device or resource busy */
> +	IECM_CTLQ_RC_EEXIST	= 13, /* object already exists */
> +	IECM_CTLQ_RC_EINVAL	= 14, /* Invalid argument */
> +	IECM_CTLQ_RC_ENOTTY	= 15, /* Not a typewriter */
> +	IECM_CTLQ_RC_ENOSPC	= 16, /* No space left or allocation failure */
> +	IECM_CTLQ_RC_ENOSYS	= 17, /* Function not implemented */
> +	IECM_CTLQ_RC_ERANGE	= 18, /* Parameter out of range */
> +	IECM_CTLQ_RC_EFLUSHED	= 19, /* Cmd flushed due to prev cmd error */
> +	IECM_CTLQ_RC_BAD_ADDR	= 20, /* Descriptor contains a bad pointer */
> +	IECM_CTLQ_RC_EMODE	= 21, /* Op not allowed in current dev mode */
> +	IECM_CTLQ_RC_EFBIG	= 22, /* File too big */
> +	IECM_CTLQ_RC_ENOSEC	= 24, /* Missing security manifest */
> +	IECM_CTLQ_RC_EBADSIG	= 25, /* Bad RSA signature */
> +	IECM_CTLQ_RC_ESVN	= 26, /* SVN number prohibits this package */
> +	IECM_CTLQ_RC_EBADMAN	= 27, /* Manifest hash mismatch */
> +	IECM_CTLQ_RC_EBADBUF	= 28, /* Buffer hash mismatches manifest */
> +};



^ permalink raw reply

* [PATCH net] net: increment xmit_recursion level in dev_direct_xmit()
From: Eric Dumazet @ 2020-06-18  5:23 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Eric Dumazet, Jakub Kicinski, syzbot

Back in commit f60e5990d9c1 ("ipv6: protect skb->sk accesses
from recursive dereference inside the stack") Hannes added code
so that IPv6 stack would not trust skb->sk for typical cases
where packet goes through 'standard' xmit path (__dev_queue_xmit())

Alas af_packet had a dev_direct_xmit() path that was not
dealing yet with xmit_recursion level.

Also change sk_mc_loop() to dump a stack once only.

Without this patch, syzbot was able to trigger :

[1]
[  153.567378] WARNING: CPU: 7 PID: 11273 at net/core/sock.c:721 sk_mc_loop+0x51/0x70
[  153.567378] Modules linked in: nfnetlink ip6table_raw ip6table_filter iptable_raw iptable_nat nf_nat nf_conntrack nf_defrag_ipv4 nf_defrag_ipv6 iptable_filter macsec macvtap tap macvlan 8021q hsr wireguard libblake2s blake2s_x86_64 libblake2s_generic udp_tunnel ip6_udp_tunnel libchacha20poly1305 poly1305_x86_64 chacha_x86_64 libchacha curve25519_x86_64 libcurve25519_generic netdevsim batman_adv dummy team bridge stp llc w1_therm wire i2c_mux_pca954x i2c_mux cdc_acm ehci_pci ehci_hcd mlx4_en mlx4_ib ib_uverbs ib_core mlx4_core
[  153.567386] CPU: 7 PID: 11273 Comm: b159172088 Not tainted 5.8.0-smp-DEV #273
[  153.567387] RIP: 0010:sk_mc_loop+0x51/0x70
[  153.567388] Code: 66 83 f8 0a 75 24 0f b6 4f 12 b8 01 00 00 00 31 d2 d3 e0 a9 bf ef ff ff 74 07 48 8b 97 f0 02 00 00 0f b6 42 3a 83 e0 01 5d c3 <0f> 0b b8 01 00 00 00 5d c3 0f b6 87 18 03 00 00 5d c0 e8 04 83 e0
[  153.567388] RSP: 0018:ffff95c69bb93990 EFLAGS: 00010212
[  153.567388] RAX: 0000000000000011 RBX: ffff95c6e0ee3e00 RCX: 0000000000000007
[  153.567389] RDX: ffff95c69ae50000 RSI: ffff95c6c30c3000 RDI: ffff95c6c30c3000
[  153.567389] RBP: ffff95c69bb93990 R08: ffff95c69a77f000 R09: 0000000000000008
[  153.567389] R10: 0000000000000040 R11: 00003e0e00026128 R12: ffff95c6c30c3000
[  153.567390] R13: ffff95c6cc4fd500 R14: ffff95c6f84500c0 R15: ffff95c69aa13c00
[  153.567390] FS:  00007fdc3a283700(0000) GS:ffff95c6ff9c0000(0000) knlGS:0000000000000000
[  153.567390] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  153.567391] CR2: 00007ffee758e890 CR3: 0000001f9ba20003 CR4: 00000000001606e0
[  153.567391] Call Trace:
[  153.567391]  ip6_finish_output2+0x34e/0x550
[  153.567391]  __ip6_finish_output+0xe7/0x110
[  153.567391]  ip6_finish_output+0x2d/0xb0
[  153.567392]  ip6_output+0x77/0x120
[  153.567392]  ? __ip6_finish_output+0x110/0x110
[  153.567392]  ip6_local_out+0x3d/0x50
[  153.567392]  ipvlan_queue_xmit+0x56c/0x5e0
[  153.567393]  ? ksize+0x19/0x30
[  153.567393]  ipvlan_start_xmit+0x18/0x50
[  153.567393]  dev_direct_xmit+0xf3/0x1c0
[  153.567393]  packet_direct_xmit+0x69/0xa0
[  153.567394]  packet_sendmsg+0xbf0/0x19b0
[  153.567394]  ? plist_del+0x62/0xb0
[  153.567394]  sock_sendmsg+0x65/0x70
[  153.567394]  sock_write_iter+0x93/0xf0
[  153.567394]  new_sync_write+0x18e/0x1a0
[  153.567395]  __vfs_write+0x29/0x40
[  153.567395]  vfs_write+0xb9/0x1b0
[  153.567395]  ksys_write+0xb1/0xe0
[  153.567395]  __x64_sys_write+0x1a/0x20
[  153.567395]  do_syscall_64+0x43/0x70
[  153.567396]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  153.567396] RIP: 0033:0x453549
[  153.567396] Code: Bad RIP value.
[  153.567396] RSP: 002b:00007fdc3a282cc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  153.567397] RAX: ffffffffffffffda RBX: 00000000004d32d0 RCX: 0000000000453549
[  153.567397] RDX: 0000000000000020 RSI: 0000000020000300 RDI: 0000000000000003
[  153.567398] RBP: 00000000004d32d8 R08: 0000000000000000 R09: 0000000000000000
[  153.567398] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004d32dc
[  153.567398] R13: 00007ffee742260f R14: 00007fdc3a282dc0 R15: 00007fdc3a283700
[  153.567399] ---[ end trace c1d5ae2b1059ec62 ]---

f60e5990d9c1 ("ipv6: protect skb->sk accesses from recursive dereference inside the stack")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
---
 net/core/dev.c  | 2 ++
 net/core/sock.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 6bc2388141f6fd7c66c0e8349514a326e5106db2..b16e2d3151ee30e555a89069021b197a084bce58 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4192,10 +4192,12 @@ int dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
 
 	local_bh_disable();
 
+	dev_xmit_recursion_inc();
 	HARD_TX_LOCK(dev, txq, smp_processor_id());
 	if (!netif_xmit_frozen_or_drv_stopped(txq))
 		ret = netdev_start_xmit(skb, dev, txq, false);
 	HARD_TX_UNLOCK(dev, txq);
+	dev_xmit_recursion_dec();
 
 	local_bh_enable();
 
diff --git a/net/core/sock.c b/net/core/sock.c
index 6c4acf1f0220b1f925ebcfaa847632ec0dbe0b9b..94391da277544e12c8a9c9eb52c51b0678b46dc4 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -718,7 +718,7 @@ bool sk_mc_loop(struct sock *sk)
 		return inet6_sk(sk)->mc_loop;
 #endif
 	}
-	WARN_ON(1);
+	WARN_ON_ONCE(1);
 	return true;
 }
 EXPORT_SYMBOL(sk_mc_loop);
-- 
2.27.0.290.gba653c62da-goog


^ permalink raw reply related

* [net-next 05/15] iecm: Add basic netdevice functionality
From: Jeff Kirsher @ 2020-06-18  5:13 UTC (permalink / raw)
  To: davem
  Cc: Alice Michael, netdev, nhorman, sassmann, Alan Brady, Phani Burra,
	Joshua Hay, Madhu Chittim, Pavan Kumar Linga, Donald Skidmore,
	Jesse Brandeburg, Sridhar Samudrala, Jeff Kirsher
In-Reply-To: <20200618051344.516587-1-jeffrey.t.kirsher@intel.com>

From: Alice Michael <alice.michael@intel.com>

This implements probe, interface up/down, and netdev_ops.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Donald Skidmore <donald.c.skidmore@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/iecm/iecm_lib.c    | 404 +++++++++++++++++-
 drivers/net/ethernet/intel/iecm/iecm_main.c   |   7 +-
 drivers/net/ethernet/intel/iecm/iecm_txrx.c   |   6 +-
 .../net/ethernet/intel/iecm/iecm_virtchnl.c   |  73 +++-
 4 files changed, 467 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/intel/iecm/iecm_lib.c b/drivers/net/ethernet/intel/iecm/iecm_lib.c
index 57a20204a7c8..6023d0c727fb 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_lib.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_lib.c
@@ -24,7 +24,17 @@ static void iecm_mb_intr_rel_irq(struct iecm_adapter *adapter)
  */
 static void iecm_intr_rel(struct iecm_adapter *adapter)
 {
-	/* stub */
+	if (!adapter->msix_entries)
+		return;
+	clear_bit(__IECM_MB_INTR_MODE, adapter->flags);
+	clear_bit(__IECM_MB_INTR_TRIGGER, adapter->flags);
+	iecm_mb_intr_rel_irq(adapter);
+
+	pci_free_irq_vectors(adapter->pdev);
+	kfree(adapter->msix_entries);
+	adapter->msix_entries = NULL;
+	kfree(adapter->req_vec_chunks);
+	adapter->req_vec_chunks = NULL;
 }
 
 /**
@@ -96,7 +106,53 @@ void iecm_intr_distribute(struct iecm_adapter *adapter)
  */
 static int iecm_intr_req(struct iecm_adapter *adapter)
 {
-	/* stub */
+	int min_vectors, max_vectors, err = 0;
+	unsigned int vector;
+	int num_vecs;
+	int v_actual;
+
+	num_vecs = adapter->vports[0]->num_q_vectors +
+		   IECM_MAX_NONQ_VEC + IECM_MAX_RDMA_VEC;
+
+	min_vectors = IECM_MIN_VEC;
+#define IECM_MAX_EVV_MAPPED_VEC 16
+	max_vectors = min(num_vecs, IECM_MAX_EVV_MAPPED_VEC);
+
+	v_actual = pci_alloc_irq_vectors(adapter->pdev, min_vectors,
+					 max_vectors, PCI_IRQ_MSIX);
+	if (v_actual < 0) {
+		dev_err(&adapter->pdev->dev, "Failed to allocate MSIX vectors: %d\n",
+			v_actual);
+		return v_actual;
+	}
+
+	adapter->msix_entries = kcalloc(v_actual, sizeof(struct msix_entry),
+					GFP_KERNEL);
+
+	if (!adapter->msix_entries) {
+		pci_free_irq_vectors(adapter->pdev);
+		return -ENOMEM;
+	}
+
+	for (vector = 0; vector < v_actual; vector++) {
+		adapter->msix_entries[vector].entry = vector;
+		adapter->msix_entries[vector].vector =
+			pci_irq_vector(adapter->pdev, vector);
+	}
+	adapter->num_msix_entries = v_actual;
+	adapter->num_req_msix = num_vecs;
+
+	iecm_intr_distribute(adapter);
+
+	err = iecm_mb_intr_init(adapter);
+	if (err)
+		goto intr_rel;
+	iecm_mb_irq_enable(adapter);
+	return err;
+
+intr_rel:
+	iecm_intr_rel(adapter);
+	return err;
 }
 
 /**
@@ -118,7 +174,21 @@ static int iecm_cfg_netdev(struct iecm_vport *vport)
  */
 static int iecm_cfg_hw(struct iecm_adapter *adapter)
 {
-	/* stub */
+	struct pci_dev *pdev = adapter->pdev;
+	struct iecm_hw *hw = &adapter->hw;
+
+	hw->hw_addr_len = pci_resource_len(pdev, 0);
+	hw->hw_addr = ioremap(pci_resource_start(pdev, 0), hw->hw_addr_len);
+
+	if (!hw->hw_addr)
+		return -EIO;
+
+	hw->back = adapter;
+	hw->bus.device = PCI_SLOT(pdev->devfn);
+	hw->bus.func = PCI_FUNC(pdev->devfn);
+	hw->bus.bus_id = pdev->bus->number;
+
+	return 0;
 }
 
 /**
@@ -132,7 +202,22 @@ static int iecm_cfg_hw(struct iecm_adapter *adapter)
  */
 static int iecm_get_free_slot(void *array, int size, int curr)
 {
-	/* stub */
+	int **tmp_array = (int **)array;
+	int next;
+
+	if (curr < (size - 1) && !tmp_array[curr + 1]) {
+		next = curr + 1;
+	} else {
+		int i = 0;
+
+		while ((i < size) && (tmp_array[i]))
+			i++;
+		if (i == size)
+			next = IECM_NO_FREE_SLOT;
+		else
+			next = i;
+	}
+	return next;
 }
 
 /**
@@ -141,7 +226,9 @@ static int iecm_get_free_slot(void *array, int size, int curr)
  */
 struct iecm_vport *iecm_netdev_to_vport(struct net_device *netdev)
 {
-	/* stub */
+	struct iecm_netdev_priv *np = netdev_priv(netdev);
+
+	return np->vport;
 }
 
 /**
@@ -150,7 +237,9 @@ struct iecm_vport *iecm_netdev_to_vport(struct net_device *netdev)
  */
 struct iecm_adapter *iecm_netdev_to_adapter(struct net_device *netdev)
 {
-	/* stub */
+	struct iecm_netdev_priv *np = netdev_priv(netdev);
+
+	return np->vport->adapter;
 }
 
 /**
@@ -185,7 +274,22 @@ static int iecm_stop(struct net_device *netdev)
  */
 int iecm_vport_rel(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_adapter *adapter;
+
+	if (!vport->adapter)
+		return -ENODEV;
+	adapter = vport->adapter;
+
+	iecm_vport_stop(vport);
+	iecm_deinit_rss(vport);
+	unregister_netdev(vport->netdev);
+	free_netdev(vport->netdev);
+	vport->netdev = NULL;
+	if (adapter->dev_ops.vc_ops.destroy_vport)
+		adapter->dev_ops.vc_ops.destroy_vport(vport);
+	kfree(vport);
+
+	return 0;
 }
 
 /**
@@ -194,7 +298,24 @@ int iecm_vport_rel(struct iecm_vport *vport)
  */
 static void iecm_vport_rel_all(struct iecm_adapter *adapter)
 {
-	/* stub */
+	int err, i;
+
+	if (!adapter->vports)
+		return;
+
+	for (i = 0; i < adapter->num_alloc_vport; i++) {
+		if (!adapter->vports[i])
+			continue;
+
+		err = iecm_vport_rel(adapter->vports[i]);
+		if (err)
+			dev_dbg(&adapter->pdev->dev,
+				"Failed to release adapter->vport[%d], err %d,\n",
+				i, err);
+		else
+			adapter->vports[i] = NULL;
+	}
+	adapter->num_alloc_vport = 0;
 }
 
 /**
@@ -217,7 +338,47 @@ void iecm_vport_set_hsplit(struct iecm_vport *vport, struct bpf_prog *prog)
 static struct iecm_vport *
 iecm_vport_alloc(struct iecm_adapter *adapter, int vport_id)
 {
-	/* stub */
+	struct iecm_vport *vport = NULL;
+
+	if (adapter->next_vport == IECM_NO_FREE_SLOT)
+		return vport;
+
+	/* Need to protect the allocation of the vports at the adapter level */
+	mutex_lock(&adapter->sw_mutex);
+
+	vport = kzalloc(sizeof(*vport), GFP_KERNEL);
+	if (!vport)
+		goto unlock_adapter;
+
+	vport->adapter = adapter;
+	vport->idx = adapter->next_vport;
+	vport->compln_clean_budget = IECM_TX_COMPLQ_CLEAN_BUDGET;
+	adapter->num_alloc_vport++;
+	adapter->dev_ops.vc_ops.vport_init(vport, vport_id);
+
+	/* Setup default MSIX irq handler for the vport */
+	vport->irq_q_handler = iecm_vport_intr_clean_queues;
+	vport->q_vector_base = IECM_MAX_NONQ_VEC;
+
+	/* fill vport slot in the adapter struct */
+	adapter->vports[adapter->next_vport] = vport;
+	if (iecm_cfg_netdev(vport))
+		goto cfg_netdev_fail;
+
+	/* prepare adapter->next_vport for next use */
+	adapter->next_vport = iecm_get_free_slot(adapter->vports,
+						 adapter->num_alloc_vport,
+						 adapter->next_vport);
+
+	goto unlock_adapter;
+
+cfg_netdev_fail:
+	adapter->vports[adapter->next_vport] = NULL;
+	kfree(vport);
+	vport = NULL;
+unlock_adapter:
+	mutex_unlock(&adapter->sw_mutex);
+	return vport;
 }
 
 /**
@@ -227,7 +388,22 @@ iecm_vport_alloc(struct iecm_adapter *adapter, int vport_id)
  */
 static void iecm_service_task(struct work_struct *work)
 {
-	/* stub */
+	struct iecm_adapter *adapter = container_of(work,
+						    struct iecm_adapter,
+						    serv_task.work);
+
+	if (test_bit(__IECM_MB_INTR_MODE, adapter->flags)) {
+		if (test_and_clear_bit(__IECM_MB_INTR_TRIGGER,
+				       adapter->flags)) {
+			iecm_recv_mb_msg(adapter, VIRTCHNL_OP_UNKNOWN, NULL, 0);
+			iecm_mb_irq_enable(adapter);
+		}
+	} else {
+		iecm_recv_mb_msg(adapter, VIRTCHNL_OP_UNKNOWN, NULL, 0);
+	}
+
+	queue_delayed_work(adapter->serv_wq, &adapter->serv_task,
+			   msecs_to_jiffies(300));
 }
 
 /**
@@ -261,7 +437,41 @@ static int iecm_vport_open(struct iecm_vport *vport)
  */
 static void iecm_init_task(struct work_struct *work)
 {
-	/* stub */
+	struct iecm_adapter *adapter = container_of(work,
+						    struct iecm_adapter,
+						    init_task.work);
+	struct iecm_vport *vport;
+	struct pci_dev *pdev;
+	int vport_id, err;
+
+	err = adapter->dev_ops.vc_ops.core_init(adapter, &vport_id);
+	if (err)
+		return;
+
+	pdev = adapter->pdev;
+	vport = iecm_vport_alloc(adapter, vport_id);
+	if (!vport) {
+		err = -EFAULT;
+		dev_err(&pdev->dev, "probe failed on vport setup:%d\n",
+			err);
+		return;
+	}
+	/* Start the service task before requesting vectors. This will ensure
+	 * vector information response from mailbox is handled
+	 */
+	queue_delayed_work(adapter->serv_wq, &adapter->serv_task,
+			   msecs_to_jiffies(5 * (pdev->devfn & 0x07)));
+	err = iecm_intr_req(adapter);
+	if (err) {
+		dev_err(&pdev->dev, "failed to enable interrupt vectors: %d\n",
+			err);
+		iecm_vport_rel(vport);
+		return;
+	}
+	/* Once state is put into DOWN, driver is ready for dev_open */
+	adapter->state = __IECM_DOWN;
+	if (test_and_clear_bit(__IECM_UP_REQUESTED, adapter->flags))
+		iecm_vport_open(vport);
 }
 
 /**
@@ -272,7 +482,40 @@ static void iecm_init_task(struct work_struct *work)
  */
 static int iecm_api_init(struct iecm_adapter *adapter)
 {
-	/* stub */
+	struct iecm_reg_ops *reg_ops = &adapter->dev_ops.reg_ops;
+	struct pci_dev *pdev = adapter->pdev;
+
+	if (!adapter->dev_ops.reg_ops_init) {
+		dev_err(&pdev->dev, "Invalid device, register API init not defined.\n");
+		return -EINVAL;
+	}
+	adapter->dev_ops.reg_ops_init(adapter);
+	if (!(reg_ops->ctlq_reg_init && reg_ops->vportq_reg_init &&
+	      reg_ops->intr_reg_init && reg_ops->mb_intr_reg_init &&
+	      reg_ops->reset_reg_init && reg_ops->trigger_reset)) {
+		dev_err(&pdev->dev, "Invalid device, missing one or more register functions\n");
+		return -EINVAL;
+	}
+
+	if (adapter->dev_ops.vc_ops_init) {
+		struct iecm_virtchnl_ops *vc_ops;
+
+		adapter->dev_ops.vc_ops_init(adapter);
+		vc_ops = &adapter->dev_ops.vc_ops;
+		if (!(vc_ops->core_init && vc_ops->vport_init &&
+		      vc_ops->vport_queue_ids_init && vc_ops->get_caps &&
+		      vc_ops->config_queues && vc_ops->enable_queues &&
+		      vc_ops->disable_queues && vc_ops->irq_map_unmap &&
+		      vc_ops->get_set_rss_lut && vc_ops->get_set_rss_hash &&
+		      vc_ops->adjust_qs && vc_ops->get_ptype)) {
+			dev_err(&pdev->dev, "Invalid device, missing one or more virtchnl functions\n");
+			return -EINVAL;
+		}
+	} else {
+		iecm_vc_ops_init(adapter);
+	}
+
+	return 0;
 }
 
 /**
@@ -284,7 +527,11 @@ static int iecm_api_init(struct iecm_adapter *adapter)
  */
 void iecm_deinit_task(struct iecm_adapter *adapter)
 {
-	/* stub */
+	iecm_vport_rel_all(adapter);
+	cancel_delayed_work_sync(&adapter->serv_task);
+	iecm_deinit_dflt_mbx(adapter);
+	iecm_vport_params_buf_rel(adapter);
+	iecm_intr_rel(adapter);
 }
 
 /**
@@ -306,7 +553,13 @@ iecm_init_hard_reset(struct iecm_adapter *adapter)
  */
 static void iecm_vc_event_task(struct work_struct *work)
 {
-	/* stub */
+	struct iecm_adapter *adapter = container_of(work,
+						    struct iecm_adapter,
+						    vc_event_task.work);
+
+	if (test_bit(__IECM_HR_CORE_RESET, adapter->flags) ||
+	    test_bit(__IECM_HR_FUNC_RESET, adapter->flags))
+		iecm_init_hard_reset(adapter);
 }
 
 /**
@@ -335,7 +588,103 @@ int iecm_probe(struct pci_dev *pdev,
 	       const struct pci_device_id __always_unused *ent,
 	       struct iecm_adapter *adapter)
 {
-	/* stub */
+	int err;
+
+	adapter->pdev = pdev;
+	err = iecm_api_init(adapter);
+	if (err) {
+		dev_err(&pdev->dev, "Device API is incorrectly configured\n");
+		return err;
+	}
+
+	err = pcim_iomap_regions(pdev, BIT(IECM_BAR0), pci_name(pdev));
+	if (err) {
+		dev_err(&pdev->dev, "BAR0 I/O map error %d\n", err);
+		return err;
+	}
+
+	/* set up for high or low DMA */
+	err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
+	if (err)
+		err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
+	if (err) {
+		dev_err(&pdev->dev, "DMA configuration failed: 0x%x\n", err);
+		return err;
+	}
+
+	pci_enable_pcie_error_reporting(pdev);
+	pci_set_master(pdev);
+	pci_set_drvdata(pdev, adapter);
+
+	adapter->init_wq =
+		alloc_workqueue("%s", WQ_MEM_RECLAIM, 0, KBUILD_MODNAME);
+	if (!adapter->init_wq) {
+		dev_err(&pdev->dev, "Failed to allocate workqueue\n");
+		err = -ENOMEM;
+		goto err_wq_alloc;
+	}
+
+	adapter->serv_wq =
+		alloc_workqueue("%s", WQ_MEM_RECLAIM, 0, KBUILD_MODNAME);
+	if (!adapter->serv_wq) {
+		dev_err(&pdev->dev, "Failed to allocate workqueue\n");
+		err = -ENOMEM;
+		goto err_mbx_wq_alloc;
+	}
+	/* setup msglvl */
+	adapter->msg_enable = netif_msg_init(debug, IECM_DFLT_NETIF_M);
+
+	adapter->vports = kcalloc(IECM_MAX_NUM_VPORTS,
+				  sizeof(*adapter->vports), GFP_KERNEL);
+	if (!adapter->vports) {
+		err = -ENOMEM;
+		goto err_vport_alloc;
+	}
+
+	err = iecm_vport_params_buf_alloc(adapter);
+	if (err) {
+		dev_err(&pdev->dev, "Failed to alloc vport params buffer: %d\n",
+			err);
+		goto err_mb_res;
+	}
+
+	err = iecm_cfg_hw(adapter);
+	if (err) {
+		dev_err(&pdev->dev, "Failed to configure HW structure for adapter: %d\n",
+			err);
+		goto err_cfg_hw;
+	}
+
+	mutex_init(&adapter->sw_mutex);
+	mutex_init(&adapter->vc_msg_lock);
+	mutex_init(&adapter->reset_lock);
+	init_waitqueue_head(&adapter->vchnl_wq);
+
+	INIT_DELAYED_WORK(&adapter->serv_task, iecm_service_task);
+	INIT_DELAYED_WORK(&adapter->init_task, iecm_init_task);
+	INIT_DELAYED_WORK(&adapter->vc_event_task, iecm_vc_event_task);
+
+	mutex_lock(&adapter->reset_lock);
+	set_bit(__IECM_HR_DRV_LOAD, adapter->flags);
+	err = iecm_init_hard_reset(adapter);
+	if (err) {
+		dev_err(&pdev->dev, "Failed to reset device: %d\n", err);
+		goto err_mb_init;
+	}
+
+	return 0;
+err_mb_init:
+err_cfg_hw:
+	iecm_vport_params_buf_rel(adapter);
+err_mb_res:
+	kfree(adapter->vports);
+err_vport_alloc:
+	destroy_workqueue(adapter->serv_wq);
+err_mbx_wq_alloc:
+	destroy_workqueue(adapter->init_wq);
+err_wq_alloc:
+	pci_disable_pcie_error_reporting(pdev);
+	return err;
 }
 EXPORT_SYMBOL(iecm_probe);
 
@@ -345,7 +694,22 @@ EXPORT_SYMBOL(iecm_probe);
  */
 void iecm_remove(struct pci_dev *pdev)
 {
-	/* stub */
+	struct iecm_adapter *adapter = pci_get_drvdata(pdev);
+
+	if (!adapter)
+		return;
+
+	iecm_deinit_task(adapter);
+	cancel_delayed_work_sync(&adapter->vc_event_task);
+	destroy_workqueue(adapter->serv_wq);
+	destroy_workqueue(adapter->init_wq);
+	kfree(adapter->vports);
+	kfree(adapter->vport_params_recvd);
+	kfree(adapter->vport_params_reqd);
+	mutex_destroy(&adapter->sw_mutex);
+	mutex_destroy(&adapter->vc_msg_lock);
+	mutex_destroy(&adapter->reset_lock);
+	pci_disable_pcie_error_reporting(pdev);
 }
 EXPORT_SYMBOL(iecm_remove);
 
@@ -355,7 +719,13 @@ EXPORT_SYMBOL(iecm_remove);
  */
 void iecm_shutdown(struct pci_dev *pdev)
 {
-	/* stub */
+	struct iecm_adapter *adapter;
+
+	adapter = pci_get_drvdata(pdev);
+	adapter->state = __IECM_REMOVE;
+
+	if (system_state == SYSTEM_POWER_OFF)
+		pci_set_power_state(pdev, PCI_D3hot);
 }
 EXPORT_SYMBOL(iecm_shutdown);
 
diff --git a/drivers/net/ethernet/intel/iecm/iecm_main.c b/drivers/net/ethernet/intel/iecm/iecm_main.c
index 0644581fc746..3b6eb44643de 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_main.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_main.c
@@ -30,7 +30,10 @@ MODULE_PARM_DESC(debug, "netif level (0=none,...,16=all)");
  */
 static int __init iecm_module_init(void)
 {
-	/* stub */
+	pr_info("%s\n", iecm_driver_string);
+	pr_info("%s\n", iecm_copyright);
+
+	return 0;
 }
 module_init(iecm_module_init);
 
@@ -42,6 +45,6 @@ module_init(iecm_module_init);
  */
 static void __exit iecm_module_exit(void)
 {
-	/* stub */
+	pr_info("module unloaded\n");
 }
 module_exit(iecm_module_exit);
diff --git a/drivers/net/ethernet/intel/iecm/iecm_txrx.c b/drivers/net/ethernet/intel/iecm/iecm_txrx.c
index b4688daa744d..0d684adc15e5 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_txrx.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_txrx.c
@@ -989,7 +989,11 @@ static int iecm_rx_splitq_clean(struct iecm_queue *rxq, int budget)
 irqreturn_t
 iecm_vport_intr_clean_queues(int __always_unused irq, void *data)
 {
-	/* stub */
+	struct iecm_q_vector *q_vector = (struct iecm_q_vector *)data;
+
+	napi_schedule(&q_vector->napi);
+
+	return IRQ_HANDLED;
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c b/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c
index 271009350503..7bf7c02f2d6f 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c
@@ -418,7 +418,47 @@ void iecm_deinit_dflt_mbx(struct iecm_adapter *adapter)
  */
 enum iecm_status iecm_init_dflt_mbx(struct iecm_adapter *adapter)
 {
-	/* stub */
+	struct iecm_ctlq_create_info ctlq_info[] = {
+		{
+			.type = IECM_CTLQ_TYPE_MAILBOX_TX,
+			.id = IECM_DFLT_MBX_ID,
+			.len = IECM_DFLT_MBX_Q_LEN,
+			.buf_size = IECM_DFLT_MBX_BUF_SIZE
+		},
+		{
+			.type = IECM_CTLQ_TYPE_MAILBOX_RX,
+			.id = IECM_DFLT_MBX_ID,
+			.len = IECM_DFLT_MBX_Q_LEN,
+			.buf_size = IECM_DFLT_MBX_BUF_SIZE
+		}
+	};
+	struct iecm_hw *hw = &adapter->hw;
+	enum iecm_status ret;
+
+	adapter->dev_ops.reg_ops.ctlq_reg_init(ctlq_info);
+
+#define NUM_Q 2
+	ret = iecm_ctlq_init(hw, NUM_Q, ctlq_info);
+	if (ret)
+		goto init_mbx_done;
+
+	hw->asq = iecm_find_ctlq(hw, IECM_CTLQ_TYPE_MAILBOX_TX,
+				 IECM_DFLT_MBX_ID);
+	hw->arq = iecm_find_ctlq(hw, IECM_CTLQ_TYPE_MAILBOX_RX,
+				 IECM_DFLT_MBX_ID);
+
+	if (!hw->asq || !hw->arq) {
+		iecm_ctlq_deinit(hw);
+		ret = IECM_ERR_CTLQ_ERROR;
+	}
+	adapter->state = __IECM_STARTUP;
+	/* Skew the delay for init tasks for each function based on fn number
+	 * to prevent every function from making the same call simultaneously.
+	 */
+	queue_delayed_work(adapter->init_wq, &adapter->init_task,
+			   msecs_to_jiffies(5 * (adapter->pdev->devfn & 0x07)));
+init_mbx_done:
+	return ret;
 }
 
 /**
@@ -440,7 +480,15 @@ int iecm_vport_params_buf_alloc(struct iecm_adapter *adapter)
  */
 void iecm_vport_params_buf_rel(struct iecm_adapter *adapter)
 {
-	/* stub */
+	int i = 0;
+
+	for (i = 0; i < IECM_MAX_NUM_VPORTS; i++) {
+		kfree(adapter->vport_params_recvd[i]);
+		kfree(adapter->vport_params_reqd[i]);
+	}
+
+	kfree(adapter->caps);
+	kfree(adapter->config_data.req_qs_chunks);
 }
 
 /**
@@ -565,6 +613,25 @@ static bool iecm_is_capability_ena(struct iecm_adapter *adapter, u64 flag)
  */
 void iecm_vc_ops_init(struct iecm_adapter *adapter)
 {
-	/* stub */
+	adapter->dev_ops.vc_ops.core_init = iecm_vc_core_init;
+	adapter->dev_ops.vc_ops.vport_init = iecm_vport_init;
+	adapter->dev_ops.vc_ops.vport_queue_ids_init =
+		iecm_vport_queue_ids_init;
+	adapter->dev_ops.vc_ops.get_caps = iecm_send_get_caps_msg;
+	adapter->dev_ops.vc_ops.is_cap_ena = iecm_is_capability_ena;
+	adapter->dev_ops.vc_ops.config_queues = iecm_send_config_queues_msg;
+	adapter->dev_ops.vc_ops.enable_queues = iecm_send_enable_queues_msg;
+	adapter->dev_ops.vc_ops.disable_queues = iecm_send_disable_queues_msg;
+	adapter->dev_ops.vc_ops.irq_map_unmap =
+		iecm_send_map_unmap_queue_vector_msg;
+	adapter->dev_ops.vc_ops.enable_vport = iecm_send_enable_vport_msg;
+	adapter->dev_ops.vc_ops.disable_vport = iecm_send_disable_vport_msg;
+	adapter->dev_ops.vc_ops.destroy_vport = iecm_send_destroy_vport_msg;
+	adapter->dev_ops.vc_ops.get_ptype = iecm_send_get_rx_ptype_msg;
+	adapter->dev_ops.vc_ops.get_set_rss_lut = iecm_send_get_set_rss_lut_msg;
+	adapter->dev_ops.vc_ops.get_set_rss_hash =
+		iecm_send_get_set_rss_hash_msg;
+	adapter->dev_ops.vc_ops.adjust_qs = iecm_vport_adjust_qs;
+	adapter->dev_ops.vc_ops.recv_mbx_msg = NULL;
 }
 EXPORT_SYMBOL(iecm_vc_ops_init);
-- 
2.26.2


^ permalink raw reply related

* [net-next 14/15] iecm: Add iecm to the kernel build system
From: Jeff Kirsher @ 2020-06-18  5:13 UTC (permalink / raw)
  To: davem
  Cc: Alice Michael, netdev, nhorman, sassmann, Alan Brady, Phani Burra,
	Joshua Hay, Madhu Chittim, Pavan Kumar Linga, Donald Skidmore,
	Jesse Brandeburg, Sridhar Samudrala, Jeff Kirsher
In-Reply-To: <20200618051344.516587-1-jeffrey.t.kirsher@intel.com>

From: Alice Michael <alice.michael@intel.com>

This introduces iecm as a module to the kernel, and adds
relevant documentation.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Donald Skidmore <donald.c.skidmore@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 .../networking/device_drivers/intel/iecm.rst  | 93 +++++++++++++++++++
 MAINTAINERS                                   |  2 +
 drivers/net/ethernet/intel/Kconfig            |  7 ++
 drivers/net/ethernet/intel/Makefile           |  1 +
 drivers/net/ethernet/intel/iecm/Makefile      | 19 ++++
 5 files changed, 122 insertions(+)
 create mode 100644 Documentation/networking/device_drivers/intel/iecm.rst
 create mode 100644 drivers/net/ethernet/intel/iecm/Makefile

diff --git a/Documentation/networking/device_drivers/intel/iecm.rst b/Documentation/networking/device_drivers/intel/iecm.rst
new file mode 100644
index 000000000000..5634e3e65c74
--- /dev/null
+++ b/Documentation/networking/device_drivers/intel/iecm.rst
@@ -0,0 +1,93 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+========================
+Intel Ethernet Common Module
+========================
+
+The Intel Ethernet Common Module is meant to serve as an abstraction layer
+between device specific implementation details and common net device driver
+flows. This library provides several function hooks which allow a device driver
+to specify register addresses, control queue communications, and other device
+specific functionality.  Some functions are required to be implemented while
+others have a default implementation that is used when none is supplied by the
+device driver.  Doing this, a device driver can be written to take advantage
+of existing code while also giving the flexibility to implement device specific
+features.
+
+The common use case for this library is for a network device driver that wants
+specify its own device specific details but also leverage the more common
+code flows found in network device drivers.
+
+Sections in this document:
+	Entry Point
+	Exit Point
+	Register Operations API
+	Virtchnl Operations API
+
+Entry Point
+~~~~~~~~~~~
+The primary entry point to the library is the iecm_probe function.  Prior to
+calling this, device drivers must have allocated an iecm_adapter struct and
+initialized it with the required API functions.  The adapter struct, along with
+the pci_dev struct and the pci_device_id struct, is provided to iecm_probe
+which finalizes device initialization and prepares the device for open.
+
+The iecm_dev_ops struct within the iecm_adapter struct is the primary vehicle
+for passing information from device drivers to the common module.  A dependent
+module must define and assign a reg_ops_init function which will assign the
+respective function pointers to initialize register values (see iecm_reg_ops
+struct).  These are required to be provided by the dependent device driver as
+no suitable default can be assumed for register addresses.
+
+The vc_ops_init function pointer and the related iecm_virtchnl_ops struct are
+optional and should only be necessary for device drivers which use a different
+method/timing for communicating across a mailbox to the hardware.  Within iecm
+is a default interface provided in the case where one is not provided by the
+device driver.
+
+Exit Point
+~~~~~~~~~~
+When the device driver is being prepared to be removed through the pci_driver
+remove callback, it is required for the device driver to call iecm_remove with
+the pci_dev struct provided.  This is required to ensure all resources are
+properly freed and returned to the operating system.
+
+Register Operations API
+~~~~~~~~~~~~~~~~~~~~~~~
+iecm_reg_ops contains three different function pointers relating to initializing
+registers for the specific net device using the library.
+
+ctlq_reg_init relates specifically to setting up registers related to control
+queue/mailbox communications.  Registers that should be defined include: head,
+tail, len, bah, bal, len_mask, len_ena_mask, and head_mask.
+
+vportq_reg_init relates to setting up queue registers.  The tail registers to
+be assigned to the iecm_queue struct for each RX/TX queue.
+
+intr_reg_init relates to any registers needed to setup interrupts.  These
+include registers needed to enable the interrupt and change ITR settings.
+
+If the initialization function finds that one or more required function
+pointers were not provided, an error will be issued and the device will be
+inoperable.
+
+
+Virtchnl Operations API
+~~~~~~~~~~~~~~~~~~~~~~~
+The virtchnl is a conduit between driver and hardware that allows device
+drivers to send and receive control messages to/from hardware.  This is
+optional to be specified as there is a general interface that can be assumed
+when using this library.  However, if a device deviates in some way to
+communicate across the mailbox correctly, this interface is provided to allow
+that.
+
+If vc_ops_init is set in the dev_ops field of the iecm_adapter struct, then it
+is assumed the device driver is using providing it's own interface to do
+virtchnl communications.  While providing vc_ops_init is optional, if it is
+provided, it is required that the device driver provide function pointers for
+those functions in vc_ops, with exception for the enable_vport, disable_vport,
+and destroy_vport functions which may not be required for all devices.
+
+If the initialization function finds that vc_ops_init was defined but one or
+more required function pointers were not provided, an error will be issued and
+the device will be inoperable.
diff --git a/MAINTAINERS b/MAINTAINERS
index 301330e02bca..102ee1e4aef0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8654,6 +8654,7 @@ F:	Documentation/networking/device_drivers/intel/fm10k.rst
 F:	Documentation/networking/device_drivers/intel/i40e.rst
 F:	Documentation/networking/device_drivers/intel/iavf.rst
 F:	Documentation/networking/device_drivers/intel/ice.rst
+F:	Documentation/networking/device_drivers/intel/iecm.rst
 F:	Documentation/networking/device_drivers/intel/igb.rst
 F:	Documentation/networking/device_drivers/intel/igbvf.rst
 F:	Documentation/networking/device_drivers/intel/ixgb.rst
@@ -8662,6 +8663,7 @@ F:	Documentation/networking/device_drivers/intel/ixgbevf.rst
 F:	drivers/net/ethernet/intel/
 F:	drivers/net/ethernet/intel/*/
 F:	include/linux/avf/virtchnl.h
+F:	include/linux/net/intel/
 
 INTEL FRAMEBUFFER DRIVER (excluding 810 and 815)
 M:	Maik Broemme <mbroemme@libmpq.org>
diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
index 48a8f9aa1dd0..6dd985cbdb6d 100644
--- a/drivers/net/ethernet/intel/Kconfig
+++ b/drivers/net/ethernet/intel/Kconfig
@@ -342,4 +342,11 @@ config IGC
 	  To compile this driver as a module, choose M here. The module
 	  will be called igc.
 
+config IECM
+	tristate "Intel(R) Ethernet Common Module Support"
+	default n
+	depends on PCI
+	help
+	 To compile this driver as a module, choose M here. The module
+	 will be called iecm.  MSI-X interrupt support is required
 endif # NET_VENDOR_INTEL
diff --git a/drivers/net/ethernet/intel/Makefile b/drivers/net/ethernet/intel/Makefile
index 3075290063f6..c9eba9cc5087 100644
--- a/drivers/net/ethernet/intel/Makefile
+++ b/drivers/net/ethernet/intel/Makefile
@@ -16,3 +16,4 @@ obj-$(CONFIG_IXGB) += ixgb/
 obj-$(CONFIG_IAVF) += iavf/
 obj-$(CONFIG_FM10K) += fm10k/
 obj-$(CONFIG_ICE) += ice/
+obj-$(CONFIG_IECM) += iecm/
diff --git a/drivers/net/ethernet/intel/iecm/Makefile b/drivers/net/ethernet/intel/iecm/Makefile
new file mode 100644
index 000000000000..61c814013582
--- /dev/null
+++ b/drivers/net/ethernet/intel/iecm/Makefile
@@ -0,0 +1,19 @@
+# SPDX-License-Identifier: GPL-2.0-only
+# Copyright (C) 2020 Intel Corporation
+
+#
+# Makefile for the Intel(R) Ethernet Common Module
+#
+
+obj-$(CONFIG_IECM) += iecm.o
+
+iecm-y := \
+	iecm_lib.o \
+	iecm_virtchnl.o \
+	iecm_txrx.o \
+	iecm_singleq_txrx.o \
+	iecm_ethtool.o \
+	iecm_controlq.o \
+	iecm_osdep.o \
+	iecm_controlq_setup.o \
+	iecm_main.o
-- 
2.26.2


^ permalink raw reply related

* [net-next 09/15] iecm: Init and allocate vport
From: Jeff Kirsher @ 2020-06-18  5:13 UTC (permalink / raw)
  To: davem
  Cc: Alice Michael, netdev, nhorman, sassmann, Alan Brady, Phani Burra,
	Joshua Hay, Madhu Chittim, Pavan Kumar Linga, Donald Skidmore,
	Jesse Brandeburg, Sridhar Samudrala, Jeff Kirsher
In-Reply-To: <20200618051344.516587-1-jeffrey.t.kirsher@intel.com>

From: Alice Michael <alice.michael@intel.com>

Initialize vport and allocate queue resources.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Donald Skidmore <donald.c.skidmore@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/iecm/iecm_lib.c    |  87 +-
 drivers/net/ethernet/intel/iecm/iecm_txrx.c   | 797 +++++++++++++++++-
 .../net/ethernet/intel/iecm/iecm_virtchnl.c   |  37 +-
 3 files changed, 890 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/intel/iecm/iecm_lib.c b/drivers/net/ethernet/intel/iecm/iecm_lib.c
index a4fd04fd0500..d855d6238740 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_lib.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_lib.c
@@ -443,7 +443,15 @@ static void iecm_vport_rel_all(struct iecm_adapter *adapter)
  */
 void iecm_vport_set_hsplit(struct iecm_vport *vport, struct bpf_prog *prog)
 {
-	/* stub */
+	if (prog) {
+		vport->rx_hsplit_en = IECM_RX_NO_HDR_SPLIT;
+		return;
+	}
+	if (iecm_is_cap_ena(vport->adapter, VIRTCHNL_CAP_HEADER_SPLIT) &&
+	    iecm_is_queue_model_split(vport->rxq_model))
+		vport->rx_hsplit_en = IECM_RX_HDR_SPLIT;
+	else
+		vport->rx_hsplit_en = IECM_RX_NO_HDR_SPLIT;
 }
 
 /**
@@ -531,7 +539,12 @@ static void iecm_service_task(struct work_struct *work)
  */
 static void iecm_up_complete(struct iecm_vport *vport)
 {
-	/* stub */
+	netif_set_real_num_rx_queues(vport->netdev, vport->num_txq);
+	netif_set_real_num_tx_queues(vport->netdev, vport->num_rxq);
+	netif_carrier_on(vport->netdev);
+	netif_tx_start_all_queues(vport->netdev);
+
+	vport->adapter->state = __IECM_UP;
 }
 
 /**
@@ -540,7 +553,71 @@ static void iecm_up_complete(struct iecm_vport *vport)
  */
 static int iecm_vport_open(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_adapter *adapter = vport->adapter;
+	int err;
+
+	if (vport->adapter->state != __IECM_DOWN)
+		return -EBUSY;
+
+	/* we do not allow interface up just yet */
+	netif_carrier_off(vport->netdev);
+
+	if (adapter->dev_ops.vc_ops.enable_vport) {
+		err = adapter->dev_ops.vc_ops.enable_vport(vport);
+		if (err)
+			return -EAGAIN;
+	}
+
+	if (iecm_vport_queues_alloc(vport)) {
+		err = -ENOMEM;
+		goto unroll_queues_alloc;
+	}
+
+	err = iecm_vport_intr_init(vport);
+	if (err)
+		goto unroll_intr_init;
+
+	err = vport->adapter->dev_ops.vc_ops.config_queues(vport);
+	if (err)
+		goto unroll_config_queues;
+	err = vport->adapter->dev_ops.vc_ops.irq_map_unmap(vport, true);
+	if (err) {
+		dev_err(&vport->adapter->pdev->dev,
+			"Call to irq_map_unmap returned %d\n", err);
+		goto unroll_config_queues;
+	}
+	err = vport->adapter->dev_ops.vc_ops.enable_queues(vport);
+	if (err)
+		goto unroll_enable_queues;
+
+	err = vport->adapter->dev_ops.vc_ops.get_ptype(vport);
+	if (err)
+		goto unroll_get_ptype;
+
+	if (adapter->rss_data.rss_lut)
+		err = iecm_config_rss(vport);
+	else
+		err = iecm_init_rss(vport);
+	if (err)
+		goto unroll_init_rss;
+	iecm_up_complete(vport);
+
+	netif_info(vport->adapter, hw, vport->netdev, "%s\n", __func__);
+
+	return 0;
+unroll_init_rss:
+unroll_get_ptype:
+	vport->adapter->dev_ops.vc_ops.disable_queues(vport);
+unroll_enable_queues:
+	vport->adapter->dev_ops.vc_ops.irq_map_unmap(vport, false);
+unroll_config_queues:
+	iecm_vport_intr_deinit(vport);
+unroll_intr_init:
+	iecm_vport_queues_rel(vport);
+unroll_queues_alloc:
+	adapter->dev_ops.vc_ops.disable_vport(vport);
+
+	return err;
 }
 
 /**
@@ -861,7 +938,9 @@ EXPORT_SYMBOL(iecm_shutdown);
  */
 static int iecm_open(struct net_device *netdev)
 {
-	/* stub */
+	struct iecm_netdev_priv *np = netdev_priv(netdev);
+
+	return iecm_vport_open(np->vport);
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/iecm/iecm_txrx.c b/drivers/net/ethernet/intel/iecm/iecm_txrx.c
index da3065a87c2c..16fea9ad6545 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_txrx.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_txrx.c
@@ -82,7 +82,37 @@ void iecm_tx_desc_rel_all(struct iecm_vport *vport)
  */
 static enum iecm_status iecm_tx_buf_alloc_all(struct iecm_queue *tx_q)
 {
-	/* stub */
+	int buf_size;
+	int i = 0;
+
+	/* Allocate book keeping buffers only. Buffers to be supplied to HW
+	 * are allocated by kernel network stack and received as part of skb
+	 */
+	buf_size = sizeof(struct iecm_tx_buf) * tx_q->desc_count;
+	tx_q->tx_buf = kzalloc(buf_size, GFP_KERNEL);
+	if (!tx_q->tx_buf)
+		return IECM_ERR_NO_MEMORY;
+
+	/* Initialize Tx buf stack for out-of-order completions if
+	 * flow scheduling offload is enabled
+	 */
+	tx_q->buf_stack.bufs =
+		kcalloc(tx_q->desc_count, sizeof(struct iecm_tx_buf *),
+			GFP_KERNEL);
+	if (!tx_q->buf_stack.bufs)
+		return IECM_ERR_NO_MEMORY;
+
+	for (i = 0; i < tx_q->desc_count; i++) {
+		tx_q->buf_stack.bufs[i] = kzalloc(sizeof(struct iecm_tx_buf),
+						  GFP_KERNEL);
+		if (!tx_q->buf_stack.bufs[i])
+			return IECM_ERR_NO_MEMORY;
+	}
+
+	tx_q->buf_stack.size = tx_q->desc_count;
+	tx_q->buf_stack.top = tx_q->desc_count;
+
+	return 0;
 }
 
 /**
@@ -92,7 +122,40 @@ static enum iecm_status iecm_tx_buf_alloc_all(struct iecm_queue *tx_q)
  */
 static enum iecm_status iecm_tx_desc_alloc(struct iecm_queue *tx_q, bool bufq)
 {
-	/* stub */
+	struct device *dev = tx_q->dev;
+	enum iecm_status err = 0;
+
+	if (bufq) {
+		err = iecm_tx_buf_alloc_all(tx_q);
+		if (err)
+			goto err_alloc;
+		tx_q->size = tx_q->desc_count *
+				sizeof(struct iecm_base_tx_desc);
+	} else {
+		tx_q->size = tx_q->desc_count *
+				sizeof(struct iecm_splitq_tx_compl_desc);
+	}
+
+	/* Allocate descriptors also round up to nearest 4K */
+	tx_q->size = ALIGN(tx_q->size, 4096);
+	tx_q->desc_ring = dmam_alloc_coherent(dev, tx_q->size, &tx_q->dma,
+					      GFP_KERNEL);
+	if (!tx_q->desc_ring) {
+		dev_info(dev, "Unable to allocate memory for the Tx descriptor ring, size=%d\n",
+			 tx_q->size);
+		err = IECM_ERR_NO_MEMORY;
+		goto err_alloc;
+	}
+
+	tx_q->next_to_alloc = 0;
+	tx_q->next_to_use = 0;
+	tx_q->next_to_clean = 0;
+	set_bit(__IECM_Q_GEN_CHK, tx_q->flags);
+
+err_alloc:
+	if (err)
+		iecm_tx_desc_rel(tx_q, bufq);
+	return err;
 }
 
 /**
@@ -101,7 +164,41 @@ static enum iecm_status iecm_tx_desc_alloc(struct iecm_queue *tx_q, bool bufq)
  */
 static enum iecm_status iecm_tx_desc_alloc_all(struct iecm_vport *vport)
 {
-	/* stub */
+	struct pci_dev *pdev = vport->adapter->pdev;
+	enum iecm_status err = 0;
+	int i, j;
+
+	/* Setup buffer queues. In single queue model buffer queues and
+	 * completion queues will be same
+	 */
+	for (i = 0; i < vport->num_txq_grp; i++) {
+		for (j = 0; j < vport->txq_grps[i].num_txq; j++) {
+			err = iecm_tx_desc_alloc(&vport->txq_grps[i].txqs[j],
+						 true);
+			if (err) {
+				dev_err(&pdev->dev,
+					"Allocation for Tx Queue %u failed\n",
+					i);
+				goto err_out;
+			}
+		}
+
+		if (iecm_is_queue_model_split(vport->txq_model)) {
+			/* Setup completion queues */
+			err = iecm_tx_desc_alloc(vport->txq_grps[i].complq,
+						 false);
+			if (err) {
+				dev_err(&pdev->dev,
+					"Allocation for Tx Completion Queue %u failed\n",
+					i);
+				goto err_out;
+			}
+		}
+	}
+err_out:
+	if (err)
+		iecm_tx_desc_rel_all(vport);
+	return err;
 }
 
 /**
@@ -156,7 +253,17 @@ void iecm_rx_desc_rel_all(struct iecm_vport *vport)
  */
 void iecm_rx_buf_hw_update(struct iecm_queue *rxq, u32 val)
 {
-	/* stub */
+	/* update next to alloc since we have filled the ring */
+	rxq->next_to_alloc = val;
+
+	rxq->next_to_use = val;
+	/* Force memory writes to complete before letting h/w
+	 * know there are new descriptors to fetch.  (Only
+	 * applicable for weak-ordered memory model archs,
+	 * such as IA-64).
+	 */
+	wmb();
+	writel_relaxed(val, rxq->tail);
 }
 
 /**
@@ -169,7 +276,34 @@ void iecm_rx_buf_hw_update(struct iecm_queue *rxq, u32 val)
  */
 bool iecm_rx_buf_hw_alloc(struct iecm_queue *rxq, struct iecm_rx_buf *buf)
 {
-	/* stub */
+	struct page *page = buf->page;
+	dma_addr_t dma;
+
+	/* since we are recycling buffers we should seldom need to alloc */
+	if (likely(page))
+		return true;
+
+	/* alloc new page for storage */
+	page = alloc_page(GFP_ATOMIC | __GFP_NOWARN);
+	if (unlikely(!page))
+		return false;
+
+	/* map page for use */
+	dma = dma_map_page(rxq->dev, page, 0, PAGE_SIZE, DMA_FROM_DEVICE);
+
+	/* if mapping failed free memory back to system since
+	 * there isn't much point in holding memory we can't use
+	 */
+	if (dma_mapping_error(rxq->dev, dma)) {
+		__free_pages(page, 0);
+		return false;
+	}
+
+	buf->dma = dma;
+	buf->page = page;
+	buf->page_offset = iecm_rx_offset(rxq);
+
+	return true;
 }
 
 /**
@@ -183,7 +317,34 @@ bool iecm_rx_buf_hw_alloc(struct iecm_queue *rxq, struct iecm_rx_buf *buf)
 bool iecm_rx_hdr_buf_hw_alloc(struct iecm_queue *rxq,
 			      struct iecm_rx_buf *hdr_buf)
 {
-	/* stub */
+	struct page *page = hdr_buf->page;
+	dma_addr_t dma;
+
+	/* since we are recycling buffers we should seldom need to alloc */
+	if (likely(page))
+		return true;
+
+	/* alloc new page for storage */
+	page = alloc_page(GFP_ATOMIC | __GFP_NOWARN);
+	if (unlikely(!page))
+		return false;
+
+	/* map page for use */
+	dma = dma_map_page(rxq->dev, page, 0, PAGE_SIZE, DMA_FROM_DEVICE);
+
+	/* if mapping failed free memory back to system since
+	 * there isn't much point in holding memory we can't use
+	 */
+	if (dma_mapping_error(rxq->dev, dma)) {
+		__free_pages(page, 0);
+		return false;
+	}
+
+	hdr_buf->dma = dma;
+	hdr_buf->page = page;
+	hdr_buf->page_offset = 0;
+
+	return true;
 }
 
 /**
@@ -197,7 +358,59 @@ static bool
 iecm_rx_buf_hw_alloc_all(struct iecm_queue *rxq,
 			 u16 cleaned_count)
 {
-	/* stub */
+	struct iecm_splitq_rx_buf_desc *splitq_rx_desc = NULL;
+	struct iecm_rx_buf *hdr_buf = NULL;
+	u16 nta = rxq->next_to_alloc;
+	struct iecm_rx_buf *buf;
+
+	/* do nothing if no valid netdev defined */
+	if (!rxq->vport->netdev || !cleaned_count)
+		return false;
+
+	splitq_rx_desc = IECM_SPLITQ_RX_BUF_DESC(rxq, nta);
+
+	buf = &rxq->rx_buf.buf[nta];
+	if (rxq->rx_hsplit_en)
+		hdr_buf = &rxq->rx_buf.hdr_buf[nta];
+
+	do {
+		if (rxq->rx_hsplit_en) {
+			if (!iecm_rx_hdr_buf_hw_alloc(rxq, hdr_buf))
+				break;
+
+			splitq_rx_desc->hdr_addr =
+				cpu_to_le64(hdr_buf->dma +
+					    hdr_buf->page_offset);
+			hdr_buf++;
+		}
+
+		if (!iecm_rx_buf_hw_alloc(rxq, buf))
+			break;
+
+		/* Refresh the desc even if buffer_addrs didn't change
+		 * because each write-back erases this info.
+		 */
+		splitq_rx_desc->pkt_addr =
+			cpu_to_le64(buf->dma + buf->page_offset);
+		splitq_rx_desc->qword0.buf_id = cpu_to_le64(nta);
+
+		splitq_rx_desc++;
+		buf++;
+		nta++;
+		if (unlikely(nta == rxq->desc_count)) {
+			splitq_rx_desc = IECM_SPLITQ_RX_BUF_DESC(rxq, 0);
+			buf = rxq->rx_buf.buf;
+			hdr_buf = rxq->rx_buf.hdr_buf;
+			nta = 0;
+		}
+
+		cleaned_count--;
+	} while (cleaned_count);
+
+	if (rxq->next_to_alloc != nta)
+		iecm_rx_buf_hw_update(rxq, nta);
+
+	return !!cleaned_count;
 }
 
 /**
@@ -206,7 +419,44 @@ iecm_rx_buf_hw_alloc_all(struct iecm_queue *rxq,
  */
 static enum iecm_status iecm_rx_buf_alloc_all(struct iecm_queue *rxq)
 {
-	/* stub */
+	enum iecm_status err = 0;
+
+	/* Allocate book keeping buffers */
+	rxq->rx_buf.buf = kcalloc(rxq->desc_count, sizeof(struct iecm_rx_buf),
+				  GFP_KERNEL);
+	if (!rxq->rx_buf.buf) {
+		err = IECM_ERR_NO_MEMORY;
+		goto rx_buf_alloc_all_out;
+	}
+
+	if (rxq->rx_hsplit_en) {
+		rxq->rx_buf.hdr_buf =
+			kcalloc(rxq->desc_count, sizeof(struct iecm_rx_buf),
+				GFP_KERNEL);
+		if (!rxq->rx_buf.hdr_buf) {
+			err = IECM_ERR_NO_MEMORY;
+			goto rx_buf_alloc_all_out;
+		}
+	} else {
+		rxq->rx_buf.hdr_buf = NULL;
+	}
+
+	/* Allocate buffers to be given to HW. Allocate one less than
+	 * total descriptor count as RX splits 4k buffers to 2K and recycles
+	 */
+	if (iecm_is_queue_model_split(rxq->vport->rxq_model)) {
+		if (iecm_rx_buf_hw_alloc_all(rxq,
+					     rxq->desc_count - 1))
+			err = IECM_ERR_NO_MEMORY;
+	} else if (iecm_rx_singleq_buf_hw_alloc_all(rxq,
+						    rxq->desc_count - 1)) {
+		err = IECM_ERR_NO_MEMORY;
+	}
+
+rx_buf_alloc_all_out:
+	if (err)
+		iecm_rx_buf_rel_all(rxq);
+	return err;
 }
 
 /**
@@ -218,7 +468,50 @@ static enum iecm_status iecm_rx_buf_alloc_all(struct iecm_queue *rxq)
 static enum iecm_status iecm_rx_desc_alloc(struct iecm_queue *rxq, bool bufq,
 					   enum virtchnl_queue_model q_model)
 {
-	/* stub */
+	struct device *dev = rxq->dev;
+	enum iecm_status err = 0;
+
+	/* As both single and split descriptors are 32 byte, memory size
+	 * will be same for all three singleq_base Rx, buf., splitq_base
+	 * Rx. So pick anyone of them for size
+	 */
+	if (bufq) {
+		rxq->size = rxq->desc_count *
+			sizeof(struct iecm_splitq_rx_buf_desc);
+	} else {
+		rxq->size = rxq->desc_count *
+			sizeof(union iecm_rx_desc);
+	}
+
+	/* Allocate descriptors and also round up to nearest 4K */
+	rxq->size = ALIGN(rxq->size, 4096);
+	rxq->desc_ring = dmam_alloc_coherent(dev, rxq->size,
+					     &rxq->dma, GFP_KERNEL);
+	if (!rxq->desc_ring) {
+		dev_info(dev, "Unable to allocate memory for the Rx descriptor ring, size=%d\n",
+			 rxq->size);
+		err = IECM_ERR_NO_MEMORY;
+		return err;
+	}
+
+	rxq->next_to_alloc = 0;
+	rxq->next_to_clean = 0;
+	rxq->next_to_use = 0;
+	set_bit(__IECM_Q_GEN_CHK, rxq->flags);
+
+	/* Allocate buffers for a Rx queue if the q_model is single OR if it
+	 * is a buffer queue in split queue model
+	 */
+	if (bufq || !iecm_is_queue_model_split(q_model)) {
+		err = iecm_rx_buf_alloc_all(rxq);
+		if (err)
+			goto err_alloc;
+	}
+
+err_alloc:
+	if (err)
+		iecm_rx_desc_rel(rxq, bufq, q_model);
+	return err;
 }
 
 /**
@@ -227,7 +520,48 @@ static enum iecm_status iecm_rx_desc_alloc(struct iecm_queue *rxq, bool bufq,
  */
 static enum iecm_status iecm_rx_desc_alloc_all(struct iecm_vport *vport)
 {
-	/* stub */
+	struct device *dev = &vport->adapter->pdev->dev;
+	enum iecm_status err = 0;
+	struct iecm_queue *q;
+	int i, j, num_rxq;
+
+	for (i = 0; i < vport->num_rxq_grp; i++) {
+		if (iecm_is_queue_model_split(vport->rxq_model))
+			num_rxq = vport->rxq_grps[i].splitq.num_rxq_sets;
+		else
+			num_rxq = vport->rxq_grps[i].singleq.num_rxq;
+
+		for (j = 0; j < num_rxq; j++) {
+			if (iecm_is_queue_model_split(vport->rxq_model))
+				q = &vport->rxq_grps[i].splitq.rxq_sets[j].rxq;
+			else
+				q = &vport->rxq_grps[i].singleq.rxqs[j];
+			err = iecm_rx_desc_alloc(q, false, vport->rxq_model);
+			if (err) {
+				dev_err(dev, "Memory allocation for Rx Queue %u failed\n",
+					i);
+				goto err_out;
+			}
+		}
+
+		if (iecm_is_queue_model_split(vport->rxq_model)) {
+			for (j = 0; j < IECM_BUFQS_PER_RXQ_SET; j++) {
+				q =
+				  &vport->rxq_grps[i].splitq.bufq_sets[j].bufq;
+				err = iecm_rx_desc_alloc(q, true,
+							 vport->rxq_model);
+				if (err) {
+					dev_err(dev, "Memory allocation for Rx Buffer Queue %u failed\n",
+						i);
+					goto err_out;
+				}
+			}
+		}
+	}
+err_out:
+	if (err)
+		iecm_rx_desc_rel_all(vport);
+	return err;
 }
 
 /**
@@ -279,7 +613,26 @@ void iecm_vport_queues_rel(struct iecm_vport *vport)
 static enum iecm_status
 iecm_vport_init_fast_path_txqs(struct iecm_vport *vport)
 {
-	/* stub */
+	enum iecm_status err = 0;
+	int i, j, k = 0;
+
+	vport->txqs = kcalloc(vport->num_txq, sizeof(struct iecm_queue *),
+			      GFP_KERNEL);
+
+	if (!vport->txqs) {
+		err = IECM_ERR_NO_MEMORY;
+		goto err_alloc;
+	}
+	for (i = 0; i < vport->num_txq_grp; i++) {
+		struct iecm_txq_group *tx_grp = &vport->txq_grps[i];
+
+		for (j = 0; j < tx_grp->num_txq; j++, k++) {
+			vport->txqs[k] = &tx_grp->txqs[j];
+			vport->txqs[k]->idx = k;
+		}
+	}
+err_alloc:
+	return err;
 }
 
 /**
@@ -290,7 +643,12 @@ iecm_vport_init_fast_path_txqs(struct iecm_vport *vport)
 void iecm_vport_init_num_qs(struct iecm_vport *vport,
 			    struct virtchnl_create_vport *vport_msg)
 {
-	/* stub */
+	vport->num_txq = vport_msg->num_tx_q;
+	vport->num_rxq = vport_msg->num_rx_q;
+	if (iecm_is_queue_model_split(vport->txq_model))
+		vport->num_complq = vport_msg->num_tx_complq;
+	if (iecm_is_queue_model_split(vport->rxq_model))
+		vport->num_bufq = vport_msg->num_rx_bufq;
 }
 
 /**
@@ -299,7 +657,32 @@ void iecm_vport_init_num_qs(struct iecm_vport *vport,
  */
 void iecm_vport_calc_num_q_desc(struct iecm_vport *vport)
 {
-	/* stub */
+	int num_req_txq_desc = vport->adapter->config_data.num_req_txq_desc;
+	int num_req_rxq_desc = vport->adapter->config_data.num_req_rxq_desc;
+
+	vport->complq_desc_count = 0;
+	vport->bufq_desc_count = 0;
+	if (num_req_txq_desc) {
+		vport->txq_desc_count = num_req_txq_desc;
+		if (iecm_is_queue_model_split(vport->txq_model))
+			vport->complq_desc_count = num_req_txq_desc;
+	} else {
+		vport->txq_desc_count =
+			IECM_DFLT_TX_Q_DESC_COUNT;
+		if (iecm_is_queue_model_split(vport->txq_model)) {
+			vport->complq_desc_count =
+				IECM_DFLT_TX_COMPLQ_DESC_COUNT;
+		}
+	}
+	if (num_req_rxq_desc) {
+		vport->rxq_desc_count = num_req_rxq_desc;
+		if (iecm_is_queue_model_split(vport->rxq_model))
+			vport->bufq_desc_count = num_req_rxq_desc;
+	} else {
+		vport->rxq_desc_count = IECM_DFLT_RX_Q_DESC_COUNT;
+		if (iecm_is_queue_model_split(vport->rxq_model))
+			vport->bufq_desc_count = IECM_DFLT_RX_BUFQ_DESC_COUNT;
+	}
 }
 EXPORT_SYMBOL(iecm_vport_calc_num_q_desc);
 
@@ -311,7 +694,51 @@ EXPORT_SYMBOL(iecm_vport_calc_num_q_desc);
 void iecm_vport_calc_total_qs(struct virtchnl_create_vport *vport_msg,
 			      int num_req_qs)
 {
-	/* stub */
+	int dflt_splitq_txq_grps, dflt_singleq_txqs;
+	int dflt_splitq_rxq_grps, dflt_singleq_rxqs;
+	int num_txq_grps, num_rxq_grps;
+	int num_cpus;
+
+	/* Restrict num of queues to cpus online as a default configuration to
+	 * give best performance. User can always override to a max number
+	 * of queues via ethtool.
+	 */
+	num_cpus = num_online_cpus();
+	dflt_splitq_txq_grps = min_t(int, IECM_DFLT_SPLITQ_TX_Q_GROUPS,
+				     num_cpus);
+	dflt_singleq_txqs = min_t(int, IECM_DFLT_SINGLEQ_TXQ_PER_GROUP,
+				  num_cpus);
+	dflt_splitq_rxq_grps = min_t(int, IECM_DFLT_SPLITQ_RX_Q_GROUPS,
+				     num_cpus);
+	dflt_singleq_rxqs = min_t(int, IECM_DFLT_SINGLEQ_RXQ_PER_GROUP,
+				  num_cpus);
+
+	if (iecm_is_queue_model_split(vport_msg->txq_model)) {
+		num_txq_grps = num_req_qs ? num_req_qs : dflt_splitq_txq_grps;
+		vport_msg->num_tx_complq = num_txq_grps *
+			IECM_COMPLQ_PER_GROUP;
+		vport_msg->num_tx_q = num_txq_grps *
+				      IECM_DFLT_SPLITQ_TXQ_PER_GROUP;
+	} else {
+		num_txq_grps = IECM_DFLT_SINGLEQ_TX_Q_GROUPS;
+		vport_msg->num_tx_q = num_txq_grps *
+				      (num_req_qs ? num_req_qs :
+				       dflt_singleq_txqs);
+		vport_msg->num_tx_complq = 0;
+	}
+	if (iecm_is_queue_model_split(vport_msg->rxq_model)) {
+		num_rxq_grps = num_req_qs ? num_req_qs : dflt_splitq_rxq_grps;
+		vport_msg->num_rx_bufq = num_rxq_grps *
+					 IECM_BUFQS_PER_RXQ_SET;
+		vport_msg->num_rx_q = num_rxq_grps *
+				      IECM_DFLT_SPLITQ_RXQ_PER_GROUP;
+	} else {
+		num_rxq_grps = IECM_DFLT_SINGLEQ_RX_Q_GROUPS;
+		vport_msg->num_rx_bufq = 0;
+		vport_msg->num_rx_q = num_rxq_grps *
+				      (num_req_qs ? num_req_qs :
+				       dflt_singleq_rxqs);
+	}
 }
 
 /**
@@ -320,7 +747,15 @@ void iecm_vport_calc_total_qs(struct virtchnl_create_vport *vport_msg,
  */
 void iecm_vport_calc_num_q_groups(struct iecm_vport *vport)
 {
-	/* stub */
+	if (iecm_is_queue_model_split(vport->txq_model))
+		vport->num_txq_grp = vport->num_txq;
+	else
+		vport->num_txq_grp = IECM_DFLT_SINGLEQ_TX_Q_GROUPS;
+
+	if (iecm_is_queue_model_split(vport->rxq_model))
+		vport->num_rxq_grp = vport->num_rxq;
+	else
+		vport->num_rxq_grp = IECM_DFLT_SINGLEQ_RX_Q_GROUPS;
 }
 EXPORT_SYMBOL(iecm_vport_calc_num_q_groups);
 
@@ -333,7 +768,15 @@ EXPORT_SYMBOL(iecm_vport_calc_num_q_groups);
 static void iecm_vport_calc_numq_per_grp(struct iecm_vport *vport,
 					 int *num_txq, int *num_rxq)
 {
-	/* stub */
+	if (iecm_is_queue_model_split(vport->txq_model))
+		*num_txq = IECM_DFLT_SPLITQ_TXQ_PER_GROUP;
+	else
+		*num_txq = vport->num_txq;
+
+	if (iecm_is_queue_model_split(vport->rxq_model))
+		*num_rxq = IECM_DFLT_SPLITQ_RXQ_PER_GROUP;
+	else
+		*num_rxq = vport->num_rxq;
 }
 
 /**
@@ -344,7 +787,10 @@ static void iecm_vport_calc_numq_per_grp(struct iecm_vport *vport,
  */
 void iecm_vport_calc_num_q_vec(struct iecm_vport *vport)
 {
-	/* stub */
+	if (iecm_is_queue_model_split(vport->txq_model))
+		vport->num_q_vectors = vport->num_txq_grp;
+	else
+		vport->num_q_vectors = vport->num_txq;
 }
 
 /**
@@ -355,7 +801,68 @@ void iecm_vport_calc_num_q_vec(struct iecm_vport *vport)
 static enum iecm_status iecm_txq_group_alloc(struct iecm_vport *vport,
 					     int num_txq)
 {
-	/* stub */
+	struct iecm_itr tx_itr = { 0 };
+	enum iecm_status err = 0;
+	int i;
+
+	vport->txq_grps = kcalloc(vport->num_txq_grp,
+				  sizeof(*vport->txq_grps), GFP_KERNEL);
+	if (!vport->txq_grps)
+		return IECM_ERR_NO_MEMORY;
+
+	tx_itr.target_itr = IECM_ITR_TX_DEF;
+	tx_itr.itr_idx = VIRTCHNL_ITR_IDX_1;
+	tx_itr.next_update = jiffies + 1;
+
+	for (i = 0; i < vport->num_txq_grp; i++) {
+		struct iecm_txq_group *tx_qgrp = &vport->txq_grps[i];
+		int j;
+
+		tx_qgrp->vport = vport;
+		tx_qgrp->num_txq = num_txq;
+		tx_qgrp->txqs = kcalloc(num_txq, sizeof(*tx_qgrp->txqs),
+					GFP_KERNEL);
+		if (!tx_qgrp->txqs) {
+			err = IECM_ERR_NO_MEMORY;
+			goto err_alloc;
+		}
+
+		for (j = 0; j < tx_qgrp->num_txq; j++) {
+			struct iecm_queue *q = &tx_qgrp->txqs[j];
+
+			q->dev = &vport->adapter->pdev->dev;
+			q->desc_count = vport->txq_desc_count;
+			q->vport = vport;
+			q->txq_grp = tx_qgrp;
+			hash_init(q->sched_buf_hash);
+
+			if (!iecm_is_queue_model_split(vport->txq_model))
+				q->itr = tx_itr;
+		}
+
+		if (!iecm_is_queue_model_split(vport->txq_model))
+			continue;
+
+		tx_qgrp->complq = kcalloc(IECM_COMPLQ_PER_GROUP,
+					  sizeof(*tx_qgrp->complq),
+					  GFP_KERNEL);
+		if (!tx_qgrp->complq) {
+			err = IECM_ERR_NO_MEMORY;
+			goto err_alloc;
+		}
+
+		tx_qgrp->complq->dev = &vport->adapter->pdev->dev;
+		tx_qgrp->complq->desc_count = vport->complq_desc_count;
+		tx_qgrp->complq->vport = vport;
+		tx_qgrp->complq->txq_grp = tx_qgrp;
+
+		tx_qgrp->complq->itr = tx_itr;
+	}
+
+err_alloc:
+	if (err)
+		iecm_txq_group_rel(vport);
+	return err;
 }
 
 /**
@@ -366,7 +873,118 @@ static enum iecm_status iecm_txq_group_alloc(struct iecm_vport *vport,
 static enum iecm_status iecm_rxq_group_alloc(struct iecm_vport *vport,
 					     int num_rxq)
 {
-	/* stub */
+	enum iecm_status err = 0;
+	struct iecm_itr rx_itr = {0};
+	struct iecm_queue *q;
+	int i;
+
+	vport->rxq_grps = kcalloc(vport->num_rxq_grp,
+				  sizeof(struct iecm_rxq_group), GFP_KERNEL);
+	if (!vport->rxq_grps) {
+		err = IECM_ERR_NO_MEMORY;
+		goto err_alloc;
+	}
+
+	rx_itr.target_itr = IECM_ITR_RX_DEF;
+	rx_itr.itr_idx = VIRTCHNL_ITR_IDX_0;
+	rx_itr.next_update = jiffies + 1;
+
+	for (i = 0; i < vport->num_rxq_grp; i++) {
+		struct iecm_rxq_group *rx_qgrp = &vport->rxq_grps[i];
+		int j;
+
+		rx_qgrp->vport = vport;
+		if (iecm_is_queue_model_split(vport->rxq_model)) {
+			rx_qgrp->splitq.num_rxq_sets = num_rxq;
+			rx_qgrp->splitq.rxq_sets =
+				kcalloc(num_rxq,
+					sizeof(struct iecm_rxq_set),
+					GFP_KERNEL);
+			if (!rx_qgrp->splitq.rxq_sets) {
+				err = IECM_ERR_NO_MEMORY;
+				goto err_alloc;
+			}
+
+			rx_qgrp->splitq.bufq_sets =
+				kcalloc(IECM_BUFQS_PER_RXQ_SET,
+					sizeof(struct iecm_bufq_set),
+					GFP_KERNEL);
+			if (!rx_qgrp->splitq.bufq_sets) {
+				err = IECM_ERR_NO_MEMORY;
+				goto err_alloc;
+			}
+
+			for (j = 0; j < IECM_BUFQS_PER_RXQ_SET; j++) {
+				int swq_size = sizeof(struct iecm_sw_queue);
+
+				q = &rx_qgrp->splitq.bufq_sets[j].bufq;
+				q->dev = &vport->adapter->pdev->dev;
+				q->desc_count = vport->bufq_desc_count;
+				q->vport = vport;
+				q->rxq_grp = rx_qgrp;
+				q->idx = j;
+				q->rx_buf_size = IECM_RX_BUF_2048;
+				q->rsc_low_watermark = IECM_LOW_WATERMARK;
+				q->rx_buf_stride = IECM_RX_BUF_STRIDE;
+				q->itr = rx_itr;
+
+				if (vport->rx_hsplit_en) {
+					q->rx_hsplit_en = vport->rx_hsplit_en;
+					q->rx_hbuf_size = IECM_HDR_BUF_SIZE;
+				}
+
+				rx_qgrp->splitq.bufq_sets[j].num_refillqs =
+					num_rxq;
+				rx_qgrp->splitq.bufq_sets[j].refillqs =
+					kcalloc(num_rxq, swq_size, GFP_KERNEL);
+				if (!rx_qgrp->splitq.bufq_sets[j].refillqs) {
+					err = IECM_ERR_NO_MEMORY;
+					goto err_alloc;
+				}
+			}
+		} else {
+			rx_qgrp->singleq.num_rxq = num_rxq;
+			rx_qgrp->singleq.rxqs = kcalloc(num_rxq,
+							sizeof(struct iecm_queue),
+							GFP_KERNEL);
+			if (!rx_qgrp->singleq.rxqs)  {
+				err = IECM_ERR_NO_MEMORY;
+				goto err_alloc;
+			}
+		}
+
+		for (j = 0; j < num_rxq; j++) {
+			if (iecm_is_queue_model_split(vport->rxq_model)) {
+				q = &rx_qgrp->splitq.rxq_sets[j].rxq;
+				rx_qgrp->splitq.rxq_sets[j].refillq0 =
+				      &rx_qgrp->splitq.bufq_sets[0].refillqs[j];
+				rx_qgrp->splitq.rxq_sets[j].refillq1 =
+				      &rx_qgrp->splitq.bufq_sets[1].refillqs[j];
+
+				if (vport->rx_hsplit_en) {
+					q->rx_hsplit_en = vport->rx_hsplit_en;
+					q->rx_hbuf_size = IECM_HDR_BUF_SIZE;
+				}
+
+			} else {
+				q = &rx_qgrp->singleq.rxqs[j];
+			}
+			q->dev = &vport->adapter->pdev->dev;
+			q->desc_count = vport->rxq_desc_count;
+			q->vport = vport;
+			q->rxq_grp = rx_qgrp;
+			q->idx = (i * num_rxq) + j;
+			q->rx_buf_size = IECM_RX_BUF_2048;
+			q->rsc_low_watermark = IECM_LOW_WATERMARK;
+			q->rx_max_pkt_size = vport->netdev->mtu +
+					     IECM_PACKET_HDR_PAD;
+			q->itr = rx_itr;
+		}
+	}
+err_alloc:
+	if (err)
+		iecm_rxq_group_rel(vport);
+	return err;
 }
 
 /**
@@ -376,7 +994,20 @@ static enum iecm_status iecm_rxq_group_alloc(struct iecm_vport *vport,
 static enum iecm_status
 iecm_vport_queue_grp_alloc_all(struct iecm_vport *vport)
 {
-	/* stub */
+	int num_txq, num_rxq;
+	enum iecm_status err;
+
+	iecm_vport_calc_numq_per_grp(vport, &num_txq, &num_rxq);
+
+	err = iecm_txq_group_alloc(vport, num_txq);
+	if (err)
+		goto err_out;
+
+	err = iecm_rxq_group_alloc(vport, num_rxq);
+err_out:
+	if (err)
+		iecm_vport_queue_grp_rel_all(vport);
+	return err;
 }
 
 /**
@@ -387,7 +1018,35 @@ iecm_vport_queue_grp_alloc_all(struct iecm_vport *vport)
  */
 enum iecm_status iecm_vport_queues_alloc(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_adapter *adapter = vport->adapter;
+	enum iecm_status err;
+
+	err = iecm_vport_queue_grp_alloc_all(vport);
+	if (err)
+		goto err_out;
+
+	err = adapter->dev_ops.vc_ops.vport_queue_ids_init(vport);
+	if (err)
+		goto err_out;
+
+	adapter->dev_ops.reg_ops.vportq_reg_init(vport);
+
+	err = iecm_tx_desc_alloc_all(vport);
+	if (err)
+		goto err_out;
+
+	err = iecm_rx_desc_alloc_all(vport);
+	if (err)
+		goto err_out;
+
+	err = iecm_vport_init_fast_path_txqs(vport);
+	if (err)
+		goto err_out;
+
+	return 0;
+err_out:
+	iecm_vport_queues_rel(vport);
+	return err;
 }
 
 /**
@@ -1786,7 +2445,16 @@ EXPORT_SYMBOL(iecm_vport_calc_num_q_vec);
  */
 int iecm_config_rss(struct iecm_vport *vport)
 {
-	/* stub */
+	int err = iecm_send_get_set_rss_key_msg(vport, false);
+
+	if (!err)
+		err = vport->adapter->dev_ops.vc_ops.get_set_rss_lut(vport,
+								     false);
+	if (!err)
+		err = vport->adapter->dev_ops.vc_ops.get_set_rss_hash(vport,
+								      false);
+
+	return err;
 }
 
 /**
@@ -1797,7 +2465,20 @@ int iecm_config_rss(struct iecm_vport *vport)
  */
 void iecm_get_rx_qid_list(struct iecm_vport *vport, u16 *qid_list)
 {
-	/* stub */
+	int i, j, k = 0;
+
+	for (i = 0; i < vport->num_rxq_grp; i++) {
+		struct iecm_rxq_group *rx_qgrp = &vport->rxq_grps[i];
+
+		if (iecm_is_queue_model_split(vport->rxq_model)) {
+			for (j = 0; j < rx_qgrp->splitq.num_rxq_sets; j++)
+				qid_list[k++] =
+					rx_qgrp->splitq.rxq_sets[j].rxq.q_id;
+		} else {
+			for (j = 0; j < rx_qgrp->singleq.num_rxq; j++)
+				qid_list[k++] = rx_qgrp->singleq.rxqs[j].q_id;
+		}
+	}
 }
 
 /**
@@ -1809,7 +2490,13 @@ void iecm_get_rx_qid_list(struct iecm_vport *vport, u16 *qid_list)
  */
 void iecm_fill_dflt_rss_lut(struct iecm_vport *vport, u16 *qid_list)
 {
-	/* stub */
+	int num_lut_segs, lut_seg, i, k = 0;
+
+	num_lut_segs = vport->adapter->rss_data.rss_lut_size / vport->num_rxq;
+	for (lut_seg = 0; lut_seg < num_lut_segs; lut_seg++) {
+		for (i = 0; i < vport->num_rxq; i++)
+			vport->adapter->rss_data.rss_lut[k++] = qid_list[i];
+	}
 }
 
 /**
@@ -1820,7 +2507,67 @@ void iecm_fill_dflt_rss_lut(struct iecm_vport *vport, u16 *qid_list)
  */
 int iecm_init_rss(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_adapter *adapter = vport->adapter;
+	u16 *qid_list;
+	int err;
+
+	adapter->rss_data.rss_key = kzalloc(adapter->rss_data.rss_key_size,
+					    GFP_KERNEL);
+	if (!adapter->rss_data.rss_key)
+		return IECM_ERR_NO_MEMORY;
+	adapter->rss_data.rss_lut = kzalloc(adapter->rss_data.rss_lut_size,
+					    GFP_KERNEL);
+	if (!adapter->rss_data.rss_lut) {
+		kfree(adapter->rss_data.rss_key);
+		adapter->rss_data.rss_key = NULL;
+		return IECM_ERR_NO_MEMORY;
+	}
+
+	/* Initialize default rss key */
+	netdev_rss_key_fill((void *)adapter->rss_data.rss_key,
+			    adapter->rss_data.rss_key_size);
+
+	/* Initialize default rss lut */
+	if (adapter->rss_data.rss_lut_size % vport->num_rxq) {
+		u16 dflt_qid;
+		int i;
+
+		/* Set all entries to a default RX queue if the algorithm below
+		 * won't fill all entries
+		 */
+		if (iecm_is_queue_model_split(vport->rxq_model))
+			dflt_qid =
+				vport->rxq_grps[0].splitq.rxq_sets[0].rxq.q_id;
+		else
+			dflt_qid =
+				vport->rxq_grps[0].singleq.rxqs[0].q_id;
+
+		for (i = 0; i < adapter->rss_data.rss_lut_size; i++)
+			adapter->rss_data.rss_lut[i] = dflt_qid;
+	}
+
+	qid_list = kcalloc(vport->num_rxq, sizeof(u16), GFP_KERNEL);
+	if (!qid_list) {
+		kfree(adapter->rss_data.rss_lut);
+		adapter->rss_data.rss_lut = NULL;
+		kfree(adapter->rss_data.rss_key);
+		adapter->rss_data.rss_key = NULL;
+		return IECM_ERR_NO_MEMORY;
+	}
+
+	iecm_get_rx_qid_list(vport, qid_list);
+
+	/* Fill the default RSS lut values*/
+	iecm_fill_dflt_rss_lut(vport, qid_list);
+
+	kfree(qid_list);
+
+	 /* Initialize default rss HASH */
+	adapter->rss_data.rss_hash = IECM_DEFAULT_RSS_HASH_EXPANDED;
+
+	err = iecm_config_rss(vport);
+
+	return err;
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c b/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c
index b1775cc38924..d56f8126521a 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c
@@ -664,7 +664,20 @@ iecm_send_destroy_vport_msg(struct iecm_vport *vport)
 enum iecm_status
 iecm_send_enable_vport_msg(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_adapter *adapter = vport->adapter;
+	struct virtchnl_vport v_id;
+	enum iecm_status err;
+
+	v_id.vport_id = vport->vport_id;
+
+	err = iecm_send_mb_msg(adapter, VIRTCHNL_OP_ENABLE_VPORT,
+			       sizeof(v_id), (u8 *)&v_id);
+
+	if (!err)
+		err = iecm_wait_for_event(adapter, IECM_VC_ENA_VPORT,
+					  IECM_VC_ENA_VPORT_ERR);
+
+	return err;
 }
 
 /**
@@ -1858,7 +1871,27 @@ EXPORT_SYMBOL(iecm_vc_core_init);
  */
 static void iecm_vport_init(struct iecm_vport *vport, int vport_id)
 {
-	/* stub */
+	struct virtchnl_create_vport *vport_msg;
+
+	vport_msg = (struct virtchnl_create_vport *)
+				vport->adapter->vport_params_recvd[0];
+	vport->txq_model = vport_msg->txq_model;
+	vport->rxq_model = vport_msg->rxq_model;
+	vport->vport_type = (u16)vport_msg->vport_type;
+	vport->vport_id = vport_msg->vport_id;
+	vport->adapter->rss_data.rss_key_size = min_t(u16, NETDEV_RSS_KEY_LEN,
+						      vport_msg->rss_key_size);
+	vport->adapter->rss_data.rss_lut_size = vport_msg->rss_lut_size;
+	ether_addr_copy(vport->default_mac_addr, vport_msg->default_mac_addr);
+	vport->max_mtu = IECM_MAX_MTU;
+
+	iecm_vport_set_hsplit(vport, NULL);
+
+	init_waitqueue_head(&vport->sw_marker_wq);
+	iecm_vport_init_num_qs(vport, vport_msg);
+	iecm_vport_calc_num_q_desc(vport);
+	iecm_vport_calc_num_q_groups(vport);
+	iecm_vport_calc_num_q_vec(vport);
 }
 
 /**
-- 
2.26.2


^ permalink raw reply related

* [net-next 03/15] iecm: Add TX/RX header files
From: Jeff Kirsher @ 2020-06-18  5:13 UTC (permalink / raw)
  To: davem
  Cc: Alice Michael, netdev, nhorman, sassmann, Alan Brady, Phani Burra,
	Joshua Hay, Madhu Chittim, Pavan Kumar Linga, Donald Skidmore,
	Jesse Brandeburg, Sridhar Samudrala, Jeff Kirsher
In-Reply-To: <20200618051344.516587-1-jeffrey.t.kirsher@intel.com>

From: Alice Michael <alice.michael@intel.com>

Introduces the data for the TX/RX paths for use
by the common module.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Donald Skidmore <donald.c.skidmore@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 include/linux/net/intel/iecm_lan_pf_regs.h | 120 ++++
 include/linux/net/intel/iecm_lan_txrx.h    | 636 +++++++++++++++++++++
 include/linux/net/intel/iecm_txrx.h        | 610 ++++++++++++++++++++
 3 files changed, 1366 insertions(+)
 create mode 100644 include/linux/net/intel/iecm_lan_pf_regs.h
 create mode 100644 include/linux/net/intel/iecm_lan_txrx.h
 create mode 100644 include/linux/net/intel/iecm_txrx.h

diff --git a/include/linux/net/intel/iecm_lan_pf_regs.h b/include/linux/net/intel/iecm_lan_pf_regs.h
new file mode 100644
index 000000000000..6690a2645608
--- /dev/null
+++ b/include/linux/net/intel/iecm_lan_pf_regs.h
@@ -0,0 +1,120 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2020, Intel Corporation. */
+
+#ifndef _IECM_LAN_PF_REGS_H_
+#define _IECM_LAN_PF_REGS_H_
+
+/* Receive queues */
+#define PF_QRX_BASE			0x00000000
+#define PF_QRX_TAIL(_QRX)		(PF_QRX_BASE + (((_QRX) * 0x1000)))
+#define PF_QRX_BUFFQ_BASE		0x03000000
+#define PF_QRX_BUFFQ_TAIL(_QRX)		\
+	(PF_QRX_BUFFQ_BASE + (((_QRX) * 0x1000)))
+
+/* Transmit queues */
+#define PF_QTX_BASE			0x05000000
+#define PF_QTX_COMM_DBELL(_DBQM)	(PF_QTX_BASE + ((_DBQM) * 0x1000))
+
+/* Control(PF Mailbox) Queue */
+#define PF_FW_BASE			0x08400000
+
+#define PF_FW_ARQBAL			(PF_FW_BASE)
+#define PF_FW_ARQBAH			(PF_FW_BASE + 0x4)
+#define PF_FW_ARQLEN			(PF_FW_BASE + 0x8)
+#define PF_FW_ARQLEN_ARQLEN_S		0
+#define PF_FW_ARQLEN_ARQLEN_M		MAKEMASK(0x1FFF, PF_FW_ARQLEN_ARQLEN_S)
+#define PF_FW_ARQLEN_ARQVFE_S		28
+#define PF_FW_ARQLEN_ARQVFE_M		BIT(PF_FW_ARQLEN_ARQVFE_S)
+#define PF_FW_ARQLEN_ARQOVFL_S		29
+#define PF_FW_ARQLEN_ARQOVFL_M		BIT(PF_FW_ARQLEN_ARQOVFL_S)
+#define PF_FW_ARQLEN_ARQCRIT_S		30
+#define PF_FW_ARQLEN_ARQCRIT_M		BIT(PF_FW_ARQLEN_ARQCRIT_S)
+#define PF_FW_ARQLEN_ARQENABLE_S	31
+#define PF_FW_ARQLEN_ARQENABLE_M	BIT(PF_FW_ARQLEN_ARQENABLE_S)
+#define PF_FW_ARQH			(PF_FW_BASE + 0xC)
+#define PF_FW_ARQH_ARQH_S		0
+#define PF_FW_ARQH_ARQH_M		MAKEMASK(0x1FFF, PF_FW_ARQH_ARQH_S)
+#define PF_FW_ARQT			(PF_FW_BASE + 0x10)
+
+#define PF_FW_ATQBAL			(PF_FW_BASE + 0x14)
+#define PF_FW_ATQBAH			(PF_FW_BASE + 0x18)
+#define PF_FW_ATQLEN			(PF_FW_BASE + 0x1C)
+#define PF_FW_ATQLEN_ATQLEN_S		0
+#define PF_FW_ATQLEN_ATQLEN_M		MAKEMASK(0x3FF, PF_FW_ATQLEN_ATQLEN_S)
+#define PF_FW_ATQLEN_ATQVFE_S		28
+#define PF_FW_ATQLEN_ATQVFE_M		BIT(PF_FW_ATQLEN_ATQVFE_S)
+#define PF_FW_ATQLEN_ATQOVFL_S		29
+#define PF_FW_ATQLEN_ATQOVFL_M		BIT(PF_FW_ATQLEN_ATQOVFL_S)
+#define PF_FW_ATQLEN_ATQCRIT_S		30
+#define PF_FW_ATQLEN_ATQCRIT_M		BIT(PF_FW_ATQLEN_ATQCRIT_S)
+#define PF_FW_ATQLEN_ATQENABLE_S	31
+#define PF_FW_ATQLEN_ATQENABLE_M	BIT(PF_FW_ATQLEN_ATQENABLE_S)
+#define PF_FW_ATQH			(PF_FW_BASE + 0x20)
+#define PF_FW_ATQH_ATQH_S		0
+#define PF_FW_ATQH_ATQH_M		MAKEMASK(0x3FF, PF_FW_ATQH_ATQH_S)
+#define PF_FW_ATQT			(PF_FW_BASE + 0x24)
+
+/* Interrupts */
+#define PF_GLINT_BASE			0x08900000
+#define PF_GLINT_DYN_CTL(_INT)		(PF_GLINT_BASE + ((_INT) * 0x1000))
+#define PF_GLINT_DYN_CTL_INTENA_S	0
+#define PF_GLINT_DYN_CTL_INTENA_M	BIT(PF_GLINT_DYN_CTL_INTENA_S)
+#define PF_GLINT_DYN_CTL_CLEARPBA_S	1
+#define PF_GLINT_DYN_CTL_CLEARPBA_M	BIT(PF_GLINT_DYN_CTL_CLEARPBA_S)
+#define PF_GLINT_DYN_CTL_SWINT_TRIG_S	2
+#define PF_GLINT_DYN_CTL_SWINT_TRIG_M	BIT(PF_GLINT_DYN_CTL_SWINT_TRIG_S)
+#define PF_GLINT_DYN_CTL_ITR_INDX_S	3
+#define PF_GLINT_DYN_CTL_INTERVAL_S	5
+#define PF_GLINT_DYN_CTL_INTERVAL_M	BIT(PF_GLINT_DYN_CTL_INTERVAL_S)
+#define PF_GLINT_DYN_CTL_SW_ITR_INDX_ENA_S	24
+#define PF_GLINT_DYN_CTL_SW_ITR_INDX_ENA_M \
+	BIT(PF_GLINT_DYN_CTL_SW_ITR_INDX_ENA_S)
+#define PF_GLINT_DYN_CTL_SW_ITR_INDX_S	25
+#define PF_GLINT_DYN_CTL_SW_ITR_INDX_M	BIT(PF_GLINT_DYN_CTL_SW_ITR_INDX_S)
+#define PF_GLINT_DYN_CTL_INTENA_MSK_S	31
+#define PF_GLINT_DYN_CTL_INTENA_MSK_M	BIT(PF_GLINT_DYN_CTL_INTENA_MSK_S)
+#define PF_GLINT_ITR(_i, _INT) (PF_GLINT_BASE + (((_INT) + \
+			       (((_i) + 1) * 4)) * 0x1000))
+#define PF_GLINT_ITR_MAX_INDEX		2
+#define PF_GLINT_ITR_INTERVAL_S		0
+#define PF_GLINT_ITR_INTERVAL_M		MAKEMASK(0xFFF, PF_GLINT_ITR_INTERVAL_S)
+
+/* Generic registers */
+#define PF_INT_DIR_OICR_ENA		0x08406000
+#define PF_INT_DIR_OICR_ENA_S		0
+#define PF_INT_DIR_OICR_ENA_M	MAKEMASK(0xFFFFFFFF, PF_INT_DIR_OICR_ENA_S)
+#define PF_INT_DIR_OICR			0x08406004
+#define PF_INT_DIR_OICR_TSYN_EVNT	0
+#define PF_INT_DIR_OICR_PHY_TS_0	BIT(1)
+#define PF_INT_DIR_OICR_PHY_TS_1	BIT(2)
+#define PF_INT_DIR_OICR_CAUSE		0x08406008
+#define PF_INT_DIR_OICR_CAUSE_CAUSE_S	0
+#define PF_INT_DIR_OICR_CAUSE_CAUSE_M	\
+	MAKEMASK(0xFFFFFFFF, PF_INT_DIR_OICR_CAUSE_CAUSE_S)
+#define PF_INT_PBA_CLEAR		0x0840600C
+
+#define PF_FUNC_RID			0x08406010
+#define PF_FUNC_RID_FUNCTION_NUMBER_S	0
+#define PF_FUNC_RID_FUNCTION_NUMBER_M	\
+	MAKEMASK(0x7, PF_FUNC_RID_FUNCTION_NUMBER_S)
+#define PF_FUNC_RID_DEVICE_NUMBER_S	3
+#define PF_FUNC_RID_DEVICE_NUMBER_M	\
+	MAKEMASK(0x1F, PF_FUNC_RID_DEVICE_NUMBER_S)
+#define PF_FUNC_RID_BUS_NUMBER_S	8
+#define PF_FUNC_RID_BUS_NUMBER_M	MAKEMASK(0xFF, PF_FUNC_RID_BUS_NUMBER_S)
+
+/* Reset registers */
+#define PFGEN_RTRIG			0x08407000
+#define PFGEN_RTRIG_CORER_S		0
+#define PFGEN_RTRIG_CORER_M		BIT(0)
+#define PFGEN_RTRIG_LINKR_S		1
+#define PFGEN_RTRIG_LINKR_M		BIT(1)
+#define PFGEN_RTRIG_IMCR_S		2
+#define PFGEN_RTRIG_IMCR_M		BIT(2)
+#define PFGEN_RSTAT			0x08407008 /* PFR Status */
+#define PFGEN_RSTAT_PFR_STATE_S		0
+#define PFGEN_RSTAT_PFR_STATE_M		MAKEMASK(0x3, PFGEN_RSTAT_PFR_STATE_S)
+#define PFGEN_CTRL			0x0840700C
+#define PFGEN_CTRL_PFSWR		BIT(0)
+
+#endif
diff --git a/include/linux/net/intel/iecm_lan_txrx.h b/include/linux/net/intel/iecm_lan_txrx.h
new file mode 100644
index 000000000000..6dc210925c7a
--- /dev/null
+++ b/include/linux/net/intel/iecm_lan_txrx.h
@@ -0,0 +1,636 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2020, Intel Corporation. */
+
+#ifndef _IECM_LAN_TXRX_H_
+#define _IECM_LAN_TXRX_H_
+#include <linux/net/intel/iecm_osdep.h>
+
+enum iecm_rss_hash {
+	/* Values 0 - 28 are reserved for future use */
+	IECM_HASH_INVALID		= 0,
+	IECM_HASH_NONF_UNICAST_IPV4_UDP	= 29,
+	IECM_HASH_NONF_MULTICAST_IPV4_UDP,
+	IECM_HASH_NONF_IPV4_UDP,
+	IECM_HASH_NONF_IPV4_TCP_SYN_NO_ACK,
+	IECM_HASH_NONF_IPV4_TCP,
+	IECM_HASH_NONF_IPV4_SCTP,
+	IECM_HASH_NONF_IPV4_OTHER,
+	IECM_HASH_FRAG_IPV4,
+	/* Values 37-38 are reserved */
+	IECM_HASH_NONF_UNICAST_IPV6_UDP	= 39,
+	IECM_HASH_NONF_MULTICAST_IPV6_UDP,
+	IECM_HASH_NONF_IPV6_UDP,
+	IECM_HASH_NONF_IPV6_TCP_SYN_NO_ACK,
+	IECM_HASH_NONF_IPV6_TCP,
+	IECM_HASH_NONF_IPV6_SCTP,
+	IECM_HASH_NONF_IPV6_OTHER,
+	IECM_HASH_FRAG_IPV6,
+	IECM_HASH_NONF_RSVD47,
+	IECM_HASH_NONF_FCOE_OX,
+	IECM_HASH_NONF_FCOE_RX,
+	IECM_HASH_NONF_FCOE_OTHER,
+	/* Values 51-62 are reserved */
+	IECM_HASH_L2_PAYLOAD		= 63,
+	IECM_HASH_MAX
+};
+
+/* Supported RSS offloads */
+#define IECM_DEFAULT_RSS_HASH ( \
+	BIT_ULL(IECM_HASH_NONF_IPV4_UDP) | \
+	BIT_ULL(IECM_HASH_NONF_IPV4_SCTP) | \
+	BIT_ULL(IECM_HASH_NONF_IPV4_TCP) | \
+	BIT_ULL(IECM_HASH_NONF_IPV4_OTHER) | \
+	BIT_ULL(IECM_HASH_FRAG_IPV4) | \
+	BIT_ULL(IECM_HASH_NONF_IPV6_UDP) | \
+	BIT_ULL(IECM_HASH_NONF_IPV6_TCP) | \
+	BIT_ULL(IECM_HASH_NONF_IPV6_SCTP) | \
+	BIT_ULL(IECM_HASH_NONF_IPV6_OTHER) | \
+	BIT_ULL(IECM_HASH_FRAG_IPV6) | \
+	BIT_ULL(IECM_HASH_L2_PAYLOAD))
+
+#define IECM_DEFAULT_RSS_HASH_EXPANDED (IECM_DEFAULT_RSS_HASH | \
+	BIT_ULL(IECM_HASH_NONF_IPV4_TCP_SYN_NO_ACK) | \
+	BIT_ULL(IECM_HASH_NONF_UNICAST_IPV4_UDP) | \
+	BIT_ULL(IECM_HASH_NONF_MULTICAST_IPV4_UDP) | \
+	BIT_ULL(IECM_HASH_NONF_IPV6_TCP_SYN_NO_ACK) | \
+	BIT_ULL(IECM_HASH_NONF_UNICAST_IPV6_UDP) | \
+	BIT_ULL(IECM_HASH_NONF_MULTICAST_IPV6_UDP))
+
+/* For iecm_splitq_base_tx_compl_desc */
+#define IECM_TXD_COMPLQ_GEN_S	15
+#define IECM_TXD_COMPLQ_GEN_M		BIT_ULL(IECM_TXD_COMPLQ_GEN_S)
+#define IECM_TXD_COMPLQ_COMPL_TYPE_S	11
+#define IECM_TXD_COMPLQ_COMPL_TYPE_M	\
+	MAKEMASK(0x7UL, IECM_TXD_COMPLQ_COMPL_TYPE_S)
+#define IECM_TXD_COMPLQ_QID_S	0
+#define IECM_TXD_COMPLQ_QID_M		MAKEMASK(0x3FFUL, IECM_TXD_COMPLQ_QID_S)
+
+/* For base mode TX descriptors */
+#define IECM_TXD_CTX_QW1_MSS_S		50
+#define IECM_TXD_CTX_QW1_MSS_M		\
+	MAKEMASK(0x3FFFULL, IECM_TXD_CTX_QW1_MSS_S)
+#define IECM_TXD_CTX_QW1_TSO_LEN_S	30
+#define IECM_TXD_CTX_QW1_TSO_LEN_M	\
+	MAKEMASK(0x3FFFFULL, IECM_TXD_CTX_QW1_TSO_LEN_S)
+#define IECM_TXD_CTX_QW1_CMD_S		4
+#define IECM_TXD_CTX_QW1_CMD_M		\
+	MAKEMASK(0xFFFUL, IECM_TXD_CTX_QW1_CMD_S)
+#define IECM_TXD_CTX_QW1_DTYPE_S	0
+#define IECM_TXD_CTX_QW1_DTYPE_M	\
+	MAKEMASK(0xFUL, IECM_TXD_CTX_QW1_DTYPE_S)
+#define IECM_TXD_QW1_L2TAG1_S		48
+#define IECM_TXD_QW1_L2TAG1_M		\
+	MAKEMASK(0xFFFFULL, IECM_TXD_QW1_L2TAG1_S)
+#define IECM_TXD_QW1_TX_BUF_SZ_S	34
+#define IECM_TXD_QW1_TX_BUF_SZ_M	\
+	MAKEMASK(0x3FFFULL, IECM_TXD_QW1_TX_BUF_SZ_S)
+#define IECM_TXD_QW1_OFFSET_S		16
+#define IECM_TXD_QW1_OFFSET_M		\
+	MAKEMASK(0x3FFFFULL, IECM_TXD_QW1_OFFSET_S)
+#define IECM_TXD_QW1_CMD_S		4
+#define IECM_TXD_QW1_CMD_M		MAKEMASK(0xFFFUL, IECM_TXD_QW1_CMD_S)
+#define IECM_TXD_QW1_DTYPE_S		0
+#define IECM_TXD_QW1_DTYPE_M		MAKEMASK(0xFUL, IECM_TXD_QW1_DTYPE_S)
+
+/* TX Completion Descriptor Completion Types */
+#define IECM_TXD_COMPLT_ITR_FLUSH	0
+#define IECM_TXD_COMPLT_RULE_MISS	1
+#define IECM_TXD_COMPLT_RS		2
+#define IECM_TXD_COMPLT_REINJECTED	3
+#define IECM_TXD_COMPLT_RE		4
+#define IECM_TXD_COMPLT_SW_MARKER	5
+
+enum iecm_tx_desc_dtype_value {
+	IECM_TX_DESC_DTYPE_DATA				= 0,
+	IECM_TX_DESC_DTYPE_CTX				= 1,
+	IECM_TX_DESC_DTYPE_REINJECT_CTX			= 2,
+	IECM_TX_DESC_DTYPE_FLEX_DATA			= 3,
+	IECM_TX_DESC_DTYPE_FLEX_CTX			= 4,
+	IECM_TX_DESC_DTYPE_FLEX_TSO_CTX			= 5,
+	IECM_TX_DESC_DTYPE_FLEX_TSYN_L2TAG1		= 6,
+	IECM_TX_DESC_DTYPE_FLEX_L2TAG1_L2TAG2		= 7,
+	IECM_TX_DESC_DTYPE_FLEX_TSO_L2TAG2_PARSTAG_CTX	= 8,
+	IECM_TX_DESC_DTYPE_FLEX_HOSTSPLIT_SA_TSO_CTX	= 9,
+	IECM_TX_DESC_DTYPE_FLEX_HOSTSPLIT_SA_CTX	= 10,
+	IECM_TX_DESC_DTYPE_FLEX_L2TAG2_CTX		= 11,
+	IECM_TX_DESC_DTYPE_FLEX_FLOW_SCHE		= 12,
+	IECM_TX_DESC_DTYPE_FLEX_HOSTSPLIT_TSO_CTX	= 13,
+	IECM_TX_DESC_DTYPE_FLEX_HOSTSPLIT_CTX		= 14,
+	/* DESC_DONE - HW has completed write-back of descriptor */
+	IECM_TX_DESC_DTYPE_DESC_DONE			= 15,
+};
+
+enum iecm_tx_ctx_desc_cmd_bits {
+	IECM_TX_CTX_DESC_TSO		= 0x01,
+	IECM_TX_CTX_DESC_TSYN		= 0x02,
+	IECM_TX_CTX_DESC_IL2TAG2	= 0x04,
+	IECM_TX_CTX_DESC_RSVD		= 0x08,
+	IECM_TX_CTX_DESC_SWTCH_NOTAG	= 0x00,
+	IECM_TX_CTX_DESC_SWTCH_UPLINK	= 0x10,
+	IECM_TX_CTX_DESC_SWTCH_LOCAL	= 0x20,
+	IECM_TX_CTX_DESC_SWTCH_VSI	= 0x30,
+	IECM_TX_CTX_DESC_FILT_AU_EN	= 0x40,
+	IECM_TX_CTX_DESC_FILT_AU_EVICT	= 0x80,
+	IECM_TX_CTX_DESC_RSVD1		= 0xF00
+};
+
+enum iecm_tx_desc_len_fields {
+	/* Note: These are predefined bit offsets */
+	IECM_TX_DESC_LEN_MACLEN_S	= 0, /* 7 BITS */
+	IECM_TX_DESC_LEN_IPLEN_S	= 7, /* 7 BITS */
+	IECM_TX_DESC_LEN_L4_LEN_S	= 14 /* 4 BITS */
+};
+
+enum iecm_tx_base_desc_cmd_bits {
+	IECM_TX_DESC_CMD_EOP			= 0x0001,
+	IECM_TX_DESC_CMD_RS			= 0x0002,
+	 /* only on VFs else RSVD */
+	IECM_TX_DESC_CMD_ICRC			= 0x0004,
+	IECM_TX_DESC_CMD_IL2TAG1		= 0x0008,
+	IECM_TX_DESC_CMD_RSVD1			= 0x0010,
+	IECM_TX_DESC_CMD_IIPT_NONIP		= 0x0000, /* 2 BITS */
+	IECM_TX_DESC_CMD_IIPT_IPV6		= 0x0020, /* 2 BITS */
+	IECM_TX_DESC_CMD_IIPT_IPV4		= 0x0040, /* 2 BITS */
+	IECM_TX_DESC_CMD_IIPT_IPV4_CSUM		= 0x0060, /* 2 BITS */
+	IECM_TX_DESC_CMD_RSVD2			= 0x0080,
+	IECM_TX_DESC_CMD_L4T_EOFT_UNK		= 0x0000, /* 2 BITS */
+	IECM_TX_DESC_CMD_L4T_EOFT_TCP		= 0x0100, /* 2 BITS */
+	IECM_TX_DESC_CMD_L4T_EOFT_SCTP		= 0x0200, /* 2 BITS */
+	IECM_TX_DESC_CMD_L4T_EOFT_UDP		= 0x0300, /* 2 BITS */
+	IECM_TX_DESC_CMD_RSVD3			= 0x0400,
+	IECM_TX_DESC_CMD_RSVD4			= 0x0800,
+};
+
+/* Transmit descriptors  */
+/* splitq Tx buf, singleq Tx buf and singleq compl desc */
+struct iecm_base_tx_desc {
+	__le64 buf_addr; /* Address of descriptor's data buf */
+	__le64 qw1; /* type_cmd_offset_bsz_l2tag1 */
+};/* read used with buffer queues*/
+
+struct iecm_splitq_tx_compl_desc {
+	/* qid=[10:0] comptype=[13:11] rsvd=[14] gen=[15] */
+	__le16 qid_comptype_gen;
+	union {
+		__le16 q_head; /* Queue head */
+		__le16 compl_tag; /* Completion tag */
+	} q_head_compl_tag;
+	u8 ts[3];
+	u8 rsvd; /* Reserved */
+};/* writeback used with completion queues*/
+
+/* Context descriptors */
+struct iecm_base_tx_ctx_desc {
+	struct {
+		__le32 rsvd0;
+		__le16 l2tag2;
+		__le16 rsvd1;
+	} qw0;
+	__le64 qw1; /* type_cmd_tlen_mss/rt_hint */
+};
+
+/* Common cmd field defines for all desc except Flex Flow Scheduler (0x0C) */
+enum iecm_tx_flex_desc_cmd_bits {
+	IECM_TX_FLEX_DESC_CMD_EOP			= 0x01,
+	IECM_TX_FLEX_DESC_CMD_RS			= 0x02,
+	IECM_TX_FLEX_DESC_CMD_RE			= 0x04,
+	IECM_TX_FLEX_DESC_CMD_IL2TAG1			= 0x08,
+	IECM_TX_FLEX_DESC_CMD_DUMMY			= 0x10,
+	IECM_TX_FLEX_DESC_CMD_CS_EN			= 0x20,
+	IECM_TX_FLEX_DESC_CMD_FILT_AU_EN		= 0x40,
+	IECM_TX_FLEX_DESC_CMD_FILT_AU_EVICT		= 0x80,
+};
+
+struct iecm_flex_tx_desc {
+	__le64 buf_addr;	/* Packet buffer address */
+	struct {
+		__le16 cmd_dtype;
+#define IECM_FLEX_TXD_QW1_DTYPE_S		0
+#define IECM_FLEX_TXD_QW1_DTYPE_M		\
+		(0x1FUL << IECM_FLEX_TXD_QW1_DTYPE_S)
+#define IECM_FLEX_TXD_QW1_CMD_S		5
+#define IECM_FLEX_TXD_QW1_CMD_M		MAKEMASK(0x7FFUL, IECM_TXD_QW1_CMD_S)
+		union {
+			/* DTYPE = IECM_TX_DESC_DTYPE_FLEX_DATA_(0x03) */
+			u8 raw[4];
+
+			/* DTYPE = IECM_TX_DESC_DTYPE_FLEX_TSYN_L2TAG1 (0x06) */
+			struct {
+				__le16 l2tag1;
+				u8 flex;
+				u8 tsync;
+			} tsync;
+
+			/* DTYPE=IECM_TX_DESC_DTYPE_FLEX_L2TAG1_L2TAG2 (0x07) */
+			struct {
+				__le16 l2tag1;
+				__le16 l2tag2;
+			} l2tags;
+		} flex;
+		__le16 buf_size;
+	} qw1;
+};
+
+struct iecm_flex_tx_sched_desc {
+	__le64 buf_addr;	/* Packet buffer address */
+
+	/* DTYPE = IECM_TX_DESC_DTYPE_FLEX_FLOW_SCHE_16B (0x0C) */
+	struct {
+		u8 cmd_dtype;
+#define IECM_TXD_FLEX_FLOW_DTYPE_M	0x1F
+#define IECM_TXD_FLEX_FLOW_CMD_EOP	0x20
+#define IECM_TXD_FLEX_FLOW_CMD_CS_EN	0x40
+#define IECM_TXD_FLEX_FLOW_CMD_RE	0x80
+
+		/* [23:23] Horizon Overflow bit, [22:0] timestamp */
+		u8 ts[3];
+#define IECM_TXD_FLOW_SCH_HORIZON_OVERFLOW_M	0x80
+
+		__le16 compl_tag;
+		__le16 rxr_bufsize;
+#define IECM_TXD_FLEX_FLOW_RXR		0x4000
+#define IECM_TXD_FLEX_FLOW_BUFSIZE_M	0x3FFF
+	} qw1;
+};
+
+/* Common cmd fields for all flex context descriptors
+ * Note: these defines already account for the 5 bit dtype in the cmd_dtype
+ * field
+ */
+enum iecm_tx_flex_ctx_desc_cmd_bits {
+	IECM_TX_FLEX_CTX_DESC_CMD_TSO			= 0x0020,
+	IECM_TX_FLEX_CTX_DESC_CMD_TSYN_EN		= 0x0040,
+	IECM_TX_FLEX_CTX_DESC_CMD_L2TAG2		= 0x0080,
+	IECM_TX_FLEX_CTX_DESC_CMD_SWTCH_UPLNK		= 0x0200, /* 2 bits */
+	IECM_TX_FLEX_CTX_DESC_CMD_SWTCH_LOCAL		= 0x0400, /* 2 bits */
+	IECM_TX_FLEX_CTX_DESC_CMD_SWTCH_TARGETVSI	= 0x0600, /* 2 bits */
+};
+
+/* Standard flex descriptor TSO context quad word */
+struct iecm_flex_tx_tso_ctx_qw {
+	__le32 flex_tlen;
+#define IECM_TXD_FLEX_CTX_TLEN_M	0x1FFFF
+#define IECM_TXD_FLEX_TSO_CTX_FLEX_S	24
+	__le16 mss_rt;
+#define IECM_TXD_FLEX_CTX_MSS_RT_M	0x3FFF
+	u8 hdr_len;
+	u8 flex;
+};
+
+union iecm_flex_tx_ctx_desc {
+	/* DTYPE = IECM_TX_DESC_DTYPE_FLEX_CTX (0x04) */
+	struct {
+		u8 qw0_flex[8];
+		struct {
+			__le16 cmd_dtype;
+			__le16 l2tag1;
+			u8 qw1_flex[4];
+		} qw1;
+	} gen;
+
+	/* DTYPE = IECM_TX_DESC_DTYPE_FLEX_TSO_CTX (0x05) */
+	struct {
+		struct iecm_flex_tx_tso_ctx_qw qw0;
+		struct {
+			__le16 cmd_dtype;
+			u8 flex[6];
+		} qw1;
+	} tso;
+
+	/* DTYPE = IECM_TX_DESC_DTYPE_FLEX_TSO_L2TAG2_PARSTAG_CTX (0x08) */
+	struct {
+		struct iecm_flex_tx_tso_ctx_qw qw0;
+		struct {
+			__le16 cmd_dtype;
+			__le16 l2tag2;
+			u8 flex0;
+			u8 ptag;
+			u8 flex1[2];
+		} qw1;
+	} tso_l2tag2_ptag;
+
+	/* DTYPE = IECM_TX_DESC_DTYPE_FLEX_L2TAG2_CTX (0x0B) */
+	struct {
+		u8 qw0_flex[8];
+		struct {
+			__le16 cmd_dtype;
+			__le16 l2tag2;
+			u8 flex[4];
+		} qw1;
+	} l2tag2;
+
+	/* DTYPE = IECM_TX_DESC_DTYPE_REINJECT_CTX (0x02) */
+	struct {
+		struct {
+			__le32 sa_domain;
+#define IECM_TXD_FLEX_CTX_SA_DOM_M	0xFFFF
+#define IECM_TXD_FLEX_CTX_SA_DOM_VAL	0x10000
+			__le32 sa_idx;
+#define IECM_TXD_FLEX_CTX_SAIDX_M	0x1FFFFF
+		} qw0;
+		struct {
+			__le16 cmd_dtype;
+			__le16 txr2comp;
+#define IECM_TXD_FLEX_CTX_TXR2COMP	0x1
+			__le16 miss_txq_comp_tag;
+			__le16 miss_txq_id;
+		} qw1;
+	} reinjection_pkt;
+};
+
+/* Host Split Context Descriptors */
+struct iecm_flex_tx_hs_ctx_desc {
+	union {
+		struct {
+			__le32 host_fnum_tlen;
+#define IECM_TXD_FLEX_CTX_TLEN_S	0
+#define IECM_TXD_FLEX_CTX_TLEN_M	0x1FFFF
+#define IECM_TXD_FLEX_CTX_FNUM_S	18
+#define IECM_TXD_FLEX_CTX_FNUM_M	0x7FF
+#define IECM_TXD_FLEX_CTX_HOST_S	29
+#define IECM_TXD_FLEX_CTX_HOST_M	0x7
+			__le16 ftype_mss_rt;
+#define IECM_TXD_FLEX_CTX_MSS_RT_0	0
+#define IECM_TXD_FLEX_CTX_MSS_RT_M	0x3FFF
+#define IECM_TXD_FLEX_CTX_FTYPE_S	14
+#define IECM_TXD_FLEX_CTX_FTYPE_VF	MAKEMASK(0x0, IECM_TXD_FLEX_CTX_FTYPE_S)
+#define IECM_TXD_FLEX_CTX_FTYPE_VDEV	MAKEMASK(0x1, IECM_TXD_FLEX_CTX_FTYPE_S)
+#define IECM_TXD_FLEX_CTX_FTYPE_PF	MAKEMASK(0x2, IECM_TXD_FLEX_CTX_FTYPE_S)
+			u8 hdr_len;
+			u8 ptag;
+		} tso;
+		struct {
+			u8 flex0[2];
+			__le16 host_fnum_ftype;
+			u8 flex1[3];
+			u8 ptag;
+		} no_tso;
+	} qw0;
+
+	__le64 qw1_cmd_dtype;
+#define IECM_TXD_FLEX_CTX_QW1_PASID_S		16
+#define IECM_TXD_FLEX_CTX_QW1_PASID_M		0xFFFFF
+#define IECM_TXD_FLEX_CTX_QW1_PASID_VALID_S	36
+#define IECM_TXD_FLEX_CTX_QW1_PASID_VALID	\
+	MAKEMASK(0x1,  IECM_TXD_FLEX_CTX_PASID_VALID_S)
+#define IECM_TXD_FLEX_CTX_QW1_TPH_S		37
+#define IECM_TXD_FLEX_CTX_QW1_TPH		\
+	MAKEMASK(0x1, IECM_TXD_FLEX_CTX_TPH_S)
+#define IECM_TXD_FLEX_CTX_QW1_PFNUM_S		38
+#define IECM_TXD_FLEX_CTX_QW1_PFNUM_M		0xF
+/* The following are only valid for DTYPE = 0x09 and DTYPE = 0x0A */
+#define IECM_TXD_FLEX_CTX_QW1_SAIDX_S		42
+#define IECM_TXD_FLEX_CTX_QW1_SAIDX_M		0x1FFFFF
+#define IECM_TXD_FLEX_CTX_QW1_SAIDX_VAL_S	63
+#define IECM_TXD_FLEX_CTX_QW1_SAIDX_VALID	\
+	MAKEMASK(0x1,  IECM_TXD_FLEX_CTX_QW1_SAIDX_VAL_S)
+/* The following are only valid for DTYPE = 0x0D and DTYPE = 0x0E */
+#define IECM_TXD_FLEX_CTX_QW1_FLEX0_S		48
+#define IECM_TXD_FLEX_CTX_QW1_FLEX0_M		0xFF
+#define IECM_TXD_FLEX_CTX_QW1_FLEX1_S		56
+#define IECM_TXD_FLEX_CTX_QW1_FLEX1_M		0xFF
+};
+
+/* Rx */
+/* For iecm_splitq_base_rx_flex desc members */
+#define IECM_RXD_FLEX_PTYPE_S		0
+#define IECM_RXD_FLEX_PTYPE_M		MAKEMASK(0x3FFUL, IECM_RXD_FLEX_PTYPE_S)
+#define IECM_RXD_FLEX_UMBCAST_S		10
+#define IECM_RXD_FLEX_UMBCAST_M		MAKEMASK(0x3UL, IECM_RXD_FLEX_UMBCAST_S)
+#define IECM_RXD_FLEX_FF0_S		12
+#define IECM_RXD_FLEX_FF0_M		MAKEMASK(0xFUL, IECM_RXD_FLEX_FF0_S)
+#define IECM_RXD_FLEX_LEN_PBUF_S	0
+#define IECM_RXD_FLEX_LEN_PBUF_M	\
+	MAKEMASK(0x3FFFUL, IECM_RXD_FLEX_LEN_PBUF_S)
+#define IECM_RXD_FLEX_GEN_S		14
+#define IECM_RXD_FLEX_GEN_M		BIT_ULL(IECM_RXD_FLEX_GEN_S)
+#define IECM_RXD_FLEX_BUFQ_ID_S		15
+#define IECM_RXD_FLEX_BUFQ_ID_M		BIT_ULL(IECM_RXD_FLEX_BUFQ_ID_S)
+#define IECM_RXD_FLEX_LEN_HDR_S		0
+#define IECM_RXD_FLEX_LEN_HDR_M		\
+	MAKEMASK(0x3FFUL, IECM_RXD_FLEX_LEN_HDR_S)
+#define IECM_RXD_FLEX_RSC_S		10
+#define IECM_RXD_FLEX_RSC_M		BIT_ULL(IECM_RXD_FLEX_RSC_S)
+#define IECM_RXD_FLEX_SPH_S		11
+#define IECM_RXD_FLEX_SPH_M		BIT_ULL(IECM_RXD_FLEX_SPH_S)
+#define IECM_RXD_FLEX_MISS_S		12
+#define IECM_RXD_FLEX_MISS_M		BIT_ULL(IECM_RXD_FLEX_MISS_S)
+#define IECM_RXD_FLEX_FF1_S		13
+#define IECM_RXD_FLEX_FF1_M		MAKEMASK(0x7UL, IECM_RXD_FLEX_FF1_M)
+
+/* For iecm_singleq_base_rx_legacy desc members */
+#define IECM_RXD_QW1_LEN_SPH_S	63
+#define IECM_RXD_QW1_LEN_SPH_M	BIT_ULL(IECM_RXD_QW1_LEN_SPH_S)
+#define IECM_RXD_QW1_LEN_HBUF_S	52
+#define IECM_RXD_QW1_LEN_HBUF_M	MAKEMASK(0x7FFULL, IECM_RXD_QW1_LEN_HBUF_S)
+#define IECM_RXD_QW1_LEN_PBUF_S	38
+#define IECM_RXD_QW1_LEN_PBUF_M	MAKEMASK(0x3FFFULL, IECM_RXD_QW1_LEN_PBUF_S)
+#define IECM_RXD_QW1_PTYPE_S	30
+#define IECM_RXD_QW1_PTYPE_M	MAKEMASK(0xFFULL, IECM_RXD_QW1_PTYPE_S)
+#define IECM_RXD_QW1_ERROR_S	19
+#define IECM_RXD_QW1_ERROR_M	MAKEMASK(0xFFUL, IECM_RXD_QW1_ERROR_S)
+#define IECM_RXD_QW1_STATUS_S	0
+#define IECM_RXD_QW1_STATUS_M	MAKEMASK(0x7FFFFUL, IECM_RXD_QW1_STATUS_S)
+
+enum iecm_rx_flex_desc_status_error_0_qw1_bits {
+	/* Note: These are predefined bit offsets */
+	IECM_RX_FLEX_DESC_STATUS0_DD_S = 0,
+	IECM_RX_FLEX_DESC_STATUS0_EOF_S,
+	IECM_RX_FLEX_DESC_STATUS0_HBO_S,
+	IECM_RX_FLEX_DESC_STATUS0_L3L4P_S,
+	IECM_RX_FLEX_DESC_STATUS0_XSUM_IPE_S,
+	IECM_RX_FLEX_DESC_STATUS0_XSUM_L4E_S,
+	IECM_RX_FLEX_DESC_STATUS0_XSUM_EIPE_S,
+	IECM_RX_FLEX_DESC_STATUS0_XSUM_EUDPE_S,
+};
+
+enum iecm_rx_flex_desc_status_error_0_qw0_bits {
+	IECM_RX_FLEX_DESC_STATUS0_LPBK_S = 0,
+	IECM_RX_FLEX_DESC_STATUS0_IPV6EXADD_S,
+	IECM_RX_FLEX_DESC_STATUS0_RXE_S,
+	IECM_RX_FLEX_DESC_STATUS0_CRCP_S,
+	IECM_RX_FLEX_DESC_STATUS0_RSS_VALID_S,
+	IECM_RX_FLEX_DESC_STATUS0_L2TAG1P_S,
+	IECM_RX_FLEX_DESC_STATUS0_XTRMD0_VALID_S,
+	IECM_RX_FLEX_DESC_STATUS0_XTRMD1_VALID_S,
+	IECM_RX_FLEX_DESC_STATUS0_LAST /* this entry must be last!!! */
+};
+
+enum iecm_rx_flex_desc_status_error_1_bits {
+	/* Note: These are predefined bit offsets */
+	IECM_RX_FLEX_DESC_STATUS1_RSVD_S = 0, /* 2 bits */
+	IECM_RX_FLEX_DESC_STATUS1_ATRAEFAIL_S = 2,
+	IECM_RX_FLEX_DESC_STATUS1_L2TAG2P_S = 3,
+	IECM_RX_FLEX_DESC_STATUS1_XTRMD2_VALID_S = 4,
+	IECM_RX_FLEX_DESC_STATUS1_XTRMD3_VALID_S = 5,
+	IECM_RX_FLEX_DESC_STATUS1_XTRMD4_VALID_S = 6,
+	IECM_RX_FLEX_DESC_STATUS1_XTRMD5_VALID_S = 7,
+	IECM_RX_FLEX_DESC_STATUS1_LAST /* this entry must be last!!! */
+};
+
+enum iecm_rx_base_desc_status_bits {
+	/* Note: These are predefined bit offsets */
+	IECM_RX_BASE_DESC_STATUS_DD_S		= 0,
+	IECM_RX_BASE_DESC_STATUS_EOF_S		= 1,
+	IECM_RX_BASE_DESC_STATUS_L2TAG1P_S	= 2,
+	IECM_RX_BASE_DESC_STATUS_L3L4P_S	= 3,
+	IECM_RX_BASE_DESC_STATUS_CRCP_S		= 4,
+	IECM_RX_BASE_DESC_STATUS_RSVD_S		= 5, /* 3 BITS */
+	IECM_RX_BASE_DESC_STATUS_EXT_UDP_0_S	= 8,
+	IECM_RX_BASE_DESC_STATUS_UMBCAST_S	= 9, /* 2 BITS */
+	IECM_RX_BASE_DESC_STATUS_FLM_S		= 11,
+	IECM_RX_BASE_DESC_STATUS_FLTSTAT_S	= 12, /* 2 BITS */
+	IECM_RX_BASE_DESC_STATUS_LPBK_S		= 14,
+	IECM_RX_BASE_DESC_STATUS_IPV6EXADD_S	= 15,
+	IECM_RX_BASE_DESC_STATUS_RSVD1_S	= 16, /* 2 BITS */
+	IECM_RX_BASE_DESC_STATUS_INT_UDP_0_S	= 18,
+	IECM_RX_BASE_DESC_STATUS_LAST /* this entry must be last!!! */
+};
+
+enum iecm_rx_desc_fltstat_values {
+	IECM_RX_DESC_FLTSTAT_NO_DATA	= 0,
+	IECM_RX_DESC_FLTSTAT_RSV_FD_ID	= 1, /* 16byte desc? FD_ID : RSV */
+	IECM_RX_DESC_FLTSTAT_RSV	= 2,
+	IECM_RX_DESC_FLTSTAT_RSS_HASH	= 3,
+};
+
+enum iecm_rx_base_desc_error_bits {
+	/* Note: These are predefined bit offsets */
+	IECM_RX_BASE_DESC_ERROR_RXE_S		= 0,
+	IECM_RX_BASE_DESC_ERROR_ATRAEFAIL_S	= 1,
+	IECM_RX_BASE_DESC_ERROR_HBO_S		= 2,
+	IECM_RX_BASE_DESC_ERROR_L3L4E_S		= 3, /* 3 BITS */
+	IECM_RX_BASE_DESC_ERROR_IPE_S		= 3,
+	IECM_RX_BASE_DESC_ERROR_L4E_S		= 4,
+	IECM_RX_BASE_DESC_ERROR_EIPE_S		= 5,
+	IECM_RX_BASE_DESC_ERROR_OVERSIZE_S	= 6,
+	IECM_RX_BASE_DESC_ERROR_RSVD_S		= 7
+};
+
+/* Receive Descriptors */
+/* splitq buf*/
+struct iecm_splitq_rx_buf_desc {
+	struct {
+		__le16  buf_id; /* Buffer Identifier */
+		__le16  rsvd0;
+		__le32  rsvd1;
+	} qword0;
+	__le64  pkt_addr; /* Packet buffer address */
+	__le64  hdr_addr; /* Header buffer address */
+	__le64  rsvd2;
+}; /* read used with buffer queues*/
+
+/* singleq buf */
+struct iecm_singleq_rx_buf_desc {
+	__le64  pkt_addr; /* Packet buffer address */
+	__le64  hdr_addr; /* Header buffer address */
+	__le64  rsvd1;
+	__le64  rsvd2;
+}; /* read used with buffer queues*/
+
+union iecm_rx_buf_desc {
+	struct iecm_singleq_rx_buf_desc		read;
+	struct iecm_splitq_rx_buf_desc		split_rd;
+};
+
+/* splitq compl */
+struct iecm_flex_rx_desc {
+	/* Qword 0 */
+	u8 rxdid_ucast; /* profile_id=[3:0] */
+			/* rsvd=[5:4] */
+			/* ucast=[7:6] */
+	u8 status_err0_qw0;
+	__le16 ptype_err_fflags0;	/* ptype=[9:0] */
+					/* ip_hdr_err=[10:10] */
+					/* udp_len_err=[11:11] */
+					/* ff0=[15:12] */
+	__le16 pktlen_gen_bufq_id;	/* plen=[13:0] */
+					/* gen=[14:14]  only in splitq */
+					/* bufq_id=[15:15] only in splitq */
+	__le16 hdrlen_flags;		/* header=[9:0] */
+					/* rsc=[10:10] only in splitq */
+					/* sph=[11:11] only in splitq */
+					/* ext_udp_0=[12:12] */
+					/* int_udp_0=[13:13] */
+					/* trunc_mirr=[14:14] */
+					/* miss_prepend=[15:15] */
+	/* Qword 1 */
+	u8 status_err0_qw1;
+	u8 status_err1;
+	u8 fflags1;
+	u8 ts_low;
+	union {
+		__le16 fmd0;
+		__le16 buf_id; /* only in splitq */
+	} fmd0_bufid;
+	union {
+		__le16 fmd1;
+		__le16 raw_cs;
+		__le16 l2tag1;
+		__le16 rscseglen;
+	} fmd1_misc;
+	/* Qword 2 */
+	union {
+		__le16 fmd2;
+		__le16 hash1;
+	} fmd2_hash1;
+	union {
+		u8 fflags2;
+		u8 mirrorid;
+		u8 hash2;
+	} ff2_mirrid_hash2;
+	u8 hash3;
+	union {
+		__le16 fmd3;
+		__le16 l2tag2;
+	} fmd3_l2tag2;
+	__le16 fmd4;
+	/* Qword 3 */
+	union {
+		__le16 fmd5;
+		__le16 l2tag1;
+	} fmd5_l2tag1;
+	__le16 fmd6;
+	union {
+		struct {
+			__le16 fmd7_0;
+			__le16 fmd7_1;
+		} fmd7;
+		__le32 ts_high;
+	} flex_ts;
+}; /* writeback */
+
+/* singleq wb(compl) */
+struct iecm_singleq_base_rx_desc {
+	struct {
+		struct {
+			__le16 mirroring_status;
+			__le16 l2tag1;
+		} lo_dword;
+		union {
+			__le32 rss; /* RSS Hash */
+			__le32 fd_id; /* Flow Director filter id */
+		} hi_dword;
+	} qword0;
+	struct {
+		/* status/error/PTYPE/length */
+		__le64 status_error_ptype_len;
+	} qword1;
+	struct {
+		__le16 ext_status; /* extended status */
+		__le16 rsvd;
+		__le16 l2tag2_1;
+		__le16 l2tag2_2;
+	} qword2;
+	struct {
+		__le32 reserved;
+		__le32 fd_id;
+	} qword3;
+}; /* writeback */
+
+union iecm_rx_desc {
+	struct iecm_singleq_rx_buf_desc	read;
+	struct iecm_singleq_base_rx_desc	base_wb;
+	struct iecm_flex_rx_desc		flex_wb;
+};
+#endif /* _IECM_LAN_TXRX_H_ */
diff --git a/include/linux/net/intel/iecm_txrx.h b/include/linux/net/intel/iecm_txrx.h
new file mode 100644
index 000000000000..bf6f02592bb7
--- /dev/null
+++ b/include/linux/net/intel/iecm_txrx.h
@@ -0,0 +1,610 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (C) 2020 Intel Corporation */
+
+#ifndef _IECM_TXRX_H_
+#define _IECM_TXRX_H_
+
+#define IECM_MAX_Q				16
+/* Mailbox Queue */
+#define IECM_MAX_NONQ				1
+#define IECM_MAX_TXQ_DESC			512
+#define IECM_MAX_RXQ_DESC			512
+#define IECM_MIN_TXQ_DESC			128
+#define IECM_MIN_RXQ_DESC			128
+#define IECM_REQ_DESC_MULTIPLE			32
+
+#define IECM_DFLT_SINGLEQ_TX_Q_GROUPS		1
+#define IECM_DFLT_SINGLEQ_RX_Q_GROUPS		1
+#define IECM_DFLT_SINGLEQ_TXQ_PER_GROUP		4
+#define IECM_DFLT_SINGLEQ_RXQ_PER_GROUP		4
+
+#define IECM_COMPLQ_PER_GROUP			1
+#define IECM_BUFQS_PER_RXQ_SET			2
+
+#define IECM_DFLT_SPLITQ_TX_Q_GROUPS		4
+#define IECM_DFLT_SPLITQ_RX_Q_GROUPS		4
+#define IECM_DFLT_SPLITQ_TXQ_PER_GROUP		1
+#define IECM_DFLT_SPLITQ_RXQ_PER_GROUP		1
+
+/* Default vector sharing */
+#define IECM_MAX_NONQ_VEC	1
+#define IECM_MAX_Q_VEC		4 /* For Tx Completion queue and Rx queue */
+#define IECM_MAX_RDMA_VEC	2 /* To share with RDMA */
+#define IECM_MIN_RDMA_VEC	1 /* Minimum vectors to be shared with RDMA */
+#define IECM_MIN_VEC		3 /* One for mailbox, one for data queues, one
+				   * for RDMA
+				   */
+
+#define IECM_DFLT_TX_Q_DESC_COUNT		512
+#define IECM_DFLT_TX_COMPLQ_DESC_COUNT		512
+#define IECM_DFLT_RX_Q_DESC_COUNT		512
+#define IECM_DFLT_RX_BUFQ_DESC_COUNT		512
+
+#define IECM_RX_BUF_WRITE			16 /* Must be power of 2 */
+#define IECM_RX_HDR_SIZE			256
+#define IECM_RX_BUF_2048			2048
+#define IECM_RX_BUF_STRIDE			64
+#define IECM_LOW_WATERMARK			64
+#define IECM_HDR_BUF_SIZE			256
+#define IECM_PACKET_HDR_PAD	\
+	(ETH_HLEN + ETH_FCS_LEN + (VLAN_HLEN * 2))
+#define IECM_MAX_RXBUFFER			9728
+#define IECM_MAX_MTU		\
+	(IECM_MAX_RXBUFFER - IECM_PACKET_HDR_PAD)
+
+#define IECM_SINGLEQ_RX_BUF_DESC(R, i)	\
+	(&(((struct iecm_singleq_rx_buf_desc *)((R)->desc_ring))[i]))
+#define IECM_SPLITQ_RX_BUF_DESC(R, i)	\
+	(&(((struct iecm_splitq_rx_buf_desc *)((R)->desc_ring))[i]))
+
+#define IECM_BASE_TX_DESC(R, i)	\
+	(&(((struct iecm_base_tx_desc *)((R)->desc_ring))[i]))
+#define IECM_SPLITQ_TX_COMPLQ_DESC(R, i)	\
+	(&(((struct iecm_splitq_tx_compl_desc *)((R)->desc_ring))[i]))
+
+#define IECM_FLEX_TX_DESC(R, i)	\
+	(&(((union iecm_tx_flex_desc *)((R)->desc_ring))[i]))
+#define IECM_FLEX_TX_CTX_DESC(R, i)	\
+	(&(((union iecm_flex_tx_ctx_desc *)((R)->desc_ring))[i]))
+
+#define IECM_DESC_UNUSED(R)	\
+	((((R)->next_to_clean > (R)->next_to_use) ? 0 : (R)->desc_count) + \
+	(R)->next_to_clean - (R)->next_to_use - 1)
+
+union iecm_tx_flex_desc {
+	struct iecm_flex_tx_desc q; /* queue based scheduling */
+	struct iecm_flex_tx_sched_desc flow; /* flow based scheduling */
+};
+
+struct iecm_tx_buf {
+	struct hlist_node hlist;
+	void *next_to_watch;
+	struct sk_buff *skb;
+	unsigned int bytecount;
+	unsigned short gso_segs;
+#define IECM_TX_FLAGS_TSO	BIT(0)
+	u32 tx_flags;
+	DEFINE_DMA_UNMAP_ADDR(dma);
+	DEFINE_DMA_UNMAP_LEN(len);
+	u16 compl_tag;		/* Unique identifier for buffer; used to
+				 * compare with completion tag returned
+				 * in buffer completion event
+				 */
+};
+
+struct iecm_buf_lifo {
+	u16 top;
+	u16 size;
+	struct iecm_tx_buf **bufs;
+};
+
+struct iecm_tx_offload_params {
+	u16 td_cmd;	/* command field to be inserted into descriptor */
+	u32 tso_len;	/* total length of payload to segment */
+	u16 mss;
+	u8 tso_hdr_len;	/* length of headers to be duplicated */
+
+	/* Flow scheduling offload timestamp, formatting as hw expects it */
+#define IECM_TW_TIME_STAMP_GRAN_512_DIV_S	9
+#define IECM_TW_TIME_STAMP_GRAN_1024_DIV_S	10
+#define IECM_TW_TIME_STAMP_GRAN_2048_DIV_S	11
+#define IECM_TW_TIME_STAMP_GRAN_4096_DIV_S	12
+	u64 desc_ts;
+
+	/* For legacy offloads */
+	u32 hdr_offsets;
+};
+
+struct iecm_tx_splitq_params {
+	/* Descriptor build function pointer */
+	void (*splitq_build_ctb)(union iecm_tx_flex_desc *desc,
+				 struct iecm_tx_splitq_params *params,
+				 u16 td_cmd, u16 size);
+
+	/* General descriptor info */
+	enum iecm_tx_desc_dtype_value dtype;
+	u16 eop_cmd;
+	u16 compl_tag; /* only relevant for flow scheduling */
+
+	struct iecm_tx_offload_params offload;
+};
+
+#define IECM_TX_COMPLQ_CLEAN_BUDGET	256
+#define IECM_TX_MIN_LEN			17
+#define IECM_TX_DESCS_FOR_SKB_DATA_PTR	1
+#define IECM_TX_MAX_BUF			8
+#define IECM_TX_DESCS_PER_CACHE_LINE	4
+#define IECM_TX_DESCS_FOR_CTX		1
+/* TX descriptors needed, worst case */
+#define IECM_TX_DESC_NEEDED (MAX_SKB_FRAGS + IECM_TX_DESCS_FOR_CTX + \
+			     IECM_TX_DESCS_PER_CACHE_LINE + \
+			     IECM_TX_DESCS_FOR_SKB_DATA_PTR)
+
+/* The size limit for a transmit buffer in a descriptor is (16K - 1).
+ * In order to align with the read requests we will align the value to
+ * the nearest 4K which represents our maximum read request size.
+ */
+#define IECM_TX_MAX_READ_REQ_SIZE	4096
+#define IECM_TX_MAX_DESC_DATA		(16 * 1024 - 1)
+#define IECM_TX_MAX_DESC_DATA_ALIGNED \
+	(~(IECM_TX_MAX_READ_REQ_SIZE - 1) & IECM_TX_MAX_DESC_DATA)
+
+#define IECM_RX_DMA_ATTR \
+	(DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING)
+#define IECM_RX_DESC(R, i)	\
+	(&(((union iecm_rx_desc *)((R)->desc_ring))[i]))
+
+struct iecm_rx_buf {
+	struct sk_buff *skb;
+	dma_addr_t dma;
+	struct page *page;
+	unsigned int page_offset;
+	u16 pagecnt_bias;
+	u16 buf_id;
+};
+
+/* Packet type non-ip values */
+enum iecm_rx_ptype_l2 {
+	IECM_RX_PTYPE_L2_RESERVED	= 0,
+	IECM_RX_PTYPE_L2_MAC_PAY2	= 1,
+	IECM_RX_PTYPE_L2_TIMESYNC_PAY2	= 2,
+	IECM_RX_PTYPE_L2_FIP_PAY2	= 3,
+	IECM_RX_PTYPE_L2_OUI_PAY2	= 4,
+	IECM_RX_PTYPE_L2_MACCNTRL_PAY2	= 5,
+	IECM_RX_PTYPE_L2_LLDP_PAY2	= 6,
+	IECM_RX_PTYPE_L2_ECP_PAY2	= 7,
+	IECM_RX_PTYPE_L2_EVB_PAY2	= 8,
+	IECM_RX_PTYPE_L2_QCN_PAY2	= 9,
+	IECM_RX_PTYPE_L2_EAPOL_PAY2	= 10,
+	IECM_RX_PTYPE_L2_ARP		= 11,
+};
+
+enum iecm_rx_ptype_outer_ip {
+	IECM_RX_PTYPE_OUTER_L2	= 0,
+	IECM_RX_PTYPE_OUTER_IP	= 1,
+};
+
+enum iecm_rx_ptype_outer_ip_ver {
+	IECM_RX_PTYPE_OUTER_NONE	= 0,
+	IECM_RX_PTYPE_OUTER_IPV4	= 1,
+	IECM_RX_PTYPE_OUTER_IPV6	= 2,
+};
+
+enum iecm_rx_ptype_outer_fragmented {
+	IECM_RX_PTYPE_NOT_FRAG	= 0,
+	IECM_RX_PTYPE_FRAG	= 1,
+};
+
+enum iecm_rx_ptype_tunnel_type {
+	IECM_RX_PTYPE_TUNNEL_NONE		= 0,
+	IECM_RX_PTYPE_TUNNEL_IP_IP		= 1,
+	IECM_RX_PTYPE_TUNNEL_IP_GRENAT		= 2,
+	IECM_RX_PTYPE_TUNNEL_IP_GRENAT_MAC	= 3,
+	IECM_RX_PTYPE_TUNNEL_IP_GRENAT_MAC_VLAN	= 4,
+};
+
+enum iecm_rx_ptype_tunnel_end_prot {
+	IECM_RX_PTYPE_TUNNEL_END_NONE	= 0,
+	IECM_RX_PTYPE_TUNNEL_END_IPV4	= 1,
+	IECM_RX_PTYPE_TUNNEL_END_IPV6	= 2,
+};
+
+enum iecm_rx_ptype_inner_prot {
+	IECM_RX_PTYPE_INNER_PROT_NONE		= 0,
+	IECM_RX_PTYPE_INNER_PROT_UDP		= 1,
+	IECM_RX_PTYPE_INNER_PROT_TCP		= 2,
+	IECM_RX_PTYPE_INNER_PROT_SCTP		= 3,
+	IECM_RX_PTYPE_INNER_PROT_ICMP		= 4,
+	IECM_RX_PTYPE_INNER_PROT_TIMESYNC	= 5,
+};
+
+enum iecm_rx_ptype_payload_layer {
+	IECM_RX_PTYPE_PAYLOAD_LAYER_NONE	= 0,
+	IECM_RX_PTYPE_PAYLOAD_LAYER_PAY2	= 1,
+	IECM_RX_PTYPE_PAYLOAD_LAYER_PAY3	= 2,
+	IECM_RX_PTYPE_PAYLOAD_LAYER_PAY4	= 3,
+};
+
+struct iecm_rx_ptype_decoded {
+	u32 ptype:10;
+	u32 known:1;
+	u32 outer_ip:1;
+	u32 outer_ip_ver:2;
+	u32 outer_frag:1;
+	u32 tunnel_type:3;
+	u32 tunnel_end_prot:2;
+	u32 tunnel_end_frag:1;
+	u32 inner_prot:4;
+	u32 payload_layer:3;
+};
+
+enum iecm_rx_hsplit {
+	IECM_RX_NO_HDR_SPLIT = 0,
+	IECM_RX_HDR_SPLIT = 1,
+	IECM_RX_HDR_SPLIT_PERF = 2,
+};
+
+/* The iecm_ptype_lkup table is used to convert from the 10-bit ptype in the
+ * hardware to a bit-field that can be used by SW to more easily determine the
+ * packet type.
+ *
+ * Macros are used to shorten the table lines and make this table human
+ * readable.
+ *
+ * We store the PTYPE in the top byte of the bit field - this is just so that
+ * we can check that the table doesn't have a row missing, as the index into
+ * the table should be the PTYPE.
+ *
+ * Typical work flow:
+ *
+ * IF NOT iecm_ptype_lkup[ptype].known
+ * THEN
+ *      Packet is unknown
+ * ELSE IF iecm_ptype_lkup[ptype].outer_ip == IECM_RX_PTYPE_OUTER_IP
+ *      Use the rest of the fields to look at the tunnels, inner protocols, etc
+ * ELSE
+ *      Use the enum iecm_rx_ptype_l2 to decode the packet type
+ * ENDIF
+ */
+/* macro to make the table lines short */
+#define IECM_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\
+	{	PTYPE, \
+		1, \
+		IECM_RX_PTYPE_OUTER_##OUTER_IP, \
+		IECM_RX_PTYPE_OUTER_##OUTER_IP_VER, \
+		IECM_RX_PTYPE_##OUTER_FRAG, \
+		IECM_RX_PTYPE_TUNNEL_##T, \
+		IECM_RX_PTYPE_TUNNEL_END_##TE, \
+		IECM_RX_PTYPE_##TEF, \
+		IECM_RX_PTYPE_INNER_PROT_##I, \
+		IECM_RX_PTYPE_PAYLOAD_LAYER_##PL }
+
+#define IECM_PTT_UNUSED_ENTRY(PTYPE) { PTYPE, 0, 0, 0, 0, 0, 0, 0, 0, 0 }
+
+/* shorter macros makes the table fit but are terse */
+#define IECM_RX_PTYPE_NOF		IECM_RX_PTYPE_NOT_FRAG
+#define IECM_RX_PTYPE_FRG		IECM_RX_PTYPE_FRAG
+#define IECM_RX_PTYPE_INNER_PROT_TS	IECM_RX_PTYPE_INNER_PROT_TIMESYNC
+#define IECM_RX_SUPP_PTYPE		18
+#define IECM_RX_MAX_PTYPE		1024
+
+/* Lookup table mapping the HW PTYPE to the bit field for decoding */
+static const
+struct iecm_rx_ptype_decoded iecm_rx_ptype_lkup[IECM_RX_SUPP_PTYPE] = {
+	/* L2 Packet types */
+	IECM_PTT_UNUSED_ENTRY(0),
+	IECM_PTT(1, L2, NONE, NOF, NONE, NONE, NOF, NONE, PAY2),
+	IECM_PTT(11, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
+	IECM_PTT_UNUSED_ENTRY(12),
+
+	/* Non Tunneled IPv4 */
+	IECM_PTT(22, IP, IPV4, FRG, NONE, NONE, NOF, NONE, PAY3),
+	IECM_PTT(23, IP, IPV4, NOF, NONE, NONE, NOF, NONE, PAY3),
+	IECM_PTT(24, IP, IPV4, NOF, NONE, NONE, NOF, UDP,  PAY4),
+	IECM_PTT_UNUSED_ENTRY(25),
+	IECM_PTT(26, IP, IPV4, NOF, NONE, NONE, NOF, TCP,  PAY4),
+	IECM_PTT(27, IP, IPV4, NOF, NONE, NONE, NOF, SCTP, PAY4),
+	IECM_PTT(28, IP, IPV4, NOF, NONE, NONE, NOF, ICMP, PAY4),
+
+	/* Non Tunneled IPv6 */
+	IECM_PTT(88, IP, IPV6, FRG, NONE, NONE, NOF, NONE, PAY3),
+	IECM_PTT(89, IP, IPV6, NOF, NONE, NONE, NOF, NONE, PAY3),
+	IECM_PTT(90, IP, IPV6, NOF, NONE, NONE, NOF, UDP,  PAY3),
+	IECM_PTT_UNUSED_ENTRY(91),
+	IECM_PTT(92, IP, IPV6, NOF, NONE, NONE, NOF, TCP,  PAY4),
+	IECM_PTT(93, IP, IPV6, NOF, NONE, NONE, NOF, SCTP, PAY4),
+	IECM_PTT(94, IP, IPV6, NOF, NONE, NONE, NOF, ICMP, PAY4),
+};
+
+#define IECM_INT_NAME_STR_LEN	(IFNAMSIZ + 16)
+
+enum iecm_queue_flags_t {
+	__IECM_Q_GEN_CHK,
+	__IECM_Q_FLOW_SCH_EN,
+	__IECM_Q_SW_MARKER,
+	__IECM_Q_FLAGS_NBITS,
+};
+
+struct iecm_intr_reg {
+	u32 dyn_ctl;
+	u32 dyn_ctl_intena_m;
+	u32 dyn_ctl_clrpba_m;
+	u32 dyn_ctl_itridx_s;
+	u32 dyn_ctl_itridx_m;
+	u32 dyn_ctl_intrvl_s;
+	u32 itr;
+};
+
+struct iecm_q_vector {
+	struct iecm_vport *vport;
+	cpumask_t affinity_mask;
+	struct napi_struct napi;
+	u16 v_idx;		/* index in the vport->q_vector array */
+	u8 itr_countdown;	/* when 0 should adjust ITR */
+	struct iecm_intr_reg intr_reg;
+	int num_txq;
+	struct iecm_queue **tx;
+	int num_rxq;
+	struct iecm_queue **rx;
+	char name[IECM_INT_NAME_STR_LEN];
+};
+
+struct iecm_rx_queue_stats {
+	u64 packets;
+	u64 bytes;
+	u64 generic_csum;
+	u64 basic_csum;
+	u64 csum_err;
+	u64 hsplit_hbo;
+};
+
+struct iecm_tx_queue_stats {
+	u64 packets;
+	u64 bytes;
+};
+
+union iecm_queue_stats {
+	struct iecm_rx_queue_stats rx;
+	struct iecm_tx_queue_stats tx;
+};
+
+enum iecm_latency_range {
+	IECM_LOWEST_LATENCY = 0,
+	IECM_LOW_LATENCY = 1,
+	IECM_BULK_LATENCY = 2,
+};
+
+struct iecm_itr {
+	u16 current_itr;
+	u16 target_itr;
+	enum virtchnl_itr_idx itr_idx;
+	union iecm_queue_stats stats; /* will reset to 0 when adjusting ITR */
+	enum iecm_latency_range latency_range;
+	unsigned long next_update;	/* jiffies of last ITR update */
+};
+
+/* indices into GLINT_ITR registers */
+#define IECM_ITR_ADAPTIVE_MIN_INC	0x0002
+#define IECM_ITR_ADAPTIVE_MIN_USECS	0x0002
+#define IECM_ITR_ADAPTIVE_MAX_USECS	0x007e
+#define IECM_ITR_ADAPTIVE_LATENCY	0x8000
+#define IECM_ITR_ADAPTIVE_BULK		0x0000
+#define ITR_IS_BULK(x) (!((x) & IECM_ITR_ADAPTIVE_LATENCY))
+
+#define IECM_ITR_DYNAMIC	0X8000	/* use top bit as a flag */
+#define IECM_ITR_MAX		0x1FE0
+#define IECM_ITR_100K		0x000A
+#define IECM_ITR_50K		0x0014
+#define IECM_ITR_20K		0x0032
+#define IECM_ITR_18K		0x003C
+#define IECM_ITR_GRAN_S		1	/* Assume ITR granularity is 2us */
+#define IECM_ITR_MASK		0x1FFE	/* ITR register value alignment mask */
+#define ITR_REG_ALIGN(setting)	__ALIGN_MASK(setting, ~IECM_ITR_MASK)
+#define IECM_ITR_IS_DYNAMIC(setting) (!!((setting) & IECM_ITR_DYNAMIC))
+#define IECM_ITR_SETTING(setting)	((setting) & ~IECM_ITR_DYNAMIC)
+#define ITR_COUNTDOWN_START	100
+#define IECM_ITR_TX_DEF		IECM_ITR_20K
+#define IECM_ITR_RX_DEF		IECM_ITR_50K
+
+/* queue associated with a vport */
+struct iecm_queue {
+	struct device *dev;		/* Used for DMA mapping */
+	struct iecm_vport *vport;	/* Back reference to associated vport */
+	union {
+		struct iecm_txq_group *txq_grp;
+		struct iecm_rxq_group *rxq_grp;
+	};
+	/* bufq: Used as group id, either 0 or 1, on clean Buf Q uses this
+	 *       index to determine which group of refill queues to clean.
+	 *       Bufqs are use in splitq only.
+	 * txq: Index to map between Tx Q group and hot path Tx ptrs stored in
+	 *      vport.  Used in both single Q/split Q.
+	 */
+	u16 idx;
+	/* Used for both Q models single and split. In split Q model relevant
+	 * only to Tx Q and Rx Q
+	 */
+	u8 __iomem *tail;
+	/* Used in both single and split Q.  In single Q, Tx Q uses tx_buf and
+	 * Rx Q uses rx_buf.  In split Q, Tx Q uses tx_buf, Rx Q uses skb, and
+	 * Buf Q uses rx_buf.
+	 */
+	union {
+		struct iecm_tx_buf *tx_buf;
+		struct {
+			struct iecm_rx_buf *buf;
+			struct iecm_rx_buf *hdr_buf;
+		} rx_buf;
+		struct sk_buff *skb;
+	};
+	enum virtchnl_queue_type q_type;
+	/* Queue id(Tx/Tx compl/Rx/Bufq) */
+	u16 q_id;
+	u16 desc_count;		/* Number of descriptors */
+
+	/* Relevant in both split & single Tx Q & Buf Q*/
+	u16 next_to_use;
+	/* In split q model only relevant for Tx Compl Q and Rx Q */
+	u16 next_to_clean;	/* used in interrupt processing */
+	/* Used only for Rx. In split Q model only relevant to Rx Q */
+	u16 next_to_alloc;
+	/* Generation bit check stored, as HW flips the bit at Queue end */
+	DECLARE_BITMAP(flags, __IECM_Q_FLAGS_NBITS);
+
+	union iecm_queue_stats q_stats;
+	struct u64_stats_sync stats_sync;
+
+	enum iecm_rx_hsplit rx_hsplit_en;
+
+	u16 rx_hbuf_size;	/* Header buffer size */
+	u16 rx_buf_size;
+	u16 rx_max_pkt_size;
+	u16 rx_buf_stride;
+	u8 rsc_low_watermark;
+	/* Used for both Q models single and split. In split Q model relevant
+	 * only to Tx compl Q and Rx compl Q
+	 */
+	struct iecm_q_vector *q_vector;	/* Back reference to associated vector */
+	struct iecm_itr itr;
+	unsigned int size;		/* length of descriptor ring in bytes */
+	dma_addr_t dma;			/* physical address of ring */
+	void *desc_ring;		/* Descriptor ring memory */
+
+	struct iecm_buf_lifo buf_stack; /* Stack of empty buffers to store
+					 * buffer info for out of order
+					 * buffer completions
+					 */
+	u16 tx_buf_key;			/* 16 bit unique "identifier" (index)
+					 * to be used as the completion tag when
+					 * queue is using flow based scheduling
+					 */
+	DECLARE_HASHTABLE(sched_buf_hash, 12);
+} ____cacheline_internodealigned_in_smp;
+
+/* Software queues are used in splitq mode to manage buffers between rxq
+ * producer and the bufq consumer.  These are required in order to maintain a
+ * lockless buffer management system and are strictly software only constructs.
+ */
+struct iecm_sw_queue {
+	u16 next_to_clean ____cacheline_aligned_in_smp;
+	u16 next_to_alloc ____cacheline_aligned_in_smp;
+	DECLARE_BITMAP(flags, __IECM_Q_FLAGS_NBITS)
+		____cacheline_aligned_in_smp;
+	u16 *ring ____cacheline_aligned_in_smp;
+	u16 q_entries;
+} ____cacheline_internodealigned_in_smp;
+
+/* Splitq only.  iecm_rxq_set associates an rxq with at most two refillqs.
+ * Each rxq needs a refillq to return used buffers back to the respective bufq.
+ * Bufqs then clean these refillqs for buffers to give to hardware.
+ */
+struct iecm_rxq_set {
+	struct iecm_queue rxq;
+	/* refillqs assoc with bufqX mapped to this rxq */
+	struct iecm_sw_queue *refillq0;
+	struct iecm_sw_queue *refillq1;
+};
+
+/* Splitq only.  iecm_bufq_set associates a bufq to an overflow and array of
+ * refillqs.  In this bufq_set, there will be one refillq for each rxq in this
+ * rxq_group.  Used buffers received by rxqs will be put on refillqs which
+ * bufqs will clean to return new buffers back to hardware.
+ *
+ * Buffers needed by some number of rxqs associated in this rxq_group are
+ * managed by at most two bufqs (depending on performance configuration).
+ */
+struct iecm_bufq_set {
+	struct iecm_queue bufq;
+	struct iecm_sw_queue overflowq;
+	/* This is always equal to num_rxq_sets in idfp_rxq_group */
+	int num_refillqs;
+	struct iecm_sw_queue *refillqs;
+};
+
+/* In singleq mode, an rxq_group is simply an array of rxqs.  In splitq, a
+ * rxq_group contains all the rxqs, bufqs, refillqs, and overflowqs needed to
+ * manage buffers in splitq mode.
+ */
+struct iecm_rxq_group {
+	struct iecm_vport *vport; /* back pointer */
+
+	union {
+		struct {
+			int num_rxq;
+			struct iecm_queue *rxqs;
+		} singleq;
+		struct {
+			int num_rxq_sets;
+			struct iecm_rxq_set *rxq_sets;
+			struct iecm_bufq_set *bufq_sets;
+		} splitq;
+	};
+};
+
+/* Between singleq and splitq, a txq_group is largely the same except for the
+ * complq.  In splitq a single complq is responsible for handling completions
+ * for some number of txqs associated in this txq_group.
+ */
+struct iecm_txq_group {
+	struct iecm_vport *vport; /* back pointer */
+
+	int num_txq;
+	struct iecm_queue *txqs;
+
+	/* splitq only */
+	struct iecm_queue *complq;
+};
+
+int iecm_vport_singleq_napi_poll(struct napi_struct *napi, int budget);
+void iecm_vport_init_num_qs(struct iecm_vport *vport,
+			    struct virtchnl_create_vport *vport_msg);
+void iecm_vport_calc_num_q_desc(struct iecm_vport *vport);
+void iecm_vport_calc_total_qs(struct virtchnl_create_vport *vport_msg,
+			      int num_req_qs);
+void iecm_vport_calc_num_q_groups(struct iecm_vport *vport);
+int iecm_vport_queues_alloc(struct iecm_vport *vport);
+void iecm_vport_queues_rel(struct iecm_vport *vport);
+void iecm_vport_calc_num_q_vec(struct iecm_vport *vport);
+void iecm_vport_intr_dis_irq_all(struct iecm_vport *vport);
+void iecm_vport_intr_clear_dflt_itr(struct iecm_vport *vport);
+void iecm_vport_intr_update_itr_ena_irq(struct iecm_q_vector *q_vector);
+void iecm_vport_intr_deinit(struct iecm_vport *vport);
+int iecm_vport_intr_init(struct iecm_vport *vport);
+irqreturn_t
+iecm_vport_intr_clean_queues(int __always_unused irq, void *data);
+void iecm_vport_intr_ena_irq_all(struct iecm_vport *vport);
+int iecm_config_rss(struct iecm_vport *vport);
+void iecm_get_rx_qid_list(struct iecm_vport *vport, u16 *qid_list);
+void iecm_fill_dflt_rss_lut(struct iecm_vport *vport, u16 *qid_list);
+int iecm_init_rss(struct iecm_vport *vport);
+void iecm_deinit_rss(struct iecm_vport *vport);
+int iecm_config_rss(struct iecm_vport *vport);
+void iecm_rx_reuse_page(struct iecm_queue *rx_bufq, bool hsplit,
+			struct iecm_rx_buf *old_buf);
+void iecm_rx_add_frag(struct iecm_rx_buf *rx_buf, struct sk_buff *skb,
+		      unsigned int size);
+struct sk_buff *iecm_rx_construct_skb(struct iecm_queue *rxq,
+				      struct iecm_rx_buf *rx_buf,
+				      unsigned int size);
+bool iecm_rx_cleanup_headers(struct sk_buff *skb);
+bool iecm_rx_recycle_buf(struct iecm_queue *rx_bufq, bool hsplit,
+			 struct iecm_rx_buf *rx_buf);
+void iecm_rx_skb(struct iecm_queue *rxq, struct sk_buff *skb);
+bool iecm_rx_buf_hw_alloc(struct iecm_queue *rxq, struct iecm_rx_buf *buf);
+void iecm_rx_buf_hw_update(struct iecm_queue *rxq, u32 val);
+void iecm_tx_buf_hw_update(struct iecm_queue *tx_q, u32 val,
+			   struct sk_buff *skb);
+void iecm_tx_buf_rel(struct iecm_queue *tx_q, struct iecm_tx_buf *tx_buf);
+unsigned int iecm_tx_desc_count_required(struct sk_buff *skb);
+int iecm_tx_maybe_stop(struct iecm_queue *tx_q, unsigned int size);
+void iecm_tx_timeout(struct net_device *netdev,
+		     unsigned int __always_unused txqueue);
+netdev_tx_t iecm_tx_splitq_start(struct sk_buff *skb,
+				 struct net_device *netdev);
+netdev_tx_t iecm_tx_singleq_start(struct sk_buff *skb,
+				  struct net_device *netdev);
+bool iecm_rx_singleq_buf_hw_alloc_all(struct iecm_queue *rxq,
+				      u16 cleaned_count);
+void iecm_get_stats64(struct net_device *netdev,
+		      struct rtnl_link_stats64 *stats);
+#endif /* !_IECM_TXRX_H_ */
-- 
2.26.2


^ permalink raw reply related

* [net-next 15/15] idpf: Introduce idpf driver
From: Jeff Kirsher @ 2020-06-18  5:13 UTC (permalink / raw)
  To: davem
  Cc: Alan Brady, netdev, nhorman, sassmann, Alice Michael, Phani Burra,
	Joshua Hay, Madhu Chittim, Pavan Kumar Linga, Donald Skidmore,
	Jesse Brandeburg, Sridhar Samudrala, kbuild test robot,
	Jeff Kirsher
In-Reply-To: <20200618051344.516587-1-jeffrey.t.kirsher@intel.com>

From: Alan Brady <alan.brady@intel.com>

Utilizes the Intel Ethernet Common Module and provides
a device specific implementation for data plane devices.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Donald Skidmore <donald.c.skidmore@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 .../networking/device_drivers/intel/idpf.rst  |  47 ++++++
 MAINTAINERS                                   |   1 +
 drivers/net/ethernet/intel/Kconfig            |   8 +
 drivers/net/ethernet/intel/Makefile           |   1 +
 drivers/net/ethernet/intel/idpf/Makefile      |  12 ++
 drivers/net/ethernet/intel/idpf/idpf_dev.h    |  17 ++
 drivers/net/ethernet/intel/idpf/idpf_devids.h |  10 ++
 drivers/net/ethernet/intel/idpf/idpf_main.c   | 136 ++++++++++++++++
 drivers/net/ethernet/intel/idpf/idpf_reg.c    | 152 ++++++++++++++++++
 9 files changed, 384 insertions(+)
 create mode 100644 Documentation/networking/device_drivers/intel/idpf.rst
 create mode 100644 drivers/net/ethernet/intel/idpf/Makefile
 create mode 100644 drivers/net/ethernet/intel/idpf/idpf_dev.h
 create mode 100644 drivers/net/ethernet/intel/idpf/idpf_devids.h
 create mode 100644 drivers/net/ethernet/intel/idpf/idpf_main.c
 create mode 100644 drivers/net/ethernet/intel/idpf/idpf_reg.c

diff --git a/Documentation/networking/device_drivers/intel/idpf.rst b/Documentation/networking/device_drivers/intel/idpf.rst
new file mode 100644
index 000000000000..973fa9613428
--- /dev/null
+++ b/Documentation/networking/device_drivers/intel/idpf.rst
@@ -0,0 +1,47 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==================================================================
+Linux Base Driver for the Intel(R) Smart Network Adapter Family Series
+==================================================================
+
+Intel idpf Linux driver.
+Copyright(c) 2020 Intel Corporation.
+
+Contents
+========
+
+- Enabling the driver
+- Support
+
+The driver in this release supports Intel's Smart Network Adapter Family Series
+of products. For more information, visit Intel's support page at
+https://support.intel.com.
+
+Enabling the driver
+===================
+The driver is enabled via the standard kernel configuration system,
+using the make command::
+
+  make oldconfig/menuconfig/etc.
+
+The driver is located in the menu structure at:
+
+  -> Device Drivers
+    -> Network device support (NETDEVICES [=y])
+      -> Ethernet driver support
+        -> Intel devices
+          -> Intel(R) Smart Network Adapter Family Series Support
+
+Support
+=======
+For general information, go to the Intel support website at:
+
+https://www.intel.com/support/
+
+or the Intel Wired Networking project hosted by Sourceforge at:
+
+https://sourceforge.net/projects/e1000
+
+If an issue is identified with the released source code on a supported kernel
+with a supported adapter, email the specific information related to the issue
+to e1000-devel@lists.sf.net.
diff --git a/MAINTAINERS b/MAINTAINERS
index 102ee1e4aef0..97ac0d417067 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8654,6 +8654,7 @@ F:	Documentation/networking/device_drivers/intel/fm10k.rst
 F:	Documentation/networking/device_drivers/intel/i40e.rst
 F:	Documentation/networking/device_drivers/intel/iavf.rst
 F:	Documentation/networking/device_drivers/intel/ice.rst
+F:	Documentation/networking/device_drivers/intel/idpf.rst
 F:	Documentation/networking/device_drivers/intel/iecm.rst
 F:	Documentation/networking/device_drivers/intel/igb.rst
 F:	Documentation/networking/device_drivers/intel/igbvf.rst
diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
index 6dd985cbdb6d..9e0b3c1bf7c6 100644
--- a/drivers/net/ethernet/intel/Kconfig
+++ b/drivers/net/ethernet/intel/Kconfig
@@ -349,4 +349,12 @@ config IECM
 	help
 	 To compile this driver as a module, choose M here. The module
 	 will be called iecm.  MSI-X interrupt support is required
+
+config IDPF
+	tristate "Intel(R) Data Plane Function Support"
+	default n
+	depends on PCI
+	help
+	 To compile this driver as a module, choose M here. The module
+	 will be called idpf.
 endif # NET_VENDOR_INTEL
diff --git a/drivers/net/ethernet/intel/Makefile b/drivers/net/ethernet/intel/Makefile
index c9eba9cc5087..3786c2269f3d 100644
--- a/drivers/net/ethernet/intel/Makefile
+++ b/drivers/net/ethernet/intel/Makefile
@@ -17,3 +17,4 @@ obj-$(CONFIG_IAVF) += iavf/
 obj-$(CONFIG_FM10K) += fm10k/
 obj-$(CONFIG_ICE) += ice/
 obj-$(CONFIG_IECM) += iecm/
+obj-$(CONFIG_IDPF) += idpf/
diff --git a/drivers/net/ethernet/intel/idpf/Makefile b/drivers/net/ethernet/intel/idpf/Makefile
new file mode 100644
index 000000000000..ac6cac6c6360
--- /dev/null
+++ b/drivers/net/ethernet/intel/idpf/Makefile
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0-only
+# Copyright (C) 2020 Intel Corporation
+
+#
+# Makefile for the Intel(R) Data Plane Function Linux Driver
+#
+
+obj-$(CONFIG_IDPF) += idpf.o
+
+idpf-y := \
+	idpf_main.o \
+	idpf_reg.o
diff --git a/drivers/net/ethernet/intel/idpf/idpf_dev.h b/drivers/net/ethernet/intel/idpf/idpf_dev.h
new file mode 100644
index 000000000000..1da33f5120a2
--- /dev/null
+++ b/drivers/net/ethernet/intel/idpf/idpf_dev.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (C) 2020 Intel Corporation */
+
+#ifndef _IDPF_DEV_H_
+#define _IDPF_DEV_H_
+
+#include <linux/net/intel/iecm.h>
+
+void idpf_intr_reg_init(struct iecm_vport *vport);
+void idpf_mb_intr_reg_init(struct iecm_adapter *adapter);
+void idpf_reset_reg_init(struct iecm_reset_reg *reset_reg);
+void idpf_trigger_reset(struct iecm_adapter *adapter,
+			enum iecm_flags trig_cause);
+void idpf_vportq_reg_init(struct iecm_vport *vport);
+void idpf_ctlq_reg_init(struct iecm_ctlq_create_info *cq);
+
+#endif /* _IDPF_DEV_H_ */
diff --git a/drivers/net/ethernet/intel/idpf/idpf_devids.h b/drivers/net/ethernet/intel/idpf/idpf_devids.h
new file mode 100644
index 000000000000..ee373a04cb20
--- /dev/null
+++ b/drivers/net/ethernet/intel/idpf/idpf_devids.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (C) 2020 Intel Corporation */
+
+#ifndef _IDPF_DEVIDS_H_
+#define _IDPF_DEVIDS_H_
+
+/* Device IDs */
+#define IDPF_DEV_ID_PF			0x1452
+
+#endif /* _IDPF_DEVIDS_H_ */
diff --git a/drivers/net/ethernet/intel/idpf/idpf_main.c b/drivers/net/ethernet/intel/idpf/idpf_main.c
new file mode 100644
index 000000000000..56f20abd57ee
--- /dev/null
+++ b/drivers/net/ethernet/intel/idpf/idpf_main.c
@@ -0,0 +1,136 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (C) 2020 Intel Corporation */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include "idpf_dev.h"
+#include "idpf_devids.h"
+
+#define DRV_SUMMARY	"Intel(R) Data Plane Function Linux Driver"
+static const char idpf_driver_string[] = DRV_SUMMARY;
+static const char idpf_copyright[] = "Copyright (c) 2020, Intel Corporation.";
+
+MODULE_AUTHOR("Intel Corporation, <linux.nics@intel.com>");
+MODULE_DESCRIPTION(DRV_SUMMARY);
+MODULE_LICENSE("GPL v2");
+
+static int debug = -1;
+module_param(debug, int, 0644);
+#ifndef CONFIG_DYNAMIC_DEBUG
+MODULE_PARM_DESC(debug, "netif level (0=none,...,16=all), hw debug_mask (0x8XXXXXXX)");
+#else
+MODULE_PARM_DESC(debug, "netif level (0=none,...,16=all)");
+#endif /* !CONFIG_DYNAMIC_DEBUG */
+
+/**
+ * idpf_reg_ops_init - Initialize register API function pointers
+ * @adapter: Driver specific private structure
+ */
+static void idpf_reg_ops_init(struct iecm_adapter *adapter)
+{
+	adapter->dev_ops.reg_ops.ctlq_reg_init = idpf_ctlq_reg_init;
+	adapter->dev_ops.reg_ops.vportq_reg_init = idpf_vportq_reg_init;
+	adapter->dev_ops.reg_ops.intr_reg_init = idpf_intr_reg_init;
+	adapter->dev_ops.reg_ops.mb_intr_reg_init = idpf_mb_intr_reg_init;
+	adapter->dev_ops.reg_ops.reset_reg_init = idpf_reset_reg_init;
+	adapter->dev_ops.reg_ops.trigger_reset = idpf_trigger_reset;
+}
+
+/**
+ * idpf_probe - Device initialization routine
+ * @pdev: PCI device information struct
+ * @ent: entry in idpf_pci_tbl
+ *
+ * Returns 0 on success, negative on failure
+ */
+int idpf_probe(struct pci_dev *pdev,
+	       const struct pci_device_id __always_unused *ent)
+{
+	struct iecm_adapter *adapter = NULL;
+	int err;
+
+	err = pcim_enable_device(pdev);
+	if (err)
+		return err;
+
+	adapter = kzalloc(sizeof(*adapter), GFP_KERNEL);
+	if (!adapter)
+		return -ENOMEM;
+
+	adapter->dev_ops.reg_ops_init = idpf_reg_ops_init;
+
+	err = iecm_probe(pdev, ent, adapter);
+	if (err)
+		kfree(adapter);
+
+	return err;
+}
+
+/**
+ * idpf_remove - Device removal routine
+ * @pdev: PCI device information struct
+ */
+static void idpf_remove(struct pci_dev *pdev)
+{
+	struct iecm_adapter *adapter = pci_get_drvdata(pdev);
+
+	iecm_remove(pdev);
+	kfree(adapter);
+}
+
+/* idpf_pci_tbl - PCI Dev idpf ID Table
+ *
+ * Wildcard entries (PCI_ANY_ID) should come last
+ * Last entry must be all 0s
+ *
+ * { Vendor ID, Deviecm ID, SubVendor ID, SubDevice ID,
+ *   Class, Class Mask, private data (not used) }
+ */
+static const struct pci_device_id idpf_pci_tbl[] = {
+	{ PCI_VDEVICE(INTEL, IDPF_DEV_ID_PF), 0 },
+	/* required last entry */
+	{ 0, }
+};
+MODULE_DEVICE_TABLE(pci, idpf_pci_tbl);
+
+static struct pci_driver idpf_driver = {
+	.name = KBUILD_MODNAME,
+	.id_table = idpf_pci_tbl,
+	.probe = idpf_probe,
+	.remove = idpf_remove,
+	.shutdown = iecm_shutdown,
+};
+
+/**
+ * idpf_module_init - Driver registration routine
+ *
+ * idpf_module_init is the first routine called when the driver is
+ * loaded. All it does is register with the PCI subsystem.
+ */
+static int __init idpf_module_init(void)
+{
+	int status;
+
+	pr_info("%s", idpf_driver_string);
+	pr_info("%s\n", idpf_copyright);
+
+	status = pci_register_driver(&idpf_driver);
+	if (status)
+		pr_err("failed to register pci driver, err %d\n", status);
+
+	return status;
+}
+module_init(idpf_module_init);
+
+/**
+ * idpf_module_exit - Driver exit cleanup routine
+ *
+ * idpf_module_exit is called just before the driver is removed
+ * from memory.
+ */
+static void __exit idpf_module_exit(void)
+{
+	pci_unregister_driver(&idpf_driver);
+	pr_info("module unloaded\n");
+}
+module_exit(idpf_module_exit);
diff --git a/drivers/net/ethernet/intel/idpf/idpf_reg.c b/drivers/net/ethernet/intel/idpf/idpf_reg.c
new file mode 100644
index 000000000000..f5f364639cfc
--- /dev/null
+++ b/drivers/net/ethernet/intel/idpf/idpf_reg.c
@@ -0,0 +1,152 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (C) 2020 Intel Corporation */
+
+#include "idpf_dev.h"
+#include <linux/net/intel/iecm_lan_pf_regs.h>
+
+/**
+ * idpf_ctlq_reg_init - initialize default mailbox registers
+ * @cq: pointer to the array of create control queues
+ */
+void idpf_ctlq_reg_init(struct iecm_ctlq_create_info *cq)
+{
+	int i;
+
+#define NUM_Q 2
+	for (i = 0; i < NUM_Q; i++) {
+		struct iecm_ctlq_create_info *ccq = cq + i;
+
+		switch (ccq->type) {
+		case IECM_CTLQ_TYPE_MAILBOX_TX:
+			/* set head and tail registers in our local struct */
+			ccq->reg.head = PF_FW_ATQH;
+			ccq->reg.tail = PF_FW_ATQT;
+			ccq->reg.len = PF_FW_ATQLEN;
+			ccq->reg.bah = PF_FW_ATQBAH;
+			ccq->reg.bal = PF_FW_ATQBAL;
+			ccq->reg.len_mask = PF_FW_ATQLEN_ATQLEN_M;
+			ccq->reg.len_ena_mask = PF_FW_ATQLEN_ATQENABLE_M;
+			ccq->reg.head_mask = PF_FW_ATQH_ATQH_M;
+			break;
+		case IECM_CTLQ_TYPE_MAILBOX_RX:
+			/* set head and tail registers in our local struct */
+			ccq->reg.head = PF_FW_ARQH;
+			ccq->reg.tail = PF_FW_ARQT;
+			ccq->reg.len = PF_FW_ARQLEN;
+			ccq->reg.bah = PF_FW_ARQBAH;
+			ccq->reg.bal = PF_FW_ARQBAL;
+			ccq->reg.len_mask = PF_FW_ARQLEN_ARQLEN_M;
+			ccq->reg.len_ena_mask = PF_FW_ARQLEN_ARQENABLE_M;
+			ccq->reg.head_mask = PF_FW_ARQH_ARQH_M;
+			break;
+		default:
+			break;
+		}
+	}
+}
+
+/**
+ * idpf_vportq_reg_init - Initialize tail registers
+ * @vport: virtual port structure
+ */
+void idpf_vportq_reg_init(struct iecm_vport *vport)
+{
+	struct iecm_hw *hw = &vport->adapter->hw;
+	struct iecm_queue *q;
+	int i, j;
+
+	for (i = 0; i < vport->num_txq_grp; i++) {
+		int num_txq = vport->txq_grps[i].num_txq;
+
+		for (j = 0; j < num_txq; j++) {
+			q = &vport->txq_grps[i].txqs[j];
+			q->tail = hw->hw_addr + PF_QTX_COMM_DBELL(q->q_id);
+		}
+	}
+
+	for (i = 0; i < vport->num_rxq_grp; i++) {
+		struct iecm_rxq_group *rxq_grp = &vport->rxq_grps[i];
+		int num_rxq;
+
+		if (iecm_is_queue_model_split(vport->rxq_model)) {
+			for (j = 0; j < IECM_BUFQS_PER_RXQ_SET; j++) {
+				q = &rxq_grp->splitq.bufq_sets[j].bufq;
+				q->tail = hw->hw_addr +
+					  PF_QRX_BUFFQ_TAIL(q->q_id);
+			}
+
+			num_rxq = rxq_grp->splitq.num_rxq_sets;
+		} else {
+			num_rxq = rxq_grp->singleq.num_rxq;
+		}
+
+		for (j = 0; j < num_rxq; j++) {
+			if (iecm_is_queue_model_split(vport->rxq_model))
+				q = &rxq_grp->splitq.rxq_sets[j].rxq;
+			else
+				q = &rxq_grp->singleq.rxqs[j];
+			q->tail = hw->hw_addr + PF_QRX_TAIL(q->q_id);
+		}
+	}
+}
+
+/**
+ * idpf_mb_intr_reg_init - Initialize mailbox interrupt register
+ * @adapter: adapter structure
+ */
+void idpf_mb_intr_reg_init(struct iecm_adapter *adapter)
+{
+	struct iecm_intr_reg *intr = &adapter->mb_vector.intr_reg;
+	int vidx;
+
+	vidx = adapter->mb_vector.v_idx;
+	intr->dyn_ctl = PF_GLINT_DYN_CTL(vidx);
+	intr->dyn_ctl_intena_m = PF_GLINT_DYN_CTL_INTENA_M;
+	intr->dyn_ctl_itridx_m = 0x3 << PF_GLINT_DYN_CTL_ITR_INDX_S;
+}
+
+/**
+ * idpf_intr_reg_init - Initialize interrupt registers
+ * @vport: virtual port structure
+ */
+void idpf_intr_reg_init(struct iecm_vport *vport)
+{
+	int q_idx;
+
+	for (q_idx = 0; q_idx < vport->num_q_vectors; q_idx++) {
+		struct iecm_q_vector *q_vector = &vport->q_vectors[q_idx];
+		struct iecm_intr_reg *intr = &q_vector->intr_reg;
+		u32 vidx = q_vector->v_idx;
+
+		intr->dyn_ctl = PF_GLINT_DYN_CTL(vidx);
+		intr->dyn_ctl_clrpba_m = PF_GLINT_DYN_CTL_CLEARPBA_M;
+		intr->dyn_ctl_intena_m = PF_GLINT_DYN_CTL_INTENA_M;
+		intr->dyn_ctl_itridx_s = PF_GLINT_DYN_CTL_ITR_INDX_S;
+		intr->dyn_ctl_intrvl_s = PF_GLINT_DYN_CTL_INTERVAL_S;
+		intr->itr = PF_GLINT_ITR(VIRTCHNL_ITR_IDX_0, vidx);
+	}
+}
+
+/**
+ * idpf_reset_reg_init - Initialize reset registers
+ * @reset_reg: struct to be filled in with reset registers
+ */
+void idpf_reset_reg_init(struct iecm_reset_reg *reset_reg)
+{
+	reset_reg->rstat = PFGEN_RSTAT;
+	reset_reg->rstat_m = PFGEN_RSTAT_PFR_STATE_M;
+}
+
+/**
+ * idpf_trigger_reset - trigger reset
+ * @adapter: Driver specific private structure
+ * @trig_cause: Reason to trigger a reset
+ */
+void idpf_trigger_reset(struct iecm_adapter *adapter,
+			enum iecm_flags trig_cause)
+{
+	u32 reset_reg;
+
+	reset_reg = rd32(&adapter->hw, PFGEN_CTRL);
+	wr32(&adapter->hw, PFGEN_CTRL, (reset_reg | PFGEN_CTRL_PFSWR));
+}
-- 
2.26.2


^ permalink raw reply related

* [net-next 12/15] iecm: Add singleq TX/RX
From: Jeff Kirsher @ 2020-06-18  5:13 UTC (permalink / raw)
  To: davem
  Cc: Alice Michael, netdev, nhorman, sassmann, Alan Brady, Phani Burra,
	Joshua Hay, Madhu Chittim, Pavan Kumar Linga, Donald Skidmore,
	Jesse Brandeburg, Sridhar Samudrala, Jeff Kirsher
In-Reply-To: <20200618051344.516587-1-jeffrey.t.kirsher@intel.com>

From: Alice Michael <alice.michael@intel.com>

Implement legacy single queue model for TX/RX flows.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Donald Skidmore <donald.c.skidmore@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 .../ethernet/intel/iecm/iecm_singleq_txrx.c   | 670 +++++++++++++++++-
 1 file changed, 652 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/intel/iecm/iecm_singleq_txrx.c b/drivers/net/ethernet/intel/iecm/iecm_singleq_txrx.c
index a85471e72d66..acb3e56063f5 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_singleq_txrx.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_singleq_txrx.c
@@ -17,7 +17,11 @@ static __le64
 iecm_tx_singleq_build_ctob(u64 td_cmd, u64 td_offset, unsigned int size,
 			   u64 td_tag)
 {
-	/* stub */
+	return cpu_to_le64(IECM_TX_DESC_DTYPE_DATA |
+			   (td_cmd    << IECM_TXD_QW1_CMD_S) |
+			   (td_offset << IECM_TXD_QW1_OFFSET_S) |
+			   ((u64)size << IECM_TXD_QW1_TX_BUF_SZ_S) |
+			   (td_tag    << IECM_TXD_QW1_L2TAG1_S));
 }
 
 /**
@@ -31,7 +35,93 @@ static
 int iecm_tx_singleq_csum(struct iecm_tx_buf *first,
 			 struct iecm_tx_offload_params *off)
 {
-	/* stub */
+	u32 l4_len = 0, l3_len = 0, l2_len = 0;
+	struct sk_buff *skb = first->skb;
+	union {
+		struct iphdr *v4;
+		struct ipv6hdr *v6;
+		unsigned char *hdr;
+	} ip;
+	union {
+		struct tcphdr *tcp;
+		unsigned char *hdr;
+	} l4;
+	__be16 frag_off, protocol;
+	unsigned char *exthdr;
+	u32 offset, cmd = 0;
+	u8 l4_proto = 0;
+
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		return 0;
+
+	if (skb->encapsulation)
+		return -1;
+
+	ip.hdr = skb_network_header(skb);
+	l4.hdr = skb_transport_header(skb);
+
+	/* compute outer L2 header size */
+	l2_len = ip.hdr - skb->data;
+	offset = (l2_len / 2) << IECM_TX_DESC_LEN_MACLEN_S;
+
+	/* Enable IP checksum offloads */
+	protocol = vlan_get_protocol(skb);
+	if (protocol == htons(ETH_P_IP)) {
+		l4_proto = ip.v4->protocol;
+		/* the stack computes the IP header already, the only time we
+		 * need the hardware to recompute it is in the case of TSO.
+		 */
+		if (first->tx_flags & IECM_TX_FLAGS_TSO)
+			cmd |= IECM_TX_DESC_CMD_IIPT_IPV4_CSUM;
+		else
+			cmd |= IECM_TX_DESC_CMD_IIPT_IPV4;
+
+	} else if (protocol == htons(ETH_P_IPV6)) {
+		cmd |= IECM_TX_DESC_CMD_IIPT_IPV6;
+		exthdr = ip.hdr + sizeof(*ip.v6);
+		l4_proto = ip.v6->nexthdr;
+		if (l4.hdr != exthdr)
+			ipv6_skip_exthdr(skb, exthdr - skb->data, &l4_proto,
+					 &frag_off);
+	} else {
+		return -1;
+	}
+
+	/* compute inner L3 header size */
+	l3_len = l4.hdr - ip.hdr;
+	offset |= (l3_len / 4) << IECM_TX_DESC_LEN_IPLEN_S;
+
+	/* Enable L4 checksum offloads */
+	switch (l4_proto) {
+	case IPPROTO_TCP:
+		/* enable checksum offloads */
+		cmd |= IECM_TX_DESC_CMD_L4T_EOFT_TCP;
+		l4_len = l4.tcp->doff;
+		offset |= l4_len << IECM_TX_DESC_LEN_L4_LEN_S;
+		break;
+	case IPPROTO_UDP:
+		/* enable UDP checksum offload */
+		cmd |= IECM_TX_DESC_CMD_L4T_EOFT_UDP;
+		l4_len = (sizeof(struct udphdr) >> 2);
+		offset |= l4_len << IECM_TX_DESC_LEN_L4_LEN_S;
+		break;
+	case IPPROTO_SCTP:
+		/* enable SCTP checksum offload */
+		cmd |= IECM_TX_DESC_CMD_L4T_EOFT_SCTP;
+		l4_len = sizeof(struct sctphdr) >> 2;
+		offset |= l4_len << IECM_TX_DESC_LEN_L4_LEN_S;
+		break;
+
+	default:
+		if (first->tx_flags & IECM_TX_FLAGS_TSO)
+			return -1;
+		skb_checksum_help(skb);
+		return 0;
+	}
+
+	off->td_cmd |= cmd;
+	off->hdr_offsets |= offset;
+	return 1;
 }
 
 /**
@@ -48,7 +138,125 @@ static void
 iecm_tx_singleq_map(struct iecm_queue *tx_q, struct iecm_tx_buf *first,
 		    struct iecm_tx_offload_params *offloads)
 {
-	/* stub */
+	u32 offsets = offloads->hdr_offsets;
+	struct iecm_base_tx_desc *tx_desc;
+	u64  td_cmd = offloads->td_cmd;
+	unsigned int data_len, size;
+	struct iecm_tx_buf *tx_buf;
+	u16 i = tx_q->next_to_use;
+	struct netdev_queue *nq;
+	struct sk_buff *skb;
+	skb_frag_t *frag;
+	dma_addr_t dma;
+
+	skb = first->skb;
+
+	data_len = skb->data_len;
+	size = skb_headlen(skb);
+
+	tx_desc = IECM_BASE_TX_DESC(tx_q, i);
+
+	dma = dma_map_single(tx_q->dev, skb->data, size, DMA_TO_DEVICE);
+
+	tx_buf = first;
+
+	/* write each descriptor with CRC bit */
+	if (tx_q->vport->adapter->dev_ops.crc_enable)
+		tx_q->vport->adapter->dev_ops.crc_enable(&td_cmd);
+
+	for (frag = &skb_shinfo(skb)->frags[0];; frag++) {
+		unsigned int max_data = IECM_TX_MAX_DESC_DATA_ALIGNED;
+
+		if (dma_mapping_error(tx_q->dev, dma))
+			goto dma_error;
+
+		/* record length, and DMA address */
+		dma_unmap_len_set(tx_buf, len, size);
+		dma_unmap_addr_set(tx_buf, dma, dma);
+
+		/* align size to end of page */
+		max_data += -dma & (IECM_TX_MAX_READ_REQ_SIZE - 1);
+		tx_desc->buf_addr = cpu_to_le64(dma);
+
+		/* account for data chunks larger than the hardware
+		 * can handle
+		 */
+		while (unlikely(size > IECM_TX_MAX_DESC_DATA)) {
+			tx_desc->qw1 = iecm_tx_singleq_build_ctob(td_cmd,
+								  offsets,
+								  size, 0);
+			tx_desc++;
+			i++;
+
+			if (i == tx_q->desc_count) {
+				tx_desc = IECM_BASE_TX_DESC(tx_q, 0);
+				i = 0;
+			}
+
+			dma += max_data;
+			size -= max_data;
+
+			max_data = IECM_TX_MAX_DESC_DATA_ALIGNED;
+			tx_desc->buf_addr = cpu_to_le64(dma);
+		}
+
+		if (likely(!data_len))
+			break;
+		tx_desc->qw1 = iecm_tx_singleq_build_ctob(td_cmd, offsets,
+							  size, 0);
+		tx_desc++;
+		i++;
+
+		if (i == tx_q->desc_count) {
+			tx_desc = IECM_BASE_TX_DESC(tx_q, 0);
+			i = 0;
+		}
+
+		size = skb_frag_size(frag);
+		data_len -= size;
+
+		dma = skb_frag_dma_map(tx_q->dev, frag, 0, size,
+				       DMA_TO_DEVICE);
+
+		tx_buf = &tx_q->tx_buf[i];
+	}
+
+	/* record bytecount for BQL */
+	nq = netdev_get_tx_queue(tx_q->vport->netdev, tx_q->idx);
+	netdev_tx_sent_queue(nq, first->bytecount);
+
+	/* record SW timestamp if HW timestamp is not available */
+	skb_tx_timestamp(first->skb);
+
+	/* write last descriptor with RS and EOP bits */
+	td_cmd |= (u64)(IECM_TX_DESC_CMD_EOP | IECM_TX_DESC_CMD_RS);
+
+	tx_desc->qw1 = iecm_tx_singleq_build_ctob(td_cmd, offsets, size, 0);
+
+	i++;
+	if (i == tx_q->desc_count)
+		i = 0;
+
+	/* set next_to_watch value indicating a packet is present */
+	first->next_to_watch = tx_desc;
+
+	iecm_tx_buf_hw_update(tx_q, i, skb);
+
+	return;
+
+dma_error:
+	/* clear DMA mappings for failed tx_buf map */
+	for (;;) {
+		tx_buf = &tx_q->tx_buf[i];
+		iecm_tx_buf_rel(tx_q, tx_buf);
+		if (tx_buf == first)
+			break;
+		if (i == 0)
+			i = tx_q->desc_count;
+		i--;
+	}
+
+	tx_q->next_to_use = i;
 }
 
 /**
@@ -61,7 +269,42 @@ iecm_tx_singleq_map(struct iecm_queue *tx_q, struct iecm_tx_buf *first,
 static netdev_tx_t
 iecm_tx_singleq_frame(struct sk_buff *skb, struct iecm_queue *tx_q)
 {
-	/* stub */
+	struct iecm_tx_offload_params offload = {0};
+	struct iecm_tx_buf *first;
+	unsigned int count;
+	int csum;
+
+	count = iecm_tx_desc_count_required(skb);
+
+	/* need: 1 descriptor per page * PAGE_SIZE/IECM_MAX_DATA_PER_TXD,
+	 *       + 1 desc for skb_head_len/IECM_MAX_DATA_PER_TXD,
+	 *       + 4 desc gap to avoid the cache line where head is,
+	 *       + 1 desc for context descriptor,
+	 * otherwise try next time
+	 */
+	if (iecm_tx_maybe_stop(tx_q, count + IECM_TX_DESCS_PER_CACHE_LINE +
+			       IECM_TX_DESCS_FOR_CTX)) {
+		return NETDEV_TX_BUSY;
+	}
+
+	/* record the location of the first descriptor for this packet */
+	first = &tx_q->tx_buf[tx_q->next_to_use];
+	first->skb = skb;
+	first->bytecount = max_t(unsigned int, skb->len, ETH_ZLEN);
+	first->gso_segs = 1;
+	first->tx_flags = 0;
+
+	csum = iecm_tx_singleq_csum(first, &offload);
+	if (csum < 0)
+		goto out_drop;
+
+	iecm_tx_singleq_map(tx_q, first, &offload);
+
+	return NETDEV_TX_OK;
+
+out_drop:
+	dev_kfree_skb_any(skb);
+	return NETDEV_TX_OK;
 }
 
 /**
@@ -74,7 +317,18 @@ iecm_tx_singleq_frame(struct sk_buff *skb, struct iecm_queue *tx_q)
 netdev_tx_t iecm_tx_singleq_start(struct sk_buff *skb,
 				  struct net_device *netdev)
 {
-	/* stub */
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+	struct iecm_queue *tx_q;
+
+	tx_q = vport->txqs[skb->queue_mapping];
+
+	/* hardware can't handle really short frames, hardware padding works
+	 * beyond this point
+	 */
+	if (skb_put_padto(skb, IECM_TX_MIN_LEN))
+		return NETDEV_TX_OK;
+
+	return iecm_tx_singleq_frame(skb, tx_q);
 }
 
 /**
@@ -85,7 +339,98 @@ netdev_tx_t iecm_tx_singleq_start(struct sk_buff *skb,
  */
 static bool iecm_tx_singleq_clean(struct iecm_queue *tx_q, int napi_budget)
 {
-	/* stub */
+	unsigned int budget = tx_q->vport->compln_clean_budget;
+	unsigned int total_bytes = 0, total_pkts = 0;
+	struct iecm_base_tx_desc *tx_desc;
+	s16 ntc = tx_q->next_to_clean;
+	struct iecm_tx_buf *tx_buf;
+	struct netdev_queue *nq;
+
+	tx_desc = IECM_BASE_TX_DESC(tx_q, ntc);
+	tx_buf = &tx_q->tx_buf[ntc];
+	ntc -= tx_q->desc_count;
+
+	do {
+		struct iecm_base_tx_desc *eop_desc = tx_buf->next_to_watch;
+
+		/* if next_to_watch is not set then no work pending */
+		if (!eop_desc)
+			break;
+
+		/* prevent any other reads prior to eop_desc */
+		smp_rmb();
+
+		/* if the descriptor isn't done, no work yet to do */
+		if (!(eop_desc->qw1 &
+		      cpu_to_le64(IECM_TX_DESC_DTYPE_DESC_DONE)))
+			break;
+
+		/* clear next_to_watch to prevent false hangs */
+		tx_buf->next_to_watch = NULL;
+
+		/* update the statistics for this packet */
+		total_bytes += tx_buf->bytecount;
+		total_pkts += tx_buf->gso_segs;
+
+		/* free the skb */
+		napi_consume_skb(tx_buf->skb, napi_budget);
+
+		/* unmap skb header data */
+		dma_unmap_single(tx_q->dev,
+				 dma_unmap_addr(tx_buf, dma),
+				 dma_unmap_len(tx_buf, len),
+				 DMA_TO_DEVICE);
+
+		/* clear tx_buf data */
+		tx_buf->skb = NULL;
+		dma_unmap_len_set(tx_buf, len, 0);
+
+		/* unmap remaining buffers */
+		while (tx_desc != eop_desc) {
+			tx_buf++;
+			tx_desc++;
+			ntc++;
+			if (unlikely(!ntc)) {
+				ntc -= tx_q->desc_count;
+				tx_buf = tx_q->tx_buf;
+				tx_desc = IECM_BASE_TX_DESC(tx_q, 0);
+			}
+
+			/* unmap any remaining paged data */
+			if (dma_unmap_len(tx_buf, len)) {
+				dma_unmap_page(tx_q->dev,
+					       dma_unmap_addr(tx_buf, dma),
+					       dma_unmap_len(tx_buf, len),
+					       DMA_TO_DEVICE);
+				dma_unmap_len_set(tx_buf, len, 0);
+			}
+		}
+
+		tx_buf++;
+		tx_desc++;
+		ntc++;
+		if (unlikely(!ntc)) {
+			ntc -= tx_q->desc_count;
+			tx_buf = tx_q->tx_buf;
+			tx_desc = IECM_BASE_TX_DESC(tx_q, 0);
+		}
+		/* update budget */
+		budget--;
+	} while (likely(budget));
+
+	ntc += tx_q->desc_count;
+	tx_q->next_to_clean = ntc;
+	nq = netdev_get_tx_queue(tx_q->vport->netdev, tx_q->idx);
+	netdev_tx_completed_queue(nq, total_pkts, total_bytes);
+	tx_q->itr.stats.tx.packets += total_pkts;
+	tx_q->itr.stats.tx.bytes += total_bytes;
+
+	u64_stats_update_begin(&tx_q->stats_sync);
+	tx_q->q_stats.tx.packets += total_pkts;
+	tx_q->q_stats.tx.bytes += total_bytes;
+	u64_stats_update_end(&tx_q->stats_sync);
+
+	return !!budget;
 }
 
 /**
@@ -98,7 +443,16 @@ static bool iecm_tx_singleq_clean(struct iecm_queue *tx_q, int napi_budget)
 static inline bool
 iecm_tx_singleq_clean_all(struct iecm_q_vector *q_vec, int budget)
 {
-	/* stub */
+	bool clean_complete = true;
+	int i, budget_per_q;
+
+	budget_per_q = max(budget / q_vec->num_txq, 1);
+	for (i = 0; i < q_vec->num_txq; i++) {
+		if (!iecm_tx_singleq_clean(q_vec->tx[i], budget_per_q))
+			clean_complete = false;
+	}
+
+	return clean_complete;
 }
 
 /**
@@ -116,7 +470,8 @@ static bool
 iecm_rx_singleq_test_staterr(struct iecm_singleq_base_rx_desc *rx_desc,
 			     const u64 stat_err_bits)
 {
-	/* stub */
+	return !!(rx_desc->qword1.status_error_ptype_len &
+		  cpu_to_le64(stat_err_bits));
 }
 
 /**
@@ -129,7 +484,15 @@ static bool iecm_rx_singleq_is_non_eop(struct iecm_queue *rxq,
 				       struct iecm_singleq_base_rx_desc
 				       *rx_desc, struct sk_buff *skb)
 {
-	/* stub */
+	/* if we are the last buffer then there is nothing else to do */
+ #define IECM_RXD_EOF BIT(IECM_RX_BASE_DESC_STATUS_EOF_S)
+	if (likely(iecm_rx_singleq_test_staterr(rx_desc, IECM_RXD_EOF)))
+		return false;
+
+	/* place skb in next buffer to be received */
+	rxq->rx_buf.buf[rxq->next_to_clean].skb = skb;
+
+	return true;
 }
 
 /**
@@ -145,7 +508,63 @@ static void iecm_rx_singleq_csum(struct iecm_queue *rxq, struct sk_buff *skb,
 				 struct iecm_singleq_base_rx_desc *rx_desc,
 				 u8 ptype)
 {
-	/* stub */
+	u64 qw1 = le64_to_cpu(rx_desc->qword1.status_error_ptype_len);
+	struct iecm_rx_ptype_decoded decoded;
+	bool ipv4, ipv6;
+	u32 rx_status;
+	u8 rx_error;
+
+	/* Start with CHECKSUM_NONE and by default csum_level = 0 */
+	skb->ip_summed = CHECKSUM_NONE;
+	skb_checksum_none_assert(skb);
+
+	/* check if Rx checksum is enabled */
+	if (!(rxq->vport->netdev->features & NETIF_F_RXCSUM))
+		return;
+
+	rx_status = ((qw1 & IECM_RXD_QW1_STATUS_M) >> IECM_RXD_QW1_STATUS_S);
+	rx_error = ((qw1 & IECM_RXD_QW1_ERROR_M) >> IECM_RXD_QW1_ERROR_S);
+
+	/* check if HW has decoded the packet and checksum */
+	if (!(rx_status & BIT(IECM_RX_BASE_DESC_STATUS_L3L4P_S)))
+		return;
+
+	decoded = rxq->vport->rx_ptype_lkup[ptype];
+	if (!(decoded.known && decoded.outer_ip))
+		return;
+
+	ipv4 = (decoded.outer_ip == IECM_RX_PTYPE_OUTER_IP) &&
+	       (decoded.outer_ip_ver == IECM_RX_PTYPE_OUTER_IPV4);
+	ipv6 = (decoded.outer_ip == IECM_RX_PTYPE_OUTER_IP) &&
+	       (decoded.outer_ip_ver == IECM_RX_PTYPE_OUTER_IPV6);
+
+	if (ipv4 && (rx_error & (BIT(IECM_RX_BASE_DESC_ERROR_IPE_S) |
+				 BIT(IECM_RX_BASE_DESC_ERROR_EIPE_S))))
+		goto checksum_fail;
+	else if (ipv6 && (rx_status &
+		 (BIT(IECM_RX_BASE_DESC_STATUS_IPV6EXADD_S))))
+		goto checksum_fail;
+
+	/* check for L4 errors and handle packets that were not able to be
+	 * checksummed due to arrival speed
+	 */
+	if (rx_error & BIT(IECM_RX_BASE_DESC_ERROR_L3L4E_S))
+		goto checksum_fail;
+
+	/* Only report checksum unnecessary for ICMP, TCP, UDP, or SCTP */
+	switch (decoded.inner_prot) {
+	case IECM_RX_PTYPE_INNER_PROT_ICMP:
+	case IECM_RX_PTYPE_INNER_PROT_TCP:
+	case IECM_RX_PTYPE_INNER_PROT_UDP:
+	case IECM_RX_PTYPE_INNER_PROT_SCTP:
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+	default:
+		break;
+	}
+	return;
+
+checksum_fail:
+	dev_dbg(rxq->dev, "RX Checksum not available\n");
 }
 
 /**
@@ -163,7 +582,10 @@ iecm_rx_singleq_process_skb_fields(struct iecm_queue *rxq, struct sk_buff *skb,
 				   struct iecm_singleq_base_rx_desc *rx_desc,
 				   u8 ptype)
 {
-	/* stub */
+	/* modifies the skb - consumes the enet header */
+	skb->protocol = eth_type_trans(skb, rxq->vport->netdev);
+
+	iecm_rx_singleq_csum(rxq, skb, rx_desc, ptype);
 }
 
 /**
@@ -176,7 +598,44 @@ iecm_rx_singleq_process_skb_fields(struct iecm_queue *rxq, struct sk_buff *skb,
 bool iecm_rx_singleq_buf_hw_alloc_all(struct iecm_queue *rx_q,
 				      u16 cleaned_count)
 {
-	/* stub */
+	struct iecm_singleq_rx_buf_desc *singleq_rx_desc = NULL;
+	u16 nta = rx_q->next_to_alloc;
+	struct iecm_rx_buf *buf;
+
+	/* do nothing if no valid netdev defined */
+	if (!rx_q->vport->netdev || !cleaned_count)
+		return false;
+
+	singleq_rx_desc = IECM_SINGLEQ_RX_BUF_DESC(rx_q, nta);
+	buf = &rx_q->rx_buf.buf[nta];
+
+	do {
+		if (!iecm_rx_buf_hw_alloc(rx_q, buf))
+			break;
+
+		/* Refresh the desc even if buffer_addrs didn't change
+		 * because each write-back erases this info.
+		 */
+		singleq_rx_desc->pkt_addr =
+			cpu_to_le64(buf->dma + buf->page_offset);
+		singleq_rx_desc->hdr_addr = 0;
+		singleq_rx_desc++;
+
+		buf++;
+		nta++;
+		if (unlikely(nta == rx_q->desc_count)) {
+			singleq_rx_desc = IECM_SINGLEQ_RX_BUF_DESC(rx_q, 0);
+			buf = rx_q->rx_buf.buf;
+			nta = 0;
+		}
+
+		cleaned_count--;
+	} while (cleaned_count);
+
+	if (rx_q->next_to_alloc != nta)
+		iecm_rx_buf_hw_update(rx_q, nta);
+
+	return !!cleaned_count;
 }
 
 /**
@@ -190,7 +649,16 @@ bool iecm_rx_singleq_buf_hw_alloc_all(struct iecm_queue *rx_q,
 static void iecm_singleq_rx_put_buf(struct iecm_queue *rx_bufq,
 				    struct iecm_rx_buf *rx_buf)
 {
-	/* stub */
+	u16 ntu = rx_bufq->next_to_use;
+	bool recycled = false;
+
+	recycled = iecm_rx_recycle_buf(rx_bufq, false, rx_buf);
+
+	/* update, and store next to alloc if the buffer was recycled */
+	if (recycled) {
+		ntu++;
+		rx_bufq->next_to_use = (ntu < rx_bufq->desc_count) ? ntu : 0;
+	}
 }
 
 /**
@@ -199,7 +667,12 @@ static void iecm_singleq_rx_put_buf(struct iecm_queue *rx_bufq,
  */
 static void iecm_singleq_rx_bump_ntc(struct iecm_queue *q)
 {
-	/* stub */
+	u16 ntc = q->next_to_clean + 1;
+	/* fetch, update, and store next to clean */
+	if (ntc < q->desc_count)
+		q->next_to_clean = ntc;
+	else
+		q->next_to_clean = 0;
 }
 
 /**
@@ -214,7 +687,17 @@ static struct sk_buff *
 iecm_singleq_rx_get_buf_page(struct device *dev, struct iecm_rx_buf *rx_buf,
 			     const unsigned int size)
 {
-	/* stub */
+	prefetch(rx_buf->page);
+
+	/* we are reusing so sync this buffer for CPU use */
+	dma_sync_single_range_for_cpu(dev, rx_buf->dma,
+				      rx_buf->page_offset, size,
+				      DMA_FROM_DEVICE);
+
+	/* We have pulled a buffer for use, so decrement pagecnt_bias */
+	rx_buf->pagecnt_bias--;
+
+	return rx_buf->skb;
 }
 
 /**
@@ -226,7 +709,116 @@ iecm_singleq_rx_get_buf_page(struct device *dev, struct iecm_rx_buf *rx_buf,
  */
 static int iecm_rx_singleq_clean(struct iecm_queue *rx_q, int budget)
 {
-	/* stub */
+	struct iecm_singleq_base_rx_desc *singleq_base_rx_desc;
+	unsigned int total_rx_bytes = 0, total_rx_pkts = 0;
+	u16 cleaned_count = 0;
+	bool failure = false;
+
+	/* Process Rx packets bounded by budget */
+	while (likely(total_rx_pkts < (unsigned int)budget)) {
+		union iecm_rx_desc *rx_desc;
+		struct sk_buff *skb = NULL;
+		struct iecm_rx_buf *rx_buf;
+		unsigned int size;
+		u8 rx_ptype;
+		u64 qword;
+
+		/* get the Rx desc from Rx queue based on 'next_to_clean' */
+		rx_desc = IECM_RX_DESC(rx_q, rx_q->next_to_clean);
+		singleq_base_rx_desc = (struct iecm_singleq_base_rx_desc *)
+					rx_desc;
+		/* status_error_ptype_len will always be zero for unused
+		 * descriptors because it's cleared in cleanup, and overlaps
+		 * with hdr_addr which is always zero because packet split
+		 * isn't used, if the hardware wrote DD then the length will be
+		 * non-zero
+		 */
+		qword =
+		le64_to_cpu(rx_desc->base_wb.qword1.status_error_ptype_len);
+
+		/* This memory barrier is needed to keep us from reading
+		 * any other fields out of the rx_desc
+		 */
+		dma_rmb();
+#define IECM_RXD_DD BIT(IECM_RX_BASE_DESC_STATUS_DD_S)
+		if (!iecm_rx_singleq_test_staterr(singleq_base_rx_desc,
+						  IECM_RXD_DD))
+			break;
+
+		size = (qword & IECM_RXD_QW1_LEN_PBUF_M) >>
+		       IECM_RXD_QW1_LEN_PBUF_S;
+		if (!size)
+			break;
+
+		rx_buf = &rx_q->rx_buf.buf[rx_q->next_to_clean];
+		skb = iecm_singleq_rx_get_buf_page(rx_q->dev, rx_buf, size);
+
+		if (skb)
+			iecm_rx_add_frag(rx_buf, skb, size);
+		else
+			skb = iecm_rx_construct_skb(rx_q, rx_buf, size);
+
+		/* exit if we failed to retrieve a buffer */
+		if (!skb) {
+			rx_buf->pagecnt_bias++;
+			break;
+		}
+
+		iecm_singleq_rx_put_buf(rx_q, rx_buf);
+		iecm_singleq_rx_bump_ntc(rx_q);
+
+		cleaned_count++;
+
+		/* skip if it is non EOP desc */
+		if (iecm_rx_singleq_is_non_eop(rx_q, singleq_base_rx_desc,
+					       skb))
+			continue;
+
+#define IECM_RXD_ERR_S BIT(IECM_RXD_QW1_ERROR_S)
+		if (unlikely(iecm_rx_singleq_test_staterr(singleq_base_rx_desc,
+							  IECM_RXD_ERR_S))) {
+			dev_kfree_skb_any(skb);
+			skb = NULL;
+			continue;
+		}
+
+		/* correct empty headers and pad skb if needed (to make valid
+		 * ethernet frame
+		 */
+		if (iecm_rx_cleanup_headers(skb)) {
+			skb = NULL;
+			continue;
+		}
+
+		/* probably a little skewed due to removing CRC */
+		total_rx_bytes += skb->len;
+
+		rx_ptype = (qword & IECM_RXD_QW1_PTYPE_M) >>
+				IECM_RXD_QW1_PTYPE_S;
+
+		/* protocol */
+		iecm_rx_singleq_process_skb_fields(rx_q, skb,
+						   singleq_base_rx_desc,
+						   rx_ptype);
+
+		/* send completed skb up the stack */
+		iecm_rx_skb(rx_q, skb);
+
+		/* update budget accounting */
+		total_rx_pkts++;
+	}
+	if (cleaned_count)
+		failure = iecm_rx_singleq_buf_hw_alloc_all(rx_q, cleaned_count);
+
+	rx_q->itr.stats.rx.packets += total_rx_pkts;
+	rx_q->itr.stats.rx.bytes += total_rx_bytes;
+	u64_stats_update_begin(&rx_q->stats_sync);
+	rx_q->q_stats.rx.packets += total_rx_pkts;
+	rx_q->q_stats.rx.bytes += total_rx_bytes;
+	u64_stats_update_end(&rx_q->stats_sync);
+
+	/* guarantee a trip back through this routine if there was a failure */
+	return failure ? budget : (int)total_rx_pkts;
 }
 
 /**
@@ -241,7 +833,22 @@ static inline bool
 iecm_rx_singleq_clean_all(struct iecm_q_vector *q_vec, int budget,
 			  int *cleaned)
 {
-	/* stub */
+	bool clean_complete = true;
+	int pkts_cleaned_per_q;
+	int budget_per_q, i;
+
+	budget_per_q = max(budget / q_vec->num_rxq, 1);
+	for (i = 0; i < q_vec->num_rxq; i++) {
+		pkts_cleaned_per_q = iecm_rx_singleq_clean(q_vec->rx[0],
+							   budget_per_q);
+
+		/* if we clean as many as budgeted, we must not be done */
+		if (pkts_cleaned_per_q >= budget_per_q)
+			clean_complete = false;
+		*cleaned += pkts_cleaned_per_q;
+	}
+
+	return clean_complete;
 }
 
 /**
@@ -251,5 +858,32 @@ iecm_rx_singleq_clean_all(struct iecm_q_vector *q_vec, int budget,
  */
 int iecm_vport_singleq_napi_poll(struct napi_struct *napi, int budget)
 {
-	/* stub */
+	struct iecm_q_vector *q_vector =
+				container_of(napi, struct iecm_q_vector, napi);
+	bool clean_complete;
+	int work_done = 0;
+
+	clean_complete = iecm_tx_singleq_clean_all(q_vector, budget);
+
+	/* Handle case where we are called by netpoll with a budget of 0 */
+	if (budget <= 0)
+		return budget;
+
+	/* We attempt to distribute budget to each Rx queue fairly, but don't
+	 * allow the budget to go below 1 because that would exit polling early.
+	 */
+	clean_complete |= iecm_rx_singleq_clean_all(q_vector, budget,
+						    &work_done);
+
+	/* If work not completed, return budget and polling will return */
+	if (!clean_complete)
+		return budget;
+
+	/* Exit the polling mode, but don't re-enable interrupts if stack might
+	 * poll us due to busy-polling
+	 */
+	if (likely(napi_complete_done(napi, work_done)))
+		iecm_vport_intr_update_itr_ena_irq(q_vector);
+
+	return min_t(int, work_done, budget - 1);
 }
-- 
2.26.2


^ permalink raw reply related

* [net-next 11/15] iecm: Add splitq TX/RX
From: Jeff Kirsher @ 2020-06-18  5:13 UTC (permalink / raw)
  To: davem
  Cc: Alice Michael, netdev, nhorman, sassmann, Alan Brady, Phani Burra,
	Joshua Hay, Madhu Chittim, Pavan Kumar Linga, Donald Skidmore,
	Jesse Brandeburg, Sridhar Samudrala, Jeff Kirsher
In-Reply-To: <20200618051344.516587-1-jeffrey.t.kirsher@intel.com>

From: Alice Michael <alice.michael@intel.com>

Implement main TX/RX flows for split queue model.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Donald Skidmore <donald.c.skidmore@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/iecm/iecm_txrx.c | 1283 ++++++++++++++++++-
 1 file changed, 1235 insertions(+), 48 deletions(-)

diff --git a/drivers/net/ethernet/intel/iecm/iecm_txrx.c b/drivers/net/ethernet/intel/iecm/iecm_txrx.c
index 92dc25c10a6c..071f78858282 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_txrx.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_txrx.c
@@ -11,7 +11,12 @@
 static enum iecm_status iecm_buf_lifo_push(struct iecm_buf_lifo *stack,
 					   struct iecm_tx_buf *buf)
 {
-	/* stub */
+	if (stack->top == stack->size)
+		return IECM_ERR_MAX_LIMIT;
+
+	stack->bufs[stack->top++] = buf;
+
+	return 0;
 }
 
 /**
@@ -20,7 +25,10 @@ static enum iecm_status iecm_buf_lifo_push(struct iecm_buf_lifo *stack,
  **/
 static struct iecm_tx_buf *iecm_buf_lifo_pop(struct iecm_buf_lifo *stack)
 {
-	/* stub */
+	if (!stack->top)
+		return NULL;
+
+	return stack->bufs[--stack->top];
 }
 
 /**
@@ -31,7 +39,16 @@ static struct iecm_tx_buf *iecm_buf_lifo_pop(struct iecm_buf_lifo *stack)
 void iecm_get_stats64(struct net_device *netdev,
 		      struct rtnl_link_stats64 *stats)
 {
-	/* stub */
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+
+	iecm_send_get_stats_msg(vport);
+	stats->rx_packets = vport->netstats.rx_packets;
+	stats->tx_packets = vport->netstats.tx_packets;
+	stats->rx_bytes = vport->netstats.rx_bytes;
+	stats->tx_bytes = vport->netstats.tx_bytes;
+	stats->tx_errors = vport->netstats.tx_errors;
+	stats->rx_dropped = vport->netstats.rx_dropped;
+	stats->tx_dropped = vport->netstats.tx_dropped;
 }
 
 /**
@@ -1246,7 +1263,16 @@ enum iecm_status iecm_vport_queues_alloc(struct iecm_vport *vport)
 static struct iecm_queue *
 iecm_tx_find_q(struct iecm_vport *vport, int q_id)
 {
-	/* stub */
+	int i;
+
+	for (i = 0; i < vport->num_txq; i++) {
+		struct iecm_queue *tx_q = vport->txqs[i];
+
+		if (tx_q->q_id == q_id)
+			return tx_q;
+	}
+
+	return NULL;
 }
 
 /**
@@ -1255,7 +1281,22 @@ iecm_tx_find_q(struct iecm_vport *vport, int q_id)
  */
 static void iecm_tx_handle_sw_marker(struct iecm_queue *tx_q)
 {
-	/* stub */
+	struct iecm_vport *vport = tx_q->vport;
+	bool drain_complete = true;
+	int i;
+
+	clear_bit(__IECM_Q_SW_MARKER, tx_q->flags);
+	/* Hardware must write marker packets to all queues associated with
+	 * completion queues. So check if all queues received marker packets
+	 */
+	for (i = 0; i < vport->num_txq; i++) {
+		if (test_bit(__IECM_Q_SW_MARKER, vport->txqs[i]->flags))
+			drain_complete = false;
+	}
+	if (drain_complete) {
+		set_bit(__IECM_VPORT_SW_MARKER, vport->flags);
+		wake_up(&vport->sw_marker_wq);
+	}
 }
 
 /**
@@ -1270,7 +1311,30 @@ static struct iecm_tx_queue_stats
 iecm_tx_splitq_clean_buf(struct iecm_queue *tx_q, struct iecm_tx_buf *tx_buf,
 			 int napi_budget)
 {
-	/* stub */
+	struct iecm_tx_queue_stats cleaned = {0};
+	struct netdev_queue *nq;
+
+	/* update the statistics for this packet */
+	cleaned.bytes = tx_buf->bytecount;
+	cleaned.packets = tx_buf->gso_segs;
+
+	/* free the skb */
+	napi_consume_skb(tx_buf->skb, napi_budget);
+	nq = netdev_get_tx_queue(tx_q->vport->netdev, tx_q->idx);
+	netdev_tx_completed_queue(nq, cleaned.packets,
+				  cleaned.bytes);
+
+	/* unmap skb header data */
+	dma_unmap_single(tx_q->dev,
+			 dma_unmap_addr(tx_buf, dma),
+			 dma_unmap_len(tx_buf, len),
+			 DMA_TO_DEVICE);
+
+	/* clear tx_buf data */
+	tx_buf->skb = NULL;
+	dma_unmap_len_set(tx_buf, len, 0);
+
+	return cleaned;
 }
 
 /**
@@ -1282,7 +1346,33 @@ iecm_tx_splitq_clean_buf(struct iecm_queue *tx_q, struct iecm_tx_buf *tx_buf,
 static int
 iecm_stash_flow_sch_buffers(struct iecm_queue *txq, struct iecm_tx_buf *tx_buf)
 {
-	/* stub */
+	struct iecm_adapter *adapter = txq->vport->adapter;
+	struct iecm_tx_buf *shadow_buf;
+
+	shadow_buf = iecm_buf_lifo_pop(&txq->buf_stack);
+	if (!shadow_buf) {
+		dev_err(&adapter->pdev->dev,
+			"No out-of-order TX buffers left!\n");
+		return -ENOMEM;
+	}
+
+	/* Store buffer params in shadow buffer */
+	shadow_buf->skb = tx_buf->skb;
+	shadow_buf->bytecount = tx_buf->bytecount;
+	shadow_buf->gso_segs = tx_buf->gso_segs;
+	shadow_buf->dma = tx_buf->dma;
+	shadow_buf->len = tx_buf->len;
+	shadow_buf->compl_tag = tx_buf->compl_tag;
+
+	/* Add buffer to buf_hash table to be freed
+	 * later
+	 */
+	hash_add(txq->sched_buf_hash, &shadow_buf->hlist,
+		 shadow_buf->compl_tag);
+
+	memset(tx_buf, 0, sizeof(struct iecm_tx_buf));
+
+	return 0;
 }
 
 /**
@@ -1305,7 +1395,91 @@ static struct iecm_tx_queue_stats
 iecm_tx_splitq_clean(struct iecm_queue *tx_q, u16 end, int napi_budget,
 		     bool descs_only)
 {
-	/* stub */
+	union iecm_tx_flex_desc *next_pending_desc = NULL;
+	struct iecm_tx_queue_stats cleaned_stats = {0};
+	union iecm_tx_flex_desc *tx_desc;
+	s16 ntc = tx_q->next_to_clean;
+	struct iecm_tx_buf *tx_buf;
+
+	tx_desc = IECM_FLEX_TX_DESC(tx_q, ntc);
+	next_pending_desc = IECM_FLEX_TX_DESC(tx_q, end);
+	tx_buf = &tx_q->tx_buf[ntc];
+	ntc -= tx_q->desc_count;
+
+	while (tx_desc != next_pending_desc) {
+		union iecm_tx_flex_desc *eop_desc =
+			(union iecm_tx_flex_desc *)tx_buf->next_to_watch;
+
+		/* clear next_to_watch to prevent false hangs */
+		tx_buf->next_to_watch = NULL;
+
+		if (descs_only) {
+			if (iecm_stash_flow_sch_buffers(tx_q, tx_buf))
+				goto tx_splitq_clean_out;
+
+			while (tx_desc != eop_desc) {
+				tx_buf++;
+				tx_desc++;
+				ntc++;
+				if (unlikely(!ntc)) {
+					ntc -= tx_q->desc_count;
+					tx_buf = tx_q->tx_buf;
+					tx_desc = IECM_FLEX_TX_DESC(tx_q, 0);
+				}
+
+				if (dma_unmap_len(tx_buf, len)) {
+					if (iecm_stash_flow_sch_buffers(tx_q,
+									tx_buf))
+						goto tx_splitq_clean_out;
+				}
+			}
+		} else {
+			struct iecm_tx_queue_stats buf_stats = {0};
+
+			buf_stats = iecm_tx_splitq_clean_buf(tx_q, tx_buf,
+							     napi_budget);
+
+			/* update the statistics for this packet */
+			cleaned_stats.bytes += buf_stats.bytes;
+			cleaned_stats.packets += buf_stats.packets;
+
+			/* unmap remaining buffers */
+			while (tx_desc != eop_desc) {
+				tx_buf++;
+				tx_desc++;
+				ntc++;
+				if (unlikely(!ntc)) {
+					ntc -= tx_q->desc_count;
+					tx_buf = tx_q->tx_buf;
+					tx_desc = IECM_FLEX_TX_DESC(tx_q, 0);
+				}
+
+				/* unmap any remaining paged data */
+				if (dma_unmap_len(tx_buf, len)) {
+					dma_unmap_page(tx_q->dev,
+						dma_unmap_addr(tx_buf, dma),
+						dma_unmap_len(tx_buf, len),
+						DMA_TO_DEVICE);
+					dma_unmap_len_set(tx_buf, len, 0);
+				}
+			}
+		}
+
+		tx_buf++;
+		tx_desc++;
+		ntc++;
+		if (unlikely(!ntc)) {
+			ntc -= tx_q->desc_count;
+			tx_buf = tx_q->tx_buf;
+			tx_desc = IECM_FLEX_TX_DESC(tx_q, 0);
+		}
+	}
+
+tx_splitq_clean_out:
+	ntc += tx_q->desc_count;
+	tx_q->next_to_clean = ntc;
+
+	return cleaned_stats;
 }
 
 /**
@@ -1315,7 +1489,18 @@ iecm_tx_splitq_clean(struct iecm_queue *tx_q, u16 end, int napi_budget,
  */
 static inline void iecm_tx_hw_tstamp(struct sk_buff *skb, u8 *desc_ts)
 {
-	/* stub */
+	struct skb_shared_hwtstamps hwtstamps;
+	u64 tstamp;
+
+	/* Only report timestamp to stack if requested */
+	if (!likely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))
+		return;
+
+	tstamp = (desc_ts[0] | (desc_ts[1] << 8) | (desc_ts[2] & 0x3F) << 16);
+	hwtstamps.hwtstamp =
+		ns_to_ktime(tstamp << IECM_TW_TIME_STAMP_GRAN_512_DIV_S);
+
+	skb_tstamp_tx(skb, &hwtstamps);
 }
 
 /**
@@ -1330,7 +1515,39 @@ static struct iecm_tx_queue_stats
 iecm_tx_clean_flow_sch_bufs(struct iecm_queue *txq, u16 compl_tag,
 			    u8 *desc_ts, int budget)
 {
-	/* stub */
+	struct iecm_tx_queue_stats cleaned_stats = {0};
+	struct hlist_node *tmp_buf = NULL;
+	struct iecm_tx_buf *tx_buf = NULL;
+
+	/* Buffer completion */
+	hash_for_each_possible_safe(txq->sched_buf_hash, tx_buf, tmp_buf,
+				    hlist, compl_tag) {
+		if (tx_buf->compl_tag != compl_tag)
+			continue;
+
+		if (likely(tx_buf->skb)) {
+			/* fetch timestamp from completion
+			 * descriptor to report to stack
+			 */
+			iecm_tx_hw_tstamp(tx_buf->skb, desc_ts);
+
+			cleaned_stats = iecm_tx_splitq_clean_buf(txq, tx_buf,
+								 budget);
+		} else if (dma_unmap_len(tx_buf, len)) {
+			dma_unmap_page(txq->dev,
+				       dma_unmap_addr(tx_buf, dma),
+				       dma_unmap_len(tx_buf, len),
+				       DMA_TO_DEVICE);
+			dma_unmap_len_set(tx_buf, len, 0);
+		}
+
+		/* Push shadow buf back onto stack */
+		iecm_buf_lifo_push(&txq->buf_stack, tx_buf);
+
+		hash_del(&tx_buf->hlist);
+	}
+
+	return cleaned_stats;
 }
 
 /**
@@ -1343,7 +1560,109 @@ iecm_tx_clean_flow_sch_bufs(struct iecm_queue *txq, u16 compl_tag,
 static bool
 iecm_tx_clean_complq(struct iecm_queue *complq, int budget)
 {
-	/* stub */
+	struct iecm_splitq_tx_compl_desc *tx_desc;
+	struct iecm_vport *vport = complq->vport;
+	s16 ntc = complq->next_to_clean;
+	unsigned int complq_budget;
+
+	complq_budget = vport->compln_clean_budget;
+	tx_desc = IECM_SPLITQ_TX_COMPLQ_DESC(complq, ntc);
+	ntc -= complq->desc_count;
+
+	do {
+		struct iecm_tx_queue_stats cleaned_stats = {0};
+		bool descs_only = false;
+		struct iecm_queue *tx_q;
+		u16 compl_tag, hw_head;
+		int tx_qid;
+		u8 ctype;	/* completion type */
+		u16 gen;
+
+		/* if the descriptor isn't done, no work yet to do */
+		gen = (le16_to_cpu(tx_desc->qid_comptype_gen) &
+		      IECM_TXD_COMPLQ_GEN_M) >> IECM_TXD_COMPLQ_GEN_S;
+		if (test_bit(__IECM_Q_GEN_CHK, complq->flags) != gen)
+			break;
+
+		/* Find necessary info of TX queue to clean buffers */
+		tx_qid = (le16_to_cpu(tx_desc->qid_comptype_gen) &
+			 IECM_TXD_COMPLQ_QID_M) >> IECM_TXD_COMPLQ_QID_S;
+		tx_q = iecm_tx_find_q(vport, tx_qid);
+		if (!tx_q) {
+			dev_err(&complq->vport->adapter->pdev->dev,
+				"TxQ #%d not found\n", tx_qid);
+			goto fetch_next_desc;
+		}
+
+		/* Determine completion type */
+		ctype = (le16_to_cpu(tx_desc->qid_comptype_gen) &
+			IECM_TXD_COMPLQ_COMPL_TYPE_M) >>
+			IECM_TXD_COMPLQ_COMPL_TYPE_S;
+		switch (ctype) {
+		case IECM_TXD_COMPLT_RE:
+			hw_head = le16_to_cpu(tx_desc->q_head_compl_tag.q_head);
+
+			cleaned_stats = iecm_tx_splitq_clean(tx_q, hw_head,
+							     budget,
+							     descs_only);
+			break;
+		case IECM_TXD_COMPLT_RS:
+			if (test_bit(__IECM_Q_FLOW_SCH_EN, tx_q->flags)) {
+				compl_tag =
+				le16_to_cpu(tx_desc->q_head_compl_tag.compl_tag);
+
+				cleaned_stats =
+					iecm_tx_clean_flow_sch_bufs(tx_q,
+								    compl_tag,
+								    tx_desc->ts,
+								    budget);
+			} else {
+				hw_head =
+				le16_to_cpu(tx_desc->q_head_compl_tag.q_head);
+
+				cleaned_stats = iecm_tx_splitq_clean(tx_q,
+								     hw_head,
+								     budget,
+								     false);
+			}
+
+			break;
+		case IECM_TXD_COMPLT_SW_MARKER:
+			iecm_tx_handle_sw_marker(tx_q);
+			break;
+		default:
+			dev_err(&tx_q->vport->adapter->pdev->dev,
+				"Unknown TX completion type: %d\n",
+				ctype);
+			goto fetch_next_desc;
+		}
+
+		tx_q->itr.stats.tx.packets += cleaned_stats.packets;
+		tx_q->itr.stats.tx.bytes += cleaned_stats.bytes;
+		u64_stats_update_begin(&tx_q->stats_sync);
+		tx_q->q_stats.tx.packets += cleaned_stats.packets;
+		tx_q->q_stats.tx.bytes += cleaned_stats.bytes;
+		u64_stats_update_end(&tx_q->stats_sync);
+
+fetch_next_desc:
+		tx_desc++;
+		ntc++;
+		if (unlikely(!ntc)) {
+			ntc -= complq->desc_count;
+			tx_desc = IECM_SPLITQ_TX_COMPLQ_DESC(complq, 0);
+			change_bit(__IECM_Q_GEN_CHK, complq->flags);
+		}
+
+		prefetch(tx_desc);
+
+		/* update budget accounting */
+		complq_budget--;
+	} while (likely(complq_budget));
+
+	ntc += complq->desc_count;
+	complq->next_to_clean = ntc;
+
+	return !!complq_budget;
 }
 
 /**
@@ -1359,7 +1678,12 @@ iecm_tx_splitq_build_ctb(union iecm_tx_flex_desc *desc,
 			 struct iecm_tx_splitq_params *parms,
 			 u16 td_cmd, u16 size)
 {
-	/* stub */
+	desc->q.qw1.cmd_dtype =
+		cpu_to_le16(parms->dtype & IECM_FLEX_TXD_QW1_DTYPE_M);
+	desc->q.qw1.cmd_dtype |=
+		cpu_to_le16((td_cmd << IECM_FLEX_TXD_QW1_CMD_S) &
+			    IECM_FLEX_TXD_QW1_CMD_M);
+	desc->q.qw1.buf_size = cpu_to_le16((u16)size);
 }
 
 /**
@@ -1375,7 +1699,13 @@ iecm_tx_splitq_build_flow_desc(union iecm_tx_flex_desc *desc,
 			       struct iecm_tx_splitq_params *parms,
 			       u16 td_cmd, u16 size)
 {
-	/* stub */
+	desc->flow.qw1.cmd_dtype = cpu_to_le16((u16)parms->dtype | td_cmd);
+	desc->flow.qw1.rxr_bufsize = cpu_to_le16((u16)size);
+	desc->flow.qw1.compl_tag = cpu_to_le16(parms->compl_tag);
+
+	desc->flow.qw1.ts[0] = parms->offload.desc_ts & 0xff;
+	desc->flow.qw1.ts[1] = (parms->offload.desc_ts >> 8) & 0xff;
+	desc->flow.qw1.ts[2] = (parms->offload.desc_ts >> 16) & 0xff;
 }
 
 /**
@@ -1388,7 +1718,19 @@ iecm_tx_splitq_build_flow_desc(union iecm_tx_flex_desc *desc,
 static int
 __iecm_tx_maybe_stop(struct iecm_queue *tx_q, unsigned int size)
 {
-	/* stub */
+	netif_stop_subqueue(tx_q->vport->netdev, tx_q->idx);
+
+	/* Memory barrier before checking head and tail */
+	smp_mb();
+
+	/* Check again in a case another CPU has just made room available. */
+	if (likely(IECM_DESC_UNUSED(tx_q) < size))
+		return -EBUSY;
+
+	/* A reprieve! - use start_subqueue because it doesn't call schedule */
+	netif_start_subqueue(tx_q->vport->netdev, tx_q->idx);
+
+	return 0;
 }
 
 /**
@@ -1400,7 +1742,10 @@ __iecm_tx_maybe_stop(struct iecm_queue *tx_q, unsigned int size)
  */
 int iecm_tx_maybe_stop(struct iecm_queue *tx_q, unsigned int size)
 {
-	/* stub */
+	if (likely(IECM_DESC_UNUSED(tx_q) >= size))
+		return 0;
+
+	return __iecm_tx_maybe_stop(tx_q, size);
 }
 
 /**
@@ -1412,7 +1757,23 @@ int iecm_tx_maybe_stop(struct iecm_queue *tx_q, unsigned int size)
 void iecm_tx_buf_hw_update(struct iecm_queue *tx_q, u32 val,
 			   struct sk_buff *skb)
 {
-	/* stub */
+	struct netdev_queue *nq;
+
+	nq = netdev_get_tx_queue(tx_q->vport->netdev, tx_q->idx);
+	tx_q->next_to_use = val;
+
+	iecm_tx_maybe_stop(tx_q, IECM_TX_DESC_NEEDED);
+
+	/* Force memory writes to complete before letting h/w
+	 * know there are new descriptors to fetch.  (Only
+	 * applicable for weak-ordered memory model archs,
+	 * such as IA-64).
+	 */
+	wmb();
+
+	/* notify HW of packet */
+	if (netif_xmit_stopped(nq) || !netdev_xmit_more())
+		writel_relaxed(val, tx_q->tail);
 }
 
 /**
@@ -1445,7 +1806,7 @@ void iecm_tx_buf_hw_update(struct iecm_queue *tx_q, u32 val,
  */
 static unsigned int __iecm_tx_desc_count_required(unsigned int size)
 {
-	/* stub */
+	return ((size * 85) >> 20) + IECM_TX_DESCS_FOR_SKB_DATA_PTR;
 }
 
 /**
@@ -1456,13 +1817,26 @@ static unsigned int __iecm_tx_desc_count_required(unsigned int size)
  */
 unsigned int iecm_tx_desc_count_required(struct sk_buff *skb)
 {
-	/* stub */
+	const skb_frag_t *frag = &skb_shinfo(skb)->frags[0];
+	unsigned int nr_frags = skb_shinfo(skb)->nr_frags;
+	unsigned int count = 0, size = skb_headlen(skb);
+
+	for (;;) {
+		count += __iecm_tx_desc_count_required(size);
+
+		if (!nr_frags--)
+			break;
+
+		size = skb_frag_size(frag++);
+	}
+
+	return count;
 }
 
 /**
  * iecm_tx_splitq_map - Build the Tx flex descriptor
  * @tx_q: queue to send buffer on
- * @off: pointer to offload params struct
+ * @parms: pointer to splitq params struct
  * @first: first buffer info buffer to use
  *
  * This function loops over the skb data pointed to by *first
@@ -1471,10 +1845,130 @@ unsigned int iecm_tx_desc_count_required(struct sk_buff *skb)
  */
 static void
 iecm_tx_splitq_map(struct iecm_queue *tx_q,
-		   struct iecm_tx_offload_params *off,
+		   struct iecm_tx_splitq_params *parms,
 		   struct iecm_tx_buf *first)
 {
-	/* stub */
+	union iecm_tx_flex_desc *tx_desc;
+	unsigned int data_len, size;
+	struct iecm_tx_buf *tx_buf;
+	u16 i = tx_q->next_to_use;
+	struct netdev_queue *nq;
+	struct sk_buff *skb;
+	skb_frag_t *frag;
+	u16 td_cmd = 0;
+	dma_addr_t dma;
+
+	skb = first->skb;
+
+	td_cmd = parms->offload.td_cmd;
+	parms->compl_tag = tx_q->tx_buf_key;
+
+	data_len = skb->data_len;
+	size = skb_headlen(skb);
+
+	tx_desc = IECM_FLEX_TX_DESC(tx_q, i);
+
+	dma = dma_map_single(tx_q->dev, skb->data, size, DMA_TO_DEVICE);
+
+	tx_buf = first;
+
+	for (frag = &skb_shinfo(skb)->frags[0];; frag++) {
+		unsigned int max_data = IECM_TX_MAX_DESC_DATA_ALIGNED;
+
+		if (dma_mapping_error(tx_q->dev, dma))
+			goto dma_error;
+
+		/* record length, and DMA address */
+		dma_unmap_len_set(tx_buf, len, size);
+		dma_unmap_addr_set(tx_buf, dma, dma);
+
+		/* align size to end of page */
+		max_data += -dma & (IECM_TX_MAX_READ_REQ_SIZE - 1);
+
+		/* buf_addr is in same location for both desc types */
+		tx_desc->q.buf_addr = cpu_to_le64(dma);
+
+		/* account for data chunks larger than the hardware
+		 * can handle
+		 */
+		while (unlikely(size > IECM_TX_MAX_DESC_DATA)) {
+			parms->splitq_build_ctb(tx_desc, parms, td_cmd, size);
+
+			tx_desc++;
+			i++;
+
+			if (i == tx_q->desc_count) {
+				tx_desc = IECM_FLEX_TX_DESC(tx_q, 0);
+				i = 0;
+			}
+
+			dma += max_data;
+			size -= max_data;
+
+			max_data = IECM_TX_MAX_DESC_DATA_ALIGNED;
+			/* buf_addr is in same location for both desc types */
+			tx_desc->q.buf_addr = cpu_to_le64(dma);
+		}
+
+		if (likely(!data_len))
+			break;
+		parms->splitq_build_ctb(tx_desc, parms, td_cmd, size);
+		tx_desc++;
+		i++;
+
+		if (i == tx_q->desc_count) {
+			tx_desc = IECM_FLEX_TX_DESC(tx_q, 0);
+			i = 0;
+		}
+
+		size = skb_frag_size(frag);
+		data_len -= size;
+
+		dma = skb_frag_dma_map(tx_q->dev, frag, 0, size,
+				       DMA_TO_DEVICE);
+
+		tx_buf->compl_tag = parms->compl_tag;
+		tx_buf = &tx_q->tx_buf[i];
+	}
+
+	/* record bytecount for BQL */
+	nq = netdev_get_tx_queue(tx_q->vport->netdev, tx_q->idx);
+	netdev_tx_sent_queue(nq, first->bytecount);
+
+	/* record SW timestamp if HW timestamp is not available */
+	skb_tx_timestamp(first->skb);
+
+	/* write last descriptor with RS and EOP bits */
+	td_cmd |= parms->eop_cmd;
+	parms->splitq_build_ctb(tx_desc, parms, td_cmd, size);
+	i++;
+	if (i == tx_q->desc_count)
+		i = 0;
+
+	/* set next_to_watch value indicating a packet is present */
+	first->next_to_watch = tx_desc;
+	tx_buf->compl_tag = parms->compl_tag++;
+
+	iecm_tx_buf_hw_update(tx_q, i, skb);
+
+	/* Update TXQ Completion Tag key for next buffer */
+	tx_q->tx_buf_key = parms->compl_tag;
+
+	return;
+
+dma_error:
+	/* clear DMA mappings for failed tx_buf map */
+	for (;;) {
+		tx_buf = &tx_q->tx_buf[i];
+		iecm_tx_buf_rel(tx_q, tx_buf);
+		if (tx_buf == first)
+			break;
+		if (i == 0)
+			i = tx_q->desc_count;
+		i--;
+	}
+
+	tx_q->next_to_use = i;
 }
 
 /**
@@ -1490,7 +1984,79 @@ iecm_tx_splitq_map(struct iecm_queue *tx_q,
 static int iecm_tso(struct iecm_tx_buf *first,
 		    struct iecm_tx_offload_params *off)
 {
-	/* stub */
+	struct sk_buff *skb = first->skb;
+	union {
+		struct iphdr *v4;
+		struct ipv6hdr *v6;
+		unsigned char *hdr;
+	} ip;
+	union {
+		struct tcphdr *tcp;
+		struct udphdr *udp;
+		unsigned char *hdr;
+	} l4;
+	u32 paylen, l4_start;
+	int err;
+
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		return 0;
+
+	if (!skb_is_gso(skb))
+		return 0;
+
+	err = skb_cow_head(skb, 0);
+	if (err < 0)
+		return err;
+
+	ip.hdr = skb_network_header(skb);
+	l4.hdr = skb_transport_header(skb);
+
+	/* initialize outer IP header fields */
+	if (ip.v4->version == 4) {
+		ip.v4->tot_len = 0;
+		ip.v4->check = 0;
+	} else {
+		ip.v6->payload_len = 0;
+	}
+
+	/* determine offset of transport header */
+	l4_start = l4.hdr - skb->data;
+
+	/* remove payload length from checksum */
+	paylen = skb->len - l4_start;
+
+	switch (skb_shinfo(skb)->gso_type) {
+	case SKB_GSO_TCPV4:
+	case SKB_GSO_TCPV6:
+		csum_replace_by_diff(&l4.tcp->check,
+				     (__force __wsum)htonl(paylen));
+
+		/* compute length of segmentation header */
+		off->tso_hdr_len = (l4.tcp->doff * 4) + l4_start;
+		break;
+	case SKB_GSO_UDP_L4:
+		csum_replace_by_diff(&l4.udp->check,
+				     (__force __wsum)htonl(paylen));
+		/* compute length of segmentation header */
+		off->tso_hdr_len = sizeof(struct udphdr) + l4_start;
+		l4.udp->len =
+			htons(skb_shinfo(skb)->gso_size +
+			      sizeof(struct udphdr));
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	off->tso_len = skb->len - off->tso_hdr_len;
+	off->mss = skb_shinfo(skb)->gso_size;
+
+	/* update gso_segs and bytecount */
+	first->gso_segs = skb_shinfo(skb)->gso_segs;
+	first->bytecount += (first->gso_segs - 1) * off->tso_hdr_len;
+
+	first->tx_flags |= IECM_TX_FLAGS_TSO;
+
+	return 0;
 }
 
 /**
@@ -1503,7 +2069,84 @@ static int iecm_tso(struct iecm_tx_buf *first,
 static netdev_tx_t
 iecm_tx_splitq_frame(struct sk_buff *skb, struct iecm_queue *tx_q)
 {
-	/* stub */
+	struct iecm_tx_splitq_params tx_parms = {0};
+	struct iecm_tx_buf *first;
+	unsigned int count;
+
+	count = iecm_tx_desc_count_required(skb);
+
+	/* need: 1 descriptor per page * PAGE_SIZE/IECM_MAX_DATA_PER_TXD,
+	 *       + 1 desc for skb_head_len/IECM_MAX_DATA_PER_TXD,
+	 *       + 4 desc gap to avoid the cache line where head is,
+	 *       + 1 desc for context descriptor,
+	 * otherwise try next time
+	 */
+	if (iecm_tx_maybe_stop(tx_q, count + IECM_TX_DESCS_PER_CACHE_LINE +
+			       IECM_TX_DESCS_FOR_CTX)) {
+		return NETDEV_TX_BUSY;
+	}
+
+	/* record the location of the first descriptor for this packet */
+	first = &tx_q->tx_buf[tx_q->next_to_use];
+	first->skb = skb;
+	first->bytecount = max_t(unsigned int, skb->len, ETH_ZLEN);
+	first->gso_segs = 1;
+	first->tx_flags = 0;
+
+	if (iecm_tso(first, &tx_parms.offload) < 0) {
+		/* If tso returns an error, drop the packet */
+		dev_kfree_skb_any(skb);
+		return NETDEV_TX_OK;
+	}
+
+	if (first->tx_flags & IECM_TX_FLAGS_TSO) {
+		/* If TSO is needed, set up context desc */
+		union iecm_flex_tx_ctx_desc *ctx_desc;
+		int i = tx_q->next_to_use;
+
+		/* grab the next descriptor */
+		ctx_desc = IECM_FLEX_TX_CTX_DESC(tx_q, i);
+		i++;
+		tx_q->next_to_use = (i < tx_q->desc_count) ? i : 0;
+
+		ctx_desc->tso.qw1.cmd_dtype |=
+				cpu_to_le16(IECM_TX_DESC_DTYPE_FLEX_TSO_CTX |
+					    IECM_TX_FLEX_CTX_DESC_CMD_TSO);
+		ctx_desc->tso.qw0.flex_tlen =
+				cpu_to_le32(tx_parms.offload.tso_len &
+					    IECM_TXD_FLEX_CTX_TLEN_M);
+		ctx_desc->tso.qw0.mss_rt =
+				cpu_to_le16(tx_parms.offload.mss &
+					    IECM_TXD_FLEX_CTX_MSS_RT_M);
+		ctx_desc->tso.qw0.hdr_len = tx_parms.offload.tso_hdr_len;
+	}
+
+	if (test_bit(__IECM_Q_FLOW_SCH_EN, tx_q->flags)) {
+		s64 ts_ns = first->skb->skb_mstamp_ns;
+
+		tx_parms.offload.desc_ts =
+			ts_ns >> IECM_TW_TIME_STAMP_GRAN_512_DIV_S;
+
+		tx_parms.dtype = IECM_TX_DESC_DTYPE_FLEX_FLOW_SCHE;
+		tx_parms.splitq_build_ctb = iecm_tx_splitq_build_flow_desc;
+		tx_parms.eop_cmd =
+			IECM_TXD_FLEX_FLOW_CMD_EOP | IECM_TXD_FLEX_FLOW_CMD_RE;
+
+		if (skb->ip_summed == CHECKSUM_PARTIAL)
+			tx_parms.offload.td_cmd |= IECM_TXD_FLEX_FLOW_CMD_CS_EN;
+
+	} else {
+		tx_parms.dtype = IECM_TX_DESC_DTYPE_FLEX_DATA;
+		tx_parms.splitq_build_ctb = iecm_tx_splitq_build_ctb;
+		tx_parms.eop_cmd = IECM_TX_DESC_CMD_EOP | IECM_TX_DESC_CMD_RS;
+
+		if (skb->ip_summed == CHECKSUM_PARTIAL)
+			tx_parms.offload.td_cmd |= IECM_TX_FLEX_DESC_CMD_CS_EN;
+	}
+
+	iecm_tx_splitq_map(tx_q, &tx_parms, first);
+
+	return NETDEV_TX_OK;
 }
 
 /**
@@ -1516,7 +2159,18 @@ iecm_tx_splitq_frame(struct sk_buff *skb, struct iecm_queue *tx_q)
 netdev_tx_t iecm_tx_splitq_start(struct sk_buff *skb,
 				 struct net_device *netdev)
 {
-	/* stub */
+	struct iecm_vport *vport = iecm_netdev_to_vport(netdev);
+	struct iecm_queue *tx_q;
+
+	tx_q = vport->txqs[skb->queue_mapping];
+
+	/* hardware can't handle really short frames, hardware padding works
+	 * beyond this point
+	 */
+	if (skb_put_padto(skb, IECM_TX_MIN_LEN))
+		return NETDEV_TX_OK;
+
+	return iecm_tx_splitq_frame(skb, tx_q);
 }
 
 /**
@@ -1531,7 +2185,18 @@ netdev_tx_t iecm_tx_splitq_start(struct sk_buff *skb,
 static enum pkt_hash_types iecm_ptype_to_htype(struct iecm_vport *vport,
 					       u16 ptype)
 {
-	/* stub */
+	struct iecm_rx_ptype_decoded decoded = vport->rx_ptype_lkup[ptype];
+
+	if (!decoded.known)
+		return PKT_HASH_TYPE_NONE;
+	if (decoded.payload_layer == IECM_RX_PTYPE_PAYLOAD_LAYER_PAY4)
+		return PKT_HASH_TYPE_L4;
+	if (decoded.payload_layer == IECM_RX_PTYPE_PAYLOAD_LAYER_PAY3)
+		return PKT_HASH_TYPE_L3;
+	if (decoded.outer_ip == IECM_RX_PTYPE_OUTER_L2)
+		return PKT_HASH_TYPE_L2;
+
+	return PKT_HASH_TYPE_NONE;
 }
 
 /**
@@ -1545,7 +2210,17 @@ static void
 iecm_rx_hash(struct iecm_queue *rxq, struct sk_buff *skb,
 	     struct iecm_flex_rx_desc *rx_desc, u16 ptype)
 {
-	/* stub */
+	u32 hash;
+
+	if (!iecm_is_feature_ena(rxq->vport, NETIF_F_RXHASH))
+		return;
+
+	hash = rx_desc->status_err1 |
+	       (rx_desc->fflags1 << 8) |
+	       (rx_desc->ts_low << 16) |
+	       (rx_desc->ff2_mirrid_hash2.hash2 << 24);
+
+	skb_set_hash(skb, hash, iecm_ptype_to_htype(rxq->vport, ptype));
 }
 
 /**
@@ -1561,7 +2236,63 @@ static void
 iecm_rx_csum(struct iecm_queue *rxq, struct sk_buff *skb,
 	     struct iecm_flex_rx_desc *rx_desc, u16 ptype)
 {
-	/* stub */
+	struct iecm_rx_ptype_decoded decoded;
+	u8 rx_status_0_qw1, rx_status_0_qw0;
+	bool ipv4, ipv6;
+
+	/* Start with CHECKSUM_NONE and by default csum_level = 0 */
+	skb->ip_summed = CHECKSUM_NONE;
+
+	/* check if Rx checksum is enabled */
+	if (!iecm_is_feature_ena(rxq->vport, NETIF_F_RXCSUM))
+		return;
+
+	rx_status_0_qw1 = rx_desc->status_err0_qw1;
+	/* check if HW has decoded the packet and checksum */
+	if (!(rx_status_0_qw1 & BIT(IECM_RX_FLEX_DESC_STATUS0_L3L4P_S)))
+		return;
+
+	decoded = rxq->vport->rx_ptype_lkup[ptype];
+	if (!(decoded.known && decoded.outer_ip))
+		return;
+
+	ipv4 = (decoded.outer_ip == IECM_RX_PTYPE_OUTER_IP) &&
+	       (decoded.outer_ip_ver == IECM_RX_PTYPE_OUTER_IPV4);
+	ipv6 = (decoded.outer_ip == IECM_RX_PTYPE_OUTER_IP) &&
+	       (decoded.outer_ip_ver == IECM_RX_PTYPE_OUTER_IPV6);
+
+	if (ipv4 && (rx_status_0_qw1 &
+		     (BIT(IECM_RX_FLEX_DESC_STATUS0_XSUM_IPE_S) |
+		      BIT(IECM_RX_FLEX_DESC_STATUS0_XSUM_EIPE_S))))
+		goto checksum_fail;
+
+	rx_status_0_qw0 = rx_desc->status_err0_qw0;
+	if (ipv6 && (rx_status_0_qw0 &
+		     (BIT(IECM_RX_FLEX_DESC_STATUS0_IPV6EXADD_S))))
+		return;
+
+	/* check for L4 errors and handle packets that were not able to be
+	 * checksummed
+	 */
+	if (rx_status_0_qw1 & BIT(IECM_RX_FLEX_DESC_STATUS0_XSUM_L4E_S))
+		goto checksum_fail;
+
+	/* Only report checksum unnecessary for ICMP, TCP, UDP, or SCTP */
+	switch (decoded.inner_prot) {
+	case IECM_RX_PTYPE_INNER_PROT_ICMP:
+	case IECM_RX_PTYPE_INNER_PROT_TCP:
+	case IECM_RX_PTYPE_INNER_PROT_UDP:
+	case IECM_RX_PTYPE_INNER_PROT_SCTP:
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+		rxq->q_stats.rx.basic_csum++;
+	default:
+		break;
+	}
+	return;
+
+checksum_fail:
+	rxq->q_stats.rx.csum_err++;
+	dev_dbg(rxq->dev, "RX Checksum not available\n");
 }
 
 /**
@@ -1577,7 +2308,74 @@ iecm_rx_csum(struct iecm_queue *rxq, struct sk_buff *skb,
 static bool iecm_rx_rsc(struct iecm_queue *rxq, struct sk_buff *skb,
 			struct iecm_flex_rx_desc *rx_desc, u16 ptype)
 {
-	/* stub */
+	struct iecm_rx_ptype_decoded decoded;
+	u16 rsc_segments, rsc_payload_len;
+	struct ipv6hdr *ipv6h;
+	struct tcphdr *tcph;
+	struct iphdr *ipv4h;
+	bool ipv4, ipv6;
+	u16 hdr_len;
+
+	rsc_payload_len = le32_to_cpu(rx_desc->fmd1_misc.rscseglen);
+	if (!rsc_payload_len)
+		goto rsc_err;
+
+	decoded = rxq->vport->rx_ptype_lkup[ptype];
+	if (!(decoded.known && decoded.outer_ip))
+		goto rsc_err;
+
+	ipv4 = (decoded.outer_ip == IECM_RX_PTYPE_OUTER_IP) &&
+		(decoded.outer_ip_ver == IECM_RX_PTYPE_OUTER_IPV4);
+	ipv6 = (decoded.outer_ip == IECM_RX_PTYPE_OUTER_IP) &&
+		(decoded.outer_ip_ver == IECM_RX_PTYPE_OUTER_IPV6);
+
+	if (!(ipv4 ^ ipv6))
+		goto rsc_err;
+
+	if (ipv4)
+		hdr_len = ETH_HLEN + sizeof(struct tcphdr) +
+			  sizeof(struct iphdr);
+	else
+		hdr_len = ETH_HLEN + sizeof(struct tcphdr) +
+			  sizeof(struct ipv6hdr);
+
+	rsc_segments = DIV_ROUND_UP(skb->len - hdr_len, rsc_payload_len);
+
+	NAPI_GRO_CB(skb)->count = rsc_segments;
+	skb_shinfo(skb)->gso_size = rsc_payload_len;
+
+	skb_reset_network_header(skb);
+
+	if (ipv4) {
+		ipv4h = ip_hdr(skb);
+		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
+
+		/* Reset and set transport header offset in skb */
+		skb_set_transport_header(skb, sizeof(struct iphdr));
+		tcph = tcp_hdr(skb);
+
+		/* Compute the TCP pseudo header checksum*/
+		tcph->check =
+			~tcp_v4_check(skb->len - skb_transport_offset(skb),
+				      ipv4h->saddr, ipv4h->daddr, 0);
+	} else {
+		ipv6h = ipv6_hdr(skb);
+		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
+		skb_set_transport_header(skb, sizeof(struct ipv6hdr));
+		tcph = tcp_hdr(skb);
+		tcph->check =
+			~tcp_v6_check(skb->len - skb_transport_offset(skb),
+				      &ipv6h->saddr, &ipv6h->daddr, 0);
+	}
+
+	tcp_gro_complete(skb);
+
+	/* Map Rx qid to the skb*/
+	skb_record_rx_queue(skb, rxq->q_id);
+
+	return true;
+rsc_err:
+	return false;
 }
 
 /**
@@ -1589,7 +2387,19 @@ static bool iecm_rx_rsc(struct iecm_queue *rxq, struct sk_buff *skb,
 static void iecm_rx_hwtstamp(struct iecm_flex_rx_desc *rx_desc,
 			     struct sk_buff __maybe_unused *skb)
 {
-	/* stub */
+	u8 ts_lo = rx_desc->ts_low;
+	u32 ts_hi = 0;
+	u64 ts_ns = 0;
+
+	ts_hi = le32_to_cpu(rx_desc->flex_ts.ts_high);
+
+	ts_ns |= ts_lo | ((u64)ts_hi << 8);
+
+	if (ts_ns) {
+		memset(skb_hwtstamps(skb), 0,
+		       sizeof(struct skb_shared_hwtstamps));
+		skb_hwtstamps(skb)->hwtstamp = ns_to_ktime(ts_ns);
+	}
 }
 
 /**
@@ -1606,7 +2416,26 @@ static bool
 iecm_rx_process_skb_fields(struct iecm_queue *rxq, struct sk_buff *skb,
 			   struct iecm_flex_rx_desc *rx_desc)
 {
-	/* stub */
+	bool err = false;
+	u16 rx_ptype;
+	bool rsc;
+
+	rx_ptype = le16_to_cpu(rx_desc->ptype_err_fflags0) &
+		   IECM_RXD_FLEX_PTYPE_M;
+
+	/* modifies the skb - consumes the enet header */
+	skb->protocol = eth_type_trans(skb, rxq->vport->netdev);
+	iecm_rx_csum(rxq, skb, rx_desc, rx_ptype);
+	/* process RSS/hash */
+	iecm_rx_hash(rxq, skb, rx_desc, rx_ptype);
+
+	rsc = le16_to_cpu(rx_desc->hdrlen_flags) & IECM_RXD_FLEX_RSC_M;
+	if (rsc)
+		err = iecm_rx_rsc(rxq, skb, rx_desc, rx_ptype);
+
+	iecm_rx_hwtstamp(rx_desc, skb);
+
+	return err;
 }
 
 /**
@@ -1619,7 +2448,7 @@ iecm_rx_process_skb_fields(struct iecm_queue *rxq, struct sk_buff *skb,
  */
 void iecm_rx_skb(struct iecm_queue *rxq, struct sk_buff *skb)
 {
-	/* stub */
+	napi_gro_receive(&rxq->q_vector->napi, skb);
 }
 
 /**
@@ -1628,7 +2457,7 @@ void iecm_rx_skb(struct iecm_queue *rxq, struct sk_buff *skb)
  */
 static bool iecm_rx_page_is_reserved(struct page *page)
 {
-	/* stub */
+	return (page_to_nid(page) != numa_mem_id()) || page_is_pfmemalloc(page);
 }
 
 /**
@@ -1644,7 +2473,13 @@ static bool iecm_rx_page_is_reserved(struct page *page)
 static void
 iecm_rx_buf_adjust_pg_offset(struct iecm_rx_buf *rx_buf, unsigned int size)
 {
-	/* stub */
+#if (PAGE_SIZE < 8192)
+	/* flip page offset to other buffer */
+	rx_buf->page_offset ^= size;
+#else
+	/* move offset up to the next cache line */
+	rx_buf->page_offset += size;
+#endif
 }
 
 /**
@@ -1658,7 +2493,34 @@ iecm_rx_buf_adjust_pg_offset(struct iecm_rx_buf *rx_buf, unsigned int size)
  */
 static bool iecm_rx_can_reuse_page(struct iecm_rx_buf *rx_buf)
 {
-	/* stub */
+#if (PAGE_SIZE >= 8192)
+#endif
+	unsigned int pagecnt_bias = rx_buf->pagecnt_bias;
+	struct page *page = rx_buf->page;
+
+	/* avoid re-using remote pages */
+	if (unlikely(iecm_rx_page_is_reserved(page)))
+		return false;
+
+#if (PAGE_SIZE < 8192)
+	/* if we are only owner of page we can reuse it */
+	if (unlikely((page_count(page) - pagecnt_bias) > 1))
+		return false;
+#else
+	if (rx_buf->page_offset > last_offset)
+		return false;
+#endif /* PAGE_SIZE < 8192) */
+
+	/* If we have drained the page fragment pool we need to update
+	 * the pagecnt_bias and page count so that we fully restock the
+	 * number of references the driver holds.
+	 */
+	if (unlikely(pagecnt_bias == 1)) {
+		page_ref_add(page, USHRT_MAX - 1);
+		rx_buf->pagecnt_bias = USHRT_MAX;
+	}
+
+	return true;
 }
 
 /**
@@ -1674,7 +2536,17 @@ static bool iecm_rx_can_reuse_page(struct iecm_rx_buf *rx_buf)
 void iecm_rx_add_frag(struct iecm_rx_buf *rx_buf, struct sk_buff *skb,
 		      unsigned int size)
 {
-	/* stub */
+#if (PAGE_SIZE >= 8192)
+	unsigned int truesize = SKB_DATA_ALIGN(size);
+#else
+	unsigned int truesize = IECM_RX_BUF_2048;
+#endif
+
+	skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, rx_buf->page,
+			rx_buf->page_offset, size, truesize);
+
+	/* page is being used so we must update the page offset */
+	iecm_rx_buf_adjust_pg_offset(rx_buf, truesize);
 }
 
 /**
@@ -1689,7 +2561,22 @@ void iecm_rx_reuse_page(struct iecm_queue *rx_bufq,
 			bool hsplit,
 			struct iecm_rx_buf *old_buf)
 {
-	/* stub */
+	u16 ntu = rx_bufq->next_to_use;
+	struct iecm_rx_buf *new_buf;
+
+	if (hsplit)
+		new_buf = &rx_bufq->rx_buf.hdr_buf[ntu];
+	else
+		new_buf = &rx_bufq->rx_buf.buf[ntu];
+
+	/* Transfer page from old buffer to new buffer.
+	 * Move each member individually to avoid possible store
+	 * forwarding stalls and unnecessary copy of skb.
+	 */
+	new_buf->dma = old_buf->dma;
+	new_buf->page = old_buf->page;
+	new_buf->page_offset = old_buf->page_offset;
+	new_buf->pagecnt_bias = old_buf->pagecnt_bias;
 }
 
 /**
@@ -1704,7 +2591,15 @@ static void
 iecm_rx_get_buf_page(struct device *dev, struct iecm_rx_buf *rx_buf,
 		     const unsigned int size)
 {
-	/* stub */
+	prefetch(rx_buf->page);
+
+	/* we are reusing so sync this buffer for CPU use */
+	dma_sync_single_range_for_cpu(dev, rx_buf->dma,
+				      rx_buf->page_offset, size,
+				      DMA_FROM_DEVICE);
+
+	/* We have pulled a buffer for use, so decrement pagecnt_bias */
+	rx_buf->pagecnt_bias--;
 }
 
 /**
@@ -1721,7 +2616,52 @@ struct sk_buff *
 iecm_rx_construct_skb(struct iecm_queue *rxq, struct iecm_rx_buf *rx_buf,
 		      unsigned int size)
 {
-	/* stub */
+	void *va = page_address(rx_buf->page) + rx_buf->page_offset;
+	unsigned int headlen;
+	struct sk_buff *skb;
+
+	/* prefetch first cache line of first page */
+	prefetch(va);
+#if L1_CACHE_BYTES < 128
+	prefetch((u8 *)va + L1_CACHE_BYTES);
+#endif /* L1_CACHE_BYTES */
+	/* allocate a skb to store the frags */
+	skb = __napi_alloc_skb(&rxq->q_vector->napi, IECM_RX_HDR_SIZE,
+			       GFP_ATOMIC | __GFP_NOWARN);
+	if (unlikely(!skb))
+		return NULL;
+
+	skb_record_rx_queue(skb, rxq->idx);
+
+	/* Determine available headroom for copy */
+	headlen = size;
+	if (headlen > IECM_RX_HDR_SIZE)
+		headlen = eth_get_headlen(skb->dev, va, IECM_RX_HDR_SIZE);
+
+	/* align pull length to size of long to optimize memcpy performance */
+	memcpy(__skb_put(skb, headlen), va, ALIGN(headlen, sizeof(long)));
+
+	/* if we exhaust the linear part then add what is left as a frag */
+	size -= headlen;
+	if (size) {
+#if (PAGE_SIZE >= 8192)
+		unsigned int truesize = SKB_DATA_ALIGN(size);
+#else
+		unsigned int truesize = IECM_RX_BUF_2048;
+#endif
+		skb_add_rx_frag(skb, 0, rx_buf->page,
+				rx_buf->page_offset + headlen, size, truesize);
+		/* buffer is used by skb, update page_offset */
+		iecm_rx_buf_adjust_pg_offset(rx_buf, truesize);
+	} else {
+		/* buffer is unused, reset bias back to rx_buf; data was copied
+		 * onto skb's linear part so there's no need for adjusting
+		 * page offset and we can reuse this buffer as-is
+		 */
+		rx_buf->pagecnt_bias++;
+	}
+
+	return skb;
 }
 
 /**
@@ -1738,7 +2678,11 @@ iecm_rx_construct_skb(struct iecm_queue *rxq, struct iecm_rx_buf *rx_buf,
  */
 bool iecm_rx_cleanup_headers(struct sk_buff *skb)
 {
-	/* stub */
+	/* if eth_skb_pad returns an error the skb was freed */
+	if (eth_skb_pad(skb))
+		return true;
+
+	return false;
 }
 
 /**
@@ -1751,7 +2695,7 @@ bool iecm_rx_cleanup_headers(struct sk_buff *skb)
 static bool
 iecm_rx_splitq_test_staterr(u8 stat_err_field, const u8 stat_err_bits)
 {
-	/* stub */
+	return !!(stat_err_field & stat_err_bits);
 }
 
 /**
@@ -1764,7 +2708,13 @@ iecm_rx_splitq_test_staterr(u8 stat_err_field, const u8 stat_err_bits)
 static bool
 iecm_rx_splitq_is_non_eop(struct iecm_flex_rx_desc *rx_desc)
 {
-	/* stub */
+	/* if we are the last buffer then there is nothing else to do */
+#define IECM_RXD_EOF BIT(IECM_RX_FLEX_DESC_STATUS0_EOF_S)
+	if (likely(iecm_rx_splitq_test_staterr(rx_desc->status_err0_qw1,
+					       IECM_RXD_EOF)))
+		return false;
+
+	return true;
 }
 
 /**
@@ -1781,7 +2731,24 @@ iecm_rx_splitq_is_non_eop(struct iecm_flex_rx_desc *rx_desc)
 bool iecm_rx_recycle_buf(struct iecm_queue *rx_bufq, bool hsplit,
 			 struct iecm_rx_buf *rx_buf)
 {
-	/* stub */
+	bool recycled = false;
+
+	if (iecm_rx_can_reuse_page(rx_buf)) {
+		/* hand second half of page back to the queue */
+		iecm_rx_reuse_page(rx_bufq, hsplit, rx_buf);
+		recycled = true;
+	} else {
+		/* we are not reusing the buffer so unmap it */
+		dma_unmap_page_attrs(rx_bufq->dev, rx_buf->dma, PAGE_SIZE,
+				     DMA_FROM_DEVICE, IECM_RX_DMA_ATTR);
+		__page_frag_cache_drain(rx_buf->page, rx_buf->pagecnt_bias);
+	}
+
+	/* clear contents of buffer_info */
+	rx_buf->page = NULL;
+	rx_buf->skb = NULL;
+
+	return recycled;
 }
 
 /**
@@ -1797,7 +2764,19 @@ static void iecm_rx_splitq_put_bufs(struct iecm_queue *rx_bufq,
 				    struct iecm_rx_buf *hdr_buf,
 				    struct iecm_rx_buf *rx_buf)
 {
-	/* stub */
+	u16 ntu = rx_bufq->next_to_use;
+	bool recycled = false;
+
+	if (likely(hdr_buf))
+		recycled = iecm_rx_recycle_buf(rx_bufq, true, hdr_buf);
+	if (likely(rx_buf))
+		recycled = iecm_rx_recycle_buf(rx_bufq, false, rx_buf);
+
+	/* update, and store next to alloc if the buffer was recycled */
+	if (recycled) {
+		ntu++;
+		rx_bufq->next_to_use = (ntu < rx_bufq->desc_count) ? ntu : 0;
+	}
 }
 
 /**
@@ -1806,7 +2785,14 @@ static void iecm_rx_splitq_put_bufs(struct iecm_queue *rx_bufq,
  */
 static void iecm_rx_bump_ntc(struct iecm_queue *q)
 {
-	/* stub */
+	u16 ntc = q->next_to_clean + 1;
+	/* fetch, update, and store next to clean */
+	if (ntc < q->desc_count) {
+		q->next_to_clean = ntc;
+	} else {
+		q->next_to_clean = 0;
+		change_bit(__IECM_Q_GEN_CHK, q->flags);
+	}
 }
 
 /**
@@ -1823,7 +2809,158 @@ static void iecm_rx_bump_ntc(struct iecm_queue *q)
  */
 static int iecm_rx_splitq_clean(struct iecm_queue *rxq, int budget)
 {
-	/* stub */
+	unsigned int total_rx_bytes = 0, total_rx_pkts = 0;
+	u16 cleaned_count[IECM_BUFQS_PER_RXQ_SET] = {0};
+	struct iecm_queue *rx_bufq = NULL;
+	struct sk_buff *skb = rxq->skb;
+	bool failure = false;
+	int i;
+
+	/* Process Rx packets bounded by budget */
+	while (likely(total_rx_pkts < (unsigned int)budget)) {
+		struct iecm_flex_rx_desc *splitq_flex_rx_desc;
+		union iecm_rx_desc *rx_desc;
+		struct iecm_rx_buf *hdr_buf = NULL;
+		struct iecm_rx_buf *rx_buf = NULL;
+		unsigned int pkt_len = 0;
+		unsigned int hdr_len = 0;
+		u16 gen_id, buf_id;
+		u8 stat_err0_qw0;
+		u8 stat_err_bits;
+		 /* Header buffer overflow only valid for header split */
+		bool hbo = false;
+		int bufq_id;
+
+		/* get the Rx desc from Rx queue based on 'next_to_clean' */
+		rx_desc = IECM_RX_DESC(rxq, rxq->next_to_clean);
+		splitq_flex_rx_desc = (struct iecm_flex_rx_desc *)rx_desc;
+
+		/* This memory barrier is needed to keep us from reading
+		 * any other fields out of the rx_desc
+		 */
+		dma_rmb();
+
+		/* if the descriptor isn't done, no work yet to do */
+		gen_id = le16_to_cpu(splitq_flex_rx_desc->pktlen_gen_bufq_id);
+		gen_id = (gen_id & IECM_RXD_FLEX_GEN_M) >> IECM_RXD_FLEX_GEN_S;
+		if (test_bit(__IECM_Q_GEN_CHK, rxq->flags) != gen_id)
+			break;
+
+		pkt_len = le16_to_cpu(splitq_flex_rx_desc->pktlen_gen_bufq_id) &
+			  IECM_RXD_FLEX_LEN_PBUF_M;
+
+		hbo = le16_to_cpu(splitq_flex_rx_desc->status_err0_qw1) &
+		      BIT(IECM_RX_FLEX_DESC_STATUS0_HBO_S);
+
+		if (unlikely(hbo)) {
+			rxq->q_stats.rx.hsplit_hbo++;
+			goto bypass_hsplit;
+		}
+
+		hdr_len =
+			le16_to_cpu(splitq_flex_rx_desc->hdrlen_flags) &
+			IECM_RXD_FLEX_LEN_HDR_M;
+
+bypass_hsplit:
+		bufq_id = le16_to_cpu(splitq_flex_rx_desc->pktlen_gen_bufq_id);
+		bufq_id = (bufq_id & IECM_RXD_FLEX_BUFQ_ID_M) >>
+			  IECM_RXD_FLEX_BUFQ_ID_S;
+		/* retrieve buffer from the rxq */
+		rx_bufq = &rxq->rxq_grp->splitq.bufq_sets[bufq_id].bufq;
+
+		buf_id = le16_to_cpu(splitq_flex_rx_desc->fmd0_bufid.buf_id);
+
+		if (pkt_len) {
+			rx_buf = &rx_bufq->rx_buf.buf[buf_id];
+			iecm_rx_get_buf_page(rx_bufq->dev, rx_buf, pkt_len);
+		}
+
+		if (hdr_len) {
+			hdr_buf = &rx_bufq->rx_buf.hdr_buf[buf_id];
+			iecm_rx_get_buf_page(rx_bufq->dev, hdr_buf,
+					     hdr_len);
+
+			skb = iecm_rx_construct_skb(rxq, hdr_buf, hdr_len);
+		}
+
+		if (skb && pkt_len)
+			iecm_rx_add_frag(rx_buf, skb, pkt_len);
+		else if (pkt_len)
+			skb = iecm_rx_construct_skb(rxq, rx_buf, pkt_len);
+
+		/* exit if we failed to retrieve a buffer */
+		if (!skb) {
+			/* If we fetched a buffer, but didn't use it
+			 * undo pagecnt_bias decrement
+			 */
+			if (rx_buf)
+				rx_buf->pagecnt_bias++;
+			break;
+		}
+
+		iecm_rx_splitq_put_bufs(rx_bufq, hdr_buf, rx_buf);
+		iecm_rx_bump_ntc(rxq);
+		cleaned_count[bufq_id]++;
+
+		/* skip if it is non EOP desc */
+		if (iecm_rx_splitq_is_non_eop(splitq_flex_rx_desc))
+			continue;
+
+		stat_err_bits = BIT(IECM_RX_FLEX_DESC_STATUS0_RXE_S);
+		stat_err0_qw0 = splitq_flex_rx_desc->status_err0_qw0;
+		if (unlikely(iecm_rx_splitq_test_staterr(stat_err0_qw0,
+							 stat_err_bits))) {
+			dev_kfree_skb_any(skb);
+			skb = NULL;
+			continue;
+		}
+
+		/* correct empty headers and pad skb if needed (to make valid
+		 * Ethernet frame
+		 */
+		if (iecm_rx_cleanup_headers(skb)) {
+			skb = NULL;
+			continue;
+		}
+
+		/* probably a little skewed due to removing CRC */
+		total_rx_bytes += skb->len;
+
+		/* protocol */
+		if (unlikely(iecm_rx_process_skb_fields(rxq, skb,
+							splitq_flex_rx_desc))) {
+			dev_kfree_skb_any(skb);
+			skb = NULL;
+			continue;
+		}
+
+		/* send completed skb up the stack */
+		iecm_rx_skb(rxq, skb);
+		skb = NULL;
+
+		/* update budget accounting */
+		total_rx_pkts++;
+	}
+	for (i = 0; i < IECM_BUFQS_PER_RXQ_SET; i++) {
+		if (cleaned_count[i]) {
+			rx_bufq = &rxq->rxq_grp->splitq.bufq_sets[i].bufq;
+			failure = iecm_rx_buf_hw_alloc_all(rx_bufq,
+							   cleaned_count[i]) ||
+				  failure;
+		}
+	}
+
+	rxq->skb = skb;
+	u64_stats_update_begin(&rxq->stats_sync);
+	rxq->q_stats.rx.packets += total_rx_pkts;
+	rxq->q_stats.rx.bytes += total_rx_bytes;
+	u64_stats_update_end(&rxq->stats_sync);
+
+	rxq->itr.stats.rx.packets += total_rx_pkts;
+	rxq->itr.stats.rx.bytes += total_rx_bytes;
+
+	/* guarantee a trip back through this routine if there was a failure */
+	return failure ? budget : (int)total_rx_pkts;
 }
 
 /**
@@ -2379,7 +3516,15 @@ iecm_vport_intr_napi_ena_all(struct iecm_vport *vport)
 static inline bool
 iecm_tx_splitq_clean_all(struct iecm_q_vector *q_vec, int budget)
 {
-	/* stub */
+	bool clean_complete = true;
+	int i, budget_per_q;
+
+	budget_per_q = max(budget / q_vec->num_txq, 1);
+	for (i = 0; i < q_vec->num_txq; i++) {
+		if (!iecm_tx_clean_complq(q_vec->tx[i], budget_per_q))
+			clean_complete = false;
+	}
+	return clean_complete;
 }
 
 /**
@@ -2394,7 +3539,22 @@ static inline bool
 iecm_rx_splitq_clean_all(struct iecm_q_vector *q_vec, int budget,
 			 int *cleaned)
 {
-	/* stub */
+	bool clean_complete = true;
+	int pkts_cleaned_per_q;
+	int i, budget_per_q;
+
+	budget_per_q = max(budget / q_vec->num_rxq, 1);
+	for (i = 0; i < q_vec->num_rxq; i++) {
+		pkts_cleaned_per_q  = iecm_rx_splitq_clean(q_vec->rx[i],
+							   budget_per_q);
+		/* if we clean as many as budgeted, we must not
+		 * be done
+		 */
+		if (pkts_cleaned_per_q >= budget_per_q)
+			clean_complete = false;
+		*cleaned += pkts_cleaned_per_q;
+	}
+	return clean_complete;
 }
 
 /**
@@ -2404,7 +3564,34 @@ iecm_rx_splitq_clean_all(struct iecm_q_vector *q_vec, int budget,
  */
 int iecm_vport_splitq_napi_poll(struct napi_struct *napi, int budget)
 {
-	/* stub */
+	struct iecm_q_vector *q_vector =
+				container_of(napi, struct iecm_q_vector, napi);
+	bool clean_complete;
+	int work_done = 0;
+
+	clean_complete = iecm_tx_splitq_clean_all(q_vector, budget);
+
+	/* Handle case where we are called by netpoll with a budget of 0 */
+	if (budget <= 0)
+		return budget;
+
+	/* We attempt to distribute budget to each Rx queue fairly, but don't
+	 * allow the budget to go below 1 because that would exit polling early.
+	 */
+	clean_complete |= iecm_rx_splitq_clean_all(q_vector, budget,
+						   &work_done);
+
+	/* If work not completed, return budget and polling will return */
+	if (!clean_complete)
+		return budget;
+
+	/* Exit the polling mode, but don't re-enable interrupts if stack might
+	 * poll us due to busy-polling
+	 */
+	if (likely(napi_complete_done(napi, work_done)))
+		iecm_vport_intr_update_itr_ena_irq(q_vector);
+
+	return min_t(int, work_done, budget - 1);
 }
 
 /**
-- 
2.26.2


^ permalink raw reply related

* [net-next 01/15] virtchnl: Extend AVF ops
From: Jeff Kirsher @ 2020-06-18  5:13 UTC (permalink / raw)
  To: davem
  Cc: Alice Michael, netdev, nhorman, sassmann, Alan Brady, Phani Burra,
	Joshua Hay, Madhu Chittim, Pavan Kumar Linga, Donald Skidmore,
	Jesse Brandeburg, Sridhar Samudrala, Jeff Kirsher
In-Reply-To: <20200618051344.516587-1-jeffrey.t.kirsher@intel.com>

From: Alice Michael <alice.michael@intel.com>

This implements the next generation of virtchnl ops which
enable greater functionality and capabilities.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Donald Skidmore <donald.c.skidmore@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 include/linux/avf/virtchnl.h | 592 +++++++++++++++++++++++++++++++++++
 1 file changed, 592 insertions(+)

diff --git a/include/linux/avf/virtchnl.h b/include/linux/avf/virtchnl.h
index 40bad71865ea..a967ea2c1248 100644
--- a/include/linux/avf/virtchnl.h
+++ b/include/linux/avf/virtchnl.h
@@ -136,6 +136,34 @@ enum virtchnl_ops {
 	VIRTCHNL_OP_DISABLE_CHANNELS = 31,
 	VIRTCHNL_OP_ADD_CLOUD_FILTER = 32,
 	VIRTCHNL_OP_DEL_CLOUD_FILTER = 33,
+	/* New major set of opcodes introduced and so leaving room for
+	 * old misc opcodes to be added in future. Also these opcodes may only
+	 * be used if both the PF and VF have successfully negotiated the
+	 * VIRTCHNL_VF_CAP_EXT_FEATURES capability during initial capabilities
+	 * exchange.
+	 */
+	VIRTCHNL_OP_GET_CAPS = 100,
+	VIRTCHNL_OP_CREATE_VPORT = 101,
+	VIRTCHNL_OP_DESTROY_VPORT = 102,
+	VIRTCHNL_OP_ENABLE_VPORT = 103,
+	VIRTCHNL_OP_DISABLE_VPORT = 104,
+	VIRTCHNL_OP_CONFIG_TX_QUEUES = 105,
+	VIRTCHNL_OP_CONFIG_RX_QUEUES = 106,
+	VIRTCHNL_OP_ENABLE_QUEUES_V2 = 107,
+	VIRTCHNL_OP_DISABLE_QUEUES_V2 = 108,
+	VIRTCHNL_OP_ADD_QUEUES = 109,
+	VIRTCHNL_OP_DEL_QUEUES = 110,
+	VIRTCHNL_OP_MAP_QUEUE_VECTOR = 111,
+	VIRTCHNL_OP_UNMAP_QUEUE_VECTOR = 112,
+	VIRTCHNL_OP_GET_RSS_KEY = 113,
+	VIRTCHNL_OP_GET_RSS_LUT = 114,
+	VIRTCHNL_OP_SET_RSS_LUT = 115,
+	VIRTCHNL_OP_GET_RSS_HASH = 116,
+	VIRTCHNL_OP_SET_RSS_HASH = 117,
+	VIRTCHNL_OP_CREATE_VFS = 118,
+	VIRTCHNL_OP_DESTROY_VFS = 119,
+	VIRTCHNL_OP_ALLOC_VECTORS = 120,
+	VIRTCHNL_OP_DEALLOC_VECTORS = 121,
 };
 
 /* These macros are used to generate compilation errors if a structure/union
@@ -463,6 +491,21 @@ VIRTCHNL_CHECK_STRUCT_LEN(4, virtchnl_promisc_info);
  * PF replies with struct eth_stats in an external buffer.
  */
 
+struct virtchnl_eth_stats {
+	u64 rx_bytes;                   /* received bytes */
+	u64 rx_unicast;                 /* received unicast pkts */
+	u64 rx_multicast;               /* received multicast pkts */
+	u64 rx_broadcast;               /* received broadcast pkts */
+	u64 rx_discards;
+	u64 rx_unknown_protocol;
+	u64 tx_bytes;                   /* transmitted bytes */
+	u64 tx_unicast;                 /* transmitted unicast pkts */
+	u64 tx_multicast;               /* transmitted multicast pkts */
+	u64 tx_broadcast;               /* transmitted broadcast pkts */
+	u64 tx_discards;
+	u64 tx_errors;
+};
+
 /* VIRTCHNL_OP_CONFIG_RSS_KEY
  * VIRTCHNL_OP_CONFIG_RSS_LUT
  * VF sends these messages to configure RSS. Only supported if both PF
@@ -668,6 +711,397 @@ enum virtchnl_vfr_states {
 	VIRTCHNL_VFR_VFACTIVE,
 };
 
+/* PF capability flags
+ * VIRTCHNL_CAP_STATELESS_OFFLOADS flag indicates stateless offloads
+ * such as TX/RX Checksum offloading and TSO for non-tunneled packets. Please
+ * note that old and new capabilities are exclusive and not supposed to be
+ * mixed
+ */
+#define VIRTCHNL_CAP_STATELESS_OFFLOADS	BIT(1)
+#define VIRTCHNL_CAP_UDP_SEG_OFFLOAD		BIT(2)
+#define VIRTCHNL_CAP_RSS			BIT(3)
+#define VIRTCHNL_CAP_TCP_RSC			BIT(4)
+#define VIRTCHNL_CAP_HEADER_SPLIT		BIT(5)
+#define VIRTCHNL_CAP_RDMA			BIT(6)
+#define VIRTCHNL_CAP_SRIOV			BIT(7)
+/* Earliest Departure Time capability used for Timing Wheel */
+#define VIRTCHNL_CAP_EDT			BIT(8)
+
+/* Type of virtual port */
+enum virtchnl_vport_type {
+	VIRTCHNL_VPORT_TYPE_DEFAULT	= 0,
+};
+
+/* Type of queue model */
+enum virtchnl_queue_model {
+	VIRTCHNL_QUEUE_MODEL_SINGLE	= 0,
+	VIRTCHNL_QUEUE_MODEL_SPLIT	= 1,
+};
+
+/* TX and RX queue types are valid in legacy as well as split queue models.
+ * With Split Queue model, 2 additional types are introduced - TX_COMPLETION
+ * and RX_BUFFER. In split queue model, RX corresponds to the queue where HW
+ * posts completions.
+ */
+enum virtchnl_queue_type {
+	VIRTCHNL_QUEUE_TYPE_TX			= 0,
+	VIRTCHNL_QUEUE_TYPE_RX			= 1,
+	VIRTCHNL_QUEUE_TYPE_TX_COMPLETION	= 2,
+	VIRTCHNL_QUEUE_TYPE_RX_BUFFER		= 3,
+};
+
+/* RX Queue Feature bits */
+#define VIRTCHNL_RXQ_RSC			BIT(1)
+#define VIRTCHNL_RXQ_HDR_SPLIT			BIT(2)
+#define VIRTCHNL_RXQ_IMMEDIATE_WRITE_BACK	BIT(4)
+
+/* RX Queue Descriptor Types */
+enum virtchnl_rxq_desc_size {
+	VIRTCHNL_RXQ_DESC_SIZE_16BYTE	= 0,
+	VIRTCHNL_RXQ_DESC_SIZE_32BYTE	= 1,
+};
+
+/* TX Queue Scheduling Modes  Queue mode is the legacy type i.e. inorder
+ * and Flow mode is out of order packet processing
+ */
+enum virtchnl_txq_sched_mode {
+	VIRTCHNL_TXQ_SCHED_MODE_QUEUE	= 0,
+	VIRTCHNL_TXQ_SCHED_MODE_FLOW	= 1,
+};
+
+/* Queue Descriptor Profiles  Base mode is the legacy and Native is the
+ * flex descriptors
+ */
+enum virtchnl_desc_profile {
+	VIRTCHNL_TXQ_DESC_PROFILE_BASE		= 0,
+	VIRTCHNL_TXQ_DESC_PROFILE_NATIVE	= 1,
+};
+
+/* Type of RSS algorithm */
+enum virtchnl_rss_algorithm {
+	VIRTCHNL_RSS_ALG_TOEPLITZ_ASYMMETRIC	= 0,
+	VIRTCHNL_RSS_ALG_R_ASYMMETRIC		= 1,
+	VIRTCHNL_RSS_ALG_TOEPLITZ_SYMMETRIC	= 2,
+	VIRTCHNL_RSS_ALG_XOR_SYMMETRIC		= 3,
+};
+
+/* VIRTCHNL_OP_GET_CAPS
+ * PF sends this message to CP to negotiate capabilities by filling
+ * in the u64 bitmap of its desired capabilities, max_num_vfs and
+ * num_allocated_vectors.
+ * CP responds with an updated virtchnl_get_capabilities structure
+ * with allowed capabilities and the other fields as below.
+ * If PF sets max_num_vfs as 0, CP will respond with max number of VFs
+ * that can be created by this PF. For any other value 'n', CP responds
+ * with max_num_vfs set to max(n, x) where x is the max number of VFs
+ * allowed by CP's policy.
+ * If PF sets num_allocated_vectors as 0, CP will respond with 1 which
+ * is default vector associated with the default mailbox. For any other
+ * value 'n', CP responds with a value <= n based on the CP's policy of
+ * max number of vectors for a PF.
+ */
+struct virtchnl_get_capabilities {
+	u64 cap_flags;
+	u16 max_num_vfs;
+	u16 num_allocated_vectors;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(16, virtchnl_get_capabilities);
+
+/* structure to specify a chunk of contiguous queues */
+struct virtchnl_queue_chunk {
+	enum virtchnl_queue_type type;
+	u16 start_queue_id;
+	u16 num_queues;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(8, virtchnl_queue_chunk);
+
+/* structure to specify several chunks of contiguous queues */
+struct virtchnl_queue_chunks {
+	u16 num_chunks;
+	u16 rsvd;
+	struct virtchnl_queue_chunk chunks[1];
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(12, virtchnl_queue_chunks);
+
+/* VIRTCHNL_OP_CREATE_VPORT
+ * PF sends this message to CP to create a vport by filling in the first 8
+ * fields of virtchnl_create_vport structure (vport type, Tx, Rx queue models
+ * and desired number of queues and vectors). CP responds with the updated
+ * virtchnl_create_vport structure containing the number of assigned queues,
+ * vectors, vport id, max mtu, default mac addr followed by chunks which in turn
+ * will have an array of num_chunks entries of virtchnl_queue_chunk structures.
+ */
+struct virtchnl_create_vport {
+	enum virtchnl_vport_type vport_type;
+	/* single or split */
+	enum virtchnl_queue_model txq_model;
+	/* single or split */
+	enum virtchnl_queue_model rxq_model;
+	u16 num_tx_q;
+	/* valid only if txq_model is split Q */
+	u16 num_tx_complq;
+	u16 num_rx_q;
+	/* valid only if rxq_model is split Q */
+	u16 num_rx_bufq;
+	u16 vport_id;
+	u16 max_mtu;
+	u8 default_mac_addr[ETH_ALEN];
+	enum virtchnl_rss_algorithm rss_algorithm;
+	u16 rss_key_size;
+	u16 rss_lut_size;
+	u16 qset_handle;
+	struct virtchnl_queue_chunks chunks;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(56, virtchnl_create_vport);
+
+/* VIRTCHNL_OP_DESTROY_VPORT
+ * VIRTCHNL_OP_ENABLE_VPORT
+ * VIRTCHNL_OP_DISABLE_VPORT
+ * PF sends this message to CP to destroy, enable or disable a vport by filling
+ * in the vport_id in virtchnl_vport structure.
+ * CP responds with the status of the requested operation.
+ */
+struct virtchnl_vport {
+	u16 vport_id;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(2, virtchnl_vport);
+
+/* Tx queue config info */
+struct virtchnl_txq_info_v2 {
+	u16 queue_id;
+	/* single or split */
+	enum virtchnl_queue_model model;
+	/* Tx or tx_completion */
+	enum virtchnl_queue_type type;
+	/* queue or flow based */
+	enum virtchnl_txq_sched_mode sched_mode;
+	/* base or native */
+	enum virtchnl_desc_profile desc_profile;
+	u16 ring_len;
+	u64 dma_ring_addr;
+	/* valid only if queue model is split and type is Tx */
+	u16 tx_compl_queue_id;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(40, virtchnl_txq_info_v2);
+
+/* VIRTCHNL_OP_CONFIG_TX_QUEUES
+ * PF sends this message to set up parameters for one or more TX queues.
+ * This message contains an array of num_qinfo instances of virtchnl_txq_info_v2
+ * structures. CP configures requested queues and returns a status code. If
+ * num_qinfo specified is greater than the number of queues associated with the
+ * vport, an error is returned and no queues are configured.
+ */
+struct virtchnl_config_tx_queues {
+	u16 vport_id;
+	u16 num_qinfo;
+	u32 rsvd;
+	struct virtchnl_txq_info_v2 qinfo[1];
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(48, virtchnl_config_tx_queues);
+
+/* Rx queue config info */
+struct virtchnl_rxq_info_v2 {
+	u16 queue_id;
+	/* single or split */
+	enum virtchnl_queue_model model;
+	/* Rx or Rx buffer */
+	enum virtchnl_queue_type type;
+	/* base or native */
+	enum virtchnl_desc_profile desc_profile;
+	/* rsc, header-split, immediate write back */
+	u16 queue_flags;
+	/* 16 or 32 byte */
+	enum virtchnl_rxq_desc_size desc_size;
+	u16 ring_len;
+	u16 hdr_buffer_size;
+	u32 data_buffer_size;
+	u32 max_pkt_size;
+	u64 dma_ring_addr;
+	u64 dma_head_wb_addr;
+	u16 rsc_low_watermark;
+	u8 buffer_notif_stride;
+	enum virtchnl_rx_hsplit rx_split_pos;
+	/* valid only if queue model is split and type is Rx buffer*/
+	u16 rx_bufq1_id;
+	/* valid only if queue model is split and type is Rx buffer*/
+	u16 rx_bufq2_id;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(72, virtchnl_rxq_info_v2);
+
+/* VIRTCHNL_OP_CONFIG_RX_QUEUES
+ * PF sends this message to set up parameters for one or more RX queues.
+ * This message contains an array of num_qinfo instances of virtchnl_rxq_info_v2
+ * structures. CP configures requested queues and returns a status code.
+ * If the number of queues specified is greater than the number of queues
+ * associated with the vport, an error is returned and no queues are configured.
+ */
+struct virtchnl_config_rx_queues {
+	u16 vport_id;
+	u16 num_qinfo;
+	struct virtchnl_rxq_info_v2 qinfo[1];
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(80, virtchnl_config_rx_queues);
+
+/* VIRTCHNL_OP_ADD_QUEUES
+ * PF sends this message to request additional TX/RX queues beyond the ones
+ * that were assigned via CREATE_VPORT request. virtchnl_add_queues structure is
+ * used to specify the number of each type of queues.
+ * CP responds with the same structure with the actual number of queues assigned
+ * followed by num_chunks of virtchnl_queue_chunk structures.
+ */
+struct virtchnl_add_queues {
+	u16 vport_id;
+	u16 num_tx_q;
+	u16 num_tx_complq;
+	u16 num_rx_q;
+	u16 num_rx_bufq;
+	struct virtchnl_queue_chunks chunks;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(24, virtchnl_add_queues);
+
+/* VIRTCHNL_OP_ENABLE_QUEUES
+ * VIRTCHNL_OP_DISABLE_QUEUES
+ * VIRTCHNL_OP_DEL_QUEUES
+ * PF sends these messages to enable, disable or delete queues specified in
+ * chunks. PF sends virtchnl_del_ena_dis_queues struct to specify the queues
+ * to be enabled/disabled/deleted. Also applicable to single queue RX or
+ * TX. CP performs requested action and returns status.
+ */
+struct virtchnl_del_ena_dis_queues {
+	u16 vport_id;
+	struct virtchnl_queue_chunks chunks;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(16, virtchnl_del_ena_dis_queues);
+
+/* Virtchannel interrupt throttling rate index */
+enum virtchnl_itr_idx {
+	VIRTCHNL_ITR_IDX_0	= 0,
+	VIRTCHNL_ITR_IDX_1	= 1,
+	VIRTCHNL_ITR_IDX_NO_ITR	= 3,
+};
+
+/* Queue to vector mapping */
+struct virtchnl_queue_vector {
+	u16 queue_id;
+	u16 vector_id;
+	enum virtchnl_itr_idx itr_idx;
+	enum virtchnl_queue_type queue_type;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(12, virtchnl_queue_vector);
+
+/* VIRTCHNL_OP_MAP_QUEUE_VECTOR
+ * VIRTCHNL_OP_UNMAP_QUEUE_VECTOR
+ * PF sends this message to map or unmap queues to vectors and ITR index
+ * registers. External data buffer contains virtchnl_queue_vector_maps structure
+ * that contains num_maps of virtchnl_queue_vector structures.
+ * CP maps the requested queue vector maps after validating the queue and vector
+ * ids and returns a status code.
+ */
+struct virtchnl_queue_vector_maps {
+	u16 vport_id;
+	u16 num_maps;
+	struct virtchnl_queue_vector qv_maps[1];
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(16, virtchnl_queue_vector_maps);
+
+/* Structure to specify a chunk of contiguous interrupt vectors */
+struct virtchnl_vector_chunk {
+	u16 start_vector_id;
+	u16 num_vectors;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(4, virtchnl_vector_chunk);
+
+/* Structure to specify several chunks of contiguous interrupt vectors */
+struct virtchnl_vector_chunks {
+	u16 num_vector_chunks;
+	struct virtchnl_vector_chunk num_vchunk[1];
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(6, virtchnl_vector_chunks);
+
+/* VIRTCHNL_OP_ALLOC_VECTORS
+ * PF sends this message to request additional interrupt vectors beyond the
+ * ones that were assigned via GET_CAPS request. virtchnl_alloc_vectors
+ * structure is used to specify the number of vectors requested. CP responds
+ * with the same structure with the actual number of vectors assigned followed
+ * by virtchnl_vector_chunks structure identifying the vector ids.
+ */
+struct virtchnl_alloc_vectors {
+	u16 num_vectors;
+	struct virtchnl_vector_chunks vchunks;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(8, virtchnl_alloc_vectors);
+
+/* VIRTCHNL_OP_DEALLOC_VECTORS
+ * PF sends this message to release the vectors.
+ * PF sends virtchnl_vector_chunks struct to specify the vectors it is giving
+ * away. CP performs requested action and returns status.
+ */
+
+/* VIRTCHNL_OP_GET_RSS_LUT
+ * VIRTCHNL_OP_SET_RSS_LUT
+ * PF sends this message to get or set RSS lookup table. Only supported if
+ * both PF and CP drivers set the VIRTCHNL_CAP_RSS bit during configuration
+ * negotiation. Uses the virtchnl_rss_lut_v2 structure
+ */
+struct virtchnl_rss_lut_v2 {
+	u16 vport_id;
+	u16 lut_entries;
+	u16 lut[1]; /* RSS lookup table */
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(6, virtchnl_rss_lut_v2);
+
+/* VIRTCHNL_OP_GET_RSS_KEY
+ * PF sends this message to get RSS key. Only supported if
+ * both PF and CP drivers set the VIRTCHNL_CAP_RSS bit during configuration
+ * negotiation. Uses the virtchnl_rss_key structure
+ */
+
+/* VIRTCHNL_OP_GET_RSS_HASH
+ * VIRTCHNL_OP_SET_RSS_HASH
+ * PF sends these messages to get and set the hash filter enable bits for RSS.
+ * By default, the CP sets these to all possible traffic types that the
+ * hardware supports. The PF can query this value if it wants to change the
+ * traffic types that are hashed by the hardware.
+ * Only supported if both PF and CP drivers set the VIRTCHNL_CAP_RSS bit
+ * during configuration negotiation.
+ */
+struct virtchnl_rss_hash {
+	u64 hash;
+	u16 vport_id;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(16, virtchnl_rss_hash);
+
+/* VIRTCHNL_OP_CREATE_SRIOV_VFS
+ * VIRTCHNL_OP_DESTROY_SRIOV_VFS
+ * This message is used to let the CP know how many SRIOV VFs need to be
+ * created. The actual allocation of resources for the VFs in terms of VSI,
+ * Queues and Interrupts is done by CP. When this call completes, the APF driver
+ * calls pci_enable_sriov to let the OS instantiate the SRIOV PCIE devices.
+ */
+struct virtchnl_sriov_vfs_info {
+	u16 num_vfs;
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(2, virtchnl_sriov_vfs_info);
+
 /**
  * virtchnl_vc_validate_vf_msg
  * @ver: Virtchnl version info
@@ -828,6 +1262,164 @@ virtchnl_vc_validate_vf_msg(struct virtchnl_version_info *ver, u32 v_opcode,
 	case VIRTCHNL_OP_DEL_CLOUD_FILTER:
 		valid_len = sizeof(struct virtchnl_filter);
 		break;
+	case VIRTCHNL_OP_GET_CAPS:
+		valid_len = sizeof(struct virtchnl_get_capabilities);
+		break;
+	case VIRTCHNL_OP_CREATE_VPORT:
+		valid_len = sizeof(struct virtchnl_create_vport);
+		if (msglen >= valid_len) {
+			struct virtchnl_create_vport *cvport =
+				(struct virtchnl_create_vport *)msg;
+
+			if (cvport->chunks.num_chunks == 0) {
+				/* zero chunks is allowed as input */
+				break;
+			}
+
+			valid_len += (cvport->chunks.num_chunks - 1) *
+				sizeof(struct virtchnl_queue_chunk);
+		}
+		break;
+	case VIRTCHNL_OP_DESTROY_VPORT:
+	case VIRTCHNL_OP_ENABLE_VPORT:
+	case VIRTCHNL_OP_DISABLE_VPORT:
+		valid_len = sizeof(struct virtchnl_vport);
+		break;
+	case VIRTCHNL_OP_CONFIG_TX_QUEUES:
+		valid_len = sizeof(struct virtchnl_config_tx_queues);
+		if (msglen >= valid_len) {
+			struct virtchnl_config_tx_queues *ctq =
+				(struct virtchnl_config_tx_queues *)msg;
+			if (ctq->num_qinfo == 0) {
+				err_msg_format = true;
+				break;
+			}
+			valid_len += (ctq->num_qinfo - 1) *
+				sizeof(struct virtchnl_txq_info_v2);
+		}
+		break;
+	case VIRTCHNL_OP_CONFIG_RX_QUEUES:
+		valid_len = sizeof(struct virtchnl_config_rx_queues);
+		if (msglen >= valid_len) {
+			struct virtchnl_config_rx_queues *crq =
+				(struct virtchnl_config_rx_queues *)msg;
+			if (crq->num_qinfo == 0) {
+				err_msg_format = true;
+				break;
+			}
+			valid_len += (crq->num_qinfo - 1) *
+				sizeof(struct virtchnl_rxq_info_v2);
+		}
+		break;
+	case VIRTCHNL_OP_ADD_QUEUES:
+		valid_len = sizeof(struct virtchnl_add_queues);
+		if (msglen >= valid_len) {
+			struct virtchnl_add_queues *add_q =
+				(struct virtchnl_add_queues *)msg;
+
+			if (add_q->chunks.num_chunks == 0) {
+				/* zero chunks is allowed as input */
+				break;
+			}
+
+			valid_len += (add_q->chunks.num_chunks - 1) *
+				sizeof(struct virtchnl_queue_chunk);
+		}
+		break;
+	case VIRTCHNL_OP_ENABLE_QUEUES_V2:
+	case VIRTCHNL_OP_DISABLE_QUEUES_V2:
+	case VIRTCHNL_OP_DEL_QUEUES:
+		valid_len = sizeof(struct virtchnl_del_ena_dis_queues);
+		if (msglen >= valid_len) {
+			struct virtchnl_del_ena_dis_queues *qs =
+				(struct virtchnl_del_ena_dis_queues *)msg;
+			if (qs->chunks.num_chunks == 0) {
+				err_msg_format = true;
+				break;
+			}
+			valid_len += (qs->chunks.num_chunks - 1) *
+				sizeof(struct virtchnl_queue_chunk);
+		}
+		break;
+	case VIRTCHNL_OP_MAP_QUEUE_VECTOR:
+	case VIRTCHNL_OP_UNMAP_QUEUE_VECTOR:
+		valid_len = sizeof(struct virtchnl_queue_vector_maps);
+		if (msglen >= valid_len) {
+			struct virtchnl_queue_vector_maps *v_qp =
+				(struct virtchnl_queue_vector_maps *)msg;
+			if (v_qp->num_maps == 0) {
+				err_msg_format = true;
+				break;
+			}
+			valid_len += (v_qp->num_maps - 1) *
+				sizeof(struct virtchnl_queue_vector);
+		}
+		break;
+	case VIRTCHNL_OP_ALLOC_VECTORS:
+		valid_len = sizeof(struct virtchnl_alloc_vectors);
+		if (msglen >= valid_len) {
+			struct virtchnl_alloc_vectors *v_av =
+				(struct virtchnl_alloc_vectors *)msg;
+
+			if (v_av->vchunks.num_vector_chunks == 0) {
+				/* zero chunks is allowed as input */
+				break;
+			}
+
+			valid_len += (v_av->vchunks.num_vector_chunks - 1) *
+				sizeof(struct virtchnl_vector_chunk);
+		}
+		break;
+	case VIRTCHNL_OP_DEALLOC_VECTORS:
+		valid_len = sizeof(struct virtchnl_vector_chunks);
+		if (msglen >= valid_len) {
+			struct virtchnl_vector_chunks *v_chunks =
+				(struct virtchnl_vector_chunks *)msg;
+			if (v_chunks->num_vector_chunks == 0) {
+				err_msg_format = true;
+				break;
+			}
+			valid_len += (v_chunks->num_vector_chunks - 1) *
+				sizeof(struct virtchnl_vector_chunk);
+		}
+		break;
+	case VIRTCHNL_OP_GET_RSS_KEY:
+		valid_len = sizeof(struct virtchnl_rss_key);
+		if (msglen >= valid_len) {
+			struct virtchnl_rss_key *vrk =
+				(struct virtchnl_rss_key *)msg;
+
+			if (vrk->key_len == 0) {
+				/* zero length is allowed as input */
+				break;
+			}
+
+			valid_len += vrk->key_len - 1;
+		}
+		break;
+	case VIRTCHNL_OP_GET_RSS_LUT:
+	case VIRTCHNL_OP_SET_RSS_LUT:
+		valid_len = sizeof(struct virtchnl_rss_lut_v2);
+		if (msglen >= valid_len) {
+			struct virtchnl_rss_lut_v2 *vrl =
+				(struct virtchnl_rss_lut_v2 *)msg;
+
+			if (vrl->lut_entries == 0) {
+				/* zero entries is allowed as input */
+				break;
+			}
+
+			valid_len += (vrl->lut_entries - 1) * sizeof(u16);
+		}
+		break;
+	case VIRTCHNL_OP_GET_RSS_HASH:
+	case VIRTCHNL_OP_SET_RSS_HASH:
+		valid_len = sizeof(struct virtchnl_rss_hash);
+		break;
+	case VIRTCHNL_OP_CREATE_VFS:
+	case VIRTCHNL_OP_DESTROY_VFS:
+		valid_len = sizeof(struct virtchnl_sriov_vfs_info);
+		break;
 	/* These are always errors coming from the VF. */
 	case VIRTCHNL_OP_EVENT:
 	case VIRTCHNL_OP_UNKNOWN:
-- 
2.26.2


^ permalink raw reply related

* [net-next 08/15] iecm: Implement vector allocation
From: Jeff Kirsher @ 2020-06-18  5:13 UTC (permalink / raw)
  To: davem
  Cc: Alice Michael, netdev, nhorman, sassmann, Alan Brady, Phani Burra,
	Joshua Hay, Madhu Chittim, Pavan Kumar Linga, Donald Skidmore,
	Jesse Brandeburg, Sridhar Samudrala, Jeff Kirsher
In-Reply-To: <20200618051344.516587-1-jeffrey.t.kirsher@intel.com>

From: Alice Michael <alice.michael@intel.com>

This allocates PCI vectors and maps to interrupt
routines.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Donald Skidmore <donald.c.skidmore@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/iecm/iecm_lib.c    |  63 +-
 drivers/net/ethernet/intel/iecm/iecm_txrx.c   | 606 +++++++++++++++++-
 .../net/ethernet/intel/iecm/iecm_virtchnl.c   |  24 +-
 3 files changed, 669 insertions(+), 24 deletions(-)

diff --git a/drivers/net/ethernet/intel/iecm/iecm_lib.c b/drivers/net/ethernet/intel/iecm/iecm_lib.c
index 3f6878704b3e..a4fd04fd0500 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_lib.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_lib.c
@@ -15,7 +15,11 @@ extern int debug;
  */
 static void iecm_mb_intr_rel_irq(struct iecm_adapter *adapter)
 {
-	/* stub */
+	int irq_num;
+
+	irq_num = adapter->msix_entries[0].vector;
+	synchronize_irq(irq_num);
+	free_irq(irq_num, adapter);
 }
 
 /**
@@ -44,7 +48,12 @@ static void iecm_intr_rel(struct iecm_adapter *adapter)
  */
 irqreturn_t iecm_mb_intr_clean(int __always_unused irq, void *data)
 {
-	/* stub */
+	struct iecm_adapter *adapter = (struct iecm_adapter *)data;
+
+	set_bit(__IECM_MB_INTR_TRIGGER, adapter->flags);
+	queue_delayed_work(adapter->serv_wq, &adapter->serv_task,
+			   msecs_to_jiffies(0));
+	return IRQ_HANDLED;
 }
 
 /**
@@ -53,7 +62,12 @@ irqreturn_t iecm_mb_intr_clean(int __always_unused irq, void *data)
  */
 void iecm_mb_irq_enable(struct iecm_adapter *adapter)
 {
-	/* stub */
+	struct iecm_hw *hw = &adapter->hw;
+	struct iecm_intr_reg *intr = &adapter->mb_vector.intr_reg;
+	u32 val;
+
+	val = intr->dyn_ctl_intena_m | intr->dyn_ctl_itridx_m;
+	writel_relaxed(val, (u8 *)(hw->hw_addr + intr->dyn_ctl));
 }
 
 /**
@@ -62,7 +76,22 @@ void iecm_mb_irq_enable(struct iecm_adapter *adapter)
  */
 int iecm_mb_intr_req_irq(struct iecm_adapter *adapter)
 {
-	/* stub */
+	struct iecm_q_vector *mb_vector = &adapter->mb_vector;
+	int irq_num, mb_vidx = 0, err;
+
+	irq_num = adapter->msix_entries[mb_vidx].vector;
+	snprintf(mb_vector->name, sizeof(mb_vector->name) - 1,
+		 "%s-%s-%d", dev_driver_string(&adapter->pdev->dev),
+		 "Mailbox", mb_vidx);
+	err = request_irq(irq_num, adapter->irq_mb_handler, 0,
+			  mb_vector->name, adapter);
+	if (err) {
+		dev_err(&adapter->pdev->dev,
+			"Request_irq for mailbox failed, error: %d\n", err);
+		return err;
+	}
+	set_bit(__IECM_MB_INTR_MODE, adapter->flags);
+	return 0;
 }
 
 /**
@@ -74,7 +103,16 @@ int iecm_mb_intr_req_irq(struct iecm_adapter *adapter)
  */
 void iecm_get_mb_vec_id(struct iecm_adapter *adapter)
 {
-	/* stub */
+	struct virtchnl_vector_chunks *vchunks;
+	struct virtchnl_vector_chunk *chunk;
+
+	if (adapter->req_vec_chunks) {
+		vchunks = &adapter->req_vec_chunks->vchunks;
+		chunk = &vchunks->num_vchunk[0];
+		adapter->mb_vector.v_idx = chunk->start_vector_id;
+	} else {
+		adapter->mb_vector.v_idx = 0;
+	}
 }
 
 /**
@@ -83,7 +121,13 @@ void iecm_get_mb_vec_id(struct iecm_adapter *adapter)
  */
 int iecm_mb_intr_init(struct iecm_adapter *adapter)
 {
-	/* stub */
+	int err = 0;
+
+	iecm_get_mb_vec_id(adapter);
+	adapter->dev_ops.reg_ops.mb_intr_reg_init(adapter);
+	adapter->irq_mb_handler = iecm_mb_intr_clean;
+	err = iecm_mb_intr_req_irq(adapter);
+	return err;
 }
 
 /**
@@ -95,7 +139,12 @@ int iecm_mb_intr_init(struct iecm_adapter *adapter)
  */
 void iecm_intr_distribute(struct iecm_adapter *adapter)
 {
-	/* stub */
+	struct iecm_vport *vport;
+
+	vport = adapter->vports[0];
+	if (adapter->num_msix_entries != adapter->num_req_msix)
+		vport->num_q_vectors = adapter->num_msix_entries -
+				       IECM_MAX_NONQ_VEC - IECM_MIN_RDMA_VEC;
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/iecm/iecm_txrx.c b/drivers/net/ethernet/intel/iecm/iecm_txrx.c
index 0d684adc15e5..da3065a87c2c 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_txrx.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_txrx.c
@@ -1002,7 +1002,16 @@ iecm_vport_intr_clean_queues(int __always_unused irq, void *data)
  */
 static void iecm_vport_intr_napi_dis_all(struct iecm_vport *vport)
 {
-	/* stub */
+	int q_idx;
+
+	if (!vport->netdev)
+		return;
+
+	for (q_idx = 0; q_idx < vport->num_q_vectors; q_idx++) {
+		struct iecm_q_vector *q_vector = &vport->q_vectors[q_idx];
+
+		napi_disable(&q_vector->napi);
+	}
 }
 
 /**
@@ -1013,7 +1022,44 @@ static void iecm_vport_intr_napi_dis_all(struct iecm_vport *vport)
  */
 static void iecm_vport_intr_rel(struct iecm_vport *vport)
 {
-	/* stub */
+	int i, j, v_idx;
+
+	if (!vport->netdev)
+		return;
+
+	for (v_idx = 0; v_idx < vport->num_q_vectors; v_idx++) {
+		struct iecm_q_vector *q_vector = &vport->q_vectors[v_idx];
+
+		if (q_vector)
+			netif_napi_del(&q_vector->napi);
+	}
+
+	/* Clean up the mapping of queues to vectors */
+	for (i = 0; i < vport->num_rxq_grp; i++) {
+		struct iecm_rxq_group *rx_qgrp = &vport->rxq_grps[i];
+
+		if (iecm_is_queue_model_split(vport->rxq_model)) {
+			for (j = 0; j < rx_qgrp->splitq.num_rxq_sets; j++)
+				rx_qgrp->splitq.rxq_sets[j].rxq.q_vector =
+									   NULL;
+		} else {
+			for (j = 0; j < rx_qgrp->singleq.num_rxq; j++)
+				rx_qgrp->singleq.rxqs[j].q_vector = NULL;
+		}
+	}
+
+	if (iecm_is_queue_model_split(vport->txq_model)) {
+		for (i = 0; i < vport->num_txq_grp; i++)
+			vport->txq_grps[i].complq->q_vector = NULL;
+	} else {
+		for (i = 0; i < vport->num_txq_grp; i++) {
+			for (j = 0; j < vport->txq_grps[i].num_txq; j++)
+				vport->txq_grps[i].txqs[j].q_vector = NULL;
+		}
+	}
+
+	kfree(vport->q_vectors);
+	vport->q_vectors = NULL;
 }
 
 /**
@@ -1022,7 +1068,25 @@ static void iecm_vport_intr_rel(struct iecm_vport *vport)
  */
 static void iecm_vport_intr_rel_irq(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_adapter *adapter = vport->adapter;
+	int vector;
+
+	for (vector = 0; vector < vport->num_q_vectors; vector++) {
+		struct iecm_q_vector *q_vector = &vport->q_vectors[vector];
+		int irq_num, vidx;
+
+		/* free only the IRQs that were actually requested */
+		if (!q_vector)
+			continue;
+
+		vidx = vector + vport->q_vector_base;
+		irq_num = adapter->msix_entries[vidx].vector;
+
+		/* clear the affinity_mask in the IRQ descriptor */
+		irq_set_affinity_hint(irq_num, NULL);
+		synchronize_irq(irq_num);
+		free_irq(irq_num, q_vector);
+	}
 }
 
 /**
@@ -1031,7 +1095,13 @@ static void iecm_vport_intr_rel_irq(struct iecm_vport *vport)
  */
 void iecm_vport_intr_dis_irq_all(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_q_vector *q_vector = vport->q_vectors;
+	struct iecm_hw *hw = &vport->adapter->hw;
+	int q_idx;
+
+	for (q_idx = 0; q_idx < vport->num_q_vectors; q_idx++)
+		writel_relaxed(0, (u8 *)(hw->hw_addr +
+					 q_vector[q_idx].intr_reg.dyn_ctl));
 }
 
 /**
@@ -1043,12 +1113,42 @@ void iecm_vport_intr_dis_irq_all(struct iecm_vport *vport)
 static u32 iecm_vport_intr_buildreg_itr(struct iecm_q_vector *q_vector,
 					const int type, u16 itr)
 {
-	/* stub */
+	u32 itr_val;
+
+	itr &= IECM_ITR_MASK;
+	/* Don't clear PBA because that can cause lost interrupts that
+	 * came in while we were cleaning/polling
+	 */
+	itr_val = q_vector->intr_reg.dyn_ctl_intena_m |
+		  (type << q_vector->intr_reg.dyn_ctl_itridx_s) |
+		  (itr << (q_vector->intr_reg.dyn_ctl_intrvl_s - 1));
+
+	return itr_val;
 }
 
 static inline unsigned int iecm_itr_divisor(struct iecm_q_vector *q_vector)
 {
-	/* stub */
+	unsigned int divisor;
+
+	switch (q_vector->vport->adapter->link_speed) {
+	case VIRTCHNL_LINK_SPEED_40GB:
+		divisor = IECM_ITR_ADAPTIVE_MIN_INC * 1024;
+		break;
+	case VIRTCHNL_LINK_SPEED_25GB:
+	case VIRTCHNL_LINK_SPEED_20GB:
+		divisor = IECM_ITR_ADAPTIVE_MIN_INC * 512;
+		break;
+	default:
+	case VIRTCHNL_LINK_SPEED_10GB:
+		divisor = IECM_ITR_ADAPTIVE_MIN_INC * 256;
+		break;
+	case VIRTCHNL_LINK_SPEED_1GB:
+	case VIRTCHNL_LINK_SPEED_100MB:
+		divisor = IECM_ITR_ADAPTIVE_MIN_INC * 32;
+		break;
+	}
+
+	return divisor;
 }
 
 /**
@@ -1069,7 +1169,206 @@ static void iecm_vport_intr_set_new_itr(struct iecm_q_vector *q_vector,
 					struct iecm_itr *itr,
 					enum virtchnl_queue_type q_type)
 {
-	/* stub */
+	unsigned int avg_wire_size, packets = 0, bytes = 0, new_itr;
+	unsigned long next_update = jiffies;
+
+	/* If we don't have any queues just leave ourselves set for maximum
+	 * possible latency so we take ourselves out of the equation.
+	 */
+	if (!IECM_ITR_IS_DYNAMIC(itr->target_itr))
+		return;
+
+	/* For Rx we want to push the delay up and default to low latency.
+	 * for Tx we want to pull the delay down and default to high latency.
+	 */
+	new_itr = q_type == VIRTCHNL_QUEUE_TYPE_RX ?
+	      IECM_ITR_ADAPTIVE_MIN_USECS | IECM_ITR_ADAPTIVE_LATENCY :
+	      IECM_ITR_ADAPTIVE_MAX_USECS | IECM_ITR_ADAPTIVE_LATENCY;
+
+	/* If we didn't update within up to 1 - 2 jiffies we can assume
+	 * that either packets are coming in so slow there hasn't been
+	 * any work, or that there is so much work that NAPI is dealing
+	 * with interrupt moderation and we don't need to do anything.
+	 */
+	if (time_after(next_update, itr->next_update))
+		goto clear_counts;
+
+	/* If itr_countdown is set it means we programmed an ITR within
+	 * the last 4 interrupt cycles. This has a side effect of us
+	 * potentially firing an early interrupt. In order to work around
+	 * this we need to throw out any data received for a few
+	 * interrupts following the update.
+	 */
+	if (q_vector->itr_countdown) {
+		new_itr = itr->target_itr;
+		goto clear_counts;
+	}
+
+	if (q_type == VIRTCHNL_QUEUE_TYPE_TX) {
+		packets = itr->stats.tx.packets;
+		bytes = itr->stats.tx.bytes;
+	}
+
+	if (q_type == VIRTCHNL_QUEUE_TYPE_RX) {
+		packets = itr->stats.rx.packets;
+		bytes = itr->stats.rx.bytes;
+
+		/* If there are 1 to 4 RX packets and bytes are less than
+		 * 9000 assume insufficient data to use bulk rate limiting
+		 * approach unless Tx is already in bulk rate limiting. We
+		 * are likely latency driven.
+		 */
+		if (packets && packets < 4 && bytes < 9000 &&
+		    (q_vector->tx[0]->itr.target_itr &
+		     IECM_ITR_ADAPTIVE_LATENCY)) {
+			new_itr = IECM_ITR_ADAPTIVE_LATENCY;
+			goto adjust_by_size;
+		}
+	} else if (packets < 4) {
+		/* If we have Tx and Rx ITR maxed and Tx ITR is running in
+		 * bulk mode and we are receiving 4 or fewer packets just
+		 * reset the ITR_ADAPTIVE_LATENCY bit for latency mode so
+		 * that the Rx can relax.
+		 */
+		if (itr->target_itr == IECM_ITR_ADAPTIVE_MAX_USECS &&
+		    ((q_vector->rx[0]->itr.target_itr & IECM_ITR_MASK) ==
+		     IECM_ITR_ADAPTIVE_MAX_USECS))
+			goto clear_counts;
+	} else if (packets > 32) {
+		/* If we have processed over 32 packets in a single interrupt
+		 * for Tx assume we need to switch over to "bulk" mode.
+		 */
+		itr->target_itr &= ~IECM_ITR_ADAPTIVE_LATENCY;
+	}
+
+	/* We have no packets to actually measure against. This means
+	 * either one of the other queues on this vector is active or
+	 * we are a Tx queue doing TSO with too high of an interrupt rate.
+	 *
+	 * Between 4 and 56 we can assume that our current interrupt delay
+	 * is only slightly too low. As such we should increase it by a small
+	 * fixed amount.
+	 */
+	if (packets < 56) {
+		new_itr = itr->target_itr + IECM_ITR_ADAPTIVE_MIN_INC;
+		if ((new_itr & IECM_ITR_MASK) > IECM_ITR_ADAPTIVE_MAX_USECS) {
+			new_itr &= IECM_ITR_ADAPTIVE_LATENCY;
+			new_itr += IECM_ITR_ADAPTIVE_MAX_USECS;
+		}
+		goto clear_counts;
+	}
+
+	if (packets <= 256) {
+		new_itr = min(q_vector->tx[0]->itr.current_itr,
+			      q_vector->rx[0]->itr.current_itr);
+		new_itr &= IECM_ITR_MASK;
+
+		/* Between 56 and 112 is our "goldilocks" zone where we are
+		 * working out "just right". Just report that our current
+		 * ITR is good for us.
+		 */
+		if (packets <= 112)
+			goto clear_counts;
+
+		/* If packet count is 128 or greater we are likely looking
+		 * at a slight overrun of the delay we want. Try halving
+		 * our delay to see if that will cut the number of packets
+		 * in half per interrupt.
+		 */
+		new_itr /= 2;
+		new_itr &= IECM_ITR_MASK;
+		if (new_itr < IECM_ITR_ADAPTIVE_MIN_USECS)
+			new_itr = IECM_ITR_ADAPTIVE_MIN_USECS;
+
+		goto clear_counts;
+	}
+
+	/* The paths below assume we are dealing with a bulk ITR since
+	 * number of packets is greater than 256. We are just going to have
+	 * to compute a value and try to bring the count under control,
+	 * though for smaller packet sizes there isn't much we can do as
+	 * NAPI polling will likely be kicking in sooner rather than later.
+	 */
+	new_itr = IECM_ITR_ADAPTIVE_BULK;
+
+adjust_by_size:
+	/* If packet counts are 256 or greater we can assume we have a gross
+	 * overestimation of what the rate should be. Instead of trying to fine
+	 * tune it just use the formula below to try and dial in an exact value
+	 * give the current packet size of the frame.
+	 */
+	avg_wire_size = bytes / packets;
+
+	/* The following is a crude approximation of:
+	 *  wmem_default / (size + overhead) = desired_pkts_per_int
+	 *  rate / bits_per_byte / (size + Ethernet overhead) = pkt_rate
+	 *  (desired_pkt_rate / pkt_rate) * usecs_per_sec = ITR value
+	 *
+	 * Assuming wmem_default is 212992 and overhead is 640 bytes per
+	 * packet, (256 skb, 64 headroom, 320 shared info), we can reduce the
+	 * formula down to
+	 *
+	 *  (170 * (size + 24)) / (size + 640) = ITR
+	 *
+	 * We first do some math on the packet size and then finally bit shift
+	 * by 8 after rounding up. We also have to account for PCIe link speed
+	 * difference as ITR scales based on this.
+	 */
+	if (avg_wire_size <= 60) {
+		/* Start at 250k ints/sec */
+		avg_wire_size = 4096;
+	} else if (avg_wire_size <= 380) {
+		/* 250K ints/sec to 60K ints/sec */
+		avg_wire_size *= 40;
+		avg_wire_size += 1696;
+	} else if (avg_wire_size <= 1084) {
+		/* 60K ints/sec to 36K ints/sec */
+		avg_wire_size *= 15;
+		avg_wire_size += 11452;
+	} else if (avg_wire_size <= 1980) {
+		/* 36K ints/sec to 30K ints/sec */
+		avg_wire_size *= 5;
+		avg_wire_size += 22420;
+	} else {
+		/* plateau at a limit of 30K ints/sec */
+		avg_wire_size = 32256;
+	}
+
+	/* If we are in low latency mode halve our delay which doubles the
+	 * rate to somewhere between 100K to 16K ints/sec
+	 */
+	if (new_itr & IECM_ITR_ADAPTIVE_LATENCY)
+		avg_wire_size /= 2;
+
+	/* Resultant value is 256 times larger than it needs to be. This
+	 * gives us room to adjust the value as needed to either increase
+	 * or decrease the value based on link speeds of 10G, 2.5G, 1G, etc.
+	 *
+	 * Use addition as we have already recorded the new latency flag
+	 * for the ITR value.
+	 */
+	new_itr += DIV_ROUND_UP(avg_wire_size, iecm_itr_divisor(q_vector)) *
+		   IECM_ITR_ADAPTIVE_MIN_INC;
+
+	if ((new_itr & IECM_ITR_MASK) > IECM_ITR_ADAPTIVE_MAX_USECS) {
+		new_itr &= IECM_ITR_ADAPTIVE_LATENCY;
+		new_itr += IECM_ITR_ADAPTIVE_MAX_USECS;
+	}
+
+clear_counts:
+	/* write back value */
+	itr->target_itr = new_itr;
+
+	/* next update should occur within next jiffy */
+	itr->next_update = next_update + 1;
+
+	if (q_type == VIRTCHNL_QUEUE_TYPE_RX) {
+		itr->stats.rx.bytes = 0;
+		itr->stats.rx.packets = 0;
+	} else if (q_type == VIRTCHNL_QUEUE_TYPE_TX) {
+		itr->stats.tx.bytes = 0;
+		itr->stats.tx.packets = 0;
+	}
 }
 
 /**
@@ -1078,7 +1377,59 @@ static void iecm_vport_intr_set_new_itr(struct iecm_q_vector *q_vector,
  */
 void iecm_vport_intr_update_itr_ena_irq(struct iecm_q_vector *q_vector)
 {
-	/* stub */
+	struct iecm_hw *hw = &q_vector->vport->adapter->hw;
+	struct iecm_itr *tx_itr = &q_vector->tx[0]->itr;
+	struct iecm_itr *rx_itr = &q_vector->rx[0]->itr;
+	u32 intval;
+
+	/* These will do nothing if dynamic updates are not enabled */
+	iecm_vport_intr_set_new_itr(q_vector, tx_itr, q_vector->tx[0]->q_type);
+	iecm_vport_intr_set_new_itr(q_vector, rx_itr, q_vector->rx[0]->q_type);
+
+	/* This block of logic allows us to get away with only updating
+	 * one ITR value with each interrupt. The idea is to perform a
+	 * pseudo-lazy update with the following criteria.
+	 *
+	 * 1. Rx is given higher priority than Tx if both are in same state
+	 * 2. If we must reduce an ITR that is given highest priority.
+	 * 3. We then give priority to increasing ITR based on amount.
+	 */
+	if (rx_itr->target_itr < rx_itr->current_itr) {
+		/* Rx ITR needs to be reduced, this is highest priority */
+		intval = iecm_vport_intr_buildreg_itr(q_vector,
+						      rx_itr->itr_idx,
+						      rx_itr->target_itr);
+		rx_itr->current_itr = rx_itr->target_itr;
+		q_vector->itr_countdown = ITR_COUNTDOWN_START;
+	} else if ((tx_itr->target_itr < tx_itr->current_itr) ||
+		   ((rx_itr->target_itr - rx_itr->current_itr) <
+		    (tx_itr->target_itr - tx_itr->current_itr))) {
+		/* Tx ITR needs to be reduced, this is second priority
+		 * Tx ITR needs to be increased more than Rx, fourth priority
+		 */
+		intval = iecm_vport_intr_buildreg_itr(q_vector,
+						      tx_itr->itr_idx,
+						      tx_itr->target_itr);
+		tx_itr->current_itr = tx_itr->target_itr;
+		q_vector->itr_countdown = ITR_COUNTDOWN_START;
+	} else if (rx_itr->current_itr != rx_itr->target_itr) {
+		/* Rx ITR needs to be increased, third priority */
+		intval = iecm_vport_intr_buildreg_itr(q_vector,
+						      rx_itr->itr_idx,
+						      rx_itr->target_itr);
+		rx_itr->current_itr = rx_itr->target_itr;
+		q_vector->itr_countdown = ITR_COUNTDOWN_START;
+	} else {
+		/* No ITR update, lowest priority */
+		intval = iecm_vport_intr_buildreg_itr(q_vector,
+						      VIRTCHNL_ITR_IDX_NO_ITR,
+						      0);
+		if (q_vector->itr_countdown)
+			q_vector->itr_countdown--;
+	}
+
+	writel_relaxed(intval, (u8 *)(hw->hw_addr +
+				      q_vector->intr_reg.dyn_ctl));
 }
 
 /**
@@ -1089,7 +1440,40 @@ void iecm_vport_intr_update_itr_ena_irq(struct iecm_q_vector *q_vector)
 static int
 iecm_vport_intr_req_irq(struct iecm_vport *vport, char *basename)
 {
-	/* stub */
+	struct iecm_adapter *adapter = vport->adapter;
+	int vector, err, irq_num, vidx;
+
+	for (vector = 0; vector < vport->num_q_vectors; vector++) {
+		struct iecm_q_vector *q_vector = &vport->q_vectors[vector];
+
+		vidx = vector + vport->q_vector_base;
+		irq_num = adapter->msix_entries[vidx].vector;
+
+		snprintf(q_vector->name, sizeof(q_vector->name) - 1,
+			 "%s-%s-%d", basename, "TxRx", vidx);
+
+		err = request_irq(irq_num, vport->irq_q_handler, 0,
+				  q_vector->name, q_vector);
+		if (err) {
+			netdev_err(vport->netdev,
+				   "Request_irq failed, error: %d\n", err);
+			goto free_q_irqs;
+		}
+		/* assign the mask for this IRQ */
+		irq_set_affinity_hint(irq_num, &q_vector->affinity_mask);
+	}
+
+	return 0;
+
+free_q_irqs:
+	while (vector) {
+		vector--;
+		vidx = vector + vport->q_vector_base;
+		irq_num = adapter->msix_entries[vidx].vector,
+		free_irq(irq_num,
+			 &vport->q_vectors[vector]);
+	}
+	return err;
 }
 
 /**
@@ -1098,7 +1482,14 @@ iecm_vport_intr_req_irq(struct iecm_vport *vport, char *basename)
  */
 void iecm_vport_intr_ena_irq_all(struct iecm_vport *vport)
 {
-	/* stub */
+	int q_idx;
+
+	for (q_idx = 0; q_idx < vport->num_q_vectors; q_idx++) {
+		struct iecm_q_vector *q_vector = &vport->q_vectors[q_idx];
+
+		if (q_vector->num_txq || q_vector->num_rxq)
+			iecm_vport_intr_update_itr_ena_irq(q_vector);
+	}
 }
 
 /**
@@ -1107,7 +1498,10 @@ void iecm_vport_intr_ena_irq_all(struct iecm_vport *vport)
  */
 void iecm_vport_intr_deinit(struct iecm_vport *vport)
 {
-	/* stub */
+	iecm_vport_intr_napi_dis_all(vport);
+	iecm_vport_intr_dis_irq_all(vport);
+	iecm_vport_intr_rel_irq(vport);
+	iecm_vport_intr_rel(vport);
 }
 
 /**
@@ -1117,7 +1511,16 @@ void iecm_vport_intr_deinit(struct iecm_vport *vport)
 static void
 iecm_vport_intr_napi_ena_all(struct iecm_vport *vport)
 {
-	/* stub */
+	int q_idx;
+
+	if (!vport->netdev)
+		return;
+
+	for (q_idx = 0; q_idx < vport->num_q_vectors; q_idx++) {
+		struct iecm_q_vector *q_vector = &vport->q_vectors[q_idx];
+
+		napi_enable(&q_vector->napi);
+	}
 }
 
 /**
@@ -1166,7 +1569,65 @@ int iecm_vport_splitq_napi_poll(struct napi_struct *napi, int budget)
  */
 void iecm_vport_intr_map_vector_to_qs(struct iecm_vport *vport)
 {
-	/* stub */
+	int i, j, k = 0, num_rxq, num_txq;
+	struct iecm_rxq_group *rx_qgrp;
+	struct iecm_txq_group *tx_qgrp;
+	struct iecm_queue *q;
+	int q_index;
+
+	for (i = 0; i < vport->num_rxq_grp; i++) {
+		rx_qgrp = &vport->rxq_grps[i];
+		if (iecm_is_queue_model_split(vport->rxq_model))
+			num_rxq = rx_qgrp->splitq.num_rxq_sets;
+		else
+			num_rxq = rx_qgrp->singleq.num_rxq;
+
+		for (j = 0; j < num_rxq; j++) {
+			if (k >= vport->num_q_vectors)
+				k = k % vport->num_q_vectors;
+
+			if (iecm_is_queue_model_split(vport->rxq_model))
+				q = &rx_qgrp->splitq.rxq_sets[j].rxq;
+			else
+				q = &rx_qgrp->singleq.rxqs[j];
+			q->q_vector = &vport->q_vectors[k];
+			q_index = q->q_vector->num_rxq;
+			q->q_vector->rx[q_index] = q;
+			q->q_vector->num_rxq++;
+
+			k++;
+		}
+	}
+	k = 0;
+	for (i = 0; i < vport->num_txq_grp; i++) {
+		tx_qgrp = &vport->txq_grps[i];
+		num_txq = tx_qgrp->num_txq;
+
+		if (iecm_is_queue_model_split(vport->txq_model)) {
+			if (k >= vport->num_q_vectors)
+				k = k % vport->num_q_vectors;
+
+			q = tx_qgrp->complq;
+			q->q_vector = &vport->q_vectors[k];
+			q_index = q->q_vector->num_txq;
+			q->q_vector->tx[q_index] = q;
+			q->q_vector->num_txq++;
+			k++;
+		} else {
+			for (j = 0; j < num_txq; j++) {
+				if (k >= vport->num_q_vectors)
+					k = k % vport->num_q_vectors;
+
+				q = &tx_qgrp->txqs[j];
+				q->q_vector = &vport->q_vectors[k];
+				q_index = q->q_vector->num_txq;
+				q->q_vector->tx[q_index] = q;
+				q->q_vector->num_txq++;
+
+				k++;
+			}
+		}
+	}
 }
 
 /**
@@ -1177,7 +1638,38 @@ void iecm_vport_intr_map_vector_to_qs(struct iecm_vport *vport)
  */
 static int iecm_vport_intr_init_vec_idx(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_adapter *adapter = vport->adapter;
+	struct iecm_q_vector *q_vector;
+	int i;
+
+	if (adapter->req_vec_chunks) {
+		struct virtchnl_vector_chunks *vchunks;
+		struct virtchnl_alloc_vectors *ac;
+		/* We may never deal with more that 256 same type of vectors */
+#define IECM_MAX_VECIDS	256
+		u16 vecids[IECM_MAX_VECIDS];
+		int num_ids;
+
+		ac = adapter->req_vec_chunks;
+		vchunks = &ac->vchunks;
+
+		num_ids = iecm_vport_get_vec_ids(vecids, IECM_MAX_VECIDS,
+						 vchunks);
+		if (num_ids != adapter->num_msix_entries)
+			return -EFAULT;
+
+		for (i = 0; i < vport->num_q_vectors; i++) {
+			q_vector = &vport->q_vectors[i];
+			q_vector->v_idx = vecids[i + vport->q_vector_base];
+		}
+	} else {
+		for (i = 0; i < vport->num_q_vectors; i++) {
+			q_vector = &vport->q_vectors[i];
+			q_vector->v_idx = i + vport->q_vector_base;
+		}
+	}
+
+	return 0;
 }
 
 /**
@@ -1189,7 +1681,65 @@ static int iecm_vport_intr_init_vec_idx(struct iecm_vport *vport)
  */
 int iecm_vport_intr_alloc(struct iecm_vport *vport)
 {
-	/* stub */
+	int txqs_per_vector, rxqs_per_vector;
+	struct iecm_q_vector *q_vector;
+	int v_idx, err = 0;
+
+	vport->q_vectors = kcalloc(vport->num_q_vectors,
+				   sizeof(struct iecm_q_vector), GFP_KERNEL);
+
+	if (!vport->q_vectors)
+		return -ENOMEM;
+
+	txqs_per_vector = DIV_ROUND_UP(vport->num_txq, vport->num_q_vectors);
+	rxqs_per_vector = DIV_ROUND_UP(vport->num_rxq, vport->num_q_vectors);
+
+	for (v_idx = 0; v_idx < vport->num_q_vectors; v_idx++) {
+		q_vector = &vport->q_vectors[v_idx];
+		q_vector->vport = vport;
+		q_vector->itr_countdown = ITR_COUNTDOWN_START;
+
+		q_vector->tx = kcalloc(txqs_per_vector,
+				       sizeof(struct iecm_queue *),
+				       GFP_KERNEL);
+		if (!q_vector->tx) {
+			err = -ENOMEM;
+			goto free_vport_q_vec;
+		}
+
+		q_vector->rx = kcalloc(rxqs_per_vector,
+				       sizeof(struct iecm_queue *),
+				       GFP_KERNEL);
+		if (!q_vector->rx) {
+			err = -ENOMEM;
+			goto free_vport_q_vec_tx;
+		}
+
+		/* only set affinity_mask if the CPU is online */
+		if (cpu_online(v_idx))
+			cpumask_set_cpu(v_idx, &q_vector->affinity_mask);
+
+		/* Register the NAPI handler */
+		if (vport->netdev) {
+			if (iecm_is_queue_model_split(vport->txq_model))
+				netif_napi_add(vport->netdev, &q_vector->napi,
+					       iecm_vport_splitq_napi_poll,
+					       NAPI_POLL_WEIGHT);
+			else
+				netif_napi_add(vport->netdev, &q_vector->napi,
+					       iecm_vport_singleq_napi_poll,
+					       NAPI_POLL_WEIGHT);
+		}
+	}
+
+	err = iecm_vport_intr_init_vec_idx(vport);
+	goto handle_err;
+free_vport_q_vec_tx:
+	kfree(q_vector->tx);
+free_vport_q_vec:
+	kfree(vport->q_vectors);
+handle_err:
+	return err;
 }
 
 /**
@@ -1200,7 +1750,31 @@ int iecm_vport_intr_alloc(struct iecm_vport *vport)
  */
 int iecm_vport_intr_init(struct iecm_vport *vport)
 {
-	/* stub */
+	char int_name[IECM_INT_NAME_STR_LEN];
+	int err = 0;
+
+	if (iecm_vport_intr_alloc(vport))
+		return -ENOMEM;
+
+	iecm_vport_intr_map_vector_to_qs(vport);
+	iecm_vport_intr_napi_ena_all(vport);
+
+	vport->adapter->dev_ops.reg_ops.intr_reg_init(vport);
+
+	snprintf(int_name, sizeof(int_name) - 1, "%s-%s",
+		 dev_driver_string(&vport->adapter->pdev->dev),
+		 vport->netdev->name);
+
+	err = iecm_vport_intr_req_irq(vport, int_name);
+	if (err)
+		goto unroll_vectors_alloc;
+
+	iecm_vport_intr_ena_irq_all(vport);
+	goto handle_err;
+unroll_vectors_alloc:
+	iecm_vport_intr_rel(vport);
+handle_err:
+	return err;
 }
 EXPORT_SYMBOL(iecm_vport_calc_num_q_vec);
 
diff --git a/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c b/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c
index 57862fbfdb9b..b1775cc38924 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c
@@ -1874,7 +1874,29 @@ int
 iecm_vport_get_vec_ids(u16 *vecids, int num_vecids,
 		       struct virtchnl_vector_chunks *chunks)
 {
-	/* stub */
+	int num_chunks = chunks->num_vector_chunks;
+	struct virtchnl_vector_chunk *chunk;
+	int num_vecid_filled = 0;
+	int start_vecid;
+	int num_vec;
+	int i, j;
+
+	for (j = 0; j < num_chunks; j++) {
+		chunk = &chunks->num_vchunk[j];
+		num_vec = chunk->num_vectors;
+		start_vecid = chunk->start_vector_id;
+		for (i = 0; i < num_vec; i++) {
+			if ((num_vecid_filled + i) < num_vecids) {
+				vecids[num_vecid_filled + i] = start_vecid;
+				start_vecid++;
+			} else {
+				break;
+			}
+		}
+		num_vecid_filled = num_vecid_filled + i;
+	}
+
+	return num_vecid_filled;
 }
 
 /**
-- 
2.26.2


^ permalink raw reply related

* [net-next 10/15] iecm: Deinit vport
From: Jeff Kirsher @ 2020-06-18  5:13 UTC (permalink / raw)
  To: davem
  Cc: Alice Michael, netdev, nhorman, sassmann, Alan Brady, Phani Burra,
	Joshua Hay, Madhu Chittim, Pavan Kumar Linga, Donald Skidmore,
	Jesse Brandeburg, Sridhar Samudrala, Jeff Kirsher
In-Reply-To: <20200618051344.516587-1-jeffrey.t.kirsher@intel.com>

From: Alice Michael <alice.michael@intel.com>

Implement vport take down and release its queue
resources.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Donald Skidmore <donald.c.skidmore@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/iecm/iecm_lib.c    |  29 ++-
 drivers/net/ethernet/intel/iecm/iecm_txrx.c   | 218 ++++++++++++++++--
 .../net/ethernet/intel/iecm/iecm_virtchnl.c   |  15 +-
 3 files changed, 246 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/intel/iecm/iecm_lib.c b/drivers/net/ethernet/intel/iecm/iecm_lib.c
index d855d6238740..707520553912 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_lib.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_lib.c
@@ -366,7 +366,27 @@ struct iecm_adapter *iecm_netdev_to_adapter(struct net_device *netdev)
  */
 static void iecm_vport_stop(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_adapter *adapter = vport->adapter;
+
+	if (adapter->state <= __IECM_DOWN)
+		return;
+	adapter->dev_ops.vc_ops.irq_map_unmap(vport, false);
+	adapter->dev_ops.vc_ops.disable_queues(vport);
+	/* Normally we ask for queues in create_vport, but if we're changing
+	 * number of requested queues we do a delete then add instead of
+	 * deleting and reallocating the vport.
+	 */
+	if (test_and_clear_bit(__IECM_DEL_QUEUES,
+			       vport->adapter->flags))
+		iecm_send_delete_queues_msg(vport);
+	netif_carrier_off(vport->netdev);
+	netif_tx_disable(vport->netdev);
+	adapter->link_up = false;
+	iecm_vport_intr_deinit(vport);
+	iecm_vport_queues_rel(vport);
+	if (adapter->dev_ops.vc_ops.disable_vport)
+		adapter->dev_ops.vc_ops.disable_vport(vport);
+	adapter->state = __IECM_DOWN;
 }
 
 /**
@@ -381,7 +401,11 @@ static void iecm_vport_stop(struct iecm_vport *vport)
  */
 static int iecm_stop(struct net_device *netdev)
 {
-	/* stub */
+	struct iecm_netdev_priv *np = netdev_priv(netdev);
+
+	iecm_vport_stop(np->vport);
+
+	return 0;
 }
 
 /**
@@ -488,6 +512,7 @@ iecm_vport_alloc(struct iecm_adapter *adapter, int vport_id)
 
 	/* fill vport slot in the adapter struct */
 	adapter->vports[adapter->next_vport] = vport;
+
 	if (iecm_cfg_netdev(vport))
 		goto cfg_netdev_fail;
 
diff --git a/drivers/net/ethernet/intel/iecm/iecm_txrx.c b/drivers/net/ethernet/intel/iecm/iecm_txrx.c
index 16fea9ad6545..92dc25c10a6c 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_txrx.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_txrx.c
@@ -41,7 +41,23 @@ void iecm_get_stats64(struct net_device *netdev,
  */
 void iecm_tx_buf_rel(struct iecm_queue *tx_q, struct iecm_tx_buf *tx_buf)
 {
-	/* stub */
+	if (tx_buf->skb) {
+		dev_kfree_skb_any(tx_buf->skb);
+		if (dma_unmap_len(tx_buf, len))
+			dma_unmap_single(tx_q->dev,
+					 dma_unmap_addr(tx_buf, dma),
+					 dma_unmap_len(tx_buf, len),
+					 DMA_TO_DEVICE);
+	} else if (dma_unmap_len(tx_buf, len)) {
+		dma_unmap_page(tx_q->dev,
+			       dma_unmap_addr(tx_buf, dma),
+			       dma_unmap_len(tx_buf, len),
+			       DMA_TO_DEVICE);
+	}
+
+	tx_buf->next_to_watch = NULL;
+	tx_buf->skb = NULL;
+	dma_unmap_len_set(tx_buf, len, 0);
 }
 
 /**
@@ -50,7 +66,26 @@ void iecm_tx_buf_rel(struct iecm_queue *tx_q, struct iecm_tx_buf *tx_buf)
  */
 void iecm_tx_buf_rel_all(struct iecm_queue *txq)
 {
-	/* stub */
+	u16 i;
+
+	/* Buffers already cleared, nothing to do */
+	if (!txq->tx_buf)
+		return;
+
+	/* Free all the Tx buffer sk_buffs */
+	for (i = 0; i < txq->desc_count; i++)
+		iecm_tx_buf_rel(txq, &txq->tx_buf[i]);
+
+	kfree(txq->tx_buf);
+	txq->tx_buf = NULL;
+
+	if (txq->buf_stack.bufs) {
+		for (i = 0; i < txq->buf_stack.size; i++) {
+			iecm_tx_buf_rel(txq, txq->buf_stack.bufs[i]);
+			kfree(txq->buf_stack.bufs[i]);
+		}
+		kfree(txq->buf_stack.bufs);
+	}
 }
 
 /**
@@ -62,7 +97,17 @@ void iecm_tx_buf_rel_all(struct iecm_queue *txq)
  */
 void iecm_tx_desc_rel(struct iecm_queue *txq, bool bufq)
 {
-	/* stub */
+	if (bufq)
+		iecm_tx_buf_rel_all(txq);
+
+	if (txq->desc_ring) {
+		dmam_free_coherent(txq->dev, txq->size,
+				   txq->desc_ring, txq->dma);
+		txq->desc_ring = NULL;
+		txq->next_to_alloc = 0;
+		txq->next_to_use = 0;
+		txq->next_to_clean = 0;
+	}
 }
 
 /**
@@ -73,7 +118,24 @@ void iecm_tx_desc_rel(struct iecm_queue *txq, bool bufq)
  */
 void iecm_tx_desc_rel_all(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_queue *txq;
+	int i, j;
+
+	if (!vport->txq_grps)
+		return;
+
+	for (i = 0; i < vport->num_txq_grp; i++) {
+		for (j = 0; j < vport->txq_grps[i].num_txq; j++) {
+			if (vport->txq_grps[i].txqs) {
+				txq = &vport->txq_grps[i].txqs[j];
+				iecm_tx_desc_rel(txq, true);
+			}
+		}
+		if (iecm_is_queue_model_split(vport->txq_model)) {
+			txq = vport->txq_grps[i].complq;
+			iecm_tx_desc_rel(txq, false);
+		}
+	}
 }
 
 /**
@@ -209,7 +271,21 @@ static enum iecm_status iecm_tx_desc_alloc_all(struct iecm_vport *vport)
 static void iecm_rx_buf_rel(struct iecm_queue *rxq,
 			    struct iecm_rx_buf *rx_buf)
 {
-	/* stub */
+	struct device *dev = rxq->dev;
+
+	if (!rx_buf->page)
+		return;
+
+	if (rx_buf->skb) {
+		dev_kfree_skb_any(rx_buf->skb);
+		rx_buf->skb = NULL;
+	}
+
+	dma_unmap_page(dev, rx_buf->dma, PAGE_SIZE, DMA_FROM_DEVICE);
+	__free_pages(rx_buf->page, 0);
+
+	rx_buf->page = NULL;
+	rx_buf->page_offset = 0;
 }
 
 /**
@@ -218,7 +294,23 @@ static void iecm_rx_buf_rel(struct iecm_queue *rxq,
  */
 void iecm_rx_buf_rel_all(struct iecm_queue *rxq)
 {
-	/* stub */
+	u16 i;
+
+	/* queue already cleared, nothing to do */
+	if (!rxq->rx_buf.buf)
+		return;
+
+	/* Free all the bufs allocated and given to HW on Rx queue */
+	for (i = 0; i < rxq->desc_count; i++) {
+		iecm_rx_buf_rel(rxq, &rxq->rx_buf.buf[i]);
+		if (rxq->rx_hsplit_en)
+			iecm_rx_buf_rel(rxq, &rxq->rx_buf.hdr_buf[i]);
+	}
+
+	kfree(rxq->rx_buf.buf);
+	rxq->rx_buf.buf = NULL;
+	kfree(rxq->rx_buf.hdr_buf);
+	rxq->rx_buf.hdr_buf = NULL;
 }
 
 /**
@@ -232,7 +324,25 @@ void iecm_rx_buf_rel_all(struct iecm_queue *rxq)
 void iecm_rx_desc_rel(struct iecm_queue *rxq, bool bufq,
 		      enum virtchnl_queue_model q_model)
 {
-	/* stub */
+	if (!rxq)
+		return;
+
+	if (!bufq && iecm_is_queue_model_split(q_model) && rxq->skb) {
+		dev_kfree_skb_any(rxq->skb);
+		rxq->skb = NULL;
+	}
+
+	if (bufq || !iecm_is_queue_model_split(q_model))
+		iecm_rx_buf_rel_all(rxq);
+
+	if (rxq->desc_ring) {
+		dmam_free_coherent(rxq->dev, rxq->size,
+				   rxq->desc_ring, rxq->dma);
+		rxq->desc_ring = NULL;
+		rxq->next_to_alloc = 0;
+		rxq->next_to_clean = 0;
+		rxq->next_to_use = 0;
+	}
 }
 
 /**
@@ -243,7 +353,49 @@ void iecm_rx_desc_rel(struct iecm_queue *rxq, bool bufq,
  */
 void iecm_rx_desc_rel_all(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_rxq_group *rx_qgrp;
+	struct iecm_queue *q;
+	int i, j, num_rxq;
+
+	if (!vport->rxq_grps)
+		return;
+
+	for (i = 0; i < vport->num_rxq_grp; i++) {
+		rx_qgrp = &vport->rxq_grps[i];
+
+		if (iecm_is_queue_model_split(vport->rxq_model)) {
+			if (rx_qgrp->splitq.rxq_sets) {
+				num_rxq = rx_qgrp->splitq.num_rxq_sets;
+				for (j = 0; j < num_rxq; j++) {
+					q = &rx_qgrp->splitq.rxq_sets[j].rxq;
+					iecm_rx_desc_rel(q, false,
+							 vport->rxq_model);
+				}
+			}
+
+			if (!rx_qgrp->splitq.bufq_sets)
+				continue;
+			for (j = 0; j < IECM_BUFQS_PER_RXQ_SET; j++) {
+				struct iecm_bufq_set *bufq_set =
+					&rx_qgrp->splitq.bufq_sets[j];
+
+				q = &bufq_set->bufq;
+				iecm_rx_desc_rel(q, true, vport->rxq_model);
+				if (!bufq_set->refillqs)
+					continue;
+				kfree(bufq_set->refillqs);
+				bufq_set->refillqs = NULL;
+			}
+		} else {
+			if (rx_qgrp->singleq.rxqs) {
+				for (j = 0; j < rx_qgrp->singleq.num_rxq; j++) {
+					q = &rx_qgrp->singleq.rxqs[j];
+					iecm_rx_desc_rel(q, false,
+							 vport->rxq_model);
+				}
+			}
+		}
+	}
 }
 
 /**
@@ -570,7 +722,18 @@ static enum iecm_status iecm_rx_desc_alloc_all(struct iecm_vport *vport)
  */
 static void iecm_txq_group_rel(struct iecm_vport *vport)
 {
-	/* stub */
+	if (vport->txq_grps) {
+		int i;
+
+		for (i = 0; i < vport->num_txq_grp; i++) {
+			kfree(vport->txq_grps[i].txqs);
+			vport->txq_grps[i].txqs = NULL;
+			kfree(vport->txq_grps[i].complq);
+			vport->txq_grps[i].complq = NULL;
+		}
+		kfree(vport->txq_grps);
+		vport->txq_grps = NULL;
+	}
 }
 
 /**
@@ -579,7 +742,25 @@ static void iecm_txq_group_rel(struct iecm_vport *vport)
  */
 static void iecm_rxq_group_rel(struct iecm_vport *vport)
 {
-	/* stub */
+	if (vport->rxq_grps) {
+		int i;
+
+		for (i = 0; i < vport->num_rxq_grp; i++) {
+			struct iecm_rxq_group *rx_qgrp = &vport->rxq_grps[i];
+
+			if (iecm_is_queue_model_split(vport->rxq_model)) {
+				kfree(rx_qgrp->splitq.rxq_sets);
+				rx_qgrp->splitq.rxq_sets = NULL;
+				kfree(rx_qgrp->splitq.bufq_sets);
+				rx_qgrp->splitq.bufq_sets = NULL;
+			} else {
+				kfree(rx_qgrp->singleq.rxqs);
+				vport->rxq_grps[i].singleq.rxqs = NULL;
+			}
+		}
+		kfree(vport->rxq_grps);
+		vport->rxq_grps = NULL;
+	}
 }
 
 /**
@@ -588,7 +769,8 @@ static void iecm_rxq_group_rel(struct iecm_vport *vport)
  */
 static void iecm_vport_queue_grp_rel_all(struct iecm_vport *vport)
 {
-	/* stub */
+	iecm_txq_group_rel(vport);
+	iecm_rxq_group_rel(vport);
 }
 
 /**
@@ -599,7 +781,12 @@ static void iecm_vport_queue_grp_rel_all(struct iecm_vport *vport)
  */
 void iecm_vport_queues_rel(struct iecm_vport *vport)
 {
-	/* stub */
+	iecm_tx_desc_rel_all(vport);
+	iecm_rx_desc_rel_all(vport);
+	iecm_vport_queue_grp_rel_all(vport);
+
+	kfree(vport->txqs);
+	vport->txqs = NULL;
 }
 
 /**
@@ -2577,5 +2764,10 @@ int iecm_init_rss(struct iecm_vport *vport)
  */
 void iecm_deinit_rss(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_adapter *adapter = vport->adapter;
+
+	kfree(adapter->rss_data.rss_key);
+	adapter->rss_data.rss_key = NULL;
+	kfree(adapter->rss_data.rss_lut);
+	adapter->rss_data.rss_lut = NULL;
 }
diff --git a/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c b/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c
index d56f8126521a..2aeeb8ca10af 100644
--- a/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c
+++ b/drivers/net/ethernet/intel/iecm/iecm_virtchnl.c
@@ -689,7 +689,20 @@ iecm_send_enable_vport_msg(struct iecm_vport *vport)
 enum iecm_status
 iecm_send_disable_vport_msg(struct iecm_vport *vport)
 {
-	/* stub */
+	struct iecm_adapter *adapter = vport->adapter;
+	struct virtchnl_vport v_id;
+	enum iecm_status err;
+
+	v_id.vport_id = vport->vport_id;
+
+	err = iecm_send_mb_msg(adapter, VIRTCHNL_OP_DISABLE_VPORT,
+			       sizeof(v_id), (u8 *)&v_id);
+
+	if (!err)
+		err = iecm_wait_for_event(adapter, IECM_VC_DIS_VPORT,
+					  IECM_VC_DIS_VPORT_ERR);
+
+	return err;
 }
 
 /**
-- 
2.26.2


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox