* [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor
@ 2026-01-30 9:34 Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 1/6] eea: introduce PCI framework Xuan Zhuo
` (6 more replies)
0 siblings, 7 replies; 14+ messages in thread
From: Xuan Zhuo @ 2026-01-30 9:34 UTC (permalink / raw)
To: netdev
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xuan Zhuo, Wen Gu, Philo Lu, Lorenzo Bianconi,
Vadim Fedorenko, Dong Yibo, Ethan Nelson-Moore, Heiner Kallweit,
Lukas Bulwahn, Dust Li
Add a driver framework for EEA that will be available in the future.
This driver is currently quite minimal, implementing only fundamental
core functionalities. Key features include: I/O queue management via
adminq, basic PCI-layer operations, and essential RX/TX data
communication capabilities. It also supports the creation,
initialization, and management of network devices (netdev). Furthermore,
the ring structures for both I/O queues and adminq have been abstracted
into a simple, unified, and reusable library implementation,
facilitating future extension and maintenance.
v24:
1. Add null checks for enet->rx and enet->tx in eea_get_ethtool_stat to
prevent errors when reading rx = enet->rx[i] in case enet->rx is null.
tx is similar. With rtnl protection in place, this check is sufficient.
2. Use 'received' as the return value in eea_poll.
v23:
I have moved netif_set_real_num_queues() out of eea_start_rxtx(), so
eea_start_rxtx() is now a void function. I believe enet_bind_new_q_and_cfg()
is a more suitable place to include netif_set_real_num_queues(). In
eea_active_ring_and_irq(), I first execute request_irq() before interacting
with the hardware to create queues. Therefore, during the NIC setup process,
all driver-internal operations (memory allocation, IRQ initialization, sysfs
configuration, etc.) will be completed before the final notification to the
hardware.
v22:
1. Use the budget from the NAPI poll function as the parameter for
napi_consume_skb.
2. Stop the TX queue when the remaining ring slots cannot hold an SKB.
v21:
Fix two issues from the previous version:
1, a DMA unmap operation was missing.
2, RCU APIs were not used in eea_stats. Although the standard practice when
using RCU would require adding the __rcu annotation to both the rx and
tx fields, in many cases these fields are read without needing RCU
protection. Therefore, I do not want to add the __rcu annotation.
Instead, I use a spin lock to protect modifications to rx and tx.
v20:
Fix the partially initialized structure passed to db. @Jakub
http://lore.kernel.org/all/20260113172353.2ae6ef81@kernel.org
v19:
fix the comments from @Simon Horman
v18:
v17 with [PATCH] prefix.
v17:
1. In `eea_adminq_dev_status`, uniformly use `enet->cfg.rx_ring_num`.
2. Add a `struct eea_net_cfg *cfg` parameter to `eea_free_rx` and
`eea_free_tx`. When called in the normal path, pass `enet->cfg` as
the argument; when called during initialization, pass the temporary
`cfg` instead.
3. Move the `.ndo_get_stats64` callback into `eea_net.c`.
4. In the `.ndo_get_stats64` callback, add a comment explaining how the TX
and RX statistics are protected by RCU.
/* This function is protected by RCU. Here uses enet->tx and enet->rx
* to check whether the TX and RX structures are safe to access. In
* eea_free_rxtx_q_mem, before freeing the TX and RX resources, enet->rx
* and enet->tx are set to NULL, and synchronize_net is called.
*/
v16:
1. follow the advices from @ALOK TIWARI
http://lore.kernel.org/all/5ff95a71-69e5-4cb6-9b2a-5224c983bdc2@oracle.com
v15:
1. remove 'default m' from eea kconfig
2. free the resources when open failed.
v14:
1. some tiny fixes
v13:
1. fix some tiny fixes @Simon
v12:
I encountered some issues with sending the v11 patches, as they were quite
messy. Therefore, I'm resending them as v12.
v11:
1. remove auto clean __free(kfree)
2. some tiny fixes
v10:
1. name the jump labels after the target @Jakub
2. rm __GFP_ZERO from dma_alloc_coherent @Jakub
v9:
1. some fixes for ethtool from http://lore.kernel.org/all/20251027183754.52fe2a2c@kernel.org
v8: 1. rename eea_net_tmp to eea_net_init_ctx
2. rm code that allocs memory to destroy queues
3. some other minor changes
v7: 1. remove the irrelative code from ethtool commit
2. build every commits with W12
v6: Split the big one commit to five commits
v5: Thanks for the comments from Kalesh Anakkur Purayil, ALOK TIWARI
v4: Thanks for the comments from Troy Mitchell, Przemek Kitszel, Andrew Lunn, Kalesh Anakkur Purayil
v3: Thanks for the comments from Paolo Abenchi
v2: Thanks for the comments from Simon Horman and Andrew Lunn
v1: Thanks for the comments from Simon Horman and Andrew Lunn
Xuan Zhuo (6):
eea: introduce PCI framework
eea: introduce ring and descriptor structures
eea: probe the netdevice and create adminq
eea: create/destroy rx,tx queues for netdevice open and stop
eea: introduce ethtool support
eea: introduce callback for ndo_get_stats64
MAINTAINERS | 8 +
drivers/net/ethernet/Kconfig | 1 +
drivers/net/ethernet/Makefile | 1 +
drivers/net/ethernet/alibaba/Kconfig | 28 +
drivers/net/ethernet/alibaba/Makefile | 5 +
drivers/net/ethernet/alibaba/eea/Makefile | 9 +
drivers/net/ethernet/alibaba/eea/eea_adminq.c | 421 ++++++++++
drivers/net/ethernet/alibaba/eea/eea_adminq.h | 70 ++
drivers/net/ethernet/alibaba/eea/eea_desc.h | 156 ++++
.../net/ethernet/alibaba/eea/eea_ethtool.c | 243 ++++++
.../net/ethernet/alibaba/eea/eea_ethtool.h | 49 ++
drivers/net/ethernet/alibaba/eea/eea_net.c | 651 +++++++++++++++
drivers/net/ethernet/alibaba/eea/eea_net.h | 201 +++++
drivers/net/ethernet/alibaba/eea/eea_pci.c | 587 +++++++++++++
drivers/net/ethernet/alibaba/eea/eea_pci.h | 67 ++
drivers/net/ethernet/alibaba/eea/eea_ring.c | 266 ++++++
drivers/net/ethernet/alibaba/eea/eea_ring.h | 91 ++
drivers/net/ethernet/alibaba/eea/eea_rx.c | 785 ++++++++++++++++++
drivers/net/ethernet/alibaba/eea/eea_tx.c | 399 +++++++++
19 files changed, 4038 insertions(+)
create mode 100644 drivers/net/ethernet/alibaba/Kconfig
create mode 100644 drivers/net/ethernet/alibaba/Makefile
create mode 100644 drivers/net/ethernet/alibaba/eea/Makefile
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_adminq.c
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_adminq.h
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_desc.h
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_ethtool.c
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_ethtool.h
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_net.c
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_net.h
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_pci.c
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_pci.h
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_ring.c
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_ring.h
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_rx.c
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_tx.c
--
2.32.0.3.g01195cf9f
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH net-next v24 1/6] eea: introduce PCI framework
2026-01-30 9:34 [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Xuan Zhuo
@ 2026-01-30 9:34 ` Xuan Zhuo
2026-01-30 23:32 ` [net-next,v24,1/6] " Jakub Kicinski
2026-01-30 9:34 ` [PATCH net-next v24 2/6] eea: introduce ring and descriptor structures Xuan Zhuo
` (5 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: Xuan Zhuo @ 2026-01-30 9:34 UTC (permalink / raw)
To: netdev
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xuan Zhuo, Wen Gu, Philo Lu, Lorenzo Bianconi,
Vadim Fedorenko, Dong Yibo, Ethan Nelson-Moore, Heiner Kallweit,
Lukas Bulwahn, Dust Li
Add basic driver framework for the Alibaba Elastic Ethernet Adapter(EEA).
This commit implements the EEA PCI probe functionality.
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Philo Lu <lulie@linux.alibaba.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
MAINTAINERS | 8 +
drivers/net/ethernet/Kconfig | 1 +
drivers/net/ethernet/Makefile | 1 +
drivers/net/ethernet/alibaba/Kconfig | 28 ++
drivers/net/ethernet/alibaba/Makefile | 5 +
drivers/net/ethernet/alibaba/eea/Makefile | 3 +
drivers/net/ethernet/alibaba/eea/eea_pci.c | 387 +++++++++++++++++++++
drivers/net/ethernet/alibaba/eea/eea_pci.h | 50 +++
8 files changed, 483 insertions(+)
create mode 100644 drivers/net/ethernet/alibaba/Kconfig
create mode 100644 drivers/net/ethernet/alibaba/Makefile
create mode 100644 drivers/net/ethernet/alibaba/eea/Makefile
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_pci.c
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_pci.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 0caa8aee5840..84ae1b0e7b48 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -808,6 +808,14 @@ S: Maintained
F: Documentation/i2c/busses/i2c-ali1563.rst
F: drivers/i2c/busses/i2c-ali1563.c
+ALIBABA ELASTIC ETHERNET ADAPTER DRIVER
+M: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
+M: Wen Gu <guwen@linux.alibaba.com>
+R: Philo Lu <lulie@linux.alibaba.com>
+L: netdev@vger.kernel.org
+S: Maintained
+F: drivers/net/ethernet/alibaba/eea
+
ALIBABA ELASTIC RDMA DRIVER
M: Cheng Xu <chengyou@linux.alibaba.com>
M: Kai Shen <kaishen@linux.alibaba.com>
diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
index aa7103e7f47f..9ead9c49e6c6 100644
--- a/drivers/net/ethernet/Kconfig
+++ b/drivers/net/ethernet/Kconfig
@@ -22,6 +22,7 @@ source "drivers/net/ethernet/aeroflex/Kconfig"
source "drivers/net/ethernet/agere/Kconfig"
source "drivers/net/ethernet/airoha/Kconfig"
source "drivers/net/ethernet/alacritech/Kconfig"
+source "drivers/net/ethernet/alibaba/Kconfig"
source "drivers/net/ethernet/allwinner/Kconfig"
source "drivers/net/ethernet/alteon/Kconfig"
source "drivers/net/ethernet/altera/Kconfig"
diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
index 6615a67a63d5..9e6d740f4cf7 100644
--- a/drivers/net/ethernet/Makefile
+++ b/drivers/net/ethernet/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_NET_VENDOR_ADI) += adi/
obj-$(CONFIG_NET_VENDOR_AGERE) += agere/
obj-$(CONFIG_NET_VENDOR_AIROHA) += airoha/
obj-$(CONFIG_NET_VENDOR_ALACRITECH) += alacritech/
+obj-$(CONFIG_NET_VENDOR_ALIBABA) += alibaba/
obj-$(CONFIG_NET_VENDOR_ALLWINNER) += allwinner/
obj-$(CONFIG_NET_VENDOR_ALTEON) += alteon/
obj-$(CONFIG_ALTERA_TSE) += altera/
diff --git a/drivers/net/ethernet/alibaba/Kconfig b/drivers/net/ethernet/alibaba/Kconfig
new file mode 100644
index 000000000000..85cf5aeb2aa3
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/Kconfig
@@ -0,0 +1,28 @@
+#
+# Alibaba network device configuration
+#
+
+config NET_VENDOR_ALIBABA
+ bool "Alibaba Devices"
+ default y
+ help
+ If you have a network (Ethernet) device belonging to this class, say Y.
+
+ Note that the answer to this question doesn't directly affect the
+ kernel: saying N will just cause the configurator to skip all
+ the questions about Alibaba devices. If you say Y, you will be asked
+ for your specific device in the following questions.
+
+if NET_VENDOR_ALIBABA
+
+config EEA
+ tristate "Alibaba Elastic Ethernet Adapter support"
+ depends on PCI_MSI
+ depends on 64BIT
+ select PAGE_POOL
+ help
+ This driver supports Alibaba Elastic Ethernet Adapter"
+
+ To compile this driver as a module, choose M here.
+
+endif #NET_VENDOR_ALIBABA
diff --git a/drivers/net/ethernet/alibaba/Makefile b/drivers/net/ethernet/alibaba/Makefile
new file mode 100644
index 000000000000..7980525cb086
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile for the Alibaba network device drivers.
+#
+
+obj-$(CONFIG_EEA) += eea/
diff --git a/drivers/net/ethernet/alibaba/eea/Makefile b/drivers/net/ethernet/alibaba/eea/Makefile
new file mode 100644
index 000000000000..cf2acf1733fd
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/Makefile
@@ -0,0 +1,3 @@
+
+obj-$(CONFIG_EEA) += eea.o
+eea-y := eea_pci.o
diff --git a/drivers/net/ethernet/alibaba/eea/eea_pci.c b/drivers/net/ethernet/alibaba/eea/eea_pci.c
new file mode 100644
index 000000000000..da9d34ae9454
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_pci.c
@@ -0,0 +1,387 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/iopoll.h>
+
+#include "eea_pci.h"
+
+#define EEA_PCI_DB_OFFSET 4096
+
+struct eea_pci_cfg {
+ __le32 reserve0;
+ __le32 reserve1;
+ __le32 drv_f_idx;
+ __le32 drv_f;
+
+#define EEA_S_OK BIT(2)
+#define EEA_S_FEATURE_DONE BIT(3)
+#define EEA_S_FAILED BIT(7)
+ u8 device_status;
+ u8 reserved[7];
+
+ __le32 rx_num_max;
+ __le32 tx_num_max;
+ __le32 db_blk_size;
+
+ /* admin queue cfg */
+ __le16 aq_size;
+ __le16 aq_msix_vector;
+ __le32 aq_db_off;
+
+ __le32 aq_sq_addr;
+ __le32 aq_sq_addr_hi;
+ __le32 aq_cq_addr;
+ __le32 aq_cq_addr_hi;
+
+ __le64 hw_ts;
+};
+
+struct eea_pci_device {
+ struct eea_device edev;
+ struct pci_dev *pci_dev;
+
+ u32 msix_vec_n;
+
+ void __iomem *reg;
+ void __iomem *db_base;
+
+ char ha_irq_name[32];
+ u8 reset_pos;
+};
+
+#define cfg_pointer(reg, item) \
+ ((void __iomem *)((reg) + offsetof(struct eea_pci_cfg, item)))
+
+#define cfg_write8(reg, item, val) iowrite8(val, cfg_pointer(reg, item))
+#define cfg_write32(reg, item, val) iowrite32(val, cfg_pointer(reg, item))
+
+#define cfg_read8(reg, item) ioread8(cfg_pointer(reg, item))
+#define cfg_read32(reg, item) ioread32(cfg_pointer(reg, item))
+#define cfg_readq(reg, item) readq(cfg_pointer(reg, item))
+
+const char *eea_pci_name(struct eea_device *edev)
+{
+ return pci_name(edev->ep_dev->pci_dev);
+}
+
+int eea_pci_domain_nr(struct eea_device *edev)
+{
+ return pci_domain_nr(edev->ep_dev->pci_dev->bus);
+}
+
+u16 eea_pci_dev_id(struct eea_device *edev)
+{
+ return pci_dev_id(edev->ep_dev->pci_dev);
+}
+
+static void eea_pci_io_set_status(struct eea_device *edev, u8 status)
+{
+ struct eea_pci_device *ep_dev = edev->ep_dev;
+
+ cfg_write8(ep_dev->reg, device_status, status);
+}
+
+static u8 eea_pci_io_get_status(struct eea_device *edev)
+{
+ struct eea_pci_device *ep_dev = edev->ep_dev;
+
+ return cfg_read8(ep_dev->reg, device_status);
+}
+
+static void eea_add_status(struct eea_device *dev, u32 status)
+{
+ eea_pci_io_set_status(dev, eea_pci_io_get_status(dev) | status);
+}
+
+#define EEA_RESET_TIMEOUT_US (1000 * 1000 * 1000)
+
+int eea_device_reset(struct eea_device *edev)
+{
+ struct eea_pci_device *ep_dev = edev->ep_dev;
+ int i, err;
+ u8 val;
+
+ eea_pci_io_set_status(edev, 0);
+
+ err = read_poll_timeout(cfg_read8, val, !val, 20, EEA_RESET_TIMEOUT_US,
+ false, ep_dev->reg, device_status);
+ if (err)
+ return -EBUSY;
+
+ for (i = 0; i < ep_dev->msix_vec_n; ++i)
+ synchronize_irq(pci_irq_vector(ep_dev->pci_dev, i));
+
+ return 0;
+}
+
+void eea_device_ready(struct eea_device *dev)
+{
+ u8 status = eea_pci_io_get_status(dev);
+
+ WARN_ON(status & EEA_S_OK);
+
+ eea_pci_io_set_status(dev, status | EEA_S_OK);
+}
+
+static int eea_negotiate(struct eea_device *edev)
+{
+ struct eea_pci_device *ep_dev;
+ u32 status;
+
+ ep_dev = edev->ep_dev;
+
+ edev->features = 0;
+
+ cfg_write32(ep_dev->reg, drv_f_idx, 0);
+ cfg_write32(ep_dev->reg, drv_f, (u32)edev->features);
+ cfg_write32(ep_dev->reg, drv_f_idx, 1);
+ cfg_write32(ep_dev->reg, drv_f, edev->features >> 32);
+
+ eea_add_status(edev, EEA_S_FEATURE_DONE);
+ status = eea_pci_io_get_status(edev);
+ if (!(status & EEA_S_FEATURE_DONE))
+ return -ENODEV;
+
+ return 0;
+}
+
+static void eea_pci_release_resource(struct eea_pci_device *ep_dev)
+{
+ struct pci_dev *pci_dev = ep_dev->pci_dev;
+
+ if (ep_dev->reg) {
+ pci_iounmap(pci_dev, ep_dev->reg);
+ ep_dev->reg = NULL;
+ }
+
+ if (ep_dev->msix_vec_n) {
+ ep_dev->msix_vec_n = 0;
+ pci_free_irq_vectors(ep_dev->pci_dev);
+ }
+
+ pci_release_regions(pci_dev);
+ pci_disable_device(pci_dev);
+}
+
+static int eea_pci_setup(struct pci_dev *pci_dev, struct eea_pci_device *ep_dev)
+{
+ int err, n, ret;
+
+ ep_dev->pci_dev = pci_dev;
+
+ err = pci_enable_device(pci_dev);
+ if (err)
+ return err;
+
+ err = pci_request_regions(pci_dev, "EEA");
+ if (err)
+ goto err_disable_dev;
+
+ pci_set_master(pci_dev);
+
+ err = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(64));
+ if (err) {
+ dev_warn(&pci_dev->dev, "Failed to enable 64-bit DMA.\n");
+ goto err_release_regions;
+ }
+
+ ep_dev->reg = pci_iomap(pci_dev, 0, 0);
+ if (!ep_dev->reg) {
+ dev_err(&pci_dev->dev, "Failed to map pci bar!\n");
+ err = -ENOMEM;
+ goto err_release_regions;
+ }
+
+ ep_dev->edev.rx_num = cfg_read32(ep_dev->reg, rx_num_max);
+ ep_dev->edev.tx_num = cfg_read32(ep_dev->reg, tx_num_max);
+
+ /* 2: adminq, error handle*/
+ n = ep_dev->edev.rx_num + ep_dev->edev.tx_num + 2;
+ ret = pci_alloc_irq_vectors(ep_dev->pci_dev, n, n, PCI_IRQ_MSIX);
+ if (ret != n) {
+ err = ret;
+ goto err_unmap_reg;
+ }
+
+ ep_dev->msix_vec_n = ret;
+
+ ep_dev->db_base = ep_dev->reg + EEA_PCI_DB_OFFSET;
+ ep_dev->edev.db_blk_size = cfg_read32(ep_dev->reg, db_blk_size);
+
+ return 0;
+
+err_unmap_reg:
+ pci_iounmap(pci_dev, ep_dev->reg);
+ ep_dev->reg = NULL;
+
+err_release_regions:
+ pci_release_regions(pci_dev);
+
+err_disable_dev:
+ pci_disable_device(pci_dev);
+
+ return err;
+}
+
+void __iomem *eea_pci_db_addr(struct eea_device *edev, u32 off)
+{
+ return edev->ep_dev->db_base + off;
+}
+
+u64 eea_pci_device_ts(struct eea_device *edev)
+{
+ struct eea_pci_device *ep_dev = edev->ep_dev;
+
+ return cfg_readq(ep_dev->reg, hw_ts);
+}
+
+static int eea_init_device(struct eea_device *edev)
+{
+ int err;
+
+ err = eea_device_reset(edev);
+ if (err)
+ return err;
+
+ eea_pci_io_set_status(edev, BIT(0) | BIT(1));
+
+ err = eea_negotiate(edev);
+ if (err)
+ goto err;
+
+ /* do net device probe ... */
+
+ return 0;
+err:
+ eea_add_status(edev, EEA_S_FAILED);
+ return err;
+}
+
+static int __eea_pci_probe(struct pci_dev *pci_dev,
+ struct eea_pci_device *ep_dev)
+{
+ int err;
+
+ pci_set_drvdata(pci_dev, ep_dev);
+
+ err = eea_pci_setup(pci_dev, ep_dev);
+ if (err)
+ return err;
+
+ err = eea_init_device(&ep_dev->edev);
+ if (err)
+ goto err_pci_rel;
+
+ return 0;
+
+err_pci_rel:
+ eea_pci_release_resource(ep_dev);
+ return err;
+}
+
+static void __eea_pci_remove(struct pci_dev *pci_dev)
+{
+ struct eea_pci_device *ep_dev = pci_get_drvdata(pci_dev);
+ struct device *dev = get_device(&ep_dev->pci_dev->dev);
+
+ pci_disable_sriov(pci_dev);
+
+ eea_pci_release_resource(ep_dev);
+
+ put_device(dev);
+}
+
+static int eea_pci_probe(struct pci_dev *pci_dev,
+ const struct pci_device_id *id)
+{
+ struct eea_pci_device *ep_dev;
+ struct eea_device *edev;
+ int err;
+
+ ep_dev = kzalloc(sizeof(*ep_dev), GFP_KERNEL);
+ if (!ep_dev)
+ return -ENOMEM;
+
+ edev = &ep_dev->edev;
+
+ edev->ep_dev = ep_dev;
+ edev->dma_dev = &pci_dev->dev;
+
+ ep_dev->pci_dev = pci_dev;
+
+ err = __eea_pci_probe(pci_dev, ep_dev);
+ if (err)
+ kfree(ep_dev);
+
+ return err;
+}
+
+static void eea_pci_remove(struct pci_dev *pci_dev)
+{
+ struct eea_pci_device *ep_dev = pci_get_drvdata(pci_dev);
+
+ __eea_pci_remove(pci_dev);
+
+ kfree(ep_dev);
+}
+
+static int eea_pci_sriov_configure(struct pci_dev *pci_dev, int num_vfs)
+{
+ struct eea_pci_device *ep_dev = pci_get_drvdata(pci_dev);
+ struct eea_device *edev = &ep_dev->edev;
+ int ret;
+
+ if (!(eea_pci_io_get_status(edev) & EEA_S_OK))
+ return -EBUSY;
+
+ if (pci_vfs_assigned(pci_dev))
+ return -EPERM;
+
+ if (num_vfs == 0) {
+ pci_disable_sriov(pci_dev);
+ return 0;
+ }
+
+ ret = pci_enable_sriov(pci_dev, num_vfs);
+ if (ret < 0)
+ return ret;
+
+ return num_vfs;
+}
+
+static const struct pci_device_id eea_pci_id_table[] = {
+ { PCI_DEVICE(PCI_VENDOR_ID_ALIBABA, 0x500B) },
+ { 0 }
+};
+
+MODULE_DEVICE_TABLE(pci, eea_pci_id_table);
+
+static struct pci_driver eea_pci_driver = {
+ .name = "eea",
+ .id_table = eea_pci_id_table,
+ .probe = eea_pci_probe,
+ .remove = eea_pci_remove,
+ .sriov_configure = eea_pci_sriov_configure,
+};
+
+static __init int eea_pci_init(void)
+{
+ return pci_register_driver(&eea_pci_driver);
+}
+
+static __exit void eea_pci_exit(void)
+{
+ pci_unregister_driver(&eea_pci_driver);
+}
+
+module_init(eea_pci_init);
+module_exit(eea_pci_exit);
+
+MODULE_DESCRIPTION("Driver for Alibaba Elastic Ethernet Adapter");
+MODULE_AUTHOR("Xuan Zhuo <xuanzhuo@linux.alibaba.com>");
+MODULE_LICENSE("GPL");
diff --git a/drivers/net/ethernet/alibaba/eea/eea_pci.h b/drivers/net/ethernet/alibaba/eea/eea_pci.h
new file mode 100644
index 000000000000..126704a207d5
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_pci.h
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#ifndef __EEA_PCI_H__
+#define __EEA_PCI_H__
+
+#include <linux/pci.h>
+
+struct eea_pci_cap {
+ __u8 cap_vndr;
+ __u8 cap_next;
+ __u8 cap_len;
+ __u8 cfg_type;
+};
+
+struct eea_pci_reset_reg {
+ struct eea_pci_cap cap;
+ __le16 driver;
+ __le16 device;
+};
+
+struct eea_pci_device;
+
+struct eea_device {
+ struct eea_pci_device *ep_dev;
+ struct device *dma_dev;
+ struct eea_net *enet;
+
+ u64 features;
+
+ u32 rx_num;
+ u32 tx_num;
+ u32 db_blk_size;
+};
+
+const char *eea_pci_name(struct eea_device *edev);
+int eea_pci_domain_nr(struct eea_device *edev);
+u16 eea_pci_dev_id(struct eea_device *edev);
+
+int eea_device_reset(struct eea_device *dev);
+void eea_device_ready(struct eea_device *dev);
+
+u64 eea_pci_device_ts(struct eea_device *edev);
+
+void __iomem *eea_pci_db_addr(struct eea_device *edev, u32 off);
+#endif
--
2.32.0.3.g01195cf9f
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH net-next v24 2/6] eea: introduce ring and descriptor structures
2026-01-30 9:34 [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 1/6] eea: introduce PCI framework Xuan Zhuo
@ 2026-01-30 9:34 ` Xuan Zhuo
2026-01-30 23:32 ` [net-next,v24,2/6] " Jakub Kicinski
2026-01-30 9:34 ` [PATCH net-next v24 3/6] eea: probe the netdevice and create adminq Xuan Zhuo
` (4 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: Xuan Zhuo @ 2026-01-30 9:34 UTC (permalink / raw)
To: netdev
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xuan Zhuo, Wen Gu, Philo Lu, Lorenzo Bianconi,
Vadim Fedorenko, Dong Yibo, Ethan Nelson-Moore, Heiner Kallweit,
Lukas Bulwahn, Dust Li
Add basic driver framework for the Alibaba Elastic Ethernet Adapter(EEA).
This commit introduces the ring and descriptor implementations.
These structures and ring APIs are used by the RX, TX, and admin queues.
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Philo Lu <lulie@linux.alibaba.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
drivers/net/ethernet/alibaba/eea/Makefile | 3 +-
drivers/net/ethernet/alibaba/eea/eea_desc.h | 156 ++++++++++++
drivers/net/ethernet/alibaba/eea/eea_ring.c | 266 ++++++++++++++++++++
drivers/net/ethernet/alibaba/eea/eea_ring.h | 91 +++++++
4 files changed, 515 insertions(+), 1 deletion(-)
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_desc.h
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_ring.c
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_ring.h
diff --git a/drivers/net/ethernet/alibaba/eea/Makefile b/drivers/net/ethernet/alibaba/eea/Makefile
index cf2acf1733fd..e5e4007810a6 100644
--- a/drivers/net/ethernet/alibaba/eea/Makefile
+++ b/drivers/net/ethernet/alibaba/eea/Makefile
@@ -1,3 +1,4 @@
obj-$(CONFIG_EEA) += eea.o
-eea-y := eea_pci.o
+eea-y := eea_ring.o \
+ eea_pci.o
diff --git a/drivers/net/ethernet/alibaba/eea/eea_desc.h b/drivers/net/ethernet/alibaba/eea/eea_desc.h
new file mode 100644
index 000000000000..541346a03375
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_desc.h
@@ -0,0 +1,156 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#ifndef __EEA_DESC_H__
+#define __EEA_DESC_H__
+
+#define EEA_DESC_TS_MASK GENMASK(47, 0)
+#define EEA_DESC_TS(desc) (le64_to_cpu((desc)->ts) & EEA_DESC_TS_MASK)
+
+struct eea_aq_desc {
+ __le16 flags;
+ __le16 id;
+ __le16 reserved;
+ u8 classid;
+ u8 command;
+ __le64 data_addr;
+ __le64 reply_addr;
+ __le32 data_len;
+ __le32 reply_len;
+};
+
+struct eea_aq_cdesc {
+ __le16 flags;
+ __le16 id;
+#define EEA_OK 0
+#define EEA_ERR 0xffffffff
+ __le32 status;
+ __le32 reply_len;
+ __le32 reserved1;
+
+ __le64 reserved2;
+ __le64 reserved3;
+};
+
+struct eea_rx_desc {
+ __le16 flags;
+ __le16 id;
+ __le16 len;
+ __le16 reserved1;
+
+ __le64 addr;
+
+ __le64 hdr_addr;
+ __le32 reserved2;
+ __le32 reserved3;
+};
+
+#define EEA_RX_CDESC_HDR_LEN_MASK GENMASK(9, 0)
+
+struct eea_rx_cdesc {
+#define EEA_DESC_F_DATA_VALID BIT(6)
+#define EEA_DESC_F_SPLIT_HDR BIT(5)
+ __le16 flags;
+ __le16 id;
+ __le16 len;
+#define EEA_NET_PT_NONE 0
+#define EEA_NET_PT_IPv4 1
+#define EEA_NET_PT_TCPv4 2
+#define EEA_NET_PT_UDPv4 3
+#define EEA_NET_PT_IPv6 4
+#define EEA_NET_PT_TCPv6 5
+#define EEA_NET_PT_UDPv6 6
+#define EEA_NET_PT_IPv6_EX 7
+#define EEA_NET_PT_TCPv6_EX 8
+#define EEA_NET_PT_UDPv6_EX 9
+ /* [9:0] is packet type. */
+ __le16 type;
+
+ /* hw timestamp [0:47]: ts */
+ __le64 ts;
+
+ __le32 hash;
+
+ /* 0-9: hdr_len split header
+ * 10-15: reserved1
+ */
+ __le16 len_ex;
+ __le16 reserved2;
+
+ __le32 reserved3;
+ __le32 reserved4;
+};
+
+#define EEA_TX_GSO_NONE 0
+#define EEA_TX_GSO_TCPV4 1
+#define EEA_TX_GSO_TCPV6 4
+#define EEA_TX_GSO_UDP_L4 5
+#define EEA_TX_GSO_ECN 0x80
+
+struct eea_tx_desc {
+#define EEA_DESC_F_DO_CSUM BIT(6)
+ __le16 flags;
+ __le16 id;
+ __le16 len;
+ __le16 reserved1;
+
+ __le64 addr;
+
+ __le16 csum_start;
+ __le16 csum_offset;
+ u8 gso_type;
+ u8 reserved2;
+ __le16 gso_size;
+ __le64 reserved3;
+};
+
+struct eea_tx_cdesc {
+ __le16 flags;
+ __le16 id;
+ __le16 len;
+ __le16 reserved1;
+
+ /* hw timestamp [0:47]: ts */
+ __le64 ts;
+ __le64 reserved2;
+ __le64 reserved3;
+};
+
+struct eea_db {
+#define EEA_IDX_PRESENT BIT(0)
+#define EEA_IRQ_MASK BIT(1)
+#define EEA_IRQ_UNMASK BIT(2)
+#define EEA_DIRECT_INLINE BIT(3)
+#define EEA_DIRECT_DESC BIT(4)
+ u8 kick_flags;
+ u8 reserved;
+ __le16 idx;
+
+ __le16 tx_cq_head;
+ __le16 rx_cq_head;
+};
+
+struct eea_db_direct {
+ u8 kick_flags;
+ u8 reserved;
+ __le16 idx;
+
+ __le16 tx_cq_head;
+ __le16 rx_cq_head;
+
+ u8 desc[24];
+};
+
+static_assert(sizeof(struct eea_rx_desc) == 32, "rx desc size does not match");
+static_assert(sizeof(struct eea_rx_cdesc) == 32,
+ "rx cdesc size does not match");
+static_assert(sizeof(struct eea_tx_desc) == 32, "tx desc size does not match");
+static_assert(sizeof(struct eea_tx_cdesc) == 32,
+ "tx cdesc size does not match");
+static_assert(sizeof(struct eea_db_direct) == 32,
+ "db direct size does not match");
+#endif
diff --git a/drivers/net/ethernet/alibaba/eea/eea_ring.c b/drivers/net/ethernet/alibaba/eea/eea_ring.c
new file mode 100644
index 000000000000..10cc2c8f4458
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_ring.c
@@ -0,0 +1,266 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#include "eea_pci.h"
+#include "eea_ring.h"
+
+void ering_irq_unactive(struct eea_ring *ering)
+{
+ union {
+ u64 data;
+ struct eea_db db;
+ } val;
+
+ if (ering->mask == EEA_IRQ_MASK)
+ return;
+
+ ering->mask = EEA_IRQ_MASK;
+
+ val.data = 0;
+
+ val.db.kick_flags = EEA_IRQ_MASK;
+
+ writeq(val.data, (void __iomem *)ering->db);
+}
+
+void ering_irq_active(struct eea_ring *ering, struct eea_ring *tx_ering)
+{
+ union {
+ u64 data;
+ struct eea_db db;
+ } val;
+
+ if (ering->mask == EEA_IRQ_UNMASK)
+ return;
+
+ ering->mask = EEA_IRQ_UNMASK;
+
+ val.data = 0;
+
+ val.db.kick_flags = EEA_IRQ_UNMASK;
+
+ val.db.tx_cq_head = cpu_to_le16(tx_ering->cq.hw_idx);
+ val.db.rx_cq_head = cpu_to_le16(ering->cq.hw_idx);
+
+ writeq(val.data, (void __iomem *)ering->db);
+}
+
+void *ering_cq_get_desc(const struct eea_ring *ering)
+{
+ u8 phase;
+ u8 *desc;
+
+ desc = ering->cq.desc + (ering->cq.head << ering->cq.desc_size_shift);
+
+ phase = *(u8 *)(desc + ering->cq.desc_size - 1);
+
+ if ((phase & EEA_RING_DESC_F_CQ_PHASE) == ering->cq.phase) {
+ dma_rmb();
+ return desc;
+ }
+
+ return NULL;
+}
+
+/* sq api */
+void *ering_sq_alloc_desc(struct eea_ring *ering, u16 id, bool is_last,
+ u16 flags)
+{
+ struct eea_ring_sq *sq = &ering->sq;
+ struct eea_common_desc *desc;
+
+ if (!sq->shadow_num) {
+ sq->shadow_idx = sq->head;
+ sq->shadow_id = cpu_to_le16(id);
+ }
+
+ if (!is_last)
+ flags |= EEA_RING_DESC_F_MORE;
+
+ desc = sq->desc + (sq->shadow_idx << sq->desc_size_shift);
+
+ desc->flags = cpu_to_le16(flags);
+ desc->id = sq->shadow_id;
+
+ if (unlikely(++sq->shadow_idx >= ering->num))
+ sq->shadow_idx = 0;
+
+ ++sq->shadow_num;
+
+ return desc;
+}
+
+/* alloc desc for adminq */
+void *ering_aq_alloc_desc(struct eea_ring *ering)
+{
+ struct eea_ring_sq *sq = &ering->sq;
+ struct eea_common_desc *desc;
+
+ sq->shadow_idx = sq->head;
+
+ desc = sq->desc + (sq->shadow_idx << sq->desc_size_shift);
+
+ if (unlikely(++sq->shadow_idx >= ering->num))
+ sq->shadow_idx = 0;
+
+ ++sq->shadow_num;
+
+ return desc;
+}
+
+void ering_sq_commit_desc(struct eea_ring *ering)
+{
+ struct eea_ring_sq *sq = &ering->sq;
+ int num;
+
+ num = sq->shadow_num;
+
+ ering->num_free -= num;
+
+ sq->head = sq->shadow_idx;
+ sq->hw_idx += num;
+ sq->shadow_num = 0;
+}
+
+void ering_sq_cancel(struct eea_ring *ering)
+{
+ ering->sq.shadow_num = 0;
+}
+
+/* cq api */
+void ering_cq_ack_desc(struct eea_ring *ering, u32 num)
+{
+ struct eea_ring_cq *cq = &ering->cq;
+
+ cq->head += num;
+ cq->hw_idx += num;
+
+ if (unlikely(cq->head >= ering->num)) {
+ cq->head -= ering->num;
+ cq->phase ^= EEA_RING_DESC_F_CQ_PHASE;
+ }
+
+ ering->num_free += num;
+}
+
+/* notify */
+bool ering_kick(struct eea_ring *ering)
+{
+ union {
+ struct eea_db db;
+ u64 data;
+ } val;
+
+ val.data = 0;
+
+ val.db.kick_flags = EEA_IDX_PRESENT;
+ val.db.idx = cpu_to_le16(ering->sq.hw_idx);
+
+ writeq(val.data, (void __iomem *)ering->db);
+
+ return true;
+}
+
+/* ering alloc/free */
+static void ering_free_queue(struct eea_device *edev, size_t size,
+ void *queue, dma_addr_t dma_handle)
+{
+ dma_free_coherent(edev->dma_dev, size, queue, dma_handle);
+}
+
+static void *ering_alloc_queue(struct eea_device *edev, size_t size,
+ dma_addr_t *dma_handle)
+{
+ gfp_t flags = GFP_KERNEL | __GFP_NOWARN;
+
+ return dma_alloc_coherent(edev->dma_dev, size, dma_handle, flags);
+}
+
+static int ering_alloc_queues(struct eea_ring *ering, struct eea_device *edev,
+ u32 num, u8 sq_desc_size, u8 cq_desc_size)
+{
+ dma_addr_t addr;
+ size_t size;
+ void *ring;
+
+ size = num * sq_desc_size;
+
+ ring = ering_alloc_queue(edev, size, &addr);
+ if (!ring)
+ return -ENOMEM;
+
+ ering->sq.desc = ring;
+ ering->sq.dma_addr = addr;
+ ering->sq.dma_size = size;
+ ering->sq.desc_size = sq_desc_size;
+ ering->sq.desc_size_shift = fls(sq_desc_size) - 1;
+
+ size = num * cq_desc_size;
+
+ ring = ering_alloc_queue(edev, size, &addr);
+ if (!ring)
+ goto err_free_sq;
+
+ ering->cq.desc = ring;
+ ering->cq.dma_addr = addr;
+ ering->cq.dma_size = size;
+ ering->cq.desc_size = cq_desc_size;
+ ering->cq.desc_size_shift = fls(cq_desc_size) - 1;
+
+ ering->num = num;
+
+ return 0;
+
+err_free_sq:
+ ering_free_queue(ering->edev, ering->sq.dma_size,
+ ering->sq.desc, ering->sq.dma_addr);
+ return -ENOMEM;
+}
+
+static void ering_init(struct eea_ring *ering)
+{
+ ering->cq.phase = EEA_RING_DESC_F_CQ_PHASE;
+ ering->num_free = ering->num;
+}
+
+struct eea_ring *ering_alloc(u32 index, u32 num, struct eea_device *edev,
+ u8 sq_desc_size, u8 cq_desc_size,
+ const char *name)
+{
+ struct eea_ring *ering;
+
+ ering = kzalloc(sizeof(*ering), GFP_KERNEL);
+ if (!ering)
+ return NULL;
+
+ ering->edev = edev;
+ ering->name = name;
+ ering->index = index;
+ ering->msix_vec = index / 2 + 1; /* vec 0 is for error notify. */
+
+ if (ering_alloc_queues(ering, edev, num, sq_desc_size, cq_desc_size))
+ goto err_free;
+
+ ering_init(ering);
+
+ return ering;
+
+err_free:
+ kfree(ering);
+ return NULL;
+}
+
+void ering_free(struct eea_ring *ering)
+{
+ ering_free_queue(ering->edev, ering->cq.dma_size,
+ ering->cq.desc, ering->cq.dma_addr);
+
+ ering_free_queue(ering->edev, ering->sq.dma_size,
+ ering->sq.desc, ering->sq.dma_addr);
+
+ kfree(ering);
+}
diff --git a/drivers/net/ethernet/alibaba/eea/eea_ring.h b/drivers/net/ethernet/alibaba/eea/eea_ring.h
new file mode 100644
index 000000000000..ea7adc32bb23
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_ring.h
@@ -0,0 +1,91 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#ifndef __EEA_RING_H__
+#define __EEA_RING_H__
+
+#include <linux/dma-mapping.h>
+#include "eea_desc.h"
+
+#define EEA_RING_DESC_F_MORE BIT(0)
+#define EEA_RING_DESC_F_CQ_PHASE BIT(7)
+
+struct eea_common_desc {
+ __le16 flags;
+ __le16 id;
+};
+
+struct eea_device;
+
+struct eea_ring_sq {
+ void *desc;
+
+ u16 head;
+ u16 hw_idx;
+
+ u16 shadow_idx;
+ __le16 shadow_id;
+ u16 shadow_num;
+
+ u8 desc_size;
+ u8 desc_size_shift;
+
+ dma_addr_t dma_addr;
+ u32 dma_size;
+};
+
+struct eea_ring_cq {
+ void *desc;
+
+ u16 head;
+ u16 hw_idx;
+
+ u8 phase;
+ u8 desc_size_shift;
+ u8 desc_size;
+
+ dma_addr_t dma_addr;
+ u32 dma_size;
+};
+
+struct eea_ring {
+ const char *name;
+ struct eea_device *edev;
+ u32 index;
+ void __iomem *db;
+ u16 msix_vec;
+
+ u8 mask;
+
+ u32 num;
+
+ u32 num_free;
+
+ struct eea_ring_sq sq;
+ struct eea_ring_cq cq;
+
+ char irq_name[32];
+};
+
+struct eea_ring *ering_alloc(u32 index, u32 num, struct eea_device *edev,
+ u8 sq_desc_size, u8 cq_desc_size,
+ const char *name);
+void ering_free(struct eea_ring *ering);
+bool ering_kick(struct eea_ring *ering);
+
+void *ering_sq_alloc_desc(struct eea_ring *ering, u16 id,
+ bool is_last, u16 flags);
+void *ering_aq_alloc_desc(struct eea_ring *ering);
+void ering_sq_commit_desc(struct eea_ring *ering);
+void ering_sq_cancel(struct eea_ring *ering);
+
+void ering_cq_ack_desc(struct eea_ring *ering, u32 num);
+
+void ering_irq_unactive(struct eea_ring *ering);
+void ering_irq_active(struct eea_ring *ering, struct eea_ring *tx_ering);
+void *ering_cq_get_desc(const struct eea_ring *ering);
+#endif
--
2.32.0.3.g01195cf9f
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH net-next v24 3/6] eea: probe the netdevice and create adminq
2026-01-30 9:34 [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 1/6] eea: introduce PCI framework Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 2/6] eea: introduce ring and descriptor structures Xuan Zhuo
@ 2026-01-30 9:34 ` Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 4/6] eea: create/destroy rx,tx queues for netdevice open and stop Xuan Zhuo
` (3 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Xuan Zhuo @ 2026-01-30 9:34 UTC (permalink / raw)
To: netdev
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xuan Zhuo, Wen Gu, Philo Lu, Lorenzo Bianconi,
Vadim Fedorenko, Dong Yibo, Ethan Nelson-Moore, Heiner Kallweit,
Lukas Bulwahn, Dust Li
Add basic driver framework for the Alibaba Elastic Ethernet Adapter(EEA).
This commit creates and registers the netdevice after PCI probe,
and initializes the admin queue to send commands to the device.
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Philo Lu <lulie@linux.alibaba.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
drivers/net/ethernet/alibaba/eea/Makefile | 6 +-
drivers/net/ethernet/alibaba/eea/eea_adminq.c | 421 ++++++++++++++++++
drivers/net/ethernet/alibaba/eea/eea_adminq.h | 70 +++
drivers/net/ethernet/alibaba/eea/eea_net.c | 193 ++++++++
drivers/net/ethernet/alibaba/eea/eea_net.h | 143 ++++++
drivers/net/ethernet/alibaba/eea/eea_pci.c | 24 +-
drivers/net/ethernet/alibaba/eea/eea_pci.h | 3 +
7 files changed, 857 insertions(+), 3 deletions(-)
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_adminq.c
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_adminq.h
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_net.c
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_net.h
diff --git a/drivers/net/ethernet/alibaba/eea/Makefile b/drivers/net/ethernet/alibaba/eea/Makefile
index e5e4007810a6..91f318e8e046 100644
--- a/drivers/net/ethernet/alibaba/eea/Makefile
+++ b/drivers/net/ethernet/alibaba/eea/Makefile
@@ -1,4 +1,6 @@
obj-$(CONFIG_EEA) += eea.o
-eea-y := eea_ring.o \
- eea_pci.o
+eea-y := eea_ring.o \
+ eea_net.o \
+ eea_pci.o \
+ eea_adminq.o
diff --git a/drivers/net/ethernet/alibaba/eea/eea_adminq.c b/drivers/net/ethernet/alibaba/eea/eea_adminq.c
new file mode 100644
index 000000000000..035de4efbe30
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_adminq.c
@@ -0,0 +1,421 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#include <linux/etherdevice.h>
+#include <linux/iopoll.h>
+#include <linux/utsname.h>
+#include <linux/version.h>
+
+#include "eea_adminq.h"
+#include "eea_net.h"
+#include "eea_pci.h"
+#include "eea_ring.h"
+
+#define EEA_AQ_CMD_CFG_QUERY ((0 << 8) | 0)
+
+#define EEA_AQ_CMD_QUEUE_CREATE ((1 << 8) | 0)
+#define EEA_AQ_CMD_QUEUE_DESTROY_ALL ((1 << 8) | 1)
+
+#define EEA_AQ_CMD_HOST_INFO ((2 << 8) | 0)
+
+#define EEA_AQ_CMD_DEV_STATUS ((3 << 8) | 0)
+
+#define EEA_RING_DESC_F_AQ_PHASE (BIT(15) | BIT(7))
+
+#define EEA_QUEUE_FLAGS_HW_SPLIT_HDR BIT(0)
+#define EEA_QUEUE_FLAGS_SQCQ BIT(1)
+#define EEA_QUEUE_FLAGS_HWTS BIT(2)
+
+struct eea_aq_create {
+ __le32 flags;
+ /* queue index.
+ * rx: 0 == qidx % 2
+ * tx: 1 == qidx % 2
+ */
+ __le16 qidx;
+ /* the depth of the queue */
+ __le16 depth;
+ /* 0: without SPLIT HDR
+ * 1: 128B
+ * 2: 256B
+ * 3: 512B
+ */
+ u8 hdr_buf_size;
+ u8 sq_desc_size;
+ u8 cq_desc_size;
+ u8 reserve0;
+ /* The vector for the irq. rx,tx share the same vector */
+ __le16 msix_vector;
+ __le16 reserve;
+ /* sq ring cfg. */
+ __le32 sq_addr_low;
+ __le32 sq_addr_high;
+ /* cq ring cfg. Just valid when flags include EEA_QUEUE_FLAGS_SQCQ. */
+ __le32 cq_addr_low;
+ __le32 cq_addr_high;
+};
+
+struct eea_aq_queue_drv_status {
+ __le16 qidx;
+
+ __le16 sq_head;
+ __le16 cq_head;
+ __le16 reserved;
+};
+
+#define EEA_OS_DISTRO 0
+#define EEA_DRV_TYPE 0
+#define EEA_OS_LINUX 1
+#define EEA_SPEC_VER_MAJOR 1
+#define EEA_SPEC_VER_MINOR 0
+
+struct eea_aq_host_info_cfg {
+ __le16 os_type;
+ __le16 os_dist;
+ __le16 drv_type;
+
+ __le16 kern_ver_major;
+ __le16 kern_ver_minor;
+ __le16 kern_ver_sub_minor;
+
+ __le16 drv_ver_major;
+ __le16 drv_ver_minor;
+ __le16 drv_ver_sub_minor;
+
+ __le16 spec_ver_major;
+ __le16 spec_ver_minor;
+ __le16 pci_bdf;
+ __le32 pci_domain;
+
+ u8 os_ver_str[64];
+ u8 isa_str[64];
+};
+
+#define EEA_HINFO_MAX_REP_LEN 1024
+#define EEA_HINFO_REP_REJECT 2
+
+struct eea_aq_host_info_rep {
+ u8 op_code;
+ u8 has_reply;
+ u8 reply_str[EEA_HINFO_MAX_REP_LEN];
+};
+
+static struct eea_ring *qid_to_ering(struct eea_net *enet, u32 qid)
+{
+ struct eea_ring *ering;
+
+ if (qid % 2 == 0)
+ ering = enet->rx[qid / 2]->ering;
+ else
+ ering = enet->tx[qid / 2].ering;
+
+ return ering;
+}
+
+#define EEA_AQ_TIMEOUT_US (60 * 1000 * 1000)
+
+static int eea_adminq_submit(struct eea_net *enet, u16 cmd,
+ dma_addr_t req_addr, dma_addr_t res_addr,
+ u32 req_size, u32 res_size)
+{
+ struct eea_aq_cdesc *cdesc;
+ struct eea_aq_desc *desc;
+ int ret;
+
+ desc = ering_aq_alloc_desc(enet->adminq.ring);
+
+ desc->classid = cmd >> 8;
+ desc->command = cmd & 0xff;
+
+ desc->data_addr = cpu_to_le64(req_addr);
+ desc->data_len = cpu_to_le32(req_size);
+
+ desc->reply_addr = cpu_to_le64(res_addr);
+ desc->reply_len = cpu_to_le32(res_size);
+
+ /* for update flags */
+ wmb();
+
+ desc->flags = cpu_to_le16(enet->adminq.phase);
+
+ ering_sq_commit_desc(enet->adminq.ring);
+
+ ering_kick(enet->adminq.ring);
+
+ ++enet->adminq.num;
+
+ if ((enet->adminq.num % enet->adminq.ring->num) == 0)
+ enet->adminq.phase ^= EEA_RING_DESC_F_AQ_PHASE;
+
+ ret = read_poll_timeout(ering_cq_get_desc, cdesc, cdesc, 0,
+ EEA_AQ_TIMEOUT_US, false, enet->adminq.ring);
+ if (ret)
+ return ret;
+
+ ret = le32_to_cpu(cdesc->status);
+
+ ering_cq_ack_desc(enet->adminq.ring, 1);
+
+ if (ret)
+ netdev_err(enet->netdev,
+ "adminq exec failed. cmd: %d ret %d\n", cmd, ret);
+
+ return ret;
+}
+
+static int eea_adminq_exec(struct eea_net *enet, u16 cmd,
+ void *req, u32 req_size, void *res, u32 res_size)
+{
+ dma_addr_t req_addr = 0, res_addr = 0;
+ struct device *dma;
+ int ret;
+
+ dma = enet->edev->dma_dev;
+
+ if (req) {
+ req_addr = dma_map_single(dma, req, req_size, DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(dma, req_addr)))
+ return -ENOMEM;
+ }
+
+ if (res) {
+ res_addr = dma_map_single(dma, res, res_size, DMA_FROM_DEVICE);
+ if (unlikely(dma_mapping_error(dma, res_addr))) {
+ ret = -ENOMEM;
+ goto err_unmap_req;
+ }
+ }
+
+ ret = eea_adminq_submit(enet, cmd, req_addr, res_addr,
+ req_size, res_size);
+ if (res)
+ dma_unmap_single(dma, res_addr, res_size, DMA_FROM_DEVICE);
+
+err_unmap_req:
+ if (req)
+ dma_unmap_single(dma, req_addr, req_size, DMA_TO_DEVICE);
+
+ return ret;
+}
+
+void eea_destroy_adminq(struct eea_net *enet)
+{
+ if (enet->adminq.ring) {
+ ering_free(enet->adminq.ring);
+ enet->adminq.ring = NULL;
+ enet->adminq.phase = 0;
+ }
+}
+
+int eea_create_adminq(struct eea_net *enet, u32 qid)
+{
+ struct eea_ring *ering;
+
+ ering = ering_alloc(qid, 64, enet->edev, sizeof(struct eea_aq_desc),
+ sizeof(struct eea_aq_desc), "adminq");
+ if (!ering)
+ return -ENOMEM;
+
+ eea_pci_active_aq(ering);
+
+ enet->adminq.ring = ering;
+ enet->adminq.phase = BIT(7);
+ enet->adminq.num = 0;
+
+ /* set device ready to active adminq */
+ eea_device_ready(enet->edev);
+
+ return 0;
+}
+
+int eea_adminq_query_cfg(struct eea_net *enet, struct eea_aq_cfg *cfg)
+{
+ return eea_adminq_exec(enet, EEA_AQ_CMD_CFG_QUERY, NULL, 0, cfg,
+ sizeof(*cfg));
+}
+
+static void qcfg_fill(struct eea_aq_create *qcfg, struct eea_ring *ering,
+ u32 flags)
+{
+ qcfg->flags = cpu_to_le32(flags);
+ qcfg->qidx = cpu_to_le16(ering->index);
+ qcfg->depth = cpu_to_le16(ering->num);
+
+ qcfg->hdr_buf_size = flags & EEA_QUEUE_FLAGS_HW_SPLIT_HDR ? 1 : 0;
+ qcfg->sq_desc_size = ering->sq.desc_size;
+ qcfg->cq_desc_size = ering->cq.desc_size;
+ qcfg->msix_vector = cpu_to_le16(ering->msix_vec);
+
+ qcfg->sq_addr_low = cpu_to_le32(ering->sq.dma_addr);
+ qcfg->sq_addr_high = cpu_to_le32(ering->sq.dma_addr >> 32);
+
+ qcfg->cq_addr_low = cpu_to_le32(ering->cq.dma_addr);
+ qcfg->cq_addr_high = cpu_to_le32(ering->cq.dma_addr >> 32);
+}
+
+int eea_adminq_create_q(struct eea_net *enet, u32 qidx, u32 num, u32 flags)
+{
+ int i, db_size, q_size, qid, err = -ENOMEM;
+ struct device *dev = enet->edev->dma_dev;
+ struct eea_aq_create *q_buf;
+ dma_addr_t db_dma, q_dma;
+ struct eea_net_cfg *cfg;
+ struct eea_ring *ering;
+ __le32 *db_buf;
+
+ cfg = &enet->cfg;
+
+ if (cfg->split_hdr)
+ flags |= EEA_QUEUE_FLAGS_HW_SPLIT_HDR;
+
+ flags |= EEA_QUEUE_FLAGS_SQCQ;
+ flags |= EEA_QUEUE_FLAGS_HWTS;
+
+ db_size = sizeof(int) * num;
+ q_size = sizeof(struct eea_aq_create) * num;
+
+ db_buf = dma_alloc_coherent(dev, db_size, &db_dma, GFP_KERNEL);
+ if (!db_buf)
+ return err;
+
+ q_buf = dma_alloc_coherent(dev, q_size, &q_dma, GFP_KERNEL);
+ if (!q_buf)
+ goto err_free_db_buf;
+
+ qid = qidx;
+ for (i = 0; i < num; i++, qid++)
+ qcfg_fill(q_buf + i, qid_to_ering(enet, qid), flags);
+
+ err = eea_adminq_exec(enet, EEA_AQ_CMD_QUEUE_CREATE,
+ q_buf, q_size, db_buf, db_size);
+ if (err)
+ goto err_free_q_buf;
+
+ qid = qidx;
+ for (i = 0; i < num; i++, qid++) {
+ ering = qid_to_ering(enet, qid);
+ ering->db = eea_pci_db_addr(ering->edev,
+ le32_to_cpu(db_buf[i]));
+ }
+
+err_free_q_buf:
+ dma_free_coherent(dev, q_size, q_buf, q_dma);
+
+err_free_db_buf:
+ dma_free_coherent(dev, db_size, db_buf, db_dma);
+
+ return err;
+}
+
+int eea_adminq_destroy_all_q(struct eea_net *enet)
+{
+ return eea_adminq_exec(enet, EEA_AQ_CMD_QUEUE_DESTROY_ALL, NULL, 0,
+ NULL, 0);
+}
+
+struct eea_aq_dev_status *eea_adminq_dev_status(struct eea_net *enet)
+{
+ struct eea_aq_queue_drv_status *drv_status;
+ struct eea_aq_dev_status *dev_status;
+ struct eea_ring *ering;
+ int err, i, num, size;
+ void *rep, *req;
+
+ num = enet->cfg.rx_ring_num * 2 + 1;
+
+ req = kcalloc(num, sizeof(struct eea_aq_queue_drv_status), GFP_KERNEL);
+ if (!req)
+ return NULL;
+
+ size = struct_size(dev_status, q_status, num);
+
+ rep = kmalloc(size, GFP_KERNEL);
+ if (!rep) {
+ kfree(req);
+ return NULL;
+ }
+
+ drv_status = req;
+ for (i = 0; i < enet->cfg.rx_ring_num * 2; ++i, ++drv_status) {
+ ering = qid_to_ering(enet, i);
+ drv_status->qidx = cpu_to_le16(i);
+ drv_status->cq_head = cpu_to_le16(ering->cq.head);
+ drv_status->sq_head = cpu_to_le16(ering->sq.head);
+ }
+
+ drv_status->qidx = cpu_to_le16(i);
+ drv_status->cq_head = cpu_to_le16(enet->adminq.ring->cq.head);
+ drv_status->sq_head = cpu_to_le16(enet->adminq.ring->sq.head);
+
+ err = eea_adminq_exec(enet, EEA_AQ_CMD_DEV_STATUS,
+ req, num * sizeof(struct eea_aq_queue_drv_status),
+ rep, size);
+ kfree(req);
+ if (err) {
+ kfree(rep);
+ return NULL;
+ }
+
+ return rep;
+}
+
+int eea_adminq_config_host_info(struct eea_net *enet)
+{
+ struct device *dev = enet->edev->dma_dev;
+ struct eea_aq_host_info_cfg *cfg;
+ struct eea_aq_host_info_rep *rep;
+ int rc = -ENOMEM;
+
+ cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
+ if (!cfg)
+ return rc;
+
+ rep = kzalloc(sizeof(*rep), GFP_KERNEL);
+ if (!rep)
+ goto err_free_cfg;
+
+ cfg->os_type = cpu_to_le16(EEA_OS_LINUX);
+ cfg->os_dist = cpu_to_le16(EEA_OS_DISTRO);
+ cfg->drv_type = cpu_to_le16(EEA_DRV_TYPE);
+
+ cfg->kern_ver_major = cpu_to_le16(LINUX_VERSION_MAJOR);
+ cfg->kern_ver_minor = cpu_to_le16(LINUX_VERSION_PATCHLEVEL);
+ cfg->kern_ver_sub_minor = cpu_to_le16(LINUX_VERSION_SUBLEVEL);
+
+ cfg->drv_ver_major = cpu_to_le16(EEA_VER_MAJOR);
+ cfg->drv_ver_minor = cpu_to_le16(EEA_VER_MINOR);
+ cfg->drv_ver_sub_minor = cpu_to_le16(EEA_VER_SUB_MINOR);
+
+ cfg->spec_ver_major = cpu_to_le16(EEA_SPEC_VER_MAJOR);
+ cfg->spec_ver_minor = cpu_to_le16(EEA_SPEC_VER_MINOR);
+
+ cfg->pci_bdf = cpu_to_le16(eea_pci_dev_id(enet->edev));
+ cfg->pci_domain = cpu_to_le32(eea_pci_domain_nr(enet->edev));
+
+ strscpy(cfg->os_ver_str, utsname()->release, sizeof(cfg->os_ver_str));
+ strscpy(cfg->isa_str, utsname()->machine, sizeof(cfg->isa_str));
+
+ rc = eea_adminq_exec(enet, EEA_AQ_CMD_HOST_INFO,
+ cfg, sizeof(*cfg), rep, sizeof(*rep));
+
+ if (!rc) {
+ if (rep->op_code == EEA_HINFO_REP_REJECT) {
+ dev_err(dev, "Device has refused the initialization due to provided host information\n");
+ rc = -ENODEV;
+ }
+ if (rep->has_reply) {
+ rep->reply_str[EEA_HINFO_MAX_REP_LEN - 1] = '\0';
+ dev_warn(dev, "Device replied: %s\n",
+ rep->reply_str);
+ }
+ }
+
+ kfree(rep);
+err_free_cfg:
+ kfree(cfg);
+ return rc;
+}
diff --git a/drivers/net/ethernet/alibaba/eea/eea_adminq.h b/drivers/net/ethernet/alibaba/eea/eea_adminq.h
new file mode 100644
index 000000000000..dce65967cc17
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_adminq.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#include "eea_pci.h"
+
+#ifndef __EEA_ADMINQ_H__
+#define __EEA_ADMINQ_H__
+
+struct eea_aq_cfg {
+ __le32 rx_depth_max;
+ __le32 rx_depth_def;
+
+ __le32 tx_depth_max;
+ __le32 tx_depth_def;
+
+ __le32 max_tso_size;
+ __le32 max_tso_segs;
+
+ u8 mac[ETH_ALEN];
+ __le16 status;
+
+ __le16 mtu;
+ __le16 reserved0;
+ __le16 reserved1;
+ u8 reserved2;
+ u8 reserved3;
+
+ __le16 reserved4;
+ __le16 reserved5;
+ __le16 reserved6;
+};
+
+struct eea_aq_queue_status {
+ __le16 qidx;
+#define EEA_QUEUE_STATUS_OK 0
+#define EEA_QUEUE_STATUS_NEED_RESET 1
+ __le16 status;
+};
+
+struct eea_aq_dev_status {
+#define EEA_LINK_DOWN_STATUS 0
+#define EEA_LINK_UP_STATUS 1
+ __le16 link_status;
+ __le16 reserved;
+
+ struct eea_aq_queue_status q_status[];
+};
+
+struct eea_aq {
+ struct eea_ring *ring;
+ u32 num;
+ u16 phase;
+};
+
+struct eea_net;
+
+int eea_create_adminq(struct eea_net *enet, u32 qid);
+void eea_destroy_adminq(struct eea_net *enet);
+
+int eea_adminq_query_cfg(struct eea_net *enet, struct eea_aq_cfg *cfg);
+
+int eea_adminq_create_q(struct eea_net *enet, u32 qidx, u32 num, u32 flags);
+int eea_adminq_destroy_all_q(struct eea_net *enet);
+struct eea_aq_dev_status *eea_adminq_dev_status(struct eea_net *enet);
+int eea_adminq_config_host_info(struct eea_net *enet);
+#endif
diff --git a/drivers/net/ethernet/alibaba/eea/eea_net.c b/drivers/net/ethernet/alibaba/eea/eea_net.c
new file mode 100644
index 000000000000..65b236c412a9
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_net.c
@@ -0,0 +1,193 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#include <linux/etherdevice.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/rtnetlink.h>
+#include <net/netdev_queues.h>
+
+#include "eea_adminq.h"
+#include "eea_net.h"
+#include "eea_pci.h"
+#include "eea_ring.h"
+
+#define EEA_SPLIT_HDR_SIZE 128
+
+static void eea_update_cfg(struct eea_net *enet,
+ struct eea_device *edev,
+ struct eea_aq_cfg *hwcfg)
+{
+ enet->cfg_hw.rx_ring_depth = le32_to_cpu(hwcfg->rx_depth_max);
+ enet->cfg_hw.tx_ring_depth = le32_to_cpu(hwcfg->tx_depth_max);
+
+ enet->cfg_hw.rx_ring_num = edev->rx_num;
+ enet->cfg_hw.tx_ring_num = edev->tx_num;
+
+ enet->cfg.rx_ring_depth = le32_to_cpu(hwcfg->rx_depth_def);
+ enet->cfg.tx_ring_depth = le32_to_cpu(hwcfg->tx_depth_def);
+
+ enet->cfg.rx_ring_num = edev->rx_num;
+ enet->cfg.tx_ring_num = edev->tx_num;
+
+ enet->cfg_hw.split_hdr = EEA_SPLIT_HDR_SIZE;
+}
+
+static int eea_netdev_init_features(struct net_device *netdev,
+ struct eea_net *enet,
+ struct eea_device *edev)
+{
+ struct eea_aq_cfg *cfg;
+ int err;
+ u32 mtu;
+
+ cfg = kmalloc(sizeof(*cfg), GFP_KERNEL);
+ if (!cfg)
+ return -ENOMEM;
+
+ err = eea_adminq_query_cfg(enet, cfg);
+ if (err)
+ goto err_free;
+
+ mtu = le16_to_cpu(cfg->mtu);
+ if (mtu < ETH_MIN_MTU) {
+ dev_err(edev->dma_dev, "The device gave us an invalid MTU. Here we can only exit the initialization. %u < %u\n",
+ mtu, ETH_MIN_MTU);
+ err = -EINVAL;
+ goto err_free;
+ }
+
+ eea_update_cfg(enet, edev, cfg);
+
+ netdev->priv_flags |= IFF_UNICAST_FLT;
+ netdev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
+
+ netdev->hw_features |= NETIF_F_HW_CSUM;
+ netdev->hw_features |= NETIF_F_GRO_HW;
+ netdev->hw_features |= NETIF_F_SG;
+ netdev->hw_features |= NETIF_F_TSO;
+ netdev->hw_features |= NETIF_F_TSO_ECN;
+ netdev->hw_features |= NETIF_F_TSO6;
+ netdev->hw_features |= NETIF_F_GSO_UDP_L4;
+
+ netdev->features |= NETIF_F_HIGHDMA;
+ netdev->features |= NETIF_F_HW_CSUM;
+ netdev->features |= NETIF_F_SG;
+ netdev->features |= NETIF_F_GSO_ROBUST;
+ netdev->features |= netdev->hw_features & NETIF_F_ALL_TSO;
+ netdev->features |= NETIF_F_RXCSUM;
+ netdev->features |= NETIF_F_GRO_HW;
+
+ netdev->vlan_features = netdev->features;
+
+ eth_hw_addr_set(netdev, cfg->mac);
+
+ enet->speed = SPEED_UNKNOWN;
+ enet->duplex = DUPLEX_UNKNOWN;
+
+ netdev->min_mtu = ETH_MIN_MTU;
+
+ netdev->mtu = mtu;
+
+ /* If jumbo frames are already enabled, then the returned MTU will be a
+ * jumbo MTU, and the driver will automatically enable jumbo frame
+ * support by default.
+ */
+ netdev->max_mtu = mtu;
+
+ netif_carrier_on(netdev);
+
+err_free:
+ kfree(cfg);
+ return err;
+}
+
+static const struct net_device_ops eea_netdev = {
+ .ndo_validate_addr = eth_validate_addr,
+ .ndo_features_check = passthru_features_check,
+};
+
+static struct eea_net *eea_netdev_alloc(struct eea_device *edev, u32 pairs)
+{
+ struct net_device *netdev;
+ struct eea_net *enet;
+
+ netdev = alloc_etherdev_mq(sizeof(struct eea_net), pairs);
+ if (!netdev) {
+ dev_err(edev->dma_dev,
+ "alloc_etherdev_mq failed with pairs %d\n", pairs);
+ return NULL;
+ }
+
+ netdev->netdev_ops = &eea_netdev;
+ SET_NETDEV_DEV(netdev, edev->dma_dev);
+
+ enet = netdev_priv(netdev);
+ enet->netdev = netdev;
+ enet->edev = edev;
+ edev->enet = enet;
+
+ return enet;
+}
+
+int eea_net_probe(struct eea_device *edev)
+{
+ struct eea_net *enet;
+ int err = -ENOMEM;
+
+ enet = eea_netdev_alloc(edev, edev->rx_num);
+ if (!enet)
+ return -ENOMEM;
+
+ err = eea_create_adminq(enet, edev->rx_num + edev->tx_num);
+ if (err)
+ goto err_free_netdev;
+
+ err = eea_adminq_config_host_info(enet);
+ if (err)
+ goto err_reset_dev;
+
+ err = eea_netdev_init_features(enet->netdev, enet, edev);
+ if (err)
+ goto err_reset_dev;
+
+ err = register_netdev(enet->netdev);
+ if (err)
+ goto err_reset_dev;
+
+ netif_carrier_off(enet->netdev);
+
+ netdev_dbg(enet->netdev, "eea probe success.\n");
+
+ return 0;
+
+err_reset_dev:
+ eea_device_reset(edev);
+ eea_destroy_adminq(enet);
+
+err_free_netdev:
+ free_netdev(enet->netdev);
+ return err;
+}
+
+void eea_net_remove(struct eea_device *edev)
+{
+ struct net_device *netdev;
+ struct eea_net *enet;
+
+ enet = edev->enet;
+ netdev = enet->netdev;
+
+ unregister_netdev(netdev);
+ netdev_dbg(enet->netdev, "eea removed.\n");
+
+ eea_device_reset(edev);
+
+ eea_destroy_adminq(enet);
+
+ free_netdev(netdev);
+}
diff --git a/drivers/net/ethernet/alibaba/eea/eea_net.h b/drivers/net/ethernet/alibaba/eea/eea_net.h
new file mode 100644
index 000000000000..b35d7483de63
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_net.h
@@ -0,0 +1,143 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#ifndef __EEA_NET_H__
+#define __EEA_NET_H__
+
+#include <linux/ethtool.h>
+#include <linux/netdevice.h>
+
+#include "eea_adminq.h"
+#include "eea_ring.h"
+
+#define EEA_VER_MAJOR 1
+#define EEA_VER_MINOR 0
+#define EEA_VER_SUB_MINOR 0
+
+struct eea_net_tx {
+ struct eea_net *enet;
+
+ struct eea_ring *ering;
+
+ struct eea_tx_meta *meta;
+ struct eea_tx_meta *free;
+
+ struct device *dma_dev;
+
+ u32 index;
+
+ char name[16];
+};
+
+struct eea_rx_meta {
+ struct eea_rx_meta *next;
+
+ struct page *page;
+ dma_addr_t dma;
+ u32 offset;
+ u32 frags;
+
+ struct page *hdr_page;
+ void *hdr_addr;
+ dma_addr_t hdr_dma;
+
+ u32 id;
+
+ u32 truesize;
+ u32 headroom;
+ u32 tailroom;
+ u32 room;
+
+ u32 len;
+};
+
+struct eea_net_rx_pkt_ctx {
+ u16 idx;
+
+ bool data_valid;
+ bool do_drop;
+
+ struct sk_buff *head_skb;
+ struct sk_buff *curr_skb;
+};
+
+struct eea_net_rx {
+ struct eea_net *enet;
+
+ struct eea_ring *ering;
+
+ struct eea_rx_meta *meta;
+ struct eea_rx_meta *free;
+
+ struct device *dma_dev;
+
+ u32 index;
+
+ u32 flags;
+
+ u32 headroom;
+
+ struct napi_struct napi;
+
+ u16 irq_n;
+
+ char name[16];
+
+ struct eea_net_rx_pkt_ctx pkt;
+
+ struct page_pool *pp;
+};
+
+struct eea_net_cfg {
+ u32 rx_ring_depth;
+ u32 tx_ring_depth;
+ u32 rx_ring_num;
+ u32 tx_ring_num;
+
+ u8 rx_sq_desc_size;
+ u8 rx_cq_desc_size;
+ u8 tx_sq_desc_size;
+ u8 tx_cq_desc_size;
+
+ u32 split_hdr;
+};
+
+enum {
+ EEA_LINK_ERR_NONE,
+ EEA_LINK_ERR_HA_RESET_DEV,
+ EEA_LINK_ERR_LINK_DOWN,
+};
+
+struct eea_net {
+ struct eea_device *edev;
+ struct net_device *netdev;
+
+ struct eea_aq adminq;
+
+ struct eea_net_tx *tx;
+ struct eea_net_rx **rx;
+
+ struct eea_net_cfg cfg;
+ struct eea_net_cfg cfg_hw;
+
+ u32 link_err;
+
+ bool started;
+ bool cpu_aff_set;
+
+ u8 duplex;
+ u32 speed;
+
+ u64 hw_ts_offset;
+};
+
+int eea_net_probe(struct eea_device *edev);
+void eea_net_remove(struct eea_device *edev);
+int eea_net_freeze(struct eea_device *edev);
+int eea_net_restore(struct eea_device *edev);
+
+#endif
diff --git a/drivers/net/ethernet/alibaba/eea/eea_pci.c b/drivers/net/ethernet/alibaba/eea/eea_pci.c
index da9d34ae9454..dcae0154c7e0 100644
--- a/drivers/net/ethernet/alibaba/eea/eea_pci.c
+++ b/drivers/net/ethernet/alibaba/eea/eea_pci.c
@@ -8,6 +8,7 @@
#include <linux/io-64-nonatomic-lo-hi.h>
#include <linux/iopoll.h>
+#include "eea_net.h"
#include "eea_pci.h"
#define EEA_PCI_DB_OFFSET 4096
@@ -58,7 +59,9 @@ struct eea_pci_device {
((void __iomem *)((reg) + offsetof(struct eea_pci_cfg, item)))
#define cfg_write8(reg, item, val) iowrite8(val, cfg_pointer(reg, item))
+#define cfg_write16(reg, item, val) iowrite16(val, cfg_pointer(reg, item))
#define cfg_write32(reg, item, val) iowrite32(val, cfg_pointer(reg, item))
+#define cfg_write64(reg, item, val) iowrite64_lo_hi(val, cfg_pointer(reg, item))
#define cfg_read8(reg, item) ioread8(cfg_pointer(reg, item))
#define cfg_read32(reg, item) ioread32(cfg_pointer(reg, item))
@@ -233,6 +236,20 @@ void __iomem *eea_pci_db_addr(struct eea_device *edev, u32 off)
return edev->ep_dev->db_base + off;
}
+void eea_pci_active_aq(struct eea_ring *ering)
+{
+ struct eea_pci_device *ep_dev = ering->edev->ep_dev;
+
+ cfg_write16(ep_dev->reg, aq_size, ering->num);
+ cfg_write16(ep_dev->reg, aq_msix_vector, ering->msix_vec);
+
+ cfg_write64(ep_dev->reg, aq_sq_addr, ering->sq.dma_addr);
+ cfg_write64(ep_dev->reg, aq_cq_addr, ering->cq.dma_addr);
+
+ ering->db = eea_pci_db_addr(ering->edev,
+ cfg_read32(ep_dev->reg, aq_db_off));
+}
+
u64 eea_pci_device_ts(struct eea_device *edev)
{
struct eea_pci_device *ep_dev = edev->ep_dev;
@@ -254,7 +271,9 @@ static int eea_init_device(struct eea_device *edev)
if (err)
goto err;
- /* do net device probe ... */
+ err = eea_net_probe(edev);
+ if (err)
+ goto err;
return 0;
err:
@@ -288,6 +307,9 @@ static void __eea_pci_remove(struct pci_dev *pci_dev)
{
struct eea_pci_device *ep_dev = pci_get_drvdata(pci_dev);
struct device *dev = get_device(&ep_dev->pci_dev->dev);
+ struct eea_device *edev = &ep_dev->edev;
+
+ eea_net_remove(edev);
pci_disable_sriov(pci_dev);
diff --git a/drivers/net/ethernet/alibaba/eea/eea_pci.h b/drivers/net/ethernet/alibaba/eea/eea_pci.h
index 126704a207d5..d793128e556c 100644
--- a/drivers/net/ethernet/alibaba/eea/eea_pci.h
+++ b/drivers/net/ethernet/alibaba/eea/eea_pci.h
@@ -10,6 +10,8 @@
#include <linux/pci.h>
+#include "eea_ring.h"
+
struct eea_pci_cap {
__u8 cap_vndr;
__u8 cap_next;
@@ -43,6 +45,7 @@ u16 eea_pci_dev_id(struct eea_device *edev);
int eea_device_reset(struct eea_device *dev);
void eea_device_ready(struct eea_device *dev);
+void eea_pci_active_aq(struct eea_ring *ering);
u64 eea_pci_device_ts(struct eea_device *edev);
--
2.32.0.3.g01195cf9f
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH net-next v24 4/6] eea: create/destroy rx,tx queues for netdevice open and stop
2026-01-30 9:34 [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Xuan Zhuo
` (2 preceding siblings ...)
2026-01-30 9:34 ` [PATCH net-next v24 3/6] eea: probe the netdevice and create adminq Xuan Zhuo
@ 2026-01-30 9:34 ` Xuan Zhuo
2026-01-30 23:32 ` [net-next,v24,4/6] " Jakub Kicinski
2026-01-30 9:34 ` [PATCH net-next v24 5/6] eea: introduce ethtool support Xuan Zhuo
` (2 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: Xuan Zhuo @ 2026-01-30 9:34 UTC (permalink / raw)
To: netdev
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xuan Zhuo, Wen Gu, Philo Lu, Lorenzo Bianconi,
Vadim Fedorenko, Dong Yibo, Ethan Nelson-Moore, Heiner Kallweit,
Lukas Bulwahn, Dust Li
Add basic driver framework for the Alibaba Elastic Ethernet Adapter(EEA).
This commit introduces the implementation for the netdevice open and
stop.
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Philo Lu <lulie@linux.alibaba.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
drivers/net/ethernet/alibaba/eea/Makefile | 4 +-
drivers/net/ethernet/alibaba/eea/eea_net.c | 412 ++++++++++-
drivers/net/ethernet/alibaba/eea/eea_net.h | 48 ++
drivers/net/ethernet/alibaba/eea/eea_pci.c | 182 ++++-
drivers/net/ethernet/alibaba/eea/eea_pci.h | 14 +
drivers/net/ethernet/alibaba/eea/eea_rx.c | 762 +++++++++++++++++++++
drivers/net/ethernet/alibaba/eea/eea_tx.c | 377 ++++++++++
7 files changed, 1793 insertions(+), 6 deletions(-)
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_rx.c
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_tx.c
diff --git a/drivers/net/ethernet/alibaba/eea/Makefile b/drivers/net/ethernet/alibaba/eea/Makefile
index 91f318e8e046..fa34a005fa01 100644
--- a/drivers/net/ethernet/alibaba/eea/Makefile
+++ b/drivers/net/ethernet/alibaba/eea/Makefile
@@ -3,4 +3,6 @@ obj-$(CONFIG_EEA) += eea.o
eea-y := eea_ring.o \
eea_net.o \
eea_pci.o \
- eea_adminq.o
+ eea_adminq.o \
+ eea_tx.o \
+ eea_rx.o
diff --git a/drivers/net/ethernet/alibaba/eea/eea_net.c b/drivers/net/ethernet/alibaba/eea/eea_net.c
index 65b236c412a9..4897c07a25ae 100644
--- a/drivers/net/ethernet/alibaba/eea/eea_net.c
+++ b/drivers/net/ethernet/alibaba/eea/eea_net.c
@@ -18,6 +18,357 @@
#define EEA_SPLIT_HDR_SIZE 128
+static int enet_bind_new_q_and_cfg(struct eea_net *enet,
+ struct eea_net_init_ctx *ctx)
+{
+ struct eea_net_rx *rx;
+ struct eea_net_tx *tx;
+ int i, err;
+
+ err = netif_set_real_num_queues(enet->netdev, ctx->cfg.tx_ring_num,
+ ctx->cfg.rx_ring_num);
+ if (err)
+ return err;
+
+ enet->cfg = ctx->cfg;
+
+ enet->rx = ctx->rx;
+ enet->tx = ctx->tx;
+
+ for (i = 0; i < ctx->cfg.rx_ring_num; i++) {
+ rx = ctx->rx[i];
+ tx = &ctx->tx[i];
+
+ rx->enet = enet;
+ tx->enet = enet;
+ }
+
+ return 0;
+}
+
+void enet_init_ctx(struct eea_net *enet, struct eea_net_init_ctx *ctx)
+{
+ memset(ctx, 0, sizeof(*ctx));
+
+ ctx->netdev = enet->netdev;
+ ctx->edev = enet->edev;
+ ctx->cfg = enet->cfg;
+}
+
+static void eea_free_rxtx_q_mem(struct eea_net *enet)
+{
+ struct eea_net_rx *rx, **rx_array;
+ struct eea_net_tx *tx, *tx_array;
+ int i;
+
+ rx_array = enet->rx;
+ tx_array = enet->tx;
+
+ enet->rx = NULL;
+ enet->tx = NULL;
+
+ for (i = 0; i < enet->cfg.rx_ring_num; i++) {
+ rx = rx_array[i];
+ tx = &tx_array[i];
+
+ eea_free_rx(rx, &enet->cfg);
+ eea_free_tx(tx, &enet->cfg);
+ }
+
+ kvfree(rx_array);
+ kvfree(tx_array);
+}
+
+/* alloc tx/rx: struct, ring, meta, pp, napi */
+static int eea_alloc_rxtx_q_mem(struct eea_net_init_ctx *ctx)
+{
+ struct eea_net_rx *rx;
+ struct eea_net_tx *tx;
+ int err, i;
+
+ ctx->tx = kvcalloc(ctx->cfg.tx_ring_num, sizeof(*ctx->tx), GFP_KERNEL);
+ if (!ctx->tx)
+ return -ENOMEM;
+
+ ctx->rx = kvcalloc(ctx->cfg.rx_ring_num, sizeof(*ctx->rx), GFP_KERNEL);
+ if (!ctx->rx)
+ goto err_free_tx;
+
+ ctx->cfg.rx_sq_desc_size = sizeof(struct eea_rx_desc);
+ ctx->cfg.rx_cq_desc_size = sizeof(struct eea_rx_cdesc);
+ ctx->cfg.tx_sq_desc_size = sizeof(struct eea_tx_desc);
+ ctx->cfg.tx_cq_desc_size = sizeof(struct eea_tx_cdesc);
+
+ ctx->cfg.tx_cq_desc_size /= 2;
+
+ if (!ctx->cfg.split_hdr)
+ ctx->cfg.rx_sq_desc_size /= 2;
+
+ for (i = 0; i < ctx->cfg.rx_ring_num; i++) {
+ rx = eea_alloc_rx(ctx, i);
+ if (!rx)
+ goto err_free;
+
+ ctx->rx[i] = rx;
+
+ tx = ctx->tx + i;
+ err = eea_alloc_tx(ctx, tx, i);
+ if (err)
+ goto err_free;
+ }
+
+ return 0;
+
+err_free:
+ for (i = 0; i < ctx->cfg.rx_ring_num; i++) {
+ rx = ctx->rx[i];
+ tx = ctx->tx + i;
+
+ eea_free_rx(rx, &ctx->cfg);
+ eea_free_tx(tx, &ctx->cfg);
+ }
+
+ kvfree(ctx->rx);
+
+err_free_tx:
+ kvfree(ctx->tx);
+ return -ENOMEM;
+}
+
+static int eea_active_ring_and_irq(struct eea_net *enet)
+{
+ struct eea_net_rx *rx;
+ int err, i;
+
+ err = enet_rxtx_irq_setup(enet, 0, enet->cfg.rx_ring_num);
+ if (err)
+ return err;
+
+ err = eea_adminq_create_q(enet, /* qidx = */ 0,
+ enet->cfg.rx_ring_num +
+ enet->cfg.tx_ring_num, 0);
+ if (err) {
+ for (i = 0; i < enet->cfg.rx_ring_num; i++) {
+ rx = enet->rx[i];
+ eea_irq_free(rx);
+ }
+
+ return err;
+ }
+
+ return 0;
+}
+
+static int eea_unactive_ring_and_irq(struct eea_net *enet)
+{
+ struct eea_net_rx *rx;
+ int err, i;
+
+ err = eea_adminq_destroy_all_q(enet);
+ if (err)
+ netdev_warn(enet->netdev, "unactive rxtx ring failed.\n");
+
+ for (i = 0; i < enet->cfg.rx_ring_num; i++) {
+ rx = enet->rx[i];
+ eea_irq_free(rx);
+ }
+
+ return err;
+}
+
+/* stop rx napi, stop tx queue. */
+static void eea_stop_rxtx(struct net_device *netdev)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+ int i;
+
+ netif_tx_disable(netdev);
+
+ for (i = 0; i < enet->cfg.rx_ring_num; i++)
+ enet_rx_stop(enet->rx[i]);
+
+ netif_carrier_off(netdev);
+}
+
+static void eea_start_rxtx(struct net_device *netdev)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+ int i;
+
+ for (i = 0; i < enet->cfg.rx_ring_num; i++)
+ enet_rx_start(enet->rx[i]);
+
+ netif_tx_start_all_queues(netdev);
+ netif_carrier_on(netdev);
+
+ enet->started = true;
+}
+
+static int eea_netdev_stop(struct net_device *netdev)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+
+ /* This function can be called during device anomaly recovery. To
+ * prevent duplicate stop operations, the `started` flag is introduced
+ * for checking.
+ */
+
+ if (!enet->started) {
+ netdev_warn(netdev, "eea netdev stop: but dev is not started.\n");
+ return 0;
+ }
+
+ eea_stop_rxtx(netdev);
+ eea_unactive_ring_and_irq(enet);
+ eea_free_rxtx_q_mem(enet);
+
+ enet->started = false;
+
+ return 0;
+}
+
+static int eea_netdev_open(struct net_device *netdev)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+ struct eea_net_init_ctx ctx;
+ int err;
+
+ if (enet->link_err) {
+ netdev_err(netdev, "netdev open err, because link error: %d\n",
+ enet->link_err);
+ return -EBUSY;
+ }
+
+ enet_init_ctx(enet, &ctx);
+
+ err = eea_alloc_rxtx_q_mem(&ctx);
+ if (err)
+ goto err_done;
+
+ err = enet_bind_new_q_and_cfg(enet, &ctx);
+ if (err)
+ goto err_free_q;
+
+ err = eea_active_ring_and_irq(enet);
+ if (err)
+ goto err_free_q;
+
+ eea_start_rxtx(netdev);
+
+ return 0;
+
+err_free_q:
+ eea_free_rxtx_q_mem(enet);
+
+err_done:
+ return err;
+}
+
+/* resources: ring, buffers, irq */
+int eea_reset_hw_resources(struct eea_net *enet, struct eea_net_init_ctx *ctx)
+{
+ int err;
+
+ if (!netif_running(enet->netdev)) {
+ enet->cfg = ctx->cfg;
+ return 0;
+ }
+
+ err = eea_alloc_rxtx_q_mem(ctx);
+ if (err) {
+ netdev_warn(enet->netdev,
+ "eea reset: alloc q failed. stop reset. err %d\n",
+ err);
+ return err;
+ }
+
+ eea_netdev_stop(enet->netdev);
+
+ /* If we encounter an error (which is, of course, a very
+ * low-probability event), but we do not immediately free the queues
+ * resources here. Instead, we defer their release until the normal NIC
+ * cleanup, or until the user or hardware triggers a reset operation.
+ * Because that the dev is running.
+ */
+
+ err = enet_bind_new_q_and_cfg(enet, ctx);
+ if (err) {
+ netdev_err(enet->netdev,
+ "eea reset: bind new queues failed. err %d\n",
+ err);
+
+ return err;
+ }
+
+ err = eea_active_ring_and_irq(enet);
+ if (err) {
+ netdev_err(enet->netdev,
+ "eea reset: active new ring and irq failed. err %d\n",
+ err);
+ return err;
+ }
+
+ eea_start_rxtx(enet->netdev);
+
+ return err;
+}
+
+int eea_queues_check_and_reset(struct eea_device *edev)
+{
+ struct eea_aq_queue_status *qstatus;
+ struct eea_aq_dev_status *dstatus;
+ struct eea_aq_queue_status *qs;
+ struct eea_net_init_ctx ctx;
+ bool need_reset = false;
+ int num, i, err = 0;
+
+ rtnl_lock();
+
+ if (!netif_running(edev->enet->netdev))
+ goto err_unlock;
+
+ num = edev->enet->cfg.tx_ring_num * 2 + 1;
+
+ dstatus = eea_adminq_dev_status(edev->enet);
+ if (!dstatus) {
+ netdev_warn(edev->enet->netdev, "query queue status failed.\n");
+ err = -ENOMEM;
+ goto err_unlock;
+ }
+
+ if (le16_to_cpu(dstatus->link_status) == EEA_LINK_DOWN_STATUS) {
+ eea_netdev_stop(edev->enet->netdev);
+ edev->enet->link_err = EEA_LINK_ERR_LINK_DOWN;
+ netdev_warn(edev->enet->netdev, "device link is down. stop device.\n");
+ goto err_free;
+ }
+
+ qstatus = dstatus->q_status;
+
+ for (i = 0; i < num; ++i) {
+ qs = &qstatus[i];
+
+ if (le16_to_cpu(qs->status) == EEA_QUEUE_STATUS_NEED_RESET) {
+ netdev_warn(edev->enet->netdev,
+ "queue status: queue %u needs to reset\n",
+ le16_to_cpu(qs->qidx));
+ need_reset = true;
+ }
+ }
+
+ if (need_reset) {
+ enet_init_ctx(edev->enet, &ctx);
+ err = eea_reset_hw_resources(edev->enet, &ctx);
+ }
+
+err_free:
+ kfree(dstatus);
+
+err_unlock:
+ rtnl_unlock();
+ return err;
+}
+
static void eea_update_cfg(struct eea_net *enet,
struct eea_device *edev,
struct eea_aq_cfg *hwcfg)
@@ -107,8 +458,12 @@ static int eea_netdev_init_features(struct net_device *netdev,
}
static const struct net_device_ops eea_netdev = {
+ .ndo_open = eea_netdev_open,
+ .ndo_stop = eea_netdev_stop,
+ .ndo_start_xmit = eea_tx_xmit,
.ndo_validate_addr = eth_validate_addr,
.ndo_features_check = passthru_features_check,
+ .ndo_tx_timeout = eea_tx_timeout,
};
static struct eea_net *eea_netdev_alloc(struct eea_device *edev, u32 pairs)
@@ -134,11 +489,48 @@ static struct eea_net *eea_netdev_alloc(struct eea_device *edev, u32 pairs)
return enet;
}
+static void eea_update_ts_off(struct eea_device *edev, struct eea_net *enet)
+{
+ u64 ts;
+
+ ts = eea_pci_device_ts(edev);
+
+ enet->hw_ts_offset = ktime_get_real() - ts;
+}
+
+static int eea_net_reprobe(struct eea_device *edev)
+{
+ struct eea_net *enet = edev->enet;
+ int err = 0;
+
+ enet->edev = edev;
+
+ if (!enet->adminq.ring) {
+ err = eea_create_adminq(enet, edev->rx_num + edev->tx_num);
+ if (err)
+ return err;
+ }
+
+ eea_update_ts_off(edev, enet);
+
+ if (edev->ha_reset_netdev_running) {
+ rtnl_lock();
+ enet->link_err = 0;
+ err = eea_netdev_open(enet->netdev);
+ rtnl_unlock();
+ }
+
+ return err;
+}
+
int eea_net_probe(struct eea_device *edev)
{
struct eea_net *enet;
int err = -ENOMEM;
+ if (edev->ha_reset)
+ return eea_net_reprobe(edev);
+
enet = eea_netdev_alloc(edev, edev->rx_num);
if (!enet)
return -ENOMEM;
@@ -159,6 +551,7 @@ int eea_net_probe(struct eea_device *edev)
if (err)
goto err_reset_dev;
+ eea_update_ts_off(edev, enet);
netif_carrier_off(enet->netdev);
netdev_dbg(enet->netdev, "eea probe success.\n");
@@ -182,12 +575,25 @@ void eea_net_remove(struct eea_device *edev)
enet = edev->enet;
netdev = enet->netdev;
- unregister_netdev(netdev);
- netdev_dbg(enet->netdev, "eea removed.\n");
+ if (edev->ha_reset) {
+ edev->ha_reset_netdev_running = false;
+ if (netif_running(enet->netdev)) {
+ rtnl_lock();
+ eea_netdev_stop(enet->netdev);
+ enet->link_err = EEA_LINK_ERR_HA_RESET_DEV;
+ enet->edev = NULL;
+ rtnl_unlock();
+ edev->ha_reset_netdev_running = true;
+ }
+ } else {
+ unregister_netdev(netdev);
+ netdev_dbg(enet->netdev, "eea removed.\n");
+ }
eea_device_reset(edev);
eea_destroy_adminq(enet);
- free_netdev(netdev);
+ if (!edev->ha_reset)
+ free_netdev(netdev);
}
diff --git a/drivers/net/ethernet/alibaba/eea/eea_net.h b/drivers/net/ethernet/alibaba/eea/eea_net.h
index b35d7483de63..9d7965acdcb2 100644
--- a/drivers/net/ethernet/alibaba/eea/eea_net.h
+++ b/drivers/net/ethernet/alibaba/eea/eea_net.h
@@ -18,6 +18,13 @@
#define EEA_VER_MINOR 0
#define EEA_VER_SUB_MINOR 0
+struct eea_tx_meta;
+
+struct eea_reprobe {
+ struct eea_net *enet;
+ bool running_before_reprobe;
+};
+
struct eea_net_tx {
struct eea_net *enet;
@@ -104,6 +111,18 @@ struct eea_net_cfg {
u8 tx_cq_desc_size;
u32 split_hdr;
+
+ struct hwtstamp_config ts_cfg;
+};
+
+struct eea_net_init_ctx {
+ struct eea_net_cfg cfg;
+
+ struct eea_net_tx *tx;
+ struct eea_net_rx **rx;
+
+ struct net_device *netdev;
+ struct eea_device *edev;
};
enum {
@@ -135,9 +154,38 @@ struct eea_net {
u64 hw_ts_offset;
};
+int eea_tx_resize(struct eea_net *enet, struct eea_net_tx *tx, u32 ring_num);
+
int eea_net_probe(struct eea_device *edev);
void eea_net_remove(struct eea_device *edev);
int eea_net_freeze(struct eea_device *edev);
int eea_net_restore(struct eea_device *edev);
+int eea_reset_hw_resources(struct eea_net *enet, struct eea_net_init_ctx *ctx);
+void enet_init_ctx(struct eea_net *enet, struct eea_net_init_ctx *ctx);
+int eea_queues_check_and_reset(struct eea_device *edev);
+
+/* rx apis */
+int eea_poll(struct napi_struct *napi, int budget);
+
+void enet_rx_stop(struct eea_net_rx *rx);
+void enet_rx_start(struct eea_net_rx *rx);
+
+void eea_free_rx(struct eea_net_rx *rx, struct eea_net_cfg *cfg);
+struct eea_net_rx *eea_alloc_rx(struct eea_net_init_ctx *ctx, u32 idx);
+
+void eea_irq_free(struct eea_net_rx *rx);
+
+int enet_rxtx_irq_setup(struct eea_net *enet, u32 qid, u32 num);
+
+/* tx apis */
+int eea_poll_tx(struct eea_net_tx *tx, int budget);
+void eea_poll_cleantx(struct eea_net_rx *rx);
+netdev_tx_t eea_tx_xmit(struct sk_buff *skb, struct net_device *netdev);
+
+void eea_tx_timeout(struct net_device *netdev, u32 txqueue);
+
+void eea_free_tx(struct eea_net_tx *tx, struct eea_net_cfg *cfg);
+int eea_alloc_tx(struct eea_net_init_ctx *ctx, struct eea_net_tx *tx, u32 idx);
+
#endif
diff --git a/drivers/net/ethernet/alibaba/eea/eea_pci.c b/drivers/net/ethernet/alibaba/eea/eea_pci.c
index dcae0154c7e0..9abd8f0b5f62 100644
--- a/drivers/net/ethernet/alibaba/eea/eea_pci.c
+++ b/drivers/net/ethernet/alibaba/eea/eea_pci.c
@@ -13,6 +13,9 @@
#define EEA_PCI_DB_OFFSET 4096
+#define EEA_PCI_CAP_RESET_DEVICE 0xFA
+#define EEA_PCI_CAP_RESET_FLAG BIT(1)
+
struct eea_pci_cfg {
__le32 reserve0;
__le32 reserve1;
@@ -51,6 +54,7 @@ struct eea_pci_device {
void __iomem *reg;
void __iomem *db_base;
+ struct work_struct ha_handle_work;
char ha_irq_name[32];
u8 reset_pos;
};
@@ -67,6 +71,11 @@ struct eea_pci_device {
#define cfg_read32(reg, item) ioread32(cfg_pointer(reg, item))
#define cfg_readq(reg, item) readq(cfg_pointer(reg, item))
+/* Due to circular references, we have to add function definitions here. */
+static int __eea_pci_probe(struct pci_dev *pci_dev,
+ struct eea_pci_device *ep_dev);
+static void __eea_pci_remove(struct pci_dev *pci_dev, bool flush_ha_work);
+
const char *eea_pci_name(struct eea_device *edev)
{
return pci_name(edev->ep_dev->pci_dev);
@@ -250,6 +259,153 @@ void eea_pci_active_aq(struct eea_ring *ering)
cfg_read32(ep_dev->reg, aq_db_off));
}
+void eea_pci_free_irq(struct eea_ring *ering, void *data)
+{
+ struct eea_pci_device *ep_dev = ering->edev->ep_dev;
+ int irq;
+
+ irq = pci_irq_vector(ep_dev->pci_dev, ering->msix_vec);
+ irq_update_affinity_hint(irq, NULL);
+ free_irq(irq, data);
+}
+
+int eea_pci_request_irq(struct eea_ring *ering,
+ irqreturn_t (*callback)(int irq, void *data),
+ void *data)
+{
+ struct eea_pci_device *ep_dev = ering->edev->ep_dev;
+ int irq;
+
+ snprintf(ering->irq_name, sizeof(ering->irq_name), "eea-q%d@%s",
+ ering->index / 2, pci_name(ep_dev->pci_dev));
+
+ irq = pci_irq_vector(ep_dev->pci_dev, ering->msix_vec);
+
+ return request_irq(irq, callback, 0, ering->irq_name, data);
+}
+
+static int eea_ha_handle_reset(struct eea_pci_device *ep_dev)
+{
+ struct eea_device *edev;
+ struct pci_dev *pci_dev;
+ u16 reset;
+
+ if (!ep_dev->reset_pos)
+ return 0;
+
+ edev = &ep_dev->edev;
+
+ pci_read_config_word(ep_dev->pci_dev, ep_dev->reset_pos, &reset);
+
+ /* clear bit */
+ pci_write_config_word(ep_dev->pci_dev, ep_dev->reset_pos, 0xFFFF);
+
+ if (reset & EEA_PCI_CAP_RESET_FLAG) {
+ dev_warn(&ep_dev->pci_dev->dev, "recv device reset request.\n");
+
+ pci_dev = ep_dev->pci_dev;
+
+ /* The pci remove callback may hold this lock. If the
+ * pci remove callback is called, then we can ignore the
+ * ha interrupt.
+ */
+ if (mutex_trylock(&edev->ha_lock)) {
+ edev->ha_reset = true;
+
+ __eea_pci_remove(pci_dev, false);
+ __eea_pci_probe(pci_dev, ep_dev);
+
+ edev->ha_reset = false;
+ mutex_unlock(&edev->ha_lock);
+ } else {
+ dev_warn(&ep_dev->pci_dev->dev,
+ "ha device reset: trylock failed.\n");
+ }
+
+ return 1;
+ }
+
+ return 0;
+}
+
+/* ha handle code */
+static void eea_ha_handle_work(struct work_struct *work)
+{
+ struct eea_pci_device *ep_dev;
+ int done;
+
+ ep_dev = container_of(work, struct eea_pci_device, ha_handle_work);
+
+ /* Ha interrupt is triggered, so there maybe some error, we may need to
+ * reset the device or reset some queues.
+ */
+ dev_warn(&ep_dev->pci_dev->dev, "recv ha interrupt.\n");
+
+ done = eea_ha_handle_reset(ep_dev);
+ if (done)
+ return;
+
+ eea_queues_check_and_reset(&ep_dev->edev);
+}
+
+static irqreturn_t eea_pci_ha_handle(int irq, void *data)
+{
+ struct eea_device *edev = data;
+
+ schedule_work(&edev->ep_dev->ha_handle_work);
+
+ return IRQ_HANDLED;
+}
+
+static void eea_pci_free_ha_irq(struct eea_device *edev)
+{
+ struct eea_pci_device *ep_dev = edev->ep_dev;
+ int irq = pci_irq_vector(ep_dev->pci_dev, 0);
+
+ free_irq(irq, edev);
+}
+
+static int eea_pci_ha_init(struct eea_device *edev, struct pci_dev *pci_dev)
+{
+ u8 pos, cfg_type_off, type, cfg_drv_off, cfg_dev_off;
+ struct eea_pci_device *ep_dev = edev->ep_dev;
+ int irq;
+
+ cfg_type_off = offsetof(struct eea_pci_cap, cfg_type);
+ cfg_drv_off = offsetof(struct eea_pci_reset_reg, driver);
+ cfg_dev_off = offsetof(struct eea_pci_reset_reg, device);
+
+ for (pos = pci_find_capability(pci_dev, PCI_CAP_ID_VNDR);
+ pos > 0;
+ pos = pci_find_next_capability(pci_dev, pos, PCI_CAP_ID_VNDR)) {
+ pci_read_config_byte(pci_dev, pos + cfg_type_off, &type);
+
+ if (type == EEA_PCI_CAP_RESET_DEVICE) {
+ /* notify device, driver support this feature. */
+ pci_write_config_word(pci_dev, pos + cfg_drv_off,
+ EEA_PCI_CAP_RESET_FLAG);
+ pci_write_config_word(pci_dev, pos + cfg_dev_off,
+ 0xFFFF);
+
+ edev->ep_dev->reset_pos = pos + cfg_dev_off;
+ goto found;
+ }
+ }
+
+ dev_warn(&edev->ep_dev->pci_dev->dev, "Not Found reset cap.\n");
+
+found:
+ snprintf(ep_dev->ha_irq_name, sizeof(ep_dev->ha_irq_name), "eea-ha@%s",
+ pci_name(ep_dev->pci_dev));
+
+ irq = pci_irq_vector(ep_dev->pci_dev, 0);
+
+ INIT_WORK(&ep_dev->ha_handle_work, eea_ha_handle_work);
+
+ return request_irq(irq, eea_pci_ha_handle, 0,
+ ep_dev->ha_irq_name, edev);
+}
+
u64 eea_pci_device_ts(struct eea_device *edev)
{
struct eea_pci_device *ep_dev = edev->ep_dev;
@@ -284,10 +440,13 @@ static int eea_init_device(struct eea_device *edev)
static int __eea_pci_probe(struct pci_dev *pci_dev,
struct eea_pci_device *ep_dev)
{
+ struct eea_device *edev;
int err;
pci_set_drvdata(pci_dev, ep_dev);
+ edev = &ep_dev->edev;
+
err = eea_pci_setup(pci_dev, ep_dev);
if (err)
return err;
@@ -296,19 +455,31 @@ static int __eea_pci_probe(struct pci_dev *pci_dev,
if (err)
goto err_pci_rel;
+ err = eea_pci_ha_init(edev, pci_dev);
+ if (err)
+ goto err_net_rm;
+
return 0;
+err_net_rm:
+ eea_net_remove(edev);
+
err_pci_rel:
eea_pci_release_resource(ep_dev);
return err;
}
-static void __eea_pci_remove(struct pci_dev *pci_dev)
+static void __eea_pci_remove(struct pci_dev *pci_dev, bool flush_ha_work)
{
struct eea_pci_device *ep_dev = pci_get_drvdata(pci_dev);
struct device *dev = get_device(&ep_dev->pci_dev->dev);
struct eea_device *edev = &ep_dev->edev;
+ eea_pci_free_ha_irq(edev);
+
+ if (flush_ha_work)
+ flush_work(&ep_dev->ha_handle_work);
+
eea_net_remove(edev);
pci_disable_sriov(pci_dev);
@@ -336,6 +507,8 @@ static int eea_pci_probe(struct pci_dev *pci_dev,
ep_dev->pci_dev = pci_dev;
+ mutex_init(&edev->ha_lock);
+
err = __eea_pci_probe(pci_dev, ep_dev);
if (err)
kfree(ep_dev);
@@ -346,8 +519,13 @@ static int eea_pci_probe(struct pci_dev *pci_dev,
static void eea_pci_remove(struct pci_dev *pci_dev)
{
struct eea_pci_device *ep_dev = pci_get_drvdata(pci_dev);
+ struct eea_device *edev;
+
+ edev = &ep_dev->edev;
- __eea_pci_remove(pci_dev);
+ mutex_lock(&edev->ha_lock);
+ __eea_pci_remove(pci_dev, true);
+ mutex_unlock(&edev->ha_lock);
kfree(ep_dev);
}
diff --git a/drivers/net/ethernet/alibaba/eea/eea_pci.h b/drivers/net/ethernet/alibaba/eea/eea_pci.h
index d793128e556c..cdddb465d956 100644
--- a/drivers/net/ethernet/alibaba/eea/eea_pci.h
+++ b/drivers/net/ethernet/alibaba/eea/eea_pci.h
@@ -10,6 +10,7 @@
#include <linux/pci.h>
+#include "eea_net.h"
#include "eea_ring.h"
struct eea_pci_cap {
@@ -34,6 +35,12 @@ struct eea_device {
u64 features;
+ bool ha_reset;
+ bool ha_reset_netdev_running;
+
+ /* ha lock for the race between ha work and pci remove */
+ struct mutex ha_lock;
+
u32 rx_num;
u32 tx_num;
u32 db_blk_size;
@@ -47,7 +54,14 @@ int eea_device_reset(struct eea_device *dev);
void eea_device_ready(struct eea_device *dev);
void eea_pci_active_aq(struct eea_ring *ering);
+int eea_pci_request_irq(struct eea_ring *ering,
+ irqreturn_t (*callback)(int irq, void *data),
+ void *data);
+void eea_pci_free_irq(struct eea_ring *ering, void *data);
+
u64 eea_pci_device_ts(struct eea_device *edev);
+int eea_pci_set_affinity(struct eea_ring *ering,
+ const struct cpumask *cpu_mask);
void __iomem *eea_pci_db_addr(struct eea_device *edev, u32 off);
#endif
diff --git a/drivers/net/ethernet/alibaba/eea/eea_rx.c b/drivers/net/ethernet/alibaba/eea/eea_rx.c
new file mode 100644
index 000000000000..663f0b0c8b0e
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_rx.c
@@ -0,0 +1,762 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#include <net/netdev_rx_queue.h>
+#include <net/page_pool/helpers.h>
+
+#include "eea_adminq.h"
+#include "eea_net.h"
+#include "eea_ring.h"
+
+#define EEA_SETUP_F_NAPI BIT(0)
+#define EEA_SETUP_F_IRQ BIT(1)
+#define EEA_ENABLE_F_NAPI BIT(2)
+
+#define EEA_PAGE_FRAGS_NUM 1024
+
+#define EEA_RX_BUF_ALIGN 128
+
+struct eea_rx_ctx {
+ void *buf;
+
+ u32 len;
+ u32 hdr_len;
+
+ u16 flags;
+ bool more;
+
+ u32 frame_sz;
+
+ struct eea_rx_meta *meta;
+};
+
+static struct eea_rx_meta *eea_rx_meta_get(struct eea_net_rx *rx)
+{
+ struct eea_rx_meta *meta;
+
+ if (!rx->free)
+ return NULL;
+
+ meta = rx->free;
+ rx->free = meta->next;
+
+ return meta;
+}
+
+static void eea_rx_meta_put(struct eea_net_rx *rx, struct eea_rx_meta *meta)
+{
+ meta->next = rx->free;
+ rx->free = meta;
+}
+
+static void eea_free_rx_buffer(struct eea_net_rx *rx, struct eea_rx_meta *meta)
+{
+ u32 drain_count;
+
+ drain_count = EEA_PAGE_FRAGS_NUM - meta->frags;
+
+ if (page_pool_unref_page(meta->page, drain_count) == 0)
+ page_pool_put_unrefed_page(rx->pp, meta->page, -1, true);
+
+ meta->page = NULL;
+}
+
+static void meta_align_offset(struct eea_net_rx *rx, struct eea_rx_meta *meta)
+{
+ int h, b;
+
+ h = rx->headroom;
+ b = meta->offset + h;
+
+ /* For better performance, we align the buffer address to
+ * EEA_RX_BUF_ALIGN, as required by the device design.
+ */
+ b = ALIGN(b, EEA_RX_BUF_ALIGN);
+
+ meta->offset = b - h;
+}
+
+static int eea_alloc_rx_buffer(struct eea_net_rx *rx, struct eea_rx_meta *meta)
+{
+ struct page *page;
+
+ if (meta->page)
+ return 0;
+
+ page = page_pool_dev_alloc_pages(rx->pp);
+ if (!page)
+ return -ENOMEM;
+
+ page_pool_fragment_page(page, EEA_PAGE_FRAGS_NUM);
+
+ meta->page = page;
+ meta->dma = page_pool_get_dma_addr(page);
+ meta->offset = 0;
+ meta->frags = 0;
+
+ meta_align_offset(rx, meta);
+
+ return 0;
+}
+
+static void eea_consume_rx_buffer(struct eea_net_rx *rx,
+ struct eea_rx_meta *meta,
+ u32 consumed)
+{
+ int min;
+
+ meta->offset += consumed;
+ ++meta->frags;
+
+ min = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+ min += rx->headroom;
+ min += ETH_DATA_LEN;
+
+ meta_align_offset(rx, meta);
+
+ if (min + meta->offset > PAGE_SIZE)
+ eea_free_rx_buffer(rx, meta);
+}
+
+static void eea_free_rx_hdr(struct eea_net_rx *rx, struct eea_net_cfg *cfg)
+{
+ struct eea_rx_meta *meta;
+ int i;
+
+ for (i = 0; i < cfg->rx_ring_depth; ++i) {
+ meta = &rx->meta[i];
+ meta->hdr_addr = NULL;
+
+ if (!meta->hdr_page)
+ continue;
+
+ dma_unmap_page(rx->dma_dev, meta->hdr_dma, PAGE_SIZE,
+ DMA_FROM_DEVICE);
+ put_page(meta->hdr_page);
+
+ meta->hdr_page = NULL;
+ }
+}
+
+static int eea_alloc_rx_hdr(struct eea_net_init_ctx *ctx, struct eea_net_rx *rx)
+{
+ struct page *hdr_page = NULL;
+ struct eea_rx_meta *meta;
+ u32 offset = 0, hdrsize;
+ struct device *dmadev;
+ dma_addr_t dma;
+ int i;
+
+ dmadev = ctx->edev->dma_dev;
+ hdrsize = ctx->cfg.split_hdr;
+
+ for (i = 0; i < ctx->cfg.rx_ring_depth; ++i) {
+ meta = &rx->meta[i];
+
+ if (!hdr_page || offset + hdrsize > PAGE_SIZE) {
+ hdr_page = dev_alloc_page();
+ if (!hdr_page)
+ return -ENOMEM;
+
+ dma = dma_map_page(dmadev, hdr_page, 0, PAGE_SIZE,
+ DMA_FROM_DEVICE);
+
+ if (unlikely(dma_mapping_error(dmadev, dma))) {
+ put_page(hdr_page);
+ return -ENOMEM;
+ }
+
+ offset = 0;
+ meta->hdr_page = hdr_page;
+ meta->dma = dma;
+ }
+
+ meta->hdr_dma = dma + offset;
+ meta->hdr_addr = page_address(hdr_page) + offset;
+ offset += hdrsize;
+ }
+
+ return 0;
+}
+
+static void eea_rx_meta_dma_sync_for_cpu(struct eea_net_rx *rx,
+ struct eea_rx_meta *meta, u32 len)
+{
+ dma_sync_single_for_cpu(rx->enet->edev->dma_dev,
+ meta->dma + meta->offset + meta->headroom,
+ len, DMA_FROM_DEVICE);
+}
+
+static int eea_harden_check_overflow(struct eea_rx_ctx *ctx,
+ struct eea_net *enet)
+{
+ if (unlikely(ctx->len > ctx->meta->truesize - ctx->meta->room)) {
+ pr_debug("%s: rx error: len %u exceeds truesize %u\n",
+ enet->netdev->name, ctx->len,
+ ctx->meta->truesize - ctx->meta->room);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int eea_harden_check_size(struct eea_rx_ctx *ctx, struct eea_net *enet)
+{
+ int err;
+
+ err = eea_harden_check_overflow(ctx, enet);
+ if (err)
+ return err;
+
+ if (unlikely(ctx->hdr_len + ctx->len < ETH_HLEN)) {
+ pr_debug("%s: short packet %u\n", enet->netdev->name, ctx->len);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static struct sk_buff *eea_build_skb(void *buf, u32 buflen, u32 headroom,
+ u32 len)
+{
+ struct sk_buff *skb;
+
+ skb = build_skb(buf, buflen);
+ if (unlikely(!skb))
+ return NULL;
+
+ skb_reserve(skb, headroom);
+ skb_put(skb, len);
+
+ return skb;
+}
+
+static struct sk_buff *eea_rx_build_split_hdr_skb(struct eea_net_rx *rx,
+ struct eea_rx_ctx *ctx)
+{
+ struct eea_rx_meta *meta = ctx->meta;
+ struct sk_buff *skb;
+ u32 truesize;
+
+ dma_sync_single_for_cpu(rx->enet->edev->dma_dev, meta->hdr_dma,
+ ctx->hdr_len, DMA_FROM_DEVICE);
+
+ skb = napi_alloc_skb(&rx->napi, ctx->hdr_len);
+ if (unlikely(!skb))
+ return NULL;
+
+ truesize = meta->headroom + ctx->len;
+
+ skb_put_data(skb, ctx->meta->hdr_addr, ctx->hdr_len);
+
+ if (ctx->len) {
+ skb_add_rx_frag(skb, 0, meta->page,
+ meta->offset + meta->headroom,
+ ctx->len, truesize);
+
+ eea_consume_rx_buffer(rx, meta, truesize);
+ }
+
+ skb_mark_for_recycle(skb);
+
+ return skb;
+}
+
+static struct sk_buff *eea_rx_build_skb(struct eea_net_rx *rx,
+ struct eea_rx_ctx *ctx)
+{
+ struct eea_rx_meta *meta = ctx->meta;
+ u32 len, shinfo_size, truesize;
+ struct sk_buff *skb;
+ struct page *page;
+ void *buf, *pkt;
+
+ page = meta->page;
+ if (!page)
+ return NULL;
+
+ shinfo_size = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+
+ buf = page_address(page) + meta->offset;
+ pkt = buf + meta->headroom;
+ len = ctx->len;
+ truesize = meta->headroom + ctx->len + shinfo_size;
+
+ skb = eea_build_skb(buf, truesize, pkt - buf, len);
+ if (unlikely(!skb))
+ return NULL;
+
+ eea_consume_rx_buffer(rx, meta, truesize);
+ skb_mark_for_recycle(skb);
+
+ return skb;
+}
+
+static int eea_skb_append_buf(struct eea_net_rx *rx, struct eea_rx_ctx *ctx)
+{
+ struct sk_buff *curr_skb = rx->pkt.curr_skb;
+ struct sk_buff *head_skb = rx->pkt.head_skb;
+ int num_skb_frags;
+ int offset;
+
+ if (!curr_skb)
+ curr_skb = head_skb;
+
+ num_skb_frags = skb_shinfo(curr_skb)->nr_frags;
+ if (unlikely(num_skb_frags == MAX_SKB_FRAGS)) {
+ struct sk_buff *nskb = alloc_skb(0, GFP_ATOMIC);
+
+ if (unlikely(!nskb))
+ return -ENOMEM;
+
+ if (curr_skb == head_skb)
+ skb_shinfo(curr_skb)->frag_list = nskb;
+ else
+ curr_skb->next = nskb;
+
+ curr_skb = nskb;
+ head_skb->truesize += nskb->truesize;
+ num_skb_frags = 0;
+
+ rx->pkt.curr_skb = curr_skb;
+ }
+
+ if (curr_skb != head_skb) {
+ head_skb->data_len += ctx->len;
+ head_skb->len += ctx->len;
+ head_skb->truesize += ctx->meta->truesize;
+ }
+
+ offset = ctx->meta->offset + ctx->meta->headroom;
+
+ skb_add_rx_frag(curr_skb, num_skb_frags, ctx->meta->page,
+ offset, ctx->len, ctx->meta->truesize);
+
+ eea_consume_rx_buffer(rx, ctx->meta, ctx->meta->headroom + ctx->len);
+
+ return 0;
+}
+
+static int process_remain_buf(struct eea_net_rx *rx, struct eea_rx_ctx *ctx)
+{
+ struct eea_net *enet = rx->enet;
+
+ if (eea_harden_check_overflow(ctx, enet))
+ goto err;
+
+ if (eea_skb_append_buf(rx, ctx))
+ goto err;
+
+ return 0;
+
+err:
+ dev_kfree_skb(rx->pkt.head_skb);
+ rx->pkt.do_drop = true;
+ rx->pkt.head_skb = NULL;
+ return 0;
+}
+
+static int process_first_buf(struct eea_net_rx *rx, struct eea_rx_ctx *ctx)
+{
+ struct eea_net *enet = rx->enet;
+ struct sk_buff *skb = NULL;
+
+ if (eea_harden_check_size(ctx, enet))
+ goto err;
+
+ rx->pkt.data_valid = ctx->flags & EEA_DESC_F_DATA_VALID;
+
+ if (ctx->hdr_len)
+ skb = eea_rx_build_split_hdr_skb(rx, ctx);
+ else
+ skb = eea_rx_build_skb(rx, ctx);
+
+ if (unlikely(!skb))
+ goto err;
+
+ rx->pkt.head_skb = skb;
+
+ return 0;
+
+err:
+ rx->pkt.do_drop = true;
+ return 0;
+}
+
+static void eea_submit_skb(struct eea_net_rx *rx, struct sk_buff *skb,
+ struct eea_rx_cdesc *desc)
+{
+ struct eea_net *enet = rx->enet;
+
+ if (rx->pkt.data_valid)
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+ if (enet->cfg.ts_cfg.rx_filter == HWTSTAMP_FILTER_ALL)
+ skb_hwtstamps(skb)->hwtstamp = EEA_DESC_TS(desc) +
+ enet->hw_ts_offset;
+
+ skb_record_rx_queue(skb, rx->index);
+ skb->protocol = eth_type_trans(skb, enet->netdev);
+
+ napi_gro_receive(&rx->napi, skb);
+}
+
+static void eea_rx_desc_to_ctx(struct eea_net_rx *rx,
+ struct eea_rx_ctx *ctx,
+ struct eea_rx_cdesc *desc)
+{
+ ctx->meta = &rx->meta[le16_to_cpu(desc->id)];
+ ctx->len = le16_to_cpu(desc->len);
+ ctx->flags = le16_to_cpu(desc->flags);
+
+ ctx->hdr_len = 0;
+ if (ctx->flags & EEA_DESC_F_SPLIT_HDR)
+ ctx->hdr_len = le16_to_cpu(desc->len_ex) &
+ EEA_RX_CDESC_HDR_LEN_MASK;
+
+ ctx->more = ctx->flags & EEA_RING_DESC_F_MORE;
+}
+
+static int eea_cleanrx(struct eea_net_rx *rx, int budget,
+ struct eea_rx_ctx *ctx)
+{
+ struct eea_rx_cdesc *desc;
+ struct eea_rx_meta *meta;
+ int packets;
+
+ for (packets = 0; packets < budget; ) {
+ desc = ering_cq_get_desc(rx->ering);
+ if (!desc)
+ break;
+
+ eea_rx_desc_to_ctx(rx, ctx, desc);
+
+ meta = ctx->meta;
+ ctx->buf = page_address(meta->page) + meta->offset +
+ meta->headroom;
+
+ if (unlikely(rx->pkt.do_drop))
+ goto skip;
+
+ eea_rx_meta_dma_sync_for_cpu(rx, meta, ctx->len);
+
+ if (!rx->pkt.idx)
+ process_first_buf(rx, ctx);
+ else
+ process_remain_buf(rx, ctx);
+
+ ++rx->pkt.idx;
+
+ if (!ctx->more) {
+ if (likely(rx->pkt.head_skb))
+ eea_submit_skb(rx, rx->pkt.head_skb, desc);
+
+ ++packets;
+ }
+
+skip:
+ eea_rx_meta_put(rx, meta);
+ ering_cq_ack_desc(rx->ering, 1);
+
+ if (!ctx->more)
+ memset(&rx->pkt, 0, sizeof(rx->pkt));
+ }
+
+ return packets;
+}
+
+static bool eea_rx_post(struct eea_net *enet, struct eea_net_rx *rx)
+{
+ u32 tailroom, headroom, room, len;
+ struct eea_rx_meta *meta;
+ struct eea_rx_desc *desc;
+ int err = 0, num = 0;
+ dma_addr_t addr;
+
+ tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+ headroom = rx->headroom;
+ room = headroom + tailroom;
+
+ while (true) {
+ meta = eea_rx_meta_get(rx);
+ if (!meta)
+ break;
+
+ err = eea_alloc_rx_buffer(rx, meta);
+ if (err) {
+ eea_rx_meta_put(rx, meta);
+ break;
+ }
+
+ len = PAGE_SIZE - meta->offset - room;
+ addr = meta->dma + meta->offset + headroom;
+
+ desc = ering_sq_alloc_desc(rx->ering, meta->id, true, 0);
+ desc->addr = cpu_to_le64(addr);
+ desc->len = cpu_to_le16(len);
+
+ if (meta->hdr_addr)
+ desc->hdr_addr = cpu_to_le64(meta->hdr_dma);
+
+ ering_sq_commit_desc(rx->ering);
+
+ meta->truesize = len + room;
+ meta->headroom = headroom;
+ meta->tailroom = tailroom;
+ meta->len = len;
+ ++num;
+ }
+
+ if (num)
+ ering_kick(rx->ering);
+
+ /* true means busy, napi should be called again. */
+ return !!err;
+}
+
+int eea_poll(struct napi_struct *napi, int budget)
+{
+ struct eea_net_rx *rx = container_of(napi, struct eea_net_rx, napi);
+ struct eea_net_tx *tx = &rx->enet->tx[rx->index];
+ struct eea_net *enet = rx->enet;
+ struct eea_rx_ctx ctx = {};
+ bool busy = false;
+ u32 received;
+
+ eea_poll_tx(tx, budget);
+
+ received = eea_cleanrx(rx, budget, &ctx);
+
+ if (rx->ering->num_free > budget)
+ busy |= eea_rx_post(enet, rx);
+
+ busy |= received >= budget;
+
+ if (busy)
+ return budget;
+
+ if (napi_complete_done(napi, received))
+ ering_irq_active(rx->ering, tx->ering);
+
+ return received;
+}
+
+static void eea_free_rx_buffers(struct eea_net_rx *rx, struct eea_net_cfg *cfg)
+{
+ struct eea_rx_meta *meta;
+ u32 i;
+
+ for (i = 0; i < cfg->rx_ring_depth; ++i) {
+ meta = &rx->meta[i];
+ if (!meta->page)
+ continue;
+
+ eea_free_rx_buffer(rx, meta);
+ }
+}
+
+static struct page_pool *eea_create_pp(struct eea_net_rx *rx,
+ struct eea_net_init_ctx *ctx, u32 idx)
+{
+ struct page_pool_params pp_params = {0};
+
+ pp_params.order = 0;
+ pp_params.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV;
+ pp_params.pool_size = ctx->cfg.rx_ring_depth;
+ pp_params.nid = dev_to_node(ctx->edev->dma_dev);
+ pp_params.dev = ctx->edev->dma_dev;
+ pp_params.napi = &rx->napi;
+ pp_params.netdev = ctx->netdev;
+ pp_params.dma_dir = DMA_FROM_DEVICE;
+ pp_params.max_len = PAGE_SIZE;
+
+ return page_pool_create(&pp_params);
+}
+
+static void eea_destroy_page_pool(struct eea_net_rx *rx)
+{
+ if (rx->pp)
+ page_pool_destroy(rx->pp);
+}
+
+static irqreturn_t irq_handler(int irq, void *data)
+{
+ struct eea_net_rx *rx = data;
+
+ rx->irq_n++;
+
+ napi_schedule_irqoff(&rx->napi);
+
+ return IRQ_HANDLED;
+}
+
+void enet_rx_stop(struct eea_net_rx *rx)
+{
+ if (rx->flags & EEA_ENABLE_F_NAPI) {
+ rx->flags &= ~EEA_ENABLE_F_NAPI;
+ napi_disable(&rx->napi);
+ }
+}
+
+void enet_rx_start(struct eea_net_rx *rx)
+{
+ napi_enable(&rx->napi);
+ rx->flags |= EEA_ENABLE_F_NAPI;
+
+ local_bh_disable();
+ napi_schedule(&rx->napi);
+ local_bh_enable();
+}
+
+static int enet_irq_setup_for_q(struct eea_net_rx *rx)
+{
+ int err;
+
+ err = eea_pci_request_irq(rx->ering, irq_handler, rx);
+ if (err)
+ return err;
+
+ rx->flags |= EEA_SETUP_F_IRQ;
+
+ return 0;
+}
+
+void eea_irq_free(struct eea_net_rx *rx)
+{
+ if (rx->flags & EEA_SETUP_F_IRQ) {
+ eea_pci_free_irq(rx->ering, rx);
+ rx->flags &= ~EEA_SETUP_F_IRQ;
+ }
+}
+
+int enet_rxtx_irq_setup(struct eea_net *enet, u32 qid, u32 num)
+{
+ struct eea_net_rx *rx;
+ int err, i;
+
+ for (i = 0; i < enet->cfg.rx_ring_num; i++) {
+ rx = enet->rx[i];
+
+ err = enet_irq_setup_for_q(rx);
+ if (err)
+ goto err_free_irq;
+ }
+
+ return 0;
+
+err_free_irq:
+ for (i = 0; i < enet->cfg.rx_ring_num; i++) {
+ rx = enet->rx[i];
+
+ eea_irq_free(rx);
+ }
+ return err;
+}
+
+/* Maybe called before enet_bind_new_q_and_cfg. So the cfg must be
+ * passed.
+ */
+void eea_free_rx(struct eea_net_rx *rx, struct eea_net_cfg *cfg)
+{
+ if (!rx)
+ return;
+
+ if (rx->ering) {
+ ering_free(rx->ering);
+ rx->ering = NULL;
+ }
+
+ if (rx->meta) {
+ eea_free_rx_buffers(rx, cfg);
+ eea_free_rx_hdr(rx, cfg);
+ kvfree(rx->meta);
+ rx->meta = NULL;
+ }
+
+ if (rx->pp) {
+ eea_destroy_page_pool(rx);
+ rx->pp = NULL;
+ }
+
+ if (rx->flags & EEA_SETUP_F_NAPI) {
+ rx->flags &= ~EEA_SETUP_F_NAPI;
+ netif_napi_del(&rx->napi);
+ }
+
+ kfree(rx);
+}
+
+static void eea_rx_meta_init(struct eea_net_rx *rx, u32 num)
+{
+ struct eea_rx_meta *meta;
+ int i;
+
+ rx->free = NULL;
+
+ for (i = 0; i < num; ++i) {
+ meta = &rx->meta[i];
+ meta->id = i;
+ meta->next = rx->free;
+ rx->free = meta;
+ }
+}
+
+struct eea_net_rx *eea_alloc_rx(struct eea_net_init_ctx *ctx, u32 idx)
+{
+ struct eea_ring *ering;
+ struct eea_net_rx *rx;
+ int err;
+
+ rx = kzalloc(sizeof(*rx), GFP_KERNEL);
+ if (!rx)
+ return rx;
+
+ rx->index = idx;
+ sprintf(rx->name, "rx.%u", idx);
+
+ /* ering */
+ ering = ering_alloc(idx * 2, ctx->cfg.rx_ring_depth, ctx->edev,
+ ctx->cfg.rx_sq_desc_size,
+ ctx->cfg.rx_cq_desc_size,
+ rx->name);
+ if (!ering)
+ goto err_free_rx;
+
+ rx->ering = ering;
+
+ rx->dma_dev = ctx->edev->dma_dev;
+
+ /* meta */
+ rx->meta = kvcalloc(ctx->cfg.rx_ring_depth,
+ sizeof(*rx->meta), GFP_KERNEL);
+ if (!rx->meta)
+ goto err_free_rx;
+
+ eea_rx_meta_init(rx, ctx->cfg.rx_ring_depth);
+
+ if (ctx->cfg.split_hdr) {
+ err = eea_alloc_rx_hdr(ctx, rx);
+ if (err)
+ goto err_free_rx;
+ }
+
+ rx->pp = eea_create_pp(rx, ctx, idx);
+ if (IS_ERR(rx->pp)) {
+ err = PTR_ERR(rx->pp);
+ rx->pp = NULL;
+ goto err_free_rx;
+ }
+
+ netif_napi_add(ctx->netdev, &rx->napi, eea_poll);
+ rx->flags |= EEA_SETUP_F_NAPI;
+
+ return rx;
+
+err_free_rx:
+ eea_free_rx(rx, &ctx->cfg);
+ return NULL;
+}
diff --git a/drivers/net/ethernet/alibaba/eea/eea_tx.c b/drivers/net/ethernet/alibaba/eea/eea_tx.c
new file mode 100644
index 000000000000..138be057611e
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_tx.c
@@ -0,0 +1,377 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#include <net/netdev_queues.h>
+
+#include "eea_net.h"
+#include "eea_pci.h"
+#include "eea_ring.h"
+
+struct eea_sq_free_stats {
+ u64 packets;
+ u64 bytes;
+};
+
+struct eea_tx_meta {
+ struct eea_tx_meta *next;
+
+ u32 id;
+
+ union {
+ struct sk_buff *skb;
+ void *data;
+ };
+
+ u32 num;
+
+ dma_addr_t dma_addr;
+ struct eea_tx_desc *desc;
+ u16 dma_len;
+};
+
+static struct eea_tx_meta *eea_tx_meta_get(struct eea_net_tx *tx)
+{
+ struct eea_tx_meta *meta;
+
+ if (!tx->free)
+ return NULL;
+
+ meta = tx->free;
+ tx->free = meta->next;
+
+ return meta;
+}
+
+static void eea_tx_meta_put_and_unmap(struct eea_net_tx *tx,
+ struct eea_tx_meta *meta)
+{
+ struct eea_tx_meta *head;
+
+ head = meta;
+
+ while (true) {
+ dma_unmap_single(tx->dma_dev, meta->dma_addr,
+ meta->dma_len, DMA_TO_DEVICE);
+
+ if (meta->next) {
+ meta = meta->next;
+ continue;
+ }
+
+ break;
+ }
+
+ meta->next = tx->free;
+ tx->free = head;
+}
+
+static void eea_meta_free_xmit(struct eea_net_tx *tx,
+ struct eea_tx_meta *meta,
+ int budget,
+ struct eea_tx_cdesc *desc,
+ struct eea_sq_free_stats *stats)
+{
+ struct sk_buff *skb = meta->skb;
+
+ if (unlikely((skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) && desc)) {
+ struct skb_shared_hwtstamps ts = {};
+
+ ts.hwtstamp = EEA_DESC_TS(desc) + tx->enet->hw_ts_offset;
+ skb_tstamp_tx(skb, &ts);
+ }
+
+ ++stats->packets;
+ stats->bytes += meta->skb->len;
+ napi_consume_skb(meta->skb, budget);
+
+ meta->data = NULL;
+}
+
+static u32 eea_clean_tx(struct eea_net_tx *tx, int budget)
+{
+ struct eea_sq_free_stats stats = {0};
+ struct eea_tx_cdesc *desc;
+ struct eea_tx_meta *meta;
+
+ while ((desc = ering_cq_get_desc(tx->ering))) {
+ meta = &tx->meta[le16_to_cpu(desc->id)];
+
+ if (meta->data) {
+ eea_tx_meta_put_and_unmap(tx, meta);
+ eea_meta_free_xmit(tx, meta, budget, desc, &stats);
+ } else {
+ netdev_err(tx->enet->netdev,
+ "tx meta->data is null. id %d num: %d\n",
+ meta->id, meta->num);
+ }
+
+ ering_cq_ack_desc(tx->ering, meta->num);
+ }
+
+ return stats.packets;
+}
+
+int eea_poll_tx(struct eea_net_tx *tx, int budget)
+{
+ struct eea_net *enet = tx->enet;
+ u32 index = tx - enet->tx;
+ struct netdev_queue *txq;
+ u32 cleaned;
+
+ txq = netdev_get_tx_queue(enet->netdev, index);
+
+ __netif_tx_lock(txq, smp_processor_id());
+
+ cleaned = eea_clean_tx(tx, budget);
+
+ if (netif_tx_queue_stopped(txq) && cleaned > 0)
+ netif_tx_wake_queue(txq);
+
+ __netif_tx_unlock(txq);
+
+ return 0;
+}
+
+static int eea_fill_desc_from_skb(const struct sk_buff *skb,
+ struct eea_ring *ering,
+ struct eea_tx_desc *desc)
+{
+ if (skb_is_gso(skb)) {
+ struct skb_shared_info *sinfo = skb_shinfo(skb);
+
+ desc->gso_size = cpu_to_le16(sinfo->gso_size);
+ if (sinfo->gso_type & SKB_GSO_TCPV4)
+ desc->gso_type = EEA_TX_GSO_TCPV4;
+
+ else if (sinfo->gso_type & SKB_GSO_TCPV6)
+ desc->gso_type = EEA_TX_GSO_TCPV6;
+
+ else if (sinfo->gso_type & SKB_GSO_UDP_L4)
+ desc->gso_type = EEA_TX_GSO_UDP_L4;
+
+ else
+ return -EINVAL;
+
+ if (sinfo->gso_type & SKB_GSO_TCP_ECN)
+ desc->gso_type |= EEA_TX_GSO_ECN;
+ } else {
+ desc->gso_type = EEA_TX_GSO_NONE;
+ }
+
+ if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ desc->csum_start = cpu_to_le16(skb_checksum_start_offset(skb));
+ desc->csum_offset = cpu_to_le16(skb->csum_offset);
+ }
+
+ return 0;
+}
+
+static struct eea_tx_meta *eea_tx_desc_fill(struct eea_net_tx *tx,
+ dma_addr_t addr, u32 len,
+ bool is_last, void *data, u16 flags)
+{
+ struct eea_tx_meta *meta;
+ struct eea_tx_desc *desc;
+
+ meta = eea_tx_meta_get(tx);
+
+ desc = ering_sq_alloc_desc(tx->ering, meta->id, is_last, flags);
+ desc->addr = cpu_to_le64(addr);
+ desc->len = cpu_to_le16(len);
+
+ meta->next = NULL;
+ meta->dma_len = len;
+ meta->dma_addr = addr;
+ meta->data = data;
+ meta->num = 1;
+ meta->desc = desc;
+
+ return meta;
+}
+
+static int eea_tx_add_skb_frag(struct eea_net_tx *tx,
+ struct eea_tx_meta *head_meta,
+ const skb_frag_t *frag, bool is_last)
+{
+ u32 len = skb_frag_size(frag);
+ struct eea_tx_meta *meta;
+ dma_addr_t addr;
+
+ addr = skb_frag_dma_map(tx->dma_dev, frag, 0, len, DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(tx->dma_dev, addr)))
+ return -ENOMEM;
+
+ meta = eea_tx_desc_fill(tx, addr, len, is_last, NULL, 0);
+
+ meta->next = head_meta->next;
+ head_meta->next = meta;
+
+ return 0;
+}
+
+static int eea_tx_post_skb(struct eea_net_tx *tx, struct sk_buff *skb)
+{
+ const struct skb_shared_info *shinfo = skb_shinfo(skb);
+ u32 hlen = skb_headlen(skb);
+ struct eea_tx_meta *meta;
+ dma_addr_t addr;
+ int i, err;
+ u16 flags;
+
+ addr = dma_map_single(tx->dma_dev, skb->data, hlen, DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(tx->dma_dev, addr)))
+ return -ENOMEM;
+
+ flags = skb->ip_summed == CHECKSUM_PARTIAL ? EEA_DESC_F_DO_CSUM : 0;
+
+ meta = eea_tx_desc_fill(tx, addr, hlen, !shinfo->nr_frags, skb, flags);
+
+ if (eea_fill_desc_from_skb(skb, tx->ering, meta->desc))
+ goto err_cancel;
+
+ for (i = 0; i < shinfo->nr_frags; i++) {
+ const skb_frag_t *frag = &shinfo->frags[i];
+ bool is_last = i == (shinfo->nr_frags - 1);
+
+ err = eea_tx_add_skb_frag(tx, meta, frag, is_last);
+ if (err)
+ goto err_cancel;
+ }
+
+ meta->num = shinfo->nr_frags + 1;
+ ering_sq_commit_desc(tx->ering);
+
+ return 0;
+
+err_cancel:
+ ering_sq_cancel(tx->ering);
+ eea_tx_meta_put_and_unmap(tx, meta);
+ meta->data = NULL;
+ return -ENOMEM;
+}
+
+static void eea_tx_kick(struct eea_net_tx *tx)
+{
+ ering_kick(tx->ering);
+}
+
+netdev_tx_t eea_tx_xmit(struct sk_buff *skb, struct net_device *netdev)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+ int qnum = skb_get_queue_mapping(skb);
+ struct eea_net_tx *tx = &enet->tx[qnum];
+ struct netdev_queue *txq;
+ int err, n;
+
+ txq = netdev_get_tx_queue(netdev, qnum);
+
+ skb_tx_timestamp(skb);
+
+ err = eea_tx_post_skb(tx, skb);
+ if (unlikely(err))
+ dev_kfree_skb_any(skb);
+
+ /* NETDEV_TX_BUSY is expensive. So stop advancing the TX queue. */
+ n = MAX_SKB_FRAGS + 1;
+ netif_txq_maybe_stop(txq, tx->ering->num_free, n, n);
+
+ if (!netdev_xmit_more() || netif_xmit_stopped(txq))
+ eea_tx_kick(tx);
+
+ return NETDEV_TX_OK;
+}
+
+static void eea_free_meta(struct eea_net_tx *tx, struct eea_net_cfg *cfg)
+{
+ struct eea_sq_free_stats stats;
+ struct eea_tx_meta *meta;
+ int i;
+
+ while ((meta = eea_tx_meta_get(tx)))
+ meta->skb = NULL;
+
+ for (i = 0; i < cfg->tx_ring_depth; i++) {
+ meta = &tx->meta[i];
+
+ if (!meta->skb)
+ continue;
+
+ eea_tx_meta_put_and_unmap(tx, meta);
+
+ eea_meta_free_xmit(tx, meta, 0, NULL, &stats);
+ }
+
+ kvfree(tx->meta);
+ tx->meta = NULL;
+}
+
+void eea_tx_timeout(struct net_device *netdev, unsigned int txqueue)
+{
+ struct netdev_queue *txq = netdev_get_tx_queue(netdev, txqueue);
+ struct eea_net *priv = netdev_priv(netdev);
+ struct eea_net_tx *tx = &priv->tx[txqueue];
+
+ netdev_err(netdev, "TX timeout on queue: %u, tx: %s, ering: 0x%x, %u usecs ago\n",
+ txqueue, tx->name, tx->ering->index,
+ jiffies_to_usecs(jiffies - READ_ONCE(txq->trans_start)));
+}
+
+/* Maybe called before enet_bind_new_q_and_cfg. So the cfg must be
+ * passed.
+ */
+void eea_free_tx(struct eea_net_tx *tx, struct eea_net_cfg *cfg)
+{
+ if (!tx)
+ return;
+
+ if (tx->ering) {
+ ering_free(tx->ering);
+ tx->ering = NULL;
+ }
+
+ if (tx->meta)
+ eea_free_meta(tx, cfg);
+}
+
+int eea_alloc_tx(struct eea_net_init_ctx *ctx, struct eea_net_tx *tx, u32 idx)
+{
+ struct eea_tx_meta *meta;
+ struct eea_ring *ering;
+ u32 i;
+
+ sprintf(tx->name, "tx.%u", idx);
+
+ ering = ering_alloc(idx * 2 + 1, ctx->cfg.tx_ring_depth, ctx->edev,
+ ctx->cfg.tx_sq_desc_size,
+ ctx->cfg.tx_cq_desc_size,
+ tx->name);
+ if (!ering)
+ goto err_free_tx;
+
+ tx->ering = ering;
+ tx->index = idx;
+ tx->dma_dev = ctx->edev->dma_dev;
+
+ /* meta */
+ tx->meta = kvcalloc(ctx->cfg.tx_ring_depth,
+ sizeof(*tx->meta), GFP_KERNEL);
+ if (!tx->meta)
+ goto err_free_tx;
+
+ for (i = 0; i < ctx->cfg.tx_ring_depth; ++i) {
+ meta = &tx->meta[i];
+ meta->id = i;
+ meta->next = tx->free;
+ tx->free = meta;
+ }
+
+ return 0;
+
+err_free_tx:
+ eea_free_tx(tx, &ctx->cfg);
+ return -ENOMEM;
+}
--
2.32.0.3.g01195cf9f
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH net-next v24 5/6] eea: introduce ethtool support
2026-01-30 9:34 [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Xuan Zhuo
` (3 preceding siblings ...)
2026-01-30 9:34 ` [PATCH net-next v24 4/6] eea: create/destroy rx,tx queues for netdevice open and stop Xuan Zhuo
@ 2026-01-30 9:34 ` Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 6/6] eea: introduce callback for ndo_get_stats64 Xuan Zhuo
2026-01-31 0:20 ` [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Ethan Nelson-Moore
6 siblings, 0 replies; 14+ messages in thread
From: Xuan Zhuo @ 2026-01-30 9:34 UTC (permalink / raw)
To: netdev
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xuan Zhuo, Wen Gu, Philo Lu, Lorenzo Bianconi,
Vadim Fedorenko, Dong Yibo, Ethan Nelson-Moore, Heiner Kallweit,
Lukas Bulwahn, Dust Li, Andrew Lunn
Add basic driver framework for the Alibaba Elastic Ethernet Adapter(EEA).
This commit introduces ethtool support.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Philo Lu <lulie@linux.alibaba.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
drivers/net/ethernet/alibaba/eea/Makefile | 1 +
.../net/ethernet/alibaba/eea/eea_ethtool.c | 243 ++++++++++++++++++
.../net/ethernet/alibaba/eea/eea_ethtool.h | 49 ++++
drivers/net/ethernet/alibaba/eea/eea_net.c | 1 +
drivers/net/ethernet/alibaba/eea/eea_net.h | 5 +
drivers/net/ethernet/alibaba/eea/eea_rx.c | 29 ++-
drivers/net/ethernet/alibaba/eea/eea_tx.c | 24 +-
7 files changed, 348 insertions(+), 4 deletions(-)
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_ethtool.c
create mode 100644 drivers/net/ethernet/alibaba/eea/eea_ethtool.h
diff --git a/drivers/net/ethernet/alibaba/eea/Makefile b/drivers/net/ethernet/alibaba/eea/Makefile
index fa34a005fa01..8f8fbb8d2d9a 100644
--- a/drivers/net/ethernet/alibaba/eea/Makefile
+++ b/drivers/net/ethernet/alibaba/eea/Makefile
@@ -4,5 +4,6 @@ eea-y := eea_ring.o \
eea_net.o \
eea_pci.o \
eea_adminq.o \
+ eea_ethtool.o \
eea_tx.o \
eea_rx.o
diff --git a/drivers/net/ethernet/alibaba/eea/eea_ethtool.c b/drivers/net/ethernet/alibaba/eea/eea_ethtool.c
new file mode 100644
index 000000000000..fcf31569f77b
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_ethtool.c
@@ -0,0 +1,243 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#include <linux/ethtool.h>
+#include <linux/ethtool_netlink.h>
+#include <linux/rtnetlink.h>
+
+#include "eea_adminq.h"
+
+struct eea_stat_desc {
+ char desc[ETH_GSTRING_LEN];
+ size_t offset;
+};
+
+#define EEA_TX_STAT(m) {#m, offsetof(struct eea_tx_stats, m)}
+#define EEA_RX_STAT(m) {#m, offsetof(struct eea_rx_stats, m)}
+
+static const struct eea_stat_desc eea_rx_stats_desc[] = {
+ EEA_RX_STAT(descs),
+ EEA_RX_STAT(kicks),
+};
+
+static const struct eea_stat_desc eea_tx_stats_desc[] = {
+ EEA_TX_STAT(descs),
+ EEA_TX_STAT(kicks),
+};
+
+#define EEA_TX_STATS_LEN ARRAY_SIZE(eea_tx_stats_desc)
+#define EEA_RX_STATS_LEN ARRAY_SIZE(eea_rx_stats_desc)
+
+static void eea_get_drvinfo(struct net_device *netdev,
+ struct ethtool_drvinfo *info)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+ struct eea_device *edev = enet->edev;
+
+ strscpy(info->driver, KBUILD_MODNAME, sizeof(info->driver));
+ strscpy(info->bus_info, eea_pci_name(edev), sizeof(info->bus_info));
+}
+
+static void eea_get_ringparam(struct net_device *netdev,
+ struct ethtool_ringparam *ring,
+ struct kernel_ethtool_ringparam *kernel_ring,
+ struct netlink_ext_ack *extack)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+
+ ring->rx_max_pending = enet->cfg_hw.rx_ring_depth;
+ ring->tx_max_pending = enet->cfg_hw.tx_ring_depth;
+ ring->rx_pending = enet->cfg.rx_ring_depth;
+ ring->tx_pending = enet->cfg.tx_ring_depth;
+
+ kernel_ring->tcp_data_split = enet->cfg.split_hdr ?
+ ETHTOOL_TCP_DATA_SPLIT_ENABLED :
+ ETHTOOL_TCP_DATA_SPLIT_DISABLED;
+}
+
+static int eea_set_ringparam(struct net_device *netdev,
+ struct ethtool_ringparam *ring,
+ struct kernel_ethtool_ringparam *kernel_ring,
+ struct netlink_ext_ack *extack)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+ struct eea_net_init_ctx ctx;
+ bool need_update = false;
+ struct eea_net_cfg *cfg;
+ bool sh;
+
+ enet_init_ctx(enet, &ctx);
+
+ cfg = &ctx.cfg;
+
+ if (ring->rx_pending != cfg->rx_ring_depth)
+ need_update = true;
+
+ if (ring->tx_pending != cfg->tx_ring_depth)
+ need_update = true;
+
+ sh = kernel_ring->tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_ENABLED;
+ if (sh != !!(cfg->split_hdr))
+ need_update = true;
+
+ if (!need_update)
+ return 0;
+
+ cfg->rx_ring_depth = ring->rx_pending;
+ cfg->tx_ring_depth = ring->tx_pending;
+
+ cfg->split_hdr = sh ? enet->cfg_hw.split_hdr : 0;
+
+ return eea_reset_hw_resources(enet, &ctx);
+}
+
+static int eea_set_channels(struct net_device *netdev,
+ struct ethtool_channels *channels)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+ u16 queue_pairs = channels->combined_count;
+ struct eea_net_init_ctx ctx;
+ struct eea_net_cfg *cfg;
+
+ enet_init_ctx(enet, &ctx);
+
+ cfg = &ctx.cfg;
+
+ cfg->rx_ring_num = queue_pairs;
+ cfg->tx_ring_num = queue_pairs;
+
+ return eea_reset_hw_resources(enet, &ctx);
+}
+
+static void eea_get_channels(struct net_device *netdev,
+ struct ethtool_channels *channels)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+
+ channels->combined_count = enet->cfg.rx_ring_num;
+ channels->max_combined = enet->cfg_hw.rx_ring_num;
+}
+
+static void eea_get_strings(struct net_device *netdev, u32 stringset, u8 *data)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+ u8 *p = data;
+ u32 i, j;
+
+ if (stringset != ETH_SS_STATS)
+ return;
+
+ for (i = 0; i < enet->cfg.rx_ring_num; i++) {
+ for (j = 0; j < EEA_RX_STATS_LEN; j++)
+ ethtool_sprintf(&p, "rx%u_%s", i,
+ eea_rx_stats_desc[j].desc);
+ }
+
+ for (i = 0; i < enet->cfg.tx_ring_num; i++) {
+ for (j = 0; j < EEA_TX_STATS_LEN; j++)
+ ethtool_sprintf(&p, "tx%u_%s", i,
+ eea_tx_stats_desc[j].desc);
+ }
+}
+
+static int eea_get_sset_count(struct net_device *netdev, int sset)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+
+ if (sset != ETH_SS_STATS)
+ return -EOPNOTSUPP;
+
+ return enet->cfg.rx_ring_num * EEA_RX_STATS_LEN +
+ enet->cfg.tx_ring_num * EEA_TX_STATS_LEN;
+}
+
+static void eea_stats_fill_for_q(struct u64_stats_sync *syncp, u32 num,
+ const struct eea_stat_desc *desc,
+ u64 *data, u32 idx)
+{
+ void *stats_base = syncp;
+ u32 start, i;
+
+ do {
+ start = u64_stats_fetch_begin(syncp);
+ for (i = 0; i < num; i++)
+ data[idx + i] =
+ u64_stats_read(stats_base + desc[i].offset);
+
+ } while (u64_stats_fetch_retry(syncp, start));
+}
+
+static void eea_get_ethtool_stats(struct net_device *netdev,
+ struct ethtool_stats *stats, u64 *data)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+ u32 i, idx = 0;
+
+ ASSERT_RTNL();
+
+ if (enet->rx) {
+ for (i = 0; i < enet->cfg.rx_ring_num; i++) {
+ struct eea_net_rx *rx = enet->rx[i];
+
+ eea_stats_fill_for_q(&rx->stats.syncp, EEA_RX_STATS_LEN,
+ eea_rx_stats_desc, data, idx);
+
+ idx += EEA_RX_STATS_LEN;
+ }
+ }
+
+ if (enet->tx) {
+ for (i = 0; i < enet->cfg.tx_ring_num; i++) {
+ struct eea_net_tx *tx = &enet->tx[i];
+
+ eea_stats_fill_for_q(&tx->stats.syncp, EEA_TX_STATS_LEN,
+ eea_tx_stats_desc, data, idx);
+
+ idx += EEA_TX_STATS_LEN;
+ }
+ }
+}
+
+void eea_update_rx_stats(struct eea_rx_stats *rx_stats,
+ struct eea_rx_ctx_stats *stats)
+{
+ u64_stats_update_begin(&rx_stats->syncp);
+ u64_stats_add(&rx_stats->descs, stats->descs);
+ u64_stats_add(&rx_stats->packets, stats->packets);
+ u64_stats_add(&rx_stats->bytes, stats->bytes);
+ u64_stats_add(&rx_stats->drops, stats->drops);
+ u64_stats_add(&rx_stats->split_hdr_bytes, stats->split_hdr_bytes);
+ u64_stats_add(&rx_stats->split_hdr_packets, stats->split_hdr_packets);
+ u64_stats_add(&rx_stats->length_errors, stats->length_errors);
+ u64_stats_update_end(&rx_stats->syncp);
+}
+
+static int eea_get_link_ksettings(struct net_device *netdev,
+ struct ethtool_link_ksettings *cmd)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+
+ cmd->base.speed = enet->speed;
+ cmd->base.duplex = enet->duplex;
+ cmd->base.port = PORT_OTHER;
+
+ return 0;
+}
+
+const struct ethtool_ops eea_ethtool_ops = {
+ .supported_ring_params = ETHTOOL_RING_USE_TCP_DATA_SPLIT,
+ .get_drvinfo = eea_get_drvinfo,
+ .get_link = ethtool_op_get_link,
+ .get_ringparam = eea_get_ringparam,
+ .set_ringparam = eea_set_ringparam,
+ .set_channels = eea_set_channels,
+ .get_channels = eea_get_channels,
+ .get_strings = eea_get_strings,
+ .get_sset_count = eea_get_sset_count,
+ .get_ethtool_stats = eea_get_ethtool_stats,
+ .get_link_ksettings = eea_get_link_ksettings,
+};
diff --git a/drivers/net/ethernet/alibaba/eea/eea_ethtool.h b/drivers/net/ethernet/alibaba/eea/eea_ethtool.h
new file mode 100644
index 000000000000..a437065d1cab
--- /dev/null
+++ b/drivers/net/ethernet/alibaba/eea/eea_ethtool.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Driver for Alibaba Elastic Ethernet Adapter.
+ *
+ * Copyright (C) 2025 Alibaba Inc.
+ */
+
+#ifndef __EEA_ETHTOOL_H__
+#define __EEA_ETHTOOL_H__
+
+struct eea_tx_stats {
+ struct u64_stats_sync syncp;
+ u64_stats_t descs;
+ u64_stats_t packets;
+ u64_stats_t bytes;
+ u64_stats_t drops;
+ u64_stats_t kicks;
+};
+
+struct eea_rx_ctx_stats {
+ u64 descs;
+ u64 packets;
+ u64 bytes;
+ u64 drops;
+ u64 split_hdr_bytes;
+ u64 split_hdr_packets;
+
+ u64 length_errors;
+};
+
+struct eea_rx_stats {
+ struct u64_stats_sync syncp;
+ u64_stats_t descs;
+ u64_stats_t packets;
+ u64_stats_t bytes;
+ u64_stats_t drops;
+ u64_stats_t kicks;
+ u64_stats_t split_hdr_bytes;
+ u64_stats_t split_hdr_packets;
+
+ u64_stats_t length_errors;
+};
+
+void eea_update_rx_stats(struct eea_rx_stats *rx_stats,
+ struct eea_rx_ctx_stats *stats);
+
+extern const struct ethtool_ops eea_ethtool_ops;
+
+#endif
diff --git a/drivers/net/ethernet/alibaba/eea/eea_net.c b/drivers/net/ethernet/alibaba/eea/eea_net.c
index 4897c07a25ae..bc9b384a61df 100644
--- a/drivers/net/ethernet/alibaba/eea/eea_net.c
+++ b/drivers/net/ethernet/alibaba/eea/eea_net.c
@@ -479,6 +479,7 @@ static struct eea_net *eea_netdev_alloc(struct eea_device *edev, u32 pairs)
}
netdev->netdev_ops = &eea_netdev;
+ netdev->ethtool_ops = &eea_ethtool_ops;
SET_NETDEV_DEV(netdev, edev->dma_dev);
enet = netdev_priv(netdev);
diff --git a/drivers/net/ethernet/alibaba/eea/eea_net.h b/drivers/net/ethernet/alibaba/eea/eea_net.h
index 9d7965acdcb2..78a621c2ce4c 100644
--- a/drivers/net/ethernet/alibaba/eea/eea_net.h
+++ b/drivers/net/ethernet/alibaba/eea/eea_net.h
@@ -12,6 +12,7 @@
#include <linux/netdevice.h>
#include "eea_adminq.h"
+#include "eea_ethtool.h"
#include "eea_ring.h"
#define EEA_VER_MAJOR 1
@@ -38,6 +39,8 @@ struct eea_net_tx {
u32 index;
char name[16];
+
+ struct eea_tx_stats stats;
};
struct eea_rx_meta {
@@ -90,6 +93,8 @@ struct eea_net_rx {
struct napi_struct napi;
+ struct eea_rx_stats stats;
+
u16 irq_n;
char name[16];
diff --git a/drivers/net/ethernet/alibaba/eea/eea_rx.c b/drivers/net/ethernet/alibaba/eea/eea_rx.c
index 663f0b0c8b0e..e343c9122dae 100644
--- a/drivers/net/ethernet/alibaba/eea/eea_rx.c
+++ b/drivers/net/ethernet/alibaba/eea/eea_rx.c
@@ -32,6 +32,8 @@ struct eea_rx_ctx {
u32 frame_sz;
struct eea_rx_meta *meta;
+
+ struct eea_rx_ctx_stats stats;
};
static struct eea_rx_meta *eea_rx_meta_get(struct eea_net_rx *rx)
@@ -198,6 +200,7 @@ static int eea_harden_check_overflow(struct eea_rx_ctx *ctx,
pr_debug("%s: rx error: len %u exceeds truesize %u\n",
enet->netdev->name, ctx->len,
ctx->meta->truesize - ctx->meta->room);
+ ++ctx->stats.length_errors;
return -EINVAL;
}
@@ -214,6 +217,7 @@ static int eea_harden_check_size(struct eea_rx_ctx *ctx, struct eea_net *enet)
if (unlikely(ctx->hdr_len + ctx->len < ETH_HLEN)) {
pr_debug("%s: short packet %u\n", enet->netdev->name, ctx->len);
+ ++ctx->stats.length_errors;
return -EINVAL;
}
@@ -355,6 +359,7 @@ static int process_remain_buf(struct eea_net_rx *rx, struct eea_rx_ctx *ctx)
err:
dev_kfree_skb(rx->pkt.head_skb);
+ ++ctx->stats.drops;
rx->pkt.do_drop = true;
rx->pkt.head_skb = NULL;
return 0;
@@ -383,6 +388,7 @@ static int process_first_buf(struct eea_net_rx *rx, struct eea_rx_ctx *ctx)
return 0;
err:
+ ++ctx->stats.drops;
rx->pkt.do_drop = true;
return 0;
}
@@ -414,9 +420,12 @@ static void eea_rx_desc_to_ctx(struct eea_net_rx *rx,
ctx->flags = le16_to_cpu(desc->flags);
ctx->hdr_len = 0;
- if (ctx->flags & EEA_DESC_F_SPLIT_HDR)
+ if (ctx->flags & EEA_DESC_F_SPLIT_HDR) {
ctx->hdr_len = le16_to_cpu(desc->len_ex) &
EEA_RX_CDESC_HDR_LEN_MASK;
+ ctx->stats.split_hdr_bytes += ctx->hdr_len;
+ ++ctx->stats.split_hdr_packets;
+ }
ctx->more = ctx->flags & EEA_RING_DESC_F_MORE;
}
@@ -444,6 +453,8 @@ static int eea_cleanrx(struct eea_net_rx *rx, int budget,
eea_rx_meta_dma_sync_for_cpu(rx, meta, ctx->len);
+ ctx->stats.bytes += ctx->len;
+
if (!rx->pkt.idx)
process_first_buf(rx, ctx);
else
@@ -461,17 +472,20 @@ static int eea_cleanrx(struct eea_net_rx *rx, int budget,
skip:
eea_rx_meta_put(rx, meta);
ering_cq_ack_desc(rx->ering, 1);
+ ++ctx->stats.descs;
if (!ctx->more)
memset(&rx->pkt, 0, sizeof(rx->pkt));
}
+ ctx->stats.packets = packets;
+
return packets;
}
static bool eea_rx_post(struct eea_net *enet, struct eea_net_rx *rx)
{
- u32 tailroom, headroom, room, len;
+ u32 tailroom, headroom, room, flags, len;
struct eea_rx_meta *meta;
struct eea_rx_desc *desc;
int err = 0, num = 0;
@@ -511,9 +525,14 @@ static bool eea_rx_post(struct eea_net *enet, struct eea_net_rx *rx)
++num;
}
- if (num)
+ if (num) {
ering_kick(rx->ering);
+ flags = u64_stats_update_begin_irqsave(&rx->stats.syncp);
+ u64_stats_inc(&rx->stats.kicks);
+ u64_stats_update_end_irqrestore(&rx->stats.syncp, flags);
+ }
+
/* true means busy, napi should be called again. */
return !!err;
}
@@ -534,6 +553,8 @@ int eea_poll(struct napi_struct *napi, int budget)
if (rx->ering->num_free > budget)
busy |= eea_rx_post(enet, rx);
+ eea_update_rx_stats(&rx->stats, &ctx.stats);
+
busy |= received >= budget;
if (busy)
@@ -718,6 +739,8 @@ struct eea_net_rx *eea_alloc_rx(struct eea_net_init_ctx *ctx, u32 idx)
rx->index = idx;
sprintf(rx->name, "rx.%u", idx);
+ u64_stats_init(&rx->stats.syncp);
+
/* ering */
ering = ering_alloc(idx * 2, ctx->cfg.rx_ring_depth, ctx->edev,
ctx->cfg.rx_sq_desc_size,
diff --git a/drivers/net/ethernet/alibaba/eea/eea_tx.c b/drivers/net/ethernet/alibaba/eea/eea_tx.c
index 138be057611e..136a244e6304 100644
--- a/drivers/net/ethernet/alibaba/eea/eea_tx.c
+++ b/drivers/net/ethernet/alibaba/eea/eea_tx.c
@@ -112,6 +112,13 @@ static u32 eea_clean_tx(struct eea_net_tx *tx, int budget)
ering_cq_ack_desc(tx->ering, meta->num);
}
+ if (stats.packets) {
+ u64_stats_update_begin(&tx->stats.syncp);
+ u64_stats_add(&tx->stats.bytes, stats.bytes);
+ u64_stats_add(&tx->stats.packets, stats.packets);
+ u64_stats_update_end(&tx->stats.syncp);
+ }
+
return stats.packets;
}
@@ -245,6 +252,10 @@ static int eea_tx_post_skb(struct eea_net_tx *tx, struct sk_buff *skb)
meta->num = shinfo->nr_frags + 1;
ering_sq_commit_desc(tx->ering);
+ u64_stats_update_begin(&tx->stats.syncp);
+ u64_stats_add(&tx->stats.descs, meta->num);
+ u64_stats_update_end(&tx->stats.syncp);
+
return 0;
err_cancel:
@@ -257,6 +268,10 @@ static int eea_tx_post_skb(struct eea_net_tx *tx, struct sk_buff *skb)
static void eea_tx_kick(struct eea_net_tx *tx)
{
ering_kick(tx->ering);
+
+ u64_stats_update_begin(&tx->stats.syncp);
+ u64_stats_inc(&tx->stats.kicks);
+ u64_stats_update_end(&tx->stats.syncp);
}
netdev_tx_t eea_tx_xmit(struct sk_buff *skb, struct net_device *netdev)
@@ -272,8 +287,13 @@ netdev_tx_t eea_tx_xmit(struct sk_buff *skb, struct net_device *netdev)
skb_tx_timestamp(skb);
err = eea_tx_post_skb(tx, skb);
- if (unlikely(err))
+ if (unlikely(err)) {
+ u64_stats_update_begin(&tx->stats.syncp);
+ u64_stats_inc(&tx->stats.drops);
+ u64_stats_update_end(&tx->stats.syncp);
+
dev_kfree_skb_any(skb);
+ }
/* NETDEV_TX_BUSY is expensive. So stop advancing the TX queue. */
n = MAX_SKB_FRAGS + 1;
@@ -343,6 +363,8 @@ int eea_alloc_tx(struct eea_net_init_ctx *ctx, struct eea_net_tx *tx, u32 idx)
struct eea_ring *ering;
u32 i;
+ u64_stats_init(&tx->stats.syncp);
+
sprintf(tx->name, "tx.%u", idx);
ering = ering_alloc(idx * 2 + 1, ctx->cfg.tx_ring_depth, ctx->edev,
--
2.32.0.3.g01195cf9f
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH net-next v24 6/6] eea: introduce callback for ndo_get_stats64
2026-01-30 9:34 [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Xuan Zhuo
` (4 preceding siblings ...)
2026-01-30 9:34 ` [PATCH net-next v24 5/6] eea: introduce ethtool support Xuan Zhuo
@ 2026-01-30 9:34 ` Xuan Zhuo
2026-01-30 23:32 ` [net-next,v24,6/6] " Jakub Kicinski
2026-01-31 0:20 ` [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Ethan Nelson-Moore
6 siblings, 1 reply; 14+ messages in thread
From: Xuan Zhuo @ 2026-01-30 9:34 UTC (permalink / raw)
To: netdev
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xuan Zhuo, Wen Gu, Philo Lu, Lorenzo Bianconi,
Vadim Fedorenko, Dong Yibo, Ethan Nelson-Moore, Heiner Kallweit,
Lukas Bulwahn, Dust Li
Add basic driver framework for the Alibaba Elastic Ethernet Adapter(EEA).
This commit introduces ndo_get_stats64 support.
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Philo Lu <lulie@linux.alibaba.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
drivers/net/ethernet/alibaba/eea/eea_net.c | 51 ++++++++++++++++++++++
drivers/net/ethernet/alibaba/eea/eea_net.h | 5 +++
2 files changed, 56 insertions(+)
diff --git a/drivers/net/ethernet/alibaba/eea/eea_net.c b/drivers/net/ethernet/alibaba/eea/eea_net.c
index bc9b384a61df..a918feaf8412 100644
--- a/drivers/net/ethernet/alibaba/eea/eea_net.c
+++ b/drivers/net/ethernet/alibaba/eea/eea_net.c
@@ -32,8 +32,10 @@ static int enet_bind_new_q_and_cfg(struct eea_net *enet,
enet->cfg = ctx->cfg;
+ spin_lock(&enet->stats_lock);
enet->rx = ctx->rx;
enet->tx = ctx->tx;
+ spin_unlock(&enet->stats_lock);
for (i = 0; i < ctx->cfg.rx_ring_num; i++) {
rx = ctx->rx[i];
@@ -61,11 +63,13 @@ static void eea_free_rxtx_q_mem(struct eea_net *enet)
struct eea_net_tx *tx, *tx_array;
int i;
+ spin_lock(&enet->stats_lock);
rx_array = enet->rx;
tx_array = enet->tx;
enet->rx = NULL;
enet->tx = NULL;
+ spin_unlock(&enet->stats_lock);
for (i = 0; i < enet->cfg.rx_ring_num; i++) {
rx = rx_array[i];
@@ -264,6 +268,50 @@ static int eea_netdev_open(struct net_device *netdev)
return err;
}
+static void eea_stats(struct net_device *netdev, struct rtnl_link_stats64 *tot)
+{
+ struct eea_net *enet = netdev_priv(netdev);
+ u64 packets, bytes;
+ u32 start;
+ int i;
+
+ spin_lock_bh(&enet->stats_lock);
+
+ if (enet->rx) {
+ for (i = 0; i < enet->cfg.rx_ring_num; i++) {
+ struct eea_net_rx *rx = enet->rx[i];
+
+ do {
+ start = u64_stats_fetch_begin(&rx->stats.syncp);
+ packets = u64_stats_read(&rx->stats.packets);
+ bytes = u64_stats_read(&rx->stats.bytes);
+ } while (u64_stats_fetch_retry(&rx->stats.syncp,
+ start));
+
+ tot->rx_packets += packets;
+ tot->rx_bytes += bytes;
+ }
+ }
+
+ if (enet->tx) {
+ for (i = 0; i < enet->cfg.tx_ring_num; i++) {
+ struct eea_net_tx *tx = &enet->tx[i];
+
+ do {
+ start = u64_stats_fetch_begin(&tx->stats.syncp);
+ packets = u64_stats_read(&tx->stats.packets);
+ bytes = u64_stats_read(&tx->stats.bytes);
+ } while (u64_stats_fetch_retry(&tx->stats.syncp,
+ start));
+
+ tot->tx_packets += packets;
+ tot->tx_bytes += bytes;
+ }
+ }
+
+ spin_unlock_bh(&enet->stats_lock);
+}
+
/* resources: ring, buffers, irq */
int eea_reset_hw_resources(struct eea_net *enet, struct eea_net_init_ctx *ctx)
{
@@ -462,6 +510,7 @@ static const struct net_device_ops eea_netdev = {
.ndo_stop = eea_netdev_stop,
.ndo_start_xmit = eea_tx_xmit,
.ndo_validate_addr = eth_validate_addr,
+ .ndo_get_stats64 = eea_stats,
.ndo_features_check = passthru_features_check,
.ndo_tx_timeout = eea_tx_timeout,
};
@@ -487,6 +536,8 @@ static struct eea_net *eea_netdev_alloc(struct eea_device *edev, u32 pairs)
enet->edev = edev;
edev->enet = enet;
+ spin_lock_init(&enet->stats_lock);
+
return enet;
}
diff --git a/drivers/net/ethernet/alibaba/eea/eea_net.h b/drivers/net/ethernet/alibaba/eea/eea_net.h
index 78a621c2ce4c..f32d26e156ab 100644
--- a/drivers/net/ethernet/alibaba/eea/eea_net.h
+++ b/drivers/net/ethernet/alibaba/eea/eea_net.h
@@ -157,6 +157,11 @@ struct eea_net {
u32 speed;
u64 hw_ts_offset;
+
+ /* Protect the tx and rx of struct eea_net, when eea_stats accesses the
+ * stats from rx and tx queues.
+ */
+ spinlock_t stats_lock;
};
int eea_tx_resize(struct eea_net *enet, struct eea_net_tx *tx, u32 ring_num);
--
2.32.0.3.g01195cf9f
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [net-next,v24,1/6] eea: introduce PCI framework
2026-01-30 9:34 ` [PATCH net-next v24 1/6] eea: introduce PCI framework Xuan Zhuo
@ 2026-01-30 23:32 ` Jakub Kicinski
0 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2026-01-30 23:32 UTC (permalink / raw)
To: xuanzhuo
Cc: Jakub Kicinski, lorenzo, andrew+netdev, hkallweit1, dust.li,
edumazet, davem, pabeni, lulie, andrew, netdev, guwen, dong100,
lukas.bulwahn, vadim.fedorenko, enelsonmoore
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
eea: introduce PCI framework
Add basic driver framework for the Alibaba Elastic Ethernet Adapter(EEA).
This commit implements the EEA PCI probe functionality.
> diff --git a/drivers/net/ethernet/alibaba/Kconfig b/drivers/net/ethernet/alibaba/Kconfig
> --- /dev/null
> +++ b/drivers/net/ethernet/alibaba/Kconfig
> @@ -0,0 +1,28 @@
> +#
> +# Alibaba network device configuration
> +#
> +
> +config NET_VENDOR_ALIBABA
> + bool "Alibaba Devices"
> + default y
> + help
> + If you have a network (Ethernet) device belonging to this class, say Y.
> +
> + Note that the answer to this question doesn't directly affect the
> + kernel: saying N will just cause the configurator to skip all
> + the questions about Alibaba devices. If you say Y, you will be asked
> + for your specific device in the following questions.
> +
> +if NET_VENDOR_ALIBABA
> +
> +config EEA
> + tristate "Alibaba Elastic Ethernet Adapter support"
> + depends on PCI_MSI
> + depends on 64BIT
> + select PAGE_POOL
> + help
> + This driver supports Alibaba Elastic Ethernet Adapter"
^
This isn't a bug, but the help text ends with an unmatched trailing quote.
This should likely be a period instead.
[ ... ]
> diff --git a/drivers/net/ethernet/alibaba/eea/eea_pci.c b/drivers/net/ethernet/alibaba/eea/eea_pci.c
> --- /dev/null
> +++ b/drivers/net/ethernet/alibaba/eea/eea_pci.c
[ ... ]
> +static int eea_init_device(struct eea_device *edev)
> +{
> + int err;
> +
> + err = eea_device_reset(edev);
> + if (err)
> + return err;
> +
> + eea_pci_io_set_status(edev, BIT(0) | BIT(1));
> +
> + err = eea_negotiate(edev);
> + if (err)
> + goto err;
> +
> + /* do net device probe ... */
> +
> + return 0;
> +err:
> + eea_add_status(edev, EEA_S_FAILED);
> + return err;
> +}
> +
> +static int __eea_pci_probe(struct pci_dev *pci_dev,
> + struct eea_pci_device *ep_dev)
> +{
> + int err;
> +
> + pci_set_drvdata(pci_dev, ep_dev);
> +
> + err = eea_pci_setup(pci_dev, ep_dev);
> + if (err)
> + return err;
> +
> + err = eea_init_device(&ep_dev->edev);
> + if (err)
> + goto err_pci_rel;
If eea_init_device() fails here, can this create a use-after-free issue?
The function sets pci_set_drvdata() at the beginning, then if
eea_init_device() fails, the error path jumps to err_pci_rel which
calls eea_pci_release_resource() but doesn't clear the driver data.
Control then returns to eea_pci_probe():
eea_pci_probe()->__eea_pci_probe():
pci_set_drvdata(pci_dev, ep_dev); // sets driver data
err = eea_init_device(&ep_dev->edev);
if (err)
goto err_pci_rel; // jumps on failure
err_pci_rel:
eea_pci_release_resource(ep_dev); // doesn't clear driver data
return err; // returns to eea_pci_probe
Back in eea_pci_probe:
err = __eea_pci_probe(pci_dev, ep_dev);
if (err)
kfree(ep_dev); // frees memory, driver data
// still points to freed memory
Would it make sense to either move pci_set_drvdata() after successful
initialization, or add pci_set_drvdata(pci_dev, NULL) in the error path?
> +
> + return 0;
> +
> +err_pci_rel:
> + eea_pci_release_resource(ep_dev);
> + return err;
> +}
[ ... ]
> +static int eea_pci_probe(struct pci_dev *pci_dev,
> + const struct pci_device_id *id)
> +{
> + struct eea_pci_device *ep_dev;
> + struct eea_device *edev;
> + int err;
> +
> + ep_dev = kzalloc(sizeof(*ep_dev), GFP_KERNEL);
> + if (!ep_dev)
> + return -ENOMEM;
> +
> + edev = &ep_dev->edev;
> +
> + edev->ep_dev = ep_dev;
> + edev->dma_dev = &pci_dev->dev;
> +
> + ep_dev->pci_dev = pci_dev;
> +
> + err = __eea_pci_probe(pci_dev, ep_dev);
> + if (err)
> + kfree(ep_dev);
> +
> + return err;
> +}
--
pw-bot: cr
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [net-next,v24,2/6] eea: introduce ring and descriptor structures
2026-01-30 9:34 ` [PATCH net-next v24 2/6] eea: introduce ring and descriptor structures Xuan Zhuo
@ 2026-01-30 23:32 ` Jakub Kicinski
0 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2026-01-30 23:32 UTC (permalink / raw)
To: xuanzhuo
Cc: Jakub Kicinski, lorenzo, andrew+netdev, hkallweit1, dust.li,
edumazet, davem, pabeni, lulie, andrew, netdev, guwen, dong100,
lukas.bulwahn, vadim.fedorenko, enelsonmoore
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
eea: introduce ring and descriptor structures
This commit introduces the ring and descriptor implementations for the
Alibaba Elastic Ethernet Adapter (EEA). These structures and ring APIs
are used by the RX, TX, and admin queues.
> diff --git a/drivers/net/ethernet/alibaba/eea/Makefile b/drivers/net/ethernet/alibaba/eea/Makefile
> index cf2acf1733fd..e5e4007810a6 100644
> --- a/drivers/net/ethernet/alibaba/eea/Makefile
> +++ b/drivers/net/ethernet/alibaba/eea/Makefile
> @@ -1,3 +1,4 @@
>
> obj-$(CONFIG_EEA) += eea.o
> -eea-y := eea_pci.o
> +eea-y := eea_ring.o \
> + eea_pci.o
[ ... ]
> diff --git a/drivers/net/ethernet/alibaba/eea/eea_desc.h b/drivers/net/ethernet/alibaba/eea/eea_desc.h
> new file mode 100644
> index 000000000000..541346a03375
> --- /dev/null
> +++ b/drivers/net/ethernet/alibaba/eea/eea_desc.h
> @@ -0,0 +1,156 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/*
> + * Driver for Alibaba Elastic Ethernet Adapter.
> + *
> + * Copyright (C) 2025 Alibaba Inc.
> + */
> +
> +#ifndef __EEA_DESC_H__
> +#define __EEA_DESC_H__
> +
> +#define EEA_DESC_TS_MASK GENMASK(47, 0)
> +#define EEA_DESC_TS(desc) (le64_to_cpu((desc)->ts) & EEA_DESC_TS_MASK)
> +
> +struct eea_aq_desc {
> + __le16 flags;
> + __le16 id;
> + __le16 reserved;
> + u8 classid;
> + u8 command;
> + __le64 data_addr;
> + __le64 reply_addr;
> + __le32 data_len;
> + __le32 reply_len;
> +};
> +
> +struct eea_aq_cdesc {
> + __le16 flags;
> + __le16 id;
> +#define EEA_OK 0
> +#define EEA_ERR 0xffffffff
> + __le32 status;
> + __le32 reply_len;
> + __le32 reserved1;
> +
> + __le64 reserved2;
> + __le64 reserved3;
> +};
[ ... ]
> +struct eea_rx_desc {
> + __le16 flags;
> + __le16 id;
> + __le16 len;
> + __le16 reserved1;
> +
> + __le64 addr;
> +
> + __le64 hdr_addr;
> + __le32 reserved2;
> + __le32 reserved3;
> +};
[ ... ]
> +struct eea_rx_cdesc {
> +#define EEA_DESC_F_DATA_VALID BIT(6)
> +#define EEA_DESC_F_SPLIT_HDR BIT(5)
> + __le16 flags;
> + __le16 id;
> + __le16 len;
> +#define EEA_NET_PT_NONE 0
> +#define EEA_NET_PT_IPv4 1
> +#define EEA_NET_PT_TCPv4 2
> +#define EEA_NET_PT_UDPv4 3
> +#define EEA_NET_PT_IPv6 4
> +#define EEA_NET_PT_TCPv6 5
> +#define EEA_NET_PT_UDPv6 6
> +#define EEA_NET_PT_IPv6_EX 7
> +#define EEA_NET_PT_TCPv6_EX 8
> +#define EEA_NET_PT_UDPv6_EX 9
> + /* [9:0] is packet type. */
> + __le16 type;
> +
> + /* hw timestamp [0:47]: ts */
> + __le64 ts;
> +
> + __le32 hash;
> +
> + /* 0-9: hdr_len split header
> + * 10-15: reserved1
> + */
> + __le16 len_ex;
> + __le16 reserved2;
> +
> + __le32 reserved3;
> + __le32 reserved4;
> +};
[ ... ]
> +struct eea_tx_desc {
> +#define EEA_DESC_F_DO_CSUM BIT(6)
> + __le16 flags;
> + __le16 id;
> + __le16 len;
> + __le16 reserved1;
> +
> + __le64 addr;
> +
> + __le16 csum_start;
> + __le16 csum_offset;
> + u8 gso_type;
> + u8 reserved2;
> + __le16 gso_size;
> + __le64 reserved3;
> +};
> +
> +struct eea_tx_cdesc {
> + __le16 flags;
> + __le16 id;
> + __le16 len;
> + __le16 reserved1;
> +
> + /* hw timestamp [0:47]: ts */
> + __le64 ts;
> + __le64 reserved2;
> + __le64 reserved3;
> +};
[ ... ]
> +struct eea_db_direct {
> + u8 kick_flags;
> + u8 reserved;
> + __le16 idx;
> +
> + __le16 tx_cq_head;
> + __le16 rx_cq_head;
> +
> + u8 desc[24];
> +};
> +
> +static_assert(sizeof(struct eea_rx_desc) == 32, "rx desc size does not match");
> +static_assert(sizeof(struct eea_rx_cdesc) == 32,
> + "rx cdesc size does not match");
> +static_assert(sizeof(struct eea_tx_desc) == 32, "tx desc size does not match");
> +static_assert(sizeof(struct eea_tx_cdesc) == 32,
> + "tx cdesc size does not match");
> +static_assert(sizeof(struct eea_db_direct) == 32,
> + "db direct size does not match");
> +#endif
This isn't a bug, but should there be similar static_assert checks for
struct eea_aq_desc and struct eea_aq_cdesc? The other DMA descriptor
structures (eea_rx_desc, eea_rx_cdesc, eea_tx_desc, eea_tx_cdesc,
eea_db_direct) all have static_assert checks to verify their 32-byte
sizes. The admin queue descriptors are also 32-byte DMA structures with
hardware-defined layouts, so adding the checks would be consistent with
the pattern used for the other descriptors.
[ ... ]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [net-next,v24,4/6] eea: create/destroy rx,tx queues for netdevice open and stop
2026-01-30 9:34 ` [PATCH net-next v24 4/6] eea: create/destroy rx,tx queues for netdevice open and stop Xuan Zhuo
@ 2026-01-30 23:32 ` Jakub Kicinski
2026-02-02 3:23 ` Xuan Zhuo
0 siblings, 1 reply; 14+ messages in thread
From: Jakub Kicinski @ 2026-01-30 23:32 UTC (permalink / raw)
To: xuanzhuo
Cc: Jakub Kicinski, lorenzo, andrew+netdev, hkallweit1, dust.li,
edumazet, davem, pabeni, lulie, andrew, netdev, guwen, dong100,
lukas.bulwahn, vadim.fedorenko, enelsonmoore
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
eea: create/destroy rx,tx queues for netdevice open and stop
This commit introduces the implementation for netdevice open and stop in
the Alibaba Elastic Ethernet Adapter (EEA) driver, including RX/TX queue
allocation, NAPI setup, and resource management.
> diff --git a/drivers/net/ethernet/alibaba/eea/eea_net.c b/drivers/net/ethernet/alibaba/eea/eea_net.c
> index 65b236c412a9..4897c07a25ae 100644
> --- a/drivers/net/ethernet/alibaba/eea/eea_net.c
> +++ b/drivers/net/ethernet/alibaba/eea/eea_net.c
[ ... ]
> +static int enet_bind_new_q_and_cfg(struct eea_net *enet,
> + struct eea_net_init_ctx *ctx)
> +{
> + struct eea_net_rx *rx;
> + struct eea_net_tx *tx;
> + int i, err;
> +
> + err = netif_set_real_num_queues(enet->netdev, ctx->cfg.tx_ring_num,
> + ctx->cfg.rx_ring_num);
> + if (err)
> + return err;
> +
> + enet->cfg = ctx->cfg;
> +
> + enet->rx = ctx->rx;
> + enet->tx = ctx->tx;
> +
> + for (i = 0; i < ctx->cfg.rx_ring_num; i++) {
> + rx = ctx->rx[i];
> + tx = &ctx->tx[i];
Does this code overflow ctx->tx when rx_ring_num > tx_ring_num?
The driver reads rx_num and tx_num from separate hardware registers
(eea_pci.c:212-213), but assumes they're always equal without validation.
The ctx->tx array is allocated with tx_ring_num elements at
eea_alloc_rxtx_q_mem:89, while ctx->rx is allocated with rx_ring_num
elements at line 93. Then this loop iterates to rx_ring_num and accesses
ctx->tx[i].
If the hardware returns rx_num=4 and tx_num=2, the driver allocates ctx->tx
with 2 elements but this loop accesses ctx->tx[2] and ctx->tx[3], which are
out-of-bounds.
The same issue appears in:
- eea_alloc_rxtx_q_mem:114 accessing ctx->tx[i]
- eea_alloc_rxtx_q_mem:123 in the error path
- eea_free_rxtx_q_mem:70
- eea_queues_check_and_reset in error handling
The driver uses a "queue pairs" model, but fails to validate or enforce
this constraint on hardware-provided values.
> +
> + rx->enet = enet;
> + tx->enet = enet;
> + }
> +
> + return 0;
> +}
[ ... ]
> +/* alloc tx/rx: struct, ring, meta, pp, napi */
> +static int eea_alloc_rxtx_q_mem(struct eea_net_init_ctx *ctx)
> +{
> + struct eea_net_cfg *cfg = &ctx->cfg;
> + struct eea_net_rx *rx;
> + struct eea_net_tx *tx;
> + int err, i;
> +
> + ctx->tx = kvcalloc(cfg->tx_ring_num, sizeof(*ctx->tx), GFP_KERNEL);
> + if (!ctx->tx)
> + return -ENOMEM;
> +
> + ctx->rx = kvcalloc(cfg->rx_ring_num, sizeof(*ctx->rx), GFP_KERNEL);
> + if (!ctx->rx) {
> + err = -ENOMEM;
> + goto err_free;
> + }
> +
> + /* modify cfg for alloc */
> + ctx->cfg.tx_cq_desc_size /= 2;
> +
> + if (!ctx->cfg.split_hdr)
> + ctx->cfg.rx_sq_desc_size /= 2;
> +
> + for (i = 0; i < ctx->cfg.rx_ring_num; i++) {
> + rx = eea_alloc_rx(ctx, i);
> + if (!rx)
> + goto err_free;
> +
> + ctx->rx[i] = rx;
> +
> + tx = ctx->tx + i;
> + err = eea_alloc_tx(ctx, tx, i);
> + if (err)
> + goto err_free;
> + }
[ ... ]
> +static int eea_netdev_open(struct net_device *dev)
> +{
> + struct eea_net *enet = netdev_priv(dev);
> + struct eea_net_init_ctx ctx;
> + int err;
> +
> + enet_init_ctx(enet, &ctx);
> +
> + err = eea_alloc_rxtx_q_mem(&ctx);
> + if (err)
> + return err;
> +
> + err = enet_bind_new_q_and_cfg(enet, &ctx);
> + if (err)
> + goto err_free_q;
> +
> + err = eea_active_ring_and_irq(enet);
> + if (err)
> + goto err_free_q;
> +
> + eea_start_rxtx(enet);
> +
> + return 0;
> +
> +err_free_q:
> + eea_free_rxtx_q_mem(enet);
Does this code leak the newly allocated queues when enet_bind_new_q_and_cfg
fails?
The function allocates new queue resources in ctx at line 244 via
eea_alloc_rxtx_q_mem(&ctx). When enet_bind_new_q_and_cfg fails at line 248,
it goes to err_free_q which calls eea_free_rxtx_q_mem(enet).
However, eea_free_rxtx_q_mem operates on enet->rx and enet->tx, not on
ctx->rx and ctx->tx. When bind fails, the resources remain in ctx and are
never transferred to enet. The function eea_free_rxtx_q_mem will either free
old queues if the device was previously opened, or do nothing if enet->rx
and enet->tx are NULL (first open). In either case, the newly allocated
ctx->rx and ctx->tx are leaked.
The allocated resources include ring buffers, DMA allocations, page pools,
and NAPI structures.
> +
> + return err;
> +}
[ ... ]
> +static int eea_reset_hw_resources(struct eea_net *enet)
> +{
> + struct eea_net_init_ctx ctx;
> + int err;
> +
> + enet_init_ctx(enet, &ctx);
> +
> + err = eea_alloc_rxtx_q_mem(&ctx);
> + if (err)
> + return err;
> +
> + eea_netdev_stop(enet->netdev);
> +
> + /* We skip resource cleanup on error paths in this function. Resources
> + * are deferred for later teardown when the netdev is closed or the
> + * driver unloads. This approach is deliberate: it simplifies the error
> + * handling and avoids the complexity of undoing partial operations
> + * during recovery.
> + */
> +
> + err = enet_bind_new_q_and_cfg(enet, &ctx);
> + if (err) {
> + netdev_err(enet->netdev,
> + "eea reset: bind new queues failed. err %d\n",
> + err);
> +
> + return err;
Does this code leak the newly allocated queues when enet_bind_new_q_and_cfg
fails?
The function allocates new queue resources in ctx at line 277 via
eea_alloc_rxtx_q_mem(&ctx), then calls eea_netdev_stop at line 285 which
frees the OLD enet->rx and enet->tx and sets them to NULL. When
enet_bind_new_q_and_cfg fails at line 294, it returns early without
transferring ownership of ctx->rx and ctx->tx to enet.
The error path at line 300 returns without freeing the newly allocated
ctx->rx and ctx->tx, causing a memory leak. These resources include ring
buffers, DMA allocations, page pools, and NAPI structures.
The ctx variable is stack-allocated, so when the function returns, the
pointers ctx->rx and ctx->tx are lost forever, orphaning the allocated
memory.
> + }
> +
> + err = eea_active_ring_and_irq(enet);
> + if (err) {
> + netdev_err(enet->netdev,
> + "eea reset: active new ring and irq failed. err %d\n",
> + err);
> + return err;
Does this code leak the newly allocated queues when eea_active_ring_and_irq
fails?
In this scenario:
1. New queues allocated in ctx at line 277
2. Old queues freed by eea_netdev_stop at line 285
3. enet_bind_new_q_and_cfg succeeds at line 294, transferring ctx->rx and
ctx->tx to enet->rx and enet->tx
4. eea_active_ring_and_irq fails at line 303
5. Function returns at line 308 without freeing enet->rx and enet->tx
The newly allocated and transferred resources remain in enet->rx and
enet->tx but are never freed. The comment at lines 287-292 claims resources
are deferred for cleanup, but there is no mechanism to track these orphaned
resources for later cleanup.
> + }
> +
> + eea_start_rxtx(enet);
> +
> + return 0;
> +}
[ ... ]
> diff --git a/drivers/net/ethernet/alibaba/eea/eea_pci.c b/drivers/net/ethernet/alibaba/eea/eea_pci.c
> index 8e9dd0f06f64..bf1b7e68fa45 100644
> --- a/drivers/net/ethernet/alibaba/eea/eea_pci.c
> +++ b/drivers/net/ethernet/alibaba/eea/eea_pci.c
[ ... ]
> @@ -336,6 +507,8 @@ static int eea_pci_probe(struct pci_dev *pci_dev, const struct pci_device_id *e
> +
> + ep_dev->pci_dev = pci_dev;
> + mutex_init(&edev->ha_lock);
> + err = __eea_pci_probe(pci_dev, ep_dev);
This isn't a bug, but should mutex_destroy() be called for ha_lock?
The mutex is initialized in eea_pci_probe() at line 510 with
mutex_init(&edev->ha_lock), but it is never destroyed with mutex_destroy()
before the structure is freed.
In the success path, eea_pci_remove() calls kfree(ep_dev) at line 530
without destroying the mutex.
In the error path of eea_pci_probe(), if __eea_pci_probe() fails at line
512, the code calls kfree(ep_dev) at line 514 without destroying the mutex.
While mutex_destroy() is a no-op when CONFIG_DEBUG_MUTEXES is disabled, it
is still considered best practice in the kernel to pair mutex_init() with
mutex_destroy(). Other networking drivers such as mlxsw consistently call
mutex_destroy() before freeing structures containing mutexes.
> + if (err) {
> + kfree(ep_dev);
> + return err;
> + }
[ ... ]
> diff --git a/drivers/net/ethernet/alibaba/eea/eea_pci.h b/drivers/net/ethernet/alibaba/eea/eea_pci.h
> index 696d82c3fe7d..91cb0be2d25b 100644
> --- a/drivers/net/ethernet/alibaba/eea/eea_pci.h
> +++ b/drivers/net/ethernet/alibaba/eea/eea_pci.h
[ ... ]
> @@ -47,7 +54,14 @@ struct eea_device_reset {
> void eea_pci_reset_finished_callback(struct eea_device *edev);
> void eea_pci_remove_nets(struct eea_device *edev);
> int eea_pci_probe_nets(struct eea_device *edev);
> +void eea_pci_device_update_ts_off(struct eea_device *edev, u64 ts_off);
> u64 eea_pci_device_ts(struct eea_device *edev);
> +int eea_pci_set_affinity(struct eea_ring *ering,
> + const struct cpumask *cpu_mask);
This isn't a bug, but eea_pci_set_affinity() is declared here but never
implemented or called.
The function is declared at line 64-65 of eea_pci.h, but there is no
implementation of this function in any .c file in drivers/net/ethernet/
alibaba/eea/, and the function is not called anywhere in the codebase.
This could be dead code that should be removed, or a forward declaration for
a future implementation that was forgotten. Given that this is part of a new
driver introduction, this might be intentional preparation for future
functionality. However, it's generally better practice to add function
declarations when they are actually implemented and used, rather than adding
them speculatively.
> +void eea_pci_free_nets(struct eea_device *edev);
> +
> +int eea_pci_set_adminq_irq_affinity(struct eea_device *edev,
> + const struct cpumask *cpu_mask);
> +int eea_pci_adminq_irq_num(struct eea_device *edev);
>
> #endif
[ ... ]
> diff --git a/drivers/net/ethernet/alibaba/eea/eea_rx.c b/drivers/net/ethernet/alibaba/eea/eea_rx.c
> new file mode 100644
> index 000000000000..3a7c06cdb89c
> --- /dev/null
> +++ b/drivers/net/ethernet/alibaba/eea/eea_rx.c
[ ... ]
> +static int eea_alloc_rx_hdr(struct eea_net_rx *rx, struct eea_net_cfg *cfg)
> +{
> + struct eea_rx_meta *meta;
> + struct page *hdr_page;
> + struct device *dmadev;
> + u32 offset, i, sz;
> + dma_addr_t dma;
> + int err;
> +
> + dmadev = ering_get_dma_device(rx->ering);
> + sz = meta_align_offset(cfg);
> + hdr_page = NULL;
> + dma = 0;
> + offset = PAGE_SIZE;
> +
> + for (i = 0; i < cfg->rx_ring_depth; i++) {
> + meta = &rx->meta[i];
> +
> + if (!hdr_page || (PAGE_SIZE - offset) < sz) {
> + hdr_page = alloc_page(GFP_KERNEL);
> + if (!hdr_page) {
> + err = -ENOMEM;
> + goto err_free;
> + }
> +
> + dma = dma_map_page(dmadev, hdr_page, 0, PAGE_SIZE,
> + DMA_FROM_DEVICE);
> + if (dma_mapping_error(dmadev, dma)) {
> + __free_page(hdr_page);
> + err = -ENOMEM;
> + goto err_free;
> + }
> +
> + offset = 0;
> + meta->hdr_page = hdr_page;
> + meta->dma = dma;
This isn't a bug, but line 175 incorrectly assigns header page DMA address
to meta->dma field.
The meta->dma field is intended for the data page DMA address (set by
eea_alloc_rx_buffer at line 97), while meta->hdr_dma (correctly set at line
178) is for the header page DMA address. In the code:
dma = dma_map_page(dmadev, hdr_page, 0, PAGE_SIZE, DMA_FROM_DEVICE);
...
meta->hdr_page = hdr_page;
meta->dma = dma; // WRONG: overwrites data page DMA with header page DMA
...
meta->hdr_dma = dma + offset; // CORRECT: sets header DMA
While this doesn't cause a functional bug (because eea_alloc_rx_buffer later
overwrites meta->dma with the correct value), it violates the semantic
meaning of struct fields, creates confusion about field purposes, and
temporarily leaves meta->dma with a wrong value between eea_alloc_rx_hdr and
eea_alloc_rx_buffer.
> + }
> +
> + meta->hdr_dma = dma + offset;
> + meta->hdr_offset = offset;
> +
> + offset += sz;
> + }
[ ... ]
> +static int eea_harden_check_overflow(struct eea_rx_ctx *ctx,
> + struct eea_net *enet)
> +{
> + if (unlikely(ctx->len > ctx->meta->truesize - ctx->meta->room)) {
> + pr_debug("%s: rx error: len %u exceeds truesize %u\n",
> + __func__, ctx->len, ctx->meta->truesize);
> + eea_rx_meta_put(ctx->meta);
> + return -EINVAL;
> + }
Does this code use an uninitialized field for buffer overflow validation?
The struct eea_rx_meta field 'room' is never assigned anywhere in the code.
Since meta is allocated with kvcalloc, room is always 0.
In eea_rx_post (line 480-510), the code calculates:
tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
headroom = rx->headroom;
room = headroom + tailroom;
len = PAGE_SIZE - meta->offset - room;
meta->truesize = len + room;
The local variable 'room' is used but never stored in meta->room. This
causes the validation check to be incorrect:
Current behavior (room=0):
ctx->len > ctx->meta->truesize - 0
ctx->len > (len + room) // allows receiving up to full truesize
Intended behavior (room set correctly):
ctx->len > ctx->meta->truesize - room
ctx->len > (len + room) - room
ctx->len > len // correctly validates against usable buffer size
With room=0, the hardware could send up to (len + room) bytes, potentially
overflowing into the headroom/tailroom reserved space. Since this is a
hardening check meant to catch misbehaving hardware, the incorrect
validation defeats its purpose.
> +
> + return 0;
> +}
[ ... ]
> diff --git a/drivers/net/ethernet/alibaba/eea/eea_tx.c b/drivers/net/ethernet/alibaba/eea/eea_tx.c
> new file mode 100644
> index 000000000000..79cc9baaebe9
> --- /dev/null
> +++ b/drivers/net/ethernet/alibaba/eea/eea_tx.c
[ ... ]
> +static void eea_meta_free_xmit(struct eea_net_tx *tx, struct eea_tx_meta *meta,
> + int budget)
> +{
> + struct eea_net_stats *stats = &tx->stats;
> +
> + ++stats->packets;
> + stats->bytes += meta->skb->len;
> + napi_consume_skb(meta->skb, budget);
The second parameter to napi_consume_skb() should be a boolean indicating
NAPI context (0 or non-zero), not the actual budget value.
Jakub Kicinski raised this concern in v21 review, referencing commit
cd18e8ac030e646 which added kdoc clarifying this. The current code passes
'budget' (which could be any positive integer like 64) instead of just 1 to
indicate NAPI context. While the function still works correctly (any
non-zero value indicates NAPI context), using the budget value is
semantically incorrect.
Reference: https://lore.kernel.org/netdev/20260121170658.371e0e63@kernel.org/
> + meta->skb = NULL;
> +
> + eea_tx_meta_put_and_unmap(tx, meta);
> +}
[ ... ]
> +static void eea_clean_tx(struct eea_net_tx *tx, int budget)
> +{
> + const struct eea_tx_desc *desc;
> + struct eea_tx_meta *meta;
> +
> + while ((desc = ering_cq_get_desc(tx->ering))) {
> + meta = &tx->meta[le16_to_cpu(desc->id)];
> +
> + if (meta->data) {
> + meta->num = le16_to_cpu(desc->num);
> +
> + eea_meta_free_xmit(tx, meta, budget);
> + } else {
> + netdev_err(tx->enet->netdev,
> + "tx meta->data is null. id %d num: %d\n",
> + meta->id, meta->num);
> + }
> +
> + ering_cq_ack_desc(tx->ering, meta->num);
Can this code cause an infinite loop on invalid completion descriptor?
When hardware returns a completion descriptor with an ID pointing to a meta
that has NULL data (line 103 check fails), the code logs an error (lines
107-109) but still calls ering_cq_ack_desc() with meta->num at line 112.
If the meta was never properly posted (or was already cleaned up), meta->num
will be 0 (from kvcalloc zero-initialization at line 360 in eea_alloc_tx).
Calling ering_cq_ack_desc() with num=0 acknowledges zero descriptors, making
no progress in the completion queue.
This causes the while loop (line 100) to read the same invalid completion
descriptor on the next iteration, logging the same error and ACKing 0
descriptors again, resulting in an infinite loop that hangs the TX
completion processing:
Line 100: while ((desc = ering_cq_get_desc(tx->ering)))
Line 101: meta = &tx->meta[le16_to_cpu(desc->id)];
Line 103: if (meta->data) check fails
Line 107-109: Log error
Line 112: ering_cq_ack_desc(tx->ering, meta->num) with num=0
Loop back to line 100: Same descriptor still pending
While this error condition should not normally happen, when it does occur
due to hardware/firmware bugs or race conditions, the driver should handle
it gracefully rather than hanging.
> + }
> +}
[ ... ]
> +static struct eea_tx_meta *eea_tx_desc_fill(struct eea_net_tx *tx,
> + dma_addr_t addr, u32 size,
> + bool is_last, struct sk_buff *skb,
> + u32 flags)
> +{
> + struct eea_tx_desc *desc;
> + struct eea_tx_meta *meta;
> +
> + meta = eea_tx_meta_get(tx);
> +
> + desc = ering_sq_alloc_desc(tx->ering, meta->id, is_last, flags);
Can this code dereference a NULL pointer?
eea_tx_meta_get() can return NULL when the free list is exhausted (tx->free
is NULL, as seen in lines 40-41 of eea_tx_meta_get). When NULL is returned,
line 182 unconditionally dereferences meta->id without checking if meta is
NULL, causing a kernel crash.
The TX queue flow control check (netif_txq_maybe_stop) happens AFTER packet
posting in eea_tx_xmit (line 280), not before. When the ring is under heavy
load with fragmented packets, the meta free list can be exhausted before the
queue is stopped.
The meta array has cfg.tx_ring_depth entries (allocated at line 360 in
eea_alloc_tx). Each packet can consume 1 + nr_frags meta entries (up to
MAX_SKB_FRAGS + 1). With small ring depths and many large fragmented packets
in flight, the free list can be completely exhausted.
Call path:
Network stack -> eea_tx_xmit -> eea_tx_post_skb (line 274) ->
eea_tx_desc_fill (line 231) -> eea_tx_meta_get (line 180) returns NULL ->
line 182 crashes.
> +
> + meta->data = (void *)skb;
> + meta->dma = addr;
> + meta->desc = desc;
> +
> + return meta;
> +}
[ ... ]
> +static int eea_tx_post_skb(struct sk_buff *skb, struct eea_net_tx *tx,
> + int flags)
> +{
> + dma_addr_t addr;
> + u32 hlen;
> + struct skb_shared_info *shinfo = skb_shinfo(skb);
> + struct eea_tx_meta *meta;
> + int err, last = 0;
> + struct device *dmadev;
> +
> + dmadev = ering_get_dma_device(tx->ering);
> +
> + hlen = skb_headlen(skb);
> +
> + addr = dma_map_single(dmadev, skb->data, hlen, DMA_TO_DEVICE);
> + if (dma_mapping_error(dmadev, addr))
> + return -ENOMEM;
> +
> + meta = eea_tx_desc_fill(tx, addr, hlen, !shinfo->nr_frags, skb, flags);
> +
> + if (eea_fill_desc_from_skb(skb, tx->ering, meta->desc))
Can this code dereference a NULL pointer?
eea_tx_desc_fill() at line 231 can return NULL when the meta free list is
exhausted (root cause explained earlier). Line 233 unconditionally
dereferences meta->desc without checking if meta is NULL.
This is a consequence of the same underlying issue: the TX queue flow
control happens after packet posting rather than before.
Call path:
eea_tx_xmit (line 274) -> eea_tx_post_skb -> eea_tx_desc_fill (line 231)
returns NULL -> line 233 crashes accessing meta->desc.
> + goto err_cancel;
[ ... ]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [net-next,v24,6/6] eea: introduce callback for ndo_get_stats64
2026-01-30 9:34 ` [PATCH net-next v24 6/6] eea: introduce callback for ndo_get_stats64 Xuan Zhuo
@ 2026-01-30 23:32 ` Jakub Kicinski
2026-02-02 2:09 ` Xuan Zhuo
0 siblings, 1 reply; 14+ messages in thread
From: Jakub Kicinski @ 2026-01-30 23:32 UTC (permalink / raw)
To: xuanzhuo
Cc: Jakub Kicinski, lorenzo, andrew+netdev, hkallweit1, dust.li,
edumazet, davem, pabeni, lulie, andrew, netdev, guwen, dong100,
lukas.bulwahn, vadim.fedorenko, enelsonmoore
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
eea: introduce callback for ndo_get_stats64
This commit adds ndo_get_stats64 support to the Alibaba Elastic Ethernet
Adapter driver and introduces a spinlock to protect concurrent access to
rx and tx pointers when gathering statistics.
The commit message says "Add basic driver framework for the Alibaba
Elastic Ethernet Adapter(EEA)." But shouldn't this say it's adding stats
support to an existing framework rather than adding the basic framework
itself? The other commits in the series appear to have already introduced
the framework.
This commit introduces a spinlock to protect against races between
eea_stats() and the queue setup/teardown functions. The race being fixed
was introduced in commit cf49df362e30 ("eea: create/destroy rx,tx queues
for netdevice open and stop"). Should this commit include a Fixes: tag?
Fixes: cf49df362e30 ("eea: create/destroy rx,tx queues for netdevice open and stop")
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor
2026-01-30 9:34 [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Xuan Zhuo
` (5 preceding siblings ...)
2026-01-30 9:34 ` [PATCH net-next v24 6/6] eea: introduce callback for ndo_get_stats64 Xuan Zhuo
@ 2026-01-31 0:20 ` Ethan Nelson-Moore
6 siblings, 0 replies; 14+ messages in thread
From: Ethan Nelson-Moore @ 2026-01-31 0:20 UTC (permalink / raw)
To: Xuan Zhuo
Cc: netdev, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Wen Gu, Philo Lu, Lorenzo Bianconi,
Vadim Fedorenko, Dong Yibo, Heiner Kallweit, Lukas Bulwahn,
Dust Li
On Fri, Jan 30, 2026 at 1:34 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> Add a driver framework for EEA that will be available in the future.
Hi, Xuan,
Would you please remove me from the CC list on future patches? I think
I am only getting CC'd because I recently edited
drivers/net/ethernet/Kconfig.
Ethan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [net-next,v24,6/6] eea: introduce callback for ndo_get_stats64
2026-01-30 23:32 ` [net-next,v24,6/6] " Jakub Kicinski
@ 2026-02-02 2:09 ` Xuan Zhuo
0 siblings, 0 replies; 14+ messages in thread
From: Xuan Zhuo @ 2026-02-02 2:09 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Jakub Kicinski, lorenzo, andrew+netdev, hkallweit1, dust.li,
edumazet, davem, pabeni, lulie, andrew, netdev, guwen, dong100,
lukas.bulwahn, vadim.fedorenko, enelsonmoore
On Fri, 30 Jan 2026 15:32:45 -0800, Jakub Kicinski <kuba@kernel.org> wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
>
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> eea: introduce callback for ndo_get_stats64
>
> This commit adds ndo_get_stats64 support to the Alibaba Elastic Ethernet
> Adapter driver and introduces a spinlock to protect concurrent access to
> rx and tx pointers when gathering statistics.
>
> The commit message says "Add basic driver framework for the Alibaba
> Elastic Ethernet Adapter(EEA)." But shouldn't this say it's adding stats
> support to an existing framework rather than adding the basic framework
> itself? The other commits in the series appear to have already introduced
> the framework.
Will update the commit log.
>
> This commit introduces a spinlock to protect against races between
> eea_stats() and the queue setup/teardown functions. The race being fixed
> was introduced in commit cf49df362e30 ("eea: create/destroy rx,tx queues
> for netdevice open and stop"). Should this commit include a Fixes: tag?
Not fix, before this commit, there is no race problem.
There are protected by rtnl.
Thanks.
>
> Fixes: cf49df362e30 ("eea: create/destroy rx,tx queues for netdevice open and stop")
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [net-next,v24,4/6] eea: create/destroy rx,tx queues for netdevice open and stop
2026-01-30 23:32 ` [net-next,v24,4/6] " Jakub Kicinski
@ 2026-02-02 3:23 ` Xuan Zhuo
0 siblings, 0 replies; 14+ messages in thread
From: Xuan Zhuo @ 2026-02-02 3:23 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Jakub Kicinski, lorenzo, andrew+netdev, hkallweit1, dust.li,
edumazet, davem, pabeni, lulie, andrew, netdev, guwen, dong100,
lukas.bulwahn, vadim.fedorenko, enelsonmoore
On Fri, 30 Jan 2026 15:32:43 -0800, Jakub Kicinski <kuba@kernel.org> wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
>
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> eea: create/destroy rx,tx queues for netdevice open and stop
>
> This commit introduces the implementation for netdevice open and stop in
> the Alibaba Elastic Ethernet Adapter (EEA) driver, including RX/TX queue
> allocation, NAPI setup, and resource management.
[...]
>
>
> > + }
> > +
> > + err = eea_active_ring_and_irq(enet);
> > + if (err) {
> > + netdev_err(enet->netdev,
> > + "eea reset: active new ring and irq failed. err %d\n",
> > + err);
> > + return err;
>
> Does this code leak the newly allocated queues when eea_active_ring_and_irq
> fails?
>
> In this scenario:
> 1. New queues allocated in ctx at line 277
> 2. Old queues freed by eea_netdev_stop at line 285
> 3. enet_bind_new_q_and_cfg succeeds at line 294, transferring ctx->rx and
> ctx->tx to enet->rx and enet->tx
> 4. eea_active_ring_and_irq fails at line 303
> 5. Function returns at line 308 without freeing enet->rx and enet->tx
>
> The newly allocated and transferred resources remain in enet->rx and
> enet->tx but are never freed. The comment at lines 287-292 claims resources
> are deferred for cleanup, but there is no mechanism to track these orphaned
> resources for later cleanup.
The mechanism of the netdev close does this.
>
> [ ... ]
>
> > +static void eea_meta_free_xmit(struct eea_net_tx *tx, struct eea_tx_meta *meta,
> > + int budget)
> > +{
> > + struct eea_net_stats *stats = &tx->stats;
> > +
> > + ++stats->packets;
> > + stats->bytes += meta->skb->len;
> > + napi_consume_skb(meta->skb, budget);
>
> The second parameter to napi_consume_skb() should be a boolean indicating
> NAPI context (0 or non-zero), not the actual budget value.
>
> Jakub Kicinski raised this concern in v21 review, referencing commit
> cd18e8ac030e646 which added kdoc clarifying this. The current code passes
> 'budget' (which could be any positive integer like 64) instead of just 1 to
> indicate NAPI context. While the function still works correctly (any
> non-zero value indicates NAPI context), using the budget value is
> semantically incorrect.
>
> Reference: https://lore.kernel.org/netdev/20260121170658.371e0e63@kernel.org/
I think the parameter budget should be an int value from NAPI context.
>
> > + meta->skb = NULL;
> > +
> > + eea_tx_meta_put_and_unmap(tx, meta);
> > +}
>
> [ ... ]
>
> [ ... ]
>
> > +static struct eea_tx_meta *eea_tx_desc_fill(struct eea_net_tx *tx,
> > + dma_addr_t addr, u32 size,
> > + bool is_last, struct sk_buff *skb,
> > + u32 flags)
> > +{
> > + struct eea_tx_desc *desc;
> > + struct eea_tx_meta *meta;
> > +
> > + meta = eea_tx_meta_get(tx);
> > +
> > + desc = ering_sq_alloc_desc(tx->ering, meta->id, is_last, flags);
>
> Can this code dereference a NULL pointer?
The validation performed after packet posting is sufficient, as large fragmented
packets do not affect this check. The below follows the same principle.
Thanks.
>
> eea_tx_meta_get() can return NULL when the free list is exhausted (tx->free
> is NULL, as seen in lines 40-41 of eea_tx_meta_get). When NULL is returned,
> line 182 unconditionally dereferences meta->id without checking if meta is
> NULL, causing a kernel crash.
>
> The TX queue flow control check (netif_txq_maybe_stop) happens AFTER packet
> posting in eea_tx_xmit (line 280), not before. When the ring is under heavy
> load with fragmented packets, the meta free list can be exhausted before the
> queue is stopped.
>
> The meta array has cfg.tx_ring_depth entries (allocated at line 360 in
> eea_alloc_tx). Each packet can consume 1 + nr_frags meta entries (up to
> MAX_SKB_FRAGS + 1). With small ring depths and many large fragmented packets
> in flight, the free list can be completely exhausted.
>
> Call path:
> Network stack -> eea_tx_xmit -> eea_tx_post_skb (line 274) ->
> eea_tx_desc_fill (line 231) -> eea_tx_meta_get (line 180) returns NULL ->
> line 182 crashes.
>
> > +
> > + meta->data = (void *)skb;
> > + meta->dma = addr;
> > + meta->desc = desc;
> > +
> > + return meta;
> > +}
>
> [ ... ]
>
> > +static int eea_tx_post_skb(struct sk_buff *skb, struct eea_net_tx *tx,
> > + int flags)
> > +{
> > + dma_addr_t addr;
> > + u32 hlen;
> > + struct skb_shared_info *shinfo = skb_shinfo(skb);
> > + struct eea_tx_meta *meta;
> > + int err, last = 0;
> > + struct device *dmadev;
> > +
> > + dmadev = ering_get_dma_device(tx->ering);
> > +
> > + hlen = skb_headlen(skb);
> > +
> > + addr = dma_map_single(dmadev, skb->data, hlen, DMA_TO_DEVICE);
> > + if (dma_mapping_error(dmadev, addr))
> > + return -ENOMEM;
> > +
> > + meta = eea_tx_desc_fill(tx, addr, hlen, !shinfo->nr_frags, skb, flags);
> > +
> > + if (eea_fill_desc_from_skb(skb, tx->ering, meta->desc))
>
> Can this code dereference a NULL pointer?
>
> eea_tx_desc_fill() at line 231 can return NULL when the meta free list is
> exhausted (root cause explained earlier). Line 233 unconditionally
> dereferences meta->desc without checking if meta is NULL.
>
> This is a consequence of the same underlying issue: the TX queue flow
> control happens after packet posting rather than before.
>
> Call path:
> eea_tx_xmit (line 274) -> eea_tx_post_skb -> eea_tx_desc_fill (line 231)
> returns NULL -> line 233 crashes accessing meta->desc.
>
> > + goto err_cancel;
>
> [ ... ]
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-02-02 3:31 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-30 9:34 [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 1/6] eea: introduce PCI framework Xuan Zhuo
2026-01-30 23:32 ` [net-next,v24,1/6] " Jakub Kicinski
2026-01-30 9:34 ` [PATCH net-next v24 2/6] eea: introduce ring and descriptor structures Xuan Zhuo
2026-01-30 23:32 ` [net-next,v24,2/6] " Jakub Kicinski
2026-01-30 9:34 ` [PATCH net-next v24 3/6] eea: probe the netdevice and create adminq Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 4/6] eea: create/destroy rx,tx queues for netdevice open and stop Xuan Zhuo
2026-01-30 23:32 ` [net-next,v24,4/6] " Jakub Kicinski
2026-02-02 3:23 ` Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 5/6] eea: introduce ethtool support Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 6/6] eea: introduce callback for ndo_get_stats64 Xuan Zhuo
2026-01-30 23:32 ` [net-next,v24,6/6] " Jakub Kicinski
2026-02-02 2:09 ` Xuan Zhuo
2026-01-31 0:20 ` [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Ethan Nelson-Moore
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox