From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93BD5CD3439 for ; Thu, 7 May 2026 00:23:52 +0000 (UTC) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C066E4025A; Thu, 7 May 2026 02:23:51 +0200 (CEST) Received: from mail-dy1-f173.google.com (mail-dy1-f173.google.com [74.125.82.173]) by mails.dpdk.org (Postfix) with ESMTP id E6D8B400D7 for ; Thu, 7 May 2026 02:23:49 +0200 (CEST) Received: by mail-dy1-f173.google.com with SMTP id 5a478bee46e88-2bdcf5970cdso250316eec.0 for ; Wed, 06 May 2026 17:23:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20251104.gappssmtp.com; s=20251104; t=1778113429; x=1778718229; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=C5XSiQarlgwB3o4//jMhIhNdFRe/n5DjJyzGnZJmIoQ=; b=ZcmxtUSvMggO/VUnforG4PGUzeUJb50ibqvotdxHMOC3C49VP0yip4OoH96JfSUo2x aoDNmlbGbx9hryhRJjHCA6v3bUzl2GPrymOG9RXQe/PXNbi6VfDbg6uuPXMZ6TiP3CUh gTILj9YKrOH6cSmPCwvd9qtOZYS3CU3QHOmZrv7B1/C1LkIJo1RyHhBYzivzOS2F3FtR ucXlU/kAitZLb22He8BGCVpwIAuibEX4apqtTV1KbywpfzPqm0dU5o8ef7/oTclsUhq3 LCEwEceT0ApGpaNVrxCgubhROhbCK37aqUtFraZmI0A4zGYk0HQahg7hugmqvROF4Ir4 vNWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778113429; x=1778718229; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=C5XSiQarlgwB3o4//jMhIhNdFRe/n5DjJyzGnZJmIoQ=; b=BNF/J1T3KBFz/2SHyQBY/wU41VX5Yrj5sOCVR9dG8mwTu+8fOICffhyCtQ6zcHTGa1 X2xcFr7helMOSYtaEcQCMUwkr+vN5685ORp4lO9NZVylIzJEwPftu0WcgGfMjhyiUIvz VOQbKt5Vb/KimgMn6ogOQrdiP7w6Tv/FcAs0/EJjgWKiAvRji6LZyfhS7wZ8BP5J7g12 1PozxRO+ADLxLd4svM24TB7Fq6X3XF8nwJDVaILYZ+tQajGTWqP8K3ENQu2Rw67bnQY7 eVTeR4Z4PS6CHk7Fwuqy3SHavBQ7veFKIQmpwcjuBLn0oiZb3L3WsOMcRv7pUoOCJoch h/gQ== X-Gm-Message-State: AOJu0YypFJziEz7R7Qfft7vq2NRQBDyi29e0nkZOVeHUHO8qwf8AMDDG Nd8Q74euRbg5pjqrMr8uoIRVBFzdEK5KKB1Mcq+JFtGoxqQ0fpeINipeEec4wkwlHBouxeJUkE8 KBe2d X-Gm-Gg: AeBDievEIAvuAgUcfRTweO9IcNJisyq3c6asIbjRawtIrNn+c+iwz6A7V/hpiVRIcKE TtCe0bGrkenkO1iVWk2qsLRAkndOn7esBXIZOJVJYMbAc99X23aLZOrKG5xnWNybjXgXgx475hz EzHMGmQ+luMBKRdknKEMyIzSIIz+LNmW2IB0Jpw/qEAua/VE9umtODpyZh46E7XLUjmPqwb7bbZ uYmNDLwydvhbZtWmEBe29Br/1W0swXcfWgRSqmH3dXC2ryI5P31d+WWC5STfXuXYEimYjGgN07f 20D2jZ0vMNHSvDyAUdNhO/Ob7qaFa5Z1mdJj54EvBVEsM1TsR/zxPTHaeVzhJRwT3ribPr7oB+5 Nc9JaU6NqC9Zu04KyLaB+o5tFHoUS87bu+v73nqvLDV4Rbsuwt5jvmXXx9wRgr2iDhhqQ8N0q1i GpTjYzaYYTudzjNqJgf6W8wrRGmumHDDzAX0kpFQHgpcvoaQ== X-Received: by 2002:a05:7301:19a4:b0:2ed:a58c:942 with SMTP id 5a478bee46e88-2f6e28f9a48mr379785eec.8.1778113428378; Wed, 06 May 2026 17:23:48 -0700 (PDT) Received: from phoenix.local ([104.202.41.210]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2f56f88f4a4sm5361890eec.17.2026.05.06.17.23.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 May 2026 17:23:48 -0700 (PDT) Date: Wed, 6 May 2026 17:23:45 -0700 From: Stephen Hemminger To: liujie5@linkdatatechnology.com Cc: dev@dpdk.org Subject: Re: [PATCH v10 00/10] Add Linkdata sxe2 driver Message-ID: <20260506172345.2b8934e6@phoenix.local> In-Reply-To: <20260506113557.1151250-1-liujie5@linkdatatechnology.com> References: <20260506095704.1056840-11-liujie5@linkdatatechnology.com> <20260506113557.1151250-1-liujie5@linkdatatechnology.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Wed, 6 May 2026 19:35:46 +0800 liujie5@linkdatatechnology.com wrote: > From: Jie Liu >=20 > V10: > - Addressed AI comments >=20 > Jie Liu (10): > mailmap: add Jie Liu > doc: add sxe2 guide and release notes > drivers: add sxe2 basic structures > common/sxe2: add base driver skeleton > drivers: add base driver probe skeleton > drivers: support PCI BAR mapping > common/sxe2: add ioctl interface for DMA map and unmap > net/sxe2: support queue setup and control > drivers: add data path for Rx and Tx > net/sxe2: add vectorized Rx and Tx My comments first: - drivers going upstream are expected to be production quality. Having #ifdef for driver specific stuff indicates to me either that the code is not ready, existing DPDK debug infrastructure is insufficient, or that the author is unwilling to "kill his darlings". You may not get the last reference, but in writing "kill your darlings" refers to the problem where author becomes too attached to characters or scenes and is unwilling to make the deep edits necessary to be good w= riting. - slicing up out of tree code base leads to messy code. Don't have all the other OS code still in base. As usual AI review has lots, lots more to say. Don't treat it all as absolute, but it looks like it found lots of real bugs and things like mismatch of doc's and features. If you want to run AI review yourself see AGENTS.md file pending in patchwo= rk. Now for the long AI review: Review of v10 sxe2 PMD series =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D Overall comments ---------------- The series adds a new PMD for the Linkdata sxe2 family. Several patches in this series contain serious correctness bugs that would prevent the driver from working with real hardware, including inverted error checks that take the failure path on success, an uninitialised mbuf->next dereference in an error path, a use-after-free / double-free in the buffer-split scatter Rx, and a vector-mode selection that is unconditionally overwritten by the scalar selection that follows it. These need to be fixed before the driver can be considered ready for merge. Beyond those correctness issues, two structural problems in this series need addressing before a merge can be considered. Both are fundamental enough that I want to call them out at the top. 1. The driver opens its own file under /var/log and routes its log lines through a private FILE *. drivers/common/sxe2/sxe2_common_log.c does fopen() on /var/log/sxe2pmd.log. and then every PMD_LOG_* call goes through a wrapper that does rte_openlog_stream(g_sxe2_common_log_fp) before the log line and rte_openlog_stream(NULL) after. This is not acceptable for a DPDK driver. rte_openlog_stream is a process-global setting; flipping it on every log call means any other PMD or application code in the same process that has set its own log stream gets clobbered every time sxe2 logs anything. /var/log is a privileged path that ordinary DPDK applications cannot write to. The FILE * is opened once and never closed, so it leaks across the lifetime of the process. The whole code path is also gated on SXE2_DPDK_DEBUG, which the driver's meson.build defines unconditionally - so the debug-only path is the production path. DPDK drivers log via RTE_LOG / RTE_LOG_LINE / RTE_LOG_LINE_PREFIX into the rte_log infrastructure. The application owns the log stream, not the PMD. Please remove the file-logging path entirely and use rte_log directly; if a per-thread/per-port prefix is wanted, use RTE_LOG_LINE_PREFIX. The 350+ lines of PMD_LOG_*/LOG_*/LOG_DEV_*/LOG_MSG_* macros in sxe2_common_log.h can collapse to a handful of one-line wrappers over RTE_LOG_LINE. 2. The driver is being submitted as a slice of a larger out-of-tree codebase, with the slicing done at compile time via #ifdef. sxe2_drv_cmd.h, sxe2_ioctl_chnl.h and others contain the pattern #ifdef SXE2_DPDK_DRIVER #include "sxe2_type.h" ... #endif #ifdef SXE2_LINUX_DRIVER #ifdef __KERNEL__ #include #include #endif #endif and sxe2_osal.h carries a #ifndef ladder around BIT_ULL, BITS_PER_LONG, DIV_ROUND_UP, TAILQ_FOREACH_SAFE and friends so the same header can be included from both sides. On top of that there is a SXE2_DPDK_DEBUG knob, a SXE2_DPDK_DEBUG_RXTX_LOG knob, a SXE2_TEST knob, an FPGA_VER_ASIC knob (defined unconditionally in meson.build, so it is dead - but its existence implies multiple build flavours), a RTE_LIBRTE_SXE2_16BYTE_RX_DESC knob that selects descriptor format at compile time, and a PCLINT knob. The result is that nobody upstream can audit "what gets built" without picking a specific combination of flags, and the variants that aren't built by default will rot. DPDK is not a kernel-driver-sharing host. The expectation is that what is submitted is the DPDK driver, written in DPDK style, building exactly one binary per arch. Please: - Remove the SXE2_LINUX_DRIVER / __KERNEL__ branches from every header. The kernel driver lives elsewhere; ship its headers there. - Remove the SXE2_DPDK_DRIVER guard - it is the only thing being built here. - Remove FPGA_VER_ASIC (unused) and PCLINT (lint annotations should not be in shipping source). - Make RTE_LIBRTE_SXE2_16BYTE_RX_DESC a runtime decision or just pick one format. Compile-time descriptor format selection is a pattern other PMDs have moved away from. - SXE2_DPDK_DEBUG / SXE2_DPDK_DEBUG_RXTX_LOG / SXE2_TEST should be removed; if a debug counter or extra log line is genuinely useful, gate it on the existing RTE_ETHDEV_DEBUG_RX/TX or on a runtime devarg, not on a build-time flag the driver defines for itself. With those gone, the same set of files will be cleaner, smaller, and a lot easier to review. The driver also reinvents a fair amount of infrastructure that already exists in DPDK - logging (above), the entire linux-kernel-style bit/bitmap/list layer in sxe2_osal.h (BIT/GENMASK/set_bit/test_bit/LIST_FOR_EACH_ENTRY/COMPILER_BARRIER/ sxe2_lock/etc.), and a parallel sxe2_errno.h that maps every errno to a SXE2_ERR_* alias. Please use the DPDK equivalents (rte_bitops.h, rte_bitmap, sys/queue.h or rte_tailq, rte_compiler_barrier, rte_spinlock_t, plain -errno) directly. Severity legend: Error =3D correctness/build; Warning =3D should fix; Info =3D consider. Patch 02/10 doc: add sxe2 guide and release notes -------------------------------------------------- Warning: doc/guides/nics/features/sxe2.ini lists only "Queue start/stop" and "Linux", but sxe2_dev_infos_get in patch 5 advertises VLAN_STRIP, KEEP_CRC, SCATTER, RSS_HASH, LRO, BUFFER_SPLIT, multiple checksum offloads, MBUF_FAST_FREE, TSO, tunnel TSO, etc. The features matrix needs to be updated to match what dev_info reports (and to drop entries the v10 patches do not yet implement). Warning: doc/guides/nics/sxe2.rst states "this driver only deals with virtual memory addresses", but sxe2_drv_dev_dma_map() in patch 7 has an explicit RTE_IOVA_PA branch and supports PA mode when IOMMU is absent. Either remove the PA path or update the doc. Info: sxe2.ini ends without a newline ("\ No newline at end of file" in the diff). Info: in sxe2.rst, "are supported" sentence is missing a trailing period; reST renders the next blank line as a section break which is probably not intended. Patch 03/10 drivers: add sxe2 basic structures ----------------------------------------------- Error: drivers/common/sxe2/meson.build enables `-DSXE2_DPDK_DEBUG` unconditionally, which turns on the fopen()/private-stream logging path in sxe2_common_log.c. See "Overall comments" - that path needs to go away entirely, and the build flag needs to go with it. Error: sxe2_common_log.h's PMD_LOG_NOTICE / PMD_LOG_WARN / PMD_LOG_ERR / PMD_LOG_CRIT / PMD_LOG_ALERT / PMD_LOG_EMERG (and the matching PMD_DEV_LOG_* variants) emit the same log line twice when SXE2_DPDK_DEBUG is on: #define PMD_LOG_ERR(logtype, fmt, ...) \ do { \ SXE2_PMD_LOG(ERR, SXE2_##logtype, fmt, ##__VA_ARGS__); \ sxe2_common_log_stream_open();\ SXE2_PMD_LOG(ERR, SXE2_##logtype, fmt, ##__VA_ARGS__); \ sxe2_common_log_stream_close();\ } while (0) The first SXE2_PMD_LOG goes to whatever rte_log stream was in effect; the second goes to the private file. This is almost certainly a bug independent of (1) above. When the file-logging path is removed, just call SXE2_PMD_LOG once. Warning: drivers/common/sxe2/meson.build uses `sources =3D files(...)` instead of `sources +=3D files(...)`. This works today because nothing else sets `sources` for this subdir, but the rest of DPDK uses `+=3D` and any future infra change that sets a source from outside will be silently dropped. Warning: drivers/common/sxe2/sxe2_type.h does `typedef char s8`. Plain `char` is signed on x86 and unsigned on arm64/ppc64 by default. s8 is then used as a string buffer (e.g. `s8 g_sxe2_common_log_filename[]`, `s8 stime[40]`, `s8 drv_name[32]`) and passed to snprintf/fopen/strerror. Under -Werror with -Wpointer-sign this will fail to build on platforms where char is unsigned. Use `int8_t` for s8 (and only use it where you truly want a signed 8-bit integer), and use `char` for text buffers. Warning: sxe2_type.h also defines unused S8/S16/S32 typedefs. Drop them. Warning: sxe2_osal.h defines GENMASK as `(((~0UL) - (1UL << (l)) + 1) & (~0UL >> (__BITS_PER_LONG - 1 - (h))))`. When h =3D=3D __BITS_PER_LONG - 1 the right operand becomes a shift by 0 which is fine, but the macro also relies on `1UL` being at least __BITS_PER_LONG bits; on a 32-bit build any GENMASK with l >=3D 32 invokes UB. Use RTE_GENMASK64 from rte_bitops.h. Warning: sxe2_osal.h sxe2_swap_u16() uses an arithmetic swap (a +=3D b; b =3D a - b; a -=3D b). This is harder to read than a tmp variable, computes the same result, and offers no benefit. Use a tmp. Info: sxe2_common_log.h `STIME` macro is defined but unused. Info: sxe2_internal_ver.h SXE2_MK_VER_MAJOR/MINOR rely on operator precedence inside macro expansion - parenthesise the macro arguments. Patch 04/10 common/sxe2: add base driver skeleton -------------------------------------------------- Error: drivers/common/sxe2/sxe2_common.c sxe2_common_pci_id_table_update() silently returns SXE2_SUCCESS when calloc() fails: s32 ret =3D SXE2_SUCCESS; ... updated_table =3D calloc(num_ids, sizeof(*updated_table)); if (!updated_table) { PMD_LOG_ERR(COM, "Failed to allocate memory for PCI ID table"); goto l_end; } ... l_end: return ret; ret is never set to an error value before the goto, so the caller (sxe2_common_pci_init / sxe2_common_driver_on_register_pci) proceeds as if the PCI ID table were updated. Set ret =3D SXE2_ERR_NO_MEMORY (or -ENOMEM) before the goto. Warning: sxe2_ioctl_chnl.c first include line is indented with a leading space (` #include `), the rest of the includes are not indented. Fixes the diff indentation noise as well. Warning: sxe2_drv_dev_close() uses `if (fd > 0)` to decide whether to close. fd 0 is a legitimate file descriptor (stdin in the host, but also a valid value returned by open() in unusual cases). Use `if (fd >=3D 0)`. Warning: sxe2_common_pci_remove() frees cdev->kvargs with `free()`, but if the kvargs were never allocated (rte_dev->devargs->args was NULL), cdev->kvargs is also NULL and the free() is fine. However the comment "Fail to get remove device" in sxe2_common_pci_dma_unmap (patch 7) is the unmap path, not remove - copy/paste from the remove function. Info: sxe2_kvargs_process() uses `(*handler)(...)` syntax; plain `handler(...)` reads better. Patch 05/10 drivers: add base driver probe skeleton ---------------------------------------------------- Error: drivers/net/sxe2/sxe2_ethdev.c registers the Tx logtype with the wrong suffix (twice): RTE_LOG_REGISTER_SUFFIX(sxe2_log_tx, rx, DEBUG); ... RTE_LOG_REGISTER_SUFFIX(sxe2_log_tx, rx, NOTICE); Both should be `tx`. As written, all Tx logs flow into the rx logtype name, and rte_log_register_pattern won't be able to address them separately. Error: sxe2_eth_pmd_probe_pf() does not set `adapter->cdev` for secondary processes, but sxe2_dev_init() unconditionally calls sxe2_hw_init() -> sxe2_dev_caps_get() -> sxe2_func_caps_get() -> sxe2_drv_dev_caps_get() which dereferences `adapter->cdev`. NULL-deref on secondary attach. sxe2_dev_init() should early-return after setting up ops/burst pointers when rte_eal_process_type() !=3D RTE_PROC_PRIMARY. Error: sxe2_dev_infos_get() assigns `dev_info->tx_queue_offload_capa` twice; the second assignment unconditionally overwrites the first with just RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE. Either drop the second assignment or change it to `|=3D`. As written, the per-queue Tx offload caps reported to the application are just MBUF_FAST_FREE, contradicting the much longer list a few lines above. Error: sxe2_eth_pmd_remove() leaks the eth_dev port when sxe2_dev_uninit() fails. rte_eth_dev_release_port() is only called on the success path. Either always release or report the leak. Warning: drivers/net/sxe2/meson.build contains a non-ASCII comment ("#=E6=89=A7=E8=A1=8C=E5=AD=90=E7=9B=AE=E5=BD=95base=EF=BC=8C=E5=B9= =B6=E8=8E=B7=E5=8F=96=E7=9B=AE=E6=A0=87=E5=AF=B9=E8=B1=A1") and uncondition= ally adds `-DFPGA_VER_ASIC`, `-DSXE2_DPDK_DRIVER`, `-g` and `-Werror` to cflags. -Werror in particular should be controlled via `-Dwerror=3Dtrue` at the project level, not hard-coded per-driver. -g should come from the buildtype. FPGA_VER_ASIC is not used in any of the patches in this series; if it is dead, drop it. Warning: sxe2_dev_infos_get() reports a long list of offloads (RSS_HASH, BUFFER_SPLIT, QINQ_STRIP, VLAN_EXTEND, TCP_LRO, several tunnel TSOs, PTP timestamp, etc.) that are not implemented anywhere in the v10 series. Either implement them, or trim dev_info to what actually works. Currently testpmd will accept and silently ignore flags the driver claims to support. Warning: sxe2_dev_infos_get() sets nb_rx_queues and nb_tx_queues from `dev->data->nb_rx_queues`/nb_tx_queues - those are the currently-configured queue counts, not the driver capability. rte_eth_dev_info_get callers expect maxima. Warning: sxe2_main_vsi_create() error path does `sxe2_vsi_node_free(adapter->vsi_ctxt.main_vsi)` but sxe2_vsi_node_free() rte_free()s the pointer and then does `vsi =3D NULL` on its local variable, leaving `adapter->vsi_ctxt.main_vsi` dangling. sxe2_vsi_uninit() and sxe2_dev_close() later read this field. Set adapter->vsi_ctxt.main_vsi =3D NULL in the caller after free. Warning: SXE2_ETH_OVERHEAD in sxe2_ethdev.h hardcodes `+ SXE2_VLAN_TAG_SIZE * 2` (8 bytes for QinQ) but the driver does not actually advertise/implement QINQ_STRIP/QINQ_INSERT in v10. This is the "hardcoded VLAN overhead in a PMD that does not support VLAN" pattern - either reflect the driver's actual capability or implement QinQ. Info: sxe2_vsi.h defines an `enum sxe2_vsi_type` and a parallel set of `sxe2_VSI_DOWN`/`sxe2_VSI_CLOSE` enums with lower-case prefixes - normalise to upper-case SXE2_ prefix. Info: sxe2_vsi.c sxe2_vsi_node_alloc() logs "Failed to malloc vf vsi struct" even when the caller is creating a PF VSI - generic message would be clearer. Info: sxe2_dev_close() does not yet release queues; this is added piecemeal in later patches but the dev_ops table in patch 8 still does not register .rx_queue_release / .tx_queue_release. See note on patch 8. Patch 06/10 drivers: support PCI BAR mapping --------------------------------------------- Error (critical): sxe2_dev_pci_map_init() in drivers/net/sxe2/sxe2_ethdev.c uses inverted error checks for every sxe2_dev_pci_res_seg_map() call. sxe2_dev_pci_res_seg_map() returns SXE2_SUCCESS (0) on success, but the caller does: ret =3D sxe2_dev_pci_res_seg_map(adapter, SXE2_PCI_MAP_RES_DOORBELL_TX, txq_cnt, txq_base); if (!ret) { PMD_LOG_ERR(INIT, "Failed to map txq doorbell addr, ret=3D%d", ret); goto l_free_seg1; } `!ret` is true on success. All five mapping calls (DOORBELL_TX, DOORBELL_RX_TAIL, IRQ_DYN, IRQ_ITR, IRQ_MSIX) have the same inversion. The result is: - First successful map -> goto l_free_seg1 -> frees bar_info[1].seg_info, bar_info[0].seg_info, bar_info. - But `map_ctxt->bar_info =3D bar_info;` was set just above the goto, so map_ctxt->bar_info is now a dangling pointer. - ret stays 0, so the function returns SUCCESS. - Caller (sxe2_dev_init) proceeds. - Subsequent access to map_ctxt->bar_info from sxe2_dev_get_bar_info() / sxe2_pci_map_addr_get() is use-after-free. - sxe2_dev_pci_map_uinit() will also UAF on map_ctxt->bar_info. This bug means the driver cannot have been exercised on real hardware in this form. Fix all five checks to `if (ret)`. Warning: sxe2_dev_pci_seg_map() uses size_t for org_len in the printf format `%zu` in patch 6 but the prototype takes `u64`. Patch 9 then changes the format to PRIu64. Use PRIu64 from the start to avoid the noise commit. Patch 07/10 common/sxe2: add ioctl interface for DMA map and IRQ ----------------------------------------------------------------- Warning: sxe2_drv_dev_dma_unmap() does `if (!cdev->config.support_iommu) return SXE2_SUCCESS;` mid-function, while every other path in the same file uses `goto l_end`. Make the style consistent. Info: sxe2_common_pci_dma_map()/dma_unmap() error logs both say "Fail to get remove device" - copy/paste from sxe2_common_pci_remove(). Use a message that matches the operation. Patch 08/10 net/sxe2: support queue setup and control ------------------------------------------------------ Error: sxe2_rx_queue_mbufs_alloc() error path uses mbuf->next which was never initialised: for (i =3D 0; i < rxq->ring_depth; i++) { mbuf =3D sxe2_mbuf_raw_alloc(rxq->mb_pool); if (mbuf =3D=3D NULL) { ... goto l_err_free_mbuf; } buf_ring[i] =3D mbuf; ... } else { mbuf_pay =3D rte_mbuf_raw_alloc(rxq->rx_seg[1].mp); if (unlikely(!mbuf_pay)) { ... goto l_err_free_mbuf; } ... mbuf->next =3D mbuf_pay; } } l_err_free_mbuf: for (j =3D 0; j <=3D i; j++) { if (buf_ring[j] !=3D NULL && buf_ring[j]->next !=3D NULL) { rte_pktmbuf_free(buf_ring[j]->next); ... } ... } When mbuf_pay allocation fails on iteration i, buf_ring[i] is set to mbuf but mbuf->next has not yet been assigned =E2=80=94 rte_mbuf_raw_alloc() does not zero ->next. The cleanup loop then reads `buf_ring[i]->next` (uninitialised pointer) and may pass it to rte_pktmbuf_free(). Use rte_pktmbuf_alloc() (which initialises ->next), or set mbuf->next =3D NULL right after the raw_alloc. Error: the dev_ops table in sxe2_ethdev.c is updated to add .rx_queue_setup, .tx_queue_setup, .rxq_info_get, .txq_info_get, but not .rx_queue_release / .tx_queue_release. All Rx/Tx queue memory (descriptor ring memzone, buffer ring, queue struct) is leaked when ports are reconfigured or closed. Wire up sxe2_rx_queue_release() and sxe2_tx_queue_release() into dev_ops. Warning: sxe2_txqs_all_start() / sxe2_txqs_all_stop() / sxe2_rxqs_all_start() / sxe2_rxqs_all_stop() declare their `dev` parameter as `__rte_unused` but then read `dev->data`. Drop the __rte_unused annotation; otherwise readers will assume the parameter really is unused. Warning: sxe2_tx.c sxe2_txqs_all_start() opens with s32 __rte_cold sxe2_txqs_all_start(struct rte_eth_dev *dev __rte_unused) { struct rte_eth_dev_data *data =3D dev->data; The first line of the function body has no leading tab, breaking the indent. Warning: sxe2_rx_queue_alloc() and sxe2_tx_queue_alloc() call sxe2_rx_queue_release(dev, queue_idx) / sxe2_tx_queue_release(dev, queue_idx) on the error path right after a fresh rte_zmalloc_socket() failure, even though dev->data->rx_queues[queue_idx] / dev->data->tx_queues[queue_idx] have not been written yet. These calls are no-ops at runtime but they read oddly - drop them and free the just-allocated objects directly. Warning: sxe2_rx_queue_setup() validates `rx_nseg > 1 && !(offloads & RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT)` and `(offloads & BUFFER_SPLIT) && !(rx_nseg > 1)` =E2=80=94 these are validations that ethdev already performs in rte_eth_rx_queue_setup(). The PMD does not need to duplicate them. Info: sxe2_mbuf_raw_alloc() is a one-line wrapper around rte_mbuf_raw_alloc() that adds nothing. Drop it and call rte_mbuf_raw_alloc() directly. Patch 09/10 drivers: add data path for Rx and Tx ------------------------------------------------- Error: sxe2_rx_pkts_scattered_split() in sxe2_txrx_poll.c has a use-after-free / double-free in the RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT path: new_mbuf =3D NULL; while (done_num < nb_pkts) { ... if ((rxq->offloads & BUFFER_SPLIT) =3D=3D 0 || first_seg =3D=3D NULL) { new_mbuf =3D rte_mbuf_raw_alloc(rxq->mb_pool); if (unlikely(new_mbuf =3D=3D NULL)) { ... break; } } if (rxq->offloads & BUFFER_SPLIT) { new_mbuf_pay =3D rte_mbuf_raw_alloc(rxq->rx_seg[1].mp); if (unlikely(new_mbuf_pay =3D=3D NULL)) { ... if (new_mbuf !=3D NULL) rte_pktmbuf_free(new_mbuf); new_mbuf =3D NULL; break; } } ... When BUFFER_SPLIT is enabled and first_seg !=3D NULL (continuing a multi-descriptor packet), the first `if` block is skipped, so `new_mbuf` keeps the value it had on the previous iteration =E2=80=94 which has already been stored in *cur_buffer and is now part of the buffer ring. If new_mbuf_pay alloc fails on this iteration, the `if (new_mbuf !=3D NULL) rte_pktmbuf_free(new_mbuf);` frees an mbuf that is still owned by the buffer ring. The next Rx burst will hand a freed mbuf back to the application (or the NIC will DMA into freed memory once the descriptor is rearmed). Set new_mbuf =3D NULL at the start of every loop iteration, or only free new_mbuf when this iteration actually allocated it. Warning: sxe2_rx_pkts_scattered_split() also writes `cur_mbuf->next =3D new_mbuf_pay` (in the first_seg !=3D NULL branch) before any of the subsequent failure checks. If a later step in the same iteration would fail, the mbuf chain is left in an inconsistent state. Reorder so the chain is updated only after all allocations succeed. Warning: sxe2_tx_pkts_prepare() declares its tx_queue parameter as `__rte_unused void *tx_queue` but uses `txq->ring_depth`. Drop the __rte_unused annotation. Warning: sxe2_set_common_function() assigns `dev->rx_pkt_burst =3D sxe2_rx_pkts_scattered;` but sxe2_rx_mode_func_set() (called later from dev_start) unconditionally overwrites it. The first assignment is either dead code or needs to gate on something =E2=80=94 pick one. Info: sxe2_rx_mode_func_set() always selects a scattered Rx burst regardless of `dev->data->scattered_rx`, and without checking whether MTU exceeds rx_buf_len. The non-scattered fast path is never reachable. Either add a non-scattered burst function and dispatch on `scattered_rx`, or document that this PMD always uses scattered Rx and remove the unused split between "scattered" and "single-buffer" code paths. Info: sxe2_txrx.c is hit by a large reformatting of already-correct code in patch 10 (whitespace removal). Either keep the blank lines from patch 9, or do the reflow as a single dedicated cleanup patch. Patch 10/10 net/sxe2: add vectorized Rx and Tx ----------------------------------------------- Error (critical): sxe2_rx_mode_func_set() in sxe2_txrx.c selects the SSE vector Rx burst and then immediately overwrites it: #ifdef RTE_ARCH_X86 if (rx_mode_flags & SXE2_RX_MODE_VEC_SET_MASK) dev->rx_pkt_burst =3D sxe2_rx_pkts_scattered_vec_sse_offload; #endif if (sxe2_rx_offload_en_check(dev, RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT)) dev->rx_pkt_burst =3D sxe2_rx_pkts_scattered_split; else dev->rx_pkt_burst =3D sxe2_rx_pkts_scattered; The vector path is never used in practice. Restructure to either an `if/else if/else`, or gate the scalar selection on `!vec_path`. Error: sxe2_rx_queue_vec_init() in sxe2_txrx_vec.c constructs an mbuf on the stack, sets only buf_addr, nb_segs, data_off, port and refcnt, then reads `*(u64 *)&mbuf_def.rearm_data`. This is correct only as long as rearm_data exactly aliases those four fields and nothing else; this is fragile across DPDK releases. memset() the struct to zero first, then assign the fields. Warning: meson.build adds a Windows-only branch inside `if arch_subdir =3D=3D 'x86'`: if arch_subdir =3D=3D 'x86' sources +=3D files('sxe2_txrx_vec_sse.c') if is_windows and cc.get_id() !=3D 'clang' cflags +=3D ['-fno-asynchronous-unwind-tables'] endif endif But the top of meson.build for net/sxe2 already does `if is_windows: build =3D false; subdir_done()`, so is_windows is always false here. Dead branch =E2=80=94 drop it. Warning: the global static `sxe2_net_map_addr_info_pf` table loses its trailing comments in this patch (the SXE2_PCI_MAP_RES_* identifiers no longer label each row). Either keep the labels or convert the table to designated initialisers `[SXE2_PCI_MAP_RES_DOORBELL_TX] =3D { ... }`. Warning: sxe2_tx_queue_mbufs_release_vec() does `rte_pktmbuf_free_seg(buffer[i].mbuf)` without checking buffer[i].mbuf for NULL. rte_pktmbuf_free_seg() handles NULL on current DPDK, but the rest of this driver guards explicitly; do the same here for consistency. Warning: sxe2_rx_mode_func_set() and sxe2_tx_mode_func_set() set `rx_mode_flags =3D 0` / `tx_mode_flags =3D 0`, then check `(rx_mode_flags & SXE2_RX_MODE_VEC_SET_MASK) =3D=3D 0` / similar inside an `if` block before any flag has been set =E2=80=94 those conditions are tautologically true. Either seed the flags from the caps result first, or drop the redundant guard. Info: sxe2_rx_mode_func_set() ends with a `goto l_end;\nl_end:` that immediately follows =E2=80=94 the goto is a no-op, drop it. Info: sxe2_tx_burst_mode_get() / sxe2_rx_burst_mode_get() list only the scalar and SSE variants in their infos[] table, but the AVX2/AVX512 selection in sxe2_tx_mode_func_set() can in principle assign a non-SSE burst function to dev->tx_pkt_burst. Either include the AVX paths or drop the AVX selection logic.