From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFF06CD4F3D for ; Sun, 17 May 2026 23:56:53 +0000 (UTC) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6F05A4065F; Mon, 18 May 2026 01:56:39 +0200 (CEST) Received: from mail-dy1-f170.google.com (mail-dy1-f170.google.com [74.125.82.170]) by mails.dpdk.org (Postfix) with ESMTP id A6EBA4066B for ; Mon, 18 May 2026 01:56:37 +0200 (CEST) Received: by mail-dy1-f170.google.com with SMTP id 5a478bee46e88-2f0d3e07e30so7047779eec.0 for ; Sun, 17 May 2026 16:56:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20251104.gappssmtp.com; s=20251104; t=1779062197; x=1779666997; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=ZOmYym/v+NIBkCfezptFJR8AnxUGWw8TvGAPv4K8DFg=; b=NlcjkKeA/8nDXQWV+BLJkSxOt5erJWgIdRrgWpcdDCut0Gfa3JZYtB7rDQvpnTJ+5o M0kTzIAh1l7/SQKCoJpLj8wOCcnarCorSeEvCpvNOJGR8NXburNxaUnVzJ4E3YoNDoGK MW4Kpy8u0x6s3oEzL3p+SrkJ0IbxASkGJFCccIs3KljaoewYOT6O+xX7+PQ7unI9R4ge ntqUWN2xFLKf3JcYGXwDBU3Zpq+0qV9WH4RoAso29ZsPqXxFhfZJskBzyrlCq1jN5n8r lL9Io+pqW14BsQAnOnA++cVWRdvJH+aP+FI7A6bWUIj3+Et/zNH0EOvKyRfxIefK4tfh /2pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779062197; x=1779666997; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ZOmYym/v+NIBkCfezptFJR8AnxUGWw8TvGAPv4K8DFg=; b=Qjo0i8Dd5wGADQxk/z63go5P3t1yaHB79pej1viSK/GiDRKpCTUYdeZkfRA4b91A3d qqA6pJSyfAPWwOvc7sSvJ5wKb549K6Isp1RX2gwYVgAOcPTXGlch/cmcP8pEtmG/p8ba PqE8Yt9KWeYf/isfYSWUBzxpndA0HBZAsGE2U0aGtEEqiGEOOhi0ExYBRljzGUYsye/Q EgxbkAljwE2t1WZ5kHh6G4nAO3nSy9HSvi7WFQkLXCWOe4hBrmIhNxG40kzneJhR4os2 ZYSHEuDt6Xwa3MkU+t3eBrl+5KQbuzjuABHA21UiL/sN6QQdZHXeQTC0eO8NxHRTs0B9 iFeQ== X-Gm-Message-State: AOJu0YwAR/1xGfpwiYOPy/43751fAOgPbaGiZaiRwWKH5QVEpBWI+UaU Hj1NDlfc7g4b5GvYrMFKMaVcui2QJQfjGpXjfbzBi4Egxad8nP5mLsHgSUoSoTPYWGw60kRNNnW b8Jn3VDU= X-Gm-Gg: Acq92OGfiaKVnN1MeQtn+o4K56dc67VLTF8ZYD5BNBgs5jhNf9I45IzaQpR8AD0qZJq tSFP5KSNaBu84tOrap+I9k0YpPNxUVCHDp/ooFTUowI1kP/U95QqopMEJrgpnwPSTfo/2HN+Btk tWODiI+zhTqoi7f341D1EPd3LNdtQAUETiroNN1xPDft52JtZ0Kaoi7mIP15Xh5N4U15xXMU+2x sRlXIcTElYWHn1lbjYBSCGDHKxoI7RAbdZiELXbOfcMV8zNA5wv5gg1gFSJ7dyqCaa11+JsHD5t ZSM1mswXYFBqSgTUFNgeppmcU0+uT3ZZt4LwYf4Xlj1oIizuFOZXMY9IckreapvYhxL3L3SfHSg 8ujUDsyMuilYTGKmIVLDSsf90/vp6qltMQMhlUmoIvbGAt1we+itDDP/o5or8j55mTbW4iP/ldb M93Eg2j7F3afGS5jK8JzCtQid6QYdDNIps4xVNoLQpq1Fx0nivXOkm79/h X-Received: by 2002:a05:7301:2901:b0:2c0:e404:8b08 with SMTP id 5a478bee46e88-303986b4529mr5454834eec.29.1779062196543; Sun, 17 May 2026 16:56:36 -0700 (PDT) Received: from phoenix.local ([104.202.41.210]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-302973bc8ddsm12189379eec.21.2026.05.17.16.56.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 17 May 2026 16:56:36 -0700 (PDT) Date: Sun, 17 May 2026 16:44:44 -0700 From: Stephen Hemminger To: Zaiyu Wang Cc: dev@dpdk.org, stable@dpdk.org, Jiawen Wu Subject: Re: [PATCH v4 07/20] net/txgbe: fix Tx desc free logic Message-ID: <20260517164444.7c0cb0e0@phoenix.local> In-Reply-To: <20260511103604.19724-8-zaiyuwang@trustnetic.com> References: <20260423034024.14404-1-zaiyuwang@trustnetic.com> <20260511103604.19724-1-zaiyuwang@trustnetic.com> <20260511103604.19724-8-zaiyuwang@trustnetic.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Mon, 11 May 2026 18:35:49 +0800 Zaiyu Wang wrote: > On some server environments, this driver caused TDM non-fatal errors > or PCIe request errors during Tx operation > > In Amber-Lite NIC's Tx head write-back mode, the hardware periodically > writes back a head index pointing to the next descriptor it is adout > to process in Tx ring. All descriptors before the head are considered > processed by hardware and can be safely freed by the driver. > > The root cause is that the driver can safely free a batch of descriptors > only when the hardware's write-back head pointer has advanced beyond all > descriptors in that batch, meaning they have all been processed by the > hardware. If the driver frees a descriptor before the hardware has > finished processing it, invalid memory access may occur, leading to the > observed bug. > > To fix the issue, correct the boundary check in all three Tx cleanup > functions, each of which was missing the proper condition to prevent > freeing unprocessed descriptors. > > Fixes: 8ada71d0bb7f ("net/txgbe: add Tx head write-back mode for Amber-Lite") > Cc: stable@dpdk.org > > Signed-off-by: Zaiyu Wang > --- Lots of AI feedback on this one. # Review of DPDK Patch: net/txgbe: fix Tx desc free logic ## Errors ### 1. Shared variable access without atomics (3 locations) **Lines affected:** - `drivers/net/txgbe/txgbe_rxtx.c:101` (txgbe_tx_free_bufs) - `drivers/net/txgbe/txgbe_rxtx.c:655` (txgbe_xmit_cleanup) - `drivers/net/txgbe/txgbe_rxtx_vec_common.h:98` (txgbe_tx_free_bufs) ```c volatile uint16_t head = (uint16_t)*txq->headwb_mem; ``` **Problem:** The code reads a hardware-updated value (`*txq->headwb_mem`) using a `volatile` qualifier. The `volatile` keyword does NOT provide atomicity or memory ordering guarantees between the hardware and CPU. On some architectures, reading a `uint16_t` from DMA-accessible memory without proper barriers or atomic operations can result in torn reads, stale cached values, or reordering issues. **Fix:** Use atomic load with appropriate memory ordering: ```c uint16_t head = rte_atomic_load_explicit( (uint16_t *)txq->headwb_mem, rte_memory_order_acquire); ``` The `acquire` ordering ensures that all descriptor writes by the hardware are visible to the CPU after reading the head pointer. This is critical for correctness in hardware write-back scenarios. Note: The `headwb_mem` field type may need adjustment to be an `rte_atomic` type or cast appropriately. --- ### 2. Whitespace/indentation error in vec_common.h **Line:** `drivers/net/txgbe/txgbe_rxtx_vec_common.h:97-98` ```c tx_last_dd -= txq->nb_tx_desc; volatile uint16_t head = (uint16_t)*txq->headwb_mem; ``` **Problem:** Inconsistent indentation. The `volatile uint16_t head` declaration is indented with tabs-only (correct), but it visually appears to be part of a removed block based on the diff context showing the previous line had different indentation. **Fix:** Verify the indentation matches the surrounding code block. Based on the diff, this appears correct after the change, but confirm the tabs align with the function's indentation level. --- ## Warnings ### 1. Logic change correctness verification needed The patch adds three new early-return conditions checking if `head == tx_next_dd`. While this appears correct (hardware hasn't advanced past the point we want to free to), the logic for wrap-around cases should be verified: ```c if (txq->tx_next_dd == head) return 0; else if (txq->tx_next_dd > head && head > tx_last_dd) return 0; else if (tx_last_dd > txq->tx_next_dd && (head > tx_last_dd || head < txq->tx_next_dd)) return 0; ``` **Concern:** The wrap-around logic assumes the ring wraps at `nb_tx_desc`. The condition `tx_last_dd > txq->tx_next_dd` handles the wrap case, but verify that: - `tx_last_dd` calculation correctly wraps (it does: `tx_last_dd -= txq->nb_tx_desc`) - The inequality checks are strict `>` not `>=` consistently The existing code appears correct, but this is complex boundary logic that warrants testing, especially the second `else if` condition in wrap-around scenarios. --- ### 2. Missing const qualifier on read-only atomic load result **All three locations:** The `head` variable is read from hardware but never modified: ```c volatile uint16_t head = (uint16_t)*txq->headwb_mem; ``` **Suggestion:** Once converted to atomic operations, declare as `const`: ```c const uint16_t head = rte_atomic_load_explicit(...); ``` This documents that `head` is never modified after the initial read and prevents accidental writes. --- ## Info ### 1. Consider helper function for repeated logic The same boundary-check logic appears in three separate functions with only minor variations. Consider extracting it to a static inline helper: ```c static inline bool txgbe_hw_head_in_range(uint16_t hw_head, uint16_t start, uint16_t end, uint16_t ring_size) { if (start == hw_head) return false; if (start > hw_head) return hw_head <= end; else /* wrap case */ return hw_head > end || hw_head < start; } ``` This would reduce code duplication and make the logic easier to test and maintain. --- ### 2. Comment explaining write-back head semantics The commit message mentions "Tx head write-back mode" but the code lacks comments explaining the hardware behavior. Consider adding: ```c /* * In Amber-Lite NIC Tx head write-back mode, hardware periodically * updates *headwb_mem with the index of the next descriptor to process. * Descriptors before this head pointer are safe to free. */ const uint16_t head = rte_atomic_load_explicit(...); ``` --- ## Summary **Must fix (Errors):** 1. Replace `volatile` read with `rte_atomic_load_explicit()` with `rte_memory_order_acquire` ordering in all three locations **Should fix (Warnings):** 1. Add `const` qualifier to the loaded `head` value after converting to atomics 2. Verify wrap-around boundary logic with testing (appears correct but is complex) **Consider (Info):** 1. Extract repeated boundary-check logic to a helper function 2. Add comments documenting hardware write-back semantics The core fix (adding the `head == tx_next_dd` check) appears correct and addresses the described bug. The primary issue is the use of `volatile` instead of proper atomic operations for accessing hardware-updated shared memory.