From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by smtp.lore.kernel.org (Postfix) with ESMTP id 814D9CD4F3D for ; Sun, 17 May 2026 23:56:41 +0000 (UTC) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id EA9A74065E; Mon, 18 May 2026 01:56:31 +0200 (CEST) Received: from mail-dy1-f170.google.com (mail-dy1-f170.google.com [74.125.82.170]) by mails.dpdk.org (Postfix) with ESMTP id 65AC54065B for ; Mon, 18 May 2026 01:56:30 +0200 (CEST) Received: by mail-dy1-f170.google.com with SMTP id 5a478bee46e88-2f7020a928eso2232672eec.1 for ; Sun, 17 May 2026 16:56:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20251104.gappssmtp.com; s=20251104; t=1779062189; x=1779666989; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=W8vLxMWF1CX0mm1XCKTzQAUS1bXUiNmKYZiGHQq2uK4=; b=yMo7vnzzCwTTAOVU2OnNPZf1+w7Q6P6xdVYKB9c34xao60t7t6TEjC7ZiXdIYRoXGj vQ/IQwhg41QgNqejNpvodbPAzdBxD8p7uB5iql7o4OrJyL2k8oL2sWBp7ZcWK1xI6gjk zSc97RA7RA3y3gdKYck0tbP/N70WDvASqe1fa0IrSyrrjfLwY4DVmZKyrrsD67fuQ+mF 3L9LO7o1Xe8OY+ljS5VpE3ZZEMI6Of1r4YcSLIKzjYjrFJ8C4LS5h3IAo8q/YeTSWJFb GZRqSsl0z1GnkcPKotGdJPuxC4s6wCTp1tP7Hm4sSqWDD8ACQI+aS3tmGnR+ffZ6mNkY Lavw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779062189; x=1779666989; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=W8vLxMWF1CX0mm1XCKTzQAUS1bXUiNmKYZiGHQq2uK4=; b=db+/GGIIibMeD5JWLqNNPwvqjpO6Vi+U6/BzV0+zjFsBrIsZCd4YILl0E7El5n2Uib bVI75zQ8dkOE+1TbvRsjV/5MTKjI0+3eoLJMb+Zpek4BvosXNgaWD/HWlv8M683O5vSZ NThB5XQ+4Ube6Hq0MPXFsOFqI9w1DkVhpUlSTnH+nTPj4SlYeXi4cJYVrfMpbRH9heKe FaJbxpOu3hxH/+x8TaS/P4cSvozDV4XAFDU2O+eNfHsX3GW9YEyNvl9BrrldZU/1RmtI xP1vqPcwj8D//N8EVDU84EoGoo9KDUdki/p1SrYiHVUC+uaL4rQoOtUKInmxwZzsTgim JFug== X-Gm-Message-State: AOJu0Yzun+ryMJob+G7E+/2RJ86gI/utBU6nThaT8g0YSIEGGcuBdUSu bhOFYjrske0Bmf67flJoWew8DjrnnDt4aDq7vqS/oMbD9Vwazls2A89GEVhcBEtZB1c= X-Gm-Gg: Acq92OF3CUfBRF5/sd/KMF562r5Cki3uY7pRUkNzGnnUy+hgZVTONehA28+oWtCHFRp qAsR/wGzYtubrrF6LxBG0lhVnAflvBwiGqAP288JZMWJFM8FazrFmgBisc+aCYSfFzgNB4rtHZ3 KUpx5mvZqpfit/oxaFS7h/Wsm1BmhFeoo24aYWV7KYGuSGRVXJTr57Ttpz8Brj6KtbG+CVOLkJB u990SRiX1n+dmmRF5aW+0M1BT2Qi370XQAw92Mfuusfn9tDiPkenwREJyZQFbfYp6TGT9DIN3Am 6vJXTLHcCfa9tDlg5N+hCE90KPO0Ro/6awpGN3GOP7pMLa2eky7R+MYLV/mP5yaE6XgAkAIuRMS 9F8n6Btss6DMFRzGtwVHL9SSzUqIJb3mjWw/RUIinoZXUP2MG5hL9ho3IupGhIKTbOeyNHP9PTK kd8NkUbYE9NPdFTvNSctyvpdYsU2GfPnfh1/0= X-Received: by 2002:a05:7300:7255:b0:2d0:239a:23c9 with SMTP id 5a478bee46e88-30398677161mr5764781eec.16.1779062189284; Sun, 17 May 2026 16:56:29 -0700 (PDT) Received: from phoenix.local ([104.202.41.210]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-302973bc8ddsm12189379eec.21.2026.05.17.16.56.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 17 May 2026 16:56:29 -0700 (PDT) Date: Sun, 17 May 2026 16:49:36 -0700 From: Stephen Hemminger To: Zaiyu Wang Cc: dev@dpdk.org, stable@dpdk.org, Jiawen Wu Subject: Re: [PATCH v4 12/20] net/txgbe: fix link stability for 25G NIC Message-ID: <20260517164936.1c83977e@phoenix.local> In-Reply-To: <20260511103604.19724-13-zaiyuwang@trustnetic.com> References: <20260423034024.14404-1-zaiyuwang@trustnetic.com> <20260511103604.19724-1-zaiyuwang@trustnetic.com> <20260511103604.19724-13-zaiyuwang@trustnetic.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Mon, 11 May 2026 18:35:54 +0800 Zaiyu Wang wrote: > The link was previously configured via firmware, but this approach > resulted in unstable link behavior. To resolve the issue, re-add the > PHY configuration flow directly into the driver. > > Fixes: ead3616f630d ("net/txgbe: support PHY configuration via SW-FW mailbox") > Cc: stable@dpdk.org > > Signed-off-by: Zaiyu Wang > --- Lots of AI review feedback here: # Review of Patch 12/20: net/txgbe: fix link stability for 25G NIC ## Errors ### 1. Use-after-free potential in txgbe_e56_rxs_calib_adapt_seq (CORRECTNESS) **File:** `drivers/net/txgbe/base/txgbe_e56.c` **Location:** Lines ~1600-1650 The function calls `txgbe_e56_rxs_osc_init_for_temp_track_range(hw, speed)` which can return `TXGBE_ERR_TIMEOUT`. When the timeout path at line ~1115 is taken, it writes to `E56PHY_PMD_CFG_0_ADDR` to disable RX and then returns `-1`. However, the caller `txgbe_e56_rxs_calib_adapt_seq` continues execution and accesses PHY registers without verifying the hardware state is valid. If the initialization timed out, subsequent register accesses may be unreliable. **Suggested fix:** ```c status = txgbe_e56_rxs_osc_init_for_temp_track_range(hw, speed); if (status != 0) return status; /* Propagate timeout or error immediately */ ``` ### 2. Resource leak: lock not released on error path (CORRECTNESS) **File:** `drivers/net/txgbe/base/txgbe_aml.c` **Location:** Lines 241-247 In `txgbe_setup_phy_link_aml`, `rte_spinlock_lock(&hw->phy_lock)` is acquired at line 241. If `txgbe_set_link_to_amlite` returns `TXGBE_ERR_PHY_INIT_NOT_DONE` at line 246, the code jumps to `out:` at line 264 without releasing the spinlock. The lock is only released on the success path or `TXGBE_ERR_TIMEOUT`. **Suggested fix:** ```c rte_spinlock_lock(&hw->phy_lock); ret_status = txgbe_set_link_to_amlite(hw, speed); rte_spinlock_unlock(&hw->phy_lock); /* Always release before checking status */ if (ret_status == TXGBE_ERR_PHY_INIT_NOT_DONE) goto out; ``` ### 3. Missing error propagation to caller (CORRECTNESS) **File:** `drivers/net/txgbe/base/txgbe_aml.c` **Location:** Lines 280-283 In `txgbe_setup_phy_link_aml`, the function returns `status` which is initialized to 0 and never assigned a failure value, even when `txgbe_set_link_to_amlite` fails or link is not established. The function always returns success regardless of actual outcome. **Suggested fix:** ```c if (!link_up) { *need_reset = true; return TXGBE_ERR_TIMEOUT; /* Return error when link fails */ } ``` ### 4. Double error return value overwrite (CORRECTNESS) **File:** `drivers/net/txgbe/base/txgbe_e56.c` **Location:** Lines 1090-1095, 1157-1162, 1197-1202, 1242-1247 In `txgbe_e56_rxs_osc_init_for_temp_track_range`, the pattern: ```c if (timer++ > PHYINIT_TIMEOUT) { DEBUGOUT("ERROR: ..."); break; return -1; /* Unreachable: break exits the loop */ } ``` The `return -1` is unreachable. After `break`, execution continues to the next loop iteration check and then proceeds past the loop. The error is not propagated. **Suggested fix:** ```c if (timer++ > PHYINIT_TIMEOUT) { DEBUGOUT("ERROR: Wait timeout"); return TXGBE_ERR_TIMEOUT; /* Remove break, return immediately */ } ``` ### 5. Error code dropped without propagation (CORRECTNESS) **File:** `drivers/net/txgbe/base/txgbe_e56.c` **Location:** Lines 1828-1830 In `txgbe_temp_track_seq`: ```c status = txgbe_e56_get_temp(hw, &temperature); if (status) return 0; /* Returns success when get_temp failed */ ``` Temperature reading failure is silently converted to success. The caller cannot distinguish between a successful temperature track and a failed temperature read. **Suggested fix:** ```c status = txgbe_e56_get_temp(hw, &temperature); if (status) return status; /* Propagate the error */ ``` ### 6. Integer overflow in bit shift (CORRECTNESS) **File:** `drivers/net/txgbe/base/txgbe_e56.c` **Location:** Lines 19-31 in `set_fields_e56()` The function performs `(1 << bit_low)` where `bit_low` can be up to 31 (since bit fields in the code span 0-31). The literal `1` is `int` (signed 32-bit). When `bit_low >= 31`, `1 << 31` invokes undefined behavior (signed overflow). **Suggested fix:** ```c if (bit_high == bit_low) { if (set_value == 0) *src_data &= ~(1U << bit_low); /* Use 1U for unsigned */ else *src_data |= (1U << bit_low); } else { for (i = bit_low; i <= bit_high; i++) *src_data &= ~(1U << i); *src_data |= (set_value << bit_low); } ``` ### 7. Missing bounds check on array size calculation (CORRECTNESS) **File:** `drivers/net/txgbe/base/txgbe_e56.c` **Location:** Line 1498 in `txgbe_e56_rx_rd_second_code()` The code calls `qsort(RXS_BBCDR_SECOND_ORDER_ST, array_size, sizeof(int), ...)` where `array_size = ARRAY_SIZE(RXS_BBCDR_SECOND_ORDER_ST)`. The array has 5 elements, but the calculation `array_size = ARRAY_SIZE(...)` trusts the macro. If N changes in the loop initialization (line 1489: `N = 5`), the array could overflow. **Suggested fix:** ```c /* Ensure loop count matches array size */ BUILD_ASSERT(N <= ARRAY_SIZE(RXS_BBCDR_SECOND_ORDER_ST)); ``` ## Warnings ### 1. Variables declared but may be read before initialization (STYLE) **File:** `drivers/net/txgbe/base/txgbe_aml.c` **Location:** Lines 325-327, 355-357 Variables `need_reset` are declared immediately before use, which is acceptable C99 style. However, they are declared in the middle of a conditional block scope, which can reduce clarity when the same variable name is reused in different branches. **Suggested improvement:** Declare `need_reset` once at function scope for better readability, though the current usage is technically correct. ### 2. Missing documentation for complex algorithm (STYLE) **File:** `drivers/net/txgbe/base/txgbe_e56.c` **Location:** Lines 902-1264 (txgbe_e56_rxs_osc_init_for_temp_track_range) This 360-line function implements a complex PHY oscillator calibration sequence with numbered steps in comments (1, 2, 3...). The algorithm is critical for 25G link stability but lacks a high-level overview comment explaining the purpose, temperature dependency, and why specific register sequences are needed. **Suggested improvement:** Add a function-level Doxygen comment explaining the calibration procedure and referencing the hardware specification section. ### 3. Magic numbers without named constants (STYLE) **File:** `drivers/net/txgbe/base/txgbe_e56.c` **Location:** Multiple locations (e.g., lines 505, 732, 1000) Numerous magic numbers appear in register writes (e.g., `0x68c1`, `0x3321`, `0x973e`, `0xccde` at lines 2281-2284) without explanation. These appear to be RS-FEC configuration magic values but are not documented. **Suggested improvement:** Define named constants or add inline comments explaining what these values configure: ```c #define RS_FEC_CONFIG_VAL1 0x68c1 /* RS-FEC configuration register value */ ``` ## Info ### 1. Deep nesting in calibration sequence **File:** `drivers/net/txgbe/base/txgbe_e56.c` **Location:** Lines 1550-1750 (txgbe_e56_rxs_calib_adapt_seq) The ADC calibration loop (16 iterations) has 5 levels of nesting. Consider extracting the calibration step into a helper function to improve readability. ### 2. Possible candidate for helper function **File:** `drivers/net/txgbe/base/txgbe_e56.c` **Location:** Lines 1950-2050 (repeated CTLE bypass pattern) The code to override CTLE bypass for multiple lanes (RXS1, RXS2, RXS3) at lines 1999-2030 is repetitive. This pattern could be extracted into a helper function. --- ## Summary **Critical Issues:** 7 Errors identified, primarily correctness bugs affecting resource management and error handling. **Style Issues:** 3 Warnings - documentation and code organization improvements. **Observations:** 2 Info items - code structure suggestions. The most serious issues are: 1. Lock leak on error path (Error #2) 2. Missing error propagation causing silent failures (Errors #3, #5) 3. Undefined behavior in bit operations (Error #6) 4. Unreachable error returns masking timeouts (Error #4) ============================================================