Netdev List
 help / color / mirror / Atom feed
* TCP default settings (bugzilla)
From: Stephen Hemminger @ 2026-04-15 14:14 UTC (permalink / raw)
  To: netdev

A pair of TCP configuration related bug reports just showed up in bugzilla.
Getting the right time values here seems like a trade off between fast
failover and not dropping crappy connections.

Given how well formatted the buts are they look AI generated.

https://bugzilla.kernel.org/show_bug.cgi?id=221366

The default value of net.ipv4.tcp_retries2 (15 retries, resulting in
~924 seconds / ~15.4 minutes before TCP abandons a dead connection) is
far too high for modern data center environments. When a remote host
becomes unreachable (server crash, failover, network partition),
applications are stuck for up to 16 minutes before receiving an error
and taking recovery action. This causes cascading failures, connection
pool exhaustion, and prolonged service outages.

https://bugzilla.kernel.org/show_bug.cgi?id=221365

The default value of net.ipv4.tcp_keepalive_time (7200 seconds / 2
hours) is incompatible with virtually all modern network
infrastructure, causing silent connection failures. Intermediate
stateful devices (load balancers, firewalls, NAT gateways) routinely
expire idle TCP connections after 300-1800 seconds — long before the
first keepalive probe is ever sent.

^ permalink raw reply

* Re: Re: Re: [PATCH,net-next] tcp: Add TCP ROCCET congestion control module.
From: Lukas Prause @ 2026-04-15 14:15 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Tim Fuechsel, David S. Miller, David Ahern, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Kuniyuki Iwashima,
	linux-kernel, netdev
In-Reply-To: <CADVnQy=+j3-Nkjn7hHax=GUcPrfJiM2iYbwgPa38aAZgL1BuQw@mail.gmail.com>

Thank you for your quick reply!

>>> Please reference figures in the paper and mention specific concrete
>>> numerical examples of latency reductions to quantify these statements.
>> Figures 5 and 6 show the performance of ROCCET in stationary and mobile
>> scenarios (https://arxiv.org/pdf/2510.25281). In the analyzed scenario,
>> we have observed a lower sRTT with ROCCET than with BBRv3 and CUBIC. The
>> observed throughput was marginally lower than that of BBRv3, but still
>> on a similar level. A detailed quantitative evaluation can be found in
>> the paper in sections VI and VII.
> In https://arxiv.org/pdf/2510.25281 zooming into the Figure 6 sRTT
> box-and-whisker-plot seems to show that BBRv3 actually has a lower
> median sRTT value than ROCCET. So that statement seems misleading?
>
> I would recommend using numerical examples in the commit message to
> quantify the gains from ROCCET and avoid potential issues from visual
> interpretation of graphs.
Thanks for pointing this out. We created new figures that include the
numerical values for Figures 5 [1] and 6 [2]. In Figure 5, i.e., our
stationary measurements, it can be seen that ROCCET obtains lower sRTTs
while maintaining a similar throughput to BBRv3. In Figure 6, i.e., our
mobile measurements, ROCCET and BBRv3 have an overall similar performance.
We will adjust this statement.
[1] https://seafile.cloud.uni-hannover.de/f/cc39263dad6b45ca9952/
[2] https://seafile.cloud.uni-hannover.de/f/9556ec768c084fe2ae40/

>>> Can you please elaborate on this statement here? AFAICT from figures 7
>>> and 8 in https://arxiv.org/pdf/2510.25281 it seems ROCCET is
>>> essentially starved by CUBIC when sharing a bottleneck with CUBIC when
>>> the bottleneck has 2*BDP or more of buffering. AFAICT it sounds like
>>> ROCCET does have "fairness issues when sharing a link with TCP CUBIC"?
>> Our main use case is a connection where the bottleneck link is in the
>> cellular network, where the bottleneck queue is typically not shared
>> between flows. "Fairness" between flows is being implemented by the base
>> station's scheduler. In this scenario, ROCCET achieves its objective to
>> not "bloat" its own queue.
>>
>> We have performed additional fairness experiments in non-cellular
>> networks (figures 7 and 8). Here we show that even when used in other
>> types of networks, ROCCET does not cause harm (see
>> https://dl.acm.org/doi/10.1145/3365609.3365855) to other congestion
control.
> I do not see you objecting to my statement, "it seems ROCCET is
> essentially starved by CUBIC when sharing a bottleneck with CUBIC when
> the bottleneck has 2*BDP or more of buffering." So I guess you agree.
>
> IMHO it's important to keep in mind that a congestion control that
> starves in the presence of CUBIC may have limited deployment. This is
> a key reason why Vegas was never deployed at scale.
We see the main use case for deploying ROCCET in cellular networks, but
we agree that in other types of networks, it might be starved by other
congestion control. We argue that this makes ROCCET different from
Vegas, in that there is a specific environment where its deployment can
be advantageous.

>>> Please specify what side effect or side effects ROCCET is claiming to
>>> solve (presumably bufferbloat?).
>> The side effect we observe in cellular networks is that, in particular,
>> for loss-based congestion control, the cwnd often gets 'frozen' at a
>> size that is too large for the BDP of the current link. This effect is
>> caused by the TCP cwnd validation, which at some point stops increasing
>> the cwnd because it assumes that the sender is application-limited.
>> However, this often leads to a cwnd size that is too large for the link,
>> but too small to cause a congestion event by overfilling the buffer. The
>> result is a standing queue that causes permanently high RTTs. Figure 2
>> in the paper (https://arxiv.org/pdf/2510.25281) shows the described
>> behaviour for a single TCP CUBIC flow.
> OK, so that sounds like you are describing the standard bufferbloat
> problem. So you could replace the phrase "solves an unwanted side
> effects of CUBIC’s implementation"  in your comment with something
> like: "avoids the bufferbloat problems inherent in CUBIC."
With this statement, we wanted to describe the specific mechanism in the
TCP CUBIC implementation that can lead to bufferbloat, particularly in
cellular networks. But you are right, the result of this mechanism is
still the standard bufferbloat problem.

Thanks,
Lukas

^ permalink raw reply

* Re:rust: net: phy: intent for MAE0621A (out-of-tree C -> Rust), request for target guidance
From: wenzhaoliao @ 2026-04-15 14:18 UTC (permalink / raw)
  To: andrew, hkallweit1, fujita.tomonori
  Cc: linux, tmgross, ojeda, netdev, rust-for-linux
In-Reply-To: <AFkAcAA-KLHz8L2oAyS3qqrb.1.1776247198201.Hmail.2023000929@ruc.edu.cn>



Hello Andrew,


Thank you for the concrete questions. They were very helpful.


We agree with your main concern: using a low-quality out-of-tree C driver as a
mechanical translation baseline would not produce something acceptable
upstream. If we continue with MAE0621A, the existing out-of-tree C code would
only be one input for behavior discovery, not the quality baseline for the
submitted driver.


After your reply, we reviewed the public `maxio.c` driver you pointed to, and
we agree that it has a number of issues that would need to be fixed before any
upstreamable Rust driver could be justified. For example:


- paged register access is open-coded and does not robustly propagate or
  restore errors;
- several vendor sequences use magic page/register values with no documented
  rationale in the driver;
- there are unconditional resets and fixed `mdelay`/`msleep` delays without a
  clear completion check or justification;
- debugging uses raw `printk()` calls;
- some helper return values are ignored, and `ret |= ...` is not a good fit
  for mainline-style error handling;
- the MMD / EEE handling looks narrowly special-cased and would need to be
  re-checked against phylib conventions and proper documentation.


So our plan, if we continue with this direction, would be to treat it as a
clean-room, mainline-quality driver effort informed by documentation, phylib
conventions, and board-level requirements, and only secondarily by the
existing out-of-tree code.


At the same time, we should also be explicit that we do not currently have
MAE0621A hardware in hand, nor sufficient public documentation to claim that
it is already a well-grounded first target. Our current local setup is useful
for Rust-for-Linux build/tooling validation and limited non-hardware checks,
but not for real hardware-backed PHY validation.


That is exactly why we wanted to ask early, before investing too much work in
the wrong target.


If there is a more suitable out-of-tree PHY target, PHY variant, or board/PHY
setup for an initial Rust PHY submission, we would greatly appreciate your
guidance. We are specifically trying to avoid duplicating an existing in-tree
C PHY driver, and if a realistic and useful target requires buying suitable
hardware, we are willing to do that.


Likewise, if there is some PHY-side Rust task that would be more useful than a
device-specific Rust port, we would also appreciate that guidance. We do not
want to force a C-to-Rust migration target where maintainers would prefer a
different kind of contribution.


If recommending a target is not practical, our fallback would be to stay at
the analysis stage for now:


- write up a short review of the public MAE0621A C driver and the issues that
  would need to be fixed for mainline quality; and
- map those requirements against the current Rust PHY abstraction surface,
  without claiming hardware-backed validation yet.


Would that limited analysis-first step still be useful, or would you prefer
that we pause the driver-RFC direction until we have a better target and
hardware setup?


Thank you again for the guidance.


Best regards,
Liao Wenzhao






发件人:wenzhaoliao <wenzhaoliao@ruc.edu.cn>
发送日期:2026-04-15 17:59:58
收件人:andrew@lunn.ch,hkallweit1@gmail.com,fujita.tomonori@gmail.com
抄送人:linux@armlinux.org.uk,tmgross@umich.edu,ojeda@kernel.org,netdev@vger.kernel.org,rust-for-linux@vger.kernel.org
主题:rust: net: phy: intent for MAE0621A (out-of-tree C -> Rust), request for target guidance>
>Hello PHY and Rust maintainers,
>
>
>I am a PhD student working on a C-to-Rust migration tool for systems code.
>We would like to validate it in Linux with one concrete PHY target and would
>like to confirm direction before posting a larger RFC series.
>
>
>Scope of this intent:
>- Initial target: MAE0621A (currently out-of-tree C driver).
>- We do NOT intend to submit a duplicate Rust rewrite of an existing in-tree C PHY driver.
>- Goal: evaluate a semi-automatic abstraction completion workflow:
>  reuse existing Rust PHY abstractions where possible, and add only minimal missing abstractions.
>
>
>Planned deliverables:
>- A gap analysis between MAE0621A C callbacks and current rust/kernel/net/phy.rs coverage.
>- A small RFC patch series with minimal abstraction additions (if needed).
>- A MAE0621A Rust driver prototype on top of those abstractions for linux-next/rust-next evaluation.
>
>
>Quality and process commitments:
>- Full human review by submitters; we can explain all submitted code.
>- Transparent disclosure of tool assistance in cover letters/changelogs.
>- Hardware-backed test results and explicit limitations in each posting.
>
>
>Questions:
>1. Is MAE0621A an acceptable first target for this direction?
>2. If MAE0621A is not suitable, could you recommend one or two better out-of-tree PHY drivers for a first Rust submission?
>3. For review flow, do you prefer:
>   (a) abstractions-first RFC, then driver, or
>   (b) minimal abstractions + concrete driver in one RFC series?
>
>
>If there are no objections, we plan to post an RFC 0/N in about 2 weeks.
>
>
>Thanks for your guidance.
>
>
>Best regards,
>Liao Wenzhao
>Renmin University of China
>

^ permalink raw reply

* Re: [PATCH net-next 2/3] net/ethernet/zte/dinghai: add logging infrastructure
From: Andrew Lunn @ 2026-04-15 14:19 UTC (permalink / raw)
  To: Junyang Han
  Cc: netdev, davem, andrew+netdev, edumazet, kuba, pabeni, ran.ming,
	han.chengfei, zhang.yanze
In-Reply-To: <20260415015334.2018453-2-han.junyang@zte.com.cn>

> +#define dh_dbg(__dev, format, ...)                    \
> +    dev_dbg((__dev)->device, "%s:%d:(pid %d): " format,        \
> +         __func__, __LINE__, current->pid,            \
> +         ##__VA_ARGS__)
> +
> +#define dh_dbg_once(__dev, format, ...)                \
> +    dev_dbg_once((__dev)->device,                \
> +             "%s:%d:(pid %d): " format,            \
> +             __func__, __LINE__, current->pid,        \
> +             ##__VA_ARGS__)

There is a dislike for adding wrappers around the existing kernel log
functions. What value does this add?

> +int debug_print;
> +module_param(debug_print, int, 0644);

Module parameters are not liked.

> +module_param_named(debug_mask, dh_debug_mask, uint, 0644);
> +MODULE_PARM_DESC(debug_mask, "debug mask: 1 = dump cmd data, 2 =
>  dump cmd exec time, 3 = both. Default=0");
> +static bool probe_vf = 1;
> +module_param(probe_vf, bool, 0644);
> +MODULE_PARM_DESC(probe_vf, "probe_vf: 0 = N, 1 = Y");

I suggest you remove all these.

Debug code is reasonable to have during development. But once the
driver actually works, it should not be needed.

>  const struct pci_device_id dh_pf_pci_table[] = {
> @@ -39,26 +47,34 @@ static int dh_pf_pci_init(struct dh_core_dev *dev)
>      pci_set_drvdata(dev->pdev, dev);
>  
>      ret = pci_enable_device(dev->pdev);
> -    if (ret)
> +    if (ret) {
> +        LOG_ERR("pci_enable_device failed: %d\n", ret);

dev_err(dev, ..)

>          return -ENOMEM;
> +    }
>  
>      ret = dma_set_mask_and_coherent(dev->device, DMA_BIT_MASK(64));
>      if (ret) {
>          ret = dma_set_mask_and_coherent(dev->device, DMA_BIT_MASK(32));
> -        if (ret)
> +        if (ret) {
> +            LOG_ERR("dma_set_mask_and_coherent failed: %d\n", ret);

dev_err(dev, ...

And maybe it should actually be dev_err_probe() if this is a probe
function, and you need to handle -EPROBE_DEFER.

Debug logging should be limited to actual debug stuff. Please use the
standard logging primitives for anything which is not debug, so
dev_err(), dev_warn(), dev_info(). netdev_err(), netdev_warn() etc.

	Andrew

^ permalink raw reply

* Re: [PATCH] net/sched: act_mirred: Fix blockcast recursion bypass leading to stack overflow
From: Simon Horman @ 2026-04-15 14:19 UTC (permalink / raw)
  To: hxzene
  Cc: 'Simon Horman', jhs, jiri, davem, edumazet, kuba, pabeni,
	victor, pctammela, netdev, linux-kernel
In-Reply-To: <20260413082027.2244884-1-hxzene@gmail.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
net/sched: act_mirred: Fix blockcast recursion bypass leading to stack overflow

This patch fixes a stack overflow bug where mirred blockcast actions would
bypass the recursion limit check because the recursion counter was not
incremented for blockcast actions.

> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
> index 05e0b14b57731..5928fcf3e6514 100644
> --- a/net/sched/act_mirred.c
> +++ b/net/sched/act_mirred.c
> @@ -444,8 +444,12 @@ TC_INDIRECT_SCOPE int tcf_mirred_act(struct sk_buff *skb,
>  	tcf_action_update_bstats(&m->common, skb);
>  
>  	blockid = READ_ONCE(m->tcfm_blockid);
> -	if (blockid)
> -		return tcf_blockcast(skb, m, blockid, res, retval);
> +	if (blockid) {
> +		xmit->sched_mirred_nest++;

Could this leave a stale pointer in sched_mirred_dev since netdev_xmit is
persistent per-CPU (or per-task in PREEMPT_RT) memory?

If a nested mirred action later iterates over sched_mirred_dev up to the new
nesting depth to detect routing loops, could it match a stale pointer from
whatever unrelated packet previously reached this nesting level?

This might cause a false-positive loop detection and incorrectly drop a valid
packet. Should we set xmit->sched_mirred_dev[xmit->sched_mirred_nest] = NULL
before incrementing the counter?

> +		retval = tcf_blockcast(skb, m, blockid, res, retval);

This isn't a regression introduced by this patch, but it exists in the early
error returns of tcf_mirred_act() (e.g., when the target device is gone, or a
routing loop is detected) and in tcf_blockcast() (e.g., when
xa_empty(&block->ports) is true).

In these error paths, the functions simply return retval. For mirred redirect
actions, retval is initialized to m->tcf_action, which is typically
TC_ACT_STOLEN.

When TC_ACT_STOLEN is returned, the TC framework assumes the action has taken
ownership of the packet and will not free the sk_buff. Since the code does
not free the sk_buff before returning in these error paths, does this leak
the sk_buff?

Could an unprivileged user inside a network namespace trigger this
intentionally by creating a routing loop with mirred redirect rules, leading
to continuous sk_buff leaks and system memory exhaustion?

Should these error paths ensure the sk_buff is freed or force the return value
to TC_ACT_SHOT if the action is a redirect, mirroring the existing behavior in
tcf_mirred_to_dev()?

> +		xmit->sched_mirred_nest--;
> +		return retval;
> +	}

^ permalink raw reply

* Re: [PATCH] net/sched: act_mirred: fix wrong device for mac_header_xmit check in tcf_blockcast_redir
From: Simon Horman @ 2026-04-15 14:27 UTC (permalink / raw)
  To: Dudu Lu; +Cc: netdev, jhs, jiri
In-Reply-To: <20260413084927.71353-1-phx0fer@gmail.com>

On Mon, Apr 13, 2026 at 04:49:27PM +0800, Dudu Lu wrote:
> In tcf_blockcast_redir(), when iterating block ports to redirect
> packets to multiple devices, the mac_header_xmit flag is queried
> from the wrong device. The loop sends to dev_prev but queries
> dev_is_mac_header_xmit(dev) — which is the NEXT device in the
> iteration, not the one being sent to.
> 
> This causes tcf_mirred_to_dev() to make incorrect decisions about
> whether to push or pull the MAC header. When the block contains
> mixed device types (e.g., an ethernet veth and a tunnel device),
> intermediate devices get the wrong mac_header_xmit flag, leading to
> skb header corruption. In the worst case, skb_push_rcsum with an
> incorrect mac_len can exhaust headroom and panic.
> 
> The last device in the loop is handled correctly (line 365-366 uses
> dev_is_mac_header_xmit(dev_prev)), confirming this is a copy-paste
> oversight for the intermediate devices.
> 
> Fix by using dev_prev instead of dev for the mac_header_xmit query,
> consistent with the device actually being sent to.
> 
> Fixes: 42f39036cda8 ("net/sched: act_mirred: Allow mirred to block")
> Signed-off-by: Dudu Lu <phx0fer@gmail.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* [PATCH v7 0/5] netem: bug fixes
From: Stephen Hemminger @ 2026-04-15 14:27 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger

These bugs were found when doing AI assisted  review of sch_netem.c
during investigation of the packet duplication recursion problem
addressed in Jamal's series.

The fixes cover:

 - probability gaps in the 4-state Markov loss model
 - queue limit not accounting for reordered packets
 - PRNG reseeded on every tc change, breaking reproducibility
 - slot delay configuration not validated for inverted ranges
 - slot delay arithmetic overflow for ranges above ~2.1 seconds

v7 - queue limit check Fixes: goes back further to earlier change
   - use NL_SET_ERR_MSG_ATTR

Stephen Hemminger (5):
  net/sched: netem: fix probability gaps in 4-state loss model
  net/sched: netem: fix queue limit check to include reordered packets
  net/sched: netem: only reseed PRNG when seed is explicitly provided
  net/sched: netem: check for invalid slot range
  net/sched: netem: fix slot delay calculation overflow

 net/sched/sch_netem.c | 44 +++++++++++++++++++++++++++++++------------
 1 file changed, 32 insertions(+), 12 deletions(-)

-- 
2.53.0


^ permalink raw reply

* [PATCH v7 1/5] net/sched: netem: fix probability gaps in 4-state loss model
From: Stephen Hemminger @ 2026-04-15 14:27 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	open list
In-Reply-To: <20260415142822.133241-1-stephen@networkplumber.org>

The 4-state Markov chain in loss_4state() has gaps at the boundaries
between transition probability ranges. The comparisons use:

  if (rnd < a4)
  else if (a4 < rnd && rnd < a1 + a4)

When rnd equals a boundary value exactly, neither branch matches and
no state transition occurs. The redundant lower-bound check (a4 < rnd)
is already implied by being in the else branch.

Remove the unnecessary lower-bound comparisons so the ranges are
contiguous and every random value produces a transition, matching
the GI (General and Intuitive) loss model specification.

This bug goes back to original implementation of this model.

Fixes: 661b79725fea ("netem: revised correlated loss generator")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
 net/sched/sch_netem.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 20df1c08b1e9..8ee72cac1faf 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -227,10 +227,10 @@ static bool loss_4state(struct netem_sched_data *q)
 		if (rnd < clg->a4) {
 			clg->state = LOST_IN_GAP_PERIOD;
 			return true;
-		} else if (clg->a4 < rnd && rnd < clg->a1 + clg->a4) {
+		} else if (rnd < clg->a1 + clg->a4) {
 			clg->state = LOST_IN_BURST_PERIOD;
 			return true;
-		} else if (clg->a1 + clg->a4 < rnd) {
+		} else {
 			clg->state = TX_IN_GAP_PERIOD;
 		}
 
@@ -247,9 +247,9 @@ static bool loss_4state(struct netem_sched_data *q)
 	case LOST_IN_BURST_PERIOD:
 		if (rnd < clg->a3)
 			clg->state = TX_IN_BURST_PERIOD;
-		else if (clg->a3 < rnd && rnd < clg->a2 + clg->a3) {
+		else if (rnd < clg->a2 + clg->a3) {
 			clg->state = TX_IN_GAP_PERIOD;
-		} else if (clg->a2 + clg->a3 < rnd) {
+		} else {
 			clg->state = LOST_IN_BURST_PERIOD;
 			return true;
 		}
-- 
2.53.0


^ permalink raw reply related

* [PATCH v7 2/5] net/sched: netem: fix queue limit check to include reordered packets
From: Stephen Hemminger @ 2026-04-15 14:27 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Martin Ottens, open list
In-Reply-To: <20260415142822.133241-1-stephen@networkplumber.org>

The queue limit check in netem_enqueue() uses q->t_len which only
counts packets in the internal tfifo. Packets placed in sch->q by
the reorder path (__qdisc_enqueue_head) are not counted, allowing
the total queue occupancy to exceed sch->limit under reordering.

Include sch->q.qlen in the limit check.

Fixes: f8d4bc455047 ("net/sched: netem: account for backlog updates from child qdisc")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
 net/sched/sch_netem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 8ee72cac1faf..d400a730eadd 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -524,7 +524,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 				1 << get_random_u32_below(8);
 	}
 
-	if (unlikely(q->t_len >= sch->limit)) {
+	if (unlikely(sch->q.qlen >= sch->limit)) {
 		/* re-link segs, so that qdisc_drop_all() frees them all */
 		skb->next = segs;
 		qdisc_drop_all(skb, sch, to_free);
-- 
2.53.0


^ permalink raw reply related

* [PATCH v7 3/5] net/sched: netem: only reseed PRNG when seed is explicitly provided
From: Stephen Hemminger @ 2026-04-15 14:27 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	François Michel, open list
In-Reply-To: <20260415142822.133241-1-stephen@networkplumber.org>

netem_change() unconditionally reseeds the PRNG on every tc change
command. If TCA_NETEM_PRNG_SEED is not specified, a new random seed
is generated, destroying reproducibility for users who set a
deterministic seed on a previous change.

Move the initial random seed generation to netem_init() and only
reseed in netem_change() when TCA_NETEM_PRNG_SEED is explicitly
provided by the user.

Fixes: 4072d97ddc44 ("netem: add prng attribute to netem_sched_data")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
 net/sched/sch_netem.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index d400a730eadd..556f9747f0e7 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -1112,11 +1112,10 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
 	/* capping jitter to the range acceptable by tabledist() */
 	q->jitter = min_t(s64, abs(q->jitter), INT_MAX);
 
-	if (tb[TCA_NETEM_PRNG_SEED])
+	if (tb[TCA_NETEM_PRNG_SEED]) {
 		q->prng.seed = nla_get_u64(tb[TCA_NETEM_PRNG_SEED]);
-	else
-		q->prng.seed = get_random_u64();
-	prandom_seed_state(&q->prng.prng_state, q->prng.seed);
+		prandom_seed_state(&q->prng.prng_state, q->prng.seed);
+	}
 
 unlock:
 	sch_tree_unlock(sch);
@@ -1139,6 +1138,9 @@ static int netem_init(struct Qdisc *sch, struct nlattr *opt,
 		return -EINVAL;
 
 	q->loss_model = CLG_RANDOM;
+	q->prng.seed = get_random_u64();
+	prandom_seed_state(&q->prng.prng_state, q->prng.seed);
+
 	ret = netem_change(sch, opt, extack);
 	if (ret)
 		pr_info("netem: change failed\n");
-- 
2.53.0


^ permalink raw reply related

* [PATCH v7 4/5] net/sched: netem: check for invalid slot range
From: Stephen Hemminger @ 2026-04-15 14:27 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Yousuk Seung, Neal Cardwell, open list
In-Reply-To: <20260415142822.133241-1-stephen@networkplumber.org>

Reject slot configuration where min_delay exceeds max_delay.
The delay range computation in get_slot_next() underflows in
this case, producing bogus results.

Fixes: 0a9fe5c375b5 ("netem: slotting with non-uniform distribution")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
 net/sched/sch_netem.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 556f9747f0e7..8593e62f3c6a 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -827,6 +827,19 @@ static int get_dist_table(struct disttable **tbl, const struct nlattr *attr)
 	return 0;
 }
 
+static int validate_slot(const struct nlattr *attr,
+			 struct netlink_ext_ack *extack)
+{
+	const struct tc_netem_slot *c = nla_data(attr);
+
+	if (c->min_delay > c->max_delay) {
+		NL_SET_ERR_MSG_ATTR(extack, attr,
+				    "slot min delay greater than max delay");
+		return -EINVAL;
+	}
+	return 0;
+}
+
 static void get_slot(struct netem_sched_data *q, const struct nlattr *attr)
 {
 	const struct tc_netem_slot *c = nla_data(attr);
@@ -1040,6 +1053,12 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
 			goto table_free;
 	}
 
+	if (tb[TCA_NETEM_SLOT]) {
+		ret = validate_slot(tb[TCA_NETEM_SLOT], extack);
+		if (ret)
+			goto table_free;
+	}
+
 	sch_tree_lock(sch);
 	/* backup q->clg and q->loss_model */
 	old_clg = q->clg;
-- 
2.53.0


^ permalink raw reply related

* [PATCH v7 5/5] net/sched: netem: fix slot delay calculation overflow
From: Stephen Hemminger @ 2026-04-15 14:27 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Yousuk Seung, Neal Cardwell, open list
In-Reply-To: <20260415142822.133241-1-stephen@networkplumber.org>

get_slot_next() computes a random delay between min_delay and
max_delay using:

  get_random_u32() * (max_delay - min_delay) >> 32

This overflows signed 64-bit arithmetic when the delay range exceeds
approximately 2.1 seconds (2^31 nanoseconds), producing a negative
result that effectively disables slot-based pacing. This is a
realistic configuration for WAN emulation (e.g., slot 1s 5s).

Use mul_u64_u32_shr() which handles the widening multiply without
overflow.

Fixes: 0a9fe5c375b5 ("netem: slotting with non-uniform distribution")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
 net/sched/sch_netem.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 8593e62f3c6a..41e56908ab0c 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -659,9 +659,8 @@ static void get_slot_next(struct netem_sched_data *q, u64 now)
 
 	if (!q->slot_dist)
 		next_delay = q->slot_config.min_delay +
-				(get_random_u32() *
-				 (q->slot_config.max_delay -
-				  q->slot_config.min_delay) >> 32);
+			mul_u64_u32_shr(q->slot_config.max_delay - q->slot_config.min_delay,
+					get_random_u32(), 32);
 	else
 		next_delay = tabledist(q->slot_config.dist_delay,
 				       (s32)(q->slot_config.dist_jitter),
-- 
2.53.0


^ permalink raw reply related

* [PATCH iwl-net v3 0/6] ixgbe: six bug fixes
From: Aleksandr Loktionov @ 2026-04-15 14:28 UTC (permalink / raw)
  To: intel-wired-lan, anthony.l.nguyen, aleksandr.loktionov; +Cc: netdev

Six fixes for the ixgbe driver, covering a SWFW semaphore timeout
miscalculation, a security-relevant debugfs out-of-bounds, a broken
flow-control NVM-reset path, a false-success return in the cls_u32
nexthdr path, an adaptive-ITR u8 overflow, and wrong bit positions in
the UP-to-TC register normalisation.

Patches 1-3 fix issues that could result in functional regressions
(FW update failures, OOB MMIO, traffic stall after NVM update).
Patches 4-6 fix correctness bugs with user-visible effects.

Patch 3 guards against calling setup_fc() on 82599 backplane links:
on those interfaces setup_fc() resolves to prot_autoc_write() ->
ixgbe_reset_pipeline_82599(), which toggles IXGBE_AUTOC_AN_RESTART
and causes an infinite link-flap loop.  setup_fc() is now skipped for
ixgbe_media_type_backplane; fc_enable() is still called.  The failure-
path guard introduced in v2 (skip fc_enable when setup_fc fails) is
preserved.

Patch 5 reworks the ITR write-back to keep the mode flag
(IXGBE_ITR_ADAPTIVE_LATENCY, bit 7) and the usec delay in separate
operands until the final store, and clamps the delay to
[IXGBE_ITR_ADAPTIVE_MIN_USECS, IXGBE_ITR_ADAPTIVE_MAX_USECS] via
clamp_val().

Patch 6 corrects the Fixes: tag to 8b1c0b24d9af ("ixgbe: configure
minimal packet buffers to support TC") per Simon Horman.

Changes in v3:
 - cover: removed Patch 1 squash-history description (v1->v2 background
          no longer needed in the cover letter).
 - 1/6: add Reviewed-by: Simon Horman, Reviewed-by: Jacob Keller;
        no code change (Jacob suggested read_poll_timeout() but
        accepted as-is for net).
 - 2/6: add Reviewed-by: Simon Horman; no code change.
 - 3/6: add backplane-link guard in ixgbe_watchdog_update_link();
        skip setup_fc() when media type is ixgbe_media_type_backplane
        to prevent infinite link-flap on 82599 backplane interfaces.
 - 4/6: add Reviewed-by: Simon Horman; no code change.
 - 5/6: rework clamping -- use clamp_val() with mode and delay as
        separate operands; clamp to [IXGBE_ITR_ADAPTIVE_MIN_USECS,
        IXGBE_ITR_ADAPTIVE_MAX_USECS] instead of LATENCY-1.
 - 6/6: correct Fixes: tag to 8b1c0b24d9af; add Reviewed-by:
        Simon Horman.

Changes in v2:
 - 1/6: Squash two patches; fix commit msg ("200ms" -> "1s"); three
        explicit mac.type == comparisons instead of range check.
 - 2/6: Add Fixes: tag; reroute from iwl-next to iwl-net.
 - 3/6: Add Fixes: tag; reroute to iwl-net; skip fc_enable() when
        setup_fc() fails to avoid committing stale FC state.
 - 4/6: Add Fixes: tag; reroute from iwl-next to iwl-net.
 - 5/6: Add proper [N/M] patch numbering.
 - 6/6: Reroute to iwl-net; swap to (expr >> ..) & MASK operand order.

---

Aleksandr Loktionov (5):
  ixgbe: fix SWFW semaphore timeout for X550 family
  ixgbe: call ixgbe_setup_fc() before fc_enable() after NVM update
  ixgbe: fix cls_u32 nexthdr path returning success when no entry installed
  ixgbe: fix ITR value overflow in adaptive interrupt throttling
  ixgbe: fix integer overflow and wrong bit position in ixgbe_validate_rtr()

Paul Greenwalt (1):
  ixgbe: add bounds check for debugfs register access

 drivers/net/ethernet/intel/ixgbe/ixgbe_debugfs.c |  4 ++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 18 ++++++++++++------
 drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c    |  8 ++++++++
 3 files changed, 22 insertions(+), 8 deletions(-)
-- 
2.52.0

^ permalink raw reply

* [PATCH iwl-net v3 1/6] ixgbe: fix SWFW semaphore timeout for X550 family
From: Aleksandr Loktionov @ 2026-04-15 14:28 UTC (permalink / raw)
  To: intel-wired-lan, anthony.l.nguyen, aleksandr.loktionov
  Cc: netdev, Simon Horman, Jacob Keller
In-Reply-To: <20260415142841.3222399-1-aleksandr.loktionov@intel.com>

According to FW documentation, the most time-consuming FW operation is
Shadow RAM (SR) dump which takes up to 3.2 seconds.  For X550 family
devices the module-update FW command can take over 4.5 s.  The default
semaphore loop runs 200 iterations with a 5 ms sleep each, giving a
maximum wait of 1 s -- not "200 ms" as previously stated in error.
This is insufficient for X550 family FW update operations and causes
spurious EBUSY failures.

Extend the SW/FW semaphore timeout from 1 s to 5 s (1000 iterations x
5 ms) for all three X550 variants: ixgbe_mac_X550, ixgbe_mac_X550EM_x,
and ixgbe_mac_x550em_a.  All three share the same FW and exhibit the
same worst-case latency.  Use three explicit mac.type comparisons rather
than a range check so future MAC additions are not inadvertently
captured.

The timeout variable is set immediately before the loop so the intent
is clear, with an inline comment stating the resulting maximum delay.

Fixes: 030eaece2d77 ("ixgbe: Add x550 SW/FW semaphore support")
Suggested-by: Soumen Karmakar <soumen.karmakar@intel.com>
Suggested-by: Marta Plantykow <marta.a.plantykow@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
---
v2 -> v3:
 - Add Reviewed-by: Simon Horman, Reviewed-by: Jacob Keller; no code
   change (Jacob suggested read_poll_timeout() but accepted as-is).

v1 -> v2:
 - Squash with 0015 (X550EM extension); fix commit message ("200ms" was
   wrong, actual default is 1 s); replace >= / <= range check with three
   explicit mac.type == comparisons per Tony Nguyen.

 drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
index e67e2fe..a3c8f51 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
@@ -577,6 +577,15 @@ int ixgbe_acquire_swfw_sync_X540(struct ixgbe_hw *hw, u32 mask)
 
 	swmask |= swi2c_mask;
 	fwmask |= swi2c_mask << 2;
+	/* Extend to 5 s (1000 x 5 ms) for X550 family; default is 1 s
+	 * (200 x 5 ms).  FW SR-dump takes up to 3.2 s; module-update up
+	 * to 4.5 s.
+	 */
+	if (hw->mac.type == ixgbe_mac_X550 ||
+	    hw->mac.type == ixgbe_mac_X550EM_x ||
+	    hw->mac.type == ixgbe_mac_x550em_a)
+		timeout = 1000;
+
 	for (i = 0; i < timeout; i++) {
 		/* SW NVM semaphore bit is used for access to all
 		 * SW_FW_SYNC bits (not just NVM)
-- 
2.52.0

^ permalink raw reply related

* [PATCH iwl-net v3 2/6] ixgbe: add bounds check for debugfs register access
From: Aleksandr Loktionov @ 2026-04-15 14:28 UTC (permalink / raw)
  To: intel-wired-lan, anthony.l.nguyen, aleksandr.loktionov
  Cc: netdev, Paul Greenwalt, Simon Horman
In-Reply-To: <20260415142841.3222399-1-aleksandr.loktionov@intel.com>

From: Paul Greenwalt <paul.greenwalt@intel.com>

Prevent out-of-bounds MMIO accesses triggered through user-controlled
register offsets.  IXGBE_HFDR (0x15FE8) is the highest valid MMIO
register in the ixgbe register map; any offset beyond it would address
unmapped memory.

Add a defense-in-depth check at two levels:

1. ixgbe_read_reg() -- the noinline register read accessor.  A
   WARN_ON_ONCE() guard here catches any future code path (including
   ioctl extensions) that might inadvertently pass an out-of-range
   offset without relying on higher layers to catch it first.
   ixgbe_write_reg() is a static inline called from the TX/RX hot path;
   adding WARN_ON_ONCE there would inline the check at every call site,
   so only the read path gets this guard.

2. ixgbe_dbg_reg_ops_write() -- the debugfs 'reg_ops' interface is the
   only current path where a raw, user-supplied offset enters the driver.
   Gating it before invoking the register accessors provides a clean,
   user-visible failure (silent ignore with no kernel splat) for
   deliberately malformed debugfs writes.

Add a reg <= IXGBE_HFDR guard to both the read and write paths in
ixgbe_dbg_reg_ops_write(), and a WARN_ON_ONCE + early-return guard to
ixgbe_read_reg().

Fixes: 91fbd8f081e2 ("ixgbe: added reg_ops file to debugfs")
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
---
v2 -> v3:
 - Add Reviewed-by: Simon Horman; no code change.

v1 -> v2:
 - Add Fixes: tag; reroute from iwl-next to iwl-net (security-relevant
   hardening for user-controllable out-of-bounds MMIO).

 drivers/net/ethernet/intel/ixgbe/ixgbe_debugfs.c | 6 ++++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_debugfs.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_debugfs.c
index 5b1cf49d..a6a19c0 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_debugfs.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_debugfs.c
@@ -86,7 +86,8 @@ static ssize_t ixgbe_dbg_reg_ops_write(struct file *filp,
 		u32 reg, value;
 		int cnt;
 		cnt = sscanf(&ixgbe_dbg_reg_ops_buf[5], "%x %x", &reg, &value);
-		if (cnt == 2) {
+		/* bounds-check register offset */
+		if (cnt == 2 && reg <= IXGBE_HFDR) {
 			IXGBE_WRITE_REG(&adapter->hw, reg, value);
 			value = IXGBE_READ_REG(&adapter->hw, reg);
 			e_dev_info("write: 0x%08x = 0x%08x\n", reg, value);
@@ -97,7 +98,8 @@ static ssize_t ixgbe_dbg_reg_ops_write(struct file *filp,
 		u32 reg, value;
 		int cnt;
 		cnt = sscanf(&ixgbe_dbg_reg_ops_buf[4], "%x", &reg);
-		if (cnt == 1) {
+		/* bounds-check register offset */
+		if (cnt == 1 && reg <= IXGBE_HFDR) {
 			value = IXGBE_READ_REG(&adapter->hw, reg);
 			e_dev_info("read 0x%08x = 0x%08x\n", reg, value);
 		} else {

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 210c7b9..4a1f3c2 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -354,4 +354,6 @@ u32 ixgbe_read_reg(struct ixgbe_hw *hw, u32 reg)
 	if (ixgbe_removed(reg_addr))
 		return IXGBE_FAILED_READ_REG;
+	if (WARN_ON_ONCE(reg > IXGBE_HFDR))
+		return IXGBE_FAILED_READ_REG;
 	if (unlikely(hw->phy.nw_mng_if_sel &
 		     IXGBE_NW_MNG_IF_SEL_SGMII_ENABLE)) {
-- 
2.52.0

^ permalink raw reply related

* [PATCH iwl-net v3 3/6] ixgbe: call ixgbe_setup_fc() before fc_enable() after NVM update
From: Aleksandr Loktionov @ 2026-04-15 14:28 UTC (permalink / raw)
  To: intel-wired-lan, anthony.l.nguyen, aleksandr.loktionov; +Cc: netdev
In-Reply-To: <20260415142841.3222399-1-aleksandr.loktionov@intel.com>

During an NVM update the PHY reset clears the Technology Ability Field
(IEEE 802.3 clause 37 register 7.10) back to hardware defaults.  When
the driver subsequently calls only hw->mac.ops.fc_enable() the SRRCTL
register is recalculated from stale autonegotiated capability bits,
which the MDD (Malicious Driver Detect) logic treats as an invalid
change and halts traffic on the PF.

Fix by calling ixgbe_setup_fc() immediately before fc_enable() in
ixgbe_watchdog_update_link() so that flow-control autoneg and the PHY
registers are re-programmed in the correct order after any reset.

Skip setup_fc() on backplane links: on 82599 backplane interfaces
setup_fc() resolves to prot_autoc_write() ->
ixgbe_reset_pipeline_82599() which toggles IXGBE_AUTOC_AN_RESTART.
Calling it unconditionally on link-up creates an infinite link-flap
loop because each AN-restart triggers another link-up event.  Guard
with a get_media_type() check and skip setup_fc() when the media type
is ixgbe_media_type_backplane; fc_enable() is still called.

Also handle the failure path: if setup_fc() returns an error its output
is invalid and calling fc_enable() on the unchanged hardware state would
repeat the exact MDD-triggering condition the fix is meant to prevent.
Skip fc_enable() in that case while still calling
ixgbe_set_rx_drop_en() which configures the independent RX-drop
behaviour.

Fixes: 93c52dd0033b ("ixgbe: Merge watchdog functionality into service task")
Suggested-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
---
v2 -> v3:
 - Skip setup_fc() for ixgbe_media_type_backplane: unconditional call on
   82599 backplane links triggers prot_autoc_write() ->
   ixgbe_reset_pipeline_82599() -> IXGBE_AUTOC_AN_RESTART, causing an
   infinite link-flap loop (Simon Horman).

v1 -> v2:
 - Add Fixes: tag; reroute to iwl-net; handle setup_fc() failure by
   skipping fc_enable() so stale FC state is never committed to hardware.

 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 13 +++++++++++++
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 210c7b9..fc3bae9 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -8029,6 +8029,18 @@ static void ixgbe_watchdog_update_link(struct ixgbe_adapter *adapter)
 		pfc_en |= !!(adapter->ixgbe_ieee_pfc->pfc_en);
 
 	if (link_up && !((adapter->flags & IXGBE_FLAG_DCB_ENABLED) && pfc_en)) {
-		hw->mac.ops.fc_enable(hw);
+		/* Skip setup_fc() on backplane links: it resolves to
+		 * prot_autoc_write() -> ixgbe_reset_pipeline_82599() and
+		 * toggles IXGBE_AUTOC_AN_RESTART, causing infinite link-flap
+		 * on 82599 backplane interfaces.
+		 * If setup_fc() fails its output is invalid; skip fc_enable()
+		 * to avoid committing stale capability bits that trigger MDD.
+		 */
+		if (hw->mac.ops.setup_fc &&
+		    hw->mac.ops.get_media_type(hw) != ixgbe_media_type_backplane &&
+		    hw->mac.ops.setup_fc(hw))
+			e_warn(drv, "setup_fc failed, skipping fc_enable\n");
+		else
+			hw->mac.ops.fc_enable(hw);
 		ixgbe_set_rx_drop_en(adapter);
 	}
-- 
2.52.0

^ permalink raw reply related

* [PATCH iwl-net v3 4/6] ixgbe: fix cls_u32 nexthdr path returning success when no entry installed
From: Aleksandr Loktionov @ 2026-04-15 14:28 UTC (permalink / raw)
  To: intel-wired-lan, anthony.l.nguyen, aleksandr.loktionov
  Cc: netdev, Simon Horman, Marcin Szycik
In-Reply-To: <20260415142841.3222399-1-aleksandr.loktionov@intel.com>

ixgbe_configure_clsu32() returns 0 (success) after the nexthdr loop
even when ixgbe_clsu32_build_input() fails for every candidate entry
and no jump-table slot is actually programmed.  Callers that test the
return value would then falsely believe the filter was installed.

The variable 'err' already tracks the last ixgbe_clsu32_build_input()
return value; if the loop completes with a successful break, err is 0.
If all attempts failed, err holds the last failure code.  Change the
unconditional 'return 0' to 'return err' so errors are propagated
correctly.

Fixes: 1cdaaf5405ba ("ixgbe: Match on multiple headers for cls_u32 offloads")
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com>
---
v2 -> v3:
 - Add Reviewed-by: Simon Horman; no code change.

v1 -> v2:
 - Add Fixes: tag; reroute from iwl-next to iwl-net (false-success
   return is a user-visible correctness bug, not a cleanup).

 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 210c7b9..6e7f8a9 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -10311,7 +10311,7 @@ static int ixgbe_configure_clsu32(struct ixgbe_adapter *adapter,
 				kfree(jump);
 			}
 		}
-		return 0;
+		return err;
 	}
 
 	input = kzalloc_obj(*input);
-- 
2.52.0

^ permalink raw reply related

* [PATCH iwl-net v3 5/6] ixgbe: fix ITR value overflow in adaptive interrupt throttling
From: Aleksandr Loktionov @ 2026-04-15 14:28 UTC (permalink / raw)
  To: intel-wired-lan, anthony.l.nguyen, aleksandr.loktionov; +Cc: netdev
In-Reply-To: <20260415142841.3222399-1-aleksandr.loktionov@intel.com>

ixgbe_update_itr() packs a mode flag (IXGBE_ITR_ADAPTIVE_LATENCY,
bit 7) and a usecs delay (bits [6:0]) into an unsigned int, then
stores the combined value in ring_container->itr which is declared as
u8.  Values above 0xFF wrap on truncation, corrupting both the delay
and the mode flag on the next readback.

Keep the mode bit (IXGBE_ITR_ADAPTIVE_LATENCY) and the usec delay as
separate operands in the final store expression.  Clamp only the usecs
portion to [IXGBE_ITR_ADAPTIVE_MIN_USECS, IXGBE_ITR_ADAPTIVE_MAX_USECS]
using clamp_val() so that:
 - overflow cannot bleed into the mode bit (bit 7),
 - the delay cannot exceed 126 us (IXGBE_ITR_ADAPTIVE_MAX_USECS),
 - the delay cannot drop below 10 us (IXGBE_ITR_ADAPTIVE_MIN_USECS).

Fixes: b4ded8327fea ("ixgbe: Update adaptive ITR algorithm")
Cc: stable@vger.kernel.org
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
---
v2 -> v3:
 - Use clamp_val() instead of min_t() to also guard the lower bound
   (IXGBE_ITR_ADAPTIVE_MIN_USECS); keep mode and delay as separate
   operands until final store; use IXGBE_ITR_ADAPTIVE_MAX_USECS (126)
   as upper bound instead of IXGBE_ITR_ADAPTIVE_LATENCY - 1 (127)
   (Simon Horman).

v1 -> v2:
 - Add proper [N/M] numbering so patchwork tracks it as part of the set;
   no code change.

 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 10 +++++++---
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 210c7b9..9f3ae21 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2886,11 +2886,17 @@ static void ixgbe_update_itr(struct ixgbe_q_vector *q_vector,
 				    IXGBE_ITR_ADAPTIVE_MIN_INC * 64) *
 		       IXGBE_ITR_ADAPTIVE_MIN_INC;
 		break;
 	}
 
 clear_counts:
-	/* write back value */
-	ring_container->itr = itr;
+	/* Separate mode bit (IXGBE_ITR_ADAPTIVE_LATENCY) from usec delay;
+	 * clamp delay to [MIN_USECS, MAX_USECS] before storing to prevent
+	 * u8 truncation from corrupting the mode flag or delay on readback.
+	 */
+	ring_container->itr = (itr & IXGBE_ITR_ADAPTIVE_LATENCY) |
+		clamp_val(itr & ~IXGBE_ITR_ADAPTIVE_LATENCY,
+			  IXGBE_ITR_ADAPTIVE_MIN_USECS,
+			  IXGBE_ITR_ADAPTIVE_MAX_USECS);
 
 	/* next update should occur within next jiffy */
 	ring_container->next_update = next_update + 1;
-- 
2.52.0

^ permalink raw reply related

* [PATCH iwl-net v3 6/6] ixgbe: fix integer overflow and wrong bit position in ixgbe_validate_rtr()
From: Aleksandr Loktionov @ 2026-04-15 14:28 UTC (permalink / raw)
  To: intel-wired-lan, anthony.l.nguyen, aleksandr.loktionov
  Cc: netdev, Simon Horman
In-Reply-To: <20260415142841.3222399-1-aleksandr.loktionov@intel.com>

Two bugs in the same loop in ixgbe_validate_rtr():

1. The 3-bit traffic-class field was extracted by shifting a u32 and
   assigning the result directly to a u8.  For user priority 0 this is
   harmless; for UP[5..7] the shift leaves bits [15..21] in the u32
   which are then silently truncated when stored in u8.  Mask with
   IXGBE_RTRUP2TC_UP_MASK before the assignment so only the intended
   3 bits are kept.

2. When clearing an out-of-bounds entry the mask was always shifted by
   the fixed constant IXGBE_RTRUP2TC_UP_SHIFT (== 3), regardless of
   which loop iteration was being processed.  This means only UP1 (bit
   position 3) was ever cleared; UP0,2..7 (positions 0, 6, 9, ..., 21)
   were left unreset, so invalid TC mappings persisted in hardware and
   could mis-steer received packets to the wrong traffic class.
   Use i * IXGBE_RTRUP2TC_UP_SHIFT to target the correct 3-bit field
   for each iteration.

Swap the operand order in the mask expression to place the constant
on the right per kernel coding style (noted by David Laight).

Fixes: 8b1c0b24d9af ("ixgbe: configure minimal packet buffers to support TC")
Cc: stable@vger.kernel.org
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
---
v2 -> v3:
 - Correct Fixes: tag to 8b1c0b24d9af ("ixgbe: configure minimal packet
   buffers to support TC") -- the previously used e7589eab9291 predates
   the buggy code path (Simon Horman); add Reviewed-by: Simon Horman.

v1 -> v2:
 - Add Fixes: tag; reroute to iwl-net (wrong bit positions cause packet
   mis-steering); swap to (reg >> ...) & MASK operand order per David
   Laight.

 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 210c7b9..c9e4f12 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -9772,11 +9772,12 @@ static void ixgbe_validate_rtr(struct ixgbe_adapter *adapter, u8 tc)
 	rsave = reg;
 
 	for (i = 0; i < MAX_TRAFFIC_CLASS; i++) {
-		u8 up2tc = reg >> (i * IXGBE_RTRUP2TC_UP_SHIFT);
+		u8 up2tc = (reg >> (i * IXGBE_RTRUP2TC_UP_SHIFT)) &
+			   IXGBE_RTRUP2TC_UP_MASK;
 
 		/* If up2tc is out of bounds default to zero */
 		if (up2tc > tc)
-			reg &= ~(0x7 << IXGBE_RTRUP2TC_UP_SHIFT);
+			reg &= ~(IXGBE_RTRUP2TC_UP_MASK << (i * IXGBE_RTRUP2TC_UP_SHIFT));
 	}
 
 	if (reg != rsave)
-- 
2.52.0

^ permalink raw reply related

* Re: [PATCH net-next 3/3] net/ethernet/zte/dinghai: add hardware register access and PCI capability scanning
From: Andrew Lunn @ 2026-04-15 14:31 UTC (permalink / raw)
  To: Junyang Han
  Cc: netdev, davem, andrew+netdev, edumazet, kuba, pabeni, ran.ming,
	han.chengfei, zhang.yanze
In-Reply-To: <20260415015334.2018453-3-han.junyang@zte.com.cn>

> +int32_t zxdh_pf_pci_find_capability(struct pci_dev *pdev, uint8_t cfg_type,
> +                    uint32_t ioresource_types, int32_t *bars)
> +{
> +    int32_t pos = 0;
> +    uint8_t type = 0;
> +    uint8_t bar = 0;
> +
> +    for (pos = pci_find_capability(pdev, PCI_CAP_ID_VNDR); pos > 0;
> +         pos = pci_find_next_capability(pdev, pos, PCI_CAP_ID_VNDR)) {
> +        pci_read_config_byte(pdev, pos + offsetof
> (struct zxdh_pf_pci_cap, cfg_type), &type);
> +        pci_read_config_byte(pdev, pos + offsetof
> (struct zxdh_pf_pci_cap, bar), &bar);

Something odd going on with indentation? Has the mailer corrupted it?

> +
> +        /* ignore structures with reserved BAR values */
> +        if (bar > ZXDH_PF_MAX_BAR_VAL)
> +            continue;
> +
> +        if (type == cfg_type) {
> +            if (pci_resource_len(pdev, bar) &&
> +                pci_resource_flags(pdev, bar) & ioresource_types) {
> +                *bars |= (1 << bar);
> +                return pos;
> +            }
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +void __iomem *zxdh_pf_map_capability(struct dh_core_dev *dh_dev, int32_t off,
> +                     size_t minlen, uint32_t align,
> +                     uint32_t start, uint32_t size,
> +                     size_t *len, resource_size_t *pa,
> +                     uint32_t *bar_off)
> +    p = pci_iomap_range(pdev, bar, offset, length);
> +    if (unlikely(!p)) {

Is this hot path? Please only use unlikely() when dealing with frames
in the hot path.

> +int32_t zxdh_pf_common_cfg_init(struct dh_core_dev *dh_dev)
> +{
> +    int32_t common = 0;
> +    struct zxdh_pf_device *pf_dev = dh_core_priv(dh_dev);
> +    struct pci_dev *pdev = dh_dev->pdev;
> +
> +    /* check for a common config: if not, use legacy mode (bar 0). */
> +    common = zxdh_pf_pci_find_capability(pdev, ZXDH_PCI_CAP_COMMON_CFG,
> +                         IORESOURCE_IO | IORESOURCE_MEM,
> +                         &pf_dev->modern_bars);
> +    if (common == 0) {
> +        LOG_ERR("missing capabilities %i, leaving for legacy driver\
> n", common);
> +        return -ENODEV;
> +    }
> +
> +    pf_dev->common = zxdh_pf_map_capability(dh_dev, common,
> +                        sizeof(struct zxdh_pf_pci_common_cfg),
> +                        ZXDH_PF_ALIGN4, 0,
> +                        sizeof(struct zxdh_pf_pci_common_cfg),
> +                        NULL, NULL, NULL);
> +    if (unlikely(!pf_dev->common)) {
> +        LOG_ERR("pf_dev->common is null\n");
> +        return -EINVAL;
> +    }
> +
> +    return 0;
> +}

> +int32_t zxdh_pf_notify_cfg_init(struct dh_core_dev *dh_dev)
> +{
> +    /* We don't know how many VQs we'll map, ahead of the time.
> +     * If notify length is small, map it all now. Otherwise, map each VQ individually later.
> +     */
> +    if ((uint64_t)notify_length + (notify_offset % PAGE_SIZE) <= PAGE_SIZE) {

Please try to avoid casts. They suggest the types are wrong. You will
probably have better code if you don't need the cast.

> +int32_t zxdh_pf_modern_cfg_init(struct dh_core_dev *dh_dev)
> +{
> +    int32_t ret = 0;
> +    struct zxdh_pf_device *pf_dev = dh_core_priv(dh_dev);
> +    struct pci_dev *pdev = dh_dev->pdev;
> +
> +    ret = zxdh_pf_common_cfg_init(dh_dev);
> +    if (ret != 0) {

if (ret)

would be more normal.

> +void zxdh_pf_get_vf_mac
> (struct dh_core_dev *dh_dev, uint8_t *mac, int32_t vf_id)
> +{
> +    uint32_t DEV_MAC_L = 0;
> +    uint16_t DEV_MAC_H = 0;
> +    struct zxdh_pf_device *pf_dev = dh_core_priv(dh_dev);
> +
> +    if (pf_dev->pf_sriov_cap_base) {
> +        DEV_MAC_L = ioread32((void __iomem *)(pf_dev->pf_sriov_cap_base +
> +                     (pf_dev->sriov_bar_size) * vf_id +
> +                     pf_dev->dev_cfg_bar_off));

Is the cast needed? pf_dev->pf_sriov_cap_base should already be void *
__iomem.

	Andrew

^ permalink raw reply

* Re: [PATCH] xfrm: iptfs: fix deadlock in iptfs_destroy_state
From: Simon Horman @ 2026-04-15 14:35 UTC (permalink / raw)
  To: phx0fer; +Cc: 'Simon Horman', netdev, steffen.klassert, herbert, davem
In-Reply-To: <20260413085138.72623-1-phx0fer@gmail.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
xfrm: iptfs: fix deadlock in iptfs_destroy_state

This commit resolves an ABBA deadlock in iptfs_destroy_state() by moving
the cancellation of iptfs_timer and drop_timer outside of their
respective spinlocks.

> diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
> index 97bc979e55baf..11291b87158cb 100644
> --- a/net/xfrm/xfrm_iptfs.c
> +++ b/net/xfrm/xfrm_iptfs.c
> @@ -2708,8 +2708,10 @@ static void iptfs_destroy_state(struct xfrm_state *x)
>  	if (!xtfs)
>  		return;
>  
> -	spin_lock_bh(&xtfs->x->lock);
>  	hrtimer_cancel(&xtfs->iptfs_timer);
> +	hrtimer_cancel(&xtfs->drop_timer);
> +
> +	spin_lock_bh(&xtfs->x->lock);
>  	__skb_queue_head_init(&list);
>  	skb_queue_splice_init(&xtfs->queue, &list);
>  	spin_unlock_bh(&xtfs->x->lock);

Is it safe to call hrtimer_cancel() and skb_queue_splice_init() on xtfs
here if it was created via iptfs_clone_state() but failed initialization?

If iptfs_clone_state() uses kmemdup() to shallow-copy the xtfs structure
from the original state, the copy includes xtfs->queue and the hrtimer
structs (iptfs_timer and drop_timer). Reinitialization only happens later
in iptfs_init_state() -> __iptfs_init_state().

If x->type->init_state() fails during __xfrm_init_state(), the state is
destroyed via xfrm_state_put() before iptfs_init_state() is called.

On this destruction path, does calling hrtimer_cancel() result in a
regression where it operates on the copied timers? If the original timer
was queued, the copied timer has the HRTIMER_STATE_ENQUEUED bit set, but
its rb_node is not natively in the rbtree. Calling hrtimer_cancel() might
call rb_erase_cached() using the cloned node's pointers (which point to
the original node's parent/children).

Additionally, since xtfs->queue is a shallow copy, its next/prev pointers
would point to the original state's queue. Could splicing it here lead to
a regression by modifying the original state's skb queue? If the original
queue was empty, it points to itself, and this code might dequeue it and
call kfree_skb() on an address inside orig->mode_data.

^ permalink raw reply

* Re: [PATCH iproute2] ss: force a flush in monitor mode
From: patchwork-bot+netdevbpf @ 2026-04-15 14:40 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: dsahern, stephen, davem, kuba, pabeni, kuniyu, netdev,
	eric.dumazet
In-Reply-To: <20260415130307.1016393-1-edumazet@google.com>

Hello:

This patch was applied to iproute2/iproute2.git (main)
by Stephen Hemminger <stephen@networkplumber.org>:

On Wed, 15 Apr 2026 13:03:07 +0000 you wrote:
> Call fflush() from generic_show_sock() in order to work
> with pipes and redirects.
> 
> After this patch, "ss -E &>log_file" works as expected.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> 
> [...]

Here is the summary with links:
  - [iproute2] ss: force a flush in monitor mode
    https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=4d82739fda71

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [net-next v1 1/3] net: phy: motorcomm: Add yt8531_set_ds() mdio_locked bool parameter
From: Andrew Lunn @ 2026-04-15 14:40 UTC (permalink / raw)
  To: Minda Chen
  Cc: Frank, Andrew Lunn, Heiner Kallweit, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel
In-Reply-To: <20260415092654.64907-2-minda.chen@starfivetech.com>

On Wed, Apr 15, 2026 at 05:26:52PM +0800, Minda Chen wrote:
> yt8531_set_ds() default set register with mdio lock and only called
> with YT8531 PHY. But new type YT8531s support RGMII and has the same
> pin strength setting with YT8531, YT8531s need to call yt8531_set_ds()
> setting pin drive strength. But Its config init function
> yt8521_config_init() already get the mdio lock with phy_select_page().
> 
> Need to add ytphy API without lock in yt8531_set_ds() and a new
> bool parameter for YT8531s RGMII case.

This is ugly.

Please try to modify the code so that both PHYs can call
yt8531_set_ds() in the same locking context. You then don't need the
mdio_locked parameter.

    Andrew

---
pw-bot: cr

^ permalink raw reply

* [PATCH net v4] ipvs: fix MTU check for GSO packets in tunnel mode
From: Yingnan Zhang @ 2026-04-15 14:40 UTC (permalink / raw)
  To: ja, pablo
  Cc: coreteam, davem, edumazet, fw, horms, kuba, linux-kernel,
	lvs-devel, netdev, netfilter-devel, pabeni, phil, Yingnan Zhang

Currently, IPVS skips MTU checks for GSO packets by excluding them with
the !skb_is_gso(skb) condition. This creates problems when IPVS tunnel
mode encapsulates GSO packets with IPIP headers.

The issue manifests in two ways:

1. MTU violation after encapsulation:
   When a GSO packet passes through IPVS tunnel mode, the original MTU
   check is bypassed. After adding the IPIP tunnel header, the packet
   size may exceed the outgoing interface MTU, leading to unexpected
   fragmentation at the IP layer.

2. Fragmentation with problematic IP IDs:
   When net.ipv4.vs.pmtu_disc=1 and a GSO packet with multiple segments
   is fragmented after encapsulation, each segment gets a sequentially
   incremented IP ID (0, 1, 2, ...). This happens because:

   a) The GSO packet bypasses MTU check and gets encapsulated
   b) At __ip_finish_output, the oversized GSO packet is split into
      separate SKBs (one per segment), with IP IDs incrementing
   c) Each SKB is then fragmented again based on the actual MTU

   This sequential IP ID allocation differs from the expected behavior
   and can cause issues with fragment reassembly and packet tracking.

Fix this by properly validating GSO packets using
skb_gso_validate_network_len(). This function correctly validates
whether the GSO segments will fit within the MTU after segmentation. If
validation fails, send an ICMP Fragmentation Needed message to enable
proper PMTU discovery.

Fixes: 4cdd34084d53 ("netfilter: nf_conntrack_ipv6: improve fragmentation handling")
Signed-off-by: Yingnan Zhang <342144303@qq.com>

---
v4:
- Introduce a new helper function ip_vs_exceeds_mtu() to improve readability (reviewer feedback)

v3: https://lore.kernel.org/netdev/tencent_73010FBD5FA1C05C3BC23A07A50B11CEC90A@qq.com/
v2: https://lore.kernel.org/netdev/tencent_CA2C1C219C99D315086BE55E8654AF7E6009@qq.com/
v1: https://lore.kernel.org/netdev/tencent_4A3E1C339C75D359093BE4F08648AFAA6009@qq.com/
---
---
 net/netfilter/ipvs/ip_vs_xmit.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 0fb5162992e5..64dfdf8b00c4 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -102,6 +102,18 @@ __ip_vs_dst_check(struct ip_vs_dest *dest)
 	return dest_dst;
 }
 
+/* Based on ip_exceeds_mtu(). */
+static bool ip_vs_exceeds_mtu(const struct sk_buff *skb, unsigned int mtu)
+{
+	if (skb->len <= mtu)
+		return false;
+
+	if (skb_is_gso(skb) && skb_gso_validate_network_len(skb, mtu))
+		return false;
+
+	return true;
+}
+
 static inline bool
 __mtu_check_toobig_v6(const struct sk_buff *skb, u32 mtu)
 {
@@ -112,7 +124,7 @@ __mtu_check_toobig_v6(const struct sk_buff *skb, u32 mtu)
 		if (IP6CB(skb)->frag_max_size > mtu)
 			return true; /* largest fragment violate MTU */
 	}
-	else if (skb->len > mtu && !skb_is_gso(skb)) {
+	else if (ip_vs_exceeds_mtu(skb, mtu)) {
 		return true; /* Packet size violate MTU size */
 	}
 	return false;
@@ -232,7 +244,7 @@ static inline bool ensure_mtu_is_adequate(struct netns_ipvs *ipvs, int skb_af,
 			return true;
 
 		if (unlikely(ip_hdr(skb)->frag_off & htons(IP_DF) &&
-			     skb->len > mtu && !skb_is_gso(skb) &&
+			     ip_vs_exceeds_mtu(skb, mtu) &&
 			     !ip_vs_iph_icmp(ipvsh))) {
 			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
 				  htonl(mtu));
-- 
2.51.0.windows.1


^ permalink raw reply related

* Re: [net-next v1 2/3] net: motorcomm: phy: set drive strength in 8531s RGMII case
From: Andrew Lunn @ 2026-04-15 14:42 UTC (permalink / raw)
  To: Minda Chen
  Cc: Frank, Andrew Lunn, Heiner Kallweit, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel
In-Reply-To: <20260415092654.64907-3-minda.chen@starfivetech.com>

On Wed, Apr 15, 2026 at 05:26:53PM +0800, Minda Chen wrote:
> Set RXD and RX CLK pin drive strength while in 8531s RGMII
> case.
> 
> Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
> ---
>  drivers/net/phy/motorcomm.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/net/phy/motorcomm.c b/drivers/net/phy/motorcomm.c
> index 35aff1519b4b..f3129419f7c9 100644
> --- a/drivers/net/phy/motorcomm.c
> +++ b/drivers/net/phy/motorcomm.c
> @@ -1714,6 +1714,11 @@ static int yt8521_config_init(struct phy_device *phydev)
>  		if (ret < 0)
>  			goto err_restore_page;
>  	}
> +
> +	if (phydev->drv->phy_id == PHY_ID_YT8531S &&
> +	    phydev->interface != PHY_INTERFACE_MODE_SGMII)
> +		ret = yt8531_set_ds(phydev, true);

phy_interface_is_rgmii().


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox