* Re: [PATCH 08/23] arm64: topology: Use RCU to protect access to HK_TYPE_TICK cpumask
From: Chen Ridong @ 2026-04-22 9:34 UTC (permalink / raw)
To: Waiman Long, Tejun Heo, Johannes Weiner, Michal Koutný,
Jonathan Corbet, Shuah Khan, Catalin Marinas, Will Deacon,
K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
Guenter Roeck, Frederic Weisbecker, Paul E. McKenney,
Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
Uladzislau Rezki, Steven Rostedt, Mathieu Desnoyers,
Lai Jiangshan, Zqiang, Anna-Maria Behnsen, Ingo Molnar,
Thomas Gleixner, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Ben Segall, Mel Gorman, Valentin Schneider,
K Prateek Nayak, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: cgroups, linux-doc, linux-kernel, linux-arm-kernel, linux-hyperv,
linux-hwmon, rcu, netdev, linux-kselftest, Costa Shulyupin,
Qiliang Yuan
In-Reply-To: <20260421030351.281436-9-longman@redhat.com>
On 2026/4/21 11:03, Waiman Long wrote:
> As the HK_TYPE_TICK cpumask is going to be changeable at run time, we
> need to use RCU to protect access to the cpumask to prevent it from
> going away in the middle of the operation.
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> arch/arm64/kernel/topology.c | 17 ++++++++++++++---
> 1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index b32f13358fbb..48f150801689 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -173,6 +173,7 @@ void arch_cpu_idle_enter(void)
> if (!amu_fie_cpu_supported(cpu))
> return;
>
> + guard(rcu)();
> /* Kick in AMU update but only if one has not happened already */
> if (housekeeping_cpu(cpu, HK_TYPE_TICK) &&
> time_is_before_jiffies(per_cpu(cpu_amu_samples.last_scale_update, cpu)))
> @@ -187,11 +188,16 @@ int arch_freq_get_on_cpu(int cpu)
> unsigned int start_cpu = cpu;
> unsigned long last_update;
> unsigned int freq = 0;
> + bool hk_cpu;
> u64 scale;
>
> if (!amu_fie_cpu_supported(cpu) || !arch_scale_freq_ref(cpu))
> return -EOPNOTSUPP;
>
> + scoped_guard(rcu) {
> + hk_cpu = housekeeping_cpu(cpu, HK_TYPE_TICK);
> + }
> +
Should we put this into a while loop, since cpu might be changed to ref_cpu?
> while (1) {
>
> amu_sample = per_cpu_ptr(&cpu_amu_samples, cpu);
> @@ -204,16 +210,21 @@ int arch_freq_get_on_cpu(int cpu)
> * (and thus freq scale), if available, for given policy: this boils
> * down to identifying an active cpu within the same freq domain, if any.
> */
> - if (!housekeeping_cpu(cpu, HK_TYPE_TICK) ||
> + if (!hk_cpu ||
> time_is_before_jiffies(last_update + msecs_to_jiffies(AMU_SAMPLE_EXP_MS))) {
> struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> + bool hk_intersects;
> int ref_cpu;
>
> if (!policy)
> return -EINVAL;
>
> - if (!cpumask_intersects(policy->related_cpus,
> - housekeeping_cpumask(HK_TYPE_TICK))) {
> + scoped_guard(rcu) {
> + hk_intersects = cpumask_intersects(policy->related_cpus,
> + housekeeping_cpumask(HK_TYPE_TICK));
> + }
> +
> + if (!hk_intersects) {
> cpufreq_cpu_put(policy);
> return -EOPNOTSUPP;
> }
--
Best regards,
Ridong
^ permalink raw reply
* Re: [PATCH net 00/18] Remove a number of ISA and PCMCIA Ethernet drivers
From: Daniel Palmer @ 2026-04-22 9:33 UTC (permalink / raw)
To: David Laight
Cc: Andrew Lunn, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Jonathan Corbet,
Shuah Khan, linux-kernel, netdev, linux-doc
In-Reply-To: <20260422101316.0efdcf24@pumpkin>
Hi David,
On Wed, 22 Apr 2026 at 18:13, David Laight <david.laight.linux@gmail.com> wrote:
> Is marking them EXPERT or BROKEN enough?
> (Or a similar new option.)
I think EXPERT gives the wrong impression that they are difficult to
use and BROKEN makes it seem like they don't work.
NEEDSHOBBIES or LIVINGINTHEPAST ?
Seriously though, I think we should have something to mark stuff in
MAINTAINERS and elsewhere that is in the kernel but only because a few
people are having fun with it.
^ permalink raw reply
* Re: [PATCH net 1/2] net/mlx5e: psp: Fix invalid access on PSP dev registration fail
From: Cosmin Ratiu @ 2026-04-22 9:25 UTC (permalink / raw)
To: kuba@kernel.org
Cc: Boris Pismenny, willemdebruijn.kernel@gmail.com,
andrew+netdev@lunn.ch, daniel.zahka@gmail.com,
davem@davemloft.net, leon@kernel.org,
linux-kernel@vger.kernel.org, edumazet@google.com,
linux-rdma@vger.kernel.org, Rahul Rameshbabu, Raed Salem,
Dragos Tatulea, kees@kernel.org, Mark Bloch, pabeni@redhat.com,
Tariq Toukan, Saeed Mahameed, netdev@vger.kernel.org,
Gal Pressman
In-Reply-To: <20260421113210.4f6a8eb6@kernel.org>
On Tue, 2026-04-21 at 11:32 -0700, Jakub Kicinski wrote:
> On Tue, 21 Apr 2026 17:34:32 +0000 Cosmin Ratiu wrote:
> > > No, the normal thing to do is to propagate errors.
> > > If you want to diverge from that _you_ should have a reason,
> > > a better reason than a vague "kernel can fail".
> > > I'd prefer for the driver to fail in an obvious way.
> > > Which will be immediately spotted by the operator, not 2 weeks
> > > later when 10% of the fleet is upgraded already.
> > > The only exception I'd make is to keep devlink registered in
> > > case the fix is to flash a different FW.
> >
> > In this case, PSP not working would be spotted on the next PSP dev-
> > get
> > op which produces zilch instead of working devices.
>
> When you have X vendors times Y device generations times Z FW
> versions
> in your fleet dev-get returning nothing is not a failure. It just
> means
> you're running on a machine that's not capable. Best you can do to
> spot a buggy kernel is to notice that the fraction of PSP traffic is
> decreasing over time. After significant portion of the fleet is
> already
> on the bad kernel.
>
> > But I understand what you want. You'd like the netdevice to either
> > be
> > fully initialized with all supported+configured protocols or fail
> > the
> > open operation. No intermediate/partial states. This is a non-
> > trivial
> > refactor for mlx5, because mlx5_nic_enable() returns nothing.
> > Refactoring seems possible though, its only caller is
> > mlx5e_attach_netdev(), which returns errors. It's certainly not
> > something that should be done for a net fix though.
> >
> > I have a series pending for net-next where the PSP configuration is
> > hooked to mlx5e_psp_set_config(). I will look into implementing
> > what
> > you propose there and propagate errors.
> >
> > Meanwhile, do you want to take these fixes (1 and 2) or maybe just
> > 2
> > for net or not?
>
> Can you call mlx5e_psp_cleanup() when register fails for now?
Done for the next version, currently undergoing testing.
Cosmin.
^ permalink raw reply
* Re: [PATCH net 0/4] Intel Wired LAN Driver Updates 2026-04-20 (ice)
From: Simon Horman @ 2026-04-22 9:23 UTC (permalink / raw)
To: Jacob Keller
Cc: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, netdev, Grzegorz Nitka,
Aleksandr Loktionov, Petr Oros, Sunitha Mekala, Timothy Miskell
In-Reply-To: <20260420-jk-iwl-net-2026-04-20-ptp-e825c-phy-interrupt-fixes-v1-0-bc2240f42251@intel.com>
On Mon, Apr 20, 2026 at 05:51:24PM -0700, Jacob Keller wrote:
> Since this is a set of related fixes for just the ice driver, Jake provides
> the following description for the series:
Thanks for the excellent cover letter and patch descriptions.
Reviewed-by: Simon Horman <horms@kernel.org>
For completeness:
* I have looked over the AI generated review of patch 2/4 by Sashiko.
You may wish to too. But I do not believe that feedback warrants
holding up this series. Actually, I am skeptical those issues
should be addressed at all.
* I have also looked over the AI generated review based on Chris Mason's
review prompts which is available at https://netdev-ai.bots.linux.dev
(if only it had a name!). It flags an potentially incorrect Fixes tag in
patch 4/4. However, the cover letter for the patch explains the
choice of Fixes tag, effectively rebutting the analysis generated by AI
(I guess it didn't take the commit message sufficiently into account.)
^ permalink raw reply
* Re: [PATCH] dt-bindings: Fix phandle-array constraints, again
From: Krzysztof Kozlowski @ 2026-04-22 9:19 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Maarten Lankhorst, Maxime Ripard, Krzysztof Kozlowski,
Conor Dooley, Ulf Hansson, Stephan Gerhold, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Johannes Berg, Jeff Johnson, Bjorn Helgaas, Lorenzo Pieralisi,
Krzysztof Wilczyński, Manivannan Sadhasivam, Bjorn Andersson,
Mathieu Poirier, Sylwester Nawrocki, Mark Brown, Maxime Coquelin,
Greg Kroah-Hartman, Yang Xiwen, Alex Elder, Chaitanya Chundru,
Sibi Sankar, Rao Mandadapu, Patrice Chotard, Xu Yang, Peng Fan,
Thomas Zimmermann, devicetree, linux-kernel, linux-mmc,
linux-arm-msm, netdev, linux-wireless, ath10k, ath11k, linux-pci,
linux-remoteproc, linux-sound, linux-spi, linux-usb
In-Reply-To: <20260421195836.1547469-1-robh@kernel.org>
On Tue, Apr 21, 2026 at 02:55:25PM -0500, Rob Herring (Arm) wrote:
> The unfortunately named 'phandle-array' property type is really a matrix
> with phandle and fixed arg cells entries. A matrix property should have 2
> levels of items constraints.
>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> Can someone from QCom provide some descriptions for 'qcom,smem-states'
> properties.
Working on it...
Best regards,
Krzysztof
^ permalink raw reply
* Re: Bug#1130336: [regression] Network failure beyond first connection after 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add was skipped")
From: Thorsten Leemhuis @ 2026-04-22 9:18 UTC (permalink / raw)
To: Fernando Fernandez Mancera, Alejandro Oliván Alvarez,
Salvatore Bonaccorso, 1130336
Cc: Florian Westphal, Pablo Neira Ayuso, Phil Sutter, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
netfilter-devel, coreteam, netdev, linux-kernel, regressions,
stable
In-Reply-To: <8788e351-553f-48da-a6e6-ce082adacb8d@suse.de>
Lo! Top-posting on purpose to make this easy to process.
What happened to this regression? It looks a bit like things stalled and
fell through the cracks. Or Fernando, did you post a patch like you
mentioned? I looked for one referring the commit or the reporter, but
could not find anything -- but maybe I missed it.
Ciao, Thorsten
On 3/19/26 09:59, Fernando Fernandez Mancera wrote:
> On 3/19/26 9:44 AM, Alejandro Oliván Alvarez wrote:
>> Hi folks.
>>
>> On Wed, 2026-03-18 at 13:49 +0100, Salvatore Bonaccorso wrote:
>>> Hi Alejandro,
>>>
>>> On Sun, Mar 15, 2026 at 02:09:33AM +0100, Fernando Fernandez Mancera
>>> wrote:
>>>> On 3/14/26 8:25 PM, Florian Westphal wrote:
>>>>> Fernando Fernandez Mancera <fmancera@suse.de> wrote:
>>>>>> On 3/14/26 5:13 PM, Fernando Fernandez Mancera wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 3/14/26 3:03 PM, Salvatore Bonaccorso wrote:
>>>>>>>> Control: forwarded -1
>>>>>>>> https://lore.kernel.org/
>>>>>>>> regressions/177349610461.3071718.4083978280323144323@eldama
>>>>>>>> r.lan
>>>>>>>> Control: tags -1 + upstream
>>>>>>>>
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> In Debian, in https://bugs.debian.org/1130336, Alejandro
>>>>>>>> reported that
>>>>>>>> after updates including 69894e5b4c5e ("netfilter:
>>>>>>>> nft_connlimit:
>>>>>>>> update the count if add was skipped"), when the following
>>>>>>>> rule is set
>>>>>>>>
>>>>>>>> iptables -A INPUT -p tcp -m
>>>>>>>> connlimit --connlimit-above 111 -j
>>>>>>>> REJECT --reject-with tcp-reset
>>>>>>>>
>>>>>>>> connections get stuck accordingly, it can be easily
>>>>>>>> reproduced by:
>>>>>>>>
>>>>>>>> # iptables -A INPUT -p tcp -m connlimit
>>>>>>>> --connlimit-above 111 -j REJECT
>>>>>>>> --reject-with tcp-reset
>>>>>>>> # nft list ruleset
>>>>>>>> # Warning: table ip filter is managed by iptables-nft, do
>>>>>>>> not touch!
>>>>>>>> table ip filter {
>>>>>>>> chain INPUT {
>>>>>>>> type filter hook input priority filter;
>>>>>>>> policy accept;
>>>>>>>> ip protocol tcp xt
>>>>>>>> match "connlimit" counter packets 0
>>>>>>>> bytes 0 reject with tcp reset
>>>>>>>> }
>>>>>>>> }
>>>>>>>> # wget -O /dev/null
>>>>>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>>>>>> rc3.tar.gz
>>>>>>>> --2026-03-14 14:53:51--
>>>>>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>>>>>> rc3.tar.gz
>>>>>>>> Resolving git.kernel.org
>>>>>>>> (git.kernel.org)... 172.105.64.184,
>>>>>>>> 2a01:7e01:e001:937:0:1991:8:25
>>>>>>>> Connecting to git.kernel.org
>>>>>>>> (git.kernel.org)|172.105.64.184|:443...
>>>>>>>> connected.
>>>>>>>> HTTP request sent, awaiting response... 301 Moved
>>>>>>>> Permanently
>>>>>>>> Location:
>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
>>>>>>>> linux.git/snapshot/linux-7.0-rc3.tar.gz
>>>>>>>> [following]
>>>>>>>> --2026-03-14 14:53:51--
>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/ git/torvalds/l
>>>>>>>> inux.git/snapshot/linux-7.0-rc3.tar.gz
>>>>>>>> Reusing existing connection to git.kernel.org:443.
>>>>>>>> HTTP request sent, awaiting response... 200 OK
>>>>>>>> Length: unspecified [application/x-gzip]
>>>>>>>> Saving to: ‘/dev/null’
>>>>>>>>
>>>>>>>> /dev/null [
>>>>>>>> <=> ] 248.03M
>>>>>>>> 51.9MB/s in 5.0s
>>>>>>>>
>>>>>>>> 2026-03-14 14:53:56 (49.3 MB/s) - ‘/dev/null’ saved
>>>>>>>> [260080129]
>>>>>>>>
>>>>>>>> # wget -O /dev/null
>>>>>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>>>>>> rc3.tar.gz
>>>>>>>> --2026-03-14 14:53:58--
>>>>>>>> https://git.kernel.org/torvalds/t/linux-7.0-
>>>>>>>> rc3.tar.gz
>>>>>>>> Resolving git.kernel.org
>>>>>>>> (git.kernel.org)... 172.105.64.184,
>>>>>>>> 2a01:7e01:e001:937:0:1991:8:25
>>>>>>>> Connecting to git.kernel.org
>>>>>>>> (git.kernel.org)|172.105.64.184|:443...
>>>>>>>> failed: Connection timed out.
>>>>>>>> Connecting to git.kernel.org
>>>>>>>> (git.kernel.org)|
>>>>>>>> 2a01:7e01:e001:937:0:1991:8:25|:443...
>>>>>>>> failed: Network is unreachable.
>>>>>>>>
>>>>>>>> Before the 69894e5b4c5e ("netfilter: nft_connlimit: update
>>>>>>>> the count
>>>>>>>> if add was skipped") commit this worked.
>>>>>>>>
>>>>>>>
>>>>>>> Thanks for the report. I have reproduced
>>>>>>> this on upstream kernel. I am working on it.
>>>>>>>
>>>>>>
>>>>>> This is what is happening:
>>>>>>
>>>>>> 1. The first connection is established and
>>>>>> tracked, all good. When it finishes, it goes to
>>>>>> TIME_WAIT state
>>>>>> 2. The second connection is established, ct is
>>>>>> confirmed since the beginning, skipping the
>>>>>> tracking and calling a GC.
>>>>>> 3. The previously tracked connection is cleaned
>>>>>> up during GC as TIME_WAIT is considered closed.
>>>>>
>>>>> This is stupid. The fix is to add --syn or use
>>>>> OUTPUT. Its not even clear to me what the user wants to achive
>>>>> with this rule.
>>>>>
>>>>
>>>> Yes, the ruleset shown does not make sense. Having said this, it
>>>> could
>>>> affect to a soft-limit scenario as the one described on the blamed
>>>> commit..
>>>
>>> Alejandro, can you describe what you would like to achieve with the
>>> specific rule?
>>>
>>> Regards,
>>> Salvatore
>>
>> The intended use of that rule was to prevent (limit) a single host from
>> establishing too many TCP connections to given host (Denial of
>> Service... particularly on streaming servers).
>>
>> I learnt about it in several IPtables guides/howtos (maaaany years
>> ago!), and never was an issue on itself.
>> Was it stupid? ... possibly... It 'seemed' to work, or, at least, when
>> checking iptables -L -v one could see packet counter for the rule
>> catching some traffic, without ever noticing it being troublesome, so,
>> at the very least it 'didn't hurt', and, since DoS ever happened over
>> the years...well, I tended to think it was indeed working the way I
>> read it did.
>>
>> Certainly, I never (the authors of those guides at their time indeed)
>> though about the possibility of just target the TCP syn.
>> I have given a try to adding the --syn option to the rule to see the
>> difference, and well, it is way less disruptive that way, but it still
>> breaks things (I saw postfix queues hanging, for instance).
>>
>
> The current problem with the ruleset is that it mixes both, incoming and
> outgoing connections. This should probably use --syn flag so it targets
> connections established against your host only.
>
> Anyway, I am sending a patch fixing this as it makes sense to do it IMO.
> We just want to understand what is the real use-case and how the ruleset
> can be improved.
>
> In addition, I would recommend you to transition to nftables because it
> would be ideal for your use-case. With nftables it would be easy to
> combine this with sets and probably quota expression to limit the usage.
>
> What is wrong with the current ruleset? (Even before the blammed
> commit), if you reach the connlimit limit **ALL** TCP connections will
> be rejected (including legit ones), I do not think that is what you want
> to achieve.
>
> Thanks,
> Fernando.
>
>> So, I have but screwed the idea of using connlimit anymore anyways.
>> Sorry for the noise. Lesson learned.
>>
>> Cheers!
>
>
^ permalink raw reply
* Re: [PATCH net 00/18] Remove a number of ISA and PCMCIA Ethernet drivers
From: David Laight @ 2026-04-22 9:13 UTC (permalink / raw)
To: Daniel Palmer
Cc: Andrew Lunn, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Jonathan Corbet,
Shuah Khan, linux-kernel, netdev, linux-doc
In-Reply-To: <CAFr9PXn1ixyhD42OswoyGZ=W-O-oZygUGpRNm2dcAuYBNgtmQw@mail.gmail.com>
On Wed, 22 Apr 2026 07:03:19 +0900
Daniel Palmer <daniel@0x0f.com> wrote:
...
> Maybe we could add a special thing in the maintainers for "this is
> code only crazy people use" and have a rule to ignore untested AI
> generated patches for it? :)
Is marking them EXPERT or BROKEN enough?
(Or a similar new option.)
David
^ permalink raw reply
* [syzbot] Monthly net report (Apr 2026)
From: syzbot @ 2026-04-22 9:08 UTC (permalink / raw)
To: linux-kernel, netdev, syzkaller-bugs
Hello net maintainers/developers,
This is a 31-day syzbot report for the net subsystem.
All related reports/information can be found at:
https://syzkaller.appspot.com/upstream/s/net
During the period, 9 new issues were detected and 5 were fixed.
In total, 94 issues are still open and 1720 have already been fixed.
Some of the still happening issues:
Ref Crashes Repro Title
<1> 8561 Yes KMSAN: uninit-value in eth_type_trans (2)
https://syzkaller.appspot.com/bug?extid=0901d0cc75c3d716a3a3
<2> 3513 Yes INFO: task hung in linkwatch_event (4)
https://syzkaller.appspot.com/bug?extid=2ba2d70f288cf61174e4
<3> 3170 Yes INFO: task hung in synchronize_rcu (4)
https://syzkaller.appspot.com/bug?extid=222aa26d0a5dbc2e84fe
<4> 2914 Yes WARNING in rcu_check_gp_start_stall
https://syzkaller.appspot.com/bug?extid=111bc509cd9740d7e4aa
<5> 2111 Yes KMSAN: uninit-value in bpf_prog_run_generic_xdp
https://syzkaller.appspot.com/bug?extid=0e6ddb1ef80986bdfe64
<6> 2008 Yes INFO: task hung in del_device_store
https://syzkaller.appspot.com/bug?extid=6d10ecc8a97cc10639f9
<7> 1675 Yes INFO: task hung in addrconf_dad_work (5)
https://syzkaller.appspot.com/bug?extid=82ccd564344eeaa5427d
<8> 1312 Yes possible deadlock in hsr_dev_xmit (2)
https://syzkaller.appspot.com/bug?extid=fbf74291c3b7e753b481
<9> 663 Yes INFO: task hung in tun_chr_close (5)
https://syzkaller.appspot.com/bug?extid=b0ae8f1abf7d891e0426
<10> 632 Yes INFO: rcu detected stall in tc_modify_qdisc
https://syzkaller.appspot.com/bug?extid=9f78d5c664a8c33f4cce
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
To disable reminders for individual bugs, reply with the following command:
#syz set <Ref> no-reminders
To change bug's subsystems, reply with:
#syz set <Ref> subsystems: new-subsystem
You may send multiple commands in a single email message.
^ permalink raw reply
* Re: [PATCH net 1/1] net: nsh: handle nested NSH headers during GSO
From: Jiri Benc @ 2026-04-22 8:52 UTC (permalink / raw)
To: Ren Wei
Cc: netdev, davem, edumazet, kuba, pabeni, horms, yuantan098,
yifanwucs, tomapufckgml, bird, lx24, caoruide123
In-Reply-To: <6112cce99b4e3571444a616d0fb19e91e2fcca72.1776597598.git.caoruide123@gmail.com>
On Mon, 20 Apr 2026 11:31:32 +0800, Ren Wei wrote:
> Handle nested NSH headers iteratively in a single nsh_gso_segment()
> invocation. Unwrap consecutive NSH headers until the first non-NSH payload
> is reached, including the case where the next redispatch target is reached
> through ETH_P_TEB, segment that payload once, and then restore the full
> outer encapsulation on each output segment.
This looks fragile. If there's ever another protocol with similar logic
added, we'll be in the same situation. (And obviously, unrolling the
recursion for any combination of protocols doesn't scale well.)
What about using a mechanism similar to dev_xmit_recursion to limit the
depth and returning EINVAL if we exceed the limit? I think it's fine to
drop the packet in this pathological case.
We might even just use dev_xmit_recursion directly, since it can be
argued that such nested NSH headers are in fact nested tunnels.
Jiri
^ permalink raw reply
* Re: [PATCH net v3 1/1] net: hsr: limit node table growth
From: Sebastian Andrzej Siewior @ 2026-04-22 8:52 UTC (permalink / raw)
To: Felix Maurer
Cc: Ren Wei, netdev, davem, edumazet, kuba, pabeni, horms, kees,
kexinsun, luka.gejak, Arvid.Brodin, m-karicheri2, yuantan098,
yifanwucs, tomapufckgml, bird, xuyuqiabc, royenheart
In-Reply-To: <aeiHa7rzmSqzMIaJ@thinkpad>
On 2026-04-22 10:31:39 [+0200], Felix Maurer wrote:
> On Tue, Apr 21, 2026 at 10:50:01PM +0800, Ren Wei wrote:
> > diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c
> > index d09875b33588..8a5a2a54a81f 100644
> > --- a/net/hsr/hsr_framereg.c
> > +++ b/net/hsr/hsr_framereg.c
> > @@ -189,6 +195,7 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
> > enum hsr_port_type rx_port)
> > {
> > struct hsr_node *new_node, *node = NULL;
> > + unsigned int node_count = 0;
> > unsigned long now;
> > size_t block_sz;
> > int i;
> > @@ -226,20 +233,31 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
> > spin_lock_bh(&hsr->list_lock);
> > list_for_each_entry_rcu(node, node_db, mac_list,
> > lockdep_is_held(&hsr->list_lock)) {
> > + node_count++;
>
> I'm not sure if this on-the-fly node counting is the best solution here.
> My concern is that it comes quite late in the process, i.e., after we
> already allocated a bunch of memory, etc. As we are discussing a
> scenario where a lot of entries are created, maybe we shouldn't even
> allocate a new_node if the table is already full? For example by storing
> the node_count in hsr_priv and checking it early in the function?
The node is allocated upfront. Then it iterates here and we only end up
counting through the full list if there is no match. This is under a
lock so "many clients" are serialized. If we allocate the node later
then we need to do it under the lock.
I don't think the node count exceeds 100 in production. So having a
counter which is incremented while adding to the list and decremented
while removing items from the list would optimize the "worst case". So
instead traversing the list with 1000 we would just give up.
The "oom block" works regardless. This does not affect the common case
where we have far less nodes.
> > if (ether_addr_equal(node->macaddress_A, addr))
> > - goto out;
> > + goto out_found;
> > if (ether_addr_equal(node->macaddress_B, addr))
> > - goto out;
> > + goto out_found;
> > }
> > +
> > + if (hsr_node_table_size && node_count >= hsr_node_table_size)
> > + goto out_drop;
>
> I think it would be good to somehow make this situation transparent to
> the user, so they can react if this an undesired behavior (for example,
> because they simply have a large network and need a large node table).
netdev_warn_once() probably.
> > list_add_tail_rcu(&new_node->mac_list, node_db);
> > spin_unlock_bh(&hsr->list_lock);
> > return new_node;
> > -out:
> > +out_found:
> > spin_unlock_bh(&hsr->list_lock);
> > + xa_destroy(&new_node->seq_blocks);
> > kfree(new_node->block_buf);
> > -free:
> > kfree(new_node);
> > return node;
> > +out_drop:
> > + spin_unlock_bh(&hsr->list_lock);
> > + xa_destroy(&new_node->seq_blocks);
> > + kfree(new_node->block_buf);
> > +free:
> > + kfree(new_node);
> > + return NULL;
> > }
>
> The two cleanup paths are almost the same now. We usually attempt to
> keep them unified to make sure that we do the correct cleanup steps in
> all situations. So please keep them unified here as well.
>
> Thanks,
> Felix
Sebastian
^ permalink raw reply
* [PATCH net-next] Documentation: net/smc: correct old value of smcr_max_recv_wr
From: Mahanta Jambigi @ 2026-04-22 8:51 UTC (permalink / raw)
To: andrew+netdev, davem, edumazet, kuba, pabeni, alibuda, dust.li,
sidraya, wenjia
Cc: pasic, horms, tonylu, guwen, netdev, linux-s390, Mahanta Jambigi
The smc-sysctl.rst documentation incorrectly stated that the previous
hardcoded maximum number of WR buffers on the receive path (smcr_max_recv_wr)
was 16. The correct historical value used before the introduction of the sysctl
control was 48. Update the documentation to reflect the accurate default value.
Fixes: aef3cdb47bbb net/smc: make wr buffer count configurable
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Sidraya Jayagond <sidraya@linux.ibm.com>
Signed-off-by: Mahanta Jambigi <mjambigi@linux.ibm.com>
---
Documentation/networking/smc-sysctl.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Documentation/networking/smc-sysctl.rst b/Documentation/networking/smc-sysctl.rst
index 904a910f198e..279d15e61899 100644
--- a/Documentation/networking/smc-sysctl.rst
+++ b/Documentation/networking/smc-sysctl.rst
@@ -100,14 +100,14 @@ smcr_max_recv_wr - INTEGER
depending on the workload it can be a bottleneck in a sense that threads
have to wait for work request buffers to become available. Before the
introduction of this control the maximal number of work request buffers
- available on the receive path used to be hard coded to 16. With this control
+ available on the receive path used to be hard coded to 48. With this control
it becomes configurable. The acceptable range is between 2 and 2048.
Please be aware that all the buffers need to be allocated as a physically
continuous array in which each element is a single buffer and has the size
of SMC_WR_BUF_SIZE (48) bytes. If the allocation fails, we keep retrying
with half of the buffer count until it is ether successful or (unlikely)
- we dip below the old hard coded value which is 16 where we give up much
+ we dip below the old hard coded value which is 48 where we give up much
like before having this control.
Default: 48
--
2.51.0
^ permalink raw reply related
* RE: [PATCH v2 1/1] tipc: fix double-free in tipc_buf_append()
From: Tung Quang Nguyen @ 2026-04-22 8:47 UTC (permalink / raw)
To: Lee Jones
Cc: tipc-discussion@lists.sourceforge.net, Jon Maloy, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Ying Xue,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20260421124528.162996-1-lee@kernel.org>
>Subject: [PATCH v2 1/1] tipc: fix double-free in tipc_buf_append()
>
>tipc_msg_validate() can potentially reallocate the skb it is validating, freeing
>the old one. In tipc_buf_append(), it was being called with a pointer to a local
>variable which was a copy of the caller's skb pointer.
>
>If the skb was reallocated and validation subsequently failed, the error
>handling path would free the original skb pointer, which had already been
>freed, leading to double-free.
>
>Fix this by checking if head now points to a newly allocated reassembled skb.
>If it does, reassign *headbuf for later freeing operations.
>
>Fixes: d618d09a68e4 ("tipc: enforce valid ratio between skb truesize and
>contents")
>Suggested-by: Tung Nguyen <tung.quang.nguyen@est.tech>
>Signed-off-by: Lee Jones <lee@kernel.org>
>---
>1v => v2: Keep the passed pointer type the same, but reassign on-change
>
> net/tipc/msg.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
>diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 76284fc538eb..b0bba0feef56
>100644
>--- a/net/tipc/msg.c
>+++ b/net/tipc/msg.c
>@@ -177,8 +177,20 @@ int tipc_buf_append(struct sk_buff **headbuf, struct
>sk_buff **buf)
>
> if (fragid == LAST_FRAGMENT) {
> TIPC_SKB_CB(head)->validated = 0;
>- if (unlikely(!tipc_msg_validate(&head)))
>+
>+ /* If the reassembled skb has been freed in
>+ * tipc_msg_validate() because of an invalid truesize,
>+ * then head will point to a newly allocated reassembled
>+ * skb, while *headbuf points to freed reassembled skb.
>+ * In such cases, correct *headbuf for freeing the newly
>+ * allocated reassembled skb later.
>+ */
>+ if (unlikely(!tipc_msg_validate(&head))) {
>+ if (head != *headbuf)
>+ *headbuf = head;
> goto err;
>+ }
>+
> *buf = head;
> TIPC_SKB_CB(head)->tail = NULL;
> *headbuf = NULL;
>--
>2.54.0.rc1.555.g9c883467ad-goog
Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech>
^ permalink raw reply
* Re: [PATCH] net/stmmac: Fix typos: 'tx_undeflow_irq' -> 'tx_underflow_irq'
From: Jakub Raczynski @ 2026-04-22 8:42 UTC (permalink / raw)
To: Andrew Lunn
Cc: netdev, linux-kernel, kuba, davem, andrew+netdev, kernel-janitors,
linux-arm-kernel, linux-stm32
In-Reply-To: <7eb9e4d4-909c-4203-833d-bd8b664fdfbc@lunn.ch>
[-- Attachment #1: Type: text/plain, Size: 963 bytes --]
On Tue, Apr 21, 2026 at 02:39:15PM +0200, Andrew Lunn wrote:
> > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
> > @@ -78,7 +78,7 @@ static const struct stmmac_stats stmmac_gstrings_stats[] = {
> > STMMAC_STAT(rx_vlan),
> > STMMAC_STAT(rx_split_hdr_pkt_n),
> > /* Tx/Rx IRQ error info */
> > - STMMAC_STAT(tx_undeflow_irq),
> > + STMMAC_STAT(tx_underflow_irq),
>
> Please take another look at this one and think about it.
>
> Andrew
>
> ---
> pw-bot: cr
>
I don't see anything wrong with it?
- naming is correct, same as stmmac_extra_stats from common.h, as it
wouldn't compile otherwise
- string length is ok, as max name length is ETH_GSTRING_LEN=32 and it is
not close
- ethtool just polls data from driver and in my tests it is ok
- all instances of 'undeflow' are changed
- 'underflow' semantic is ok, 'undeflow' is just not correct
Please correct me if I am wrong, but imo no issues with this patch.
Regards
Jakub Raczynski
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: [PATCH iwl-net v2] igc: fix potential skb leak in igc_fpe_xmit_smd_frame()
From: Abdul Rahim, Faizal @ 2026-04-22 8:38 UTC (permalink / raw)
To: Kohei Enju, intel-wired-lan, netdev
Cc: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, kohei.enju, stable
In-Reply-To: <20260415025226.114115-1-kohei@enjuk.jp>
On 15/4/2026 10:52 am, Kohei Enju wrote:
> When igc_fpe_init_tx_descriptor() fails, no one takes care of an
> allocated skb, leaking it. [1]
> Use dev_kfree_skb_any() on failure.
>
> Tested on an I226 adapter with the following command, while injecting
> faults in igc_fpe_init_tx_descriptor() to trigger the error path.
> # ethtool --set-mm $DEV verify-enabled on tx-enabled on pmac-enabled on
>
> [1]
> unreferenced object 0xffff888113c6cdc0 (size 224):
> ...
> backtrace (crc be3d3fda):
> kmem_cache_alloc_node_noprof+0x3b1/0x410
> __alloc_skb+0xde/0x830
> igc_fpe_xmit_smd_frame.isra.0+0xad/0x1b0
> igc_fpe_send_mpacket+0x37/0x90
> ethtool_mmsv_verify_timer+0x15e/0x300
>
> Cc: stable@vger.kernel.org
> Fixes: 5422570c0010 ("igc: add support for frame preemption verification")
> Signed-off-by: Kohei Enju <kohei@enjuk.jp>
> ---
> Changes:
> v2:
> - change to idiomatic style with goto (Simon)
> - add Cc to stable (Alex)
> - add reprodunction steps (Alex)
> v1: https://lore.kernel.org/all/20260329145122.126040-1-kohei@enjuk.jp/
> ---
> drivers/net/ethernet/intel/igc/igc_tsn.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
> index 8a110145bfee..02dd9f0290a3 100644
> --- a/drivers/net/ethernet/intel/igc/igc_tsn.c
> +++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
> @@ -109,10 +109,16 @@ static int igc_fpe_xmit_smd_frame(struct igc_adapter *adapter,
> __netif_tx_lock(nq, cpu);
>
> err = igc_fpe_init_tx_descriptor(ring, skb, type);
> - igc_flush_tx_descriptors(ring);
> + if (err)
> + goto err_free_skb_any;
>
> + igc_flush_tx_descriptors(ring);
> __netif_tx_unlock(nq);
> + return 0;
>
> +err_free_skb_any:
> + __netif_tx_unlock(nq);
> + dev_kfree_skb_any(skb);
> return err;
> }
>
Thanks for helping to fix this.
Reviewed-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
^ permalink raw reply
* [net-next v2 3/3] net: phy: motorcomm: Add YT8522 100M RMII PHY support
From: Minda Chen @ 2026-04-22 8:32 UTC (permalink / raw)
To: Frank, Andrew Lunn, Heiner Kallweit, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev
Cc: linux-kernel, Minda Chen
In-Reply-To: <20260422083255.29692-1-minda.chen@starfivetech.com>
Add YT8522 100M RMII ethernet PHY base driver support, including
PHY ID and base config init function.
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
---
drivers/net/phy/motorcomm.c | 49 ++++++++++++++++++++++++++++++++++++-
1 file changed, 48 insertions(+), 1 deletion(-)
diff --git a/drivers/net/phy/motorcomm.c b/drivers/net/phy/motorcomm.c
index ebc24f51e626..13d57aba5487 100644
--- a/drivers/net/phy/motorcomm.c
+++ b/drivers/net/phy/motorcomm.c
@@ -1,6 +1,6 @@
// SPDX-License-Identifier: GPL-2.0+
/*
- * Motorcomm 8511/8521/8531/8531S/8821 PHY driver.
+ * Motorcomm 8511/8521/8522/8531/8531S/8821 PHY driver.
*
* Author: Peter Geis <pgwipeout@gmail.com>
* Author: Frank <Frank.Sae@motor-comm.com>
@@ -14,6 +14,7 @@
#define PHY_ID_YT8511 0x0000010a
#define PHY_ID_YT8521 0x0000011a
+#define PHY_ID_YT8522 0x4f51e928
#define PHY_ID_YT8531 0x4f51e91b
#define PHY_ID_YT8531S 0x4f51e91a
#define PHY_ID_YT8821 0x4f51ea19
@@ -227,6 +228,13 @@
#define YT8521_LED_100_ON_EN BIT(5)
#define YT8521_LED_10_ON_EN BIT(4)
+#define YT8522_EXTREG_SLEEP_CONTROL 0x2027
+#define YT8522_EN_SLEEP_SW 15
+
+#define YT8522_EXTENDED_COMBO_CTRL 0x4000
+#define YT8522_RXDV_SEL BIT(4)
+#define YT8522_RMII_EN BIT(1)
+
#define YTPHY_MISC_CONFIG_REG 0xA006
#define YTPHY_MCR_FIBER_SPEED_MASK BIT(0)
#define YTPHY_MCR_FIBER_1000BX (0x1 << 0)
@@ -1843,6 +1851,36 @@ static int yt8531_config_init(struct phy_device *phydev)
return 0;
}
+static int yt8522_config_init(struct phy_device *phydev)
+{
+ struct device_node *node = phydev->mdio.dev.of_node;
+ int ret, val;
+
+ val = ytphy_read_ext_with_lock(phydev, YT8522_EXTENDED_COMBO_CTRL);
+ if (val < 0)
+ return val;
+
+ if (val & YT8522_RMII_EN) {
+ val |= YT8522_RXDV_SEL;
+ ret = ytphy_write_ext_with_lock(phydev,
+ YT8522_EXTENDED_COMBO_CTRL,
+ val);
+ if (ret < 0)
+ return ret;
+ }
+
+ if (of_property_read_bool(node, "motorcomm,auto-sleep-disabled")) {
+ /* disable auto sleep */
+ ret = ytphy_modify_ext_with_lock(phydev,
+ YT8522_EXTREG_SLEEP_CONTROL,
+ YT8522_EN_SLEEP_SW, 0);
+ if (ret < 0)
+ return ret;
+ }
+
+ return 0;
+}
+
/**
* yt8531_link_change_notify() - Adjust the tx clock direction according to
* the current speed and dts config.
@@ -3052,6 +3090,14 @@ static struct phy_driver motorcomm_phy_drvs[] = {
.led_hw_control_set = yt8521_led_hw_control_set,
.led_hw_control_get = yt8521_led_hw_control_get,
},
+ {
+ PHY_ID_MATCH_EXACT(PHY_ID_YT8522),
+ .name = "YT8522 100 Megabit Ethernet",
+ .config_aneg = genphy_config_aneg,
+ .config_init = yt8522_config_init,
+ .suspend = genphy_suspend,
+ .resume = genphy_resume,
+ },
{
PHY_ID_MATCH_EXACT(PHY_ID_YT8531),
.name = "YT8531 Gigabit Ethernet",
@@ -3112,6 +3158,7 @@ MODULE_LICENSE("GPL");
static const struct mdio_device_id __maybe_unused motorcomm_tbl[] = {
{ PHY_ID_MATCH_EXACT(PHY_ID_YT8511) },
{ PHY_ID_MATCH_EXACT(PHY_ID_YT8521) },
+ { PHY_ID_MATCH_EXACT(PHY_ID_YT8522) },
{ PHY_ID_MATCH_EXACT(PHY_ID_YT8531) },
{ PHY_ID_MATCH_EXACT(PHY_ID_YT8531S) },
{ PHY_ID_MATCH_EXACT(PHY_ID_YT8821) },
--
2.17.1
^ permalink raw reply related
* [net-next v2 2/3] net: motorcomm: phy: set drive strength in 8531s RGMII case
From: Minda Chen @ 2026-04-22 8:32 UTC (permalink / raw)
To: Frank, Andrew Lunn, Heiner Kallweit, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev
Cc: linux-kernel, Minda Chen
In-Reply-To: <20260422083255.29692-1-minda.chen@starfivetech.com>
Set RXD and RX CLK pin drive strength while in 8531s RGMII
case.
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
---
drivers/net/phy/motorcomm.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/phy/motorcomm.c b/drivers/net/phy/motorcomm.c
index c66804537aa2..ebc24f51e626 100644
--- a/drivers/net/phy/motorcomm.c
+++ b/drivers/net/phy/motorcomm.c
@@ -1698,6 +1698,11 @@ static int yt8521_config_init(struct phy_device *phydev)
if (ret < 0)
goto err_restore_page;
}
+
+ if (phydev->drv->phy_id == PHY_ID_YT8531S &&
+ phy_interface_is_rgmii(phydev))
+ ret = yt8531_set_ds(phydev);
+
err_restore_page:
return phy_restore_page(phydev, old_page, ret);
}
--
2.17.1
^ permalink raw reply related
* [net-next v2 0/3] Add motorcomm 8531s set ds func and 8522 driver
From: Minda Chen @ 2026-04-22 8:32 UTC (permalink / raw)
To: Frank, Andrew Lunn, Heiner Kallweit, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev
Cc: linux-kernel, Minda Chen
This patch is for Starfive JHB100 EVB board. JHB100 contain
1 RGMII/RMII and 1 RMII synopsys GMAC cores. In the EVB board, RGMII
interface connect with YT8531s Ethernet PHY. RMII interface connect
with YT8522 ethernet PHY. So patch 1-2 is for RGMII interface
patch 3 is RMII is for RMII interface.
JHB100 is a Starfive new RISC-V SoC for datacenter BMC (BaseBoard
Managent Controller). Similar with Aspeed 27x0.
The JHB100 minimal system upstream is in progress:
https://patchwork.kernel.org/project/linux-riscv/cover/20260403054945.467700-1-changhuang.liang@starfivetech.com/
The patch base in V7.0-rc5
The change list:
v2:
1. patch1 move mdio lock out from yt8531_set_ds().
2. patch2 changed to phy_interface_is_rgmii().
Minda Chen (3):
net: phy: motorcomm: move mdio lock out from yt8531_set_ds()
net: motorcomm: phy: set drive strength in 8531s RGMII case
net: phy: motorcomm: Add YT8522 100M RMII PHY support
drivers/net/phy/motorcomm.c | 77 ++++++++++++++++++++++++++++++++-----
1 file changed, 67 insertions(+), 10 deletions(-)
base-commit: c369299895a591d96745d6492d4888259b004a9e
--
2.17.1
^ permalink raw reply
* [net-next v2 1/3] net: phy: motorcomm: move mdio lock out from yt8531_set_ds()
From: Minda Chen @ 2026-04-22 8:32 UTC (permalink / raw)
To: Frank, Andrew Lunn, Heiner Kallweit, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev
Cc: linux-kernel, Minda Chen
In-Reply-To: <20260422083255.29692-1-minda.chen@starfivetech.com>
yt8531_set_ds() default set register with mdio lock and only called
with YT8531 PHY. But new type YT8531s support RGMII and has the same
pin strength setting with YT8531, YT8531s need to call yt8531_set_ds()
setting pin drive strength. But YT8531s config init function
yt8521_config_init() already get the mdio lock with phy_select_page().
If calling yt8521_config_init() with mdio lock will cause dead lock.
Need to get the lock before calling yt8531_get_ds() and move mdio
lock out from it can solve this issue.
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
---
drivers/net/phy/motorcomm.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/drivers/net/phy/motorcomm.c b/drivers/net/phy/motorcomm.c
index 4d62f7b36212..c66804537aa2 100644
--- a/drivers/net/phy/motorcomm.c
+++ b/drivers/net/phy/motorcomm.c
@@ -974,7 +974,8 @@ static u32 yt8531_get_ldo_vol(struct phy_device *phydev)
{
u32 val;
- val = ytphy_read_ext_with_lock(phydev, YT8521_CHIP_CONFIG_REG);
+ val = ytphy_read_ext(phydev, YT8521_CHIP_CONFIG_REG);
+
val = FIELD_GET(YT8531_RGMII_LDO_VOL_MASK, val);
return val <= YT8531_LDO_VOL_1V8 ? val : YT8531_LDO_VOL_1V8;
@@ -1010,10 +1011,11 @@ static int yt8531_set_ds(struct phy_device *phydev)
ds = YT8531_RGMII_RX_DS_DEFAULT;
}
- ret = ytphy_modify_ext_with_lock(phydev,
- YTPHY_PAD_DRIVE_STRENGTH_REG,
- YT8531_RGMII_RXC_DS_MASK,
- FIELD_PREP(YT8531_RGMII_RXC_DS_MASK, ds));
+ ret = ytphy_modify_ext(phydev,
+ YTPHY_PAD_DRIVE_STRENGTH_REG,
+ YT8531_RGMII_RXC_DS_MASK,
+ FIELD_PREP(YT8531_RGMII_RXC_DS_MASK, ds));
+
if (ret < 0)
return ret;
@@ -1033,10 +1035,11 @@ static int yt8531_set_ds(struct phy_device *phydev)
ds_field_low = FIELD_GET(GENMASK(1, 0), ds);
ds_field_low = FIELD_PREP(YT8531_RGMII_RXD_DS_LOW_MASK, ds_field_low);
- ret = ytphy_modify_ext_with_lock(phydev,
- YTPHY_PAD_DRIVE_STRENGTH_REG,
- YT8531_RGMII_RXD_DS_LOW_MASK | YT8531_RGMII_RXD_DS_HI_MASK,
- ds_field_low | ds_field_hi);
+ ret = ytphy_modify_ext(phydev,
+ YTPHY_PAD_DRIVE_STRENGTH_REG,
+ YT8531_RGMII_RXD_DS_LOW_MASK | YT8531_RGMII_RXD_DS_HI_MASK,
+ ds_field_low | ds_field_hi);
+
if (ret < 0)
return ret;
@@ -1826,7 +1829,9 @@ static int yt8531_config_init(struct phy_device *phydev)
return ret;
}
+ phy_lock_mdio_bus(phydev);
ret = yt8531_set_ds(phydev);
+ phy_unlock_mdio_bus(phydev);
if (ret < 0)
return ret;
--
2.17.1
^ permalink raw reply related
* Re: [PATCH] net/intel: Replace manual array size calculation with ARRAY_SIZE
From: Jakub Raczynski @ 2026-04-22 8:32 UTC (permalink / raw)
To: Dan Carpenter
Cc: netdev, kuba, przemyslaw.kitszel, anthony.l.nguyen,
kernel-janitors
In-Reply-To: <aeeFh1zQqhVysvxI@stanley.mountain>
[-- Attachment #1: Type: text/plain, Size: 898 bytes --]
On Tue, Apr 21, 2026 at 05:11:19PM +0300, Dan Carpenter wrote:
> On Tue, Apr 21, 2026 at 01:40:29PM +0200, Jakub Raczynski wrote:
> >
> > - if (!((u32)aq_rc < (sizeof(aq_to_posix) / sizeof((aq_to_posix)[0]))))
> > + if (!((u32)aq_rc < ARRAY_SIZE(aq_to_posix)))
>
> This still isn't beautiful. There are so many parens. The !(foo < size)
> formulation is weird. The cast is unnnecessary. Better to write it as:
>
> if (aq_rc >= ARRAY_SIZE(aq_to_posix))
> return -ERANGE;
>
> > return -ERANGE;
> >
> > return aq_to_posix[aq_rc];
>
> regards,
> dan carpenter
>
Alright, will beautify it and resend soon.
I can see potential original intention of not comparing unsigned from sizeof
with int, maybe that was original compiler configuration to include that
warning.
But at this variable range it is irrelevant and it is probably most disabled
warning ever.
regards
Jakub Raczynski
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: [PATCH net v3 1/1] net: hsr: limit node table growth
From: Felix Maurer @ 2026-04-22 8:31 UTC (permalink / raw)
To: Ren Wei
Cc: netdev, Sebastian Andrzej Siewior, davem, edumazet, kuba, pabeni,
horms, kees, kexinsun, luka.gejak, Arvid.Brodin, m-karicheri2,
yuantan098, yifanwucs, tomapufckgml, bird, xuyuqiabc, royenheart
In-Reply-To: <3bdbe54e81bd89c1443b05500368fb45bddc3191.1776754203.git.royenheart@gmail.com>
On Tue, Apr 21, 2026 at 10:50:01PM +0800, Ren Wei wrote:
> From: Haoze Xie <royenheart@gmail.com>
>
> The HSR/PRP node learning paths allocate one persistent entry per
> previously unseen source MAC. Since learned entries stay alive until the
> prune timer catches up, the node tables can otherwise grow without a
> bound under high churn of learned senders.
>
> Limit the number of learned entries in each node table and stop adding
> new ones once the configured limit is reached. This keeps node-table
> resource use bounded across the affected learning paths.
Hi,
thank you for giving this approach a try!
[snip]
> diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c
> index d09875b33588..8a5a2a54a81f 100644
> --- a/net/hsr/hsr_framereg.c
> +++ b/net/hsr/hsr_framereg.c
> @@ -14,12 +14,18 @@
> #include <kunit/visibility.h>
> #include <linux/if_ether.h>
> #include <linux/etherdevice.h>
> +#include <linux/moduleparam.h>
> #include <linux/slab.h>
> #include <linux/rculist.h>
> #include "hsr_main.h"
> #include "hsr_framereg.h"
> #include "hsr_netlink.h"
>
> +static unsigned int hsr_node_table_size = 1024;
> +module_param_named(node_table_size, hsr_node_table_size, uint, 0644);
> +MODULE_PARM_DESC(node_table_size,
> + "Maximum number of learned entries in each HSR/PRP node table (0 = unlimited)");
> +
> bool hsr_addr_is_redbox(struct hsr_priv *hsr, unsigned char *addr)
> {
> if (!hsr->redbox || !is_valid_ether_addr(hsr->macaddress_redbox))
> @@ -189,6 +195,7 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
> enum hsr_port_type rx_port)
> {
> struct hsr_node *new_node, *node = NULL;
> + unsigned int node_count = 0;
> unsigned long now;
> size_t block_sz;
> int i;
> @@ -226,20 +233,31 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
> spin_lock_bh(&hsr->list_lock);
> list_for_each_entry_rcu(node, node_db, mac_list,
> lockdep_is_held(&hsr->list_lock)) {
> + node_count++;
I'm not sure if this on-the-fly node counting is the best solution here.
My concern is that it comes quite late in the process, i.e., after we
already allocated a bunch of memory, etc. As we are discussing a
scenario where a lot of entries are created, maybe we shouldn't even
allocate a new_node if the table is already full? For example by storing
the node_count in hsr_priv and checking it early in the function?
> if (ether_addr_equal(node->macaddress_A, addr))
> - goto out;
> + goto out_found;
> if (ether_addr_equal(node->macaddress_B, addr))
> - goto out;
> + goto out_found;
> }
> +
> + if (hsr_node_table_size && node_count >= hsr_node_table_size)
> + goto out_drop;
I think it would be good to somehow make this situation transparent to
the user, so they can react if this an undesired behavior (for example,
because they simply have a large network and need a large node table).
> list_add_tail_rcu(&new_node->mac_list, node_db);
> spin_unlock_bh(&hsr->list_lock);
> return new_node;
> -out:
> +out_found:
> spin_unlock_bh(&hsr->list_lock);
> + xa_destroy(&new_node->seq_blocks);
> kfree(new_node->block_buf);
> -free:
> kfree(new_node);
> return node;
> +out_drop:
> + spin_unlock_bh(&hsr->list_lock);
> + xa_destroy(&new_node->seq_blocks);
> + kfree(new_node->block_buf);
> +free:
> + kfree(new_node);
> + return NULL;
> }
The two cleanup paths are almost the same now. We usually attempt to
keep them unified to make sure that we do the correct cleanup steps in
all situations. So please keep them unified here as well.
Thanks,
Felix
^ permalink raw reply
* Re: [PATCH] wifi: mac80211: check ieee80211_rx_data_set_link return in pubsta MLO path
From: Benjamin Berg @ 2026-04-22 8:17 UTC (permalink / raw)
To: Johannes Berg, Michael Bommarito, linux-wireless
Cc: Felix Fietkau, netdev, linux-kernel, Ramasamy Kaliappan
In-Reply-To: <434407f50d6b7ee85ad14dd6db757f7d9f695a96.camel@sipsolutions.net>
On Wed, 2026-04-22 at 08:27 +0200, Johannes Berg wrote:
> On Tue, 2026-04-21 at 20:06 -0400, Michael Bommarito wrote:
>
> > Benjamin Berg's 2026-02 RFC v2 "wifi: mac80211: refactor RX
> > link_id and station handling"
> > (20260223133818.9f5550ab445f.I...@changeid) touches the same
> > code and may supersede or subsume this patch; happy to fold /
> > rebase / drop.
>
> Yeah, Benjamin, what's up with that :)
Hmm, good question. I was hoping for a userspace hostap implementation
to start using the new NO_STA flag as that part is otherwise untested.
Qualcomm was going to work on that, but I have not yet heard about it
again.
That said, it should be fine to merge patches 1-4 at least. So I am
happy to submit them separately.
Benjamin
>
> OTOH we perhaps wants this patch in wireless, and then yours in
> wireless-next.
>
> johannes
^ permalink raw reply
* Re: [PATCH] smb: smbdirect: move fs/smb/common/smbdirect/ to fs/smb/smbdirect/
From: Stefan Metzmacher @ 2026-04-22 8:16 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-cifs, linux-rdma, netdev, samba-technical, Tom Talpey,
Steve French, Linus Torvalds, Namjae Jeon
In-Reply-To: <aehrPuY60VMcYGU8@infradead.org>
Hi Christoph,
>> diff --git a/fs/smb/Makefile b/fs/smb/Makefile
>> index 9a1bf59a1a65..353b1c2eefc4 100644
>> --- a/fs/smb/Makefile
>> +++ b/fs/smb/Makefile
>> @@ -1,5 +1,6 @@
>> # SPDX-License-Identifier: GPL-2.0
>>
>> obj-$(CONFIG_SMBFS) += common/
>> +obj-$(CONFIG_SMBDIRECT) += smbdirect/
>
> Why is this not in net/smbdirect/ or driver/infiniband/ulp/smdirect?
Yes, I also thought about net/smbdirect.
As IPPROTO_SMBDIRECT or PF_SMBDIRECT will be the next step,
see the open discussion here:
https://lore.kernel.org/linux-cifs/cover.1775571957.git.metze@samba.org/
(I'll follow with that discussion soon)
I was just unsure about the consequences, e.g. would
the maintainer/pull request flow have to change in that case?
Or would Steve be able to take the changes via his trees?
Any I also didn't want to offend anybody, so I just took
what Linus proposed.
Using driver/infiniband/ulp/smdirect would also work,
if everybody prefer that.
> As far as I can tell there is zero file system logic in this code.
>
>> -#include "../common/smbdirect/smbdirect_public.h"
>> +#include "../smbdirect/public.h"
>
> And all these relative includes suggest you really want a
> include/linux/smdirect/ instead.
Yes, that's my also my goal in the next steps.
> While we're at it: __SMBDIRECT_EXPORT_SYMBOL__ is really odd.
> One thing is the __ pre- and postfix that make it look weird.
Yes, the __SMBDIRECT_EXPORT_SYMBOL__ was mainly a temporary
thing, now it's useless and I'll remove it.
> The other is that EXPORT_SYMBOL_FOR_MODULES is for very specific
> symbols that really should not exported. What this warrants instead
> is a normal EXPORT_SYMBOL_NS_GPL.
I want the exported functions be minimal, as most of
of should go via the socket layer instead.
If EXPORT_SYMBOL_NS_GPL(func, "smbdirect") is better than
EXPORT_SYMBOL_FOR_MODULES() I can change that.
It means cifs.ko and ksmbd.ko would need MODULE_IMPORT_NS("smbdirect"), correct?
Thanks!
metze
^ permalink raw reply
* Re: [PATCH iproute2] ss: fix vsock port filter
From: Stefano Garzarella @ 2026-04-22 8:03 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Luigi Leonardi, stefanha, netdev, Mathieu Schroeter, David Ahern
In-Reply-To: <20260421163757.31da8751@phoenix.local>
On Wed, 22 Apr 2026 at 01:38, Stephen Hemminger <stephen@networkplumber.org> wrote:
>
> On Tue, 21 Apr 2026 14:35:12 +0200
> Luigi Leonardi <leonardi@redhat.com> wrote:
>
> > parse_hostcond() uses get_u32() to parse the vsock port into the
> > aafilter.port field, which is a long. On 64-bit systems, get_u32()
> > only writes the lower 32 bits, leaving the upper 32 bits set from
> > the -1 initialization. This causes the port comparison
> > "a->port != s->rport" in run_ssfilter() to always fail, since the
> > corrupted long value never matches the int rport.
> >
> > Fix by using get_long() instead, consistent with how AF_PACKET and
> > AF_NETLINK handle the same field.
> >
> > Fixes: c759116a0b2b ("ss: add AF_VSOCK support")
> > Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
> > ---
> > misc/ss.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/misc/ss.c b/misc/ss.c
> > index 14e9f27a..6e3321ac 100644
> > --- a/misc/ss.c
> > +++ b/misc/ss.c
> > @@ -2323,7 +2323,7 @@ void *parse_hostcond(char *addr, bool is_port)
> > port = find_port(addr, is_port);
> >
> > if (port && strcmp(port, "*") &&
> > - get_u32((__u32 *)&a.port, port, 0))
> > + get_long(&a.port, port, 0))
> > return NULL;
>
> If you use get_long() then the code could get negative values.
> Actually have port in ss as signed value seems like a mistake in original design.
>
> The port in unix domain socket is inode number.
> Originally it was int, but got changed to long back in 6.6
>
> The port in ss cache is int.
Yeah, as I mentioned I think the issue was introduced by commit
012cb515 ("ss: change aafilter port from int to long (inode support)").
After teverting it, the filtering on AF_VSOCK works correctly.
IMO (I don't know the code at all), commit 012cb515 is incomplete and we
should also move from int to long lport and rport in `struct sockstat`.
>
> The ss code is one of those legacy dog piles that needs a major
> overhaul and refactoring.
>
yeah, I see...
Thanks,
Stefano
^ permalink raw reply
* [syzbot] [atm?] general protection fault in atmtcp_ioctl
From: syzbot @ 2026-04-22 7:51 UTC (permalink / raw)
To: 3chas3, linux-atm-general, linux-kernel, netdev, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 1f5ffc672165 Fix mismerge of the arm64 / timer-core interr..
git tree: net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=14b72c36580000
kernel config: https://syzkaller.appspot.com/x/.config?x=95729ed00549063a
dashboard link: https://syzkaller.appspot.com/bug?extid=52bbaea57234493b6b17
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/6b552538b97f/disk-1f5ffc67.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/724a3a1d69d7/vmlinux-1f5ffc67.xz
kernel image: https://storage.googleapis.com/syzbot-assets/ea684969e2c2/bzImage-1f5ffc67.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+52bbaea57234493b6b17@syzkaller.appspotmail.com
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 1 UID: 0 PID: 15453 Comm: syz.2.2407 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/18/2026
RIP: 0010:atmtcp_attach drivers/atm/atmtcp.c:408 [inline]
RIP: 0010:atmtcp_ioctl+0x860/0xdf0 drivers/atm/atmtcp.c:477
Code: ff e9 b4 fb ff ff 4c 8d 63 20 4c 89 e0 48 c1 e8 03 80 3c 28 00 74 08 4c 89 e7 e8 cb 21 0b fb 4d 8b 24 24 4c 89 e0 48 c1 e8 03 <80> 3c 28 00 74 08 4c 89 e7 e8 b2 21 0b fb 49 83 3c 24 00 0f 84 f0
RSP: 0018:ffffc9000f6b7c30 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff888037f2f000 RCX: ffffffff8bb4829e
RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffc9000f6b7b80
RBP: dffffc0000000000 R08: ffffc9000f6b7b87 R09: 1ffff92001ed6f70
R10: dffffc0000000000 R11: fffff52001ed6f71 R12: 0000000000000000
R13: 1ffff11006fe5e00 R14: ffff88808ac0d440 R15: ffff88803234e000
FS: 00007fd42b7c26c0(0000) GS:ffff888125345000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000001944 CR3: 00000000358e6000 CR4: 00000000003526f0
Call Trace:
<TASK>
do_vcc_ioctl+0x36d/0x9d0 net/atm/ioctl.c:159
sock_do_ioctl+0x101/0x320 net/socket.c:1313
sock_ioctl+0x5c6/0x7f0 net/socket.c:1434
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fd42a99c819
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fd42b7c2028 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fd42ac15fa0 RCX: 00007fd42a99c819
RDX: 0000000000000000 RSI: 0000000000006180 RDI: 0000000000000008
RBP: 00007fd42aa32c91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fd42ac16038 R14: 00007fd42ac15fa0 R15: 00007ffe8b9d1618
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:atmtcp_attach drivers/atm/atmtcp.c:408 [inline]
RIP: 0010:atmtcp_ioctl+0x860/0xdf0 drivers/atm/atmtcp.c:477
Code: ff e9 b4 fb ff ff 4c 8d 63 20 4c 89 e0 48 c1 e8 03 80 3c 28 00 74 08 4c 89 e7 e8 cb 21 0b fb 4d 8b 24 24 4c 89 e0 48 c1 e8 03 <80> 3c 28 00 74 08 4c 89 e7 e8 b2 21 0b fb 49 83 3c 24 00 0f 84 f0
RSP: 0018:ffffc9000f6b7c30 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff888037f2f000 RCX: ffffffff8bb4829e
RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffc9000f6b7b80
RBP: dffffc0000000000 R08: ffffc9000f6b7b87 R09: 1ffff92001ed6f70
R10: dffffc0000000000 R11: fffff52001ed6f71 R12: 0000000000000000
R13: 1ffff11006fe5e00 R14: ffff88808ac0d440 R15: ffff88803234e000
FS: 00007fd42b7c26c0(0000) GS:ffff888125345000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000110c3d9593 CR3: 00000000358e6000 CR4: 00000000003526f0
----------------
Code disassembly (best guess):
0: ff ljmp (bad)
1: e9 b4 fb ff ff jmp 0xfffffbba
6: 4c 8d 63 20 lea 0x20(%rbx),%r12
a: 4c 89 e0 mov %r12,%rax
d: 48 c1 e8 03 shr $0x3,%rax
11: 80 3c 28 00 cmpb $0x0,(%rax,%rbp,1)
15: 74 08 je 0x1f
17: 4c 89 e7 mov %r12,%rdi
1a: e8 cb 21 0b fb call 0xfb0b21ea
1f: 4d 8b 24 24 mov (%r12),%r12
23: 4c 89 e0 mov %r12,%rax
26: 48 c1 e8 03 shr $0x3,%rax
* 2a: 80 3c 28 00 cmpb $0x0,(%rax,%rbp,1) <-- trapping instruction
2e: 74 08 je 0x38
30: 4c 89 e7 mov %r12,%rdi
33: e8 b2 21 0b fb call 0xfb0b21ea
38: 49 83 3c 24 00 cmpq $0x0,(%r12)
3d: 0f .byte 0xf
3e: 84 f0 test %dh,%al
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply
* RE: [RFC Patch net-next v1 9/9] r8169: add support for ethtool
From: Javen @ 2026-04-22 7:47 UTC (permalink / raw)
To: Andrew Lunn
Cc: hkallweit1@gmail.com, nic_swsd@realtek.com, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, horms@kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <c94f23f6-e353-4e88-af91-9e73d70d009d@lunn.ch>
>> +static int rtl8169_set_channels(struct net_device *dev,
>> + struct ethtool_channels *ch) {
>> + struct rtl8169_private *tp = netdev_priv(dev);
>> + bool if_running = netif_running(dev);
>> + int i;
>> +
>> + if (!tp->rss_support && (ch->rx_count > 1 || ch->tx_count > 1)) {
>> + netdev_warn(dev, "This chip does not support multiple
>channels/RSS.\n");
>> + return -EOPNOTSUPP;
>> + }
>> +
>> + if (ch->rx_count == 0 || ch->tx_count == 0)
>> + return -EINVAL;
>> + if (ch->rx_count > tp->HwSuppNumRxQueues ||
>> + ch->tx_count > tp->HwSuppNumTxQueues)
>> + return -EINVAL;
>> + if (ch->other_count || ch->combined_count)
>> + return -EINVAL;
>> +
>> + if (ch->rx_count == tp->num_rx_rings &&
>> + ch->tx_count == tp->num_tx_rings)
>> + return 0;
>> +
>> + if (if_running)
>> + rtl8169_close(dev);
>
>I assume this releases all the memory from the rings?
>
>> +
>> + tp->num_rx_rings = ch->rx_count;
>> + tp->num_tx_rings = ch->tx_count;
>> +
>> + tp->rss_enable = (tp->num_rx_rings > 1 && tp->rss_support);
>> +
>> + for (i = 0; i < tp->HwSuppIndirTblEntries; i++) {
>> + if (tp->rss_enable)
>> + tp->rss_indir_tbl[i] = ethtool_rxfh_indir_default(i, tp-
>>num_rx_rings);
>> + else
>> + tp->rss_indir_tbl[i] = 0;
>> + }
>> +
>> + if (tp->rss_enable)
>> + tp->InitRxDescType = RX_DESC_RING_TYPE_RSS;
>> + else
>> + tp->InitRxDescType = RX_DESC_RING_TYPE_DEAFULT;
>> +
>> + if (if_running)
>> + return rtl_open(dev);
>
>And this tries to allocate the memory needed for the rings? And if the system
>is under memory pressure, it fails and your network is dead?
>
>Please modify the code so that is first allocated the new rings and then frees
>the old rings, so you can fail gracefully.
>
> Andrew
Thank you for your advice. I have updated the code accordingly. This patch is
based on RFC Patch net-next 9/9.
This patch fixes issues when using ethtool -L to change the number of RX rings.
Specifically, if allocating memory for the new RX rings fails, it now gracefully rolls
back to the original state.
Additionally, it fixes a bug where ping or iperf would fail after manually setting
8 RX rings (when the default suggested RSS queue number is only 1 or 2). This
was caused by a missing call to rtl_set_irq_mask, which is now added.
---
drivers/net/ethernet/realtek/r8169_main.c | 101 +++++++++++++++++++---
1 file changed, 90 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 57087abe7d88..4b2abd3deee0 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -6116,6 +6116,8 @@ static void rtl8169_double_check_rss_support(struct rtl8169_private *tp)
if (tp->num_rx_rings >= 2) {
tp->rss_enable = 1;
tp->InitRxDescType = RX_DESC_RING_TYPE_RSS;
+ } else if (tp->num_rx_rings == 1 && tp->irq_nvecs > 1) {
+ tp->rss_enable = 0;
} else {
tp->rss_enable = 0;
if (tp->irq_nvecs > 1) {
@@ -6534,18 +6536,70 @@ static void rtl8169_get_channels(struct net_device *dev,
ch->combined_count = 0;
}
+static int rtl8169_realloc_rx(struct rtl8169_private *tp,
+ struct rtl8169_rx_ring *new_rx,
+ int new_count)
+{
+ int i, ret;
+
+ new_rx[0].rdsar_reg = RxDescAddrLow;
+ for (i = 1; i < new_count; i++)
+ new_rx[i].rdsar_reg = (u16)(RDSAR_Q1_LOW + (i - 1) * 8);
+
+ for (i = 0; i < new_count; i++)
+ new_rx[i].num_rx_desc = NUM_RX_DESC;
+
+ for (i = 0; i < new_count; i++) {
+ struct rtl8169_rx_ring *ring = &new_rx[i];
+
+ ring->RxDescAllocSize = (NUM_RX_DESC + 1) * sizeof(struct RxDesc);
+ ring->RxDescArray = dma_alloc_coherent(&tp->pci_dev->dev,
+ ring->RxDescAllocSize,
+ &ring->RxPhyAddr,
+ GFP_KERNEL);
+ if (!ring->RxDescArray) {
+ ret = -ENOMEM;
+ goto err_free;
+ }
+
+ memset(ring->Rx_databuff, 0, sizeof(ring->Rx_databuff));
+ ret = rtl8169_rx_fill(tp, ring);
+ if (ret) {
+ dma_free_coherent(&tp->pci_dev->dev, ring->RxDescAllocSize,
+ ring->RxDescArray, ring->RxPhyAddr);
+ goto err_free;
+ }
+ }
+ return 0;
+
+err_free:
+ while (--i >= 0) {
+ rtl8169_rx_clear(tp, &new_rx[i]);
+ dma_free_coherent(&tp->pci_dev->dev, new_rx[i].RxDescAllocSize,
+ new_rx[i].RxDescArray, new_rx[i].RxPhyAddr);
+ }
+ return ret;
+}
+
static int rtl8169_set_channels(struct net_device *dev,
struct ethtool_channels *ch)
{
struct rtl8169_private *tp = netdev_priv(dev);
bool if_running = netif_running(dev);
- int i;
+ struct rtl8169_rx_ring *new_rx;
+ u8 old_tx_desc_type = tp->InitRxDescType;
+ u8 new_desc_type;
+ bool new_rss_enable;
+ int i, ret;
if (!tp->rss_support && (ch->rx_count > 1 || ch->tx_count > 1)) {
netdev_warn(dev, "This chip does not support multiple channels/RSS.\n");
return -EOPNOTSUPP;
}
+ if (tp->features & RTL_FEATURE_MSI)
+ return -EINVAL;
+
if (ch->rx_count == 0 || ch->tx_count == 0)
return -EINVAL;
if (ch->rx_count > tp->HwSuppNumRxQueues ||
@@ -6558,13 +6612,39 @@ static int rtl8169_set_channels(struct net_device *dev,
ch->tx_count == tp->num_tx_rings)
return 0;
- if (if_running)
- rtl8169_close(dev);
+ new_rss_enable = (ch->rx_count > 1 && tp->rss_support);
+ new_desc_type = new_rss_enable ? RX_DESC_RING_TYPE_RSS : RX_DESC_RING_TYPE_DEAFULT;
+ tp->InitRxDescType = new_desc_type;
+
+ if (!if_running) {
+ tp->num_rx_rings = ch->rx_count;
+ tp->rss_enable = new_rss_enable;
+ return 0;
+ }
+
+ new_rx = kcalloc(R8169_MAX_RX_QUEUES, sizeof(*new_rx), GFP_KERNEL);
+ if (!new_rx)
+ return -ENOMEM;
+
+ ret = rtl8169_realloc_rx(tp, new_rx, ch->rx_count);
+ if (ret) {
+ kfree(new_rx);
+ tp->InitRxDescType = old_tx_desc_type;
+ return ret;
+ }
+
+ netif_stop_queue(dev);
+ rtl8169_down(tp);
+
+ for (i = 0; i < tp->num_rx_rings; i++)
+ rtl8169_rx_clear(tp, &tp->rx_ring[i]);
+ rtl8169_free_rx_desc(tp);
tp->num_rx_rings = ch->rx_count;
- tp->num_tx_rings = ch->tx_count;
+ tp->rss_enable = new_rss_enable;
- tp->rss_enable = (tp->num_rx_rings > 1 && tp->rss_support);
+ memset(tp->rx_ring, 0, sizeof(tp->rx_ring));
+ memcpy(tp->rx_ring, new_rx, sizeof(*new_rx) * ch->rx_count);
for (i = 0; i < tp->HwSuppIndirTblEntries; i++) {
if (tp->rss_enable)
@@ -6573,13 +6653,12 @@ static int rtl8169_set_channels(struct net_device *dev,
tp->rss_indir_tbl[i] = 0;
}
- if (tp->rss_enable)
- tp->InitRxDescType = RX_DESC_RING_TYPE_RSS;
- else
- tp->InitRxDescType = RX_DESC_RING_TYPE_DEAFULT;
+ rtl_set_irq_mask(tp);
+
+ rtl8169_up(tp);
+ netif_start_queue(dev);
- if (if_running)
- return rtl_open(dev);
+ kfree(new_rx);
return 0;
}
--
2.43.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox