* Re: [PATCH net] net: usb: cdc_ncm: reject negative chained NDP offsets
From: Greg Kroah-Hartman @ 2026-04-13 12:24 UTC (permalink / raw)
To: Oliver Neukum
Cc: linux-usb, netdev, linux-kernel, Oliver Neukum, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
stable
In-Reply-To: <198c1240-80a6-456c-8b12-25158c90c965@suse.com>
On Mon, Apr 13, 2026 at 02:11:50PM +0200, Oliver Neukum wrote:
> On 13.04.26 12:43, Greg Kroah-Hartman wrote:
> > On Mon, Apr 13, 2026 at 10:36:19AM +0200, Oliver Neukum wrote:
> > >
> > >
> > > On 11.04.26 12:53, Greg Kroah-Hartman wrote:
> > > > cdc_ncm_rx_fixup() reads dwNextNdpIndex from each NDP32 to chain to the
> > > > next one. The 32-bit value from the device is stored into the signed
> > > > int ndpoffset so that means values with the high bit set become
> > >
> > > Well, then isn't the problem rather that you should not store an
> > > unsigned value in a signed variable?
> >
> > No. well, yes. but no.
> >
> > cdc_ncm_rx_verify_nth16() returns an int, and is negative if something
> > went wrong, so we need it that way, and then we need to check it, like
> > we properly do at the top of the loop, it's just that at the bottom of
> > the loop we also need to do the same exact thing.
>
> Doesn't that suggest that cdc_ncm_rx_verify_nth16() is the problem?
> To be precise, the way it indicates errors?
> As this is an offset into a buffer and the header must be at the start
> of the buffer, isn't 0 the natural indication of an error?
Maybe? I really don't know, sorry, parsing the cdc_ncm buffer is not
something I looked too deeply into :)
greg k-h
^ permalink raw reply
* Re: [PATCH v11 net-next 5/7] octeontx2-af: npc: cn20k: add subbank search order control
From: Paolo Abeni @ 2026-04-13 12:56 UTC (permalink / raw)
To: Ratheesh Kannoth, netdev, linux-kernel, linux-rdma
Cc: sgoutham, andrew+netdev, davem, edumazet, kuba, donald.hunter,
horms, jiri, chuck.lever, matttbe, cjubran, saeedm, leon, tariqt,
mbloch, dtatulea
In-Reply-To: <20260409025055.1664053-6-rkannoth@marvell.com>
On 4/9/26 4:50 AM, Ratheesh Kannoth wrote:
> CN20K NPC MCAM is split into 32 subbanks that are searched in a
> predefined order during allocation. Lower-numbered subbanks have
> higher priority than higher-numbered ones.
>
> Add a runtime devlink parameter "srch_order" (
> DEVLINK_PARAM_TYPE_U32_ARRAY) to control the order in which
> subbanks are searched during MCAM allocation.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
> ---
> .../ethernet/marvell/octeontx2/af/cn20k/npc.c | 91 +++++++++++++++++-
> .../ethernet/marvell/octeontx2/af/cn20k/npc.h | 2 +
> .../marvell/octeontx2/af/rvu_devlink.c | 92 +++++++++++++++++--
> 3 files changed, 173 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> index e854b85ced9e..153765b3e504 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> @@ -3317,7 +3317,7 @@ rvu_mbox_handler_npc_cn20k_get_kex_cfg(struct rvu *rvu,
> return 0;
> }
>
> -static int *subbank_srch_order;
> +static u32 *subbank_srch_order;
>
> static void npc_populate_restricted_idxs(int num_subbanks)
> {
> @@ -3329,7 +3329,7 @@ static int npc_create_srch_order(int cnt)
> {
> int val = 0;
>
> - subbank_srch_order = kcalloc(cnt, sizeof(int),
> + subbank_srch_order = kcalloc(cnt, sizeof(u32),
> GFP_KERNEL);
> if (!subbank_srch_order)
> return -ENOMEM;
> @@ -3809,6 +3809,93 @@ static void npc_unlock_all_subbank(void)
> mutex_unlock(&npc_priv.sb[i].lock);
> }
>
> +int npc_cn20k_search_order_set(struct rvu *rvu,
> + u64 arr[MAX_NUM_SUB_BANKS], int cnt)
> +{
> + struct npc_mcam *mcam = &rvu->hw->mcam;
> + u32 fslots[MAX_NUM_SUB_BANKS][2];
> + u32 uslots[MAX_NUM_SUB_BANKS][2];
> + int fcnt = 0, ucnt = 0;
> + struct npc_subbank *sb;
> + int idx, val, rc = 0;
> +
> + unsigned long index;
> + void *v;
> +
> + if (cnt != npc_priv.num_subbanks) {
> + dev_err(rvu->dev, "Number of entries(%u) != %u\n",
> + cnt, npc_priv.num_subbanks);
> + return -EINVAL;
> + }
> +
> + mutex_lock(&mcam->lock);
> + npc_lock_all_subbank();
> + restrict_valid = false;
> +
> + for (int i = 0; i < cnt; i++)
> + subbank_srch_order[i] = (u32)arr[i];
> +
> + xa_for_each(&npc_priv.xa_sb_used, index, v) {
> + val = xa_to_value(v);
> + uslots[ucnt][0] = index;
> + uslots[ucnt][1] = val;
> + xa_erase(&npc_priv.xa_sb_used, index);
> + ucnt++;
> + }
> +
> + xa_for_each(&npc_priv.xa_sb_free, index, v) {
> + val = xa_to_value(v);
> + fslots[fcnt][0] = index;
> + fslots[fcnt][1] = val;
> + xa_erase(&npc_priv.xa_sb_free, index);
> + fcnt++;
> + }
> +
> + /* xa_store() is done under lock. If xa_store fails
> + * ,no rollback is planned as it might also fail.
Why do you need to go throuh erase and add loop? Why can't you directly
xa_store() the new value? Note that xa_store() can fail due to memory
pressure.
Avoiding the previous erase will prevent deallocation and re allocation
and will avoid any reasonable xa_store() failure.
AFAICS there are a few more items reported by sashiko, please have a look:
https://sashiko.dev/#/patchset/20260409025055.1664053-1-rkannoth%40marvell.com
/P
^ permalink raw reply
* Re: [net,PATCH v2] net: ks8851: Reinstate disabling of BHs around IRQ handler
From: Sebastian Andrzej Siewior @ 2026-04-13 12:57 UTC (permalink / raw)
To: Jakub Kicinski, Marek Vasut
Cc: netdev, stable, David S. Miller, Andrew Lunn, Eric Dumazet,
Nicolai Buchwitz, Paolo Abeni, Ronald Wahl, Yicong Hui,
linux-kernel, Thomas Gleixner
In-Reply-To: <20260412105125.48f0c58f@kernel.org>
On 2026-04-12 10:51:25 [-0700], Jakub Kicinski wrote:
> > Does the backtrace make the problem clearer, with the annotation above ?
>
> Sebastian, do you have any recommendation here? tl;dr is that the driver does
…
What about this:
--- a/drivers/net/ethernet/micrel/ks8851_par.c
+++ b/drivers/net/ethernet/micrel/ks8851_par.c
@@ -63,7 +63,7 @@ static void ks8851_lock_par(struct ks8851_net *ks, unsigned long *flags)
{
struct ks8851_net_par *ksp = to_ks8851_par(ks);
- spin_lock_irqsave(&ksp->lock, *flags);
+ spin_lock_bh(&ksp->lock);
}
/**
@@ -77,7 +77,7 @@ static void ks8851_unlock_par(struct ks8851_net *ks, unsigned long *flags)
{
struct ks8851_net_par *ksp = to_ks8851_par(ks);
- spin_unlock_irqrestore(&ksp->lock, *flags);
+ spin_unlock_bh(&ksp->lock);
}
/**
I don't see why it needs to disable interrupts. This seems to be used by
the _par driver and the _common part. The comments refer to DMA but I
see only FIFO access.
And while at it, I would recommend to
diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
index 8048770958d60..f1c662887646c 100644
--- a/drivers/net/ethernet/micrel/ks8851_common.c
+++ b/drivers/net/ethernet/micrel/ks8851_common.c
@@ -378,9 +378,12 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
if (status & IRQ_LCI)
mii_check_link(&ks->mii);
- if (status & IRQ_RXI)
+ if (status & IRQ_RXI) {
+ local_bh_disable();
while ((skb = __skb_dequeue(&rxq)))
netif_rx(skb);
+ local_bh_enable();
+ }
return IRQ_HANDLED;
}
Because otherwise it will kick-off backlog NAPI after every packet if
multiple packets are available.
Sebastian
^ permalink raw reply related
* Re: [PATCH net-next 0/3] Follow-ups to nk_qlease net selftests
From: Daniel Borkmann @ 2026-04-13 13:02 UTC (permalink / raw)
To: kuba; +Cc: pabeni, dw, razor, netdev
In-Reply-To: <20260413114011.588162-1-daniel@iogearbox.net>
On 4/13/26 1:40 PM, Daniel Borkmann wrote:
> This is a set of follow-ups addressing [0]:
>
> - Split netdevsim tests from HW tests in nk_qlease and move the SW
> tests under selftests/net/
> - Remove multiple ksft_run()s to fix the recently enforced hard-fail
> - Move all the setup inside the test cases for the ones under
> selftests/net/ (I'll defer the HW ones to David)
> - Add more test coverage related to queue leasing behavior and corner
> cases, so now we have 45 tests in nk_qlease.py with netdevsim
> which does not need special HW
>
> [0] https://lore.kernel.org/netdev/20260409181950.7e099b6c@kernel.org
Few comments on the sashiko and ruff review [1,2]:
- re: "the socket would stay open until the cyclic garbage collector runs"
imho that is fine since this would mean there's an error somewhere and
test does not run as expected / would fail, and socket is still being
closed eventually
- re "del test_ns" with sleep to wait for cleanup_net.. was done similarly
as in already merged patches, I can think of a different/better way with
a wait loop where applicable to remove any potential for flakiness
- The other things flagged by Gemini also make sense
- Missed the ruff one in "[E741] Ambiguous variable name: `l`" will fix
I'm planning to address these in a v2 of the series, but as per netdev rule
will wait 24h before resend unless you'd like me to explicitly resend earlier
(given merge win timing).
Thanks,
Daniel
[1] https://sashiko.dev/#/patchset/20260413114011.588162-1-daniel%40iogearbox.net
[2] https://patchwork.kernel.org/project/netdevbpf/list/?series=1080682
^ permalink raw reply
* Re: [patch 14/38] slub: Use prandom instead of get_cycles()
From: hu.shengming @ 2026-04-13 13:02 UTC (permalink / raw)
To: harry
Cc: tglx, linux-kernel, vbabka, linux-mm, arnd, x86, baolu.lu, iommu,
m.grzeschik, netdev, linux-wireless, herbert, linux-crypto, dwmw2,
bernie, linux-fbdev, tytso, linux-ext4, akpm, urezki, elver,
dvyukov, kasan-dev, ryabinin.a.a, t.sailer, linux-hams, Jason,
richard.henderson, linux-alpha, linux, linux-arm-kernel,
catalin.marinas, chenhuacai, loongarch, geert, linux-m68k,
dinguyen, jonas, linux-openrisc, deller, linux-parisc, mpe,
linuxppc-dev, pjw, linux-riscv, hca, linux-s390, davem,
sparclinux, hao.li, cl, rientjes, roman.gushchin
In-Reply-To: <adyyNeVTkXQlnh_2@hyeyoo>
Harry wrote:
> [Resending after fixing broken email headers]
>
> On Fri, Apr 10, 2026 at 02:19:37PM +0200, Thomas Gleixner wrote:
> > The decision whether to scan remote nodes is based on a 'random' number
> > retrieved via get_cycles(). get_cycles() is about to be removed.
> >
> > There is already prandom state in the code, so use that instead.
> >
> > Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> > Cc: Vlastimil Babka <vbabka@kernel.org>
> > Cc: linux-mm@kvack.org
> > ---
>
> Acked-by: Harry Yoo (Oracle) <harry@kernel.org>
>
> Is this for this merge window?
>
> This may conflict with upcoming changes on freelist shuffling [1]
> (not queued for slab/for-next yet though), but it should be easy to
> resolve.
>
Hi Harry,
Would you like me to wait for this patch to land linux-next and then
rebase and send v6 on top?
Thanks,
--
With Best Regards,
Shengming
> [Cc'ing Shengming and SLAB ALLOCATOR folks]
> [1] https://lore.kernel.org/linux-mm/20260409204352095kKWVYKtZImN59ybO6iRNj@zte.com.cn
>
> --
> Cheers,
> Harry / Hyeonggon
>
> > mm/slub.c | 37 +++++++++++++++++++++++--------------
> > 1 file changed, 23 insertions(+), 14 deletions(-)
> >
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -3302,6 +3302,25 @@ static inline struct slab *alloc_slab_pa
> > return slab;
> > }
> >
> > +#if defined(CONFIG_SLAB_FREELIST_RANDOM) || defined(CONFIG_NUMA)
> > +static DEFINE_PER_CPU(struct rnd_state, slab_rnd_state);
> > +
> > +static unsigned int slab_get_prandom_state(unsigned int limit)
> > +{
> > + struct rnd_state *state;
> > + unsigned int res;
> > +
> > + /*
> > + * An interrupt or NMI handler might interrupt and change
> > + * the state in the middle, but that's safe.
> > + */
> > + state = &get_cpu_var(slab_rnd_state);
> > + res = prandom_u32_state(state) % limit;
> > + put_cpu_var(slab_rnd_state);
> > + return res;
> > +}
> > +#endif
> > +
> > #ifdef CONFIG_SLAB_FREELIST_RANDOM
> > /* Pre-initialize the random sequence cache */
> > static int init_cache_random_seq(struct kmem_cache *s)
> > @@ -3365,8 +3384,6 @@ static void *next_freelist_entry(struct
> > return (char *)start + idx;
> > }
> >
> > -static DEFINE_PER_CPU(struct rnd_state, slab_rnd_state);
> > -
> > /* Shuffle the single linked freelist based on a random pre-computed sequence */
> > static bool shuffle_freelist(struct kmem_cache *s, struct slab *slab,
> > bool allow_spin)
> > @@ -3383,15 +3400,7 @@ static bool shuffle_freelist(struct kmem
> > if (allow_spin) {
> > pos = get_random_u32_below(freelist_count);
> > } else {
> > - struct rnd_state *state;
> > -
> > - /*
> > - * An interrupt or NMI handler might interrupt and change
> > - * the state in the middle, but that's safe.
> > - */
> > - state = &get_cpu_var(slab_rnd_state);
> > - pos = prandom_u32_state(state) % freelist_count;
> > - put_cpu_var(slab_rnd_state);
> > + pos = slab_get_prandom_state(freelist_count);
> > }
> >
> > page_limit = slab->objects * s->size;
> > @@ -3882,7 +3891,7 @@ static void *get_from_any_partial(struct
> > * with available objects.
> > */
> > if (!s->remote_node_defrag_ratio ||
> > - get_cycles() % 1024 > s->remote_node_defrag_ratio)
> > + slab_get_prandom_state(1024) > s->remote_node_defrag_ratio)
> > return NULL;
> >
> > do {
> > @@ -7102,7 +7111,7 @@ static unsigned int
> >
> > /* see get_from_any_partial() for the defrag ratio description */
> > if (!s->remote_node_defrag_ratio ||
> > - get_cycles() % 1024 > s->remote_node_defrag_ratio)
> > + slab_get_prandom_state(1024) > s->remote_node_defrag_ratio)
> > return 0;
> >
> > do {
> > @@ -8421,7 +8430,7 @@ void __init kmem_cache_init_late(void)
> > flushwq = alloc_workqueue("slub_flushwq", WQ_MEM_RECLAIM | WQ_PERCPU,
> > 0);
> > WARN_ON(!flushwq);
> > -#ifdef CONFIG_SLAB_FREELIST_RANDOM
> > +#if defined(CONFIG_SLAB_FREELIST_RANDOM) || defined(CONFIG_NUMA)
> > prandom_init_once(&slab_rnd_state);
> > #endif
> > }
> >
> >
^ permalink raw reply
* Re: [PATCH net-next v7 04/10] selftests: net: Add tests for failover of team-aggregated ports
From: Paolo Abeni @ 2026-04-13 13:05 UTC (permalink / raw)
To: Marc Harvey, Jiri Pirko, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Shuah Khan, Simon Horman
Cc: netdev, linux-kernel, linux-kselftest, Kuniyuki Iwashima
In-Reply-To: <20260409-teaming-driver-internal-v7-4-f47e7589685d@google.com>
On 4/9/26 4:59 AM, Marc Harvey wrote:
> There are currently no kernel tests that verify the effect of setting
> the enabled team driver option. In a followup patch, there will be
> changes to this option, so it will be important to make sure it still
> behaves as it does now.
>
> The test verifies that tcp continues to work across two different team
> devices in separate network namespaces, even when member links are
> manually disabled.
>
> Signed-off-by: Marc Harvey <marcharvey@google.com>
> ---
> Changes in v6:
> - Use a tcp port with no associated service.
> - Make tcpdump helper function not string-replace port numbers with
> associated service names, even on Fedora, which has a tcpdump patch
> that changes the required flag.
> - Link to v5: https://lore.kernel.org/netdev/20260406-teaming-driver-internal-v5-4-e8a3f348a1c5@google.com/
>
> Changes in v5:
> - Use tcpdump for collecting traffic, rather than reading rx counters.
> - Link to v4: https://lore.kernel.org/netdev/20260403-teaming-driver-internal-v4-4-d3032f33ca25@google.com/
>
> Changes in v2:
> - Fix shellcheck failures.
> - Remove dependency on net forwarding lib and pipe viewer tools.
> - Use iperf3 for tcp instead of netcat.
> - Link to v1: https://lore.kernel.org/all/20260331053353.2504254-5-marcharvey@google.com/
> ---
> tools/testing/selftests/drivers/net/team/Makefile | 2 +
> tools/testing/selftests/drivers/net/team/config | 4 +
> .../testing/selftests/drivers/net/team/team_lib.sh | 148 +++++++++++++++++++
> .../drivers/net/team/transmit_failover.sh | 158 +++++++++++++++++++++
> tools/testing/selftests/net/forwarding/lib.sh | 9 +-
> 5 files changed, 319 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/drivers/net/team/Makefile b/tools/testing/selftests/drivers/net/team/Makefile
> index 02d6f51d5a06..777da2e0429e 100644
> --- a/tools/testing/selftests/drivers/net/team/Makefile
> +++ b/tools/testing/selftests/drivers/net/team/Makefile
> @@ -7,9 +7,11 @@ TEST_PROGS := \
> options.sh \
> propagation.sh \
> refleak.sh \
> + transmit_failover.sh \
> # end of TEST_PROGS
>
> TEST_INCLUDES := \
> + team_lib.sh \
> ../bonding/lag_lib.sh \
> ../../../net/forwarding/lib.sh \
> ../../../net/in_netns.sh \
> diff --git a/tools/testing/selftests/drivers/net/team/config b/tools/testing/selftests/drivers/net/team/config
> index 5d36a22ef080..8f04ae419c53 100644
> --- a/tools/testing/selftests/drivers/net/team/config
> +++ b/tools/testing/selftests/drivers/net/team/config
> @@ -6,4 +6,8 @@ CONFIG_NETDEVSIM=m
> CONFIG_NET_IPGRE=y
> CONFIG_NET_TEAM=y
> CONFIG_NET_TEAM_MODE_ACTIVEBACKUP=y
> +CONFIG_NET_TEAM_MODE_BROADCAST=y
> CONFIG_NET_TEAM_MODE_LOADBALANCE=y
> +CONFIG_NET_TEAM_MODE_RANDOM=y
> +CONFIG_NET_TEAM_MODE_ROUNDROBIN=y
> +CONFIG_VETH=y
> diff --git a/tools/testing/selftests/drivers/net/team/team_lib.sh b/tools/testing/selftests/drivers/net/team/team_lib.sh
> new file mode 100644
> index 000000000000..2057f5edee79
> --- /dev/null
> +++ b/tools/testing/selftests/drivers/net/team/team_lib.sh
> @@ -0,0 +1,148 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +
> +test_dir="$(dirname "$0")"
> +export REQUIRE_MZ=no
> +export NUM_NETIFS=0
> +# shellcheck disable=SC1091
> +source "${test_dir}/../../../net/forwarding/lib.sh"
> +
> +TCP_PORT="43434"
> +
> +# Create a team interface inside of a given network namespace with a given
> +# mode, members, and IP address.
> +# Arguments:
> +# namespace - Network namespace to put the team interface into.
> +# team - The name of the team interface to setup.
> +# mode - The team mode of the interface.
> +# ip_address - The IP address to assign to the team interface.
> +# prefix_length - The prefix length for the IP address subnet.
> +# $@ - members - The member interfaces of the aggregation.
> +setup_team()
> +{
> + local namespace=$1
> + local team=$2
> + local mode=$3
> + local ip_address=$4
> + local prefix_length=$5
> + shift 5
> + local members=("$@")
> +
> + # Prerequisite: team must have no members
> + for member in "${members[@]}"; do
> + ip -n "${namespace}" link set "${member}" nomaster
> + done
> +
> + # Prerequisite: team must have no address in order to set it
> + # shellcheck disable=SC2086
> + ip -n "${namespace}" addr del "${ip_address}/${prefix_length}" \
> + ${NODAD} dev "${team}"
> +
> + echo "Setting team in ${namespace} to mode ${mode}"
> +
> + if ! ip -n "${namespace}" link set "${team}" down; then
> + echo "Failed to bring team device down"
> + return 1
> + fi
> + if ! ip netns exec "${namespace}" teamnl "${team}" setoption mode \
> + "${mode}"; then
> + echo "Failed to set ${team} mode to '${mode}'"
> + return 1
> + fi
> +
> + # Aggregate the members into teams.
> + for member in "${members[@]}"; do
> + ip -n "${namespace}" link set "${member}" master "${team}"
> + done
> +
> + # Bring team devices up and give them addresses.
> + if ! ip -n "${namespace}" link set "${team}" up; then
> + echo "Failed to set ${team} up"
> + return 1
> + fi
> +
> + # shellcheck disable=SC2086
> + if ! ip -n "${namespace}" addr add "${ip_address}/${prefix_length}" \
> + ${NODAD} dev "${team}"; then
> + echo "Failed to give ${team} IP address in ${namespace}"
> + return 1
> + fi
> +}
> +
> +# This is global used to keep track of the sender's iperf3 process, so that it
> +# can be terminated.
> +declare sender_pid
> +
> +# Start sending and receiving TCP traffic with iperf3.
> +# Globals:
> +# sender_pid - The process ID of the iperf3 sender process. Used to kill it
> +# later.
> +start_listening_and_sending()
> +{
> + ip netns exec "${NS2}" iperf3 -s -p "${TCP_PORT}" --logfile /dev/null &
> + # Wait for server to become reachable before starting client.
> + slowwait 5 ip netns exec "${NS1}" iperf3 -c "${NS2_IP}" -p \
> + "${TCP_PORT}" -t 1 --logfile /dev/null
Note for a possible follow-up: the iperf3 server is apparently never
stopped. You could used the wait_local_port_listen helper and
the`--one-off` iperf3 command line argument to avoid that (or explicitly
killing the server pid at cleanup time)
/P
^ permalink raw reply
* Re: [PATCH net-next v7 05/10] selftests: net: Add test for enablement of ports with teamd
From: Paolo Abeni @ 2026-04-13 13:07 UTC (permalink / raw)
To: Marc Harvey, Jiri Pirko, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Shuah Khan, Simon Horman
Cc: netdev, linux-kernel, linux-kselftest, Kuniyuki Iwashima
In-Reply-To: <20260409-teaming-driver-internal-v7-5-f47e7589685d@google.com>
On 4/9/26 4:59 AM, Marc Harvey wrote:
> There are no tests that verify enablement and disablement of team driver
> ports with teamd. This should work even with changes to the enablement
> option, so it is important to test.
>
> This test sets up an active-backup network configuration across two
> network namespaces, and tries to send traffic while changing which
> link is the active one.
>
> Also increase the team test timeout to 300 seconds, because gracefully
> killing teamd can take 30 seconds for each instance.
>
> Signed-off-by: Marc Harvey <marcharvey@google.com>
> ---
> Changes in v7:
> - Increase test timeout to 300 seconds, since terminating teamd can
> take 30 seconds during test cleanup.
> - Link to v6: https://lore.kernel.org/netdev/20260408-teaming-driver-internal-v6-5-e5bcdcf72504@google.com/
>
> Changes in v6:
> - Remove manual changing of member port states to UP, not needed.
> - Link to v5: https://lore.kernel.org/netdev/20260406-teaming-driver-internal-v5-5-e8a3f348a1c5@google.com/
>
> Changes in v5:
> - Make test wait for inactive link to stop receiving traffic after
> setting it to inactive, since there was a race condition.
> - Change test teardown to try graceful shutdown first, then use
> sigkill if needed.
> - Manually delete leftover teamd files during teardown.
> - Use tcpdump instead of checking rx counters.
> - Link to v4: https://lore.kernel.org/netdev/20260403-teaming-driver-internal-v4-5-d3032f33ca25@google.com/
>
> Changed in v3:
> - Make test cleanup kill teamd instead of terminate.
> - Link to v2: https://lore.kernel.org/netdev/20260401-teaming-driver-internal-v2-5-f80c1291727b@google.com/
>
> Changes in v2:
> - Fix shellcheck failures.
> - Remove dependency on net forwarding lib and pipe viewer tools.
> - Use iperf3 for tcp instead of netcat.
> - Link to v1: https://lore.kernel.org/all/20260331053353.2504254-6-marcharvey@google.com/
> ---
> tools/testing/selftests/drivers/net/team/Makefile | 1 +
> tools/testing/selftests/drivers/net/team/settings | 1 +
> .../testing/selftests/drivers/net/team/team_lib.sh | 26 +++
> .../drivers/net/team/teamd_activebackup.sh | 246 +++++++++++++++++++++
> tools/testing/selftests/net/lib.sh | 13 ++
> 5 files changed, 287 insertions(+)
>
> diff --git a/tools/testing/selftests/drivers/net/team/Makefile b/tools/testing/selftests/drivers/net/team/Makefile
> index 777da2e0429e..dab922d7f83d 100644
> --- a/tools/testing/selftests/drivers/net/team/Makefile
> +++ b/tools/testing/selftests/drivers/net/team/Makefile
> @@ -7,6 +7,7 @@ TEST_PROGS := \
> options.sh \
> propagation.sh \
> refleak.sh \
> + teamd_activebackup.sh \
> transmit_failover.sh \
> # end of TEST_PROGS
>
> diff --git a/tools/testing/selftests/drivers/net/team/settings b/tools/testing/selftests/drivers/net/team/settings
> new file mode 100644
> index 000000000000..694d70710ff0
> --- /dev/null
> +++ b/tools/testing/selftests/drivers/net/team/settings
> @@ -0,0 +1 @@
> +timeout=300
> diff --git a/tools/testing/selftests/drivers/net/team/team_lib.sh b/tools/testing/selftests/drivers/net/team/team_lib.sh
> index 2057f5edee79..02ef0ee02d6a 100644
> --- a/tools/testing/selftests/drivers/net/team/team_lib.sh
> +++ b/tools/testing/selftests/drivers/net/team/team_lib.sh
> @@ -146,3 +146,29 @@ did_interface_receive()
> false
> fi
> }
> +
> +# Return true if the given interface in the given namespace does NOT receive
> +# traffic over a 1 second period.
> +# Arguments:
> +# interface - The name of the interface.
> +# ip_address - The destination IP address.
> +# namespace - The name of the namespace that the interface is in.
> +check_no_traffic()
> +{
> + local interface="$1"
> + local ip_address="$2"
> + local namespace="$3"
> + local rc
> +
> + save_tcpdump_outputs "${namespace}" "${interface}"
> + did_interface_receive "${interface}" "${ip_address}"
> + rc=$?
> +
> + clear_tcpdump_outputs "${interface}"
> +
> + if [[ "${rc}" -eq 0 ]]; then
> + return 1
> + else
> + return 0
> + fi
> +}
> diff --git a/tools/testing/selftests/drivers/net/team/teamd_activebackup.sh b/tools/testing/selftests/drivers/net/team/teamd_activebackup.sh
> new file mode 100755
> index 000000000000..2b26a697e179
> --- /dev/null
> +++ b/tools/testing/selftests/drivers/net/team/teamd_activebackup.sh
> @@ -0,0 +1,246 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +
> +# These tests verify that teamd is able to enable and disable ports via the
> +# active backup runner.
> +#
> +# Topology:
> +#
> +# +-------------------------+ NS1
> +# | test_team1 |
> +# | + |
> +# | eth0 | eth1 |
> +# | +---+---+ |
> +# | | | |
> +# +-------------------------+
> +# | |
> +# +-------------------------+ NS2
> +# | | | |
> +# | +-------+ |
> +# | eth0 | eth1 |
> +# | + |
> +# | test_team2 |
> +# +-------------------------+
> +
> +export ALL_TESTS="teamd_test_active_backup"
> +
> +test_dir="$(dirname "$0")"
> +# shellcheck disable=SC1091
> +source "${test_dir}/../../../net/lib.sh"
> +# shellcheck disable=SC1091
> +source "${test_dir}/team_lib.sh"
> +
> +NS1=""
> +NS2=""
> +export NODAD="nodad"
> +PREFIX_LENGTH="64"
> +NS1_IP="fd00::1"
> +NS2_IP="fd00::2"
> +NS1_IP4="192.168.0.1"
> +NS2_IP4="192.168.0.2"
> +NS1_TEAMD_CONF=""
> +NS2_TEAMD_CONF=""
> +NS1_TEAMD_PID=""
> +NS2_TEAMD_PID=""
> +
> +while getopts "4" opt; do
> + case $opt in
> + 4)
> + echo "IPv4 mode selected."
> + export NODAD=
> + PREFIX_LENGTH="24"
> + NS1_IP="${NS1_IP4}"
> + NS2_IP="${NS2_IP4}"
> + ;;
> + \?)
> + echo "Invalid option: -${OPTARG}" >&2
> + exit 1
> + ;;
> + esac
> +done
> +
> +teamd_config_create()
> +{
> + local runner=$1
> + local dev=$2
> + local conf
> +
> + conf=$(mktemp)
> +
> + cat > "${conf}" <<-EOF
> + {
> + "device": "${dev}",
> + "runner": {"name": "${runner}"},
> + "ports": {
> + "eth0": {},
> + "eth1": {}
> + }
> + }
> + EOF
> + echo "${conf}"
> +}
> +
> +# Create the network namespaces, veth pair, and team devices in the specified
> +# runner.
> +# Globals:
> +# RET - Used by test infra, set by `check_err` functions.
> +# Arguments:
> +# runner - The Teamd runner to use for the Team devices.
> +environment_create()
> +{
> + local runner=$1
> +
> + echo "Setting up two-link aggregation for runner ${runner}"
> + echo "Teamd version is: $(teamd --version)"
> + trap environment_destroy EXIT
> +
> + setup_ns ns1 ns2
> + NS1="${NS_LIST[0]}"
> + NS2="${NS_LIST[1]}"
> +
> + for link in $(seq 0 1); do
> + ip -n "${NS1}" link add "eth${link}" type veth peer name \
> + "eth${link}" netns "${NS2}"
> + check_err $? "Failed to create veth pair"
> + done
> +
> + NS1_TEAMD_CONF=$(teamd_config_create "${runner}" "test_team1")
> + NS2_TEAMD_CONF=$(teamd_config_create "${runner}" "test_team2")
> + echo "Conf files are ${NS1_TEAMD_CONF} and ${NS2_TEAMD_CONF}"
> +
> + ip netns exec "${NS1}" teamd -d -f "${NS1_TEAMD_CONF}"
> + check_err $? "Failed to create team device in ${NS1}"
> + NS1_TEAMD_PID=$(pgrep -f "teamd -d -f ${NS1_TEAMD_CONF}")
> +
> + ip netns exec "${NS2}" teamd -d -f "${NS2_TEAMD_CONF}"
> + check_err $? "Failed to create team device in ${NS2}"
> + NS2_TEAMD_PID=$(pgrep -f "teamd -d -f ${NS2_TEAMD_CONF}")
> +
> + echo "Created team devices"
> + echo "Teamd PIDs are ${NS1_TEAMD_PID} and ${NS2_TEAMD_PID}"
> +
> + ip -n "${NS1}" link set test_team1 up
> + check_err $? "Failed to set test_team1 up in ${NS1}"
> + ip -n "${NS2}" link set test_team2 up
> + check_err $? "Failed to set test_team2 up in ${NS2}"
> +
> + ip -n "${NS1}" addr add "${NS1_IP}/${PREFIX_LENGTH}" "${NODAD}" dev \
> + test_team1
Note for a possible follow-up: it looks like that the above will fail with:
Error: either "local" is duplicate, or "" is garbage.
when running in ipv4 mode (not invoked by the CI/self-test infra), due
to the quotes around ${NODAD}.
/P
^ permalink raw reply
* Re: [PATCH v3 net-next 13/15] net/sched: sch_cake: annotate data-races in cake_dump_stats()
From: Eric Dumazet @ 2026-04-13 13:11 UTC (permalink / raw)
To: Toke Høiland-Jørgensen
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
Jamal Hadi Salim, Jiri Pirko, netdev, eric.dumazet
In-Reply-To: <87se8zcbcy.fsf@toke.dk>
On Mon, Apr 13, 2026 at 5:07 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Eric Dumazet <edumazet@google.com> writes:
>
> > cake_dump_stats() and cake_dump_class_stats() run without qdisc
> > spinlock being held.
> >
> > Add READ_ONCE()/WRITE_ONCE() annotations.
> >
> > Fixes: 046f6fd5daef ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: "Toke Høiland-Jørgensen" <toke@toke.dk>
> > ---
> > net/sched/sch_cake.c | 404 ++++++++++++++++++++++++-------------------
> > 1 file changed, 225 insertions(+), 179 deletions(-)
>
> One of these diffstats is not like the others - thanks for tackling this :)
>
> A few nits below:
>
> > diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
> > index 32e672820c00a88c6d8fe77a6308405e016525ea..f523f0aa4d830e9d3ec4d43bb123e1dc4f8f289d 100644
> > --- a/net/sched/sch_cake.c
> > +++ b/net/sched/sch_cake.c
> > @@ -399,14 +399,14 @@ static void cake_configure_rates(struct Qdisc *sch, u64 rate, bool rate_adjust);
> > * Here, invsqrt is a fixed point number (< 1.0), 32bit mantissa, aka Q0.32
> > */
> >
> > -static void cobalt_newton_step(struct cobalt_vars *vars)
> > +static void cobalt_newton_step(struct cobalt_vars *vars, u32 count)
> > {
> > u32 invsqrt, invsqrt2;
> > u64 val;
> >
> > invsqrt = vars->rec_inv_sqrt;
> > invsqrt2 = ((u64)invsqrt * invsqrt) >> 32;
> > - val = (3LL << 32) - ((u64)vars->count * invsqrt2);
> > + val = (3LL << 32) - ((u64)count * invsqrt2);
> >
> > val >>= 2; /* avoid overflow in following multiply */
> > val = (val * invsqrt) >> (32 - 2 + 1);
> > @@ -414,12 +414,12 @@ static void cobalt_newton_step(struct cobalt_vars *vars)
> > vars->rec_inv_sqrt = val;
> > }
> >
> > -static void cobalt_invsqrt(struct cobalt_vars *vars)
> > +static void cobalt_invsqrt(struct cobalt_vars *vars, u32 count)
> > {
> > - if (vars->count < REC_INV_SQRT_CACHE)
> > - vars->rec_inv_sqrt = inv_sqrt_cache[vars->count];
> > + if (count < REC_INV_SQRT_CACHE)
> > + vars->rec_inv_sqrt = inv_sqrt_cache[count];
> > else
> > - cobalt_newton_step(vars);
> > + cobalt_newton_step(vars, count);
> > }
> >
> > static void cobalt_vars_init(struct cobalt_vars *vars)
> > @@ -449,16 +449,19 @@ static bool cobalt_queue_full(struct cobalt_vars *vars,
> > bool up = false;
> >
> > if (ktime_to_ns(ktime_sub(now, vars->blue_timer)) > p->target) {
> > - up = !vars->p_drop;
> > - vars->p_drop += p->p_inc;
> > - if (vars->p_drop < p->p_inc)
> > - vars->p_drop = ~0;
> > - vars->blue_timer = now;
> > - }
> > - vars->dropping = true;
> > - vars->drop_next = now;
> > + u32 p_drop = vars->p_drop;
> > +
> > + up = !p_drop;
> > + p_drop += p->p_inc;
> > + if (p_drop < p->p_inc)
> > + p_drop = ~0;
> > + WRITE_ONCE(vars->p_drop, p_drop);
> > + WRITE_ONCE(vars->blue_timer, now);
> > + }
> > + WRITE_ONCE(vars->dropping, true);
> > + WRITE_ONCE(vars->drop_next, now);
> > if (!vars->count)
> > - vars->count = 1;
> > + WRITE_ONCE(vars->count, 1);
> >
> > return up;
> > }
> > @@ -474,21 +477,25 @@ static bool cobalt_queue_empty(struct cobalt_vars *vars,
> >
> > if (vars->p_drop &&
> > ktime_to_ns(ktime_sub(now, vars->blue_timer)) > p->target) {
> > - if (vars->p_drop < p->p_dec)
> > - vars->p_drop = 0;
> > + u32 p_drop = vars->p_drop;
> > +
> > + if (p_drop < p->p_dec)
> > + p_drop = 0;
> > else
> > - vars->p_drop -= p->p_dec;
> > - vars->blue_timer = now;
> > - down = !vars->p_drop;
> > + p_drop -= p->p_dec;
> > + WRITE_ONCE(vars->p_drop, p_drop);
> > + WRITE_ONCE(vars->blue_timer, now);
> > + down = !p_drop;
> > }
> > - vars->dropping = false;
> > + WRITE_ONCE(vars->dropping, false);
> >
> > if (vars->count && ktime_to_ns(ktime_sub(now, vars->drop_next)) >= 0) {
> > - vars->count--;
> > - cobalt_invsqrt(vars);
> > - vars->drop_next = cobalt_control(vars->drop_next,
> > - p->interval,
> > - vars->rec_inv_sqrt);
> > + WRITE_ONCE(vars->count, vars->count - 1);
> > + cobalt_invsqrt(vars, vars->count);
> > + WRITE_ONCE(vars->drop_next,
> > + cobalt_control(vars->drop_next,
> > + p->interval,
> > + vars->rec_inv_sqrt));
> > }
> >
> > return down;
> > @@ -507,6 +514,7 @@ static enum qdisc_drop_reason cobalt_should_drop(struct cobalt_vars *vars,
> > bool next_due, over_target;
> > ktime_t schedule;
> > u64 sojourn;
> > + u32 count;
> >
> > /* The 'schedule' variable records, in its sign, whether 'now' is before or
> > * after 'drop_next'. This allows 'drop_next' to be updated before the next
> > @@ -528,45 +536,50 @@ static enum qdisc_drop_reason cobalt_should_drop(struct cobalt_vars *vars,
> > over_target = sojourn > p->target &&
> > sojourn > p->mtu_time * bulk_flows * 2 &&
> > sojourn > p->mtu_time * 4;
> > - next_due = vars->count && ktime_to_ns(schedule) >= 0;
> > + count = vars->count;
> > + next_due = count && ktime_to_ns(schedule) >= 0;
> >
> > vars->ecn_marked = false;
> >
> > if (over_target) {
> > if (!vars->dropping) {
> > - vars->dropping = true;
> > - vars->drop_next = cobalt_control(now,
> > - p->interval,
> > - vars->rec_inv_sqrt);
> > + WRITE_ONCE(vars->dropping, true);
> > + WRITE_ONCE(vars->drop_next,
> > + cobalt_control(now,
> > + p->interval,
> > + vars->rec_inv_sqrt));
> > }
> > - if (!vars->count)
> > - vars->count = 1;
> > + if (!count)
> > + count = 1;
> > } else if (vars->dropping) {
> > - vars->dropping = false;
> > + WRITE_ONCE(vars->dropping, false);
> > }
> >
> > if (next_due && vars->dropping) {
> > /* Use ECN mark if possible, otherwise drop */
> > - if (!(vars->ecn_marked = INET_ECN_set_ce(skb)))
> > + vars->ecn_marked = INET_ECN_set_ce(skb);
> > + if (!vars->ecn_marked)
> > reason = QDISC_DROP_CONGESTED;
> >
> > - vars->count++;
> > - if (!vars->count)
> > - vars->count--;
> > - cobalt_invsqrt(vars);
> > - vars->drop_next = cobalt_control(vars->drop_next,
> > - p->interval,
> > - vars->rec_inv_sqrt);
> > + count++;
> > + if (!count)
> > + count--;
> > + cobalt_invsqrt(vars, count);
> > + WRITE_ONCE(vars->drop_next,
> > + cobalt_control(vars->drop_next,
> > + p->interval,
> > + vars->rec_inv_sqrt));
> > schedule = ktime_sub(now, vars->drop_next);
> > } else {
> > while (next_due) {
> > - vars->count--;
> > - cobalt_invsqrt(vars);
> > - vars->drop_next = cobalt_control(vars->drop_next,
> > - p->interval,
> > - vars->rec_inv_sqrt);
> > + count--;
> > + cobalt_invsqrt(vars, count);
> > + WRITE_ONCE(vars->drop_next,
> > + cobalt_control(vars->drop_next,
> > + p->interval,
> > + vars->rec_inv_sqrt));
> > schedule = ktime_sub(now, vars->drop_next);
> > - next_due = vars->count && ktime_to_ns(schedule) >= 0;
> > + next_due = count && ktime_to_ns(schedule) >= 0;
> > }
> > }
> >
> > @@ -575,11 +588,12 @@ static enum qdisc_drop_reason cobalt_should_drop(struct cobalt_vars *vars,
> > get_random_u32() < vars->p_drop)
> > reason = QDISC_DROP_FLOOD_PROTECTION;
> >
> > + WRITE_ONCE(vars->count, count);
> > /* Overload the drop_next field as an activity timeout */
> > - if (!vars->count)
> > - vars->drop_next = ktime_add_ns(now, p->interval);
> > + if (count)
>
> This seems to reverse the conditional?
Ah right, thanks !
>
> > + WRITE_ONCE(vars->drop_next, ktime_add_ns(now, p->interval));
> > else if (ktime_to_ns(schedule) > 0 && reason == QDISC_DROP_UNSPEC)
> > - vars->drop_next = now;
> > + WRITE_ONCE(vars->drop_next, now);
> >
> > return reason;
> > }
> > @@ -813,7 +827,7 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb,
> > i++, k = (k + 1) % CAKE_SET_WAYS) {
> > if (q->tags[outer_hash + k] == flow_hash) {
> > if (i)
> > - q->way_hits++;
> > + WRITE_ONCE(q->way_hits, q->way_hits + 1);
> >
> > if (!q->flows[outer_hash + k].set) {
> > /* need to increment host refcnts */
> > @@ -831,7 +845,7 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb,
> > for (i = 0; i < CAKE_SET_WAYS;
> > i++, k = (k + 1) % CAKE_SET_WAYS) {
> > if (!q->flows[outer_hash + k].set) {
> > - q->way_misses++;
> > + WRITE_ONCE(q->way_misses, q->way_misses + 1);
> > allocate_src = cake_dsrc(flow_mode);
> > allocate_dst = cake_ddst(flow_mode);
> > goto found;
> > @@ -841,7 +855,7 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb,
> > /* With no empty queues, default to the original
> > * queue, accept the collision, update the host tags.
> > */
> > - q->way_collisions++;
> > + WRITE_ONCE(q->way_collisions, q->way_collisions + 1);
> > allocate_src = cake_dsrc(flow_mode);
> > allocate_dst = cake_ddst(flow_mode);
> >
> > @@ -875,7 +889,8 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb,
> > q->flows[reduced_hash].srchost = srchost_idx;
> >
> > if (q->flows[reduced_hash].set == CAKE_SET_BULK)
> > - cake_inc_srchost_bulk_flow_count(q, &q->flows[reduced_hash], flow_mode);
> > + cake_inc_srchost_bulk_flow_count(q, &q->flows[reduced_hash],
> > + flow_mode);
> > }
> >
> > if (allocate_dst) {
> > @@ -899,7 +914,8 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb,
> > q->flows[reduced_hash].dsthost = dsthost_idx;
> >
> > if (q->flows[reduced_hash].set == CAKE_SET_BULK)
> > - cake_inc_dsthost_bulk_flow_count(q, &q->flows[reduced_hash], flow_mode);
> > + cake_inc_dsthost_bulk_flow_count(q, &q->flows[reduced_hash],
> > + flow_mode);
> > }
> > }
> >
> > @@ -1379,9 +1395,9 @@ static u32 cake_calc_overhead(struct cake_sched_data *qd, u32 len, u32 off)
> > len -= off;
> >
> > if (qd->max_netlen < len)
> > - qd->max_netlen = len;
> > + WRITE_ONCE(qd->max_netlen, len);
> > if (qd->min_netlen > len)
> > - qd->min_netlen = len;
> > + WRITE_ONCE(qd->min_netlen, len);
> >
> > len += q->rate_overhead;
> >
> > @@ -1401,9 +1417,9 @@ static u32 cake_calc_overhead(struct cake_sched_data *qd, u32 len, u32 off)
> > }
> >
> > if (qd->max_adjlen < len)
> > - qd->max_adjlen = len;
> > + WRITE_ONCE(qd->max_adjlen, len);
> > if (qd->min_adjlen > len)
> > - qd->min_adjlen = len;
> > + WRITE_ONCE(qd->min_adjlen, len);
> >
> > return len;
> > }
> > @@ -1416,7 +1432,7 @@ static u32 cake_overhead(struct cake_sched_data *q, const struct sk_buff *skb)
> > u16 segs = qdisc_pkt_segs(skb);
> > u32 len = qdisc_pkt_len(skb);
> >
> > - q->avg_netoff = cake_ewma(q->avg_netoff, off << 16, 8);
> > + WRITE_ONCE(q->avg_netoff, cake_ewma(q->avg_netoff, off << 16, 8));
> >
> > if (segs == 1)
> > return cake_calc_overhead(q, len, off);
> > @@ -1590,16 +1606,17 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free)
> > }
> >
> > if (cobalt_queue_full(&flow->cvars, &b->cparams, now))
> > - b->unresponsive_flow_count++;
> > + WRITE_ONCE(b->unresponsive_flow_count,
> > + b->unresponsive_flow_count + 1);
> >
> > len = qdisc_pkt_len(skb);
> > q->buffer_used -= skb->truesize;
> > - b->backlogs[idx] -= len;
> > - b->tin_backlog -= len;
> > + WRITE_ONCE(b->backlogs[idx], b->backlogs[idx] - len);
> > + WRITE_ONCE(b->tin_backlog, b->tin_backlog - len);
> > qstats_backlog_sub(sch, len);
> >
> > - flow->dropped++;
> > - b->tin_dropped++;
> > + WRITE_ONCE(flow->dropped, flow->dropped + 1);
> > + WRITE_ONCE(b->tin_dropped, b->tin_dropped + 1);
> >
> > if (q->config->rate_flags & CAKE_FLAG_INGRESS)
> > cake_advance_shaper(q, b, skb, now, true);
> > @@ -1795,7 +1812,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> > }
> >
> > if (unlikely(len > b->max_skblen))
> > - b->max_skblen = len;
> > + WRITE_ONCE(b->max_skblen, len);
> >
> > if (qdisc_pkt_segs(skb) > 1 && q->config->rate_flags & CAKE_FLAG_SPLIT_GSO) {
> > struct sk_buff *segs, *nskb;
> > @@ -1819,13 +1836,13 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> > numsegs++;
> > slen += segs->len;
> > q->buffer_used += segs->truesize;
> > - b->packets++;
>
> Right above this hunk we do sch->q.qlen++; - does that need changing as
> well?
This was changed to qdisc_qlen_inc() in a prior commit in this series.
( net/sched: add qdisc_qlen_inc() and qdisc_qlen_dec() )
>
> > }
> >
> > /* stats */
> > - b->bytes += slen;
> > - b->backlogs[idx] += slen;
> > - b->tin_backlog += slen;
> > + WRITE_ONCE(b->bytes, b->bytes + slen);
> > + WRITE_ONCE(b->packets, b->packets + numsegs);
> > + WRITE_ONCE(b->backlogs[idx], b->backlogs[idx] + slen);
> > + WRITE_ONCE(b->tin_backlog, b->tin_backlog + slen);
> > qstats_backlog_add(sch, slen);
> > q->avg_window_bytes += slen;
> >
> > @@ -1843,10 +1860,10 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> > ack = cake_ack_filter(q, flow);
> >
> > if (ack) {
> > - b->ack_drops++;
> > + WRITE_ONCE(b->ack_drops, b->ack_drops + 1);
> > qdisc_qstats_drop(sch);
> > ack_pkt_len = qdisc_pkt_len(ack);
> > - b->bytes += ack_pkt_len;
> > + WRITE_ONCE(b->bytes, b->bytes + ack_pkt_len);
> > q->buffer_used += skb->truesize - ack->truesize;
> > if (q->config->rate_flags & CAKE_FLAG_INGRESS)
> > cake_advance_shaper(q, b, ack, now, true);
> > @@ -1859,10 +1876,10 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> > }
> >
> > /* stats */
> > - b->packets++;
> > - b->bytes += len - ack_pkt_len;
> > - b->backlogs[idx] += len - ack_pkt_len;
> > - b->tin_backlog += len - ack_pkt_len;
> > + WRITE_ONCE(b->packets, b->packets + 1);
> > + WRITE_ONCE(b->bytes, b->bytes + len - ack_pkt_len);
> > + WRITE_ONCE(b->backlogs[idx], b->backlogs[idx] + len - ack_pkt_len);
> > + WRITE_ONCE(b->tin_backlog, b->tin_backlog + len - ack_pkt_len);
> > qstats_backlog_add(sch, len - ack_pkt_len);
> > q->avg_window_bytes += len - ack_pkt_len;
> > }
> > @@ -1894,9 +1911,9 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> > u64 b = q->avg_window_bytes * (u64)NSEC_PER_SEC;
> >
> > b = div64_u64(b, window_interval);
> > - q->avg_peak_bandwidth =
> > - cake_ewma(q->avg_peak_bandwidth, b,
> > - b > q->avg_peak_bandwidth ? 2 : 8);
> > + WRITE_ONCE(q->avg_peak_bandwidth,
> > + cake_ewma(q->avg_peak_bandwidth, b,
> > + b > q->avg_peak_bandwidth ? 2 : 8));
> > q->avg_window_bytes = 0;
> > q->avg_window_begin = now;
> >
> > @@ -1917,27 +1934,30 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> > if (!flow->set) {
> > list_add_tail(&flow->flowchain, &b->new_flows);
> > } else {
> > - b->decaying_flow_count--;
> > + WRITE_ONCE(b->decaying_flow_count,
> > + b->decaying_flow_count - 1);
> > list_move_tail(&flow->flowchain, &b->new_flows);
> > }
> > flow->set = CAKE_SET_SPARSE;
> > - b->sparse_flow_count++;
> > + WRITE_ONCE(b->sparse_flow_count,
> > + b->sparse_flow_count + 1);
> >
> > - flow->deficit = cake_get_flow_quantum(b, flow, q->config->flow_mode);
> > + WRITE_ONCE(flow->deficit,
> > + cake_get_flow_quantum(b, flow, q->config->flow_mode));
> > } else if (flow->set == CAKE_SET_SPARSE_WAIT) {
> > /* this flow was empty, accounted as a sparse flow, but actually
> > * in the bulk rotation.
> > */
> > flow->set = CAKE_SET_BULK;
> > - b->sparse_flow_count--;
> > - b->bulk_flow_count++;
> > + WRITE_ONCE(b->sparse_flow_count, b->sparse_flow_count - 1);
> > + WRITE_ONCE(b->bulk_flow_count, b->bulk_flow_count + 1);
> >
> > cake_inc_srchost_bulk_flow_count(b, flow, q->config->flow_mode);
> > cake_inc_dsthost_bulk_flow_count(b, flow, q->config->flow_mode);
> > }
> >
> > if (q->buffer_used > q->buffer_max_used)
> > - q->buffer_max_used = q->buffer_used;
> > + WRITE_ONCE(q->buffer_max_used, q->buffer_used);
> >
> > if (q->buffer_used <= q->buffer_limit)
> > return NET_XMIT_SUCCESS;
> > @@ -1976,8 +1996,8 @@ static struct sk_buff *cake_dequeue_one(struct Qdisc *sch)
> > if (flow->head) {
> > skb = dequeue_head(flow);
> > len = qdisc_pkt_len(skb);
> > - b->backlogs[q->cur_flow] -= len;
> > - b->tin_backlog -= len;
> > + WRITE_ONCE(b->backlogs[q->cur_flow], b->backlogs[q->cur_flow] - len);
> > + WRITE_ONCE(b->tin_backlog, b->tin_backlog - len);
> > qstats_backlog_sub(sch, len);
> > q->buffer_used -= skb->truesize;
> > qdisc_qlen_dec(sch);
> > @@ -2042,7 +2062,7 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
> >
> > cake_configure_rates(sch, new_rate, true);
> > q->last_checked_active = now;
> > - q->active_queues = num_active_qs;
> > + WRITE_ONCE(q->active_queues, num_active_qs);
> > }
> >
> > begin:
> > @@ -2149,8 +2169,10 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
> > */
> > if (flow->set == CAKE_SET_SPARSE) {
> > if (flow->head) {
> > - b->sparse_flow_count--;
> > - b->bulk_flow_count++;
> > + WRITE_ONCE(b->sparse_flow_count,
> > + b->sparse_flow_count - 1);
> > + WRITE_ONCE(b->bulk_flow_count,
> > + b->bulk_flow_count + 1);
> >
> > cake_inc_srchost_bulk_flow_count(b, flow, q->config->flow_mode);
> > cake_inc_dsthost_bulk_flow_count(b, flow, q->config->flow_mode);
> > @@ -2165,7 +2187,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
> > }
> > }
> >
> > - flow->deficit += cake_get_flow_quantum(b, flow, q->config->flow_mode);
> > + WRITE_ONCE(flow->deficit,
> > + flow->deficit + cake_get_flow_quantum(b, flow, q->config->flow_mode));
> > list_move_tail(&flow->flowchain, &b->old_flows);
> >
> > goto retry;
> > @@ -2177,7 +2200,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
> > if (!skb) {
> > /* this queue was actually empty */
> > if (cobalt_queue_empty(&flow->cvars, &b->cparams, now))
> > - b->unresponsive_flow_count--;
> > + WRITE_ONCE(b->unresponsive_flow_count,
> > + b->unresponsive_flow_count - 1);
> >
> > if (flow->cvars.p_drop || flow->cvars.count ||
> > ktime_before(now, flow->cvars.drop_next)) {
> > @@ -2187,16 +2211,22 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
> > list_move_tail(&flow->flowchain,
> > &b->decaying_flows);
> > if (flow->set == CAKE_SET_BULK) {
> > - b->bulk_flow_count--;
> > + WRITE_ONCE(b->bulk_flow_count,
> > + b->bulk_flow_count - 1);
> >
> > - cake_dec_srchost_bulk_flow_count(b, flow, q->config->flow_mode);
> > - cake_dec_dsthost_bulk_flow_count(b, flow, q->config->flow_mode);
> > + cake_dec_srchost_bulk_flow_count(b, flow,
> > + q->config->flow_mode);
> > + cake_dec_dsthost_bulk_flow_count(b, flow,
> > + q->config->flow_mode);
>
> These seem like unnecessary whitespace changes?
Line length was 105 ... a bit over the recommended limit.
^ permalink raw reply
* [PATCH v2] vsock/virtio: fix accept queue count leak on transport mismatch
From: Dudu Lu @ 2026-04-13 13:14 UTC (permalink / raw)
To: netdev; +Cc: stefanha, sgarzare, mst, jasowang, Dudu Lu
virtio_transport_recv_listen() calls sk_acceptq_added() before
vsock_assign_transport(). If vsock_assign_transport() fails or
selects a different transport, the error path returns without
calling sk_acceptq_removed(), permanently incrementing
sk_ack_backlog.
After approximately backlog+1 such failures, sk_acceptq_is_full()
returns true, causing the listener to reject all new connections.
Fix by moving sk_acceptq_added() to after the transport validation,
matching the pattern used by vmci_transport and hyperv_transport.
Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
Signed-off-by: Dudu Lu <phx0fer@gmail.com>
---
net/vmw_vsock/virtio_transport_common.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 8a9fb23c6e85..e01d983488e5 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -1560,8 +1560,6 @@ virtio_transport_recv_listen(struct sock *sk, struct sk_buff *skb,
return -ENOMEM;
}
- sk_acceptq_added(sk);
-
lock_sock_nested(child, SINGLE_DEPTH_NESTING);
child->sk_state = TCP_ESTABLISHED;
@@ -1583,6 +1581,7 @@ virtio_transport_recv_listen(struct sock *sk, struct sk_buff *skb,
return ret;
}
+ sk_acceptq_added(sk);
if (virtio_transport_space_update(child, skb))
child->sk_write_space(child);
--
2.39.3 (Apple Git-145)
^ permalink raw reply related
* Re: [PATCH v3 net-next 00/15] net/sched: prepare RTNL removal from qdisc dumps
From: Eric Dumazet @ 2026-04-13 13:16 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Jamal Hadi Salim, Jiri Pirko, netdev, eric.dumazet
In-Reply-To: <20260410182257.774311-1-edumazet@google.com>
On Fri, Apr 10, 2026 at 11:23 AM Eric Dumazet <edumazet@google.com> wrote:
>
> We add annotations for data-races, so that most dump methods
> can run in parallel with data path.
>
> Then change mq and mqprio to no longer acquire each children
> qdisc spinlock.
>
> Next round of patches will wait for linux-7.2.
>
> v2/v3: addressed most sashiko.dev feedbacks.
> I think remaining problems (in red offloads) are minor
> and can be fixed later.
1) An issue was spooted in sch_cake.c (patch (13/15)
2) net-next has been closed.
Therefore, I will send a V4 in in 2 weeks.
pw-bot: cr
^ permalink raw reply
* Re: [PATCH v11 net-next 4/7] devlink: Implement devlink param multi attribute nested data values
From: Paolo Abeni @ 2026-04-13 13:18 UTC (permalink / raw)
To: Ratheesh Kannoth
Cc: netdev, linux-kernel, linux-rdma, sgoutham, andrew+netdev, davem,
edumazet, kuba, donald.hunter, horms, jiri, chuck.lever, matttbe,
cjubran, saeedm, leon, tariqt, mbloch, dtatulea
In-Reply-To: <adzMvyIr7-uBtGlI@rkannoth-OptiPlex-7090>
On 4/13/26 1:00 PM, Ratheesh Kannoth wrote:
> On 2026-04-13 at 16:24:41, Paolo Abeni (pabeni@redhat.com) wrote:
>> On 4/9/26 4:50 AM, Ratheesh Kannoth wrote:
>>> @@ -441,6 +448,7 @@ union devlink_param_value {
>>> u64 vu64;
>>> char vstr[__DEVLINK_PARAM_MAX_STRING_VALUE];
>>> bool vbool;
>>> + struct devlink_param_u64_array u64arr;
>>
>> You mentioned that you intend to handle the possible CONFIG_FRAME_WARN
>> with a separate patch. IMHO such patch need to be part of this series,
>> or things will stay broken for an undefined amount of time until such
>> patch is merged separatelly.
>
> Patch no: 3 in the same series.
> https://lore.kernel.org/netdev/20260409025055.1664053-4-rkannoth@marvell.com/#t
I fear that is not enough ?!? i.e. what's about
devl_param_driverinit_value_set()? Likely devlink_param->validate is
called with enough space available in the stack to not care about the
huge argument, but the mentioned helper is called quite deeper.
/P
^ permalink raw reply
* [GIT PULL] bluetooth-next 2026-04-13
From: Luiz Augusto von Dentz @ 2026-04-13 13:22 UTC (permalink / raw)
To: davem, kuba; +Cc: linux-bluetooth, netdev
The following changes since commit 42f9b4c6ef19e71d2c7d9bfd3c5037d4fe434ad7:
tools: ynl: tests: fix leading space on Makefile target (2026-04-09 20:41:40 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git tags/for-net-next-2026-04-13
for you to fetch changes up to c347ca17d62a32c25564fee0ca3a2a7bc2d5fd6f:
Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling (2026-04-13 09:19:42 -0400)
----------------------------------------------------------------
bluetooth-next pull request for net-next:
core:
- hci_core: Rate limit the logging of invalid ISO handle
- hci_sync: make hci_cmd_sync_run_once return -EEXIST if exists
- hci_event: fix locking in hci_conn_request_evt() with HCI_PROTO_DEFER
- hci_event: fix potential UAF in SSP passkey handlers
- HCI: Avoid a couple -Wflex-array-member-not-at-end warnings
- L2CAP: CoC: Disconnect if received packet size exceeds MPS
- L2CAP: Add missing chan lock in l2cap_ecred_reconf_rsp
- L2CAP: Fix printing wrong information if SDU length exceeds MTU
- SCO: check for codecs->num_codecs == 1 before assigning to sco_pi(sk)->codec
drivers:
- btusb: MT7922: Add VID/PID 0489/e174
- btusb: Add Lite-On 04ca:3807 for MediaTek MT7921
- btusb: Add MT7927 IDs ASUS ROG Crosshair X870E Hero, Lenovo Legion Pro 7
16ARX9, Gigabyte Z790 AORUS MASTER X, MSI X870E Ace Max, TP-Link
Archer TBE550E, ASUS X870E / ProArt X870E-Creator.
- btusb: Add MT7902 IDs 13d3/3579, 13d3/3580, 13d3/3594, 13d3/3596, 0e8d/1ede
- btusb: Add MT7902 IDs 13d3/3579, 13d3/3580, 13d3/3594, 13d3/3596, 0e8d/1ede
- btusb: MediaTek MT7922: Add VID 0489 & PID e11d
- btintel: Add support for Scorpious Peak2 support
- btintel: Add support for Scorpious Peak2F support
- btintel_pcie: Add device id of Scorpius Peak2, Nova Lake-PCD-H
- btintel_pcie: Add device id of Scorpious2, Nova Lake-PCD-S
- btmtk: Add reset mechanism if downloading firmware failed
- btmtk: Add MT6639 (MT7927) Bluetooth support
- btmtk: fix ISO interface setup for single alt setting
- btmtk: add MT7902 SDIO support
- Bluetooth: btmtk: add MT7902 MCU support
- btbcm: Add entry for BCM4343A2 UART Bluetooth
- qca: enable pwrseq support for wcn39xx devices
- hci_qca: Fix BT not getting powered-off on rmmod
- hci_qca: disable power control for WCN7850 when bt_en is not defined
- hci_qca: Fix missing wakeup during SSR memdump handling
- hci_ldisc: Clear HCI_UART_PROTO_INIT on error
- mmc: sdio: add MediaTek MT7902 SDIO device ID
- hci_ll: Enable BROKEN_ENHANCED_SETUP_SYNC_CONN for WL183x
----------------------------------------------------------------
Arnd Bergmann (1):
Bluetooth: btmtk: hide unused btmtk_mt6639_devs[] array
Chris Lu (4):
Bluetooth: btusb: MT7922: Add VID/PID 0489/e174
Bluetooth: btmtk: improve mt79xx firmware setup retry flow
Bluetooth: btmtk: add status check in mt79xx firmware setup
Bluetooth: btmtk: Add reset mechanism if downloading firmware failed
Christian Eggers (1):
Bluetooth: L2CAP: CoC: Disconnect if received packet size exceeds MPS
Dmitry Baryshkov (1):
Bluetooth: qca: enable pwrseq support for WCN39xx devices
Dongyang Jin (1):
Bluetooth: btbcm: remove done label in btbcm_patchram
Dudu Lu (1):
Bluetooth: l2cap: Add missing chan lock in l2cap_ecred_reconf_rsp
Dylan Eray (1):
Bluetooth: btusb: Add Lite-On 04ca:3807 for MediaTek MT7921
Gustavo A. R. Silva (1):
Bluetooth: hci.h: Avoid a couple -Wflex-array-member-not-at-end warnings
Hans de Goede (2):
Bluetooth: hci_qca: Fix confusing shutdown() and power_off() naming
Bluetooth: hci_qca: Fix BT not getting powered-off on rmmod
Javier Tia (8):
Bluetooth: btmtk: Add MT6639 (MT7927) Bluetooth support
Bluetooth: btmtk: fix ISO interface setup for single alt setting
Bluetooth: btusb: Add MT7927 ID for ASUS ROG Crosshair X870E Hero
Bluetooth: btusb: Add MT7927 ID for Lenovo Legion Pro 7 16ARX9
Bluetooth: btusb: Add MT7927 ID for Gigabyte Z790 AORUS MASTER X
Bluetooth: btusb: Add MT7927 ID for MSI X870E Ace Max
Bluetooth: btusb: Add MT7927 ID for TP-Link Archer TBE550E
Bluetooth: btusb: Add MT7927 ID for ASUS X870E / ProArt X870E-Creator
Johan Hovold (2):
Bluetooth: btusb: refactor endpoint lookup
Bluetooth: btmtk: refactor endpoint lookup
Jonathan Rissanen (1):
Bluetooth: hci_ldisc: Clear HCI_UART_PROTO_INIT on error
Kamiyama Chiaki (1):
Bluetooth: btusb: MediaTek MT7922: Add VID 0489 & PID e11d
Kiran K (10):
Bluetooth: btintel: Add support for hybrid signature for ScP2 onwards
Bluetooth: btintel: Replace CNVi id with hardware variant
Bluetooth: btintel: Add support for Scorpious Peak2 support
Bluetooth: btintel: Add DSBR support for ScP2 onwards
Bluetooth: btintel_pcie: Add support for exception dump for ScP2
Bluetooth: btintel: Add support for Scorpious Peak2F support
Bluetooth: btintel_pcie: Add support for exception dump for ScP2F
Bluetooth: btintel_pcie: Add device id of Scorpius Peak2, Nova Lake-PCD-H
Bluetooth: btintel_pcie: Add device id of Scorpious2, Nova Lake-PCD-S
Bluetooth: btintel_pcie: Align shared DMA memory to 128 bytes
Luiz Augusto von Dentz (2):
Bluetooth: btintel_pci: Fix btintel_pcie_read_hwexp code style
Bluetooth: L2CAP: Fix printing wrong information if SDU length exceeds MTU
Lukas Kraft (1):
bluetooth: btusb: Fix whitespace in btusb.c
Marek Vasut (1):
Bluetooth: btbcm: Add entry for BCM4343A2 UART Bluetooth
Pauli Virtanen (3):
Bluetooth: hci_core: Rate limit the logging of invalid ISO handle
Bluetooth: hci_sync: make hci_cmd_sync_run_once return -EEXIST if exists
Bluetooth: fix locking in hci_conn_request_evt() with HCI_PROTO_DEFER
Sean Wang (8):
mmc: sdio: add MediaTek MT7902 SDIO device ID
Bluetooth: btmtk: add MT7902 MCU support
Bluetooth: btusb: Add new VID/PID 13d3/3579 for MT7902
Bluetooth: btusb: Add new VID/PID 13d3/3580 for MT7902
Bluetooth: btusb: Add new VID/PID 13d3/3594 for MT7902
Bluetooth: btusb: Add new VID/PID 13d3/3596 for MT7902
Bluetooth: btusb: Add new VID/PID 0e8d/1ede for MT7902
Bluetooth: btmtk: add MT7902 SDIO support
Shuai Zhang (2):
Bluetooth: hci_qca: disable power control for WCN7850 when bt_en is not defined
Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling
Shuvam Pandey (1):
Bluetooth: hci_event: fix potential UAF in SSP passkey handlers
Stefan Metzmacher (1):
Bluetooth: SCO: check for codecs->num_codecs == 1 before assigning to sco_pi(sk)->codec
Stefano Radaelli (1):
Bluetooth: hci_ll: Enable BROKEN_ENHANCED_SETUP_SYNC_CONN for WL183x
Thorsten Blum (3):
Bluetooth: btintel_pcie: Replace snprintf("%s") with strscpy
Bluetooth: btintel_pcie: Use struct_size to improve hci_drv_read_info
Bluetooth: btintel_pcie: use strscpy to copy plain strings
Vivek Sahu (1):
Bluetooth: qca: Refactor code on the basis of chipset names
drivers/bluetooth/btbcm.c | 11 ++--
drivers/bluetooth/btintel.c | 109 ++++++++++++++++++++++++++++------
drivers/bluetooth/btintel.h | 20 +++++--
drivers/bluetooth/btintel_pcie.c | 122 ++++++++++++++++++++++++---------------
drivers/bluetooth/btintel_pcie.h | 3 -
drivers/bluetooth/btmtk.c | 115 ++++++++++++++++++++++++++----------
drivers/bluetooth/btmtk.h | 9 ++-
drivers/bluetooth/btmtksdio.c | 44 +++++++++-----
drivers/bluetooth/btqca.c | 37 ++++++------
drivers/bluetooth/btusb.c | 84 +++++++++++++--------------
drivers/bluetooth/hci_ldisc.c | 3 +
drivers/bluetooth/hci_ll.c | 10 ++++
drivers/bluetooth/hci_qca.c | 84 ++++++++++++++++-----------
include/linux/mmc/sdio_ids.h | 1 +
include/net/bluetooth/hci.h | 16 +++--
net/bluetooth/hci_conn.c | 4 +-
net/bluetooth/hci_core.c | 4 +-
net/bluetooth/hci_event.c | 21 ++++---
net/bluetooth/hci_sync.c | 2 +-
net/bluetooth/l2cap_core.c | 15 ++++-
net/bluetooth/sco.c | 3 +-
21 files changed, 476 insertions(+), 241 deletions(-)
^ permalink raw reply
* Re: [PATCH net-next v7 00/10] Decouple receive and transmit enablement in team driver
From: patchwork-bot+netdevbpf @ 2026-04-13 13:30 UTC (permalink / raw)
To: Marc Harvey
Cc: jiri, andrew+netdev, davem, edumazet, kuba, pabeni, shuah, horms,
netdev, linux-kernel, linux-kselftest, kuniyu
In-Reply-To: <20260409-teaming-driver-internal-v7-0-f47e7589685d@google.com>
Hello:
This series was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Thu, 09 Apr 2026 02:59:22 +0000 you wrote:
> Allow independent control over receive and transmit enablement states
> for aggregated ports in the team driver.
>
> The motivation is that IEE 802.3ad LACP "independent control" can't
> be implemented for the team driver currently. This was added to the
> bonding driver in commit 240fd405528b ("bonding: Add independent
> control state machine").
>
> [...]
Here is the summary with links:
- [net-next,v7,01/10] net: team: Annotate reads and writes for mixed lock accessed values
https://git.kernel.org/netdev/net-next/c/3faf0ce6e499
- [net-next,v7,02/10] net: team: Remove unused team_mode_op, port_enabled
https://git.kernel.org/netdev/net-next/c/014f249121d7
- [net-next,v7,03/10] net: team: Rename port_disabled team mode op to port_tx_disabled
https://git.kernel.org/netdev/net-next/c/cfa477df2cc6
- [net-next,v7,04/10] selftests: net: Add tests for failover of team-aggregated ports
https://git.kernel.org/netdev/net-next/c/05e352444b24
- [net-next,v7,05/10] selftests: net: Add test for enablement of ports with teamd
https://git.kernel.org/netdev/net-next/c/10407eebe886
- [net-next,v7,06/10] net: team: Rename enablement functions and struct members to tx
https://git.kernel.org/netdev/net-next/c/fa6ed31dd913
- [net-next,v7,07/10] net: team: Track rx enablement separately from tx enablement
https://git.kernel.org/netdev/net-next/c/68f0833f279a
- [net-next,v7,08/10] net: team: Add new rx_enabled team port option
https://git.kernel.org/netdev/net-next/c/0e47569a574d
- [net-next,v7,09/10] net: team: Add new tx_enabled team port option
https://git.kernel.org/netdev/net-next/c/bb9215a98179
- [net-next,v7,10/10] selftests: net: Add tests for team driver decoupled tx and rx control
https://git.kernel.org/netdev/net-next/c/d3870724eb16
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: [PATCH v10 1/2] net: mhi: Enable Ethernet interface support
From: Paolo Abeni @ 2026-04-13 13:31 UTC (permalink / raw)
To: Vivek Pernamitta, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski
Cc: netdev, linux-kernel
In-Reply-To: <20260409-vdev_b1_eth_b1_next-20260408-v10-1-6d44ca48f189@oss.qualcomm.com>
On 4/9/26 8:08 AM, Vivek Pernamitta wrote:
> @@ -208,17 +235,20 @@ static void mhi_net_dl_callback(struct mhi_device *mhi_dev,
> skb = mhi_net_skb_agg(mhi_netdev, skb);
> mhi_netdev->skbagg_head = NULL;
> }
> -
> - switch (skb->data[0] & 0xf0) {
> - case 0x40:
> - skb->protocol = htons(ETH_P_IP);
> - break;
> - case 0x60:
> - skb->protocol = htons(ETH_P_IPV6);
> - break;
> - default:
> - skb->protocol = htons(ETH_P_MAP);
> - break;
> + if (mhi_netdev->ndev->type == ARPHRD_ETHER) {
> + skb->protocol = eth_type_trans(skb, mhi_netdev->ndev);
Sashiko says:
Is there a risk of an out-of-bounds read or kernel panic here if a malformed
or fragmented packet is received?
eth_type_trans() assumes the SKB has at least a 14-byte MAC header and calls
skb_pull_inline(). If the linear portion of the SKB is smaller than 14
bytes,
__skb_pull() will trigger a BUG_ON(skb->len < skb->data_len).
Should we call pskb_may_pull(skb, ETH_HLEN) before parsing the Ethernet
header?
> + } else {
> + switch (skb->data[0] & 0xf0) {
> + case 0x40:
> + skb->protocol = htons(ETH_P_IP);
> + break;
> + case 0x60:
> + skb->protocol = htons(ETH_P_IPV6);
> + break;
> + default:
> + skb->protocol = htons(ETH_P_MAP);
> + break;
> + }
> }
>
> u64_stats_update_begin(&mhi_netdev->stats.rx_syncp);
> @@ -306,6 +336,9 @@ static int mhi_net_newlink(struct mhi_device *mhi_dev, struct net_device *ndev)
> struct mhi_net_dev *mhi_netdev;
> int err;
>
> + if (ndev->header_ops)
> + eth_hw_addr_random(ndev);
> +
> mhi_netdev = netdev_priv(ndev);
>
> dev_set_drvdata(&mhi_dev->dev, mhi_netdev);
> @@ -356,7 +389,8 @@ static int mhi_net_probe(struct mhi_device *mhi_dev,
> int err;
>
> ndev = alloc_netdev(sizeof(struct mhi_net_dev), info->netname,
> - NET_NAME_PREDICTABLE, mhi_net_setup);
> + NET_NAME_ENUM, info->ethernet_if ?
> + mhi_ethernet_setup : mhi_net_setup);
Sashiko says:
Does changing the name assignment type from NET_NAME_PREDICTABLE to
NET_NAME_ENUM break backwards compatibility for existing legacy interfaces?
NET_NAME_PREDICTABLE instructs userspace to leave the kernel-provided
name alone, while NET_NAME_ENUM signals that the interface is a generic
enumeration and should be renamed. Applying this to existing interfaces
like mhi_hwip0 and mhi_swip0 might cause them to be unexpectedly renamed
on boot, potentially breaking existing userspace network configurations.
please have a look at the full report:
https://sashiko.dev/#/patchset/20260409-vdev_b1_eth_b1_next-20260408-v10-0-6d44ca48f189%40oss.qualcomm.com
/P
^ permalink raw reply
* Re: [PATCH] xfrm: fix memory leak in xfrm_add_policy()
From: Sabrina Dubroca @ 2026-04-13 13:32 UTC (permalink / raw)
To: Deepanshu Kartikey
Cc: steffen.klassert, herbert, davem, edumazet, kuba, pabeni, horms,
leon, netdev, linux-kernel, syzbot+901d48e0b95aed4a2548
In-Reply-To: <20260412020809.35465-1-kartikey406@gmail.com>
2026-04-12, 07:38:09 +0530, Deepanshu Kartikey wrote:
> When xfrm_policy_insert() fails, the error path performs manual
> cleanup by calling xfrm_dev_policy_free(), security_xfrm_policy_free()
> and kfree() directly. This is incorrect because xfrm_policy_destroy()
> already handles all of these, causing a memory leak detected by
> kmemleak.
What is missing in the current code? "we have a better way to do this"
is not a bugfix, it's a clean up. The kmemleak report says that we're
leaking the xfrm_policy struct on this codepath, which doesn't make
sense, that's covered by the existing kfree(xp).
Also, please use "PATCH ipsec" for fixes to net/xfrm and the rest of
the IPsec implementation.
--
Sabrina
^ permalink raw reply
* Re: commit 0c4f1c02d27a880b cause a deadlock issue
From: Greg KH @ 2026-04-13 12:55 UTC (permalink / raw)
To: Thorsten Leemhuis
Cc: He, Guocai (CN), Berg, Johannes, Friend,
Linux kernel regressions list, Korenblit, Miriam Rachel,
stable@vger.kernel.org
In-Reply-To: <58f6e74d-480e-4e0c-aa66-68dfc1de7421@leemhuis.info>
On Mon, Apr 13, 2026 at 01:58:56PM +0200, Thorsten Leemhuis wrote:
> On 4/3/26 15:00, Korenblit, Miriam Rachel wrote:
> >> From: Greg KH <gregkh@linuxfoundation.org>
> >> On Fri, Apr 03, 2026 at 12:44:48PM +0000, Korenblit, Miriam Rachel wrote:
> >>>> -----Original Message-----
> >>>> From: Greg KH <gregkh@linuxfoundation.org>
> >>>> On Fri, Apr 03, 2026 at 11:08:46AM +0000, He, Guocai (CN) wrote:
> >>>>> No, The mainline have no this issue.
> >>>>> The changes of 0c4f1c02d27a880b is not in mainline.
> >>>>
> >>>> That does not make sense, that commit is really commit e1696c8bd005
> >>>> ("wifi: cfg80211: stop NAN and P2P in cfg80211_leave") which is in
> >>>> all of the following releases:
> >>>> 5.10.252 5.15.202 6.1.165 6.6.128 6.12.75 6.18.14 6.19.4 7.0-rc1
> >>>> confused,
> >>> The change is indeed in mainline, but the locking situation in
> >>> mainline is totally different (that mutex does not even exist there)
> >>> Therefore, the issue is not supposed to happen in mainline.
> >>
> >> Ok, does that commit now need to be reverted from some of the stable branches?
> >> If so, which ones?
> >
> > From every version which is < 6.7.
>
> Greg, do you still have this in your todo mail queue somewhere? Just
> wondering, as last weeks 6.6.y released afics lacked a revert of
> e1696c8bd0056b ("wifi: cfg80211: stop NAN and P2P in cfg80211_leave") --
> and I cannot spot one in your public stable queue either.
>
> These are the commits that according to Miri need to be reverted if I
> understood things right:
>
> v6.6.128 (4d7a05da767e5c), v6.1.165 (0c4f1c02d27a88), v5.15.202
> (31344ffecd7a34), v5.10.252 (d91240f24e831d)
It is, yes, my queue is huge :(
It's fastest if someone sends me the reverts and I can easily apply them
that way. Otherwise it takes me a bit to do each one manually :(
thanks,
greg k-h
^ permalink raw reply
* Re: [PATCH iwl-net v2 5/6] ixgbe: fix ITR value overflow in adaptive interrupt throttling
From: Simon Horman @ 2026-04-13 13:39 UTC (permalink / raw)
To: Aleksandr Loktionov; +Cc: intel-wired-lan, anthony.l.nguyen, netdev
In-Reply-To: <20260408131154.2661818-6-aleksandr.loktionov@intel.com>
On Wed, Apr 08, 2026 at 03:11:53PM +0200, Aleksandr Loktionov wrote:
> ixgbe_update_itr() packs a mode flag (IXGBE_ITR_ADAPTIVE_LATENCY,
> bit 7) and a usecs delay (bits [6:0]) into an unsigned int, then
> stores the combined value in ring_container->itr which is declared as
> u8. Values above 0xFF wrap on truncation, corrupting both the delay
> and the mode flag on the next readback.
>
> Separate the mode bits from the usecs sub-field; clamp only the usecs
> portion to [0, IXGBE_ITR_ADAPTIVE_LATENCY - 1] (= 0x7F) using min_t()
> so overflow cannot bleed into bit 7.
>
> Fixes: b4ded8327fea ("ixgbe: Update adaptive ITR algorithm")
> Cc: stable@vger.kernel.org
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> ---
> v1 -> v2:
> - Add proper [N/M] numbering so patchwork tracks it as part of the set;
> no code change.
>
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 210c7b9..9f3ae21 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -2889,8 +2889,9 @@ static void ixgbe_update_itr(struct ixgbe_q_vector *q_vector,
> }
>
> clear_counts:
> - /* write back value */
> - ring_container->itr = itr;
> + ring_container->itr = (itr & IXGBE_ITR_ADAPTIVE_LATENCY) |
> + min_t(unsigned int, itr & ~IXGBE_ITR_ADAPTIVE_LATENCY,
> + IXGBE_ITR_ADAPTIVE_LATENCY - 1);
* It is not clear to me that the mode flag bit (IXGBE_ITR_ADAPTIVE_LATENCY)
is always set in itr when reaching this code. But with this patch that
bit will always be set in ring_container->itr.
* Perhaps no such case exists, but it's not clear to me how this handles a
case where the usec delay has overflowed into the mode flag bit.
As a hypothetical example, consider the case where the delay overflows to
exactly 0x80. The resulting delay is 0 (both with and without this
patch).
I would suggest an approach of keeping the delay and mode bits separate
during calculation - in separate local variables - and only combining
them when ring_container->itr is set.
This may turn out to be more verbose. But I expect it is easier to reason
with.
* Looking over the code, it looks like the maximum allowed udelay is
IXGBE_ITR_ADAPTIVE_MAX_USECS (126) rather than
IXGBE_ITR_ADAPTIVE_LATENCY - 1 (127).
* The calculation does not guard against delay values less
than IXGBE_ITR_ADAPTIVE_MIN_USECS. Which looking over the code seems
to be something that matters. (And which occurred in the hypothetical
example above).
* As itr is an unsigned int, and IXGBE_ITR_ADAPTIVE_LATENCY - 1 is a
compile time constant, I expect that min() is sufficient.
IOW, I don't think min_t is needed here.
* It looks like using FIELD_PREP is appropriate to construct
ring_container->itr. But that may be overkill if you end up with
something like:
ring_container->itr = mode | clamp(delay, IXGBE_ITR_ADAPTIVE_MAX_USECS,
IXGBE_ITR_ADAPTIVE_MIN_USECS);
>
> /* next update should occur within next jiffy */
> ring_container->next_update = next_update + 1;
> --
> 2.52.0
^ permalink raw reply
* Re: [PATCH net] tcp: update window_clamp when SO_RCVBUF is set
From: patchwork-bot+netdevbpf @ 2026-04-13 13:40 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, ncardwell,
kuniyu, willemb, dsahern, quic_subashab, quic_stranche
In-Reply-To: <20260408001438.129165-1-kuba@kernel.org>
Hello:
This patch was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Tue, 7 Apr 2026 17:14:38 -0700 you wrote:
> Commit under Fixes moved recomputing the window clamp to
> tcp_measure_rcv_mss() (when scaling_ratio changes).
> I suspect it missed the fact that we don't recompute the clamp
> when rcvbuf is set. Until scaling_ratio changes we are
> stuck with the old window clamp which may be based on
> the small initial buffer. scaling_ratio may never change.
>
> [...]
Here is the summary with links:
- [net] tcp: update window_clamp when SO_RCVBUF is set
https://git.kernel.org/netdev/net-next/c/b025461303d8
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: [GIT PULL] bluetooth-next 2026-04-13
From: Paolo Abeni @ 2026-04-13 13:41 UTC (permalink / raw)
To: Luiz Augusto von Dentz, davem, kuba; +Cc: linux-bluetooth, netdev
In-Reply-To: <20260413132247.320961-1-luiz.dentz@gmail.com>
On 4/13/26 3:22 PM, Luiz Augusto von Dentz wrote:
> The following changes since commit 42f9b4c6ef19e71d2c7d9bfd3c5037d4fe434ad7:
>
> tools: ynl: tests: fix leading space on Makefile target (2026-04-09 20:41:40 -0700)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git tags/for-net-next-2026-04-13
>
> for you to fetch changes up to c347ca17d62a32c25564fee0ca3a2a7bc2d5fd6f:
>
> Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling (2026-04-13 09:19:42 -0400)
>
> ----------------------------------------------------------------
> bluetooth-next pull request for net-next:
Net-next is closed for the merge window. I guess Jakub could still
consider merging this, but unless you want it very, very badly, I hope
it can just be postponed, as the PW queue is already long.
Thanks,
/P
^ permalink raw reply
* Re: [PATCH iwl-net v2 6/6] ixgbe: fix integer overflow and wrong bit position in ixgbe_validate_rtr()
From: Simon Horman @ 2026-04-13 13:43 UTC (permalink / raw)
To: Aleksandr Loktionov; +Cc: intel-wired-lan, anthony.l.nguyen, netdev
In-Reply-To: <20260408131154.2661818-7-aleksandr.loktionov@intel.com>
On Wed, Apr 08, 2026 at 03:11:54PM +0200, Aleksandr Loktionov wrote:
> Two bugs in the same loop in ixgbe_validate_rtr():
>
> 1. The 3-bit traffic-class field was extracted by shifting a u32 and
> assigning the result directly to a u8. For user priority 0 this is
> harmless; for UP[5..7] the shift leaves bits [15..21] in the u32
> which are then silently truncated when stored in u8. Mask with
> IXGBE_RTRUP2TC_UP_MASK before the assignment so only the intended
> 3 bits are kept.
>
> 2. When clearing an out-of-bounds entry the mask was always shifted by
> the fixed constant IXGBE_RTRUP2TC_UP_SHIFT (== 3), regardless of
> which loop iteration was being processed. This means only UP1 (bit
> position 3) was ever cleared; UP0,2..7 (positions 0, 6, 9, ..., 21)
> were left unreset, so invalid TC mappings persisted in hardware and
> could mis-steer received packets to the wrong traffic class.
> Use i * IXGBE_RTRUP2TC_UP_SHIFT to target the correct 3-bit field
> for each iteration.
>
> Swap the operand order in the mask expression to place the constant
> on the right per kernel coding style (noted by David Laight).
>
> Fixes: e7589eab9291 ("ixgbe: consolidate, setup for multiple traffic classes")
> Cc: stable@vger.kernel.org
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> ---
> v1 -> v2:
> - Add Fixes: tag; reroute to iwl-net (wrong bit positions cause packet
> mis-steering); swap to (reg >> ...) & MASK operand order per David
> Laight.
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply
* Re: [patch 14/38] slub: Use prandom instead of get_cycles()
From: Vlastimil Babka (SUSE) @ 2026-04-13 13:45 UTC (permalink / raw)
To: hu.shengming, harry
Cc: tglx, linux-kernel, linux-mm, arnd, x86, baolu.lu, iommu,
m.grzeschik, netdev, linux-wireless, herbert, linux-crypto, dwmw2,
bernie, linux-fbdev, tytso, linux-ext4, akpm, urezki, elver,
dvyukov, kasan-dev, ryabinin.a.a, t.sailer, linux-hams, Jason,
richard.henderson, linux-alpha, linux, linux-arm-kernel,
catalin.marinas, chenhuacai, loongarch, geert, linux-m68k,
dinguyen, jonas, linux-openrisc, deller, linux-parisc, mpe,
linuxppc-dev, pjw, linux-riscv, hca, linux-s390, davem,
sparclinux, hao.li, cl, rientjes, roman.gushchin
In-Reply-To: <20260413210252672ZfdcegJLJtyvlYdFAUBlr@zte.com.cn>
On 4/13/26 15:02, hu.shengming@zte.com.cn wrote:
> Harry wrote:
>> [Resending after fixing broken email headers]
>>
>> On Fri, Apr 10, 2026 at 02:19:37PM +0200, Thomas Gleixner wrote:
>> > The decision whether to scan remote nodes is based on a 'random' number
>> > retrieved via get_cycles(). get_cycles() is about to be removed.
>> >
>> > There is already prandom state in the code, so use that instead.
>> >
>> > Signed-off-by: Thomas Gleixner <tglx@kernel.org>
>> > Cc: Vlastimil Babka <vbabka@kernel.org>
>> > Cc: linux-mm@kvack.org
>> > ---
>>
>> Acked-by: Harry Yoo (Oracle) <harry@kernel.org>
>>
>> Is this for this merge window?
I'd say it's not intended for 7.1 as it's not in -next and v1 was posted
just before the merge window.
>> This may conflict with upcoming changes on freelist shuffling [1]
>> (not queued for slab/for-next yet though), but it should be easy to
>> resolve.
Indeed, it's a simple conflict.
>
> Hi Harry,
>
> Would you like me to wait for this patch to land linux-next and then
> rebase and send v6 on top?
Just send it now based same as previously so we can finish the reviews, and
we'll deal with it after rc1.
^ permalink raw reply
* linux-next: manual merge of the net-next tree with the net tree
From: Mark Brown @ 2026-04-13 13:50 UTC (permalink / raw)
To: David Miller, Jakub Kicinski, Paolo Abeni, Networking
Cc: Fernando Fernandez Mancera, Jesper Dangaard Brouer,
Linux Kernel Mailing List, Linux Next Mailing List
[-- Attachment #1: Type: text/plain, Size: 1798 bytes --]
Hi all,
Today's linux-next merge of the net-next tree got a conflict in:
include/net/sch_generic.h
between commit:
a6bd339dbb351 ("net_sched: fix skb memory leak in deferred qdisc drops")
from the net tree and commit:
ff2998f29f390 ("net: sched: introduce qdisc-specific drop reason tracing")
from the net-next tree.
I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging. You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.
diff --cc include/net/sch_generic.h
index 5fc0b1ebaf25c,5af262ec4bbd2..0000000000000
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@@ -1168,24 -1185,14 +1185,24 @@@ static inline void tcf_kfree_skb_list(s
}
static inline void qdisc_dequeue_drop(struct Qdisc *q, struct sk_buff *skb,
- enum skb_drop_reason reason)
+ enum qdisc_drop_reason reason)
{
+ struct Qdisc *root;
+
DEBUG_NET_WARN_ON_ONCE(!(q->flags & TCQ_F_DEQUEUE_DROPS));
DEBUG_NET_WARN_ON_ONCE(q->flags & TCQ_F_NOLOCK);
- tcf_set_qdisc_drop_reason(skb, reason);
- skb->next = q->to_free;
- q->to_free = skb;
+ rcu_read_lock();
+ root = qdisc_root_sleeping(q);
+
+ if (root->flags & TCQ_F_DEQUEUE_DROPS) {
- tcf_set_drop_reason(skb, reason);
++ tcf_set_qdisc_drop_reason(skb, reason);
+ skb->next = root->to_free;
+ root->to_free = skb;
+ } else {
+ kfree_skb_reason(skb, (enum skb_drop_reason)reason);
+ }
+ rcu_read_unlock();
}
/* Instead of calling kfree_skb() while root qdisc lock is held,
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH iwl-net v2 6/6] ixgbe: fix integer overflow and wrong bit position in ixgbe_validate_rtr()
From: Simon Horman @ 2026-04-13 14:02 UTC (permalink / raw)
To: Aleksandr Loktionov; +Cc: intel-wired-lan, anthony.l.nguyen, netdev
In-Reply-To: <20260413134334.GP469338@kernel.org>
On Mon, Apr 13, 2026 at 02:43:34PM +0100, Simon Horman wrote:
> On Wed, Apr 08, 2026 at 03:11:54PM +0200, Aleksandr Loktionov wrote:
> > Two bugs in the same loop in ixgbe_validate_rtr():
> >
> > 1. The 3-bit traffic-class field was extracted by shifting a u32 and
> > assigning the result directly to a u8. For user priority 0 this is
> > harmless; for UP[5..7] the shift leaves bits [15..21] in the u32
> > which are then silently truncated when stored in u8. Mask with
> > IXGBE_RTRUP2TC_UP_MASK before the assignment so only the intended
> > 3 bits are kept.
> >
> > 2. When clearing an out-of-bounds entry the mask was always shifted by
> > the fixed constant IXGBE_RTRUP2TC_UP_SHIFT (== 3), regardless of
> > which loop iteration was being processed. This means only UP1 (bit
> > position 3) was ever cleared; UP0,2..7 (positions 0, 6, 9, ..., 21)
> > were left unreset, so invalid TC mappings persisted in hardware and
> > could mis-steer received packets to the wrong traffic class.
> > Use i * IXGBE_RTRUP2TC_UP_SHIFT to target the correct 3-bit field
> > for each iteration.
> >
> > Swap the operand order in the mask expression to place the constant
> > on the right per kernel coding style (noted by David Laight).
> >
> > Fixes: e7589eab9291 ("ixgbe: consolidate, setup for multiple traffic classes")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> > ---
> > v1 -> v2:
> > - Add Fixes: tag; reroute to iwl-net (wrong bit positions cause packet
> > mis-steering); swap to (reg >> ...) & MASK operand order per David
> > Laight.
>
> Reviewed-by: Simon Horman <horms@kernel.org>
Sorry, I was a little too hasty there.
AI generated code review points out that the cited commit
doesn't seem to have introduced the code being fixed,
and that perhaps this Fixes tag would be more appropriate.
Fixes: 8b1c0b24d9af ("ixgbe: configure minimal packet buffers to support TC")
I will also forward on a review from Sashiko, although it
is for an existing bug and thus is strictly FYI.
^ permalink raw reply
* Re: [PATCH iwl-net v2 6/6] ixgbe: fix integer overflow and wrong bit position in ixgbe_validate_rtr()
From: Simon Horman @ 2026-04-13 14:03 UTC (permalink / raw)
To: aleksandr.loktionov
Cc: 'Simon Horman', intel-wired-lan, anthony.l.nguyen, netdev,
stable
In-Reply-To: <20260408131154.2661818-7-aleksandr.loktionov@intel.com>
From: 'Simon Horman' <horms@kernel.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
ixgbe: fix integer overflow and wrong bit position in ixgbe_validate_rtr()
This commit fixes two bugs in ixgbe_validate_rtr() by masking the
traffic-class field properly before assigning it to a u8, and correctly
shifting the mask for clearing out-of-bounds entries.
Cc: stable@vger.kernel.org
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 5ff462a848feb..8ebf717ce4017 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -9782,11 +9782,12 @@ static void ixgbe_validate_rtr(struct ixgbe_adapter *adapter, u8 tc)
> rsave = reg;
>
> for (i = 0; i < MAX_TRAFFIC_CLASS; i++) {
> - u8 up2tc = reg >> (i * IXGBE_RTRUP2TC_UP_SHIFT);
> + u8 up2tc = (reg >> (i * IXGBE_RTRUP2TC_UP_SHIFT)) &
> + IXGBE_RTRUP2TC_UP_MASK;
>
> /* If up2tc is out of bounds default to zero */
> if (up2tc > tc)
This wasn't introduced by this patch, but is there a pre-existing off-by-one
error in this bounds check?
If tc represents the number of enabled traffic classes (e.g., 4), the valid TC
indices would be 0 through tc - 1. If up2tc equals tc, it would be out of
bounds, but the condition up2tc > tc will evaluate to false and the invalid
mapping won't be cleared.
Could this allow an invalid UP-to-TC mapping to persist in the hardware
register, potentially causing received packets to be mis-steered to an
uninitialized or non-existent traffic class queue?
Should the condition be corrected to ensure up2tc is strictly less than tc
when tc > 0?
> - reg &= ~(0x7 << IXGBE_RTRUP2TC_UP_SHIFT);
> + reg &= ~(IXGBE_RTRUP2TC_UP_MASK << (i * IXGBE_RTRUP2TC_UP_SHIFT));
> }
^ permalink raw reply
* Re: [PATCH net] ice: Fix missing 1's complement negation in GCS raw checksum
From: Simon Horman @ 2026-04-13 14:11 UTC (permalink / raw)
To: Matt Fleming
Cc: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, intel-wired-lan,
netdev, linux-kernel, kernel-team, Matt Fleming
In-Reply-To: <20260408190214.1287708-1-matt@readmodwrite.com>
On Wed, Apr 08, 2026 at 08:02:14PM +0100, Matt Fleming wrote:
> From: Matt Fleming <mfleming@cloudflare.com>
>
> Commit 905d1a220e8d ("ice: Add E830 checksum offload support") added
> Generic Checksum (GCS) support for E830 NICs but omitted the 1's
> complement negation (~) when converting the hardware raw_csum to
> skb->csum for CHECKSUM_COMPLETE.
>
> Without the negation, every CHECKSUM_COMPLETE packet fails the
> fast-path validation in nf_ip_checksum() and falls through to software
> checksumming via __skb_checksum_complete(), which triggers the
> rate-limited "hw csum failure" warning. Packets are still accepted
> (the software recheck passes) but hardware checksum offload is
> effectively disabled and the warning floods dmesg on systems running
> nf_conntrack on VLAN sub-interfaces.
>
> Multiple other drivers (idpf, ehea, iwlwifi, cassini, sunhme, enetc)
> also apply ~ for CHECKSUM_COMPLETE. The ice driver was the only in-tree
> user of csum_unfold() for CHECKSUM_COMPLETE that omitted it.
>
> Fixes: 905d1a220e8d ("ice: Add E830 checksum offload support")
> Signed-off-by: Matt Fleming <mfleming@cloudflare.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox