* Re: [RESEND PATCH v1] net: dsa: motorcomm: add yt92xx dsa driver
From: Andrew Lunn @ 2026-06-17 9:07 UTC (permalink / raw)
To: Kyle Switch
Cc: David Yang, olteanv, davem, edumazet, kuba, pabeni, horms, netdev,
linux-kernel, ming.xu, xiaolin.xu, jianmin.wang, de.ge
In-Reply-To: <88f726d5-1617-4d2e-8fbb-d3da9478b386@motor-comm.com>
> >> +#define CMM_PARAM_CHK(expr, err_code) \
> >> + do { \
> >> + if ((u32)(expr)) { \
> >> + return err_code; \
> >> + } \
> >> + } while (0)
> >> +
> >> +#define CMM_ERR_CHK(op, ret) \
> >> + do { \
> >> + ret = (op); \
> >> + if (ret != CMM_ERR_OK) \
> >> + return ret; \
> >> + } while (0)
> >
> > Do not use macros like this.
>
> Ans: Acknowledged, i will consider how to optimize them in the future.
It is not about optimization. Hiding a return statement in a macro is
very bad style. It will lead to locking bugs, and resource leaks,
because nobody knows the return is there.
> >> +/*
> >> + * Macro Definition
> >> + */
> >> +#ifndef NULL
> >> +#define NULL 0
> >> +#endif
> >> +
> >> +#ifndef FALSE
> >> +#define FALSE 0
> >> +#endif
> >> +
> >> +#ifndef TRUE
> >> +#define TRUE 1
> >> +#endif
> >
> > Nonsense.
>
> Ans: Acknowledge, will be fixed later.
No. They will be fixed now.
> >> + /* Print chipid here since we are interested in lower 16 bits */
> >> + dev_info(dev,
> >> + "Motorcomm %s ethernet switch.\n",
> >> + info->name);
> >
> > Stop copy-n-paste.
>
> Ans: Sry for this, i will recheck the code to make sure each line of comments and code
> meaningful again.
Also, consider the comments. Do the comments add anything useful which
is not already obvious from the code. Comments should be about "Why?".
> >> --- a/include/uapi/linux/if_ether.h
> >> +++ b/include/uapi/linux/if_ether.h
> >> @@ -118,7 +118,7 @@
> >> #define ETH_P_QINQ1 0x9100 /* deprecated QinQ VLAN [ NOT AN OFFICIALLY REGISTERED ID ] */
> >> #define ETH_P_QINQ2 0x9200 /* deprecated QinQ VLAN [ NOT AN OFFICIALLY REGISTERED ID ] */
> >> #define ETH_P_QINQ3 0x9300 /* deprecated QinQ VLAN [ NOT AN OFFICIALLY REGISTERED ID ] */
> >> -#define ETH_P_YT921X 0x9988 /* Motorcomm YT921x DSA [ NOT AN OFFICIALLY REGISTERED ID ] */
> >> +#define ETH_P_YT92XX 0x9988 /* Motorcomm YT92xx DSA [ NOT AN OFFICIALLY REGISTERED ID ] */
> >> #define ETH_P_EDSA 0xDADA /* Ethertype DSA [ NOT AN OFFICIALLY REGISTERED ID ] */
> >> #define ETH_P_DSA_8021Q 0xDADB /* Fake VLAN Header for DSA [ NOT AN OFFICIALLY REGISTERED ID ] */
> >> #define ETH_P_DSA_A5PSW 0xE001 /* A5PSW Tag Value [ NOT AN OFFICIALLY REGISTERED ID ] */
> >
> > UAPI stands for User-space API. Do not change it unless there is a
> > very very good reason.
> >
>
> Ans: The default tpid both yt921x and yt922x is 0x9988. I have modified this to
> allow for simultaneous use in both yt922x and yt921x scenarios.
As pointed out, this is UAPI. Any changes to this file need a good
explanation how it does not change the user API. Do this break
backwards compatibility with user space applications? Maybe tcpdump or
wireshark has a dissector which expects ETH_P_YT921X and you have just
broken it?
> >> +#define YT922X_TAG_FORMAT2_NAME "yt922x-8b"
> >> +#define YT922X_FORMAT2_TAG_LEN 8
> >> +#define YT922X_PKT_TYPE GENMASK(15, 14)
> >> +#define YT922X_8B_CPUTAG_PKT_FROM_CPU 0x1
> >> +#define YT922X_8B_CPUTAG_SRC_PORT GENMASK(6, 2)
> >> +#define YT922X_8B_CPUTAG_DST_PORTMASK GENMASK(8, 0)
> >> +#define YT922X_8B_CPUTAG_DST_PORTMASK_0 BIT(15)
> >> +#define YT922X_8B_CPUTAG_DST_PORTMASK_0_EN 0x1
> >> +#define YT922X_8B_CPUTAG_FORCE_DST BIT(9)
> >> +#define YT922X_8B_CPUTAG_FORCE_DST_EN 0x1
> >
> > If yt922x tag format shares no common with yt921x, make a new tag driver.
>
> Ans: thank you for your suggestion, we will consider whether to create a new driver in the new file.
When you look at other tag drivers, you will also notice some drivers
implement two taggers in one file. So consider this if there is any
shared code.
> >> +static struct dsa_tag_driver *dsa_tag_driver_array[] = {
> >> + &DSA_TAG_DRIVER_NAME(yt921x_netdev_ops),
> >> + &DSA_TAG_DRIVER_NAME(yt922x_4b_netdev_ops),
> >> + &DSA_TAG_DRIVER_NAME(yt922x_8b_netdev_ops),
> >> +};
> >
> > If both are supported by the chip and 4b does nothing more than 8b
> > does, do not bother with it.
>
> Ans: 4b and 8b dsa tag may have different application scenarios. from my opinion,
> 1. 4b dsa tag can save 4 bytes of payload
> 2. 8b dsa tag carry more package info.
How do you plan to swap between the different formats?
The user perspective is that the machine has a collection of interface
which are used just as normal, using Linux tools likes like
iproute2. If the user enables a feature which requires the 8b tag
format, will you change the format from the DSA driver? And swap back
to the 4 byte format when the feature is no longer needed?
Andrew
^ permalink raw reply
* Re: [PATCH bpf-next v2 1/4] bpf: Initialize the l3mdev field for the fib lookup flow
From: Toke Høiland-Jørgensen @ 2026-06-17 9:06 UTC (permalink / raw)
To: Avinash Duduskar, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko
Cc: Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau,
Song Liu, Yonghong Song, Jiri Olsa, Emil Tsalapatis,
John Fastabend, Stanislav Fomichev, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern,
Shuah Khan, Jesper Dangaard Brouer, Mykyta Yatsenko, Leon Hwang,
KP Singh, Anton Protopopov, Amery Hung, Eyal Birger, Rong Tao,
bpf, netdev, linux-kselftest, linux-kernel
In-Reply-To: <20260616223426.3568080-2-avinash.duduskar@gmail.com>
> The helper already initializes the other flow fields the rules path
> consumes (flowi4_mark, flowi4_tun_key.tun_id, flowi4_uid and the v6
> counterparts); flowi*_l3mdev was added to that set afterwards and this
> helper was never updated to match. ip_route_input_slow() likewise zeroes
> the field before its input lookup. Do the same here.
So how about we explicitly zero-init the whole struct instead of adding
more fields ad-hoc like this? Otherwise this seems like something that
is likely to happen again if we ever add another field to the struct?
-Toke
^ permalink raw reply
* [PATCH net-next] net: airoha: Make use of the helper function dev_err_probe()
From: Lei Zhu @ 2026-06-17 9:03 UTC (permalink / raw)
To: lorenzo, andrew+netdev, davem, edumazet, kuba, pabeni; +Cc: netdev, zhulei_szu
From: Lei Zhu <zhulei@kylinos.cn>
Use dev_err_probe() to reduce code size and simplify the code.
Signed-off-by: Lei Zhu <zhulei@kylinos.cn>
---
drivers/net/ethernet/airoha/airoha_eth.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index 31cdb11cd78d..189f64e83a46 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -3071,10 +3071,9 @@ static int airoha_probe(struct platform_device *pdev)
eth->dev = &pdev->dev;
err = dma_set_mask_and_coherent(eth->dev, DMA_BIT_MASK(32));
- if (err) {
- dev_err(eth->dev, "failed configuring DMA mask\n");
- return err;
- }
+ if (err)
+ return dev_err_probe(eth->dev, err,
+ "failed configuring DMA mask\n");
eth->fe_regs = devm_platform_ioremap_resource_byname(pdev, "fe");
if (IS_ERR(eth->fe_regs))
@@ -3087,10 +3086,9 @@ static int airoha_probe(struct platform_device *pdev)
err = devm_reset_control_bulk_get_exclusive(eth->dev,
ARRAY_SIZE(eth->rsts),
eth->rsts);
- if (err) {
- dev_err(eth->dev, "failed to get bulk reset lines\n");
- return err;
- }
+ if (err)
+ return dev_err_probe(eth->dev, err,
+ "failed to get bulk reset lines\n");
xsi_rsts = devm_kcalloc(eth->dev,
eth->soc->num_xsi_rsts, sizeof(*xsi_rsts),
@@ -3105,10 +3103,9 @@ static int airoha_probe(struct platform_device *pdev)
err = devm_reset_control_bulk_get_exclusive(eth->dev,
eth->soc->num_xsi_rsts,
eth->xsi_rsts);
- if (err) {
- dev_err(eth->dev, "failed to get bulk xsi reset lines\n");
- return err;
- }
+ if (err)
+ return dev_err_probe(eth->dev, err,
+ "failed to get bulk xsi reset lines\n");
eth->napi_dev = alloc_netdev_dummy(0);
if (!eth->napi_dev)
--
2.25.1
^ permalink raw reply related
* Re: [Intel-wired-lan] [PATCH iwl-next v1] ixgbe: Implement PCI reset handler
From: Paul Menzel @ 2026-06-17 9:03 UTC (permalink / raw)
To: Sergey Temerkhanov
Cc: intel-wired-lan, netdev, Aleksandr Loktionov, Bjorn Helgaas,
linux-pci
In-Reply-To: <20260617084329.199110-1-sergey.temerkhanov@intel.com>
[Cc: +Aleksandr (as in Reviewed-by:), +PCI subsystem]
Dear Sergey,
Thank you for your patch.
Am 17.06.26 um 10:43 schrieb Sergey Temerkhanov:
> Implement PCI device reset handler to allow the network device to
> get re-initialized and function after a PCI-level reset.
Please describe the problem in more detail. When does PCI-level reset
occur, and what is the current problematic situation?
Also, what is ixgbe specific compared to a general PCIe implementation?
Please share details how to test it, and how you tested it.
> Signed-off-by: Sergey Temerkhanov <sergey.temerkhanov@intel.com>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> ---
> drivers/net/ethernet/intel/ixgbe/ixgbe.h | 1 +
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 72 +++++++++++++++++++
> 2 files changed, 73 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 594ccb28da20..c4b0c5bb89c6 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -912,6 +912,7 @@ enum ixgbe_state_t {
> __IXGBE_PTP_TX_IN_PROGRESS,
> __IXGBE_RESET_REQUESTED,
> __IXGBE_PHY_INIT_COMPLETE,
> + __IXGBE_PCIE_RESET_IN_PROGRESS,
> };
>
> struct ixgbe_cb {
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 2ac274c73d61..a61ee5fff7be 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -12352,6 +12352,76 @@ static pci_ers_result_t ixgbe_io_slot_reset(struct pci_dev *pdev)
> return result;
> }
>
> +#define IXGBE_PCIE_RESET_RETRIES 1000
Why 1000? Isn’t there a generic PCIe macro? Please extend the commit
message.
> +
> +/**
> + * ixgbe_reset_prep - called before the pci bus is reset.
> + * @pdev: Pointer to PCI device
> + *
> + * Prepare the card for a reset, preventing the service task from running.
> + */
> +static void ixgbe_reset_prep(struct pci_dev *pdev)
> +{
> + struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
> + unsigned int timeout = IXGBE_PCIE_RESET_RETRIES;
> +
> + if (!adapter)
> + return;
> +
> + /* Prevent the service task from being requeued in the timer callback
> + * while we're resetting.
> + */
> + if (test_bit(__IXGBE_SERVICE_INITED, &adapter->state)) {
> + timer_delete_sync(&adapter->service_timer);
> + /* Prevent the service task from running while we're resetting. */
One of the two comments seems redundant.
> + cancel_work_sync(&adapter->service_task);
> + }
> +
> + pci_clear_master(pdev);
> +
> + while (test_and_set_bit(__IXGBE_RESETTING, &adapter->state) && --timeout)
> + usleep_range(1000, 2000);
> +
> + if (!timeout) {
> + e_err(drv, "Timed out waiting for __IXGBE_RESETTING to be released. Reset is needed\n");
> + pci_set_master(pdev);
> + return;
> + }
> +
> + set_bit(__IXGBE_PCIE_RESET_IN_PROGRESS, &adapter->state);
> + smp_mb__after_atomic();
> +}
> +
> +/**
> + * ixgbe_reset_done - called after the pci bus has been reset.
> + * @pdev: Pointer to PCI device
> + *
> + * Allow the service task to run and schedule re-initialization.
> + */
> +static void ixgbe_reset_done(struct pci_dev *pdev)
> +{
> + struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
> +
> + smp_mb__before_atomic();
> + if (!test_and_clear_bit(__IXGBE_PCIE_RESET_IN_PROGRESS, &adapter->state)) {
> + e_err(drv, "Reset done called without PCIe reset in progress\n");
How can this happen? What should the user reading this error do?
> + return;
> + }
> +
> + /* Allow the service task to run */
> + if (!test_bit(__IXGBE_REMOVING, &adapter->state)) {
> + clear_bit(__IXGBE_RESETTING, &adapter->state);
> + smp_mb__after_atomic();
> + }
> +
> + /* Schedule re-initialization */
> + if (!test_bit(__IXGBE_DOWN, &adapter->state)) {
> + set_bit(__IXGBE_RESET_REQUESTED, &adapter->state);
> + if (test_bit(__IXGBE_SERVICE_INITED, &adapter->state))
> + mod_timer(&adapter->service_timer, jiffies + 1);
> + }
> +}
> +
> /**
> * ixgbe_io_resume - called when traffic can start flowing again.
> * @pdev: Pointer to PCI device
> @@ -12384,6 +12454,8 @@ static const struct pci_error_handlers ixgbe_err_handler = {
> .error_detected = ixgbe_io_error_detected,
> .slot_reset = ixgbe_io_slot_reset,
> .resume = ixgbe_io_resume,
> + .reset_prepare = ixgbe_reset_prep,
> + .reset_done = ixgbe_reset_done,
> };
>
> static DEFINE_SIMPLE_DEV_PM_OPS(ixgbe_pm_ops, ixgbe_suspend, ixgbe_resume);
Kind regards,
Paul
^ permalink raw reply
* Re: [PATCH net v3] rtase: Workaround for TX hang caused by short UDP packets entering hardware PTP parsing
From: Simon Horman @ 2026-06-17 8:59 UTC (permalink / raw)
To: justinlai0215
Cc: Simon Horman, kuba, davem, edumazet, pabeni, andrew+netdev,
linux-kernel, netdev, stable, richardcochran, david.laight.linux,
aleksander.lobakin, pkshih, larry.chiu
In-Reply-To: <20260615131653.15730-1-justinlai0215@realtek.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
rtase: Workaround for TX hang caused by short UDP packets entering hardware PTP parsing
The patch adds a workaround for a hardware bug in rtase where short UDP
packets triggering PTP parsing can cause a TX hang due to out-of-bounds reads.
It pads affected packets so the transport data reaches a minimum length.
Cc: stable@vger.kernel.org
> diff --git a/drivers/net/ethernet/realtek/rtase/rtase_main.c b/drivers/net/ethernet/realtek/rtase/rtase_main.c
> index 55105d34bc797..4c295a39c7a0c 100644
> --- a/drivers/net/ethernet/realtek/rtase/rtase_main.c
> +++ b/drivers/net/ethernet/realtek/rtase/rtase_main.c
[ ... ]
> @@ -1249,6 +1250,81 @@ static u32 rtase_tx_csum(struct sk_buff *skb, const struct net_device *dev)
> return csum_cmd;
> }
>
> +static bool rtase_get_udp_offset(struct sk_buff *skb, u32 *udp_offset)
> +{
> + int no = skb_network_offset(skb);
> + struct ipv6hdr *i6h, _i6h;
> + struct iphdr *ih, _ih;
> +
> + switch (vlan_get_protocol(skb)) {
> + case htons(ETH_P_IP):
[Severity: High]
Does relying on vlan_get_protocol(skb) allow an unprivileged user to bypass
this workaround?
If an attacker uses an AF_PACKET/SOCK_RAW socket to inject a raw Ethernet
frame containing an IPv4/UDP packet, they could set skb->protocol to an
arbitrary value like htons(ETH_P_ALL). The switch statement will fall
through to the default case and return false, causing rtase_skb_pad() to
skip the padding.
Since the hardware independently parses the raw frame based on the EtherType,
could it still process the unpadded packet and trigger the TX hang?
> + ih = skb_header_pointer(skb, no, sizeof(_ih), &_ih);
> + if (!ih)
> + return false;
[ ... ]
> +static bool rtase_skb_pad(struct sk_buff *skb)
> +{
> + __be16 *dest, _dest;
> + u32 trans_data_len;
> + u32 udp_offset;
> + u16 dest_port;
> + u32 pad_len;
> +
> + if (!rtase_get_udp_offset(skb, &udp_offset))
> + return true;
> +
> + trans_data_len = skb->len - udp_offset;
> + if (trans_data_len < offsetof(struct udphdr, len) ||
> + trans_data_len >= RTASE_MIN_PAD_LEN)
> + return true;
[Severity: High]
Does this logic inadvertently skip padding for underflowed lengths and tiny
payloads?
If the IP header is malformed such that skb->len < udp_offset, the unsigned
subtraction for trans_data_len will underflow to a large positive value.
This causes trans_data_len >= RTASE_MIN_PAD_LEN to be true, bypassing the
padding.
Additionally, if the packet has fewer than 4 bytes of UDP payload, meaning
trans_data_len < offsetof(struct udphdr, len), it explicitly returns true.
Since the hardware blindly fetches the UDP destination port at
udp_offset + 2 via DMA, won't transmitting these short or malformed packets
without padding still expose the hardware to out-of-bounds reads?
> +
> + dest = skb_header_pointer(skb,
> + udp_offset + offsetof(struct udphdr, dest),
> + sizeof(_dest), &_dest);
[ ... ]
> @@ -1362,6 +1438,9 @@ static netdev_tx_t rtase_start_xmit(struct sk_buff *skb,
> opts2 |= rtase_tx_csum(skb, dev);
> }
>
> + if (!rtase_skb_pad(skb))
> + goto err_dma_0;
> +
> frags = rtase_xmit_frags(ring, skb, opts1, opts2);
> if (unlikely(frags < 0))
> goto err_dma_0;
[Severity: High]
This is a pre-existing issue, but does the err_dma_1 error path in
rtase_start_xmit() contain a double free of the skb?
When transmitting a fragmented packet, rtase_xmit_frags() maps the
fragments and assigns ring->skbuff[entry] = skb for the last fragment
descriptor. Back in rtase_start_xmit(), if dma_map_single() fails for the
linear part, it jumps to err_dma_1:
err_dma_1:
ring->skbuff[entry] = NULL;
rtase_tx_clear_range(ring, ring->cur_idx + 1, frags);
err_dma_0:
tp->stats.tx_dropped++;
dev_kfree_skb_any(skb);
return NETDEV_TX_OK;
rtase_tx_clear_range() iterates over the mapped fragments and calls
dev_kfree_skb_any() when it finds the skb pointer. Execution then falls
through to err_dma_0, which unconditionally calls dev_kfree_skb_any(skb) a
second time on the same skb pointer.
^ permalink raw reply
* Re: [PATCH net] net: airoha: Fix TX scheduler queue mask loop upper bound
From: Wayen Yan @ 2026-06-17 8:55 UTC (permalink / raw)
To: lorenzo; +Cc: netdev, nbd, linux-arm-kernel, linux-mediatek
In-Reply-To: <178166704952.2212140.11002626760717132754@gmail.com>
On Tue, Jun 17, 2026, Lorenzo Bianconi wrote:
> Even if the current codebase supports just AIROHA_NUM_QOS_CHANNEL (4), the hw
> exposes 32 hw QoS channels (AIROHA_NUM_TX_RING). Here we are just clearing the
> configuration, so I guess the current implementation is correct.
Hi Lorenzo,
You are right that there is no functional impact, and I agree this
should not go to net. Let me explain the register layout I was worried
about, and you can decide whether it is worth a net-next cleanup or
should just be dropped.
The two macros are:
REG_QUEUE_CLOSE_CFG(_n) = 0x00a0 + ((_n) & 0xfc)
TXQ_DISABLE_CHAN_QUEUE_MASK(_n, _m) = BIT((_m) + (((_n) & 0x3) << 3))
REG_QUEUE_CLOSE_CFG() masks the channel with 0xfc, and the bit macro
folds the channel with & 0x3 (mod 4) shifted by 3. So one 32-bit
register holds 4 channels x 8 queues, 8 queue bits per channel:
channel 0 -> reg 0x00a0, bits 0..7
channel 1 -> reg 0x00a0, bits 8..15
channel 2 -> reg 0x00a0, bits 16..23
channel 3 -> reg 0x00a0, bits 24..31
channel 4 -> reg 0x00a4, bits 0..7
...
In airoha_qdma_set_chan_tx_sched() the loop variable 'i' is passed as
the *queue* argument _m, not as a channel:
for (i = 0; i < AIROHA_NUM_TX_RING; i++) // i = 0..31
airoha_qdma_clear(qdma, REG_QUEUE_CLOSE_CFG(channel),
TXQ_DISABLE_CHAN_QUEUE_MASK(channel, i));
Since each channel only has AIROHA_NUM_QOS_QUEUES (8) queues, the correct
logic is to clear the 8 queue bits belonging to 'channel'. With i running
up to 31 the BIT() shift instead walks past those 8 bits and into the bit
ranges of the other channels folded into the same register. For channel 0
the accumulated mask becomes 0xffffffff, i.e. it touches channels 1..3 as
well.
This is harmless today only because REG_QUEUE_CLOSE_CFG is written
exclusively here, via airoha_qdma_clear() (RMW clear), and the register
resets to 0 and is never set anywhere -- so clearing extra bits is a
no-op. Functionally the current code is fine, as you say.
The point is just the loop-bound semantics: 'i' is a per-channel queue
index, so the bound should be AIROHA_NUM_QOS_QUEUES (8), not
AIROHA_NUM_TX_RING (32). The two happen to be related (32 == 4 channels *
8 queues) but mean different things.
Since there is no functional change, feel free to drop this if you would
rather not carry a cosmetic patch. If you think the clarity is worth it I
can resend against net-next without the Fixes tag.
Thanks,
Wayen
^ permalink raw reply
* Re: [PATCH net v3] rtase: Workaround for TX hang caused by short UDP packets entering hardware PTP parsing
From: Simon Horman @ 2026-06-17 8:54 UTC (permalink / raw)
To: justinlai0215
Cc: Simon Horman, kuba, davem, edumazet, pabeni, andrew+netdev,
linux-kernel, netdev, stable, richardcochran, david.laight.linux,
aleksander.lobakin, pkshih, larry.chiu
In-Reply-To: <20260615131653.15730-1-justinlai0215@realtek.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
---
rtase: Workaround for TX hang caused by short UDP packets entering hardware PTP parsing
This patch adds a workaround for a hardware TX hang that occurs when
short UDP packets matching destination ports 319/320 enter a PTP parsing
path that reads beyond the packet end. The workaround pads affected
packets so the transport data reaches RTASE_MIN_PAD_LEN before
transmission.
> diff --git a/drivers/net/ethernet/realtek/rtase/rtase_main.c b/drivers/net/ethernet/realtek/rtase/rtase_main.c
> index 55105d34bc79..4c295a39c7a0 100644
> --- a/drivers/net/ethernet/realtek/rtase/rtase_main.c
> +++ b/drivers/net/ethernet/realtek/rtase/rtase_main.c
[ ... ]
> +static bool rtase_get_udp_offset(struct sk_buff *skb, u32 *udp_offset)
> +{
> + int no = skb_network_offset(skb);
> + struct ipv6hdr *i6h, _i6h;
> + struct iphdr *ih, _ih;
> +
> + switch (vlan_get_protocol(skb)) {
> + case htons(ETH_P_IP):
> + ih = skb_header_pointer(skb, no, sizeof(_ih), &_ih);
> + if (!ih)
> + return false;
> +
> + if (ih->ihl < 5)
> + return false;
> +
> + if (ih->protocol != IPPROTO_UDP)
> + return false;
> +
> + *udp_offset = no + ih->ihl * 4;
[Low]
Should ih->ihl also be checked against the actual data present in the
skb? With ihl up to 15 (60 bytes), if the skb only contains the basic
20-byte IP header, the computed udp_offset can exceed skb->len.
Downstream in rtase_skb_pad():
trans_data_len = skb->len - udp_offset;
trans_data_len is u32, so when udp_offset > skb->len this underflows to
a very large value, which trivially satisfies:
trans_data_len >= RTASE_MIN_PAD_LEN
and rtase_skb_pad() silently skips the padding. A malformed packet from
an AF_PACKET sender with CAP_NET_RAW could craft a mismatched ihl and
bypass the workaround. Would clamping or validating udp_offset against
skb->len before the subtraction be appropriate?
> +
> + return true;
> + case htons(ETH_P_IPV6):
> + i6h = skb_header_pointer(skb, no, sizeof(_i6h), &_i6h);
> + if (!i6h)
> + return false;
> +
> + if (i6h->nexthdr != IPPROTO_UDP)
> + return false;
> +
> + *udp_offset = no + sizeof(*i6h);
> +
> + return true;
> + default:
> + return false;
> + }
> +}
> +
> +static bool rtase_skb_pad(struct sk_buff *skb)
> +{
> + __be16 *dest, _dest;
> + u32 trans_data_len;
> + u32 udp_offset;
> + u16 dest_port;
> + u32 pad_len;
> +
> + if (!rtase_get_udp_offset(skb, &udp_offset))
> + return true;
> +
> + trans_data_len = skb->len - udp_offset;
> + if (trans_data_len < offsetof(struct udphdr, len) ||
> + trans_data_len >= RTASE_MIN_PAD_LEN)
> + return true;
[Low]
When trans_data_len is less than offsetof(struct udphdr, len) (i.e., the
transport region is under 4 bytes), this returns true and skips the
padding entirely.
The commit message states the hardware "may access data beyond the end
of the packet" during PTP parsing. If the hardware also reads out of
bounds while classifying the packet, an IPv4 last fragment with 1-3
trailing bytes could still be misclassified as PTP based on whatever
garbage memory the hardware happens to read, and the workaround would
be bypassed.
The implicit assumption here seems to be "if the driver cannot read the
dest port from the skb, the hardware cannot classify it either", which
appears to contradict the premise that the hardware reads beyond the
packet boundary. Should these very short transport-data cases also be
padded to be safe?
> +
> + dest = skb_header_pointer(skb,
> + udp_offset + offsetof(struct udphdr, dest),
> + sizeof(_dest), &_dest);
[ ... ]
^ permalink raw reply
* RE: [PATCH net v2] tipc: fix use-after-free of the discoverer in tipc_disc_rcv()
From: Tung Quang Nguyen @ 2026-06-17 8:47 UTC (permalink / raw)
To: Weiming Shi
Cc: jmaloy@redhat.com, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, horms@kernel.org, davem@davemloft.net,
xmei5@asu.edu, netdev@vger.kernel.org,
tipc-discussion@lists.sourceforge.net,
linux-kernel@vger.kernel.org
In-Reply-To: <20260616122246.3136462-2-bestswngs@gmail.com>
>Subject: [PATCH net v2] tipc: fix use-after-free of the discoverer in
>tipc_disc_rcv()
>
>bearer_disable() frees b->disc with tipc_disc_delete()'s plain kfree(), but
>tipc_disc_rcv() still dereferences b->disc in RX softirq under
>rcu_read_lock() (tipc_udp_recv -> tipc_rcv -> tipc_disc_rcv).
>
>L2 bearers are safe thanks to the synchronize_net() in tipc_disable_l2_media(),
>but the UDP bearer defers that call to the
>cleanup_bearer() workqueue, so the discoverer is freed with no grace
>period:
>
> BUG: KASAN: slab-use-after-free in tipc_disc_rcv (net/tipc/discover.c:149)
>Read of size 8 at addr ffff88802348b728 by task poc_tipc/184 <IRQ>
> tipc_disc_rcv (net/tipc/discover.c:149)
> tipc_rcv (net/tipc/node.c:2126)
> tipc_udp_recv (net/tipc/udp_media.c:391)
> udp_rcv (net/ipv4/udp.c:2643)
> ip_local_deliver_finish (net/ipv4/ip_input.c:241) </IRQ> Freed by task 181:
> kfree (mm/slub.c:6565)
> bearer_disable (net/tipc/bearer.c:418)
> tipc_nl_bearer_disable (net/tipc/bearer.c:1001)
>
>The bearer is freed with kfree_rcu(); free the discoverer the same way.
>Add an rcu_head to struct tipc_discoverer and free it and its skb from an RCU
>callback.
>
>Because the RCU callback (tipc_disc_free_rcu) lives in module text, a
>call_rcu() that is still pending when the tipc module is unloaded would invoke a
>freed function. Add an rcu_barrier() to tipc_exit() after the bearer subsystem
>has been torn down, so all pending discoverer callbacks have run before the
>module text goes away.
>
>Reachable from an unprivileged user namespace: the TIPCv2 genl family is
>netnsok and its bearer commands have no GENL_ADMIN_PERM. Needs
>CONFIG_TIPC and CONFIG_TIPC_MEDIA_UDP.
>
>Fixes: 25b0b9c4e835 ("tipc: handle collisions of 32-bit node address hash
>values")
>Reported-by: Xiang Mei <xmei5@asu.edu>
>Assisted-by: Claude:claude-opus-4-8
>Signed-off-by: Weiming Shi <bestswngs@gmail.com>
>---
>v2:
> - split the over-80-column container_of() line (Tung Quang Nguyen)
> - add rcu_barrier() to tipc_exit() so a pending call_rcu() cannot fire
> into freed module text after rmmod (Eric Dumazet)
>
> net/tipc/core.c | 3 +++
> net/tipc/discover.c | 14 ++++++++++++--
> 2 files changed, 15 insertions(+), 2 deletions(-)
>
>diff --git a/net/tipc/core.c b/net/tipc/core.c index
>434e70eabe08..747328e58d30 100644
>--- a/net/tipc/core.c
>+++ b/net/tipc/core.c
>@@ -218,6 +218,9 @@ static void __exit tipc_exit(void)
> unregister_pernet_device(&tipc_net_ops);
> tipc_unregister_sysctl();
>
>+ /* Wait for tipc_disc_free_rcu() callbacks queued from module text. */
Please change above comment to: /* TODO: Wait for all timers that called call_rcu() to finish before calling rcu_barrier() */
Note that call_rcu() are used in discover.c and node.c. So, the TODO comment helps we add more checking code later in another patch.
>+ rcu_barrier();
>+
> pr_info("Deactivated\n");
> }
>
>diff --git a/net/tipc/discover.c b/net/tipc/discover.c index
>3e54d2df5683..696b7a8ed54d 100644
>--- a/net/tipc/discover.c
>+++ b/net/tipc/discover.c
>@@ -58,6 +58,7 @@
> * @skb: request message to be (repeatedly) sent
> * @timer: timer governing period between requests
> * @timer_intv: current interval between requests (in ms)
>+ * @rcu: RCU head for deferred freeing
> */
> struct tipc_discoverer {
> u32 bearer_id;
>@@ -69,6 +70,7 @@ struct tipc_discoverer {
> struct sk_buff *skb;
> struct timer_list timer;
> unsigned long timer_intv;
>+ struct rcu_head rcu;
> };
>
> /**
>@@ -382,6 +384,15 @@ int tipc_disc_create(struct net *net, struct tipc_bearer
>*b,
> return 0;
> }
>
>+static void tipc_disc_free_rcu(struct rcu_head *rp) {
>+ struct tipc_discoverer *d =
>+ container_of(rp, struct tipc_discoverer, rcu);
>+
>+ kfree_skb(d->skb);
>+ kfree(d);
>+}
>+
> /**
> * tipc_disc_delete - destroy object sending periodic link setup requests
> * @d: ptr to link dest structure
>@@ -389,8 +400,7 @@ int tipc_disc_create(struct net *net, struct tipc_bearer
>*b, void tipc_disc_delete(struct tipc_discoverer *d) {
> timer_shutdown_sync(&d->timer);
>- kfree_skb(d->skb);
>- kfree(d);
>+ call_rcu(&d->rcu, tipc_disc_free_rcu);
> }
>
> /**
>--
>2.43.0
^ permalink raw reply
* Re: [PATCH net] ice: eswitch: fix use-after-free of metadata_dst in repr release
From: Simon Horman @ 2026-06-17 8:47 UTC (permalink / raw)
To: Doruk Tan Ozturk
Cc: anthony.l.nguyen, przemyslaw.kitszel, andrew+netdev, davem,
edumazet, kuba, pabeni, piotr.raczynski, michal.swiatkowski,
wojciech.drewek, intel-wired-lan, netdev, linux-kernel, stable
In-Reply-To: <20260615140532.52676-1-doruk@0sec.ai>
On Mon, Jun 15, 2026 at 04:05:32PM +0200, Doruk Tan Ozturk wrote:
> ice_eswitch_release_repr() frees the port representor metadata_dst via
> metadata_dst_free(), which directly kfree()s the object and ignores the
> dst_entry refcount. The eswitch slow-path TX routine
> ice_eswitch_port_start_xmit() takes a reference on this dst with
> dst_hold() and attaches it to the skb via skb_dst_set(). If such an skb
> is still in flight (e.g. queued in a qdisc) when the representor is torn
> down, the metadata_dst is freed while the skb still points at it. When
> the skb is later freed, dst_release() operates on already-freed memory.
>
> Replace metadata_dst_free() with dst_release() so the metadata_dst is
> freed only after the last reference is dropped. The dst subsystem frees
> metadata_dst objects from dst_destroy() once the refcount reaches zero
> (DST_METADATA is set by metadata_dst_alloc()).
>
> Same class of bug and fix as commit c32b26aaa2f9 ("netfilter:
> nft_tunnel: fix use-after-free on object destroy").
I think that the commit cited above moves the code in question around
but did not introduce the call to dst_release. And I think that this
bug goes back to when switchdev support was added.
I would suggest:
Fixes: 1a1c40df2e80 ("ice: set and release switchdev environment")
> Cc: stable@vger.kernel.org
> Signed-off-by: Doruk Tan Ozturk <doruk@0sec.ai>
Otherwise, this looks good to me.
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply
* [PATCH iwl-next v1] ixgbe: Implement PCI reset handler
From: Sergey Temerkhanov @ 2026-06-17 8:43 UTC (permalink / raw)
To: intel-wired-lan; +Cc: netdev
Implement PCI device reset handler to allow the network device to
get re-initialized and function after a PCI-level reset.
Signed-off-by: Sergey Temerkhanov <sergey.temerkhanov@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe.h | 1 +
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 72 +++++++++++++++++++
2 files changed, 73 insertions(+)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 594ccb28da20..c4b0c5bb89c6 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -912,6 +912,7 @@ enum ixgbe_state_t {
__IXGBE_PTP_TX_IN_PROGRESS,
__IXGBE_RESET_REQUESTED,
__IXGBE_PHY_INIT_COMPLETE,
+ __IXGBE_PCIE_RESET_IN_PROGRESS,
};
struct ixgbe_cb {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 2ac274c73d61..a61ee5fff7be 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -12352,6 +12352,76 @@ static pci_ers_result_t ixgbe_io_slot_reset(struct pci_dev *pdev)
return result;
}
+#define IXGBE_PCIE_RESET_RETRIES 1000
+
+/**
+ * ixgbe_reset_prep - called before the pci bus is reset.
+ * @pdev: Pointer to PCI device
+ *
+ * Prepare the card for a reset, preventing the service task from running.
+ */
+static void ixgbe_reset_prep(struct pci_dev *pdev)
+{
+ struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
+ unsigned int timeout = IXGBE_PCIE_RESET_RETRIES;
+
+ if (!adapter)
+ return;
+
+ /* Prevent the service task from being requeued in the timer callback
+ * while we're resetting.
+ */
+ if (test_bit(__IXGBE_SERVICE_INITED, &adapter->state)) {
+ timer_delete_sync(&adapter->service_timer);
+ /* Prevent the service task from running while we're resetting. */
+ cancel_work_sync(&adapter->service_task);
+ }
+
+ pci_clear_master(pdev);
+
+ while (test_and_set_bit(__IXGBE_RESETTING, &adapter->state) && --timeout)
+ usleep_range(1000, 2000);
+
+ if (!timeout) {
+ e_err(drv, "Timed out waiting for __IXGBE_RESETTING to be released. Reset is needed\n");
+ pci_set_master(pdev);
+ return;
+ }
+
+ set_bit(__IXGBE_PCIE_RESET_IN_PROGRESS, &adapter->state);
+ smp_mb__after_atomic();
+}
+
+/**
+ * ixgbe_reset_done - called after the pci bus has been reset.
+ * @pdev: Pointer to PCI device
+ *
+ * Allow the service task to run and schedule re-initialization.
+ */
+static void ixgbe_reset_done(struct pci_dev *pdev)
+{
+ struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
+
+ smp_mb__before_atomic();
+ if (!test_and_clear_bit(__IXGBE_PCIE_RESET_IN_PROGRESS, &adapter->state)) {
+ e_err(drv, "Reset done called without PCIe reset in progress\n");
+ return;
+ }
+
+ /* Allow the service task to run */
+ if (!test_bit(__IXGBE_REMOVING, &adapter->state)) {
+ clear_bit(__IXGBE_RESETTING, &adapter->state);
+ smp_mb__after_atomic();
+ }
+
+ /* Schedule re-initialization */
+ if (!test_bit(__IXGBE_DOWN, &adapter->state)) {
+ set_bit(__IXGBE_RESET_REQUESTED, &adapter->state);
+ if (test_bit(__IXGBE_SERVICE_INITED, &adapter->state))
+ mod_timer(&adapter->service_timer, jiffies + 1);
+ }
+}
+
/**
* ixgbe_io_resume - called when traffic can start flowing again.
* @pdev: Pointer to PCI device
@@ -12384,6 +12454,8 @@ static const struct pci_error_handlers ixgbe_err_handler = {
.error_detected = ixgbe_io_error_detected,
.slot_reset = ixgbe_io_slot_reset,
.resume = ixgbe_io_resume,
+ .reset_prepare = ixgbe_reset_prep,
+ .reset_done = ixgbe_reset_done,
};
static DEFINE_SIMPLE_DEV_PM_OPS(ixgbe_pm_ops, ixgbe_suspend, ixgbe_resume);
base-commit: c50bfa9768ff3a5163746c6362a8a910a0b4dca0
--
2.53.0
^ permalink raw reply related
* Re: [GIT PULL] Networking for 7.2
From: pr-tracker-bot @ 2026-06-17 8:41 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: torvalds, kuba, davem, netdev, linux-kernel, pabeni
In-Reply-To: <20260617000705.931602-1-kuba@kernel.org>
The pull request you sent on Tue, 16 Jun 2026 17:07:05 -0700:
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git tags/net-next-7.2
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b85966adbf5de0668a815c6e3527f87e0c387fb4
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply
* Re: [PATCH v18 net-next 01/11] net/nebula-matrix: add minimum nbl build framework
From: Uwe Kleine-König @ 2026-06-17 8:40 UTC (permalink / raw)
To: illusion.wang
Cc: dimon.zhao, alvin.wang, sam.chen, netdev, andrew+netdev, corbet,
kuba, horms, linux-doc, pabeni, vadim.fedorenko, lukas.bulwahn,
edumazet, enelsonmoore, skhan, hkallweit1, open list
In-Reply-To: <20260611044916.2383-2-illusion.wang@nebula-matrix.com>
[-- Attachment #1: Type: text/plain, Size: 3848 bytes --]
On Thu, Jun 11, 2026 at 12:49:00PM +0800, illusion.wang wrote:
> +static int nbl_probe(struct pci_dev *pdev,
> + const struct pci_device_id *id)
> +{
> + return 0;
> +}
> +
> +static void nbl_remove(struct pci_dev *pdev)
> +{
> +}
> [...]
> +static const struct pci_device_id nbl_id_table[] = {
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18110),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18110_LX),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18110_BASE_T),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18110_LX_BASE_T),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18110_OCP),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18110_LX_OCP),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18110_BASE_T_OCP),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18110_LX_BASE_T_OCP),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18000),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18000_LX),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18000_BASE_T),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18000_LX_BASE_T),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18000_OCP),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18000_LX_OCP),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18000_BASE_T_OCP),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + { PCI_DEVICE(NBL_VENDOR_ID, NBL_DEVICE_ID_M18000_LX_BASE_T_OCP),
> + .driver_data = BIT(NBL_CAP_HAS_NET_BIT) | BIT(NBL_CAP_IS_NIC_BIT) |
> + BIT(NBL_CAP_IS_LEONIS_BIT) },
> + /* required as sentinel */
> + {
> + 0,
Please drop this zero. The most usual style is `{ }`.
> + }
> +};
> +MODULE_DEVICE_TABLE(pci, nbl_id_table);
> +
> +static struct pci_driver nbl_driver = {
> + .name = NBL_DRIVER_NAME,
> + .id_table = nbl_id_table,
> + .probe = nbl_probe,
> + .remove = nbl_remove,
> +};
The pci bus probe function has (pci_device_probe() ->
__pci_device_probe()):
int error = 0;
if (drv->probe) {
...
}
return error;
So given that the probe function does nothing apart from returning zero,
you can just drop .probe(). (There is an additional check against
.id_table, but I'm pretty sure that isn't relevant because
pci_bus_match() already makes sure that there is a match.) The same is
true for .remove().
Best regards
Uwe
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* [PATCH net] net: rnpgbe: fix mailbox endianness handling
From: Dong Yibo @ 2026-06-17 8:35 UTC (permalink / raw)
To: andrew+netdev, davem, edumazet, kuba, pabeni, vadim.fedorenko
Cc: netdev, linux-kernel, dong100, yaojun
Mailbox data is exchanged through 32-bit MMIO accesses but the
mailbox payload is defined using little-endian FW structures with
__le16 and __le32 fields.
The mailbox read/write helpers previously operated on raw u32
buffers without performing endian conversion. On big-endian
systems this causes mailbox payload fields to be byte-swapped in
memory, resulting in corrupted FW command and reply structures.
Convert mailbox data between CPU-endian MMIO values and the
little-endian mailbox wire format using cpu_to_le32() on reads and
le32_to_cpu() on writes.
Also switch the helper interfaces to use void */const void * since
the mailbox transport layer operates on opaque payload buffers
rather than native-endian u32 arrays.
Fixes: 4543534c3ef5 ("net: rnpgbe: Add basic mbx ops support")
Signed-off-by: Dong Yibo <dong100@mucse.com>
---
drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx.c | 16 ++++++++++------
drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx.h | 5 +++--
.../net/ethernet/mucse/rnpgbe/rnpgbe_mbx_fw.c | 7 +++----
3 files changed, 16 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx.c b/drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx.c
index de5e29230b3c..0fccfc49ffc7 100644
--- a/drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx.c
+++ b/drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx.c
@@ -166,10 +166,12 @@ static void mucse_mbx_inc_pf_ack(struct mucse_hw *hw)
*
* Return: 0 on success, negative errno on failure
**/
-static int mucse_read_mbx_pf(struct mucse_hw *hw, u32 *msg, u16 size)
+static int mucse_read_mbx_pf(struct mucse_hw *hw, void *msg, u16 size)
{
const int size_in_words = size / sizeof(u32);
struct mucse_mbx_info *mbx = &hw->mbx;
+ int off = MUCSE_MBX_FWPF_SHM;
+ __le32 *msg_le32 = msg;
int err;
err = mucse_obtain_mbx_lock_pf(hw);
@@ -177,7 +179,7 @@ static int mucse_read_mbx_pf(struct mucse_hw *hw, u32 *msg, u16 size)
return err;
for (int i = 0; i < size_in_words; i++)
- msg[i] = mbx_data_rd32(mbx, MUCSE_MBX_FWPF_SHM + 4 * i);
+ msg_le32[i] = cpu_to_le32(mbx_data_rd32(mbx, off + 4 * i));
/* Hw needs write data_reg at last */
mbx_data_wr32(mbx, MUCSE_MBX_FWPF_SHM, 0);
/* flush reqs as we have read this request data */
@@ -236,7 +238,7 @@ static int mucse_poll_for_msg(struct mucse_hw *hw)
* Return: 0 if it successfully received a message notification and
* copied it into the receive buffer, negative errno on failure
**/
-int mucse_poll_and_read_mbx(struct mucse_hw *hw, u32 *msg, u16 size)
+int mucse_poll_and_read_mbx(struct mucse_hw *hw, void *msg, u16 size)
{
int err;
@@ -290,10 +292,11 @@ static void mucse_mbx_inc_pf_req(struct mucse_hw *hw)
* Return: 0 if it successfully copied message into the buffer,
* negative errno on failure
**/
-static int mucse_write_mbx_pf(struct mucse_hw *hw, u32 *msg, u16 size)
+static int mucse_write_mbx_pf(struct mucse_hw *hw, const void *msg, u16 size)
{
const int size_in_words = size / sizeof(u32);
struct mucse_mbx_info *mbx = &hw->mbx;
+ const __le32 *msg_le32 = msg;
int err;
err = mucse_obtain_mbx_lock_pf(hw);
@@ -301,7 +304,8 @@ static int mucse_write_mbx_pf(struct mucse_hw *hw, u32 *msg, u16 size)
return err;
for (int i = 0; i < size_in_words; i++)
- mbx_data_wr32(mbx, MUCSE_MBX_FWPF_SHM + i * 4, msg[i]);
+ mbx_data_wr32(mbx, MUCSE_MBX_FWPF_SHM + i * 4,
+ le32_to_cpu(msg_le32[i]));
/* flush acks as we are overwriting the message buffer */
hw->mbx.fw_ack = mucse_mbx_get_fwack(mbx);
@@ -360,7 +364,7 @@ static int mucse_poll_for_ack(struct mucse_hw *hw)
* Return: 0 if it successfully copied message into the buffer and
* received an ack to that message within delay * timeout_cnt period
**/
-int mucse_write_and_wait_ack_mbx(struct mucse_hw *hw, u32 *msg, u16 size)
+int mucse_write_and_wait_ack_mbx(struct mucse_hw *hw, const void *msg, u16 size)
{
int err;
diff --git a/drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx.h b/drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx.h
index e6fcc8d1d3ca..25bfc97c24c0 100644
--- a/drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx.h
+++ b/drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx.h
@@ -14,7 +14,8 @@
#define MUCSE_MBX_REQ BIT(0) /* Request a req to mailbox */
#define MUCSE_MBX_PFU BIT(3) /* PF owns the mailbox buffer */
-int mucse_write_and_wait_ack_mbx(struct mucse_hw *hw, u32 *msg, u16 size);
+int mucse_write_and_wait_ack_mbx(struct mucse_hw *hw,
+ const void *msg, u16 size);
void mucse_init_mbx_params_pf(struct mucse_hw *hw);
-int mucse_poll_and_read_mbx(struct mucse_hw *hw, u32 *msg, u16 size);
+int mucse_poll_and_read_mbx(struct mucse_hw *hw, void *msg, u16 size);
#endif /* _RNPGBE_MBX_H */
diff --git a/drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx_fw.c b/drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx_fw.c
index 8c8bd5e8e1db..2ac97915a098 100644
--- a/drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx_fw.c
+++ b/drivers/net/ethernet/mucse/rnpgbe/rnpgbe_mbx_fw.c
@@ -28,12 +28,11 @@ static int mucse_fw_send_cmd_wait_resp(struct mucse_hw *hw,
int err;
mutex_lock(&hw->mbx.lock);
- err = mucse_write_and_wait_ack_mbx(hw, (u32 *)req, len);
+ err = mucse_write_and_wait_ack_mbx(hw, req, len);
if (err)
goto out;
do {
- err = mucse_poll_and_read_mbx(hw, (u32 *)reply,
- sizeof(*reply));
+ err = mucse_poll_and_read_mbx(hw, reply, sizeof(*reply));
if (err)
goto out;
/* mucse_write_and_wait_ack_mbx return 0 means fw has
@@ -125,7 +124,7 @@ int mucse_mbx_powerup(struct mucse_hw *hw, bool is_powerup)
len = le16_to_cpu(req.datalen);
mutex_lock(&hw->mbx.lock);
- err = mucse_write_and_wait_ack_mbx(hw, (u32 *)&req, len);
+ err = mucse_write_and_wait_ack_mbx(hw, &req, len);
mutex_unlock(&hw->mbx.lock);
return err;
--
2.25.1
^ permalink raw reply related
* Re: [PATCH net] ipv6: ndisc: fix NULL deref in accept_untracked_na()
From: Jiayuan Chen @ 2026-06-17 8:32 UTC (permalink / raw)
To: Weiming Shi, David S . Miller, David Ahern, Eric Dumazet,
Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, netdev, linux-kernel, Xiang Mei
In-Reply-To: <20260617065512.2529757-2-bestswngs@gmail.com>
On 6/17/26 2:55 PM, Weiming Shi wrote:
> accept_untracked_na() re-fetches the inet6_dev with __in6_dev_get(dev)
> and dereferences idev->cnf.accept_untracked_na without a NULL check,
Does ipv6_rpl_srh_rcv have same problem?
^ permalink raw reply
* Re: [REGRESSION 6.16] r8169 RTL8168h/8111h fails to probe — "Unable to change power state from D3cold to D0" — bisected to 4d4c10f763d7
From: Thorsten Leemhuis @ 2026-06-17 8:32 UTC (permalink / raw)
To: Josh Perry, mario.limonciello, bhelgaas
Cc: hkallweit1, nic_swsd, rafael, linux-pci, netdev, regressions
In-Reply-To: <d4aaa5e8-7366-461c-94b1-ccf3631c8bf9@6bit.com>
On 6/12/26 03:07, Josh Perry wrote:
> #regzbot introduced: 4d4c10f763d7
>
> Since v6.16 one of two onboard RTL8168h/8111h NICs on this board fails
> to probe on boot; the device drops to D3cold and the driver can't bring
> it back:
FWIW, that commit is 4d4c10f763d780 ("PCI: Explicitly put devices into
D0 when initializing") [v6.16-rc1] from Mario, who is already CCed, but
looks like might be on holiday or something due to inactivity on the
lists in the recent days. So it might take a few days before this moves on.
Josh, this is not my area of expertise, but there are two things I guess
might be helpful:
* retry with 7.1
* upload "dmesg" and "sudo lspci -vvv" output from working and broken
kernels somewhere (like bugzilla.kernel.org).
Ciao, Thorsten
> r8169 0000:02:00.0 eth0: RTL8168h/8111h, 00:2b:67:48:40:01, XID 541,
> IRQ 137
> r8169 0000:04:00.0: Unable to change power state from D3cold to D0,
> device inaccessible
> r8169 0000:04:00.0: Mem-Wr-Inval unavailable
> r8169 0000:04:00.0: error -EIO: PCI read failed
> r8169 0000:04:00.0: probe with driver r8169 failed with error -5
>
> The board has two identical RTL8168h NICs (both XID 541): 0000:02:00.0
> and 0000:04:00.0. Only 04:00.0 fails — its sibling 02:00.0, on a
> different root port, probes and works normally on the very same kernel
> and boot. The failing NIC then does not appear (no enp4s0), taking the
> machine's WAN offline. This strongly suggests the problem is port/
> topology-specific rather than device- or driver-specific: the upstream
> port behind 04:00.0 is placed in D3cold and the endpoint cannot be
> resumed to D0.
>
> Hardware: RTL8168h/8111h, XID 541, PCI 04:00.0 (onboard 1GbE).
> Platform: Lenovo ThinkCentre M90n-1 (11AHS0B200), BIOS M2AKT49A
> (2026-03-25, latest available). Firmware is current, so this is not a
> platform-firmware issue.
>
> Bisection: v6.15 good, v6.16 bad (verified by booting both). I then
> reverted 4d4c10f763d7 ("PCI: Explicitly put devices into D0 when
> initializing") together with its follow-up 907a7a2e5bf4 ("PCI/PM: Set up
> runtime PM even for devices without PCI PM") on top of 6.16.7: the NIC
> probes and links at 1Gbps/Full normally, with no workaround:
>
> r8169 0000:04:00.0 eth1: RTL8168h/8111h, 00:2b:67:48:40:02, XID 541,
> IRQ 138
> r8169 0000:04:00.0 enp4s0: Link is Up - 1Gbps/Full - flow control rx/tx
>
> Workaround: booting an unmodified v6.16+ kernel with pcie_port_pm=off
> also restores the NIC, which is consistent with the upstream port being
> placed in D3cold and the device failing to resume to D0 after the
> explicit-D0 init change.
>
> The follow-up 907a7a2e5bf4 does not fix this resume case: v6.18.33 is
> still affected (retested today on current firmware).
>
> Happy to test patches or provide full dmesg / lspci.
>
^ permalink raw reply
* Re: [PATCH net-next v6 1/2] dinghai: add ZTE network driver support
From: Uwe Kleine-König @ 2026-06-17 8:30 UTC (permalink / raw)
To: han.junyang
Cc: andrew+netdev, davem, edumazet, kuba, pabeni, horms, linux-kernel,
netdev, ran.ming, han.chengfei, zhang.yanze
In-Reply-To: <20260616213057452I2KLm3mVgWYl_SUTy_YYS@zte.com.cn>
[-- Attachment #1: Type: text/plain, Size: 637 bytes --]
Hello,
On Tue, Jun 16, 2026 at 09:30:57PM +0800, han.junyang@zte.com.cn wrote:
> +static const struct pci_device_id dh_pf_pci_table[] = {
> + { PCI_DEVICE(ZXDH_PF_VENDOR_ID, ZXDH_PF_DEVICE_ID), 0 },
> + { PCI_DEVICE(ZXDH_PF_VENDOR_ID, ZXDH_VF_DEVICE_ID), 0 },
> + { 0, }
> +};
Please make this:
+static const struct pci_device_id dh_pf_pci_table[] = {
+ { PCI_DEVICE(ZXDH_PF_VENDOR_ID, ZXDH_PF_DEVICE_ID) },
+ { PCI_DEVICE(ZXDH_PF_VENDOR_ID, ZXDH_VF_DEVICE_ID) },
+ { }
+};
(because the assignment to .driver_data is superflous and initializing
it using a list expression is in the way for one of my patch quests).
Best regards
Uwe
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* [PATCH 5.10.y] net: add missing ns_capable check for peer netns
From: Maximilian Heyne @ 2026-06-17 8:27 UTC (permalink / raw)
To: stable
Cc: Maximilian Heyne, Wolfgang Grandegger, Marc Kleine-Budde,
David S. Miller, Jakub Kicinski, Eric W. Biederman, Eric Dumazet,
linux-can, netdev, linux-kernel
The upstream commit 7b735ef81286 ("rtnetlink: add missing
netlink_ns_capable() check for peer netns") doesn't apply on older
stable kernels due to refactoring. Therefore, this patch is an attempt
to implement the same capability check just directly in the respective
interface types.
Approximate the netlink_ns_capable check with an ns_capable check. As
the newlink operation is synchronous this should result in the same
behavior.
Without this commit, for example, the following command creating a veth
device in network namespace of pid 1 succeeds:
$ unshare -U -r -n -- bash -c '
ip link add veth0 type veth peer name foobar netns 1
sleep 60' &
$ ip link show foobar
13: foobar@if2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 96:09:69:92:92:cc brd ff:ff:ff:ff:ff:ff link-netnsid 1
With this patch, it's returning -EPERM.
This fixes CVE-2026-31692
Cc: stable@vger.kernel.org
Fixes: 81adee47dfb6 ("net: Support specifying the network namespace upon device creation.")
Assisted-by: Kiro:claude
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
---
drivers/net/can/vxcan.c | 5 +++++
drivers/net/veth.c | 5 +++++
2 files changed, 10 insertions(+)
diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c
index 1bfede407270d..05fcbfacc3433 100644
--- a/drivers/net/can/vxcan.c
+++ b/drivers/net/can/vxcan.c
@@ -198,6 +198,11 @@ static int vxcan_newlink(struct net *net, struct net_device *dev,
if (IS_ERR(peer_net))
return PTR_ERR(peer_net);
+ if (!ns_capable(peer_net->user_ns, CAP_NET_ADMIN)) {
+ put_net(peer_net);
+ return -EPERM;
+ }
+
peer = rtnl_create_link(peer_net, ifname, name_assign_type,
&vxcan_link_ops, tbp, extack);
if (IS_ERR(peer)) {
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 743716ebebdb9..bda3add65c76e 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1341,6 +1341,11 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
if (IS_ERR(net))
return PTR_ERR(net);
+ if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) {
+ put_net(net);
+ return -EPERM;
+ }
+
peer = rtnl_create_link(net, ifname, name_assign_type,
&veth_link_ops, tbp, extack);
if (IS_ERR(peer)) {
--
2.50.1
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
^ permalink raw reply related
* [PATCH 5.15.y] net: add missing ns_capable check for peer netns
From: Maximilian Heyne @ 2026-06-17 8:27 UTC (permalink / raw)
To: stable
Cc: Maximilian Heyne, Wolfgang Grandegger, Marc Kleine-Budde,
David S. Miller, Jakub Kicinski, Eric Dumazet, Eric W. Biederman,
linux-can, netdev, linux-kernel
The upstream commit 7b735ef81286 ("rtnetlink: add missing
netlink_ns_capable() check for peer netns") doesn't apply on older
stable kernels due to refactoring. Therefore, this patch is an attempt
to implement the same capability check just directly in the respective
interface types.
Approximate the netlink_ns_capable check with an ns_capable check. As
the newlink operation is synchronous this should result in the same
behavior.
Without this commit, for example, the following command creating a veth
device in network namespace of pid 1 succeeds:
$ unshare -U -r -n -- bash -c '
ip link add veth0 type veth peer name foobar netns 1
sleep 60' &
$ ip link show foobar
13: foobar@if2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 96:09:69:92:92:cc brd ff:ff:ff:ff:ff:ff link-netnsid 1
With this patch, it's returning -EPERM.
This fixes CVE-2026-31692
Cc: stable@vger.kernel.org
Fixes: 81adee47dfb6 ("net: Support specifying the network namespace upon device creation.")
Assisted-by: Kiro:claude
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
---
drivers/net/can/vxcan.c | 5 +++++
drivers/net/veth.c | 5 +++++
2 files changed, 10 insertions(+)
diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c
index afd9060c5421c..8a61011fdaeef 100644
--- a/drivers/net/can/vxcan.c
+++ b/drivers/net/can/vxcan.c
@@ -198,6 +198,11 @@ static int vxcan_newlink(struct net *net, struct net_device *dev,
if (IS_ERR(peer_net))
return PTR_ERR(peer_net);
+ if (!ns_capable(peer_net->user_ns, CAP_NET_ADMIN)) {
+ put_net(peer_net);
+ return -EPERM;
+ }
+
peer = rtnl_create_link(peer_net, ifname, name_assign_type,
&vxcan_link_ops, tbp, extack);
if (IS_ERR(peer)) {
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index cfacf8965bc59..c644d59d70900 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1664,6 +1664,11 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
if (IS_ERR(net))
return PTR_ERR(net);
+ if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) {
+ put_net(net);
+ return -EPERM;
+ }
+
peer = rtnl_create_link(net, ifname, name_assign_type,
&veth_link_ops, tbp, extack);
if (IS_ERR(peer)) {
--
2.50.1
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
^ permalink raw reply related
* [PATCH 6.1.y] net: add missing ns_capable check for peer netns
From: Maximilian Heyne @ 2026-06-17 8:27 UTC (permalink / raw)
To: stable
Cc: Maximilian Heyne, Wolfgang Grandegger, Marc Kleine-Budde,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Eric W. Biederman, linux-can, netdev, linux-kernel
The upstream commit 7b735ef81286 ("rtnetlink: add missing
netlink_ns_capable() check for peer netns") doesn't apply on older
stable kernels due to refactoring. Therefore, this patch is an attempt
to implement the same capability check just directly in the respective
interface types.
Approximate the netlink_ns_capable check with an ns_capable check. As
the newlink operation is synchronous this should result in the same
behavior.
Without this commit, for example, the following command creating a veth
device in network namespace of pid 1 succeeds:
$ unshare -U -r -n -- bash -c '
ip link add veth0 type veth peer name foobar netns 1
sleep 60' &
$ ip link show foobar
13: foobar@if2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 96:09:69:92:92:cc brd ff:ff:ff:ff:ff:ff link-netnsid 1
With this patch, it's returning -EPERM.
This fixes CVE-2026-31692
Cc: stable@vger.kernel.org
Fixes: 81adee47dfb6 ("net: Support specifying the network namespace upon device creation.")
Assisted-by: Kiro:claude
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
---
drivers/net/can/vxcan.c | 5 +++++
drivers/net/veth.c | 5 +++++
2 files changed, 10 insertions(+)
diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c
index 98c669ad51414..da4affff65476 100644
--- a/drivers/net/can/vxcan.c
+++ b/drivers/net/can/vxcan.c
@@ -211,6 +211,11 @@ static int vxcan_newlink(struct net *net, struct net_device *dev,
if (IS_ERR(peer_net))
return PTR_ERR(peer_net);
+ if (!ns_capable(peer_net->user_ns, CAP_NET_ADMIN)) {
+ put_net(peer_net);
+ return -EPERM;
+ }
+
peer = rtnl_create_link(peer_net, ifname, name_assign_type,
&vxcan_link_ops, tbp, extack);
if (IS_ERR(peer)) {
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index e1e8c825483aa..dac8cc5a79f5a 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1707,6 +1707,11 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
if (IS_ERR(net))
return PTR_ERR(net);
+ if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) {
+ put_net(net);
+ return -EPERM;
+ }
+
peer = rtnl_create_link(net, ifname, name_assign_type,
&veth_link_ops, tbp, extack);
if (IS_ERR(peer)) {
--
2.50.1
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
^ permalink raw reply related
* [PATCH 6.6.y] net: add missing ns_capable check for peer netns
From: Maximilian Heyne @ 2026-06-17 8:26 UTC (permalink / raw)
To: stable
Cc: Maximilian Heyne, Wolfgang Grandegger, Marc Kleine-Budde,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Eric W. Biederman, linux-can, netdev, linux-kernel
The upstream commit 7b735ef81286 ("rtnetlink: add missing
netlink_ns_capable() check for peer netns") doesn't apply on older
stable kernels due to refactoring. Therefore, this patch is an attempt
to implement the same capability check just directly in the respective
interface types.
Approximate the netlink_ns_capable check with an ns_capable check. As
the newlink operation is synchronous this should result in the same
behavior.
Without this commit, for example, the following command creating a veth
device in network namespace of pid 1 succeeds:
$ unshare -U -r -n -- bash -c '
ip link add veth0 type veth peer name foobar netns 1
sleep 60' &
$ ip link show foobar
13: foobar@if2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 96:09:69:92:92:cc brd ff:ff:ff:ff:ff:ff link-netnsid 1
With this patch, it's returning -EPERM.
This fixes CVE-2026-31692
Cc: stable@vger.kernel.org
Fixes: 81adee47dfb6 ("net: Support specifying the network namespace upon device creation.")
Assisted-by: Kiro:claude
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
---
drivers/net/can/vxcan.c | 5 +++++
drivers/net/veth.c | 5 +++++
2 files changed, 10 insertions(+)
diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c
index 98c669ad51414..da4affff65476 100644
--- a/drivers/net/can/vxcan.c
+++ b/drivers/net/can/vxcan.c
@@ -211,6 +211,11 @@ static int vxcan_newlink(struct net *net, struct net_device *dev,
if (IS_ERR(peer_net))
return PTR_ERR(peer_net);
+ if (!ns_capable(peer_net->user_ns, CAP_NET_ADMIN)) {
+ put_net(peer_net);
+ return -EPERM;
+ }
+
peer = rtnl_create_link(peer_net, ifname, name_assign_type,
&vxcan_link_ops, tbp, extack);
if (IS_ERR(peer)) {
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 2b3b0beb55c88..ba4ca6c6bc9d8 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1857,6 +1857,11 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
if (IS_ERR(net))
return PTR_ERR(net);
+ if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) {
+ put_net(net);
+ return -EPERM;
+ }
+
peer = rtnl_create_link(net, ifname, name_assign_type,
&veth_link_ops, tbp, extack);
if (IS_ERR(peer)) {
--
2.50.1
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
^ permalink raw reply related
* [PATCH 6.12.y] net: add missing ns_capable check for peer netns
From: Maximilian Heyne @ 2026-06-17 8:25 UTC (permalink / raw)
To: stable
Cc: Maximilian Heyne, Marc Kleine-Budde, Vincent Mailhol, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Daniel Borkmann, Nikolay Aleksandrov, Eric W. Biederman,
linux-can, netdev, linux-kernel, bpf
The upstream commit 7b735ef81286 ("rtnetlink: add missing
netlink_ns_capable() check for peer netns") doesn't apply on older
stable kernels due to refactoring. Therefore, this patch is an attempt
to implement the same capability check just directly in the respective
interface types.
Approximate the netlink_ns_capable check with an ns_capable check. As
the newlink operation is synchronous this should result in the same
behavior.
Without this commit, for example, the following command creating a veth
device in network namespace of pid 1 succeeds:
$ unshare -U -r -n -- bash -c '
ip link add veth0 type veth peer name foobar netns 1
sleep 60' &
$ ip link show foobar
13: foobar@if2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 96:09:69:92:92:cc brd ff:ff:ff:ff:ff:ff link-netnsid 1
With this patch, it's returning -EPERM.
This fixes CVE-2026-31692
Cc: stable@vger.kernel.org
Fixes: 81adee47dfb6 ("net: Support specifying the network namespace upon device creation.")
Assisted-by: Kiro:claude
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
---
drivers/net/can/vxcan.c | 5 +++++
drivers/net/netkit.c | 5 +++++
drivers/net/veth.c | 5 +++++
3 files changed, 15 insertions(+)
diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c
index 9e1b7d41005f8..851c93bf0b310 100644
--- a/drivers/net/can/vxcan.c
+++ b/drivers/net/can/vxcan.c
@@ -211,6 +211,11 @@ static int vxcan_newlink(struct net *net, struct net_device *dev,
if (IS_ERR(peer_net))
return PTR_ERR(peer_net);
+ if (!ns_capable(peer_net->user_ns, CAP_NET_ADMIN)) {
+ put_net(peer_net);
+ return -EPERM;
+ }
+
peer = rtnl_create_link(peer_net, ifname, name_assign_type,
&vxcan_link_ops, tbp, extack);
if (IS_ERR(peer)) {
diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c
index fba2c734f0ec7..e0c42fa0c835c 100644
--- a/drivers/net/netkit.c
+++ b/drivers/net/netkit.c
@@ -413,6 +413,11 @@ static int netkit_new_link(struct net *src_net, struct net_device *dev,
if (IS_ERR(net))
return PTR_ERR(net);
+ if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) {
+ put_net(net);
+ return -EPERM;
+ }
+
peer = rtnl_create_link(net, ifname, ifname_assign_type,
&netkit_link_ops, tbp, extack);
if (IS_ERR(peer)) {
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 77e4b0d1ca557..6ffde7ee2119d 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1854,6 +1854,11 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
if (IS_ERR(net))
return PTR_ERR(net);
+ if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) {
+ put_net(net);
+ return -EPERM;
+ }
+
peer = rtnl_create_link(net, ifname, name_assign_type,
&veth_link_ops, tbp, extack);
if (IS_ERR(peer)) {
--
2.50.1
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
^ permalink raw reply related
* Re: [PATCH net v3] net: pch_gbe: handle TX skb allocation failure
From: Simon Horman @ 2026-06-17 8:24 UTC (permalink / raw)
To: Ruoyu Wang
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Masayuki Ohtake, netdev, linux-kernel
In-Reply-To: <20260615125043.3537046-1-ruoyuw560@gmail.com>
On Mon, Jun 15, 2026 at 08:50:42PM +0800, Ruoyu Wang wrote:
> pch_gbe_alloc_tx_buffers() allocates an skb for each TX descriptor and
> then passes the returned pointer to skb_reserve(). If netdev_alloc_skb()
> fails, skb_reserve() dereferences NULL.
>
> Make pch_gbe_alloc_tx_buffers() return an error when an skb allocation
> fails. On failure, let pch_gbe_alloc_tx_buffers() clean the partially
> allocated TX ring before returning the error. While bringing the device
> up, release the RX buffer pool through a shared cleanup helper before
> unwinding the IRQ setup.
>
> Fixes: 77555ee72282 ("net: Add Gigabit Ethernet driver of Topcliff PCH")
> Signed-off-by: Ruoyu Wang <ruoyuw560@gmail.com>
> ---
> Changes in v3:
> - Move the partial TX ring cleanup into pch_gbe_alloc_tx_buffers(), as
> suggested by Simon Horman.
>
> Changes in v2:
> - Add the kernel-doc return value description for
> pch_gbe_alloc_tx_buffers().
Thanks for the updates.
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply
* Re: [Intel-wired-lan] [PATCH v2] ice: retry reading NVM if admin queue returns EBUSY
From: Robert Malz @ 2026-06-17 8:11 UTC (permalink / raw)
To: Loktionov, Aleksandr
Cc: Nguyen, Anthony L, Kitszel, Przemyslaw,
intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org
In-Reply-To: <IA3PR11MB8986729EE79F3F3FBAAC68C9E5E42@IA3PR11MB8986.namprd11.prod.outlook.com>
(resend)
Hey Aleksandr,
Thanks for taking a look at this.
exit loop, just like in OOT, happens during:
> if (hw->adminq.sq_last_status != LIBIE_AQ_RC_EBUSY ||
> retry_cnt > ICE_SQ_SEND_MAX_EXECUTE)
> break;
And by the way, I have v3 ready, which I plan to send 24 hours after
the initial submission, it doesn't change any code but I want to keep
the netdev bots happy.
Thanks,
Robert
On Wed, Jun 17, 2026 at 9:47 AM Loktionov, Aleksandr
<aleksandr.loktionov@intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> > Of Robert Malz via Intel-wired-lan
> > Sent: Wednesday, June 17, 2026 12:08 AM
> > To: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel,
> > Przemyslaw <przemyslaw.kitszel@intel.com>
> > Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org
> > Subject: [Intel-wired-lan] [PATCH v2] ice: retry reading NVM if
> > admin queue returns EBUSY
> >
> > When the admin queue command to read NVM returns EBUSY, the driver
> > currently treats it as a fatal error and aborts the entire read
> > operation. This can cause spurious NVM read failures during periods
> > of high firmware activity.
> >
> > Add retry logic to ice_read_flat_nvm() that handles EBUSY responses
> > from the admin queue. When an EBUSY error is encountered, release
> > the NVM resource lock, wait for ICE_SQ_SEND_DELAY_TIME_MS, re-
> > acquire it, and retry the failed read. The retry is attempted up to
> > ICE_SQ_SEND_MAX_EXECUTE times before giving up.
> >
> > Code was extracted from OOT ice driver 1.15.4 release. Additional
> > change was made to reset last_cmd in case of retry to make sure that
> > all commands are retried properly.
> >
> > Fixes: e94509906d6b ("ice: create function to read a section of the
> > NVM and Shadow RAM")
> > Signed-off-by: Robert Malz <robert.malz@canonical.com>
> > ---
> > Changes in v2:
> > - change ICE_AQ_RC_EBUSY -> LIBIE_AQ_RC_EBUSY
> >
> > drivers/net/ethernet/intel/ice/ice_nvm.c | 25 +++++++++++++++++++--
> > ---
> > 1 file changed, 20 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_nvm.c
> > b/drivers/net/ethernet/intel/ice/ice_nvm.c
> > index 7e187a804dfa..b3120605d66f 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_nvm.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_nvm.c
> > @@ -67,6 +67,7 @@ ice_read_flat_nvm(struct ice_hw *hw, u32 offset,
> > u32 *length, u8 *data, {
> > u32 inlen = *length;
> > u32 bytes_read = 0;
> > + int retry_cnt = 0;
> > bool last_cmd;
> > int status;
> >
> > @@ -96,11 +97,25 @@ ice_read_flat_nvm(struct ice_hw *hw, u32 offset,
> > u32 *length, u8 *data,
> > offset, read_size,
> > data + bytes_read, last_cmd,
> > read_shadow_ram, NULL);
> > - if (status)
> > - break;
> > -
> > - bytes_read += read_size;
> > - offset += read_size;
> > + if (status) {
> > + if (hw->adminq.sq_last_status !=
> > LIBIE_AQ_RC_EBUSY ||
> > + retry_cnt > ICE_SQ_SEND_MAX_EXECUTE)
> > + break;
> > + ice_debug(hw, ICE_DBG_NVM,
> > + "NVM read EBUSY error, retry %d\n",
> > + retry_cnt + 1);
> > + last_cmd = false;
> > + ice_release_nvm(hw);
> > + msleep(ICE_SQ_SEND_DELAY_TIME_MS);
> > + status = ice_acquire_nvm(hw, ICE_RES_READ);
> > + if (status)
> > + break;
> > + retry_cnt++;
> It looks like you added the retry_cnt increment but you didn't add it into the loop exit condition.
>
>
> > + } else {
> > + bytes_read += read_size;
> > + offset += read_size;
> > + retry_cnt = 0;
> > + }
> > } while (!last_cmd);
> >
> > *length = bytes_read;
> > --
> > 2.34.1
>
^ permalink raw reply
* Re: [PATCH] e1000: Remove redundant else after return
From: Lovekesh Solanki @ 2026-06-17 7:58 UTC (permalink / raw)
To: andrew
Cc: andrew+netdev, anthony.l.nguyen, davem, edumazet, kuba,
lovekeshsolanki00, netdev, pabeni, przemyslaw.kitszel
In-Reply-To: <ead7bfc9-3978-4442-9cd1-23c2182b36b3@lunn.ch>
Hi Andrew,
I read the documentation you linked and understand simple standalone
cleanups are discouraged.
Thanks for the review, I will drop this patch.
Regards,
Lovekesh
^ permalink raw reply
* [PATCH net v4] tipc: fix slab-use-after-free Read in tipc_aead_decrypt_done
From: Doruk Tan Ozturk @ 2026-06-17 7:58 UTC (permalink / raw)
To: jmaloy
Cc: davem, edumazet, kuba, pabeni, horms, aleksander.lobakin,
tung.quang.nguyen, tipc-discussion, netdev, linux-kernel,
Doruk Tan Ozturk, stable
tipc_aead_decrypt() goes straight from tipc_bearer_hold(b) to
crypto_aead_decrypt(req) without taking a reference on the netns, unlike
the encrypt path. When crypto_aead_decrypt() is offloaded asynchronously
(e.g. the SIMD aead wrapper queuing to cryptd), the cryptd worker runs
tipc_aead_decrypt_done() later. If the bearer's netns is torn down in the
meantime, cleanup_net() -> tipc_exit_net() -> tipc_crypto_stop() frees the
per-netns tipc_crypto, and the completion then reads it:
tipc_aead_decrypt_done() dereferences aead->crypto->stats and
aead->crypto->net, and tipc_crypto_rcv_complete() dereferences
aead->crypto->aead[] and the node table -- reading freed memory.
Decoded KASAN splat (v7.1-rc7, CONFIG_KASAN_INLINE + TIPC + TIPC_CRYPTO):
BUG: KASAN: slab-use-after-free in tipc_aead_decrypt_done (net/tipc/crypto.c:999)
Read of size 8 at addr ffff8881056258a8 by task kworker/u16:2/51
Workqueue: events_unbound
Call Trace:
tipc_aead_decrypt_done (net/tipc/crypto.c:999)
process_one_work (kernel/workqueue.c:3314)
worker_thread (kernel/workqueue.c:3397 kernel/workqueue.c:3478)
kthread (kernel/kthread.c:436)
ret_from_fork (arch/x86/kernel/process.c:158)
ret_from_fork_asm (arch/x86/entry/entry_64.S:245)
Allocated by task 169:
__kasan_kmalloc (mm/kasan/common.c:398 mm/kasan/common.c:415)
tipc_crypto_start (net/tipc/crypto.c:1502)
tipc_init_net (net/tipc/core.c:72)
ops_init (net/core/net_namespace.c:137)
setup_net (net/core/net_namespace.c:446)
copy_net_ns (net/core/net_namespace.c:579)
create_new_namespaces (kernel/nsproxy.c:132)
__x64_sys_unshare (kernel/fork.c:3316)
do_syscall_64 (arch/x86/entry/syscall_64.c:63)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
Freed by task 8:
kfree (mm/slub.c:6566)
tipc_exit_net (net/tipc/core.c:119)
cleanup_net (net/core/net_namespace.c:704)
process_one_work (kernel/workqueue.c:3314)
kthread (kernel/kthread.c:436)
This is the same class of bug that commit e279024617134 ("net/tipc: fix
slab-use-after-free Read in tipc_aead_encrypt_done") fixed for the encrypt
side. The encrypt path takes maybe_get_net(aead->crypto->net) before
crypto_aead_encrypt() and drops it with put_net() on the synchronous
return paths and in tipc_aead_encrypt_done(); the -EINPROGRESS/-EBUSY
return keeps the reference for the async callback to release. The decrypt
path was left without the equivalent guard.
Mirror the encrypt-side fix on the decrypt path: take a net reference
before crypto_aead_decrypt() (failing with -ENODEV and the matching
bearer put if it cannot be acquired), keep it across the
-EINPROGRESS/-EBUSY async return, and drop it with put_net() on the
synchronous success/error return and at the end of
tipc_aead_decrypt_done().
Reproduced under KASAN on v7.1-rc7: a UDP bearer with a cluster key is
flooded with crafted encrypted frames from an unknown peer (driving the
cluster-key decrypt path) while the bearer's netns is repeatedly torn
down. The completion must run asynchronously to outlive
tipc_crypto_stop(); on x86 the stock aesni gcm(aes) now decrypts
synchronously, so the async path was exercised via cryptd offload. The
unguarded aead->crypto dereference in tipc_aead_decrypt_done() is the
unpatched upstream path; tipc_aead_decrypt() still lacks
maybe_get_net(aead->crypto->net), so the completion can outlive the free
on any config where crypto_aead_decrypt() goes async.
Found by 0sec automated security-research tooling (https://0sec.ai).
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Cc: stable@vger.kernel.org
Signed-off-by: Doruk Tan Ozturk <doruk@0sec.ai>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech>
---
v4:
- Use the net parameter for maybe_get_net()/put_net() instead of
dereferencing aead->crypto->net, which is the per-netns structure at
risk during teardown (per the automated review forwarded by Simon
Horman). net == aead->crypto->net here; no functional change.
v3:
- Rewrite the changelog with the decoded stack trace and frame the
reproduction on the current tree (v7.1-rc7); drop the v6.12.92
references (Tung Quang Nguyen).
v2:
- Add Cc: stable@vger.kernel.org and Alexander Lobakin's Reviewed-by.
No functional change.
net/tipc/crypto.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
index 6d3b6b89b1d1..16f1ed1f6b1b 100644
--- a/net/tipc/crypto.c
+++ b/net/tipc/crypto.c
@@ -941,12 +941,20 @@ static int tipc_aead_decrypt(struct net *net, struct tipc_aead *aead,
goto exit;
}
+ /* Get net to avoid freed tipc_crypto when delete namespace */
+ if (!maybe_get_net(net)) {
+ tipc_bearer_put(b);
+ rc = -ENODEV;
+ goto exit;
+ }
+
/* Now, do decrypt */
rc = crypto_aead_decrypt(req);
if (rc == -EINPROGRESS || rc == -EBUSY)
return rc;
tipc_bearer_put(b);
+ put_net(net);
exit:
kfree(ctx);
@@ -984,6 +992,7 @@ static void tipc_aead_decrypt_done(void *data, int err)
}
tipc_bearer_put(b);
+ put_net(net);
}
static inline int tipc_ehdr_size(struct tipc_ehdr *ehdr)
--
2.43.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox