Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net v3] net: airoha: Fix skb->priority underflow in airoha_dev_select_queue()
From: Lorenzo Bianconi @ 2026-06-18 10:03 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Wayen Yan, netdev, horms, pabeni, edumazet, andrew+netdev,
	angelogioacchino.delregno, matthias.bgg, linux-arm-kernel,
	linux-mediatek
In-Reply-To: <20260617161951.52abe413@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 2515 bytes --]

> On Sun, 14 Jun 2026 07:30:54 +0800 Wayen Yan wrote:
> > In airoha_dev_select_queue(), the expression:
> > 
> >   queue = (skb->priority - 1) % AIROHA_NUM_QOS_QUEUES;
> > 
> > implicitly converts to unsigned arithmetic: when skb->priority is 0
> > (the default for unclassified traffic), (0u - 1u) wraps to UINT_MAX,
> > and UINT_MAX % 8 = 7, routing default best-effort packets to the
> > highest-priority QoS queue. This causes QoS inversion where the
> > majority of traffic on a PON gateway starves actual high-priority
> > flows (VoIP, gaming, etc.).
> > 
> > Fix by guarding the subtraction: when priority is 0, map to queue 0
> > (lowest priority), otherwise apply the original (priority - 1) % 8
> > mapping.
> > 
> > Fixes: 2b288b81560b ("net: airoha: Introduce ndo_select_queue callback")
> > Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > Reviewed-by: Joe Damato <joe@dama.to>
> > Signed-off-by: Wayen Yan <win847@gmail.com>
> > ---
> >  drivers/net/ethernet/airoha/airoha_eth.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> > index 31cdb11cd7..d476ef83c3 100644
> > --- a/drivers/net/ethernet/airoha/airoha_eth.c
> > +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> > @@ -1933,7 +1933,7 @@ static u16 airoha_dev_select_queue(struct net_device *dev, struct sk_buff *skb,
> >  	 */
> >  	channel = netdev_uses_dsa(dev) ? skb_get_queue_mapping(skb) : port->id;
> >  	channel = channel % AIROHA_NUM_QOS_CHANNELS;
> > -	queue = (skb->priority - 1) % AIROHA_NUM_QOS_QUEUES; /* QoS queue */
> > +	queue = skb->priority ? (skb->priority - 1) % AIROHA_NUM_QOS_QUEUES : 0;
> 
> Hi Lorenzo, is there a reason we're subtracting 1 here in the first
> place? Could be just me, but may be worth adding a comment here.
> 
> Intuitively if we are "narrowing" 16 prios to 8 queues it'd make most
> sense to group the adjacent ones -- divide by two.
> 
> Please respin with some sort of an explanation..

IIRC this is a leftover of the ETS offload support.
I agree it is righ to just do:

	queue = skb->priority % AIROHA_NUM_QOS_QUEUES; /* QoS queue */

@Wayen: can you please respin fixing the issue? Please add even my Acked-by:

Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>

Regards,
Lorenzo

> 
> >  	queue = channel * AIROHA_NUM_QOS_QUEUES + queue;
> >  
> >  	return queue < dev->num_tx_queues ? queue : 0;
> -- 
> pw-bot: cr

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [RESEND PATCH v1] net: dsa: motorcomm: add yt92xx dsa driver
From: Kyle Switch @ 2026-06-18  9:59 UTC (permalink / raw)
  To: David Yang
  Cc: andrew, olteanv, davem, edumazet, kuba, pabeni, horms, netdev,
	linux-kernel, ming.xu, xiaolin.xu, jianmin.wang, de.ge
In-Reply-To: <CAAXyoMMYxRTwHD6QmpAkspCtiY853KkYuOAUR=qV0v9g5w9v+g@mail.gmail.com>



On 6/17/26 19:15, David Yang wrote:
> On Wed, Jun 17, 2026 at 10:37 AM Kyle Switch <kyle.switch@motor-comm.com> wrote:
>>>> +/* To define the from cpu tag format 8 bytes:
>>>> + *
>>>> + * 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7
>>>> + *|<----------TPID 0x9988---------->|
>>>> + *|<--RESERVE-->|<-----DST PORT---->|
>>>> + *|-|<---------RESERVE------------->|
>>>> + *|<------------------------------->|
>>>> + */
>>>> +#define YT922X_TAG_FORMAT2_NAME "yt922x-8b"
>>>> +#define YT922X_FORMAT2_TAG_LEN                  8
>>>> +#define YT922X_PKT_TYPE          GENMASK(15, 14)
>>>> +#define YT922X_8B_CPUTAG_PKT_FROM_CPU      0x1
>>>> +#define YT922X_8B_CPUTAG_SRC_PORT          GENMASK(6, 2)
>>>> +#define YT922X_8B_CPUTAG_DST_PORTMASK      GENMASK(8, 0)
>>>> +#define YT922X_8B_CPUTAG_DST_PORTMASK_0      BIT(15)
>>>> +#define YT922X_8B_CPUTAG_DST_PORTMASK_0_EN      0x1
>>>> +#define YT922X_8B_CPUTAG_FORCE_DST         BIT(9)
>>>> +#define YT922X_8B_CPUTAG_FORCE_DST_EN      0x1
>>>
>>> If yt922x tag format shares no common with yt921x, make a new tag driver.
>>
>> Ans: thank you for your suggestion, we will consider whether to create a new driver in the new file.
> 
> I'm not an expert in this, but if yt922x tag does support cpu codes
> and priority, please consider updating yt921x tagger to support it,
> even if you don't use or test these features for now.
> 

Ans: here "updating yt921x tagger" you mean yt922x tag driver to support cpu code and dscp prio? We consider
implementing it in the subsequent patch, but no matter what, when we submit the yt922x dsa driver ,it will support it.

>>>
>>>> +static struct dsa_tag_driver *dsa_tag_driver_array[] = {
>>>> +       &DSA_TAG_DRIVER_NAME(yt921x_netdev_ops),
>>>> +       &DSA_TAG_DRIVER_NAME(yt922x_4b_netdev_ops),
>>>> +       &DSA_TAG_DRIVER_NAME(yt922x_8b_netdev_ops),
>>>> +};
>>>
>>> If both are supported by the chip and 4b does nothing more than 8b
>>> does, do not bother with it.
>>
>> Ans: 4b and 8b dsa tag may have different application scenarios. from my opinion,
>>      1. 4b dsa tag can save 4 bytes of payload
>>      2. 8b dsa tag carry more package info.
> 
> We do not support every tag protocol. For DSA switches,
>   - the conduit interface supports jumbo frames so there is room for
> the DSA header, or
>   - you end up with MTU less than 1500 anyway.
> 4-byte reduction does not make a practical difference here. An
> alternative protocol poses 2x work to everyone else, and unnecessarily
> exposes your driver to interoperability issues, as pointed by Andrew.
> 
> As I've commented before, if there is a particular reason to add
> 4-byte protocol, leave it behind for the moment, and focus on a
> minimal yt922x_dsa_switch_ops + yt922x_netdev_ops for your first
> patchset without any offloading supports. This way, others can easily
> see your changes and move the work forward efficiently.

Ans: Thank you for your advise, 8bytes dsa tag driver will be supported firstly.


^ permalink raw reply

* Re: [RESEND PATCH v1] net: dsa: motorcomm: add yt92xx dsa driver
From: Kyle Switch @ 2026-06-18  9:53 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: David Yang, olteanv, davem, edumazet, kuba, pabeni, horms, netdev,
	linux-kernel, ming.xu, xiaolin.xu, jianmin.wang, de.ge
In-Reply-To: <a689a734-bfd2-4e8f-85dd-9ae210b3161a@lunn.ch>



On 6/17/26 17:07, Andrew Lunn wrote:
>>>> +#define CMM_PARAM_CHK(expr, err_code)    \
>>>> +       do {                             \
>>>> +               if ((u32)(expr)) {       \
>>>> +                       return err_code; \
>>>> +               }                        \
>>>> +       } while (0)
>>>> +
>>>> +#define CMM_ERR_CHK(op, ret)           \
>>>> +       do {                           \
>>>> +               ret = (op);            \
>>>> +               if (ret != CMM_ERR_OK) \
>>>> +                       return ret;    \
>>>> +       } while (0)
>>>
>>> Do not use macros like this.
>>
>> Ans: Acknowledged, i will consider how to optimize them in the future.
> 
> It is not about optimization. Hiding a return statement in a macro is
> very bad style. It will lead to locking bugs, and resource leaks,
> because nobody knows the return is there.
> 

Ans: This issue will be fixed before the next patch is sent.

>>>> +/*
>>>> + * Macro Definition
>>>> + */
>>>> +#ifndef NULL
>>>> +#define NULL 0
>>>> +#endif
>>>> +
>>>> +#ifndef FALSE
>>>> +#define FALSE 0
>>>> +#endif
>>>> +
>>>> +#ifndef TRUE
>>>> +#define TRUE 1
>>>> +#endif
>>>
>>> Nonsense.
>>
>> Ans: Acknowledge, will be fixed later.
> 
> No. They will be fixed now.
> 

Ans: This issue will be fixed before the next patch is sent.

>>>> +       /* Print chipid here since we are interested in lower 16 bits */
>>>> +       dev_info(dev,
>>>> +                "Motorcomm %s ethernet switch.\n",
>>>> +                info->name);
>>>
>>> Stop copy-n-paste.
>>
>> Ans: Sry for this, i will recheck the code to make sure each line of comments and code
>> meaningful again.
> 
> Also, consider the comments. Do the comments add anything useful which
> is not already obvious from the code. Comments should be about "Why?".
> 
>>>> --- a/include/uapi/linux/if_ether.h
>>>> +++ b/include/uapi/linux/if_ether.h
>>>> @@ -118,7 +118,7 @@
>>>>  #define ETH_P_QINQ1    0x9100          /* deprecated QinQ VLAN [ NOT AN OFFICIALLY REGISTERED ID ] */
>>>>  #define ETH_P_QINQ2    0x9200          /* deprecated QinQ VLAN [ NOT AN OFFICIALLY REGISTERED ID ] */
>>>>  #define ETH_P_QINQ3    0x9300          /* deprecated QinQ VLAN [ NOT AN OFFICIALLY REGISTERED ID ] */
>>>> -#define ETH_P_YT921X   0x9988          /* Motorcomm YT921x DSA [ NOT AN OFFICIALLY REGISTERED ID ] */
>>>> +#define ETH_P_YT92XX   0x9988          /* Motorcomm YT92xx DSA [ NOT AN OFFICIALLY REGISTERED ID ] */
>>>>  #define ETH_P_EDSA     0xDADA          /* Ethertype DSA [ NOT AN OFFICIALLY REGISTERED ID ] */
>>>>  #define ETH_P_DSA_8021Q        0xDADB          /* Fake VLAN Header for DSA [ NOT AN OFFICIALLY REGISTERED ID ] */
>>>>  #define ETH_P_DSA_A5PSW        0xE001          /* A5PSW Tag Value [ NOT AN OFFICIALLY REGISTERED ID ] */
>>>
>>> UAPI stands for User-space API. Do not change it unless there is a
>>> very very good reason.
>>>
>>
>> Ans: The default tpid both yt921x and yt922x is 0x9988. I have modified this to 
>> allow for simultaneous use in both yt922x and yt921x scenarios.
> 
> As pointed out, this is UAPI. Any changes to this file need a good
> explanation how it does not change the user API. Do this break
> backwards compatibility with user space applications? Maybe tcpdump or
> wireshark has a dissector which expects ETH_P_YT921X and you have just
> broken it?
> 

Ans:Now I have a better understanding of the role of the UAPI representative. 
If a new dsa driver is added in the subsequent patch, consider adding one instead of modifying the original content.

>>>> +#define YT922X_TAG_FORMAT2_NAME "yt922x-8b"
>>>> +#define YT922X_FORMAT2_TAG_LEN                  8
>>>> +#define YT922X_PKT_TYPE          GENMASK(15, 14)
>>>> +#define YT922X_8B_CPUTAG_PKT_FROM_CPU      0x1
>>>> +#define YT922X_8B_CPUTAG_SRC_PORT          GENMASK(6, 2)
>>>> +#define YT922X_8B_CPUTAG_DST_PORTMASK      GENMASK(8, 0)
>>>> +#define YT922X_8B_CPUTAG_DST_PORTMASK_0      BIT(15)
>>>> +#define YT922X_8B_CPUTAG_DST_PORTMASK_0_EN      0x1
>>>> +#define YT922X_8B_CPUTAG_FORCE_DST         BIT(9)
>>>> +#define YT922X_8B_CPUTAG_FORCE_DST_EN      0x1
>>>
>>> If yt922x tag format shares no common with yt921x, make a new tag driver.
>>
>> Ans: thank you for your suggestion, we will consider whether to create a new driver in the new file.
> 
> When you look at other tag drivers, you will also notice some drivers
> implement two taggers in one file. So consider this if there is any
> shared code.
> 

Ans: ok, the tag driver will refer to the methods of other existing tag drivers.

>>>> +static struct dsa_tag_driver *dsa_tag_driver_array[] = {
>>>> +       &DSA_TAG_DRIVER_NAME(yt921x_netdev_ops),
>>>> +       &DSA_TAG_DRIVER_NAME(yt922x_4b_netdev_ops),
>>>> +       &DSA_TAG_DRIVER_NAME(yt922x_8b_netdev_ops),
>>>> +};
>>>
>>> If both are supported by the chip and 4b does nothing more than 8b
>>> does, do not bother with it.
>>
>> Ans: 4b and 8b dsa tag may have different application scenarios. from my opinion,
>>      1. 4b dsa tag can save 4 bytes of payload
>>      2. 8b dsa tag carry more package info.
> 
> How do you plan to swap between the different formats?
> 
> The user perspective is that the machine has a collection of interface
> which are used just as normal, using Linux tools likes like
> iproute2. If the user enables a feature which requires the 8b tag
> format, will you change the format from the DSA driver? And swap back
> to the 4 byte format when the feature is no longer needed?
> 

Ans: After considering your and David's comments and suggestion, we will broken this patch into lots of
small patches which just include 8bytes tag driver for now.
If the 4bytes tag driver scenario is required later, we will use "change_tag_protocol" mechanism from DSA driver.

As you mentioned "One thing i need to point out. Linux has a long tradition of not
replacing existing code with a new implementation. You take the existing code and step by step improve it. " in another mail before.
I want to explain the patch in more detail.

Step 1. We do not attempt to remove the existing driver implementation, and don't change the behavior of existing software,
we will retain the implementation of the existing driver software layer, but encapsulate the use of hardware operations into 
functional interfaces. The advantage of this is that it is easy to maintain and easy to support other motorcomm switch series.

for example: vlan add ops in dsa driver:

Existing code:

yt921x_vlan_add(struct yt921x_priv *priv, int port, u16 vid, bool untagged)
{
 u64 mask64;
 u64 ctrl64;

 mask64 = YT921X_VLAN_CTRL_PORTn(port) |
   YT921X_VLAN_CTRL_PORTS(priv->cpu_ports_mask);
 ctrl64 = mask64;

 mask64 |= YT921X_VLAN_CTRL_UNTAG_PORTn(port);
 if (untagged)
  ctrl64 |= YT921X_VLAN_CTRL_UNTAG_PORTn(port);

 return yt921x_reg64_update_bits(priv, YT921X_VLANn_CTRL(vid),
     mask64, ctrl64);
}

after patch:

yt921x_vlan_add(struct yt921x_priv *priv, int port, u16 vid, bool untagged)
{
 struct yt_port_mask member;
 struct yt_port_mask untag;

 member.portsbits[0] = BIT(port) | priv->cpu_ports_mask;
 if (untagged)
  untag.portbits[0] = BIT(port);

  return yt_vlan_port_set(priv->unit, vid, member, untag);  // Here we use encapsulated interfaces to complete the hardware configuration. 
							     // We can ignore the differences between different motorcomm series, which will be reflected in driver/net/dsa/motorocmm/switch/yt_vlan. c
}

Step 2. if Step 1 is accepted, later, the plan may be to replace the hardware configuration involved in the existing dsa driver 
with the encapsulated interface step by step according to the functional module such as vlan, mirror, lag, etc. Finally, upload the yt922x dsa driver.

> 	Andrew

^ permalink raw reply

* Re: [PATCH net-next v9 01/10] enic: verify firmware supports V2 SR-IOV at probe time
From: Breno Leitao @ 2026-06-18  9:32 UTC (permalink / raw)
  To: Satish Kharat
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel, Sesidhar Baddela
In-Reply-To: <20260617-enic-sriov-v2-admin-channel-v2-v9-1-37f5f5af4c93@cisco.com>

On Wed, Jun 17, 2026 at 06:53:24PM -0700, Satish Kharat wrote:
> During PF probe, query the firmware get-supported-feature interface
> to verify that the running firmware supports V2 SR-IOV. Firmware
> version 5.3(4.72) and later report VIC_FEATURE_SRIOV via
> CMD_GET_SUPP_FEATURE_VER. If the firmware does not support the
> feature, set vf_type to ENIC_VF_TYPE_NONE and log a warning so the
> admin knows a firmware upgrade is needed.
> 
> VIC_FEATURE_SRIOV is assigned the explicit value 4 to match the
> firmware ABI.  Slot 3 (firmware's VIC_FEATURE_PTP) is reserved with
> a comment rather than a placeholder enum entry, since PTP is not
> used by the upstream driver.
> 
> Suggested-by: Breno Leitao <leitao@debian.org>
> Signed-off-by: Satish Kharat <satishkh@cisco.com>

Reviewed-by: Breno Leitao <leitao@debian.org>

FWIW: net-next is closed now.
https://lore.kernel.org/all/20260615085310.014e4e31@kernel.org/

^ permalink raw reply

* [PATCH net v2] net: ethernet: ti: icssg: guard PA stat lookups
From: Philippe Schenker @ 2026-06-18  9:30 UTC (permalink / raw)
  To: netdev
  Cc: Philippe Schenker, Simon Horman, danishanwar, rogerq,
	linux-arm-kernel, stable, Andrew Lunn, David Carlier,
	David S. Miller, Eric Dumazet, Jacob Keller, Jakub Kicinski,
	Kevin Hao, Meghana Malladi, Paolo Abeni, Vadim Fedorenko,
	linux-kernel

From: Philippe Schenker <philippe.schenker@impulsing.ch>

icssg_ndo_get_stats64() unconditionally calls emac_get_stat_by_name()
with FW PA stat names regardless of whether the PA stats block is
present on the hardware.  emac_get_stat_by_name() already guards the
PA stats lookup with `if (emac->prueth->pa_stats)`; when that pointer
is NULL the lookup falls through to netdev_err() and returns -EINVAL.
Because ndo_get_stats64 is polled regularly by the networking stack
this produces thousands of log entries of the form:

  icssg-prueth icssg1-eth end0: Invalid stats FW_RX_ERROR

A secondary consequence is that the int(-EINVAL) return value is
implicitly widened to a near-ULLONG_MAX unsigned value when accumulated
into the __u64 fields of rtnl_link_stats64, silently corrupting the
rx_errors, rx_dropped and tx_dropped counters reported by `ip -s link`.

Every other PA-aware code path in the driver is already guarded with
the same `if (emac->prueth->pa_stats)` check.  Apply the same guard
here.

Fixes: 0d15a26b247d ("net: ti: icssg-prueth: Add ICSSG FW Stats")
Signed-off-by: Philippe Schenker <philippe.schenker@impulsing.ch>
Reviewed-by: Simon Horman <horms@kernel.org>

Cc: danishanwar@ti.com
Cc: rogerq@kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: stable@vger.kernel.org

---

Changes in v2:
- Removed newline between Fixes tag and Signed-off-by
- Use return in if statement to guard so we get rid
  of the 80 char warnings.
- Added Simon's Reviewed-by. Thanks!

 drivers/net/ethernet/ti/icssg/icssg_common.c | 49 +++++++++++---------
 1 file changed, 28 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c
index a28a608f9bf4..d9af6419e032 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_common.c
+++ b/drivers/net/ethernet/ti/icssg/icssg_common.c
@@ -1628,28 +1628,35 @@ void icssg_ndo_get_stats64(struct net_device *ndev,
 	stats->rx_over_errors = emac_get_stat_by_name(emac, "rx_over_errors");
 	stats->multicast      = emac_get_stat_by_name(emac, "rx_multicast_frames");
 
-	stats->rx_errors  = ndev->stats.rx_errors +
-			    emac_get_stat_by_name(emac, "FW_RX_ERROR") +
-			    emac_get_stat_by_name(emac, "FW_RX_EOF_SHORT_FRMERR") +
-			    emac_get_stat_by_name(emac, "FW_RX_B0_DROP_EARLY_EOF") +
-			    emac_get_stat_by_name(emac, "FW_RX_EXP_FRAG_Q_DROP") +
-			    emac_get_stat_by_name(emac, "FW_RX_FIFO_OVERRUN");
-	stats->rx_dropped = ndev->stats.rx_dropped +
-			    emac_get_stat_by_name(emac, "FW_DROPPED_PKT") +
-			    emac_get_stat_by_name(emac, "FW_INF_PORT_DISABLED") +
-			    emac_get_stat_by_name(emac, "FW_INF_SAV") +
-			    emac_get_stat_by_name(emac, "FW_INF_SA_DL") +
-			    emac_get_stat_by_name(emac, "FW_INF_PORT_BLOCKED") +
-			    emac_get_stat_by_name(emac, "FW_INF_DROP_TAGGED") +
-			    emac_get_stat_by_name(emac, "FW_INF_DROP_PRIOTAGGED") +
-			    emac_get_stat_by_name(emac, "FW_INF_DROP_NOTAG") +
-			    emac_get_stat_by_name(emac, "FW_INF_DROP_NOTMEMBER");
+	stats->rx_errors  = ndev->stats.rx_errors;
+	stats->rx_dropped = ndev->stats.rx_dropped;
 	stats->tx_errors  = ndev->stats.tx_errors;
-	stats->tx_dropped = ndev->stats.tx_dropped +
-			    emac_get_stat_by_name(emac, "FW_RTU_PKT_DROP") +
-			    emac_get_stat_by_name(emac, "FW_TX_DROPPED_PACKET") +
-			    emac_get_stat_by_name(emac, "FW_TX_TS_DROPPED_PACKET") +
-			    emac_get_stat_by_name(emac, "FW_TX_JUMBO_FRM_CUTOFF");
+	stats->tx_dropped = ndev->stats.tx_dropped;
+
+	if (!emac->prueth->pa_stats)
+		return;
+
+	stats->rx_errors  +=
+			emac_get_stat_by_name(emac, "FW_RX_ERROR") +
+			emac_get_stat_by_name(emac, "FW_RX_EOF_SHORT_FRMERR") +
+			emac_get_stat_by_name(emac, "FW_RX_B0_DROP_EARLY_EOF") +
+			emac_get_stat_by_name(emac, "FW_RX_EXP_FRAG_Q_DROP") +
+			emac_get_stat_by_name(emac, "FW_RX_FIFO_OVERRUN");
+	stats->rx_dropped +=
+			emac_get_stat_by_name(emac, "FW_DROPPED_PKT") +
+			emac_get_stat_by_name(emac, "FW_INF_PORT_DISABLED") +
+			emac_get_stat_by_name(emac, "FW_INF_SAV") +
+			emac_get_stat_by_name(emac, "FW_INF_SA_DL") +
+			emac_get_stat_by_name(emac, "FW_INF_PORT_BLOCKED") +
+			emac_get_stat_by_name(emac, "FW_INF_DROP_TAGGED") +
+			emac_get_stat_by_name(emac, "FW_INF_DROP_PRIOTAGGED") +
+			emac_get_stat_by_name(emac, "FW_INF_DROP_NOTAG") +
+			emac_get_stat_by_name(emac, "FW_INF_DROP_NOTMEMBER");
+	stats->tx_dropped +=
+			emac_get_stat_by_name(emac, "FW_RTU_PKT_DROP") +
+			emac_get_stat_by_name(emac, "FW_TX_DROPPED_PACKET") +
+			emac_get_stat_by_name(emac, "FW_TX_TS_DROPPED_PACKET") +
+			emac_get_stat_by_name(emac, "FW_TX_JUMBO_FRM_CUTOFF");
 }
 EXPORT_SYMBOL_GPL(icssg_ndo_get_stats64);
 
-- 
2.54.0

base-commit: 8cd9520d35a6c38db6567e97dd93b1f11f185dc6
branch: fix-icssg_common-pa-stats-errors__master-7-1

^ permalink raw reply related

* Re: [PATCH net] net: ethernet: ti: icssg: guard PA stat lookups
From: Philippe Schenker @ 2026-06-18  9:29 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, danishanwar, rogerq, linux-arm-kernel, stable,
	Andrew Lunn, David Carlier, David S. Miller, Eric Dumazet,
	Jacob Keller, Jakub Kicinski, Kevin Hao, Meghana Malladi,
	Paolo Abeni, Vadim Fedorenko, linux-kernel
In-Reply-To: <20260618091004.GG827683@horms.kernel.org>

[-- Attachment #1: Type: text/plain, Size: 1794 bytes --]

Hi Simon

Thanks for the review and I'll send a v2 with that blank line removed.
Saw it right after sending the patch.

Philippe

On Thu, 2026-06-18 at 10:10 +0100, Simon Horman wrote:
> On Tue, Jun 16, 2026 at 04:35:34PM +0200, Philippe Schenker wrote:
> > From: Philippe Schenker <philippe.schenker@impulsing.ch>
> > 
> > icssg_ndo_get_stats64() unconditionally calls
> > emac_get_stat_by_name()
> > with FW PA stat names regardless of whether the PA stats block is
> > present on the hardware.  emac_get_stat_by_name() already guards
> > the
> > PA stats lookup with `if (emac->prueth->pa_stats)`; when that
> > pointer
> > is NULL the lookup falls through to netdev_err() and returns -
> > EINVAL.
> > Because ndo_get_stats64 is polled regularly by the networking stack
> > this produces thousands of log entries of the form:
> > 
> >   icssg-prueth icssg1-eth end0: Invalid stats FW_RX_ERROR
> > 
> > A secondary consequence is that the int(-EINVAL) return value is
> > implicitly widened to a near-ULLONG_MAX unsigned value when
> > accumulated
> > into the __u64 fields of rtnl_link_stats64, silently corrupting the
> > rx_errors, rx_dropped and tx_dropped counters reported by `ip -s
> > link`.
> > 
> > Every other PA-aware code path in the driver is already guarded
> > with
> > the same `if (emac->prueth->pa_stats)` check.  Apply the same guard
> > here.
> > 
> > Fixes: 0d15a26b247d ("net: ti: icssg-prueth: Add ICSSG FW Stats")
> 
> nit: no blank line between tags
> 
> > 
> > Signed-off-by: Philippe Schenker <philippe.schenker@impulsing.ch>
> > 
> > Cc: danishanwar@ti.com
> > Cc: rogerq@kernel.org
> > Cc: linux-arm-kernel@lists.infradead.org
> > Cc: stable@vger.kernel.org
> 
> Reviewed-by: Simon Horman <horms@kernel.org>

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 659 bytes --]

^ permalink raw reply

* Re: [PATCH net 5/5] afs: Fix uncancelled rxrpc OOB message handler
From: Simon Horman @ 2026-06-18  9:29 UTC (permalink / raw)
  To: dhowells
  Cc: Simon Horman, netdev, marc.dionne, kuba, davem, edumazet, pabeni,
	linux-afs, linux-kernel, d4n.for.sec, n05ec, jaltman, stable
In-Reply-To: <20260616155749.2125907-6-dhowells@redhat.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
afs: Fix uncancelled rxrpc OOB message handler

This commit cancels AFS OOB message processing during shutdown and moves it
to afs_wq. It also prevents queuing if the net namespace is no longer live.

Link: https://sashiko.dev/#/patchset/20260609140911.838677-1-dhowells%40redhat.com

> diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
> index fd2d260fb25f9..6241f9349f6b6 100644
> --- a/fs/afs/rxrpc.c
> +++ b/fs/afs/rxrpc.c
[ ... ]
> @@ -128,6 +128,7 @@ void afs_close_socket(struct afs_net *net)
>  	_enter("");
>  
>  	cancel_work_sync(&net->charge_preallocation_work);
> +	cancel_work_sync(&net->rx_oob_work);
>  	kernel_listen(net->socket, 0);
>  	flush_workqueue(afs_async_calls);
>  	flush_workqueue(afs_wq);

[Severity: High]
This is a pre-existing issue, but does placing the cancellation and flush
before kernel_sock_shutdown() leave a window open for the work to be queued
afterwards?

If kernel_sock_shutdown() is what synchronously disables the notify_oob
callback from firing, couldn't a delayed callback still queue work on afs_wq
after we've already flushed it here?

[ ... ]
> @@ -985,5 +986,6 @@ static void afs_rx_notify_oob(struct sock *sk, struct sk_buff *oob)
>  {
>  	struct afs_net *net = sk->sk_user_data;
>  
> -	schedule_work(&net->rx_oob_work);
> +	if (net->live)
> +		queue_work(afs_wq, &net->rx_oob_work);
>  }

[Severity: High]
This is a pre-existing issue, but does this lockless check of net->live allow
a race condition that leads to a use-after-free?

Consider this sequence:

CPU1 (teardown)
afs_net_exit()
    net->live = false;
    afs_close_socket()
        cancel_work_sync(&net->rx_oob_work);
        flush_workqueue(afs_wq);
        kernel_sock_shutdown()  // blocks waiting for recvmsg_lock

CPU2 (incoming OOB)
rxrpc_notify_socket_oob()
    spin_lock_irq(&rx->recvmsg_lock);
    afs_rx_notify_oob()
        // Reads net->live == true (stale read, or delayed execution)
        queue_work(afs_wq, &net->rx_oob_work);
    spin_unlock_irq(&rx->recvmsg_lock);

CPU1 then finishes kernel_sock_shutdown() and frees the net namespace, but
afs_wq is never flushed again.

Would moving cancel_work_sync() and flush_workqueue() to the end of
afs_close_socket(), after kernel_sock_shutdown() completes, prevent this race?

^ permalink raw reply

* Re: [PATCH net 4/5] afs: Fix further netns teardown to cancel the preallocation charger
From: Simon Horman @ 2026-06-18  9:29 UTC (permalink / raw)
  To: dhowells
  Cc: Simon Horman, netdev, marc.dionne, kuba, davem, edumazet, pabeni,
	linux-afs, linux-kernel, d4n.for.sec, n05ec, jaltman, stable
In-Reply-To: <20260616155749.2125907-5-dhowells@redhat.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
afs: Fix further netns teardown to cancel the preallocation charger

This commit modifies the afs network namespace teardown process to wait for
the preallocation charger work item. It addresses a small window where the
work item could be requeued after being cancelled by flushing the workqueue
on which the charger runs.

Link: https://sashiko.dev/#/patchset/20260609140911.838677-1-dhowells%40redhat.com

> diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
> --- a/fs/afs/rxrpc.c
> +++ b/fs/afs/rxrpc.c
> @@ -130,6 +130,7 @@ void afs_close_socket(struct afs_net *net)
>  	cancel_work_sync(&net->charge_preallocation_work);
>  	kernel_listen(net->socket, 0);
>  	flush_workqueue(afs_async_calls);
> +	flush_workqueue(afs_wq);

[Severity: High]
Will flushing the system-wide global workqueue here cause cross-namespace
stalls during teardown?

It appears afs_wq is allocated once in afs_init() as a system-wide workqueue
shared by all AFS network namespaces:

fs/afs/main.c:afs_init() {
    ...
    afs_wq = alloc_workqueue("afs", WQ_PERCPU, 0);
    ...
}

Other namespaces use afs_wq for operations like cell management, which can
block for tens of seconds waiting for userspace DNS lookups via the
request_key mechanism.

Since afs_close_socket() runs in the context of the global cleanup_net
worker thread, does this stall the destruction of all network namespaces
system-wide until unrelated AFS work across the entire system finishes?

Would it be safer to explicitly call
cancel_work_sync(&net->charge_preallocation_work) a second time here instead
of flushing the shared queue?

^ permalink raw reply

* Re: [PATCH net 1/5] rxrpc: input: reject ACKALL outside transmit phase
From: Simon Horman @ 2026-06-18  9:27 UTC (permalink / raw)
  To: dhowells
  Cc: Simon Horman, netdev, marc.dionne, kuba, davem, edumazet, pabeni,
	linux-afs, linux-kernel, d4n.for.sec, n05ec, jaltman, stable
In-Reply-To: <20260616155749.2125907-2-dhowells@redhat.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
rxrpc: input: reject ACKALL outside transmit phase

rxrpc_input_ackall() accepts ACKALL packets without checking whether
the call is in a state that can legitimately have outstanding transmit
buffers. A forged ACKALL can reach a new service call and trigger a null
pointer dereference.

> diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
> index ce761466b02d6..37881dffa898b 100644
> --- a/net/rxrpc/input.c
> +++ b/net/rxrpc/input.c

[ ... ]

> @@ -1214,8 +1214,22 @@ static void rxrpc_input_ack(struct rxrpc_call *call, struct sk_buff *skb)
>  static void rxrpc_input_ackall(struct rxrpc_call *call, struct sk_buff *skb)
>  {
>  	struct rxrpc_ack_summary summary = { 0 };
> +	rxrpc_seq_t top = READ_ONCE(call->tx_top);
> +
> +	switch (__rxrpc_call_state(call)) {
> +	case RXRPC_CALL_CLIENT_SEND_REQUEST:
> +	case RXRPC_CALL_CLIENT_AWAIT_REPLY:
> +	case RXRPC_CALL_SERVER_SEND_REPLY:
> +	case RXRPC_CALL_SERVER_AWAIT_ACK:
> +		break;
> +	default:
> +		return;
> +	}
> +
> +	if (call->tx_bottom == top)
> +		return;
>  
> -	if (rxrpc_rotate_tx_window(call, call->tx_top, &summary))
> +	if (rxrpc_rotate_tx_window(call, top, &summary))

[Severity: High]
This isn't a bug introduced by this patch, but it appears the fix might be
incomplete. The patch prevents a NULL pointer dereference in
rxrpc_input_ackall() by checking if call->tx_bottom == top before calling
rxrpc_rotate_tx_window().

However, does the same vulnerability remain in rxrpc_receiving_reply()?

If a client in the RXRPC_CALL_CLIENT_SEND_REQUEST state receives a forged or
early DATA packet (acting as a reply) before it has queued any transmit data,
call->tx_top and call->tx_bottom will both be 0, and call->tx_queue will be
NULL.

rxrpc_receiving_reply() will then unconditionally call:

    if (!rxrpc_rotate_tx_window(call, top, &summary)) {

Because rxrpc_rotate_tx_window() uses a do { ... } while (before_eq(seq, to))
loop, it executes exactly once for seq = 1 and to = 0, which dereferences
the NULL tq:

    if (tq->bufs[ix]->flags & RXRPC_LAST_PACKET) {

Should the empty window check be placed at the top of rxrpc_rotate_tx_window()
instead, or mirrored in rxrpc_receiving_reply() to protect against early
replies?

>  		rxrpc_end_tx_phase(call, false, rxrpc_eproto_unexpected_ackall);
>  }

^ permalink raw reply

* [PATCH net v2 2/2] dpaa2-switch: fix VLAN upper check not rejecting bridge join
From: Ioana Ciornei @ 2026-06-18  9:28 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev
  Cc: f.fainelli, vladimir.oltean, linux-kernel
In-Reply-To: <20260618092813.432535-1-ioana.ciornei@nxp.com>

The blamed commit refactored the prechangeupper event handling but
failed to actually return an error in case
dpaa2_switch_prevent_bridging_with_8021q_upper() detected a 802.1q upper
on a port which tries to join a bridge. Fix this by returning err
instead of 0.

Fixes: 45035febc495 ("net: dpaa2-switch: refactor prechangeupper sanity checks")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v2:
- none

 drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index 83ccefdac59f..858ba844ac51 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -2212,7 +2212,7 @@ dpaa2_switch_prechangeupper_sanity_checks(struct net_device *netdev,
 	if (err) {
 		NL_SET_ERR_MSG_MOD(extack,
 				   "Cannot join a bridge while VLAN uppers are present");
-		return 0;
+		return err;
 	}
 
 	netdev_for_each_lower_dev(upper_dev, other_dev, iter) {
-- 
2.25.1


^ permalink raw reply related

* [PATCH net v2 1/2] dpaa2-switch: do not accept VLAN uppers while bridged
From: Ioana Ciornei @ 2026-06-18  9:28 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev
  Cc: f.fainelli, vladimir.oltean, linux-kernel
In-Reply-To: <20260618092813.432535-1-ioana.ciornei@nxp.com>

The dpaa2-switch driver does not support VLAN uppers while its ports are
bridged. This scenario tried to be prevented by rejecting a bridge join
while VLAN uppers exist but the reverse order was still possible.

This patches adds a check so that the dpaa2-switch also does not accept
VLAN uppers while bridged.

Fixes: f48298d3fbfa ("staging: dpaa2-switch: move the driver out of staging")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v2:
- patch is new

 drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index 45f276c2c3ec..83ccefdac59f 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -2233,6 +2233,7 @@ dpaa2_switch_prechangeupper_sanity_checks(struct net_device *netdev,
 static int dpaa2_switch_port_prechangeupper(struct net_device *netdev,
 					    struct netdev_notifier_changeupper_info *info)
 {
+	struct ethsw_port_priv *port_priv;
 	struct netlink_ext_ack *extack;
 	struct net_device *upper_dev;
 	int err;
@@ -2251,6 +2252,13 @@ static int dpaa2_switch_port_prechangeupper(struct net_device *netdev,
 
 		if (!info->linking)
 			dpaa2_switch_port_pre_bridge_leave(netdev);
+	} else if (is_vlan_dev(upper_dev)) {
+		port_priv = netdev_priv(netdev);
+		if (port_priv->fdb->bridge_dev) {
+			NL_SET_ERR_MSG_MOD(extack,
+					   "Cannot accept VLAN uppers while bridged");
+			return -EOPNOTSUPP;
+		}
 	}
 
 	return 0;
-- 
2.25.1


^ permalink raw reply related

* [PATCH net v2 0/2] dpaa2-switch: reject VLAN uppers while bridged
From: Ioana Ciornei @ 2026-06-18  9:28 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, netdev
  Cc: f.fainelli, vladimir.oltean, linux-kernel

The dpaa2-switch driver does not support VLAN uppers on its ports while
they are bridged. The check which should have prevented a port with a
VLAN upper to join bridge was poorly refactored and didn't actually
return an error. Patch 2/2 fixes that.

On the other hand, the driver didn't reject the addition of a VLAN upper
while bridged. Patch 1/2 fixes that.

Changes in v2:
- added patch 1/2

Ioana Ciornei (2):
  dpaa2-switch: do not accept VLAN uppers while bridged
  dpaa2-switch: fix VLAN upper check not rejecting bridge join

 drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

-- 
2.25.1

^ permalink raw reply

* Re: [PATCH 1/1] selftests: net: fix file owner for broadcast_ether_dst test
From: Simon Horman @ 2026-06-18  9:21 UTC (permalink / raw)
  To: Ross Porter
  Cc: linux-kselftest, netdev, stable, Edoardo Canepa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Shuah Khan, Oscar Maes,
	Brett A C Sheffield, linux-kernel
In-Reply-To: <20260610062230.71573-2-ross.porter@canonical.com>

On Wed, Jun 10, 2026 at 06:22:29PM +1200, Ross Porter wrote:
> Ensure the output file is always owned by root (even if tcpdump was 
> compiled with `--with-user`), by passing the `-Z root` argument when 
> invoking it.

Hi Ross,

I think that the motivation, described in the cover letter,
belongs here so it can be found more easily using git..

Also, as there is only one patch in the series, the cover letter
could be dropped.

And lastly, this should be targeted at net as it's a but fix
for code present there.

Subject: [PATCH net] ...

For more information on the Networking development workflow see
https://docs.kernel.org/process/maintainer-netdev.html

> 
> Cc: stable@vger.kernel.org
> Reported-by: Edoardo Canepa <edoardo.canepa@canonical.com>
> Closes: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2129815
> Fixes: bf59028ea8d4 ("selftests: net: add test for destination in broadcast packets")
> Suggested-by: Edoardo Canepa <edoardo.canepa@canonical.com>
> Tested-by: Ross Porter <ross.porter@canonical.com>
> Signed-off-by: Ross Porter <ross.porter@canonical.com>

...

-- 
pw-bot: changes-requested

^ permalink raw reply

* [PATCH net v2] sfc: Use acquire/release for irq_soft_enabled
From: Gui-Dong Han @ 2026-06-18  9:16 UTC (permalink / raw)
  To: netdev, linux-net-drivers, ecree.xilinx
  Cc: linux-kernel, andrew+netdev, davem, edumazet, kuba, pabeni, horms,
	baijiaju1990, Gui-Dong Han

irq_soft_enabled is a lockless gate for interrupt handlers. When it is
false, handlers acknowledge interrupts but must not touch channel state.

Channel reallocation disables the gate, swaps and initializes channel
pointers, frees old channels, and then enables the gate again. Once a
handler observes irq_soft_enabled as true, it can dereference
efx->channel[] and other channel state. That observation must therefore
be ordered after the channel state was published.

READ_ONCE() does not provide that acquire ordering. The existing
smp_wmb() in the soft-enable paths also cannot provide it because it is
after the irq_soft_enabled=true store, so it cannot publish prior channel
state before the gate becomes visible.

Use a release store only when opening the software IRQ gate, and use
acquire loads in interrupt handlers before touching channels. Use
WRITE_ONCE() when closing the gate; handlers that observe false do not
touch channel state.

Keep the existing smp_wmb() after gate updates. It preserves the
previous ordering between the software IRQ gate and subsequent event
queue setup, start and stop operations, which is separate from the
release/acquire ordering added here.

Fixes: d829118705f8 ("sfc: Rework IRQ enable/disable")
Fixes: 8127d661e77f ("sfc: Add support for Solarflare SFC9100 family")
Fixes: 5a6681e22c14 ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver")
Fixes: 51b35a454efd ("sfc: skeleton EF100 PF driver")
Fixes: 6e173d3b4af9 ("sfc: Copy shared files needed for Siena (part 1)")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
---
v2:
- Use release ordering only when enabling the software IRQ gate.
- Use WRITE_ONCE() when disabling it.
- Expand the commit message to address review comments from Jakub
  Kicinski and Edward Cree about the release pairing and the existing
  smp_wmb().
v1: https://lore.kernel.org/netdev/20260528092838.2099352-1-hanguidong02@gmail.com/
---
 drivers/net/ethernet/sfc/ef10.c               |  4 ++--
 drivers/net/ethernet/sfc/ef100_nic.c          |  2 +-
 drivers/net/ethernet/sfc/efx_channels.c       |  4 ++--
 drivers/net/ethernet/sfc/falcon/efx.c         |  4 ++--
 drivers/net/ethernet/sfc/falcon/falcon.c      |  2 +-
 drivers/net/ethernet/sfc/falcon/farch.c       |  4 ++--
 drivers/net/ethernet/sfc/falcon/net_driver.h  | 17 +++++++++++++++++
 drivers/net/ethernet/sfc/net_driver.h         | 17 +++++++++++++++++
 drivers/net/ethernet/sfc/siena/efx_channels.c |  4 ++--
 drivers/net/ethernet/sfc/siena/farch.c        |  4 ++--
 drivers/net/ethernet/sfc/siena/net_driver.h   | 17 +++++++++++++++++
 11 files changed, 65 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 7e04f115bbaa..a907303497f9 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -2143,7 +2143,7 @@ static irqreturn_t efx_ef10_msi_interrupt(int irq, void *dev_id)
 	netif_vdbg(efx, intr, efx->net_dev,
 		   "IRQ %d on CPU %d\n", irq, raw_smp_processor_id());
 
-	if (likely(READ_ONCE(efx->irq_soft_enabled))) {
+	if (likely(efx_irq_soft_enabled(efx))) {
 		/* Note test interrupts */
 		if (context->index == efx->irq_level)
 			efx->last_irq_cpu = raw_smp_processor_id();
@@ -2158,7 +2158,7 @@ static irqreturn_t efx_ef10_msi_interrupt(int irq, void *dev_id)
 static irqreturn_t efx_ef10_legacy_interrupt(int irq, void *dev_id)
 {
 	struct efx_nic *efx = dev_id;
-	bool soft_enabled = READ_ONCE(efx->irq_soft_enabled);
+	bool soft_enabled = efx_irq_soft_enabled(efx);
 	struct efx_channel *channel;
 	efx_dword_t reg;
 	u32 queues;
diff --git a/drivers/net/ethernet/sfc/ef100_nic.c b/drivers/net/ethernet/sfc/ef100_nic.c
index 00050f786cae..7885b3a5a398 100644
--- a/drivers/net/ethernet/sfc/ef100_nic.c
+++ b/drivers/net/ethernet/sfc/ef100_nic.c
@@ -333,7 +333,7 @@ static irqreturn_t ef100_msi_interrupt(int irq, void *dev_id)
 	netif_vdbg(efx, intr, efx->net_dev,
 		   "IRQ %d on CPU %d\n", irq, raw_smp_processor_id());
 
-	if (likely(READ_ONCE(efx->irq_soft_enabled))) {
+	if (likely(efx_irq_soft_enabled(efx))) {
 		/* Note test interrupts */
 		if (context->index == efx->irq_level)
 			efx->last_irq_cpu = raw_smp_processor_id();
diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c
index f4dc3f3f4416..103d2a02bf5f 100644
--- a/drivers/net/ethernet/sfc/efx_channels.c
+++ b/drivers/net/ethernet/sfc/efx_channels.c
@@ -972,7 +972,7 @@ int efx_soft_enable_interrupts(struct efx_nic *efx)
 
 	BUG_ON(efx->state == STATE_DISABLED);
 
-	efx->irq_soft_enabled = true;
+	efx_irq_soft_enable(efx);
 	smp_wmb();
 
 	efx_for_each_channel(channel, efx) {
@@ -1009,7 +1009,7 @@ void efx_soft_disable_interrupts(struct efx_nic *efx)
 
 	efx_mcdi_mode_poll(efx);
 
-	efx->irq_soft_enabled = false;
+	efx_irq_soft_disable(efx);
 	smp_wmb();
 
 	if (efx->legacy_irq)
diff --git a/drivers/net/ethernet/sfc/falcon/efx.c b/drivers/net/ethernet/sfc/falcon/efx.c
index 0c197b448645..a61cc2c84b78 100644
--- a/drivers/net/ethernet/sfc/falcon/efx.c
+++ b/drivers/net/ethernet/sfc/falcon/efx.c
@@ -1460,7 +1460,7 @@ static int ef4_soft_enable_interrupts(struct ef4_nic *efx)
 
 	BUG_ON(efx->state == STATE_DISABLED);
 
-	efx->irq_soft_enabled = true;
+	ef4_irq_soft_enable(efx);
 	smp_wmb();
 
 	ef4_for_each_channel(channel, efx) {
@@ -1493,7 +1493,7 @@ static void ef4_soft_disable_interrupts(struct ef4_nic *efx)
 	if (efx->state == STATE_DISABLED)
 		return;
 
-	efx->irq_soft_enabled = false;
+	ef4_irq_soft_disable(efx);
 	smp_wmb();
 
 	if (efx->legacy_irq)
diff --git a/drivers/net/ethernet/sfc/falcon/falcon.c b/drivers/net/ethernet/sfc/falcon/falcon.c
index fb1d19b7c419..0c0e00412689 100644
--- a/drivers/net/ethernet/sfc/falcon/falcon.c
+++ b/drivers/net/ethernet/sfc/falcon/falcon.c
@@ -449,7 +449,7 @@ static irqreturn_t falcon_legacy_interrupt_a1(int irq, void *dev_id)
 		   "IRQ %d on CPU %d status " EF4_OWORD_FMT "\n",
 		   irq, raw_smp_processor_id(), EF4_OWORD_VAL(*int_ker));
 
-	if (!likely(READ_ONCE(efx->irq_soft_enabled)))
+	if (!likely(ef4_irq_soft_enabled(efx)))
 		return IRQ_HANDLED;
 
 	/* Check to see if we have a serious error condition */
diff --git a/drivers/net/ethernet/sfc/falcon/farch.c b/drivers/net/ethernet/sfc/falcon/farch.c
index 23d507a3820d..291165db7933 100644
--- a/drivers/net/ethernet/sfc/falcon/farch.c
+++ b/drivers/net/ethernet/sfc/falcon/farch.c
@@ -1500,7 +1500,7 @@ irqreturn_t ef4_farch_fatal_interrupt(struct ef4_nic *efx)
 irqreturn_t ef4_farch_legacy_interrupt(int irq, void *dev_id)
 {
 	struct ef4_nic *efx = dev_id;
-	bool soft_enabled = READ_ONCE(efx->irq_soft_enabled);
+	bool soft_enabled = ef4_irq_soft_enabled(efx);
 	ef4_oword_t *int_ker = efx->irq_status.addr;
 	irqreturn_t result = IRQ_NONE;
 	struct ef4_channel *channel;
@@ -1592,7 +1592,7 @@ irqreturn_t ef4_farch_msi_interrupt(int irq, void *dev_id)
 		   "IRQ %d on CPU %d status " EF4_OWORD_FMT "\n",
 		   irq, raw_smp_processor_id(), EF4_OWORD_VAL(*int_ker));
 
-	if (!likely(READ_ONCE(efx->irq_soft_enabled)))
+	if (!likely(ef4_irq_soft_enabled(efx)))
 		return IRQ_HANDLED;
 
 	/* Handle non-event-queue sources */
diff --git a/drivers/net/ethernet/sfc/falcon/net_driver.h b/drivers/net/ethernet/sfc/falcon/net_driver.h
index 7ab0db44720d..9880fff59f9d 100644
--- a/drivers/net/ethernet/sfc/falcon/net_driver.h
+++ b/drivers/net/ethernet/sfc/falcon/net_driver.h
@@ -1305,6 +1305,23 @@ static inline netdev_features_t ef4_supported_features(const struct ef4_nic *efx
 	return net_dev->features | net_dev->hw_features;
 }
 
+static inline void ef4_irq_soft_enable(struct ef4_nic *efx)
+{
+	/* Publish channel state before opening the IRQ handler gate. */
+	smp_store_release(&efx->irq_soft_enabled, true);
+}
+
+static inline void ef4_irq_soft_disable(struct ef4_nic *efx)
+{
+	WRITE_ONCE(efx->irq_soft_enabled, false);
+}
+
+static inline bool ef4_irq_soft_enabled(struct ef4_nic *efx)
+{
+	/* Pair with ef4_irq_soft_enable() before touching channels. */
+	return smp_load_acquire(&efx->irq_soft_enabled);
+}
+
 /* Get the current TX queue insert index. */
 static inline unsigned int
 ef4_tx_queue_get_insert_index(const struct ef4_tx_queue *tx_queue)
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index b98c259f672d..c172b3504e61 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -1731,6 +1731,23 @@ static inline void efx_xmit_hwtstamp_pending(struct sk_buff *skb)
 	skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
 }
 
+static inline void efx_irq_soft_enable(struct efx_nic *efx)
+{
+	/* Publish channel state before opening the IRQ handler gate. */
+	smp_store_release(&efx->irq_soft_enabled, true);
+}
+
+static inline void efx_irq_soft_disable(struct efx_nic *efx)
+{
+	WRITE_ONCE(efx->irq_soft_enabled, false);
+}
+
+static inline bool efx_irq_soft_enabled(struct efx_nic *efx)
+{
+	/* Pair with efx_irq_soft_enable() before touching channels. */
+	return smp_load_acquire(&efx->irq_soft_enabled);
+}
+
 /* Get the max fill level of the TX queues on this channel */
 static inline unsigned int
 efx_channel_tx_fill_level(struct efx_channel *channel)
diff --git a/drivers/net/ethernet/sfc/siena/efx_channels.c b/drivers/net/ethernet/sfc/siena/efx_channels.c
index 1fc343598771..55123f8322a7 100644
--- a/drivers/net/ethernet/sfc/siena/efx_channels.c
+++ b/drivers/net/ethernet/sfc/siena/efx_channels.c
@@ -1004,7 +1004,7 @@ static int efx_soft_enable_interrupts(struct efx_nic *efx)
 
 	BUG_ON(efx->state == STATE_DISABLED);
 
-	efx->irq_soft_enabled = true;
+	efx_irq_soft_enable(efx);
 	smp_wmb();
 
 	efx_for_each_channel(channel, efx) {
@@ -1041,7 +1041,7 @@ static void efx_soft_disable_interrupts(struct efx_nic *efx)
 
 	efx_siena_mcdi_mode_poll(efx);
 
-	efx->irq_soft_enabled = false;
+	efx_irq_soft_disable(efx);
 	smp_wmb();
 
 	if (efx->legacy_irq)
diff --git a/drivers/net/ethernet/sfc/siena/farch.c b/drivers/net/ethernet/sfc/siena/farch.c
index 7613d7988894..208cc499c747 100644
--- a/drivers/net/ethernet/sfc/siena/farch.c
+++ b/drivers/net/ethernet/sfc/siena/farch.c
@@ -1514,7 +1514,7 @@ irqreturn_t efx_farch_fatal_interrupt(struct efx_nic *efx)
 irqreturn_t efx_farch_legacy_interrupt(int irq, void *dev_id)
 {
 	struct efx_nic *efx = dev_id;
-	bool soft_enabled = READ_ONCE(efx->irq_soft_enabled);
+	bool soft_enabled = efx_irq_soft_enabled(efx);
 	efx_oword_t *int_ker = efx->irq_status.addr;
 	irqreturn_t result = IRQ_NONE;
 	struct efx_channel *channel;
@@ -1606,7 +1606,7 @@ irqreturn_t efx_farch_msi_interrupt(int irq, void *dev_id)
 		   "IRQ %d on CPU %d status " EFX_OWORD_FMT "\n",
 		   irq, raw_smp_processor_id(), EFX_OWORD_VAL(*int_ker));
 
-	if (!likely(READ_ONCE(efx->irq_soft_enabled)))
+	if (!likely(efx_irq_soft_enabled(efx)))
 		return IRQ_HANDLED;
 
 	/* Handle non-event-queue sources */
diff --git a/drivers/net/ethernet/sfc/siena/net_driver.h b/drivers/net/ethernet/sfc/siena/net_driver.h
index 4cf556782133..73bc42a854e2 100644
--- a/drivers/net/ethernet/sfc/siena/net_driver.h
+++ b/drivers/net/ethernet/sfc/siena/net_driver.h
@@ -1624,6 +1624,23 @@ static inline void efx_xmit_hwtstamp_pending(struct sk_buff *skb)
 	skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
 }
 
+static inline void efx_irq_soft_enable(struct efx_nic *efx)
+{
+	/* Publish channel state before opening the IRQ handler gate. */
+	smp_store_release(&efx->irq_soft_enabled, true);
+}
+
+static inline void efx_irq_soft_disable(struct efx_nic *efx)
+{
+	WRITE_ONCE(efx->irq_soft_enabled, false);
+}
+
+static inline bool efx_irq_soft_enabled(struct efx_nic *efx)
+{
+	/* Pair with efx_irq_soft_enable() before touching channels. */
+	return smp_load_acquire(&efx->irq_soft_enabled);
+}
+
 /* Get the max fill level of the TX queues on this channel */
 static inline unsigned int
 efx_channel_tx_fill_level(struct efx_channel *channel)
-- 
2.34.1

^ permalink raw reply related

* Re: [PATCH v2] net: mvneta: free/request IRQ across suspend/resume
From: Zhou, Yun @ 2026-06-18  9:14 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: marcin.s.wojtas, andrew+netdev, davem, edumazet, kuba, pabeni,
	clrkwllms, rostedt, netdev, linux-kernel, linux-rt-devel
In-Reply-To: <20260618083952.IbGzrvJL@linutronix.de>


On 6/18/26 16:39, Sebastian Andrzej Siewior wrote:
> CAUTION: This email comes from a non Wind River email account!
> Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
> On 2026-06-17 17:20:28 [+0800], Yun Zhou wrote:
>> On PREEMPT_RT, the mvneta IRQ handler is force-threaded. Under high
> There is also the `threadirqs' option.
>
>> network traffic, the IRQ can enter suspend with desc->depth == 1
>> (masked by the oneshot mechanism between handler invocations).
> That would be irq_desc::depth.
>
>> During suspend, the kernel increments depth to 2 and masks the
>> interrupt at the MPIC level (clearing the SRC_CTL CPU routing bit,
>> due to IRQCHIP_MASK_ON_SUSPEND).
> The interrupt should be masked while the depth counter goes 0->1, no?
>
>>                                   On resume, depth is decremented
>> back to 1, but since it does not reach 0, the unmask is never
>> called. The MPIC CPU routing remains cleared, permanently disabling
>> interrupt delivery.
> But why not? In my naive assumption, we get into suspend with
> irq_desc::depth = 2 and the threaded should be woken up. Once the
> treaded handler is done the counter should decrement by one. Then again
> during resume reaching 0 leading to the unmask. If the thread handler is
> frozen and defrosted on resume then it should still happen but in
> different order.
>
> Something is missing here based on my naive assumption.
>
>> Fix by freeing the IRQ in suspend and re-requesting it in resume.
>> This ensures a clean IRQ state (depth=0, proper hardware routing)
>> on every resume cycle, regardless of the pre-suspend depth. This
>> follows the approach used by other drivers (e.g. igb).
> The igb shutdowns the device entirely, not just freeing the IRQ.
You are right. The original analysis was wrong — mvneta uses
request_percpu_irq() which sets IRQF_NO_SUSPEND, so the PM framework
never touches this IRQ. The depth never changes from 1.

The actual root cause is simpler: mvneta_percpu_isr() calls
disable_percpu_irq() before scheduling NAPI, and enable_percpu_irq()
is called in napi_complete_done(). If suspend hits during active NAPI
polling, the MPIC percpu IRQ stays masked after resume because
mvneta_start_dev() doesn't restore it.

Will send a v3 with the correct one-liner fix (enable_percpu_irq in
the resume path). Apologies for the incorrect analysis.

BR,
Yun

^ permalink raw reply

* Re: [PATCH bpf] bpf: zero-initialize the fib lookup flow struct
From: Toke Høiland-Jørgensen @ 2026-06-18  9:13 UTC (permalink / raw)
  To: Avinash Duduskar, ast, daniel, andrii
  Cc: bpf, davem, dsahern, eddyz87, edumazet, emil, horms,
	john.fastabend, jolsa, kuba, linux-kernel, martin.lau, memxor,
	netdev, pabeni, sdf, song, yonghong.song
In-Reply-To: <20260617224719.1428599-1-avinash.duduskar@gmail.com>

Avinash Duduskar <avinash.duduskar@gmail.com> writes:

> bpf_ipv4_fib_lookup() and bpf_ipv6_fib_lookup() build the flow key on
> the stack with a bare "struct flowi4 fl4;" / "struct flowi6 fl6;" and
> fill it field by field, but never set flowi4_l3mdev / flowi6_l3mdev.
>
> On the non-DIRECT path the lookup goes through the fib rules whenever the
> netns has custom rules, which a VRF installs:
>
> 	bpf_ipv4_fib_lookup() -> fib_lookup() -> __fib_lookup()
> 	  -> l3mdev_update_flow()   reads !fl->flowi_l3mdev
> 	  -> fib_rules_lookup() -> fib_rule_match()
> 	       -> l3mdev_fib_rule_match()   uses fl->flowi_l3mdev
>
> l3mdev_update_flow() resolves the l3mdev master from the ingress device
> only while the field is still zero. Left at a nonzero stack value the
> resolution is skipped, and l3mdev_fib_rule_match() then tests that value
> as an ifindex, so the VRF master is not resolved and the rule fails to
> match: an ingress enslaved to a VRF can fail to select its table. FIB
> rules matching on an L3 master device (l3mdev_fib_rule_iif_match()/
> _oif_match()) read the same value, so an "ip rule iif/oif <vrf>"
> mismatches the same way.
>
> Zero-initialize the whole flow struct rather than adding one more
> field assignment, so any flowi field added later is covered too.
> ip_route_input_slow() likewise zeroes the field before its input lookup.
>
> CONFIG_INIT_STACK_ALL_ZERO masks this by default, but it depends on
> compiler support (CC_HAS_AUTO_VAR_INIT_ZERO), so INIT_STACK_NONE builds,
> including older toolchains that fall back to it, are exposed. Built with
> INIT_STACK_ALL_PATTERN, a plain bpf_fib_lookup (no VLAN, no DIRECT) over a
> VRF slave whose destination is routed only in the VRF table returns
> BPF_FIB_LKUP_RET_NOT_FWDED, and resolves with this patch. On the default
> config the lookup succeeds either way, so ordinary testing does not catch
> the bug.
>
> Fixes: 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices")
> Signed-off-by: Avinash Duduskar <avinash.duduskar@gmail.com>

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply

* Re: [PATCH net] netconsole: don't drop the last byte of a full-sized message
From: Simon Horman @ 2026-06-18  9:13 UTC (permalink / raw)
  To: Breno Leitao
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel, asantostc, gustavold,
	kernel-team
In-Reply-To: <20260616-max_print_chunk-v1-1-8dc125d67083@debian.org>

On Tue, Jun 16, 2026 at 09:09:52AM -0700, Breno Leitao wrote:
> nt->buf is exactly MAX_PRINT_CHUNK bytes, but scnprintf() reserves one
> byte for its NUL terminator, so a non-fragmented payload of exactly
> MAX_PRINT_CHUNK loses its last byte (emitted as a stray NUL in the
> release path). Grow nt->buf to MAX_PRINT_CHUNK + 1 and bound the
> scnprintf() calls with sizeof(nt->buf); the transmitted length stays
> capped at MAX_PRINT_CHUNK.
> 
> Alternatively, nt->buf could be left at MAX_PRINT_CHUNK and the NUL byte
> reserved by routing exactly-MAX_PRINT_CHUNK payloads to fragmentation
> ('len < MAX_PRINT_CHUNK'), at the cost of fragmenting those messages.
> But it would look less sane, thus the current approach.
> 
> Fixes: c62c0a17f9b7 ("netconsole: Append kernel version to message")
> Signed-off-by: Breno Leitao <leitao@debian.org>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH net] net: ethernet: ti: icssg: guard PA stat lookups
From: Simon Horman @ 2026-06-18  9:10 UTC (permalink / raw)
  To: Philippe Schenker
  Cc: netdev, Philippe Schenker, danishanwar, rogerq, linux-arm-kernel,
	stable, Andrew Lunn, David Carlier, David S. Miller, Eric Dumazet,
	Jacob Keller, Jakub Kicinski, Kevin Hao, Meghana Malladi,
	Paolo Abeni, Vadim Fedorenko, linux-kernel
In-Reply-To: <20260616143642.1972071-1-dev@pschenker.ch>

On Tue, Jun 16, 2026 at 04:35:34PM +0200, Philippe Schenker wrote:
> From: Philippe Schenker <philippe.schenker@impulsing.ch>
> 
> icssg_ndo_get_stats64() unconditionally calls emac_get_stat_by_name()
> with FW PA stat names regardless of whether the PA stats block is
> present on the hardware.  emac_get_stat_by_name() already guards the
> PA stats lookup with `if (emac->prueth->pa_stats)`; when that pointer
> is NULL the lookup falls through to netdev_err() and returns -EINVAL.
> Because ndo_get_stats64 is polled regularly by the networking stack
> this produces thousands of log entries of the form:
> 
>   icssg-prueth icssg1-eth end0: Invalid stats FW_RX_ERROR
> 
> A secondary consequence is that the int(-EINVAL) return value is
> implicitly widened to a near-ULLONG_MAX unsigned value when accumulated
> into the __u64 fields of rtnl_link_stats64, silently corrupting the
> rx_errors, rx_dropped and tx_dropped counters reported by `ip -s link`.
> 
> Every other PA-aware code path in the driver is already guarded with
> the same `if (emac->prueth->pa_stats)` check.  Apply the same guard
> here.
> 
> Fixes: 0d15a26b247d ("net: ti: icssg-prueth: Add ICSSG FW Stats")

nit: no blank line between tags

> 
> Signed-off-by: Philippe Schenker <philippe.schenker@impulsing.ch>
> 
> Cc: danishanwar@ti.com
> Cc: rogerq@kernel.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: stable@vger.kernel.org

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH v2] net: mvneta: free/request IRQ across suspend/resume
From: Zhou, Yun @ 2026-06-18  9:03 UTC (permalink / raw)
  To: Maxime Chevallier, marcin.s.wojtas, andrew+netdev, davem,
	edumazet, kuba, pabeni, bigeasy, clrkwllms, rostedt
  Cc: netdev, linux-kernel, linux-rt-devel
In-Reply-To: <95249596-5f05-421c-9c8a-693c7b26c4f6@bootlin.com>

On 6/17/26 20:49, Maxime Chevallier wrote:
> CAUTION: This email comes from a non Wind River email account!
> Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
> Hi,
>
> On 6/17/26 11:20, Yun Zhou wrote:
>> On PREEMPT_RT, the mvneta IRQ handler is force-threaded. Under high
>> network traffic, the IRQ can enter suspend with desc->depth == 1
>> (masked by the oneshot mechanism between handler invocations).
>>
>> During suspend, the kernel increments depth to 2 and masks the
>> interrupt at the MPIC level (clearing the SRC_CTL CPU routing bit,
>> due to IRQCHIP_MASK_ON_SUSPEND). On resume, depth is decremented
>> back to 1, but since it does not reach 0, the unmask is never
>> called. The MPIC CPU routing remains cleared, permanently disabling
>> interrupt delivery.
>>
>> Fix by freeing the IRQ in suspend and re-requesting it in resume.
>> This ensures a clean IRQ state (depth=0, proper hardware routing)
>> on every resume cycle, regardless of the pre-suspend depth. This
>> follows the approach used by other drivers (e.g. igb).
> This description makes it sound like it's not really a mvneta problem,
> but rather a broader effect from  preempt-rt / irq management / suspend
> interactions.
>
> Is this the expected way to deal with that ?
>
You were right to question this. After deeper investigation, I found
that the original analysis was incorrect.

The real root cause is entirely within the mvneta driver:

mvneta_percpu_isr() calls disable_percpu_irq() to mask the MPIC percpu
IRQ before scheduling NAPI. The corresponding enable_percpu_irq() is
called in napi_complete_done(). If suspend occurs during active NAPI
polling (between disable and enable), the MPIC percpu IRQ remains
masked after resume — mvneta_start_dev() only restores the NIC-level
INTR_NEW_MASK register, not the irqchip-level per-CPU mask.

The fix is a one-liner: call on_each_cpu(mvneta_percpu_enable) in the
resume path to ensure the MPIC percpu IRQ is unmasked. I will send a
v3 with the correct fix and updated description.

The previous free_irq/request_irq approach happened to work as a
side-effect (request_percpu_irq → enable_percpu_irq restores the mask),
but it was fixing the symptom rather than the actual cause.

Thank you very much for your rigorous review,
Yun

^ permalink raw reply

* [PATCH v6.6-v6.1] netfilter: nf_tables: always walk all pending catchall elements
From: Shivani Agarwal @ 2026-06-18  8:34 UTC (permalink / raw)
  To: stable, gregkh
  Cc: pablo, fw, phil, davem, edumazet, kuba, pabeni, horms,
	netfilter-devel, coreteam, netdev, linux-kernel, ajay.kaher,
	alexey.makhalov, vamsi-krishna.brahmajosyula, yin.ding,
	tapas.kundu, Yiming Qian, Sasha Levin, Shivani Agarwal

From: Florian Westphal <fw@strlen.de>

[ Upstream commit 7cb9a23d7ae40a702577d3d8bacb7026f04ac2a9 ]

During transaction processing we might have more than one catchall element:
1 live catchall element and 1 pending element that is coming as part of the
new batch.

If the map holding the catchall elements is also going away, its
required to toggle all catchall elements and not just the first viable
candidate.

Otherwise, we get:
 WARNING: ./include/net/netfilter/nf_tables.h:1281 at nft_data_release+0xb7/0xe0 [nf_tables], CPU#2: nft/1404
 RIP: 0010:nft_data_release+0xb7/0xe0 [nf_tables]
 [..]
 __nft_set_elem_destroy+0x106/0x380 [nf_tables]
 nf_tables_abort_release+0x348/0x8d0 [nf_tables]
 nf_tables_abort+0xcf2/0x3ac0 [nf_tables]
 nfnetlink_rcv_batch+0x9c9/0x20e0 [..]

Fixes: 628bd3e49cba ("netfilter: nf_tables: drop map element references from preparation phase")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Shivani: Modified to apply on v6.6.y-v6.1.y ]
Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com>
---
 net/netfilter/nf_tables_api.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 196ac4e76..0581f6479 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -620,7 +620,6 @@ static void nft_map_catchall_deactivate(const struct nft_ctx *ctx,
 
 		elem.priv = catchall->elem;
 		nft_setelem_data_deactivate(ctx->net, set, &elem);
-		break;
 	}
 }
 
@@ -5241,7 +5240,6 @@ static void nft_map_catchall_activate(const struct nft_ctx *ctx,
 
 		elem.priv = catchall->elem;
 		nft_setelem_data_activate(ctx->net, set, &elem);
-		break;
 	}
 }
 
-- 
2.53.0


^ permalink raw reply related

* RE: [EXTERNAL] [PATCH net v2] net: marvell: prestera: initialize err in prestera_port_sfp_bind
From: Elad Nachman @ 2026-06-18  8:55 UTC (permalink / raw)
  To: Ruoyu Wang, Taras Chornyi, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Russell King,
	Oleksandr Mazur, Yevhen Orlov, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20260617193228.1653582-1-ruoyuw560@gmail.com>

> 
> 
> From: Ruoyu Wang <ruoyuw560@gmail.com>
> Sent: Wednesday, June 17, 2026 10:32 PM
> To: Taras Chornyi <taras.chornyi@plvision.eu>; Andrew Lunn <andrew+netdev@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; Russell King <linux@armlinux.org.uk>; Oleksandr Mazur <oleksandr.mazur@plvision.eu>; Yevhen Orlov <yevhen.orlov@plvision.eu>; netdev@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [EXTERNAL] [PATCH net v2] net: marvell: prestera: initialize err in prestera_port_sfp_bind
> 
> prestera_port_sfp_bind() returns err after walking the ports node. If no child node matches the port's front-panel id, err is never assigned. Initialize err to 0 because absence of a matching optional port device tree node is not an error. In
> 
> prestera_port_sfp_bind() returns err after walking the ports node. If no
> child node matches the port's front-panel id, err is never assigned.
> 
> Initialize err to 0 because absence of a matching optional port device
> tree node is not an error. In that case no phylink is created and port
> creation should continue with port->phy_link left NULL. Errors from
> malformed matched nodes and phylink_create() still propagate.
> 
> Fixes: 52323ef75414 ("net: marvell: prestera: add phylink support")
> Signed-off-by: Ruoyu Wang <mailto:ruoyuw560@gmail.com>
> ---
> v2:
> - Add net tree target to the subject.
> - Explain why the no-match path returns 0 instead of -ENODEV.
> 
>  drivers/net/ethernet/marvell/prestera/prestera_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/marvell/prestera/prestera_main.c b/drivers/net/ethernet/marvell/prestera/prestera_main.c
> index 41e19e9ad28d4..a82e7a8029851 100644
> --- a/drivers/net/ethernet/marvell/prestera/prestera_main.c
> +++ b/drivers/net/ethernet/marvell/prestera/prestera_main.c
> @@ -373,7 +373,7 @@ static int prestera_port_sfp_bind(struct prestera_port *port)
>  	struct device_node *ports, *node;
>  	struct fwnode_handle *fwnode;
>  	struct phylink *phy_link;
> -	int err;
> +	int err = 0;
> 
>  	if (!sw->np)
>  		return 0;
> --
> 2.51.0
> 

prestera_port_sfp_bind() iterates only SFP ports.
Although all currently existing switch boards have at least one SFP uplink port,
In theory a manufacturer might produce a switch board without any SFP ports,
which will unnecessarily fail this function call, so for resolving this case indeed
err should be initialized to zero to make this function return 0 and not an error.

Acked-by: Elad Nachman <enachman@marvell.com>

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH 1/2] igc: Wait for MAC passthrough after reset
From: Ruinskiy, Dima @ 2026-06-18  8:51 UTC (permalink / raw)
  To: Loktionov, Aleksandr, kao, acelan, Nguyen, Anthony L,
	Kitszel, Przemyslaw
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, intel-wired-lan@lists.osuosl.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <IA3PR11MB8986B77F49DF672178FEE4BCE5E32@IA3PR11MB8986.namprd11.prod.outlook.com>

On 18/06/2026 10:55, Loktionov, Aleksandr wrote:
> 
> 
>> -----Original Message-----
>> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
>> Of Chia-Lin Kao (AceLan) via Intel-wired-lan
>> Sent: Thursday, June 18, 2026 9:33 AM
>> To: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel,
>> Przemyslaw <przemyslaw.kitszel@intel.com>
>> Cc: Andrew Lunn <andrew+netdev@lunn.ch>; David S. Miller
>> <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub
>> Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; intel-
>> wired-lan@lists.osuosl.org; netdev@vger.kernel.org; linux-
>> kernel@vger.kernel.org
>> Subject: [Intel-wired-lan] [PATCH 1/2] igc: Wait for MAC passthrough
>> after reset
>>
>> Some systems support MAC passthrough for dock Ethernet controllers by
>> having firmware rewrite the receive address registers after the
>> controller reset completes.
>>
>> igc resets the controller before reading RAL0/RAH0, so that reset can
>> restore the controller native MAC address temporarily. If the driver
>> reads the registers immediately, it can race the firmware rewrite and
>> keep the native dock MAC instead of the host passthrough MAC.
>>
>> For LMVP devices, poll RAL0/RAH0 after reset and before reading the
>> MAC address. Stop once the address registers change to another valid
>> Ethernet address, allowing firmware a bounded window to complete the
>> passthrough update.
>>
> Good day, Chia-Lin
> 
> It'd be great if you could share more details on how to reproduce the issue.
> 
> What exact hardware setup is affected (dock model, NIC, system)?
> Which firmware/BIOS version?
> How often does the race trigger?
> Do you have a way to reliably reproduce it?
> 
> Also, what is the observed behavior vs. expected behavior? For example,
> which MAC address is seen and which one should be used?
> 
In addition to that - I would ask - when the race triggers - how much 
wait time do you need to reliably resolve it (i.e., for the FW to have 
completed the MAC update)?

Because 100 iterations of 100msec each - this translates to up-to 10 
seconds, no?
The weak spot here is what if you are on an LMvP system where MAC 
passthrough has not been enabled. You will always wait for the full 10 
seconds after every reset until you give up and just continue with the 
default MAC. Hardly desirable behavior.

We've implemented something like this in another driver at one point, 
and the default polling timeout there is 1 second (which does not affect 
the UX too much).

A better way may be using a FW interrupt to notify the driver when the 
MAC address has been updated. The usability of this approach depends on 
whether it is possible to update the MAC address up the stack after the 
device has already been initialized. Does the framework support this?

Thanks,
Dima.

> 
>> Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
>> ---
>>   drivers/net/ethernet/intel/igc/igc_main.c | 48
>> +++++++++++++++++++++++
>>   1 file changed, 48 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c
>> b/drivers/net/ethernet/intel/igc/igc_main.c
>> index 2c9e2dfd8499..fa9752ed8bc5 100644
>> --- a/drivers/net/ethernet/intel/igc/igc_main.c
>> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
>> @@ -11,6 +11,7 @@
>>   #include <net/pkt_sched.h>
>>   #include <linux/bpf_trace.h>
>>   #include <net/xdp_sock_drv.h>
>> +#include <linux/etherdevice.h>
>>   #include <linux/pci.h>
>>   #include <linux/mdio.h>
>>
>> @@ -69,6 +70,52 @@ static const struct pci_device_id igc_pci_tbl[] = {
>>
>>   MODULE_DEVICE_TABLE(pci, igc_pci_tbl);
>>
>> +static void igc_read_rar0(struct igc_hw *hw, u8 *addr, u32 *ral, u32
>> +*rah) {
>> +	*ral = rd32(IGC_RAL(0));
>> +	*rah = rd32(IGC_RAH(0));
>> +
>> +	addr[0] = *ral & 0xff;
>> +	addr[1] = (*ral >> 8) & 0xff;
>> +	addr[2] = (*ral >> 16) & 0xff;
>> +	addr[3] = (*ral >> 24) & 0xff;
>> +	addr[4] = *rah & 0xff;
>> +	addr[5] = (*rah >> 8) & 0xff;
>> +}
>> +
>> +static bool igc_is_lmvp_device(struct pci_dev *pdev) {
>> +	switch (pdev->device) {
>> +	case IGC_DEV_ID_I225_LMVP:
>> +	case IGC_DEV_ID_I226_LMVP:
>> +		return true;
>> +	default:
>> +		return false;
>> +	}
>> +}
>> +
>> +static void igc_wait_for_lmvp_mac_passthrough(struct pci_dev *pdev,
>> +					      struct igc_hw *hw)
>> +{
>> +	u8 addr[ETH_ALEN] __aligned(2);
>> +	u32 orig_ral, orig_rah;
>> +	u32 ral, rah;
>> +	int i;
>> +
>> +	if (!igc_is_lmvp_device(pdev))
>> +		return;
>> +
>> +	igc_read_rar0(hw, addr, &orig_ral, &orig_rah);
>> +
>> +	for (i = 0; i < 100; i++) {
>> +		msleep(100);
>> +		igc_read_rar0(hw, addr, &ral, &rah);
>> +		if ((ral != orig_ral || rah != orig_rah) &&
>> +		    is_valid_ether_addr(addr))
>> +			return;
>> +	}
>> +}
>> +
>>   enum latency_range {
>>   	lowest_latency = 0,
>>   	low_latency = 1,
>> @@ -7259,6 +7306,7 @@ static int igc_probe(struct pci_dev *pdev,
>>   	 * known good starting state
>>   	 */
>>   	hw->mac.ops.reset_hw(hw);
>> +	igc_wait_for_lmvp_mac_passthrough(pdev, hw);
>>
>>   	if (igc_get_flash_presence_i225(hw)) {
>>   		if (hw->nvm.ops.validate(hw) < 0) {
>> --
>> 2.53.0
> 


^ permalink raw reply

* Re: [PATCH net] netpoll: run NAPI poll in softirq context to avoid rq->lock self-deadlock
From: Sebastian Andrzej Siewior @ 2026-06-18  8:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Petr Mladek, Jakub Kicinski, John Ogness, Sergey Senozhatsky,
	Vlad Poenaru, Thomas Gleixner, netdev, David S . Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman, Breno Leitao,
	Clark Williams, Steven Rostedt, linux-rt-devel, linux-kernel,
	stable, Frederic Weisbecker, Ingo Molnar, Vincent Guittot,
	Dietmar Eggemann, K Prateek Nayak
In-Reply-To: <20260617111504.GK49951@noisy.programming.kicks-ass.net>

On 2026-06-17 13:15:04 [+0200], Peter Zijlstra wrote:
> 
> Can't we push all the legacy consoles into a single legacy kthread? I
> mean, converting all consoles is of course awesome, but should we really
> wait for that?

That would be 

diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 85fbf1801cbe0..c72f8d7027aee 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -27,11 +27,7 @@ int devkmsg_sysctl_set_loglvl(const struct ctl_table *table, int write,
  * nbcon consoles have had their chance to print the panic messages
  * first.
  */
-#ifdef CONFIG_PREEMPT_RT
 # define force_legacy_kthread()	(true)
-#else
-# define force_legacy_kthread()	(false)
-#endif
 
 #ifdef CONFIG_PRINTK
 
and if I remember correctly it was due to delayed CI output limited to
RT. But this does not fix stable down to 5.10 LTS.

Sebastian

^ permalink raw reply related

* RE: [Intel-wired-lan] [PATCH 1/2] igc: Wait for MAC passthrough after reset
From: Kwapulinski, Piotr @ 2026-06-18  8:49 UTC (permalink / raw)
  To: kao, acelan, Nguyen, Anthony L, Kitszel, Przemyslaw
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, intel-wired-lan@lists.osuosl.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20260618073324.1843310-1-acelan.kao@canonical.com>

>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Chia-Lin Kao (AceLan) via Intel-wired-lan
>Sent: Thursday, June 18, 2026 9:33 AM
>To: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>
>Cc: Andrew Lunn <andrew+netdev@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; linux-kernel@vger.kernel.org
>Subject: [Intel-wired-lan] [PATCH 1/2] igc: Wait for MAC passthrough after reset
>
>Some systems support MAC passthrough for dock Ethernet controllers by having firmware rewrite the receive address registers after the controller reset completes.
>
>igc resets the controller before reading RAL0/RAH0, so that reset can restore the controller native MAC address temporarily. If the driver reads the registers immediately, it can race the firmware rewrite and keep the native dock MAC instead of the host passthrough MAC.
>
>For LMVP devices, poll RAL0/RAH0 after reset and before reading the MAC address. Stop once the address registers change to another valid Ethernet address, allowing firmware a bounded window to complete the passthrough update.
>
>Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
>---
> drivers/net/ethernet/intel/igc/igc_main.c | 48 +++++++++++++++++++++++
> 1 file changed, 48 insertions(+)
>
>diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
>index 2c9e2dfd8499..fa9752ed8bc5 100644
>--- a/drivers/net/ethernet/intel/igc/igc_main.c
>+++ b/drivers/net/ethernet/intel/igc/igc_main.c
>@@ -11,6 +11,7 @@
> #include <net/pkt_sched.h>
> #include <linux/bpf_trace.h>
> #include <net/xdp_sock_drv.h>
>+#include <linux/etherdevice.h>
> #include <linux/pci.h>
> #include <linux/mdio.h>
> 
>@@ -69,6 +70,52 @@ static const struct pci_device_id igc_pci_tbl[] = {
> 
> MODULE_DEVICE_TABLE(pci, igc_pci_tbl);
> 
>+static void igc_read_rar0(struct igc_hw *hw, u8 *addr, u32 *ral, u32 
>+*rah) {
>+	*ral = rd32(IGC_RAL(0));
>+	*rah = rd32(IGC_RAH(0));
>+
>+	addr[0] = *ral & 0xff;
>+	addr[1] = (*ral >> 8) & 0xff;
>+	addr[2] = (*ral >> 16) & 0xff;
>+	addr[3] = (*ral >> 24) & 0xff;
>+	addr[4] = *rah & 0xff;
>+	addr[5] = (*rah >> 8) & 0xff;
>+}
>+
>+static bool igc_is_lmvp_device(struct pci_dev *pdev) {
>+	switch (pdev->device) {
>+	case IGC_DEV_ID_I225_LMVP:
>+	case IGC_DEV_ID_I226_LMVP:
>+		return true;
>+	default:
>+		return false;
>+	}
>+}
>+
>+static void igc_wait_for_lmvp_mac_passthrough(struct pci_dev *pdev,
>+					      struct igc_hw *hw)
>+{
>+	u8 addr[ETH_ALEN] __aligned(2);
>+	u32 orig_ral, orig_rah;
>+	u32 ral, rah;
>+	int i;
Hello AceLan
Please move ral, rah and 'i' right into the loop.
Thank you.
Piotr
>+
>+	if (!igc_is_lmvp_device(pdev))
>+		return;
>+
>+	igc_read_rar0(hw, addr, &orig_ral, &orig_rah);
>+
>+	for (i = 0; i < 100; i++) {
>+		msleep(100);
>+		igc_read_rar0(hw, addr, &ral, &rah);
>+		if ((ral != orig_ral || rah != orig_rah) &&
>+		    is_valid_ether_addr(addr))
>+			return;
>+	}
>+}
>+
> enum latency_range {
> 	lowest_latency = 0,
> 	low_latency = 1,
>@@ -7259,6 +7306,7 @@ static int igc_probe(struct pci_dev *pdev,
> 	 * known good starting state
> 	 */
> 	hw->mac.ops.reset_hw(hw);
>+	igc_wait_for_lmvp_mac_passthrough(pdev, hw);
> 
> 	if (igc_get_flash_presence_i225(hw)) {
> 		if (hw->nvm.ops.validate(hw) < 0) {
>--
>2.53.0
>
>

^ permalink raw reply

* Re: [PATCH bpf v3 1/2] bpf: Fix partial copy of non-linear test_run output
From: Paul Chaignon @ 2026-06-18  8:46 UTC (permalink / raw)
  To: Sun Jian
  Cc: bpf, netdev, linux-kselftest, linux-kernel, ast, daniel, andrii,
	martin.lau, eddyz87, memxor, song, yonghong.song, jolsa, davem,
	edumazet, kuba, pabeni, horms, shuah, hawk, john.fastabend, sdf,
	toke, lorenzo
In-Reply-To: <20260617093557.63880-2-sun.jian.kdev@gmail.com>

On Wed, Jun 17, 2026 at 05:35:56PM +0800, Sun Jian wrote:
> For non-linear test_run output, bpf_test_finish() derives the linear
> data copy length from copy_size - frag_size. This only matches the
> linear data length when copy_size is the full packet size.
> 
> When userspace provides a short data_out buffer, copy_size is clamped to
> that buffer size. If copy_size is smaller than frag_size, the computed
> length becomes negative and bpf_test_finish() returns -ENOSPC before
> copying the packet prefix or updating data_size_out.
> 
> Compute the linear data length from the packet layout instead, and clamp
> the linear copy length to copy_size. This preserves the expected
> partial-copy semantics: return -ENOSPC, copy the packet prefix that fits
> in data_out, and report the full packet length through data_size_out.
> 
> Fixes: 7855e0db150ad ("bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature")
> Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
> ---
>  net/bpf/test_run.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 2bc04feadfab..f15c613aaa4e 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -453,12 +453,8 @@ static int bpf_test_finish(const union bpf_attr *kattr,
>  	}
>  
>  	if (data_out) {
> -		int len = sinfo ? copy_size - frag_size : copy_size;
> -
> -		if (len < 0) {
> -			err = -ENOSPC;
> -			goto out;
> -		}
> +		u32 head_len = size - frag_size;
> +		u32 len = min(copy_size, head_len);
>  
>  		if (copy_to_user(data_out, data, len))
>  			goto out;

Acked-by: Paul Chaignon <paul.chaignon@gmail.com>


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox