Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx wake up
From: Xuan Zhuo @ 2026-06-15  2:48 UTC (permalink / raw)
  To: menglong8.dong
  Cc: mst, jasowang, andrew+netdev, davem, edumazet, kuba, pabeni,
	minhquangbui99, kerneljasonxing, netdev, virtualization,
	linux-kernel, eperezma
In-Reply-To: <20260611025644.2431148-2-dongml2@chinatelecom.cn>

On Thu, 11 Jun 2026 10:56:43 +0800, menglong8.dong@gmail.com wrote:
> From: Menglong Dong <dongml2@chinatelecom.cn>
>
> During packet receiving in virtio-net, the rq can be empty, which means
> "rq->vq->num_free == virtqueue_get_vring_size(rq->vq)", in
> virtnet_add_recvbuf_xsk(), if we are using xsk. Meanwhile, the fill ring
> can be empty too, which means we can't allocate anything from
> xsk_buff_alloc_batch(). Then, we will set the XDP_RING_NEED_WAKEUP flag.
>
> However, if the user clean all the data in rx ring and fill the
> "fill ring" and check the XDP_RING_NEED_WAKEUP flag after
> xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(), then the rx
> napi will never be scheduled: the rx ring is empty, which means we will
> never receive a packet to trigger the further recv fill. The rx ring is
> empty now, so the user will not check the flag too.
>
> Fix this by set the XDP_RING_NEED_WAKEUP flag before
> xsk_buff_alloc_batch() if both rq->vq and fill ring are empty.
>
> Meanwhile, set the XDP_RING_NEED_WAKEUP flag if we have any free entry in
> rq->vq.
>
> Fixes: e3f8800aa243 ("virtio-net: xsk: Support wakeup on RX side")
> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> ---
>  drivers/net/virtio_net.c | 25 ++++++++++++++++++++++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index f4adcfee7a80..4b5b3fa62008 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1323,16 +1323,27 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
>  				   struct xsk_buff_pool *pool, gfp_t gfp)
>  {
>  	struct xdp_buff **xsk_buffs;
> +	bool need_wakeup;
>  	dma_addr_t addr;
>  	int err = 0;
>  	u32 len, i;
>  	int num;
>
> +	need_wakeup = xsk_uses_need_wakeup(pool);
>  	xsk_buffs = rq->xsk_buffs;
>
> +	/* If both rq->vq and fill ring are empty, and then the user submit
> +	 * all the chunks to the fill ring and check the wake up flag
> +	 * after xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(),
> +	 * we will lose the chance to wake up the rx napi, so we have to
> +	 * set the need_wakeup flag here.
> +	 */
> +	if (need_wakeup && virtqueue_get_vring_size(rq->vq) == rq->vq->num_free)
> +		xsk_set_rx_need_wakeup(pool);

Is Condition A here too strict? We should trigger the wakeup under a wider range
of scenarios.

> +
>  	num = xsk_buff_alloc_batch(pool, xsk_buffs, rq->vq->num_free);
>  	if (!num) {
> -		if (xsk_uses_need_wakeup(pool)) {
> +		if (need_wakeup) {
>  			xsk_set_rx_need_wakeup(pool);
>  			/* Return 0 instead of -ENOMEM so that NAPI is
>  			 * descheduled.
> @@ -1341,8 +1352,6 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
>  		}
>
>  		return -ENOMEM;
> -	} else {
> -		xsk_clear_rx_need_wakeup(pool);
>  	}
>
>  	len = xsk_pool_get_rx_frame_size(pool) + vi->hdr_len;
> @@ -1363,6 +1372,16 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
>  			goto err;
>  	}
>
> +	if (need_wakeup) {
> +		if (rq->vq->num_free)
> +			/* We have free buffers, so we'd better wake up the
> +			 * rx napi as soon as possible.
> +			 */
> +			xsk_set_rx_need_wakeup(pool);

Is the purpose of waking up RX NAPI to invoke try_fill_recv? However,
virtnet_poll does not call try_fill_recv directly. it is done
conditionally.

Thanks.


> +		else
> +			xsk_clear_rx_need_wakeup(pool);
> +	}
> +
>  	return num;
>
>  err:
> --
> 2.54.0
>

^ permalink raw reply

* Re: [net-next v1 2/6] net: stmmac: Checking whether priv->phylink if NULL in NCSI case
From: Minda Chen @ 2026-06-15  1:25 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Andrew Lunn, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Jose Abreu, Maxime Coquelin, Russell King,
	Giuseppe Cavallaro, Alexandre Torgue, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-stm32@st-md-mailman.stormreply.com,
	devicetree@vger.kernel.org
In-Reply-To: <f3a32c42-27b2-496f-b236-02c33bee1773@lunn.ch>



> 
> > +static inline bool stmmac_phylink_expects_phy(struct phylink *link) {
> > +	if (link)
> > +		return phylink_expects_phy(link);
> > +
> > +	return false;
> > +}
> > +
> > +static inline int stmmac_phylink_pcs_pre_init(struct phylink *link,
> > +struct phylink_pcs *pcs) {
> > +	if (link)
> > +		return phylink_pcs_pre_init(link, pcs);
> > +
> > +	return 0;
> > +}
> > +
> > +static inline void stmmac_phylink_start(struct phylink *link) {
> > +	if (link)
> > +		phylink_start(link);
> > +}
> > +
> > +static inline void stmmac_phylink_stop(struct phylink *link) {
> > +	if (link)
> > +		phylink_stop(link);
> > +}
> 
> Please take a step back and think about the Linux big picture architecture.
> 
> What is stmmac specific here? If you were to add NCSI support to another driver
> which uses phylink, would it need to replicate all this?
> 
> When you consider how the MAC is configured, does it need to know it is
> connected to an NCSI? Can the MAC tell the difference between NSCI, fixed-link,
> a PHY or an SFP? Or does the MAC just need to know RGMII, the link is up, send
> frames?
> 
> Please look at adding generic support for NSCI in phylink, and see if the existing
> phylink mac ops covers everything needed for configuring the MAC.
> 
> 	Andrew

Thanks. I will try to use fix-PHY link.

^ permalink raw reply

* RE: [PATCH net-next v6 5/5] net: wangxun: add pcie error handler
From: Jiawen Wu @ 2026-06-15  2:13 UTC (permalink / raw)
  To: 'Simon Horman'
  Cc: netdev, 'Mengyuan Lou', 'Andrew Lunn',
	'David S. Miller', 'Eric Dumazet',
	'Jakub Kicinski', 'Paolo Abeni',
	'Richard Cochran', 'Russell King',
	'Jacob Keller', 'Michal Swiatkowski',
	'Kees Cook', 'Larysa Zaremba',
	'Joe Damato', 'Breno Leitao',
	'Aleksandr Loktionov',
	'Uwe Kleine-König (The Capable Hub)',
	'Fabio Baltieri', 'Thomas Gleixner',
	'Greg Kroah-Hartman', netdev, 'Mengyuan Lou',
	'Andrew Lunn', 'David S. Miller',
	'Eric Dumazet', 'Jakub Kicinski',
	'Paolo Abeni', 'Richard Cochran',
	'Russell King', 'Jacob Keller',
	'Michal Swiatkowski', 'Kees Cook',
	'Larysa Zaremba', 'Joe Damato',
	'Breno Leitao', 'Aleksandr Loktionov',
	'Uwe Kleine-König (The Capable Hub)',
	'Fabio Baltieri', 'Thomas Gleixner',
	'Greg Kroah-Hartman'
In-Reply-To: <20260612154117.GC671640@horms.kernel.org>

On Fri, Jun 12, 2026 11:41 PM, Simon Horman wrote:
> On Wed, Jun 10, 2026 at 02:09:17PM +0800, Jiawen Wu wrote:
> > Support AER driver to handle the PCIe errors. Sometimes netdev watchdog
> > Tx timeout happens before the AER error report when a PCIe error occurs,
> > CPU blocking would be caused by MMIO during the reset process. To
> > prevent it, check PCIe error status in .ndo_tx_timeout. The current
> > function of ngbe is not yet fully developed, it will be completed in the
> > future.
> >
> > Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
> 
> Hi Jiawen,
> 
> There is AI-generated review of this patch-set available on both
> https://sashiko.dev and https://netdev-ai.bots.linux.dev/sashiko/
> 
> I've tried to filter out only those that seem strictly relevant
> to the progress of this patch and have included them inline.
> 
> I'd appreciate it if you could take a look over this.
> 
> > diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
> 
> ...
> 
> > +/**
> > + * wx_io_slot_reset - called after the pci bus has been reset.
> > + * @pdev: Pointer to PCI device
> > + *
> > + * Return: pci_ers_result_t.
> > + *
> > + * Restart the card from scratch, as if from a cold-boot.
> > + */
> > +static pci_ers_result_t wx_io_slot_reset(struct pci_dev *pdev)
> > +{
> > +	struct wx *wx = pci_get_drvdata(pdev);
> > +	pci_ers_result_t result;
> > +
> > +	if (pci_enable_device_mem(pdev)) {
> > +		wx_err(wx, "Cannot re-enable PCI device after reset.\n");
> > +		result = PCI_ERS_RESULT_DISCONNECT;
> > +	} else {
> > +		/* make all memory operations done before clearing the flag */
> > +		smp_mb__before_atomic();
> > +		clear_bit(WX_STATE_DISABLED, wx->state);
> > +		clear_bit(WX_FLAG_NEED_PCIE_RECOVERY, wx->flags);
> > +		pci_set_master(pdev);
> > +		pci_restore_state(pdev);
> > +		pci_wake_from_d3(pdev, false);
> > +
> > +		rtnl_lock();
> > +		if (netif_running(wx->netdev) && wx->down_suspend)
> > +			wx->down_suspend(wx);
> > +		if (wx->do_reset)
> > +			wx->do_reset(wx->netdev, false);
> 
> The AI-generated review on sashiko.dev flags the following.
> 
> 	"Does this properly initialize the hardware MAC during PCIe AER
> 	 slot reset recovery for the ngbe driver?
> 
> 	"For txgbe, this correctly maps to txgbe_reset(), which calls
> 	 txgbe_reset_hw().  However, for ngbe, wx->do_reset maps to
> 	 ngbe_do_reset() which calls ngbe_reset().
> 
> 	"Looking at ngbe_reset():
> 
> 	"drivers/net/ethernet/wangxun/ngbe/ngbe_main.c:ngbe_reset() {
> 	    if (test_bit(WX_FLAG_NEED_PCIE_RECOVERY, wx->flags))
> 	        return;
> 	    wx_flush_sw_mac_table(wx);
> 	    wx_mac_set_default_filter(wx, wx->mac.addr);
> 	    if (test_bit(WX_STATE_PTP_RUNNING, wx->state))
> 	        wx_ptp_reset(wx);
> 
> 	"This skips ngbe_reset_hw() and fails to fully reset the hardware.
> 	 Since a PCIe slot reset clears the device's internal state, when
> 	 wx_io_resume() subsequently calls ndo_open(), could the device fail
> 	 to function because its hardware registers and PHY state were never
> 	 reset?

As stated in the commit message, ngbe is not yet fully implemented due to some
existing issues with its reset mechanism. These will be refined in a future patch.

> > +		rtnl_unlock();
> > +		result = PCI_ERS_RESULT_RECOVERED;
> > +	}
> > +
> > +	pci_aer_clear_nonfatal_status(pdev);
> > +
> > +	return result;
> > +}
> 
> ...
> 
> > +static void wx_tx_timeout_recovery(struct wx *wx)
> > +{
> > +	/*
> > +	 * When a PCIe hardware error occurs, the driver should initiate a PCIe
> > +	 * recovery mechanism. However, this recovery flow relies on the AER
> > +	 * driver for current kernel policy. Therefore, a self-contained
> > +	 * recovery mechanism is not implemented yet.
> > +	 */
> > +	set_bit(WX_FLAG_NEED_PCIE_RECOVERY, wx->flags);
> > +	wx_err(wx, "PCIe error detected during tx timeout\n");
> > +	queue_work(wx->reset_wq, &wx->reset_task);
> > +}
> 
> The AI-generated review on https://netdev-ai.bots.linux.dev/sashiko/ flags:
> 
> 	"What clears WX_FLAG_NEED_PCIE_RECOVERY if AER never follows up?
> 	 wx_tx_timeout_recovery() sets the flag and queues reset_task.
> 	 reset_task only calls wx_soft_quiesce() and does not clear the
> 	 flag:
> 
> 	    if (test_bit(WX_FLAG_NEED_PCIE_RECOVERY, wx->flags))
> 	        wx_soft_quiesce(wx);
> 
> 	"The flag is cleared only on the success branch of
> 	 wx_io_slot_reset(). If the PCIe error never escalates to AER (AER
> 	 disabled in firmware or kernel, error not reportable, slot_reset
> 	 returns PCI_ERS_RESULT_DISCONNECT), the flag remains set
> 	 indefinitely.
> 
> 	"Both ngbe_reset() and txgbe_reset() now early-return on this flag:
> 
> 	    static void ngbe_reset(struct wx *wx)
> 	    {
> 	        if (test_bit(WX_FLAG_NEED_PCIE_RECOVERY, wx->flags))
> 	            return;
> 	        ...
> 	    }
> 
> 	"so any subsequent reset path (ethtool reset, ndo_set_features
> 	 triggered reset, watchdog timeout reset, FDIR reinit,
> 	 ngbe_reinit_locked / txgbe_reinit_locked) is silently a no-op. The
> 	 hardware reset capability is effectively disabled until the module
> 	 is reloaded.
> 
> 	"Should wx_tx_timeout_recovery() (or reset_task after the quiesce)
> 	 clear the flag once the soft-quiesce completes, so the driver's
> 	 reset machinery remains functional when AER does not follow up?

If a PCIe error happens and AER is unresponsive, the driver currently lacks
alternative recovery mechanisms. This renders the hardware unusable, making
any further reset operations disallowed.



^ permalink raw reply

* [BUG] ptp: vmclock: KASAN slab-use-after-free in vmclock_miscdev_read
From: Shuangpeng Bai @ 2026-06-15  2:15 UTC (permalink / raw)
  To: dwmw2, richardcochran, netdev, linux-kernel

Hi,

I hit the following KASAN report while testing current upstream kernel.

The issue was reproduced by opening /dev/vmclock0, unbinding the vmclock
platform device, and then reading from the old fd.

KASAN: slab-use-after-free in vmclock_miscdev_read

I reproduced this on commit: e8c2f9fdadee7cbc75134dc463c1e0d856d6e5c7 (May 25 2026)

The reproducer and .config files are here.
https://gist.github.com/shuangpengbai/7c2d117852611448a80026f8aa4d4bc4

I'm happy to test debug patches or provide additional information.

Reported-by: Shuangpeng Bai <shuangpeng.kernel@gmail.com>

[  148.011605][ T8390] BUG: KASAN: slab-use-after-free in vmclock_miscdev_read (drivers/ptp/ptp_vmclock.c:409)
[  148.015241][ T8390] Read of size 8 at addr ffff88811fdc7478 by task repro_vmclock_o/8390
[  148.018209][ T8390] Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  148.018216][ T8390] Call Trace:
[  148.018226][ T8390]  <TASK>
[  148.018232][ T8390]  dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
[  148.018248][ T8390]  print_report (mm/kasan/report.c:378 mm/kasan/report.c:482)
[  148.018314][ T8390]  kasan_report (mm/kasan/report.c:595)
[  148.018335][ T8390]  vmclock_miscdev_read (drivers/ptp/ptp_vmclock.c:409)
[  148.018384][ T8390]  vfs_read (fs/read_write.c:572)
[  148.018453][ T8390]  __x64_sys_pread64 (fs/read_write.c:765 fs/read_write.c:773 fs/read_write.c:770 fs/read_write.c:770)
[  148.018483][ T8390]  do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
[  148.018498][ T8390]  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
[  148.018604][ T8390]  </TASK>
[  148.042173][ T8390] Freed by task 8390 on cpu 1 at 147.908511s:
[  148.042791][ T8390]  kasan_save_track (mm/kasan/common.c:57 mm/kasan/common.c:78)
[  148.043265][ T8390]  kasan_save_free_info (mm/kasan/generic.c:584)
[  148.043775][ T8390]  __kasan_slab_free (mm/kasan/common.c:253 mm/kasan/common.c:285)
[  148.044256][ T8390]  kfree (include/linux/kasan.h:235 mm/slub.c:2689 mm/slub.c:6251 mm/slub.c:6566)
[  148.044668][ T8390]  devres_release_all (drivers/base/devres.c:50 drivers/base/devres.c:547 drivers/base/devres.c:576)
[  148.045171][ T8390]  device_release_driver_internal (drivers/base/dd.c:598 drivers/base/dd.c:1357 drivers/base/dd.c:1375)
[  148.045791][ T8390]  unbind_store (drivers/base/bus.c:244)
[  148.046252][ T8390]  kernfs_fop_write_iter (fs/kernfs/file.c:352)
[  148.046798][ T8390]  vfs_write (fs/read_write.c:595 fs/read_write.c:688)
[  148.047229][ T8390]  ksys_write (fs/read_write.c:740)
[  148.047678][ T8390]  do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
[  148.048144][ T8390]  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
[  148.048996][ T8390] The buggy address belongs to the object at ffff88811fdc7400
[  148.048996][ T8390]  which belongs to the cache kmalloc-512 of size 512
[  148.050394][ T8390] The buggy address is located 120 bytes inside of
[  148.050394][ T8390]  freed 512-byte region [ffff88811fdc7400, ffff88811fdc7600)


Best,
Shuangpeng

^ permalink raw reply

* Re: [PATCH net-next v1] e1000: Initialize phy_data to avoid unexpected values
From: Rongguang Wei @ 2026-06-15  1:40 UTC (permalink / raw)
  To: Jagielski, Jedrzej, Kitszel, Przemyslaw, Nguyen, Anthony L
  Cc: netdev@vger.kernel.org, intel-wired-lan@lists.osuosl.org,
	Rongguang Wei
In-Reply-To: <PH0PR11MB590263784927AAB47EE558BAF0182@PH0PR11MB5902.namprd11.prod.outlook.com>



在 2026/6/12 16:58, Jagielski, Jedrzej 写道:
> From: Rongguang Wei <clementwei90@163.com> 
> Sent: Friday, June 12, 2026 10:04 AM
> 
>> From: Rongguang Wei <weirongguang@kylinos.cn>
>>
>> The phy_data variable is not initialized. If e1000_read_phy_reg
>> returns an error, phy_data will not point to a valid value from
>> the PHY register, which may cause the regs_buff array to be populated
>> with unexpected values.
> 
> Hi,
> 
> Sounds like a fix, but i believe we would like to have any real
> scenario when the issue occurs and how it can be reproduced.
> If such is provided please target the patch against net tree and
> add fixes tag.
> 
Hi,
I was not face a real scenario. I just found out there is no check for 
e1000_read_phy_reg return value when I reading the driver code. 
Maybe is better to initialized the value or check the return value of e1000_read_phy_reg.
>>
>> Signed-off-by: Rongguang Wei <weirongguang@kylinos.cn>
>> Change-Id: I46071b3b21a566f8da650168d38d6968251b077d
> 
> 
> i doubt this is a correct kernel commit tag
> 
>> ---
>> drivers/net/ethernet/intel/e1000/e1000_ethtool.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
>> index 4dcbeabb3ad2..f068108c5004 100644
>> --- a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
>> +++ b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
>> @@ -327,7 +327,7 @@ static void e1000_get_regs(struct net_device *netdev, struct ethtool_regs *regs,
>> 	struct e1000_adapter *adapter = netdev_priv(netdev);
>> 	struct e1000_hw *hw = &adapter->hw;
>> 	u32 *regs_buff = p;
>> -	u16 phy_data;
>> +	u16 phy_data = 0;
>>
>> 	memset(p, 0, E1000_REGS_LEN * sizeof(u32));
>>
>> -- 
>> 2.25.1
> 


^ permalink raw reply

* Re: [PATCH net-next v1] e1000: Initialize phy_data to avoid unexpected values
From: Rongguang Wei @ 2026-06-15  1:28 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: przemyslaw.kitszel, anthony.l.nguyen, netdev, intel-wired-lan,
	Rongguang Wei
In-Reply-To: <1578c474-ffbf-46f7-b906-49da4ea48142@lunn.ch>



在 2026/6/13 03:39, Andrew Lunn 写道:
> On Fri, Jun 12, 2026 at 04:03:31PM +0800, Rongguang Wei wrote:
>> From: Rongguang Wei <weirongguang@kylinos.cn>
>>
>> The phy_data variable is not initialized. If e1000_read_phy_reg
>> returns an error, phy_data will not point to a valid value from
>> the PHY register, which may cause the regs_buff array to be populated
>> with unexpected values.
>>
>> Signed-off-by: Rongguang Wei <weirongguang@kylinos.cn>
>> Change-Id: I46071b3b21a566f8da650168d38d6968251b077d
> 
> What does this Change-Id mean?
> 
Sorry, it just a auto generate id when I push this patch on my own repos.
I forget to delete.
>> ---
>>  drivers/net/ethernet/intel/e1000/e1000_ethtool.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
>> index 4dcbeabb3ad2..f068108c5004 100644
>> --- a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
>> +++ b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
>> @@ -327,7 +327,7 @@ static void e1000_get_regs(struct net_device *netdev, struct ethtool_regs *regs,
>>  	struct e1000_adapter *adapter = netdev_priv(netdev);
>>  	struct e1000_hw *hw = &adapter->hw;
>>  	u32 *regs_buff = p;
>> -	u16 phy_data;
>> +	u16 phy_data = 0;
> 
> 	if (hw->phy_type == e1000_phy_igp) {
> 		e1000_write_phy_reg(hw, IGP01E1000_PHY_PAGE_SELECT,
> 				    IGP01E1000_PHY_AGC_A);
> 		e1000_read_phy_reg(hw, IGP01E1000_PHY_AGC_A &
> 				   IGP01E1000_PHY_PAGE_SELECT, &phy_data);
> 		regs_buff[13] = (u32)phy_data; /* cable length */
> 
> Isn't a cable length of 0 also unexpected?
> 
> How does this patch actually make the situation better?
> 
Uninitialized variables may be initialized to 0 by the system, explicit initialization
is performed to avoid accidents. 

The 0 is from e1000_read_phy_reg_ex in e1000_main.c and e1000_power_down_phy when use
e1000_read_phy_reg function the last paramenters is initialized 0. So I used this value.

There are many other function which use e1000_read_phy_reg also not initialize the last paramenters
eg. e1000_phy_reset_clk_and_crs. I can do it in V2.
>     
>     Andrew
> 
> ---
> pw-bot: cr


^ permalink raw reply

* [BUG] netdevsim: KASAN slab-use-after-free in ref_tracker_free
From: Shuangpeng Bai @ 2026-06-15  1:16 UTC (permalink / raw)
  To: netdev
  Cc: Jakub Kicinski, Andrew Lunn, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, linux-kernel

Hi netdev maintainers,

I hit the following KASAN report while testing an upstream kernel.

The issue was reproduced with netdevsim. I have not confirmed whether this is
specific to netdevsim or whether other net devices can trigger a similar issue.

The KASAN report shows a slab-use-after-free in ref_tracker_free(), reached from
sysfs_rtnl_lock() while reading phys_port_name.

I reproduced this on commit: e8c2f9fdadee7cbc75134dc463c1e0d856d6e5c7 (May 25 2026)

To help trigger the bug more reliably, we applied a minimal diagnostic patch
that only adds delays and print statements.

The reproducer and .config files are here.
https://gist.github.com/shuangpengbai/b49765d646ec4610917015371aa1c3ca

I'm happy to test debug patches or provide additional information.

Reported-by: Shuangpeng Bai <shuangpeng.kernel@gmail.com>

[ 3145.449971][T17497] BUG: KASAN: slab-use-after-free in ref_tracker_free (lib/ref_tracker.c:295)
[ 3145.452089][T17497] Read of size 1 at addr ffff888107678598 by task cat/17497
[ 3145.454439][T17497]
[ 3145.454977][T17497] Tainted: [W]=WARN
[ 3145.454980][T17497] Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 3145.454985][T17497] Call Trace:
[ 3145.454991][T17497]  <TASK>
[ 3145.454994][T17497]  dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
[ 3145.455002][T17497]  print_report (mm/kasan/report.c:378 mm/kasan/report.c:482)
[ 3145.455028][T17497]  kasan_report (mm/kasan/report.c:595)
[ 3145.455046][T17497]  ref_tracker_free (lib/ref_tracker.c:295)
[ 3145.455083][T17497]  sysfs_rtnl_lock (include/linux/netdevice.h:4491 include/linux/netdevice.h:4508 include/linux/netdevice.h:4534 net/core/net-sysfs.c:122)
[ 3145.455091][T17497]  phys_port_name_show (net/core/net-sysfs.c:665)
[ 3145.455118][T17497]  dev_attr_show (drivers/base/core.c:2421)
[ 3145.455128][T17497]  sysfs_kf_seq_show (fs/sysfs/file.c:65)
[ 3145.455135][T17497]  seq_read_iter (fs/seq_file.c:231)
[ 3145.455144][T17497]  vfs_read (fs/read_write.c:493 fs/read_write.c:574)
[ 3145.455169][T17497]  ksys_read (fs/read_write.c:717)
[ 3145.455181][T17497]  do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
[ 3145.455188][T17497]  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
[ 3145.455193][T17497] RIP: 0033:0x7fcf098c43ce
[ 3145.455200][T17497] Code: c0 e9 b6 fe ff ff 50 48 8d 3d 6e 08 0b 00 e8 69 01 02 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
[ 3145.455204][T17497] RSP: 002b:00007ffd05e76b98 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 3145.455211][T17497] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007fcf098c43ce
[ 3145.455214][T17497] RDX: 0000000000020000 RSI: 00007fcf095e4000 RDI: 0000000000000003
[ 3145.455217][T17497] RBP: 00007fcf095e4000 R08: 00007fcf095e3010 R09: 0000000000000000
[ 3145.455219][T17497] R10: fffffffffffffbc5 R11: 0000000000000246 R12: 0000000000000000
[ 3145.455222][T17497] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
[ 3145.455227][T17497]  </TASK>
[ 3145.455229][T17497]
[ 3145.479014][T17497] Freed by task 17497 on cpu 0 at 3145.447575s:
[ 3145.479559][T17497]  kasan_save_track (mm/kasan/common.c:57 mm/kasan/common.c:78)
[ 3145.479963][T17497]  kasan_save_free_info (mm/kasan/generic.c:584)
[ 3145.480411][T17497]  __kasan_slab_free (mm/kasan/common.c:253 mm/kasan/common.c:285)
[ 3145.480813][T17497]  kfree (include/linux/kasan.h:235 mm/slub.c:2689 mm/slub.c:6251 mm/slub.c:6566)
[ 3145.481148][T17497]  device_release (drivers/base/core.c:2542)
[ 3145.481567][T17497]  kobject_put (lib/kobject.c:689 lib/kobject.c:720 include/linux/kref.h:65 lib/kobject.c:737)
[ 3145.481951][T17497]  sysfs_rtnl_lock (net/core/net-sysfs.c:121)
[ 3145.482351][T17497]  phys_port_name_show (net/core/net-sysfs.c:665)
[ 3145.482782][T17497]  dev_attr_show (drivers/base/core.c:2421)
[ 3145.483154][T17497]  sysfs_kf_seq_show (fs/sysfs/file.c:65)
[ 3145.483586][T17497]  seq_read_iter (fs/seq_file.c:231)
[ 3145.483975][T17497]  vfs_read (fs/read_write.c:493 fs/read_write.c:574)
[ 3145.484334][T17497]  ksys_read (fs/read_write.c:717)
[ 3145.484701][T17497]  do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
[ 3145.485092][T17497]  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
[ 3145.485592][T17497]
[ 3145.485794][T17497] The buggy address belongs to the object at ffff888107678000
[ 3145.485794][T17497]  which belongs to the cache kmalloc-cg-8k of size 8192
[ 3145.486991][T17497] The buggy address is located 1432 bytes inside of
[ 3145.486991][T17497]  freed 8192-byte region [ffff888107678000, ffff88810767a000)
[ 3145.488159][T17497]
[ 3145.488367][T17497] The buggy address belongs to the physical page:


Best,
Shuangpeng

^ permalink raw reply

* Re: [PATCH net-next v7 0/4] net: rnpgbe: Add TX/RX and link status support
From: Yibo Dong @ 2026-06-15  0:58 UTC (permalink / raw)
  To: Simon Horman
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, danishanwar,
	vadim.fedorenko, u.kleine-koenig, linux-kernel, netdev, yaojun
In-Reply-To: <20260612184436.GL671640@horms.kernel.org>

On Fri, Jun 12, 2026 at 07:44:36PM +0100, Simon Horman wrote:
> On Thu, Jun 11, 2026 at 06:00:32PM +0800, Dong Yibo wrote:
> > Hi maintainers,
> > 
> > This patch series adds the packet transmission, reception, and link status
> > management features to the RNPGBE driver, building upon the previously
> > introduced mailbox communication and basic driver infrastructure.
> > 
> > The series introduces:
> > - Msix/msi interrupt handling with NAPI support
> > - TX path with scatter-gather DMA and completion handling
> > - RX path with page pool buffer management
> > - Link status monitoring and carrier management
> > 
> > These changes enable the RNPGBE driver to support basic tx/rx
> > network operations.
> > 
> > Changelog:
> > v6 -> v7:
> > [patch 2/4]:
> > 1. Fix 'frag_idx' error in rnpgbe_tx_map. (Sashiko-gemini)
> > [patch 3/4]:
> > 1. Fix skb leak in invalid size path in rnpgbe_clean_rx_irq.
> >    (Sashiko-gemini)
> > 2. Fix invalid size range check for rxdesc. (Sashiko-gemini)
> > [patch 4/4]:
> > 1. Fix 'data race on the reply payload'. (Sashiko-gemini)
> > 2. Fix 'asymmetric behaviour' when report up/down. (andrew)
> > 
> > links:
> > ---
> > v1: https://lore.kernel.org/netdev/20260325091204.94015-1-dong100@mucse.com/
> > v2: https://lore.kernel.org/netdev/20260403025713.527841-1-dong100@mucse.com/
> > v3: https://lore.kernel.org/netdev/20260507081539.171844-1-dong100@mucse.com/
> > v4: https://lore.kernel.org/netdev/20260526033539.164061-1-dong100@mucse.com/
> > v5: https://lore.kernel.org/netdev/20260528023150.239532-1-dong100@mucse.com/
> > v6: https://lore.kernel.org/netdev/20260604112750.769215-1-dong100@mucse.com/
> > 
> > Additional Notes:
> 
> Thanks for the update and the notes.
> 
> There is another round of AI-generated review of this patch-set available
> on both https://sashiko.dev and https://netdev-ai.bots.linux.dev/sashiko/
> 
> I would appreciate it if you could look over that too. With a view to
> addressing any issues that directly affect this patch.
> 

Ok, I Will take a look at the AI review results on both sites and fix all
relevant issues pointed out for this patch.

> ...
> 
Thanks for your feedback.

^ permalink raw reply

* [PATCH iproute2-next v3] rdma: display resource limits in curr/max format
From: Tao Cui @ 2026-06-15  0:53 UTC (permalink / raw)
  To: dsahern, leonro; +Cc: linux-rdma, netdev, Tao Cui

From: Tao Cui <cuitao@kylinos.cn>

Parse the new RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_MAX netlink attribute
to show resource limits alongside current counts in curr/max format:

  Before: 0: mlx5_0: qp 123  cq 45  mr 200  pd 10
  After:  0: mlx5_0: qp 123/131072  cq 45/65536  mr 200/1000000  pd 10/32768

JSON output provides both current and max fields per resource type
(e.g. "qp": 123, "qp-max": 131072). Backward compatible: no output
change when kernel lacks the new attribute.

Signed-off-by: Tao Cui <cuitao@kylinos.cn>
---
 rdma/include/uapi/rdma/rdma_netlink.h |  5 +++++
 rdma/res.c                            | 21 ++++++++++++++++++++-
 rdma/utils.c                          |  1 +
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/rdma/include/uapi/rdma/rdma_netlink.h b/rdma/include/uapi/rdma/rdma_netlink.h
index 4356ec4a..e5b8b065 100644
--- a/rdma/include/uapi/rdma/rdma_netlink.h
+++ b/rdma/include/uapi/rdma/rdma_netlink.h
@@ -604,6 +604,11 @@ enum rdma_nldev_attr {
 	RDMA_NLDEV_ATTR_FRMR_POOL_PINNED_HANDLES,	/* u32 */
 	RDMA_NLDEV_ATTR_FRMR_POOL_KEY_KERNEL_VENDOR_KEY,	/* u64 */
 
+	/*
+	 * Resource summary entry maximum value.
+	 */
+	RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_MAX,		/* u64 */
+
 	/*
 	 * Always the end
 	 */
diff --git a/rdma/res.c b/rdma/res.c
index 062f0007..046935e2 100644
--- a/rdma/res.c
+++ b/rdma/res.c
@@ -55,7 +55,26 @@ static int res_print_summary(struct nlattr **tb)
 
 		name = mnl_attr_get_str(nla_line[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_NAME]);
 		curr = mnl_attr_get_u64(nla_line[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR]);
-		res_print_u64(name, curr, nla_line[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR]);
+		if (nla_line[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_MAX]) {
+			uint64_t max;
+			char max_name[64];
+
+			max = mnl_attr_get_u64(
+				nla_line[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_MAX]);
+			snprintf(max_name, sizeof(max_name), "%s-max", name);
+			print_u64(PRINT_JSON, name, NULL, curr);
+			print_u64(PRINT_JSON, max_name, NULL, max);
+			if (!is_json_context()) {
+				char buf[64];
+
+				snprintf(buf, sizeof(buf), "%s %" PRIu64 "/%" PRIu64 " ",
+					 name, curr, max);
+				pr_out("%s", buf);
+			}
+		} else {
+			res_print_u64(name, curr,
+				      nla_line[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR]);
+		}
 	}
 	return 0;
 }
diff --git a/rdma/utils.c b/rdma/utils.c
index 87003b2c..90ea1c55 100644
--- a/rdma/utils.c
+++ b/rdma/utils.c
@@ -480,6 +480,7 @@ static const enum mnl_attr_data_type nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
 	[RDMA_NLDEV_ATTR_EVENT_TYPE] = MNL_TYPE_U8,
 	[RDMA_NLDEV_SYS_ATTR_MONITOR_MODE] = MNL_TYPE_U8,
 	[RDMA_NLDEV_ATTR_STAT_OPCOUNTER_ENABLED] = MNL_TYPE_U8,
+	[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_MAX] = MNL_TYPE_U64,
 };
 
 static int rd_attr_check(const struct nlattr *attr, int *typep)
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH bpf-next v4 0/2] bpf: Fix bpf_get/setsockopt to tos for ipv4-mapped ipv6 socket
From: patchwork-bot+netdevbpf @ 2026-06-15  0:50 UTC (permalink / raw)
  To: Leon Hwang
  Cc: bpf, ast, daniel, andrii, eddyz87, memxor, martin.lau, song,
	yonghong.song, jolsa, emil, john.fastabend, sdf, davem, edumazet,
	kuba, pabeni, horms, shuah, ihor.solodrai, netdev, linux-kernel,
	linux-kselftest, kernel-patches-bot
In-Reply-To: <20260613162443.60515-1-leon.hwang@linux.dev>

Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Sun, 14 Jun 2026 00:24:41 +0800 you wrote:
> When TCP over IPv4 via INET6 API, sk->sk_family is AF_INET6, but it is a
> v4 pkt. inet_csk(sk)->icsk_af_ops is ipv6_mapped and use ip_queue_xmit.
> The tos sockopt does not work for bpf [get,set]sockopt() helpers.
> 
> Changelog:
> v3 -> v4:
> * Add 'sk->sk_type != SOCK_RAW && !ipv6_only_sock(sk)' check.
> * Re-implement test with LLM assistance.
> * v3: https://lore.kernel.org/all/20240914103226.71109-1-zhoufeng.zf@bytedance.com/
> 
> [...]

Here is the summary with links:
  - [bpf-next,v4,1/2] bpf: Fix bpf_get/setsockopt to tos for ipv4-mapped ipv6 socket
    https://git.kernel.org/bpf/bpf-next/c/ca0f587c029a
  - [bpf-next,v4,2/2] selftests/bpf: Add test to verify the fix for bpf_setsockopt() helper
    https://git.kernel.org/bpf/bpf-next/c/5cf2c21ab090

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH net-next] i40e: add devlink parameter for Flow Director ATR sample rate
From: kernel test robot @ 2026-06-15  0:42 UTC (permalink / raw)
  To: mheib, intel-wired-lan
  Cc: llvm, oe-kbuild-all, netdev, jiri, davem, edumazet, kuba, pabeni,
	horms, corbet, anthony.l.nguyen, przemyslaw.kitszel,
	andrew+netdev, Mohammad Heib
In-Reply-To: <20260614161131.192068-1-mheib@redhat.com>

Hi,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/mheib-redhat-com/i40e-add-devlink-parameter-for-Flow-Director-ATR-sample-rate/20260615-001257
base:   net-next/main
patch link:    https://lore.kernel.org/r/20260614161131.192068-1-mheib%40redhat.com
patch subject: [Intel-wired-lan] [PATCH net-next] i40e: add devlink parameter for Flow Director ATR sample rate
config: sparc64-allmodconfig (https://download.01.org/0day-ci/archive/20260615/202606150807.u6hUM8VJ-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260615/202606150807.u6hUM8VJ-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202606150807.u6hUM8VJ-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/net/ethernet/intel/i40e/i40e_devlink.c:106:9: error: incompatible function pointer types initializing 'int (*)(struct devlink *, u32, union devlink_param_value *, struct netlink_ext_ack *)' (aka 'int (*)(struct devlink *, unsigned int, union devlink_param_value *, struct netlink_ext_ack *)') with an expression of type 'int (struct devlink *, u32, union devlink_param_value, struct netlink_ext_ack *)' (aka 'int (struct devlink *, unsigned int, union devlink_param_value, struct netlink_ext_ack *)') [-Wincompatible-function-pointer-types]
     106 |                              i40e_atr_sample_rate_validate),
         |                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/net/devlink.h:650:14: note: expanded from macro 'DEVLINK_PARAM_DRIVER'
     650 |         .validate = _validate,                                          \
         |                     ^~~~~~~~~
   1 error generated.


vim +106 drivers/net/ethernet/intel/i40e/i40e_devlink.c

    93	
    94	static const struct devlink_param i40e_dl_params[] = {
    95		DEVLINK_PARAM_GENERIC(MAX_MAC_PER_VF,
    96				      BIT(DEVLINK_PARAM_CMODE_RUNTIME),
    97				      i40e_max_mac_per_vf_get,
    98				      i40e_max_mac_per_vf_set,
    99				      NULL),
   100		DEVLINK_PARAM_DRIVER(I40E_DEVLINK_PARAM_ID_ATR_SAMPLE_RATE,
   101				     "atr_sample_rate",
   102				     DEVLINK_PARAM_TYPE_U32,
   103				     BIT(DEVLINK_PARAM_CMODE_RUNTIME),
   104				     i40e_atr_sample_rate_get,
   105				     i40e_atr_sample_rate_set,
 > 106				     i40e_atr_sample_rate_validate),
   107	};
   108	

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* Re: [netfilter-nf-next:for-netdev-nf-next-26-06-14 3/11] net/netfilter/nf_conncount.c:502:18: sparse: sparse: incompatible types in comparison expression (different address spaces):
From: Florian Westphal @ 2026-06-15  0:38 UTC (permalink / raw)
  To: kernel test robot
  Cc: oe-kbuild-all, Pablo Neira Ayuso, netdev, netfilter-devel
In-Reply-To: <202606150616.cpmJToWO-lkp@intel.com>

kernel test robot <lkp@intel.com> wrote:
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next.git for-netdev-nf-next-26-06-14
> head:   2354e975932dabb06fad239f07a3b68fd1809737
> commit: 64d7d5abe2160bba369b4a8f06bdf5630573bab0 [3/11] netfilter: nf_conncount: callers must hold rcu read lock
> config: x86_64-randconfig-123-20260614 (https://download.01.org/0day-ci/archive/20260615/202606150616.cpmJToWO-lkp@intel.com/config)
> compiler: gcc-13 (Debian 13.3.0-16) 13.3.0
> sparse: v0.6.5-rc1
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260615/202606150616.cpmJToWO-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202606150616.cpmJToWO-lkp@intel.com/
> 
> sparse warnings: (new ones prefixed by >>)
> >> net/netfilter/nf_conncount.c:502:18: sparse: sparse: incompatible types in comparison expression (different address spaces):
>    net/netfilter/nf_conncount.c:502:18: sparse:    struct rb_node [noderef] __rcu *
>    net/netfilter/nf_conncount.c:502:18: sparse:    struct rb_node *
>    net/netfilter/nf_conncount.c:510:34: sparse: sparse: incompatible types in comparison expression (different address spaces):
>    net/netfilter/nf_conncount.c:510:34: sparse:    struct rb_node [noderef] __rcu *
>    net/netfilter/nf_conncount.c:510:34: sparse:    struct rb_node *
>    net/netfilter/nf_conncount.c:512:34: sparse: sparse: incompatible types in comparison expression (different address spaces):
>    net/netfilter/nf_conncount.c:512:34: sparse:    struct rb_node [noderef] __rcu *
>    net/netfilter/nf_conncount.c:512:34: sparse:    struct rb_node *

Thanks but I have no intent to fix this.

Between rcu_dereference_raw() not giving sparse warnings but also not
providing any hints when callers don't hold rcu read lock and plain
rcu_dereference() that does give runtime coverage but results in above
sparse output I will pick the latter and just ignore these warnings.

^ permalink raw reply

* Re: [PATCH net v2] tipc: fix slab-use-after-free Read in tipc_aead_decrypt_done
From: Doruk Tan Ozturk @ 2026-06-15  0:22 UTC (permalink / raw)
  To: tung.quang.nguyen
  Cc: jmaloy, aleksander.lobakin, davem, edumazet, kuba, pabeni, horms,
	netdev, tipc-discussion, linux-kernel
In-Reply-To: <GV1P189MB1988B2B662DC781F7347EBD4C61B2@GV1P189MB1988.EURP189.PROD.OUTLOOK.COM>

On Tue, Jun 10, 2026, Tung Quang Nguyen wrote:
> Can you decode the stack trace (using linux/scripts/decode_stacktrace.sh)
> for more readable text?
>
> Is the issue reproducible on the latest net branch, or just on the old
> v6.12.92 you mentioned?

Hi Tung,

Thanks for the review. Answers to both questions below.

1) Decoded stack trace
----------------------
Decoded with scripts/decode_stacktrace.sh against the vmlinux that produced
the splat (6.12.92, CONFIG_KASAN_INLINE + CONFIG_TIPC + CONFIG_TIPC_CRYPTO).
The use-after-free read and the allocation/free sites resolve as follows:

BUG: KASAN: slab-use-after-free in tipc_crypto_rcv_complete (net/tipc/crypto.c:1917)
Read of size 8 at addr ffff888104c8c808 by task kworker/3:2/70
Workqueue: cryptd cryptd_queue_worker
Call Trace:
 <TASK>
 dump_stack_lvl (lib/dump_stack.c:123)
 print_report (mm/kasan/report.c:378 mm/kasan/report.c:481)
 kasan_report (mm/kasan/report.c:596)
 tipc_crypto_rcv_complete (net/tipc/crypto.c:1917)
 tipc_aead_decrypt_done (net/tipc/crypto.c:996)
 cryptd_aead_crypt (include/crypto/internal/aead.h:85 crypto/cryptd.c:772)
 cryptd_queue_worker (crypto/cryptd.c:181)
 process_one_work (kernel/workqueue.c:3264)
 worker_thread (kernel/workqueue.c:3339 kernel/workqueue.c:3426)
 kthread (kernel/kthread.c:389)
 ret_from_fork (arch/x86/kernel/process.c:152)
 ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
 </TASK>

Allocated by task 1550:
 kasan_save_stack (mm/kasan/common.c:49)
 kasan_save_track (mm/kasan/common.c:61 mm/kasan/common.c:70)
 __kasan_kmalloc (mm/kasan/common.c:378 mm/kasan/common.c:395)
 tipc_crypto_start (net/tipc/crypto.c:1484)
 tipc_init_net (net/tipc/core.c:73)
 ops_init (net/core/net_namespace.c:140)
 setup_net (net/core/net_namespace.c:357)
 copy_net_ns (net/core/net_namespace.c:512)
 create_new_namespaces (kernel/nsproxy.c:110)

(The captured report has the KASAN read trace and the Allocated-by track; the
free is on the netns teardown path tipc_crypto_stop() <- tipc_exit_net() <-
cleanup_net(), as described in the changelog.)

So the freed object is the per-netns struct tipc_crypto allocated in
tipc_crypto_start() at netns creation (crypto.c:1484), and the cryptd worker
then reads it from the async completion: tipc_aead_decrypt_done()
(crypto.c:996) -> tipc_crypto_rcv_complete() (crypto.c:1917). Immediately after
the UAF read the worker also faults dereferencing the stale node pointer in
tipc_node_put() (net/tipc/node.c:319), confirming the object is gone.

2) Reproducibility on the latest net branch
--------------------------------------------
The bug is still present on the latest net tree. I checked out v7.1-rc7 and
inspected net/tipc/crypto.c first: the encrypt side already carries the
maybe_get_net() guard from commit e279024617134 ("net/tipc: fix
slab-use-after-free Read in tipc_aead_encrypt"), but tipc_aead_decrypt() still
goes straight from tipc_bearer_hold(b) to crypto_aead_decrypt(req) with no
maybe_get_net(aead->crypto->net) and no matching put_net() -- i.e. the exact
gap this patch closes. So the decrypt path is unguarded on 7.1-rc7 and the UAF
is reachable there in the same way.

I also built v7.1-rc7 (HEAD at v7.1-rc7) with KASAN_INLINE + TIPC + TIPC_CRYPTO
and reproduced the UAF live. The workload is the same as on 6.12.92: a UDP
bearer with a cluster key is flooded with crafted encrypted frames from an
unknown peer, taking the cluster-key (pick_tx) RX decrypt path, while the
bearer's netns is repeatedly torn down. Decoded against the rc7 vmlinux:

BUG: KASAN: slab-use-after-free in tipc_aead_decrypt_done (net/tipc/crypto.c:999)
Read of size 8 at addr ffff8881056258a8 by task kworker/u16:2/51
CPU: 2 UID: 0 PID: 51 Comm: kworker/u16:2 Not tainted 7.1.0-rc7-00020-... #15
Workqueue: events_unbound
Call Trace:
 <TASK>
 dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
 print_report (mm/kasan/report.c:378 mm/kasan/report.c:482)
 kasan_report (mm/kasan/report.c:595)
 tipc_aead_decrypt_done (net/tipc/crypto.c:999)
 process_one_work (kernel/workqueue.c:3314)
 worker_thread (kernel/workqueue.c:3397 kernel/workqueue.c:3478)
 kthread (kernel/kthread.c:436)
 ret_from_fork (arch/x86/kernel/process.c:158)
 ret_from_fork_asm (arch/x86/entry/entry_64.S:245)
 </TASK>

Allocated by task 169:
 __kasan_kmalloc (mm/kasan/common.c:398 mm/kasan/common.c:415)
 tipc_crypto_start (net/tipc/crypto.c:1502)
 tipc_init_net (net/tipc/core.c:72)
 ops_init (net/core/net_namespace.c:137)
 setup_net (net/core/net_namespace.c:446)
 copy_net_ns (net/core/net_namespace.c:579)
 create_new_namespaces (kernel/nsproxy.c:132)
 unshare_nsproxy_namespaces (kernel/nsproxy.c:234)
 ksys_unshare (kernel/fork.c:3242)
 __x64_sys_unshare (kernel/fork.c:3316)
 do_syscall_64 (arch/x86/entry/syscall_64.c:63)
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)

Freed by task 8:
 kfree (mm/slub.c:6566)
 tipc_exit_net (net/tipc/core.c:119)
 ops_undo_list (net/core/net_namespace.c)
 cleanup_net (net/core/net_namespace.c:704)
 process_one_work (kernel/workqueue.c:3314)
 worker_thread (kernel/workqueue.c)
 kthread (kernel/kthread.c:436)

The freed object is the per-netns struct tipc_crypto allocated in
tipc_crypto_start() at netns creation (crypto.c:1502 on rc7); the async decrypt
completion then reads aead->crypto->stats from it (crypto.c:999) after
cleanup_net() -> tipc_exit_net() -> tipc_crypto_stop() has freed it -- the exact
read/alloc/free triple this patch closes, now on 7.1-rc7 rather than 6.12.92.

One note on the harness: on x86 the in-tree gcm(aes) the SIMD aead wrapper used
to register via simd_register_aeads_compat() is, as of the aesni rewrite, now
registered directly with crypto_register_aeads() and decrypts synchronously, so
the cryptd async window the original 6.12.92 splat used does not arise from the
stock aesni path on rc7. To exercise the same async completion the changelog
describes, I forced tipc_aead_decrypt()'s completion onto a workqueue in my test
tree; the unguarded aead->crypto dereference in tipc_aead_decrypt_done() is what
KASAN catches, and that code is byte-for-byte the unpatched upstream path. The
source state is in any case unambiguous: tipc_aead_decrypt() on rc7 still lacks
maybe_get_net(aead->crypto->net), so the completion can outlive the free on any
config where crypto_aead_decrypt() goes async (e.g. cryptd offload).

Reproduced under KASAN on both v6.12.92 and v7.1-rc7; the decrypt path lacks the
guard on the latest net tree.

Thanks,
Doruk

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH net-next] i40e: add devlink parameter for Flow Director ATR sample rate
From: kernel test robot @ 2026-06-14 22:16 UTC (permalink / raw)
  To: mheib, intel-wired-lan
  Cc: oe-kbuild-all, netdev, jiri, davem, edumazet, kuba, pabeni, horms,
	corbet, anthony.l.nguyen, przemyslaw.kitszel, andrew+netdev,
	Mohammad Heib
In-Reply-To: <20260614161131.192068-1-mheib@redhat.com>

Hi,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/mheib-redhat-com/i40e-add-devlink-parameter-for-Flow-Director-ATR-sample-rate/20260615-001257
base:   net-next/main
patch link:    https://lore.kernel.org/r/20260614161131.192068-1-mheib%40redhat.com
patch subject: [Intel-wired-lan] [PATCH net-next] i40e: add devlink parameter for Flow Director ATR sample rate
config: openrisc-allmodconfig (https://download.01.org/0day-ci/archive/20260615/202606150639.gh5uBfAP-lkp@intel.com/config)
compiler: or1k-linux-gcc (GCC) 16.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260615/202606150639.gh5uBfAP-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202606150639.gh5uBfAP-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from drivers/net/ethernet/intel/i40e/i40e_devlink.c:4:
>> drivers/net/ethernet/intel/i40e/i40e_devlink.c:106:30: error: initialization of 'int (*)(struct devlink *, u32,  union devlink_param_value *, struct netlink_ext_ack *)' {aka 'int (*)(struct devlink *, unsigned int,  union devlink_param_value *, struct netlink_ext_ack *)'} from incompatible pointer type 'int (*)(struct devlink *, u32,  union devlink_param_value,  struct netlink_ext_ack *)' {aka 'int (*)(struct devlink *, unsigned int,  union devlink_param_value,  struct netlink_ext_ack *)'} [-Wincompatible-pointer-types]
     106 |                              i40e_atr_sample_rate_validate),
         |                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/net/devlink.h:650:21: note: in definition of macro 'DEVLINK_PARAM_DRIVER'
     650 |         .validate = _validate,                                          \
         |                     ^~~~~~~~~
   drivers/net/ethernet/intel/i40e/i40e_devlink.c:106:30: note: (near initialization for 'i40e_dl_params[1].validate')
   include/net/devlink.h:650:21: note: in definition of macro 'DEVLINK_PARAM_DRIVER'
     650 |         .validate = _validate,                                          \
         |                     ^~~~~~~~~
   drivers/net/ethernet/intel/i40e/i40e_devlink.c:77:12: note: 'i40e_atr_sample_rate_validate' declared here
      77 | static int i40e_atr_sample_rate_validate(struct devlink *devlink, u32 id,
         |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~


vim +106 drivers/net/ethernet/intel/i40e/i40e_devlink.c

    93	
    94	static const struct devlink_param i40e_dl_params[] = {
    95		DEVLINK_PARAM_GENERIC(MAX_MAC_PER_VF,
    96				      BIT(DEVLINK_PARAM_CMODE_RUNTIME),
    97				      i40e_max_mac_per_vf_get,
    98				      i40e_max_mac_per_vf_set,
    99				      NULL),
   100		DEVLINK_PARAM_DRIVER(I40E_DEVLINK_PARAM_ID_ATR_SAMPLE_RATE,
   101				     "atr_sample_rate",
   102				     DEVLINK_PARAM_TYPE_U32,
   103				     BIT(DEVLINK_PARAM_CMODE_RUNTIME),
   104				     i40e_atr_sample_rate_get,
   105				     i40e_atr_sample_rate_set,
 > 106				     i40e_atr_sample_rate_validate),
   107	};
   108	

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* Re: [PATCH net-next] r8169: migrate Rx path to page_pool
From: Francois Romieu @ 2026-06-14 22:09 UTC (permalink / raw)
  To: atharva-potdar
  Cc: hkallweit1, nic_swsd, andrew+netdev, davem, edumazet, kuba,
	pabeni, netdev
In-Reply-To: <20260614054137.32181-1-atharvapotdar07@gmail.com>

atharva-potdar <atharvapotdar07@gmail.com> :
> Replace the driver-managed skb+copy Rx model with page_pool
> zero-copy in preparation for XDP support.
> 
> Key changes:
> - Allocate order-0 pages via page_pool instead of alloc_pages + dma_map
> - Build skbs directly from pages with napi_build_skb (zero-copy)
> - Add rtl8169_rx_refill() to replenish descriptors after processing
> - Track dirty_rx boundary for efficient refill scheduling
> - Cap max_mtu to R8169_RX_BUF_SIZE - VLAN_ETH_HLEN - ETH_FCS_LEN
>   (order-0 pages can't support arbitrary jumbo frames)
> 
> Tested on RTL8168h with iperf3 (~470 Mbps, 0 retransmits) and
> 1000 pings (0 drops).

You may consider fdd7b4c3302c93f6833e338903ea77245eb510b4 and some related
changes around that time.

-- 
Ueimor

^ permalink raw reply

* e1000e: Report link down after "Detected Hardware Unit Hang" ?
From: Helge Deller @ 2026-06-14 21:48 UTC (permalink / raw)
  To: Tony Nguyen, Przemek Kitszel, intel-wired-lan, netdev

I'm regularily facing the known "eno1: Detected Hardware Unit Hang:"
with my on-board intel e1000e NIC hardware.
Since none of he various tips on the internet helped, I had the idea
to setup a master/slave bond networking to fail over to another NIC when
the Intel chip hangs.

Sadly this doesn't work as intended, because the link of the intel NIC 
isn't reported "down", so the failover never happens, unless I manually
start "ifconfig eno1 down".

My question: Shouldn't the intel NIC ideally report Link Down if we know
it hangs? That way a fail-over should at least happen, right?

Below is a completely untested patch.
Does it make sense that I try to test and/or develop such a patch, or
are there things I miss?

Helge 


diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 7ce0cc8ab8f4..c6edcf4ac032 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1157,6 +1157,10 @@ static void e1000_print_hw_hang(struct work_struct *work)
 
 	e1000e_dump(adapter);
 
+	/* The NIC hangs. Force link down in e1000e_has_link() such that a
+	 * failover can happen */
+	hw->phy.media_type = e1000_media_type_unknown;
+
 	/* Suggest workaround for known h/w issue */
 	if ((hw->mac.type == e1000_pchlan) && (er32(CTRL) & E1000_CTRL_TFCE))
 		e_err("Try turning off Tx pause (flow control) via ethtool\n");

^ permalink raw reply related

* Re: [PATCH net-next v4 0/5] ionic: Expose more port stats to ethtool
From: Eric Joyner @ 2026-06-14 20:54 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, Brett Creeley, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Jacob Keller
In-Reply-To: <20260612091825.GB625020@horms.kernel.org>

On 6/12/2026 2:18 AM, Simon Horman wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On Tue, Jun 09, 2026 at 11:18:25PM -0700, Eric Joyner wrote:
>> The primary aim of this patchset is to support the reporting of new port
>> statistics (and one old one) that firmware sends to the driver; these
>> include general FEC codeword stats and the FEC histogram. A scheme for
>> these extra stats is introduced in order to prevent devices that don't
>> support these new statistics from unconditionally setting them or
>> reporting them in ethtool.
> 
> ...
> 
> Hi Eric,
> 
> There is AI-generated review of this patch-set available on both
> https://sashiko.dev and https://netdev-ai.bots.linux.dev/sashiko/
> I would appreciate it if you could look over that with a view
> to addressing any issues that directly affect this patch.

Thanks for letting me know; I've submitted a v5 where I hope I've addressed all
of the comments.

- Eric

^ permalink raw reply

* [PATCH net-next v5 5/5] ionic: Add .get_fec_stats ethtool handler
From: Eric Joyner @ 2026-06-14 20:53 UTC (permalink / raw)
  To: netdev
  Cc: Brett Creeley, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Eric Joyner, Nikhil P . Rao,
	Simon Horman
In-Reply-To: <20260614205303.48088-1-eric.joyner@amd.com>

Reports FEC statistics totals and an 802.3ck FEC histogram. Per-lane
counts currently aren't supported.

The reporting of these statistics is gated by DEV_CAP_EXTRA_STATS and
checks for IONIC_STAT_INVALID, since only the newest devices support
reporting all of these stats. Older devices can only report some of the
statistics or not at all, and so the output will properly exclude those
unsupported statistics.

Assisted-by: Claude:claude-4.6-sonnet
Signed-off-by: Eric Joyner <eric.joyner@amd.com>
---
 .../ethernet/pensando/ionic/ionic_ethtool.c   | 77 +++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
index c4ab4b5caa0a..e88e818667c2 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
@@ -441,6 +441,82 @@ static int ionic_get_fecparam(struct net_device *netdev,
 	return 0;
 }
 
+static const struct ethtool_fec_hist_range ionic_fec_ranges[] = {
+	{ 0, 0},
+	{ 1, 1},
+	{ 2, 2},
+	{ 3, 3},
+	{ 4, 4},
+	{ 5, 5},
+	{ 6, 6},
+	{ 7, 7},
+	{ 8, 8},
+	{ 9, 9},
+	{ 10, 10},
+	{ 11, 11},
+	{ 12, 12},
+	{ 13, 13},
+	{ 14, 14},
+	{ 15, 15},
+	{ 0, 0},
+};
+
+#define IONIC_FEC_STAT(dst, src)			\
+	do {						\
+		if ((src) != IONIC_STAT_INVALID)	\
+			(dst) = le64_to_cpu((src));	\
+	} while (0)
+
+static void
+ionic_fill_fec_hist(const struct ionic_port_extra_stats *port_extra_stats,
+		    struct ethtool_fec_hist *hist)
+{
+	__le64 fec_cw_err_bin;
+	int i;
+
+	if (port_extra_stats->fec_codeword_error_bin[0] == IONIC_STAT_INVALID)
+		return;
+
+	hist->ranges = ionic_fec_ranges;
+	/* All bins in ranges must be set */
+	for (i = 0; i < ARRAY_SIZE(ionic_fec_ranges) - 1; i++) {
+		fec_cw_err_bin = port_extra_stats->fec_codeword_error_bin[i];
+
+		if (fec_cw_err_bin != IONIC_STAT_INVALID)
+			hist->values[i].sum = le64_to_cpu(fec_cw_err_bin);
+		else
+			hist->values[i].sum = 0;
+	}
+}
+
+static void ionic_get_fec_stats(struct net_device *netdev,
+				struct ethtool_fec_stats *fec_stats,
+				struct ethtool_fec_hist *hist)
+{
+	struct ionic_port_extra_stats *port_extra_stats;
+	struct ionic_lif *lif = netdev_priv(netdev);
+
+	if (!(lif->ionic->ident.dev.capabilities &
+	      cpu_to_le64(IONIC_DEV_CAP_EXTRA_STATS)))
+		return;
+
+	if (!lif->ionic->idev.port_info) {
+		netdev_err_once(netdev, "port_info not initialized\n");
+		return;
+	}
+
+	port_extra_stats = &lif->ionic->idev.port_info->extra_stats;
+
+	IONIC_FEC_STAT(fec_stats->corrected_blocks.total,
+		       port_extra_stats->rsfec_correctable_blocks);
+	IONIC_FEC_STAT(fec_stats->uncorrectable_blocks.total,
+		       port_extra_stats->rsfec_uncorrectable_blocks);
+	IONIC_FEC_STAT(fec_stats->corrected_bits.total,
+		       port_extra_stats->fec_corrected_bits_total);
+
+	ionic_fill_fec_hist(port_extra_stats, hist);
+}
+
 static int ionic_set_fecparam(struct net_device *netdev,
 			      struct ethtool_fecparam *fec)
 {
@@ -1177,6 +1253,7 @@ static const struct ethtool_ops ionic_ethtool_ops = {
 	.get_module_eeprom_by_page	= ionic_get_module_eeprom_by_page,
 	.get_pauseparam		= ionic_get_pauseparam,
 	.set_pauseparam		= ionic_set_pauseparam,
+	.get_fec_stats		= ionic_get_fec_stats,
 	.get_fecparam		= ionic_get_fecparam,
 	.set_fecparam		= ionic_set_fecparam,
 	.get_ts_info		= ionic_get_ts_info,
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next v5 4/5] ionic: Get "link_down_count" ext link stat from firmware
From: Eric Joyner @ 2026-06-14 20:53 UTC (permalink / raw)
  To: netdev
  Cc: Brett Creeley, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Eric Joyner, Nikhil P . Rao,
	Simon Horman
In-Reply-To: <20260614205303.48088-1-eric.joyner@amd.com>

The number of times that link has gone down at the port level is tracked
by the firmware and sent to the driver via regular DMA writes to an
instance of struct ionic_port_status in the driver's memory.

This statistic was never reported in favor of a driver-derived stat, but
doing it in the driver was never necessary since firmware had been
reporting it the whole time. Since it would be more accurate and true to
the description of the statistic to get this count at the PHY level,
replace the driver-calculated statistic with one derived from the
firmware one and remove the driver-calculated one entirely.

The stat reported by the ethtool .get_link_ext_stats() handler is
normalized to 0 on driver load and any device resets that require the
driver to rebuild state while also handling overflows.

Signed-off-by: Eric Joyner <eric.joyner@amd.com>
---
 .../net/ethernet/pensando/ionic/ionic_dev.c   | 10 +++++++++
 .../net/ethernet/pensando/ionic/ionic_dev.h   |  5 +++++
 .../ethernet/pensando/ionic/ionic_ethtool.c   | 22 ++++++++++++++++---
 .../net/ethernet/pensando/ionic/ionic_lif.c   |  4 +++-
 .../net/ethernet/pensando/ionic/ionic_lif.h   |  1 -
 .../net/ethernet/pensando/ionic/ionic_main.c  |  2 ++
 6 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.c b/drivers/net/ethernet/pensando/ionic/ionic_dev.c
index 3838c4a70766..648d9d24be85 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_dev.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.c
@@ -1076,3 +1076,13 @@ bool ionic_q_is_posted(struct ionic_queue *q, unsigned int pos)
 
 	return ((pos - tail) & mask) < ((head - tail) & mask);
 }
+
+void ionic_reset_link_down_count(struct ionic_dev *idev)
+{
+	if (!READ_ONCE(idev->link_down_count_init)) {
+		idev->link_down_count_total = 0;
+		idev->link_down_count_last =
+			le16_to_cpu(idev->port_info->status.link_down_count);
+		WRITE_ONCE(idev->link_down_count_init, true);
+	}
+}
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.h b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
index 5f677bcbaf02..db90e39a1442 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_dev.h
+++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
@@ -185,6 +185,9 @@ struct ionic_dev {
 	struct ionic_port_info *port_info;
 	dma_addr_t port_info_pa;
 	struct ionic_port_extra_stats port_extra_stats_cache;
+	bool link_down_count_init;
+	u16 link_down_count_last;
+	u32 link_down_count_total;
 
 	struct ionic_devinfo dev_info;
 };
@@ -395,4 +398,6 @@ bool ionic_adminq_poke_doorbell(struct ionic_queue *q);
 bool ionic_txq_poke_doorbell(struct ionic_queue *q);
 bool ionic_rxq_poke_doorbell(struct ionic_queue *q);
 
+void ionic_reset_link_down_count(struct ionic_dev *idev);
+
 #endif /* _IONIC_DEV_H_ */
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
index 6069fa460913..c4ab4b5caa0a 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
@@ -115,16 +115,32 @@ static void ionic_get_link_ext_stats(struct net_device *netdev,
 				     struct ethtool_link_ext_stats *stats)
 {
 	struct ionic_lif *lif = netdev_priv(netdev);
+	struct ionic *ionic = lif->ionic;
+	u64 link_down_count_total;
+	u16 link_down_count_fw;
 
-	if (lif->ionic->pdev->is_virtfn)
+	if (ionic->pdev->is_virtfn)
 		return;
 
-	if (!lif->ionic->idev.port_info) {
+	if (!ionic->idev.port_info) {
 		netdev_err_once(netdev, "port_info not initialized\n");
 		return;
 	}
 
-	stats->link_down_events = lif->link_down_count;
+	link_down_count_fw =
+	    le16_to_cpu(ionic->idev.port_info->status.link_down_count);
+	link_down_count_total = ionic->idev.link_down_count_total +
+				link_down_count_fw -
+				ionic->idev.link_down_count_last;
+
+	/* The firmware counter is only 16 bits and can wraparound */
+	if (link_down_count_fw < ionic->idev.link_down_count_last)
+		link_down_count_total += BIT(16);
+
+	ionic->idev.link_down_count_last = link_down_count_fw;
+	ionic->idev.link_down_count_total = link_down_count_total;
+
+	stats->link_down_events = link_down_count_total;
 }
 
 static int ionic_get_link_ksettings(struct net_device *netdev,
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
index 637e635bbf03..fd3ee9820531 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
@@ -140,6 +140,7 @@ void ionic_lif_deferred_enqueue(struct ionic_lif *lif,
 
 static void ionic_link_status_check(struct ionic_lif *lif)
 {
+	struct ionic_dev *idev = &lif->ionic->idev;
 	struct net_device *netdev = lif->netdev;
 	u16 link_status;
 	bool link_up;
@@ -153,6 +154,8 @@ static void ionic_link_status_check(struct ionic_lif *lif)
 		return;
 	}
 
+	ionic_reset_link_down_count(idev);
+
 	link_status = le16_to_cpu(lif->info->status.link_status);
 	link_up = link_status == IONIC_PORT_OPER_STATUS_UP;
 
@@ -179,7 +182,6 @@ static void ionic_link_status_check(struct ionic_lif *lif)
 		}
 	} else {
 		if (netif_carrier_ok(netdev)) {
-			lif->link_down_count++;
 			netdev_info(netdev, "Link down\n");
 			netif_carrier_off(netdev);
 		}
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.h b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
index 8e10f66dc50e..d34692462036 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
+++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
@@ -214,7 +214,6 @@ struct ionic_lif {
 	bool registered;
 	bool doorbell_wa;
 	u16 lif_type;
-	unsigned int link_down_count;
 	unsigned int nmcast;
 	unsigned int nucast;
 	unsigned int nvlans;
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_main.c b/drivers/net/ethernet/pensando/ionic/ionic_main.c
index 306d9d160e17..6e6f3ed07271 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_main.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_main.c
@@ -737,6 +737,8 @@ int ionic_port_init(struct ionic *ionic)
 	memset(&idev->port_info->extra_stats, 0xff,
 	       sizeof(idev->port_info->extra_stats));
 
+	WRITE_ONCE(idev->link_down_count_init, false);
+
 	sz = min(sizeof(ident->port.config), sizeof(idev->dev_cmd_regs->data));
 
 	mutex_lock(&ionic->dev_cmd_lock);
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next v5 2/5] ionic: Update ionic_if.h with new extra port stats
From: Eric Joyner @ 2026-06-14 20:53 UTC (permalink / raw)
  To: netdev
  Cc: Brett Creeley, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Eric Joyner, Nikhil P . Rao,
	Simon Horman
In-Reply-To: <20260614205303.48088-1-eric.joyner@amd.com>

Add a new structure to report additional statistics from the firmware to
struct ionic_port_info. This new struct currently only contains FEC
related statistics, but any new port-level statistics collected by the
firmware would go into it.

The new structure is located in the same area as the unused
ionic_port_pb_stats structure, so this patch also removes that and its
supporting enumerations since they was never used in this driver.

Finally, to indicate firmware support for the new structure, introduce a
new device capability that the driver can use to see if the attached
device supports reporting these extra stats.

Signed-off-by: Eric Joyner <eric.joyner@amd.com>
---
 .../net/ethernet/pensando/ionic/ionic_if.h    | 64 ++++---------------
 1 file changed, 11 insertions(+), 53 deletions(-)

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_if.h b/drivers/net/ethernet/pensando/ionic/ionic_if.h
index 23d6e2b4791e..0a201422d0c5 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_if.h
+++ b/drivers/net/ethernet/pensando/ionic/ionic_if.h
@@ -273,10 +273,12 @@ union ionic_drv_identity {
  * enum ionic_dev_capability - Device capabilities
  * @IONIC_DEV_CAP_VF_CTRL:     Device supports VF ctrl operations
  * @IONIC_DEV_CAP_DISC_CMB:    Device supports CMB discovery operations
+ * @IONIC_DEV_CAP_EXTRA_STATS: Device supports extra stats schema
  */
 enum ionic_dev_capability {
 	IONIC_DEV_CAP_VF_CTRL        = BIT(0),
 	IONIC_DEV_CAP_DISC_CMB       = BIT(1),
+	IONIC_DEV_CAP_EXTRA_STATS    = BIT(4),
 };
 
 /**
@@ -2329,7 +2331,7 @@ struct ionic_qos_identify_comp {
 /* Capri max supported, should be renamed. */
 #define IONIC_QOS_CLASS_MAX		7
 #define IONIC_QOS_PCP_MAX		8
-#define IONIC_QOS_CLASS_NAME_SZ	32
+#define IONIC_QOS_CLASS_NAME_SZ		32
 #define IONIC_QOS_DSCP_MAX		64
 #define IONIC_QOS_ALL_PCP		0xFF
 #define IONIC_DSCP_BLOCK_SIZE		8
@@ -2855,54 +2857,12 @@ struct ionic_mgmt_port_stats {
 	__le64 frames_tx_pause;
 };
 
-enum ionic_pb_buffer_drop_stats {
-	IONIC_BUFFER_INTRINSIC_DROP = 0,
-	IONIC_BUFFER_DISCARDED,
-	IONIC_BUFFER_ADMITTED,
-	IONIC_BUFFER_OUT_OF_CELLS_DROP,
-	IONIC_BUFFER_OUT_OF_CELLS_DROP_2,
-	IONIC_BUFFER_OUT_OF_CREDIT_DROP,
-	IONIC_BUFFER_TRUNCATION_DROP,
-	IONIC_BUFFER_PORT_DISABLED_DROP,
-	IONIC_BUFFER_COPY_TO_CPU_TAIL_DROP,
-	IONIC_BUFFER_SPAN_TAIL_DROP,
-	IONIC_BUFFER_MIN_SIZE_VIOLATION_DROP,
-	IONIC_BUFFER_ENQUEUE_ERROR_DROP,
-	IONIC_BUFFER_INVALID_PORT_DROP,
-	IONIC_BUFFER_INVALID_OUTPUT_QUEUE_DROP,
-	IONIC_BUFFER_DROP_MAX,
-};
-
-enum ionic_oflow_drop_stats {
-	IONIC_OFLOW_OCCUPANCY_DROP,
-	IONIC_OFLOW_EMERGENCY_STOP_DROP,
-	IONIC_OFLOW_WRITE_BUFFER_ACK_FILL_UP_DROP,
-	IONIC_OFLOW_WRITE_BUFFER_ACK_FULL_DROP,
-	IONIC_OFLOW_WRITE_BUFFER_FULL_DROP,
-	IONIC_OFLOW_CONTROL_FIFO_FULL_DROP,
-	IONIC_OFLOW_DROP_MAX,
-};
-
-/* struct ionic_port_pb_stats - packet buffers system stats
- * uses ionic_pb_buffer_drop_stats for drop_counts[]
- */
-struct ionic_port_pb_stats {
-	__le64 sop_count_in;
-	__le64 eop_count_in;
-	__le64 sop_count_out;
-	__le64 eop_count_out;
-	__le64 drop_counts[IONIC_BUFFER_DROP_MAX];
-	__le64 input_queue_buffer_occupancy[IONIC_QOS_TC_MAX];
-	__le64 input_queue_port_monitor[IONIC_QOS_TC_MAX];
-	__le64 output_queue_port_monitor[IONIC_QOS_TC_MAX];
-	__le64 oflow_drop_counts[IONIC_OFLOW_DROP_MAX];
-	__le64 input_queue_good_pkts_in[IONIC_QOS_TC_MAX];
-	__le64 input_queue_good_pkts_out[IONIC_QOS_TC_MAX];
-	__le64 input_queue_err_pkts_in[IONIC_QOS_TC_MAX];
-	__le64 input_queue_fifo_depth[IONIC_QOS_TC_MAX];
-	__le64 input_queue_max_fifo_depth[IONIC_QOS_TC_MAX];
-	__le64 input_queue_peak_occupancy[IONIC_QOS_TC_MAX];
-	__le64 output_queue_buffer_occupancy[IONIC_QOS_TC_MAX];
+struct ionic_port_extra_stats {
+	__le64 rsfec_correctable_blocks;
+	__le64 rsfec_uncorrectable_blocks;
+	__le64 fec_corrected_bits_total;
+	__le64 rx_bits_phy;
+	__le64 fec_codeword_error_bin[16];
 };
 
 /**
@@ -2950,7 +2910,7 @@ union ionic_port_identity {
  * @sprom_page2:     Extended Transceiver sprom, page 2
  * @sprom_page17:    Extended Transceiver sprom, page 17
  * @rsvd:            reserved byte(s)
- * @pb_stats:        uplink pb drop stats
+ * @extra_stats:     Extra port statistics data
  */
 struct ionic_port_info {
 	union ionic_port_config config;
@@ -2968,9 +2928,7 @@ struct ionic_port_info {
 		};
 	};
 	u8     rsvd[376];
-
-	/* pb_stats must start at 2k offset */
-	struct ionic_port_pb_stats  pb_stats;
+	struct ionic_port_extra_stats extra_stats;
 };
 
 /*
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next v5 3/5] ionic: Report "rx_bits_phy" stat to ethtool
From: Eric Joyner @ 2026-06-14 20:53 UTC (permalink / raw)
  To: netdev
  Cc: Brett Creeley, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Eric Joyner, Nikhil P . Rao,
	Simon Horman
In-Reply-To: <20260614205303.48088-1-eric.joyner@amd.com>

This stat contains the number of total bits that the PHY has received;
it's useful for BER calculations. Add it to the ethtool stats output.

However, since this is one of the new "extra port stats", it's reported
in a different manner than the existing port stats and only
conditionally added to the ethtool stats output list: both the
DEV_CAP_EXTRA_STATS capability must be supported by the firmware, and
the firmware must set the value of the statistic to something other than
IONIC_STAT_INVALID.

To help support this scheme, the extra port stats region is initialized to
0xff's/IONIC_STAT_INVALID by the driver, to ensure the statistics that
the driver knows about but the firmware does not are still invalid
to the driver.

Signed-off-by: Eric Joyner <eric.joyner@amd.com>
---
 .../net/ethernet/pensando/ionic/ionic_dev.h   |  1 +
 .../net/ethernet/pensando/ionic/ionic_main.c  |  6 ++
 .../net/ethernet/pensando/ionic/ionic_stats.c | 65 ++++++++++++++++++-
 .../net/ethernet/pensando/ionic/ionic_stats.h |  2 +
 4 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.h b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
index 35566f97eaea..5f677bcbaf02 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_dev.h
+++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
@@ -184,6 +184,7 @@ struct ionic_dev {
 	u32 port_info_sz;
 	struct ionic_port_info *port_info;
 	dma_addr_t port_info_pa;
+	struct ionic_port_extra_stats port_extra_stats_cache;
 
 	struct ionic_devinfo dev_info;
 };
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_main.c b/drivers/net/ethernet/pensando/ionic/ionic_main.c
index 3c5200e2fdb7..306d9d160e17 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_main.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_main.c
@@ -731,6 +731,12 @@ int ionic_port_init(struct ionic *ionic)
 			return -ENOMEM;
 	}
 
+	/* If the driver knows about more "extra stats" than the firmware,
+	 * make sure these stats are marked as invalid.
+	 */
+	memset(&idev->port_info->extra_stats, 0xff,
+	       sizeof(idev->port_info->extra_stats));
+
 	sz = min(sizeof(ident->port.config), sizeof(idev->dev_cmd_regs->data));
 
 	mutex_lock(&ionic->dev_cmd_lock);
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_stats.c b/drivers/net/ethernet/pensando/ionic/ionic_stats.c
index 0107599a9dd4..428d5cca930f 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_stats.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_stats.c
@@ -167,6 +167,7 @@ static const struct ionic_stat_desc ionic_rx_stats_desc[] = {
 #define IONIC_NUM_PORT_STATS ARRAY_SIZE(ionic_port_stats_desc)
 #define IONIC_NUM_TX_STATS ARRAY_SIZE(ionic_tx_stats_desc)
 #define IONIC_NUM_RX_STATS ARRAY_SIZE(ionic_rx_stats_desc)
+#define IONIC_NUM_EXTRA_PORT_STATS	1
 
 #define MAX_Q(lif)   ((lif)->netdev->real_num_tx_queues)
 
@@ -232,6 +233,33 @@ static void ionic_get_lif_stats(struct ionic_lif *lif,
 	stats->hw_tx_aborted_errors = ns.tx_aborted_errors;
 }
 
+static u32 ionic_extra_port_stats_get_count(struct ionic_lif *lif)
+{
+	struct ionic_dev *idev = &lif->ionic->idev;
+	struct ionic_port_extra_stats *pes_cache;
+	u32 count = 0;
+
+	if (!(lif->ionic->ident.dev.capabilities &
+	      cpu_to_le64(IONIC_DEV_CAP_EXTRA_STATS)))
+		return count;
+
+	pes_cache = &idev->port_extra_stats_cache;
+	/* Treat all of the extra port stats as invalid in subsequent calls if
+	 * port_info isn't set; otherwise cache a valid snapshot for them.
+	 */
+	if (!idev->port_info) {
+		memset(pes_cache, 0xff, sizeof(*pes_cache));
+		return count;
+	}
+
+	*pes_cache = idev->port_info->extra_stats;
+
+	if (pes_cache->rx_bits_phy != IONIC_STAT_INVALID)
+		count++;
+
+	return count;
+}
+
 static u64 ionic_sw_stats_get_count(struct ionic_lif *lif)
 {
 	u64 total = 0, tx_queues = MAX_Q(lif), rx_queues = MAX_Q(lif);
@@ -243,7 +271,7 @@ static u64 ionic_sw_stats_get_count(struct ionic_lif *lif)
 		rx_queues += 1;
 
 	total += IONIC_NUM_LIF_STATS;
-	total += IONIC_NUM_PORT_STATS;
+	total += IONIC_NUM_PORT_STATS + ionic_extra_port_stats_get_count(lif);
 
 	total += tx_queues * IONIC_NUM_TX_STATS;
 	total += rx_queues * IONIC_NUM_RX_STATS;
@@ -271,6 +299,20 @@ static void ionic_sw_stats_get_rx_strings(struct ionic_lif *lif, u8 **buf,
 				ionic_rx_stats_desc[i].name);
 }
 
+static void ionic_extra_port_stats_get_strings(struct ionic_lif *lif, u8 **buf)
+{
+	struct ionic_port_extra_stats *pes_cache;
+
+	if (!(lif->ionic->ident.dev.capabilities &
+	    cpu_to_le64(IONIC_DEV_CAP_EXTRA_STATS)))
+		return;
+
+	pes_cache = &lif->ionic->idev.port_extra_stats_cache;
+
+	if (pes_cache->rx_bits_phy != IONIC_STAT_INVALID)
+		ethtool_puts(buf, "rx_bits_phy");
+}
+
 static void ionic_sw_stats_get_strings(struct ionic_lif *lif, u8 **buf)
 {
 	int i, q_num;
@@ -280,6 +322,7 @@ static void ionic_sw_stats_get_strings(struct ionic_lif *lif, u8 **buf)
 
 	for (i = 0; i < IONIC_NUM_PORT_STATS; i++)
 		ethtool_puts(buf, ionic_port_stats_desc[i].name);
+	ionic_extra_port_stats_get_strings(lif, buf);
 
 	for (q_num = 0; q_num < MAX_Q(lif); q_num++)
 		ionic_sw_stats_get_tx_strings(lif, buf, q_num);
@@ -322,6 +365,25 @@ static void ionic_sw_stats_get_rxq_values(struct ionic_lif *lif, u64 **buf,
 	}
 }
 
+static void ionic_extra_port_stats_get_values(struct ionic_lif *lif, u64 **buf)
+{
+	struct ionic_port_extra_stats *pes_cache;
+
+	if (!(lif->ionic->ident.dev.capabilities &
+	      cpu_to_le64(IONIC_DEV_CAP_EXTRA_STATS)))
+		return;
+
+	/* The number of statistics added to @buf here must equal
+	 * ionic_extra_port_stats_get_count().
+	 */
+	pes_cache = &lif->ionic->idev.port_extra_stats_cache;
+
+	if (pes_cache->rx_bits_phy != IONIC_STAT_INVALID) {
+		**buf = le64_to_cpu(pes_cache->rx_bits_phy);
+		(*buf)++;
+	}
+}
+
 static void ionic_sw_stats_get_values(struct ionic_lif *lif, u64 **buf)
 {
 	struct ionic_port_stats *port_stats;
@@ -341,6 +403,7 @@ static void ionic_sw_stats_get_values(struct ionic_lif *lif, u64 **buf)
 					     &ionic_port_stats_desc[i]);
 		(*buf)++;
 	}
+	ionic_extra_port_stats_get_values(lif, buf);
 
 	for (q_num = 0; q_num < MAX_Q(lif); q_num++)
 		ionic_sw_stats_get_txq_values(lif, buf, q_num);
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_stats.h b/drivers/net/ethernet/pensando/ionic/ionic_stats.h
index 2a725834f792..7ed935868e84 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_stats.h
+++ b/drivers/net/ethernet/pensando/ionic/ionic_stats.h
@@ -4,6 +4,8 @@
 #ifndef _IONIC_STATS_H_
 #define _IONIC_STATS_H_
 
+#define IONIC_STAT_INVALID	(cpu_to_le64(~0ULL))
+
 #define IONIC_STAT_TO_OFFSET(type, stat_name) (offsetof(type, stat_name))
 
 #define IONIC_STAT_DESC(type, stat_name) { \
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next v5 1/5] ionic: Fix check in ionic_get_link_ext_stats
From: Eric Joyner @ 2026-06-14 20:52 UTC (permalink / raw)
  To: netdev
  Cc: Brett Creeley, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Eric Joyner, Nikhil P . Rao,
	Simon Horman
In-Reply-To: <20260614205303.48088-1-eric.joyner@amd.com>

From: Brett Creeley <brett.creeley@amd.com>

The current check will fail if SR-IOV is not initialized for the
physical function; this is because is_physfn is 0 if sriov_init() isn't
run or fails. Change the check that prevents getting the link down count
to use is_virtfn instead so that VFs don't get this functionality, which
was the original intent.

Fixes: 132b4ebfa090 ("ionic: add support for ethtool extended stat link_down_count")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Eric Joyner <eric.joyner@amd.com>
---
 drivers/net/ethernet/pensando/ionic/ionic_ethtool.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
index 78a802eb159f..6069fa460913 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
@@ -116,8 +116,15 @@ static void ionic_get_link_ext_stats(struct net_device *netdev,
 {
 	struct ionic_lif *lif = netdev_priv(netdev);
 
-	if (lif->ionic->pdev->is_physfn)
-		stats->link_down_events = lif->link_down_count;
+	if (lif->ionic->pdev->is_virtfn)
+		return;
+
+	if (!lif->ionic->idev.port_info) {
+		netdev_err_once(netdev, "port_info not initialized\n");
+		return;
+	}
+
+	stats->link_down_events = lif->link_down_count;
 }
 
 static int ionic_get_link_ksettings(struct net_device *netdev,
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next v5 0/5] ionic: Expose more port stats to ethtool
From: Eric Joyner @ 2026-06-14 20:52 UTC (permalink / raw)
  To: netdev
  Cc: Brett Creeley, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Eric Joyner, Nikhil P . Rao,
	Simon Horman

The primary aim of this patchset is to support the reporting of new port
statistics (and one old one) that firmware sends to the driver; these
include general FEC codeword stats and the FEC histogram. A scheme for
these extra stats is introduced in order to prevent devices that don't
support these new statistics from unconditionally setting them or
reporting them in ethtool.

---
v5:
Address new Sashiko AI comments and previous feedback
patch 2:
- Remove additional unused enums that were only used by struct
  ionic_port_pb_stats
- Change IONIC_STAT_INVALID to use (~0ULL) instead of ((__le64)-1) to
  avoid sparse warnings (and moved to patch 3)
patch 4:
- Change ethtool link_down_events count to use firmware-derived value;
  handle overflow and set count to zero on driver load and resets. Finally
  drop the driver-calculated value and ethtool general stat counter and
  only report count in the ethtool extended link stat structure.
patch 5:
- Add check for NULL port_info in .get_fec_stats() callback in order to
  prevent a potential NULL pointer dereference, similar to other ethtool
  callbacks.

v4:
- A big addition is a new scheme where the driver will only report the
  new statistics supported by firmware if the firmware sets the statistic
  value to something that isn't invalid; more details in patch 3, and this
  is used in patch 5 to prevent FEC stats from being unconditionally
  reported on devices/firmware versions that don't support these stats.
- The link_down_count from firmware mentioned in v2 was moved back to an
  entry in the general ethtool statistics; the existing driver-calculated
  value for the ethtool ext link is more appropriate due to the firmware
  value being a small size and not resetting between driver loads.
- Add netdev_err_once() call recommended by Sashiko for ethtool handler
- Drop devcmd retry logic patch from previous versions for now;
  explicitly include Brett's patch from another patchset to fix the
  ionic_get_link_ext_stats() PF/VF check

v3:
Address issues mostly found by Sashiko:
- Fix potential return of uninitialized variable in __ionic_dev_cmd_wait()
- Fix bounds of wait in __ionic_dev_cmd_wait() to prevent function from
  giving up prematurely now that the wait period has changed
- Add NULL check to ionic_get_link_ext_stats(), following the example set by
  ionic_get_link_ksettings()
- Add missing le16_to_cpu() when copying link_down_count from firmware

v2:
- Add missing cpu_to_le64() to FEC histogram stat assignment
- Remove unused pb_stats field that's replaced by the new FEC/extra stats
- Replace ethtool ext link stat with firmware stat instead of adding
  the firmware stat to general ethtool statistics; remove old driver
  calculated stat
- Add explanation for what EAGAIN return value could be used for in
  commit message

Brett Creeley (1):
  ionic: Fix check in ionic_get_link_ext_stats

Eric Joyner (4):
  ionic: Update ionic_if.h with new extra port stats
  ionic: Report "rx_bits_phy" stat to ethtool
  ionic: Get "link_down_count" ext link stat from firmware
  ionic: Add .get_fec_stats ethtool handler

 .../net/ethernet/pensando/ionic/ionic_dev.c   |  10 ++
 .../net/ethernet/pensando/ionic/ionic_dev.h   |   6 +
 .../ethernet/pensando/ionic/ionic_ethtool.c   | 104 +++++++++++++++++-
 .../net/ethernet/pensando/ionic/ionic_if.h    |  64 ++---------
 .../net/ethernet/pensando/ionic/ionic_lif.c   |   4 +-
 .../net/ethernet/pensando/ionic/ionic_lif.h   |   1 -
 .../net/ethernet/pensando/ionic/ionic_main.c  |   8 ++
 .../net/ethernet/pensando/ionic/ionic_stats.c |  65 ++++++++++-
 .../net/ethernet/pensando/ionic/ionic_stats.h |   2 +
 9 files changed, 206 insertions(+), 58 deletions(-)


base-commit: 93790c374b9d77f3db15786d7d432872d92751cf
-- 
2.17.1


^ permalink raw reply

* Re: [PATCH net-next v13 1/2] tcp: rehash onto different local ECMP path on retransmit timeout
From: Neil Spring @ 2026-06-14 20:38 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, edumazet, ncardwell, kuniyu, davem, kuba, dsahern, horms,
	shuah, linux-kselftest, bpf, martin.lau, daniel
In-Reply-To: <6ab86192-f0d4-4a00-83c9-3fd6449dc9dd@redhat.com>

On Sun, Jun 14, 2026 at 6:54 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> >
> On 6/12/26 3:00 AM, Neil Spring wrote:
> > Currently sk_rethink_txhash() re-rolls the socket's txhash on RTO, PLB,
> > and spurious-retransmission events, but the cached route is reused and
> > the new hash is not propagated into the ECMP path selection logic.  Two
> > changes are needed to make rehash select a different local ECMP path:
> >
> > 1. Add __sk_dst_reset() alongside sk_rethink_txhash() in
> >    tcp_write_timeout(), tcp_rcv_spurious_retrans(), and
> >    tcp_plb_check_rehash() so the cached dst is invalidated and the
> >    next transmit triggers a fresh route lookup.
> >
> > 2. Set fl6->mp_hash from sk_txhash (or tcp_rsk(req)->txhash for
> >    SYN/ACK retransmits and syncookies) in tcp_v6_connect(),
> >    inet6_sk_rebuild_header(), inet6_csk_route_req(),
> >    inet6_csk_route_socket(), tcp_v6_send_response(), and
> >    cookie_v6_check() so fib6_select_path() picks a path based on the
> >    new hash.
> >
> > The mp_hash override only applies to fib_multipath_hash_policy 0 (the
> > default L3 policy).  Its hash includes the flow label, but that is 0 by
> > default -- np->flow_label is unset, and auto_flowlabels only computes
> > the on-wire label later, per packet -- so flows to the same peer share
> > one local path.  Keying the hash on sk_txhash makes the local path
> > per-connection and lets a rehash re-select it.  Policies 1-3 are left
> > unchanged.
> >
> > The mp_hash assignment is factored into a small helper,
> > ip6_ecmp_set_mp_hash(), shared by inet6_csk_route_req(),
> > inet6_csk_route_socket(), tcp_v6_connect(), inet6_sk_rebuild_header(),
> > tcp_v6_send_response(), and cookie_v6_check().  It applies
> > (txhash >> 1) ?: 1 for policy 0 (the >> 1 keeps mp_hash in the 31-bit
> > range; ?: 1 keeps it non-zero, since 0 would fall back to
> > rt6_multipath_hash()).  inet6_csk_route_socket() calls it only for
> > sk_protocol == IPPROTO_TCP so that non-TCP callers (e.g., L2TP via
> > inet6_csk_xmit) fall through to rt6_multipath_hash() and retain their
> > existing flow-key-based ECMP behavior.
> >
> > tcp_v6_send_response() also sets mp_hash from the response txhash so
> > that a control packet (a RST from the full socket, or an ACK from a
> > time-wait socket) selects the same local ECMP nexthop as the
> > connection's txhash rather than falling back to the flow hash.  The
> > time-wait socket's tw_txhash is copied from sk_txhash when the
> > connection enters TIME_WAIT, so it reflects any rehash that occurred.
> >
> > Setting mp_hash explicitly is necessary because the default ECMP hash
> > derives from fl6->flowlabel via np->flow_label, which is not updated
> > from sk_txhash (REPFLOW is off by default).  ip6_make_flowlabel()
> > cannot help either, as it runs after the route lookup.
> >
> > As a consequence, for policy 0 the local ECMP path of an IPv6 TCP
> > flow follows sk_txhash even when fl6->flowlabel is non-zero, e.g. a
> > reflected (REPFLOW) or explicitly set (IPV6_FLOWLABEL_MGR) flow
> > label.  This is intentional: only local path selection changes, so
> > rehash can recover from a failed path; the on-wire flow label is
> > unchanged.
> >
> > sk_set_txhash() is moved before ip6_dst_lookup_flow() in
> > tcp_v6_connect() so the initial ECMP path is selected by the same
> > txhash that subsequent route rebuilds will use.  This avoids
> > unintended path changes when the cached dst is naturally invalidated
> > (e.g., by PMTU discovery or route changes).
> >
> > The rehash sites (tcp_write_timeout(), tcp_plb_check_rehash(), and
> > tcp_rcv_spurious_retrans()) call __sk_rethink_txhash_reset_dst(),
> > which re-rolls the txhash and, when it changed, drops the cached dst
> > so the next transmit re-runs route selection.  The dst reset is
> > guarded by sk->sk_family == AF_INET6 since IPv4 ECMP does not
> > currently use sk_txhash for path selection.  For IPv4-mapped IPv6
> > sockets this produces a redundant dst reset on a cold path
> > (RTO/PLB); the subsequent IPv4 route lookup returns the same result.
> > The helper is deliberately separate from sk_rethink_txhash() itself:
> > dst_negative_advice() calls sk_rethink_txhash() before its own dst op,
> > so resetting the dst inside sk_rethink_txhash() would skip that op
> > (e.g. rt6_remove_exception_rt()).
> >
> > For syncookies, cookie_init_sequence() computes the cookie value
> > before route_req() and sets txhash so the SYN-ACK selects the same
> > ECMP path that cookie_v6_check() will use when the full socket is
> > created.  cookie_tcp_reqsk_init() derives txhash from the cookie so
> > the full socket's ECMP path matches the SYN-ACK.  Both the SYN-ACK
> > assignment in tcp_conn_request() and the full-socket assignment in
> > cookie_tcp_reqsk_init() are keyed on the packet family
> > (skb->protocol == ETH_P_IPV6), not sk->sk_family: a dual-stack
> > AF_INET6 listener also serves IPv4 connections, and the v4 cookie has
> > mssind bits that would bias TX queue distribution if used as txhash.
> > IPv4 connections retain net_tx_rndhash().
> >
> > cookie_init_sequence() is split from the former version that also
> > called tcp_synq_overflow() and incremented SYNCOOKIESSENT; those
> > side effects are now in cookie_record_sent(), called after
> > route_req() succeeds so they are not bumped when route_req() fails.
> > cookie_record_sent() is guarded by CONFIG_SYN_COOKIES to
> > match the guard on tcp_synq_overflow().  route_req() receives 0 as
> > tw_isn for the syncookie path so that tcp_v6_init_req() still saves
> > ireq->pktopts for REPFLOW flowlabel reflection and IPv6 cmsg
> > options.  The ecn_ok clear for syncookies without timestamps stays
> > after tcp_ecn_create_request() so it takes precedence.
> >
> > Signed-off-by: Neil Spring <ntspring@meta.com>
>
> This looks good to me, with a minor commit below.
>
> > diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
> > index df479277fb80..cc71d84df42b 100644
> > --- a/net/ipv4/syncookies.c
> > +++ b/net/ipv4/syncookies.c
> > @@ -280,9 +280,18 @@ static int cookie_tcp_reqsk_init(struct sock *sk, struct sk_buff *skb,
> >       treq->snt_synack = 0;
> >       treq->snt_tsval_first = 0;
> >       treq->tfo_listener = false;
> > -     treq->txhash = net_tx_rndhash();
> >       treq->rcv_isn = ntohl(th->seq) - 1;
> >       treq->snt_isn = ntohl(th->ack_seq) - 1;
> > +     if (skb->protocol == htons(ETH_P_IPV6)) {
> > +             /* Use the cookie as txhash so the ECMP path matches
> > +              * the SYN-ACK, where txhash was also set to the
> > +              * cookie.  The original request socket (and its
> > +              * txhash) was freed after sending the SYN-ACK.
> > +              */
> > +             treq->txhash = treq->snt_isn;
> > +     } else {
> > +             treq->txhash = net_tx_rndhash();
>
> I'm wondering if it would make sense always using snt_isn for txhash in
> the syn cookie case, regardless of the IP protocol. Beyond reducing the
> differences between ipv4 and ipv6 it will make the code a little simpler.

I added the conditional in v11 after the AI review poked at how this wasn't
plumbed for IPv4; it seemed safer and clearer about v6-vs-v4 use of txhash,
but I wasn't weighing a goal of avoiding protocol-specific paths.

I will try no-conditional again in v14 with this feedback, happy to change
back if anyone has a strong opinion.

-neil

>
> Not a blocker in any case.
>
> Still I think this could deserve an explicit ack from Eric.
>
> /P
>

^ permalink raw reply

* Re: [PATCH net-next] r8169: migrate Rx path to page_pool
From: Heiner Kallweit @ 2026-06-14 20:26 UTC (permalink / raw)
  To: atharva-potdar, nic_swsd, andrew+netdev, davem, edumazet, kuba,
	pabeni
  Cc: netdev
In-Reply-To: <20260614054137.32181-1-atharvapotdar07@gmail.com>

On 14.06.2026 07:41, atharva-potdar wrote:
> Replace the driver-managed skb+copy Rx model with page_pool
> zero-copy in preparation for XDP support.
> 
> Key changes:
> - Allocate order-0 pages via page_pool instead of alloc_pages + dma_map
> - Build skbs directly from pages with napi_build_skb (zero-copy)
> - Add rtl8169_rx_refill() to replenish descriptors after processing
> - Track dirty_rx boundary for efficient refill scheduling
> - Cap max_mtu to R8169_RX_BUF_SIZE - VLAN_ETH_HLEN - ETH_FCS_LEN
>   (order-0 pages can't support arbitrary jumbo frames)
> 
If I read this correctly, max_mtu may be lower with this patch.
This may cause a regression for existing users.

> Tested on RTL8168h with iperf3 (~470 Mbps, 0 retransmits) and
> 1000 pings (0 drops).
> 
Assuming your link speed is 1Gbps, 470Mbps is quite low.

Did you test also on non-x86 architectures? We had DMA-related regressions
in the past which showed up on certain non-x86 architectures only.

> Signed-off-by: atharva-potdar <atharvapotdar07@gmail.com>
> ---
>  drivers/net/ethernet/realtek/r8169_main.c | 128 ++++++++++++++--------
>  1 file changed, 85 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index ec4fc21fa..9d8d678ac 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -31,6 +31,7 @@
>  #include <linux/unaligned.h>
>  #include <net/ip6_checksum.h>
>  #include <net/netdev_queues.h>
> +#include <net/page_pool/helpers.h>
>  #include <net/phy/realtek_phy.h>
>  
>  #include "r8169.h"
> @@ -70,7 +71,9 @@
>  #define InterFrameGap	0x03	/* 3 means InterFrameGap = the shortest one */
>  
>  #define R8169_REGS_SIZE		256
> -#define R8169_RX_BUF_SIZE	(SZ_16K - 1)
> +#define R8169_RX_HEADROOM	ALIGN(XDP_PACKET_HEADROOM, 8)
> +#define R8169_RX_BUF_SIZE	(PAGE_SIZE - R8169_RX_HEADROOM - \
> +				 SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
>  #define NUM_TX_DESC	256	/* Number of Tx descriptor registers */
>  #define NUM_RX_DESC	256	/* Number of Rx descriptor registers */
>  #define R8169_TX_RING_BYTES	(NUM_TX_DESC * sizeof(struct TxDesc))
> @@ -737,6 +740,7 @@ struct rtl8169_private {
>  	enum mac_version mac_version;
>  	enum rtl_dash_type dash_type;
>  	u32 cur_rx; /* Index into the Rx descriptor buffer of next Rx pkt. */
> +	u32 dirty_rx; /* Index of first Rx descriptor needing a new buffer */
>  	u32 cur_tx; /* Index into the Tx descriptor buffer of next Rx pkt. */
>  	u32 dirty_tx;
>  	struct TxDesc *TxDescArray;	/* 256-aligned Tx descriptor ring */
> @@ -745,6 +749,8 @@ struct rtl8169_private {
>  	dma_addr_t RxPhyAddr;
>  	struct page *Rx_databuff[NUM_RX_DESC];	/* Rx data buffers */
>  	struct ring_info tx_skb[NUM_TX_DESC];	/* Tx data buffers */
> +	struct page_pool *page_pool;
> +	u32 rx_buf_sz;
>  	u16 cp_cmd;
>  	u16 tx_lpi_timer;
>  	u32 irq_mask;
> @@ -4148,37 +4154,27 @@ static int rtl8169_change_mtu(struct net_device *dev, int new_mtu)
>  	return 0;
>  }
>  
> -static void rtl8169_mark_to_asic(struct RxDesc *desc)
> +static void rtl8169_mark_to_asic(struct RxDesc *desc, u32 rx_buf_sz)
>  {
>  	u32 eor = le32_to_cpu(desc->opts1) & RingEnd;
>  
>  	desc->opts2 = 0;
>  	/* Force memory writes to complete before releasing descriptor */
>  	dma_wmb();
> -	WRITE_ONCE(desc->opts1, cpu_to_le32(DescOwn | eor | R8169_RX_BUF_SIZE));
> +	WRITE_ONCE(desc->opts1, cpu_to_le32(DescOwn | eor | rx_buf_sz));
>  }
>  
>  static struct page *rtl8169_alloc_rx_data(struct rtl8169_private *tp,
>  					  struct RxDesc *desc)
>  {
> -	struct device *d = tp_to_dev(tp);
> -	int node = dev_to_node(d);
> -	dma_addr_t mapping;
>  	struct page *data;
>  
> -	data = alloc_pages_node(node, GFP_KERNEL, get_order(R8169_RX_BUF_SIZE));
> +	data = page_pool_dev_alloc_pages(tp->page_pool);
>  	if (!data)
>  		return NULL;
>  
> -	mapping = dma_map_page(d, data, 0, R8169_RX_BUF_SIZE, DMA_FROM_DEVICE);
> -	if (unlikely(dma_mapping_error(d, mapping))) {
> -		netdev_err(tp->dev, "Failed to map RX DMA!\n");
> -		__free_pages(data, get_order(R8169_RX_BUF_SIZE));
> -		return NULL;
> -	}
> -
> -	desc->addr = cpu_to_le64(mapping);
> -	rtl8169_mark_to_asic(desc);
> +	desc->addr = cpu_to_le64(page_pool_get_dma_addr(data) + R8169_RX_HEADROOM);
> +	rtl8169_mark_to_asic(desc, tp->rx_buf_sz);
>  
>  	return data;
>  }
> @@ -4187,15 +4183,17 @@ static void rtl8169_rx_clear(struct rtl8169_private *tp)
>  {
>  	int i;
>  
> -	for (i = 0; i < NUM_RX_DESC && tp->Rx_databuff[i]; i++) {
> -		dma_unmap_page(tp_to_dev(tp),
> -			       le64_to_cpu(tp->RxDescArray[i].addr),
> -			       R8169_RX_BUF_SIZE, DMA_FROM_DEVICE);
> -		__free_pages(tp->Rx_databuff[i], get_order(R8169_RX_BUF_SIZE));
> +	for (i = 0; i < NUM_RX_DESC; i++) {
> +		if (!tp->Rx_databuff[i])
> +			continue;
> +		page_pool_put_full_page(tp->page_pool, tp->Rx_databuff[i], true);
>  		tp->Rx_databuff[i] = NULL;
>  		tp->RxDescArray[i].addr = 0;
>  		tp->RxDescArray[i].opts1 = 0;
>  	}
> +
> +	page_pool_destroy(tp->page_pool);
> +	tp->page_pool = NULL;
>  }
>  
>  static int rtl8169_rx_fill(struct rtl8169_private *tp)
> @@ -4221,11 +4219,28 @@ static int rtl8169_rx_fill(struct rtl8169_private *tp)
>  
>  static int rtl8169_init_ring(struct rtl8169_private *tp)
>  {
> +	struct page_pool_params pp_params = { 0 };
> +
>  	rtl8169_init_ring_indexes(tp);
> +	tp->dirty_rx = 0;
> +	tp->rx_buf_sz = R8169_RX_BUF_SIZE;
>  
>  	memset(tp->tx_skb, 0, sizeof(tp->tx_skb));
>  	memset(tp->Rx_databuff, 0, sizeof(tp->Rx_databuff));
>  
> +	pp_params.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV;
> +	pp_params.order = 0;
> +	pp_params.pool_size = NUM_RX_DESC;
> +	pp_params.nid = dev_to_node(tp_to_dev(tp));
> +	pp_params.dev = tp_to_dev(tp);
> +	pp_params.dma_dir = DMA_FROM_DEVICE;
> +	pp_params.offset = R8169_RX_HEADROOM;
> +	pp_params.max_len = tp->rx_buf_sz;
> +
> +	tp->page_pool = page_pool_create(&pp_params);
> +	if (IS_ERR(tp->page_pool))
> +		return PTR_ERR(tp->page_pool);
> +
>  	return rtl8169_rx_fill(tp);
>  }
>  
> @@ -4312,7 +4327,7 @@ static void rtl_reset_work(struct rtl8169_private *tp)
>  	rtl8169_cleanup(tp);
>  
>  	for (i = 0; i < NUM_RX_DESC; i++)
> -		rtl8169_mark_to_asic(tp->RxDescArray + i);
> +		rtl8169_mark_to_asic(tp->RxDescArray + i, tp->rx_buf_sz);
>  
>  	napi_enable(&tp->napi);
>  	rtl_hw_start(tp);
> @@ -4776,9 +4791,8 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
>  	for (count = 0; count < budget; count++, tp->cur_rx++) {
>  		unsigned int pkt_size, entry = tp->cur_rx % NUM_RX_DESC;
>  		struct RxDesc *desc = tp->RxDescArray + entry;
> +		struct page *page;
>  		struct sk_buff *skb;
> -		const void *rx_buf;
> -		dma_addr_t addr;
>  		u32 status;
>  
>  		status = le32_to_cpu(READ_ONCE(desc->opts1));
> @@ -4791,6 +4805,9 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
>  		 */
>  		dma_rmb();
>  
> +		page = tp->Rx_databuff[entry];
> +		tp->Rx_databuff[entry] = NULL;
> +
>  		if (unlikely(status & RxRES)) {
>  			if (net_ratelimit())
>  				netdev_warn(dev, "Rx ERROR. status = %08x\n",
> @@ -4802,9 +4819,9 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
>  				dev->stats.rx_crc_errors++;
>  
>  			if (!(dev->features & NETIF_F_RXALL))
> -				goto release_descriptor;
> +				goto recycle;
>  			else if (status & RxRWT || !(status & (RxRUNT | RxCRC)))
> -				goto release_descriptor;
> +				goto recycle;
>  		}
>  
>  		pkt_size = status & GENMASK(13, 0);
> @@ -4817,24 +4834,23 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
>  		if (unlikely(rtl8169_fragmented_frame(status))) {
>  			dev->stats.rx_dropped++;
>  			dev->stats.rx_length_errors++;
> -			goto release_descriptor;
> +			goto recycle;
>  		}
>  
> -		skb = napi_alloc_skb(&tp->napi, pkt_size);
> +		dma_sync_single_for_cpu(d,
> +					page_pool_get_dma_addr(page) +
> +					R8169_RX_HEADROOM,
> +					pkt_size, DMA_FROM_DEVICE);
> +
> +		skb = napi_build_skb(page_address(page), PAGE_SIZE);
>  		if (unlikely(!skb)) {
>  			dev->stats.rx_dropped++;
> -			goto release_descriptor;
> +			goto recycle;
>  		}
>  
> -		addr = le64_to_cpu(desc->addr);
> -		rx_buf = page_address(tp->Rx_databuff[entry]);
> -
> -		dma_sync_single_for_cpu(d, addr, pkt_size, DMA_FROM_DEVICE);
> -		prefetch(rx_buf);
> -		skb_copy_to_linear_data(skb, rx_buf, pkt_size);
> -		skb->tail += pkt_size;
> -		skb->len = pkt_size;
> -		dma_sync_single_for_device(d, addr, pkt_size, DMA_FROM_DEVICE);
> +		skb_reserve(skb, R8169_RX_HEADROOM);
> +		skb_put(skb, pkt_size);
> +		skb_mark_for_recycle(skb);
>  
>  		rtl8169_rx_csum(skb, status);
>  		skb->protocol = eth_type_trans(skb, dev);
> @@ -4847,13 +4863,34 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
>  		napi_gro_receive(&tp->napi, skb);
>  
>  		dev_sw_netstats_rx_add(dev, pkt_size);
> -release_descriptor:
> -		rtl8169_mark_to_asic(desc);
> +
> +		continue;
> +
> +recycle:
> +		page_pool_put_full_page(tp->page_pool, page, true);
>  	}
>  
>  	return count;
>  }
>  
> +static void rtl8169_rx_refill(struct rtl8169_private *tp)
> +{
> +	u32 dirty_rx = tp->dirty_rx;
> +
> +	while (dirty_rx != tp->cur_rx) {
> +		u32 entry = dirty_rx % NUM_RX_DESC;
> +
> +		if (!tp->Rx_databuff[entry]) {
> +			tp->Rx_databuff[entry] = rtl8169_alloc_rx_data(tp,
> +								       tp->RxDescArray + entry);
> +			if (!tp->Rx_databuff[entry])
> +				break;
> +		}
> +		dirty_rx++;
> +	}
> +	tp->dirty_rx = dirty_rx;
> +}
> +
>  static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
>  {
>  	struct rtl8169_private *tp = dev_instance;
> @@ -4921,6 +4958,7 @@ static int rtl8169_poll(struct napi_struct *napi, int budget)
>  	rtl_tx(dev, tp, budget);
>  
>  	work_done = rtl_rx(dev, tp, budget);
> +	rtl8169_rx_refill(tp);
>  
>  	if (work_done < budget && napi_complete_done(napi, work_done))
>  		rtl_irq_enable(tp);
> @@ -5775,8 +5813,12 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  	}
>  
>  	jumbo_max = rtl_jumbo_max(tp);
> -	if (jumbo_max)
> -		dev->max_mtu = jumbo_max;
> +	if (jumbo_max) {
> +		unsigned int page_pool_mtu;
> +
> +		page_pool_mtu = R8169_RX_BUF_SIZE - VLAN_ETH_HLEN - ETH_FCS_LEN;
> +		dev->max_mtu = min_t(int, jumbo_max, page_pool_mtu);
> +	}
>  
>  	rtl_set_irq_mask(tp);
>  
> @@ -5808,7 +5850,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  
>  	if (jumbo_max)
>  		netdev_info(dev, "jumbo features [frames: %d bytes, tx checksumming: %s]\n",
> -			    jumbo_max, tp->mac_version <= RTL_GIGA_MAC_VER_06 ?
> +			    dev->max_mtu, tp->mac_version <= RTL_GIGA_MAC_VER_06 ?
>  			    "ok" : "ko");
>  
>  	if (tp->dash_type != RTL_DASH_NONE) {


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox