All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Eric Joyner <eric.joyner@amd.com>
Cc: netdev@vger.kernel.org, brett.creeley@amd.com,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	pabeni@redhat.com
Subject: Re: [PATCH net-next v3 2/5] ionic: Get "link_down_count" ext link stat from firmware
Date: Tue, 9 Jun 2026 14:26:40 -0700	[thread overview]
Message-ID: <20260609142640.261b7d87@kernel.org> (raw)
In-Reply-To: <f2d17ae9-ec9f-46db-b46a-a2bb680c3460@amd.com>

On Tue, 9 Jun 2026 11:00:03 -0700 Eric Joyner wrote:
> On 5/14/2026 6:27 PM, Jakub Kicinski wrote:
> > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> > 
> > 
> > This is an AI-generated review of your patch. The human sending this
> > email has considered the AI review valid, or at least plausible.  
> 
> Man, it's been forever since I've been able to get back to this.
> 
> > ---
> > [Low]
> > Should this NULL check sit inside the is_physfn branch below? The
> > port_info pointer is only dereferenced on the PF path, so a VF calling
> > ethtool on link ext stats will now log a netdev_err even though the VF
> > path never reads port_info.  
> 
> Sure.
> 
> > 
> > The message is also unrate-limited and reachable from userspace via
> > ethtool. port_info is freed and set to NULL in ionic_port_reset() and
> > reallocated in ionic_port_init() on FW reset / FLR / PCI error recovery
> > paths (ionic_reset_prepare()/ionic_reset_done() in ionic_bus_pci.c, and
> > ionic_restart_lif() in ionic_lif.c), so any ethtool query during that
> > transient window will emit a kernel log line where the previous code
> > silently did nothing.
> >   
> >>
> >> -     if (lif->ionic->pdev->is_physfn)
> >> -             stats->link_down_events = lif->link_down_count;
> >> +     if (ionic->pdev->is_physfn)
> >> +             stats->link_down_events =
> >> +                 le16_to_cpu(ionic->idev.port_info->status.link_down_count);
> >>  }  
> 
> I can use netdev_err_once() here, instead? There's no "netdev_err_ratelimited()"
> like how there is for struct device and "dev_err_ratelimited()", interestingly.
> Would that be more appropriate in the future?
> 
> There's also a similar issue in ionic_get_link_ksettings() below (and maybe
> elsewhere?), but that can be fixed in a separate patch.

This is a nit, up to you.

> > [Medium]
> > The firmware-supplied link_down_count is a __le16 in struct
> > ionic_port_status (ionic_if.h), so the value saturates at 65,535. The
> > previous lif->link_down_count was an unsigned int (32-bit), and the
> > ethtool field stats->link_down_events is u64 and treated as monotonic
> > by userspace consumers (per the comment on struct
> > ethtool_link_ext_stats in include/linux/ethtool.h).
> > 
> > Without driver-side rollover detection that accumulates the carry into
> > a wider counter, after 65,536 link bounces the reported value will
> > appear to go backward to a small number. Should the driver detect a
> > firmware value smaller than the previously-read value and fold the
> > delta into a 32/64-bit running tally before assigning to
> > stats->link_down_events?  
> 
> Is this a big enough issue to deal with? Is link really going to go down
> sixty-five thousand times while the driver is loaded? I guess we could have a
> running tally, but that seems like more effort than this is worth.

Seems unlikely indeed, but drivers routinely deal with such wrapping.
It's like 5 lines of code, no?

	u16 val = le16_to_cpu(...

        if ((u16)stats->link_down_events > val)
               stats->link_down_events += BIT(16);
        stats->link_down_events &= 0xffff0000;
	stats->link_down_events |= val;

> > [Medium]
> > Does this change make link_down_events non-persistent across firmware
> > resets? port_info lives in DMA-coherent memory that
> > ionic_port_reset() frees and clears:
> > 
> > drivers/net/ethernet/pensando/ionic/ionic_main.c:ionic_port_reset() {
> >     ...
> >     dma_free_coherent(...);
> >     idev->port_info = NULL;
> >     ...
> > }
> > 
> > and ionic_port_init() reallocates a fresh zeroed page. These are
> > called from the FLR / PCI error recovery path
> > (ionic_reset_prepare()/ionic_reset_done() in ionic_bus_pci.c) and from
> > ionic_restart_lif() in ionic_lif.c, so FW-reset events that previously
> > left the driver-resident counter untouched will now zero this stat.
> > 
> > The commit message does not mention this behavioral change. Should the
> > driver snapshot the firmware value before each reset and accumulate it
> > across reset boundaries to keep the ethtool counter monotonic?  
> 
> This count doesn't get reset in the firmware when the device the ionic driver
> binds to gets reset; the freshly zeroed page will get the count from the
> firmware again soon after the reset occurs.

This is also not great, operationally. Monitoring systems may expect
counters to start from 0 after boot (IIUC you're saying that unless
the card goes thru a full reset we'll keep the value from previous
boot).


  reply	other threads:[~2026-06-09 21:26 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-12 22:44 [PATCH net-next v3 0/5] Expose more port stats to ethtool Eric Joyner
2026-05-12 22:44 ` [PATCH net-next v3 1/5] ionic: Small improvements in devcmd retry logic Eric Joyner
2026-05-15  1:26   ` Jakub Kicinski
2026-05-19  0:01     ` Eric Joyner
2026-05-19  0:23       ` Jakub Kicinski
2026-05-12 22:44 ` [PATCH net-next v3 2/5] ionic: Get "link_down_count" ext link stat from firmware Eric Joyner
2026-05-15  1:27   ` Jakub Kicinski
2026-06-09 18:00     ` Eric Joyner
2026-06-09 21:26       ` Jakub Kicinski [this message]
2026-06-09 22:18         ` Eric Joyner
2026-06-09 22:26           ` Eric Joyner
2026-05-12 22:44 ` [PATCH net-next v3 3/5] ionic: Update ionic_if.h with new extra port stats structure Eric Joyner
2026-05-12 22:44 ` [PATCH net-next v3 4/5] ionic: Report rx_bits_phy stat to ethtool Eric Joyner
2026-05-12 22:44 ` [PATCH net-next v3 5/5] ionic: Add .get_fec_stats ethtool handler Eric Joyner
2026-05-15  1:32   ` Jakub Kicinski
2026-05-15 19:59     ` Eric Joyner
2026-05-15 23:02       ` Jakub Kicinski
2026-05-19  0:06         ` Eric Joyner
2026-05-19  0:22           ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260609142640.261b7d87@kernel.org \
    --to=kuba@kernel.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=brett.creeley@amd.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.joyner@amd.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.