netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Per-queue stats question
@ 2024-08-15 17:11 Edward Cree
  2024-08-15 18:01 ` Simon Horman
  2024-08-15 19:42 ` Jakub Kicinski
  0 siblings, 2 replies; 5+ messages in thread
From: Edward Cree @ 2024-08-15 17:11 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Network Development

I'm working on adding netdev_stat_ops support to sfc, and finding that
 the expectations of the selftest around the relation between qstats
 and rtnl stats are difficult for us to meet.  I'm not sure whether it
 is our existing rtnl stats or the qstats I'm adding that have the
 wrong semantics.

sfc fills in rtnl_link_stats64 with MAC stats from the firmware (or
 'vadaptor stats' if using SR-IOV).  These count packets (or bytes)
 since last FW boot/reset (for instance, "ethtool --reset $dev all"
 clears them).  (Also, for reasons I'm still investigating, while the
 interface is administratively down they read as zero, then jump back
 to what they were on "ip link set up".)  Moreover, the counts are
 updated by periodic DMA, so can be up to 1 second stale.
The queue stats, meanwhile, are maintained in software, and count
 since ifup (efx_start_channels()), so that they can be reset on
 reconfiguration; the base_stats count since driver probe
 (efx_alloc_channels()).

Thus, as it stands, it is possible for qstats and rtstats to disagree,
 in both directions.  For example:
* Driver is unloaded and then loaded again.  base_stats will reset,
  but MAC stats won't.
* ethtool reset.  MAC stats will reset, but base_stats won't.
* Traffic is passing during the test.  qstats will be up to date,
  whereas MAC stats, being up to 1s stale, could be far behind.
* RX filter drops (e.g. unwanted destination MAC address).  These are
  counted in MAC stats but since they never reach the driver they're
  not counted in qstats/base_stats (and by my reading of netdev.yaml
  they shouldn't be, even if we could).

Any of these will cause the stats.pkt_byte_sum selftest to fail.
Which side do I need to change, qstats or rtstats?  Or is the test
 being too strict?

On a related note, I notice that the stat_cmp() function within that
 selftest returns the first nonzero delta it finds in the stats, so
 that if (say) tx-packets goes forwards but rx-packets goes backwards,
 it will return >0 causing the rx-packets delta to be ignored.  Is
 this intended behaviour, or should I submit a patch?

-ed

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Per-queue stats question
  2024-08-15 17:11 Per-queue stats question Edward Cree
@ 2024-08-15 18:01 ` Simon Horman
  2024-08-15 19:42 ` Jakub Kicinski
  1 sibling, 0 replies; 5+ messages in thread
From: Simon Horman @ 2024-08-15 18:01 UTC (permalink / raw)
  To: Edward Cree; +Cc: Jakub Kicinski, Network Development

On Thu, Aug 15, 2024 at 06:11:42PM +0100, Edward Cree wrote:
> I'm working on adding netdev_stat_ops support to sfc, and finding that
>  the expectations of the selftest around the relation between qstats
>  and rtnl stats are difficult for us to meet.  I'm not sure whether it
>  is our existing rtnl stats or the qstats I'm adding that have the
>  wrong semantics.
> 
> sfc fills in rtnl_link_stats64 with MAC stats from the firmware (or
>  'vadaptor stats' if using SR-IOV).  These count packets (or bytes)
>  since last FW boot/reset (for instance, "ethtool --reset $dev all"
>  clears them).  (Also, for reasons I'm still investigating, while the
>  interface is administratively down they read as zero, then jump back
>  to what they were on "ip link set up".)  Moreover, the counts are
>  updated by periodic DMA, so can be up to 1 second stale.
> The queue stats, meanwhile, are maintained in software, and count
>  since ifup (efx_start_channels()), so that they can be reset on
>  reconfiguration; the base_stats count since driver probe
>  (efx_alloc_channels()).
> 
> Thus, as it stands, it is possible for qstats and rtstats to disagree,
>  in both directions.  For example:
> * Driver is unloaded and then loaded again.  base_stats will reset,
>   but MAC stats won't.
> * ethtool reset.  MAC stats will reset, but base_stats won't.
> * Traffic is passing during the test.  qstats will be up to date,
>   whereas MAC stats, being up to 1s stale, could be far behind.
> * RX filter drops (e.g. unwanted destination MAC address).  These are
>   counted in MAC stats but since they never reach the driver they're
>   not counted in qstats/base_stats (and by my reading of netdev.yaml
>   they shouldn't be, even if we could).
> 
> Any of these will cause the stats.pkt_byte_sum selftest to fail.
> Which side do I need to change, qstats or rtstats?  Or is the test
>  being too strict?

Maybe speaking out of turn (or through my hat) but my opinion is that the
test is too strict.

This is because I believe that in practice there can be constraints which
make it impractical or undesirable for stats to be fetched directly from
hardware. And that this leads to some level of caching in software. Which
in turn leads to some scope for stats to be out of date.

I think that is fine, so long as the stats don't lag too much. Where, IMHO
too much might be say more than 0.1s, or something like that. Because, in
my experience, that is fine as a user-experience: either a user, looking at
stats, or a process analysing stats for usage patterns.

> 
> On a related note, I notice that the stat_cmp() function within that
>  selftest returns the first nonzero delta it finds in the stats, so
>  that if (say) tx-packets goes forwards but rx-packets goes backwards,
>  it will return >0 causing the rx-packets delta to be ignored.  Is
>  this intended behaviour, or should I submit a patch?
> 
> -ed
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Per-queue stats question
  2024-08-15 17:11 Per-queue stats question Edward Cree
  2024-08-15 18:01 ` Simon Horman
@ 2024-08-15 19:42 ` Jakub Kicinski
  2024-08-16  9:19   ` Joe Damato
  1 sibling, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2024-08-15 19:42 UTC (permalink / raw)
  To: Edward Cree; +Cc: Network Development

On Thu, 15 Aug 2024 18:11:42 +0100 Edward Cree wrote:
> I'm working on adding netdev_stat_ops support to sfc, and finding that
>  the expectations of the selftest around the relation between qstats
>  and rtnl stats are difficult for us to meet.  I'm not sure whether it
>  is our existing rtnl stats or the qstats I'm adding that have the
>  wrong semantics.
> 
> sfc fills in rtnl_link_stats64 with MAC stats from the firmware (or
>  'vadaptor stats' if using SR-IOV).  These count packets (or bytes)
>  since last FW boot/reset (for instance, "ethtool --reset $dev all"
>  clears them).  (Also, for reasons I'm still investigating, while the
>  interface is administratively down they read as zero, then jump back
>  to what they were on "ip link set up".)  Moreover, the counts are
>  updated by periodic DMA, so can be up to 1 second stale.
> The queue stats, meanwhile, are maintained in software, and count
>  since ifup (efx_start_channels()), so that they can be reset on
>  reconfiguration; the base_stats count since driver probe
>  (efx_alloc_channels()).
> 
> Thus, as it stands, it is possible for qstats and rtstats to disagree,
>  in both directions.  For example:

[reordering for grouped answer]

> * Driver is unloaded and then loaded again.  base_stats will reset,
>   but MAC stats won't.
> * ethtool reset.  MAC stats will reset, but base_stats won't.
> * RX filter drops (e.g. unwanted destination MAC address).  These are
>   counted in MAC stats but since they never reach the driver they're
>   not counted in qstats/base_stats (and by my reading of netdev.yaml
>   they shouldn't be, even if we could).

rstats have no clear semantics on modern devices, some drivers count
at the MAC (potentially including VF traffic), some drivers count after
XDP (i.e. don't count XDP_DROP!?!)

We should maintain the qstat semantics as packets intended for a given
netdev, with rx-packets being packets which got delivered to the host
and picked up by the driver.

> * Traffic is passing during the test.  qstats will be up to date,
>   whereas MAC stats, being up to 1s stale, could be far behind.

That's a bug, we should pop a env.wait_hw_stats_settle() in the right
spots in the test.

> Any of these will cause the stats.pkt_byte_sum selftest to fail.
> Which side do I need to change, qstats or rtstats?  Or is the test
>  being too strict?

Test is too strict, I'm not sure what to do about it. It has proven useful
in the past, mlx5 has "misc queues" for PTP, for example, and it caught
that they are added to rstats but weren't added to the base. IDK what
to do about drivers which use MAC stats for rstat :( The fact that the
test fails doesn't mean they misuse qstat.

> On a related note, I notice that the stat_cmp() function within that
>  selftest returns the first nonzero delta it finds in the stats, so
>  that if (say) tx-packets goes forwards but rx-packets goes backwards,
>  it will return >0 causing the rx-packets delta to be ignored.  Is
>  this intended behaviour, or should I submit a patch?

Looks like a bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Per-queue stats question
  2024-08-15 19:42 ` Jakub Kicinski
@ 2024-08-16  9:19   ` Joe Damato
  2024-08-16 15:44     ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Joe Damato @ 2024-08-16  9:19 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Edward Cree, Network Development

On Thu, Aug 15, 2024 at 12:42:47PM -0700, Jakub Kicinski wrote:
> On Thu, 15 Aug 2024 18:11:42 +0100 Edward Cree wrote:
[...]
> > On a related note, I notice that the stat_cmp() function within that
> >  selftest returns the first nonzero delta it finds in the stats, so
> >  that if (say) tx-packets goes forwards but rx-packets goes backwards,
> >  it will return >0 causing the rx-packets delta to be ignored.  Is
> >  this intended behaviour, or should I submit a patch?
> 
> Looks like a bug.

FWIW, while debugging the stats stuff on mlx5, I tweaked the
stat_cmp function to output a lot more information about each of the
values to help me debug.

It seemed too verbose for an upstream patch at the time, but since
you are going through the same process a patch might make a lot of
sense.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Per-queue stats question
  2024-08-16  9:19   ` Joe Damato
@ 2024-08-16 15:44     ` Jakub Kicinski
  0 siblings, 0 replies; 5+ messages in thread
From: Jakub Kicinski @ 2024-08-16 15:44 UTC (permalink / raw)
  To: Joe Damato; +Cc: Edward Cree, Network Development

On Fri, 16 Aug 2024 10:19:48 +0100 Joe Damato wrote:
> On Thu, Aug 15, 2024 at 12:42:47PM -0700, Jakub Kicinski wrote:
> > > On a related note, I notice that the stat_cmp() function within that
> > >  selftest returns the first nonzero delta it finds in the stats, so
> > >  that if (say) tx-packets goes forwards but rx-packets goes backwards,
> > >  it will return >0 causing the rx-packets delta to be ignored.  Is
> > >  this intended behaviour, or should I submit a patch?  
> > 
> > Looks like a bug.  
> 
> FWIW, while debugging the stats stuff on mlx5, I tweaked the
> stat_cmp function to output a lot more information about each of the
> values to help me debug.
> 
> It seemed too verbose for an upstream patch at the time, but since
> you are going through the same process a patch might make a lot of
> sense.

We are in a desperate need of a better debugging flow :(
I hope Mohsin's patch to add verbose printing is ready soon,
it should be _a_ start.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-08-16 15:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-15 17:11 Per-queue stats question Edward Cree
2024-08-15 18:01 ` Simon Horman
2024-08-15 19:42 ` Jakub Kicinski
2024-08-16  9:19   ` Joe Damato
2024-08-16 15:44     ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).