From: Jacob Keller <jacob.e.keller@intel.com>
To: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Cc: Saeed Mahameed <saeed@kernel.org>,
Leon Romanovsky <leon@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
"Jonathan Corbet" <corbet@lwn.net>,
Richard Cochran <richardcochran@gmail.com>,
"Tariq Toukan" <tariqt@nvidia.com>, Gal Pressman <gal@nvidia.com>,
Vadim Fedorenko <vadim.fedorenko@linux.dev>,
Andrew Lunn <andrew@lunn.ch>,
Heiner Kallweit <hkallweit1@gmail.com>,
Przemek Kitszel <przemyslaw.kitszel@intel.com>,
"Ahmed Zaki" <ahmed.zaki@intel.com>,
Alexander Lobakin <aleksander.lobakin@intel.com>,
Hangbin Liu <liuhangbin@gmail.com>,
"Paul Greenwalt" <paul.greenwalt@intel.com>,
Justin Stitt <justinstitt@google.com>,
Randy Dunlap <rdunlap@infradead.org>,
Maxime Chevallier <maxime.chevallier@bootlin.com>,
Kory Maincent <kory.maincent@bootlin.com>,
Wojciech Drewek <wojciech.drewek@intel.com>,
Vladimir Oltean <vladimir.oltean@nxp.com>,
Jiri Pirko <jiri@resnulli.us>,
Alexandre Torgue <alexandre.torgue@foss.st.com>,
Jose Abreu <joabreu@synopsys.com>,
"Dragos Tatulea" <dtatulea@nvidia.com>, <netdev@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <linux-doc@vger.kernel.org>
Subject: Re: [PATCH RFC net-next v1 1/6] ethtool: add interface to read Tx hardware timestamping statistics
Date: Fri, 23 Feb 2024 14:48:51 -0800 [thread overview]
Message-ID: <a84df9ec-475d-4ffc-a975-a0911a57901e@intel.com> (raw)
In-Reply-To: <875xyex10q.fsf@nvidia.com>
On 2/23/2024 2:21 PM, Rahul Rameshbabu wrote:
>> The Intel ice drivers has the following Tx timestamp statistics:
>>
>> tx_hwtstamp_skipped - indicates when we get a Tx timestamp request but
>> are unable to fulfill it.
>> tx_hwtstamp_timeouts - indicates we had a Tx timestamp skb waiting for a
>> timestamp from hardware but it didn't get received within some internal
>> time limit.
>
> This is interesting. In mlx5 land, the only case where we are unable to
> fulfill a hwtstamp is when the timestamp information is lost or late.
>
For ice, the timestamps are captured in the PHY and stored in a block of
registers with limited slots. The driver tracks the available slots and
uses one when a Tx timestamp request comes in.
So we have "skipped" because its possible to request too many timestamps
at once and fill up all the slots before the first timestamp is reported
back.
> lost for us means that the timestamp never arrived within some internal
> time limit that our device will supposedly never be able to deliver
> timestamp information after that point.
>
That is more or less the equivalent we have for timeout.
> late for us is that we got hardware timestamp information delivered
> after that internal time limit. We are able to track this by using
> identifiers in our completion events and we only release references to
> these identifiers upon delivery (never delivering leaks the references.
> Enough build up leads to a recovery flow). The theory for us is that
> late timestamp information arrival after that period of time should not
> happen. However the truth is that it does happen and we want our driver
> implementation to be resilient to this case rather than trusting the
> time interval.
>
In our case, because of how the slots work, once we "timeout" a slot, it
could get re-used. We set the timeout to be pretty ridiculous (1 second)
to ensure that if we do timeout its almost certainly because hardware
never timestamped the packet.
> Do you have any example of a case of skipping timestamp information that
> is not related to lack of delivery over time? I am wondering if this
> case is more like a hardware error or not. Or is it more like something
> along the lines of being busy/would impact line rate of timestamp
> information must be recorded?
>
The main example for skipped is the event where all our slots are full
at point of timestamp request.
There have been a few rare cases where things like a link event or
issues with the MAC dropping a packet where the PHY simply never gets
the packet and thus never timestamps it. This is typically the result of
a lost timestamp.
Flushed, for us, is when we reset the timestamp block while it has
timestamps oustanding. This can happen for example due to link changes,
where we ultimately
>> tx_hwtstamp_flushed - indicates that we flushed an outstanding timestamp
>> before it completed, such as if the link resets or similar.
>> tx_hwtstamp_discarded - indicates that we obtained a timestamp from
>> hardware but were unable to complete it due to invalid cached data used
>> for timestamp extension.
>>
>> I think these could be translated roughly to one of the lost, late, or
>> err stats. I am a bit confused as to how drivers could distinguish
>> between lost and late, but I guess that depends on the specific hardware
>> design.
>>
>> In theory we could keep some of these more detailed stats but I don't
>> think we strictly need to be as detailed as the ice driver is.
>
> We also converged a statistic in the mlx5 driver to the simple error
> counter here. I think what makes sense is design specific counters
> should be exposed as driver specific counters and more common counters
> should be converged into the ethtool_ts_stats struct.
>
Sure that seems reasonable.
>>
>> The only major addition I think is the skipped stat, which I would
>> prefer to have. Perhaps that could be tracked in the netdev layer by
>> checking whether the skb flags to see whether or not the driver actually
>> set the appropriate flag?
>
> I guess the problem is how would the core stack know at what layer this
> was skipped at (I think Kory's patch series can be used to help with
> this since it's adding a common interface in ethtool to select the
> timestamping layer). As of today, mlx5 is the only driver I know of that
> supports selecting between the DMA and PHY layers for timestamp
> information.
>
Well, the way the interface worked in my understanding was that the core
sets the SKBTX_HW_TSTAMP flag. The driver is supposed to then prepare
the packet for timestamp and set the SKBTX_IN_PROGRESS flag. I just
looked though, and it looks like ice doesn't actually set this flag...
If we fixed this, in theory the stack should be able to check after the
packet gets sent with SKBTX_HW_TSTAMP, if SKBTX_IN_PROGRESS isn't set
then it would be a skipped timestamp?
Its not really a huge deal, and this could just be lumped into either
lost or err.
next prev parent reply other threads:[~2024-02-23 22:48 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-23 19:24 [PATCH RFC net-next v1 0/6] ethtool HW timestamping statistics Rahul Rameshbabu
2024-02-23 19:24 ` [PATCH RFC net-next v1 1/6] ethtool: add interface to read Tx hardware " Rahul Rameshbabu
2024-02-23 21:07 ` Jacob Keller
2024-02-23 22:21 ` Rahul Rameshbabu
2024-02-23 22:48 ` Jacob Keller [this message]
2024-02-23 23:43 ` Rahul Rameshbabu
2024-02-26 19:54 ` Jacob Keller
2024-03-07 18:47 ` Rahul Rameshbabu
2024-03-08 3:29 ` Jacob Keller
2024-03-08 5:09 ` Rahul Rameshbabu
2024-03-08 22:28 ` Jacob Keller
2024-03-08 22:30 ` Rahul Rameshbabu
2024-02-26 8:59 ` Köry Maincent
2024-02-26 10:09 ` Köry Maincent
2024-02-29 2:05 ` Jakub Kicinski
2024-02-29 22:20 ` Rahul Rameshbabu
2024-02-23 19:24 ` [PATCH RFC net-next v1 2/6] net/mlx5e: Introduce lost_cqe statistic counter for PTP Tx port timestamping CQ Rahul Rameshbabu
2024-02-23 19:24 ` [PATCH RFC net-next v1 3/6] net/mlx5e: Introduce timestamps statistic counter for Tx DMA layer Rahul Rameshbabu
2024-02-23 19:24 ` [PATCH RFC net-next v1 4/6] net/mlx5e: Implement ethtool hardware timestamping statistics Rahul Rameshbabu
2024-02-26 9:26 ` Köry Maincent
2024-02-23 19:24 ` [PATCH RFC net-next v1 5/6] tools: ynl: ethtool.py: Make tool invokable from any CWD Rahul Rameshbabu
2024-02-23 21:08 ` Jacob Keller
2024-02-23 22:39 ` Rahul Rameshbabu
2024-02-29 2:08 ` Jakub Kicinski
2024-02-23 19:24 ` [PATCH RFC net-next v1 6/6] tools: ynl: ethtool.py: Add ts ethtool statistics group Rahul Rameshbabu
2024-02-23 21:00 ` [PATCH RFC net-next v1 0/6] ethtool HW timestamping statistics Jacob Keller
2024-02-23 21:12 ` Jacob Keller
2024-02-23 22:47 ` Rahul Rameshbabu
2024-03-09 8:44 ` [PATCH RFC v2 " Rahul Rameshbabu
2024-03-09 8:44 ` [PATCH RFC v2 1/6] ethtool: add interface to read Tx hardware " Rahul Rameshbabu
2024-03-12 23:53 ` Jakub Kicinski
2024-03-14 0:26 ` Rahul Rameshbabu
2024-03-14 0:41 ` Jakub Kicinski
2024-03-14 0:50 ` Rahul Rameshbabu
2024-03-14 1:40 ` Jakub Kicinski
2024-03-14 4:19 ` Rahul Rameshbabu
2024-03-14 17:50 ` Keller, Jacob E
2024-03-14 18:48 ` Rahul Rameshbabu
2024-03-14 17:01 ` Rahul Rameshbabu
2024-03-14 17:59 ` Jakub Kicinski
2024-03-14 18:43 ` Rahul Rameshbabu
2024-03-14 19:06 ` Jakub Kicinski
2024-03-14 20:16 ` Rahul Rameshbabu
2024-03-09 8:44 ` [PATCH RFC v2 2/6] net/mlx5e: Introduce lost_cqe statistic counter for PTP Tx port timestamping CQ Rahul Rameshbabu
2024-03-09 8:44 ` [PATCH RFC v2 3/6] net/mlx5e: Introduce timestamps statistic counter for Tx DMA layer Rahul Rameshbabu
2024-03-09 8:44 ` [PATCH RFC v2 4/6] net/mlx5e: Implement ethtool hardware timestamping statistics Rahul Rameshbabu
2024-03-09 8:44 ` [PATCH RFC v2 5/6] tools: ynl: ethtool.py: Make tool invokable from any CWD Rahul Rameshbabu
2024-03-11 12:43 ` Köry Maincent
2024-03-09 8:44 ` [PATCH RFC v2 6/6] tools: ynl: ethtool.py: Output timestamping statistics from tsinfo-get operation Rahul Rameshbabu
2024-03-12 23:55 ` Jakub Kicinski
2024-03-14 0:22 ` Rahul Rameshbabu
2024-03-14 0:47 ` Jakub Kicinski
2024-03-14 6:07 ` Rahul Rameshbabu
2024-03-14 18:05 ` Jakub Kicinski
2024-03-14 18:39 ` Rahul Rameshbabu
2024-03-14 20:04 ` Jakub Kicinski
2024-03-14 20:05 ` Rahul Rameshbabu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a84df9ec-475d-4ffc-a975-a0911a57901e@intel.com \
--to=jacob.e.keller@intel.com \
--cc=ahmed.zaki@intel.com \
--cc=aleksander.lobakin@intel.com \
--cc=alexandre.torgue@foss.st.com \
--cc=andrew@lunn.ch \
--cc=corbet@lwn.net \
--cc=davem@davemloft.net \
--cc=dtatulea@nvidia.com \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=hkallweit1@gmail.com \
--cc=jiri@resnulli.us \
--cc=joabreu@synopsys.com \
--cc=justinstitt@google.com \
--cc=kory.maincent@bootlin.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=liuhangbin@gmail.com \
--cc=maxime.chevallier@bootlin.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=paul.greenwalt@intel.com \
--cc=przemyslaw.kitszel@intel.com \
--cc=rdunlap@infradead.org \
--cc=richardcochran@gmail.com \
--cc=rrameshbabu@nvidia.com \
--cc=saeed@kernel.org \
--cc=tariqt@nvidia.com \
--cc=vadim.fedorenko@linux.dev \
--cc=vladimir.oltean@nxp.com \
--cc=wojciech.drewek@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).