From: Stephen Hemminger <stephen@networkplumber.org>
To: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Cc: <dev@dpdk.org>, <rasland@nvidia.com>, <matan@nvidia.com>,
<suanmingm@nvidia.com>
Subject: Re: [PATCH 1/1] doc: add mlx5 xstats send scheduling counters description
Date: Mon, 28 Oct 2024 08:57:08 -0700 [thread overview]
Message-ID: <20241028085708.0060bc6f@hermes.local> (raw)
In-Reply-To: <20241028142741.1609088-1-viacheslavo@nvidia.com>
On Mon, 28 Oct 2024 16:27:41 +0200
Viacheslav Ovsiienko <viacheslavo@nvidia.com> wrote:
> The mlx5 provides the scheduling send on time capability.
> The check the operating status of this feature the xstats
> counters are provided. This patch adds the counter descriptions
> and provides some meaningful information how to interpret
> the counter values in runtime.
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> ---
> doc/guides/nics/mlx5.rst | 48 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 48 insertions(+)
>
> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
> index f82e2d75de..8d1a1311d4 100644
> --- a/doc/guides/nics/mlx5.rst
> +++ b/doc/guides/nics/mlx5.rst
> @@ -2655,3 +2655,51 @@ Destroy GENEVE TLV parser for specific port::
>
> This command doesn't destroy the global list,
> For releasing options, ``flush`` command should be used.
> +
> +
> +Extended statistics counters
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Send scheduling related xstats counters
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +The mlx5 PMD provides the set of tx_pp feature related counters to provide debug and diagnostics
> +on send packet scheduling. These counters are applicable only if port was probed with ``tx_pp``
> +devarg and reflect the status of PMD scheduling infrastructure based on Clock and Rearm Queues.
> +This infrastructure provedies the Send Scheduling capability on CX6DX NICs as temporary workaround
> +and should not be engaged on the newer hardware.
> +
> +- ``tx_pp_missed_interrupt_errors`` - the Rearm Queue interrupt was not serviced in time. EAL handles
> + interrupts in dedicated thread and, possible, there were another time-consuming actions were taken.
> +
> +- ``tx_pp_rearm_queue_errors`` - hardware errors occurred on Rearm Queue, usually it is caused by not
> + servicing interrupts in time
> +
> +- ``tx_pp_clock_queue_errors`` - hardware errors occurred on Clock Queue, usually it indicates some
> + configuration or internal NIC hardware or firmware issues
> +
> +- ``tx_pp_timestamp_past_errors`` - application tried to send packet(s) with specifying timestamp in the past.
> + This counter is useful to check and debug the application code, it does not indicate PMD malfunction.
> +
> +- ``tx_pp_timestamp_future_errors`` - application tried to send packet(s) with specifying timestamp
> + in the too distant future, beyond the hardware capabilities to schedule the sending
> + This counter is useful to check and debug the application code, it does not indicate PMD malfunction.
> +
> +- ``tx_pp_jitter`` - this counter exposes the internal NIC realtime clock jitter estimation between two
> + neighbour Clock Queue completions in nanoseconds. Significant jitter might alert about clock
> + synchronization issues (say, some system PTP agent might adjust NIC clock in inappropriate way)
> +
> +- ``tx_pp_wander`` - the counter exposes the longterm internal NUC realtime clock stability - tx_pp_wander
> + for 2^24 completions, in nanoseconds. Significant wander might indicate clock synchronization issues.
> +
> +- ``tx_pp_sync_lost`` - the general operating indicator, the non-zero value says the driver lost
> + the Clock Queue synchronization and scheduling does not operate correctly. The port must be restarted
> + to restore the correct scheduling functioning.
> +
> +The following counters are extremely useful for application code check and debug, these ones do not
> +indicate driver or hardware mulfunctions, and are also applicable for the newer hardware (with direct
> +on time scheduling capabilities - ConnectX-7 and above):
> +
> +- ``tx_pp_timestamp_order_errors`` - application tried to send packet(s) with timestamps in not
> + strictly ascending order. Because of PMD does not reorder packets in the hardware queues, scheduling
> + timestamps order violation causes sending packets in wrong moments of time.
Lots of grammar and spelling errors and overly wordy.
Please spend some time cleaning up the wording, find a writer or AI tool to help.
next prev parent reply other threads:[~2024-10-28 15:57 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-28 14:27 [PATCH 1/1] doc: add mlx5 xstats send scheduling counters description Viacheslav Ovsiienko
2024-10-28 15:57 ` Stephen Hemminger [this message]
2024-10-31 8:04 ` [PATCH v2] " Viacheslav Ovsiienko
2024-11-27 0:36 ` Stephen Hemminger
2024-11-27 12:18 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241028085708.0060bc6f@hermes.local \
--to=stephen@networkplumber.org \
--cc=dev@dpdk.org \
--cc=matan@nvidia.com \
--cc=rasland@nvidia.com \
--cc=suanmingm@nvidia.com \
--cc=viacheslavo@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.