From: Jacob Keller <jacob.e.keller@intel.com>
To: Miroslav Lichvar <mlichvar@redhat.com>,
Kurt Kanzenbach <kurt@linutronix.de>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>,
Tony Nguyen <anthony.l.nguyen@intel.com>,
Przemek Kitszel <przemyslaw.kitszel@intel.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
Richard Cochran <richardcochran@gmail.com>,
Vinicius Costa Gomes <vinicius.gomes@intel.com>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Vadim Fedorenko <vadim.fedorenko@linux.dev>,
<intel-wired-lan@lists.osuosl.org>, <netdev@vger.kernel.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [Intel-wired-lan] [PATCH iwl-next v4] igb: Retrieve Tx timestamp from BH workqueue
Date: Mon, 9 Mar 2026 01:37:54 -0700 [thread overview]
Message-ID: <ecacc5fd-b3b0-4d66-83ab-4152e4ed22b8@intel.com> (raw)
In-Reply-To: <aaf8xVPWQ0-y1BnX@localhost>
On 3/4/2026 1:35 AM, Miroslav Lichvar wrote:
> On Tue, Mar 03, 2026 at 02:38:11PM +0100, Kurt Kanzenbach wrote:
>>> It would be great, if you shared the numbers. Did Miroslav already test
>>> this?
>>
>> Great question. I did test with ptp4l and synchronization looks fine <
>> below 10ns back to back as expected. I did not test with ntpperf,
>> because I was never able to reproduce the NTP regression to the same
>> extent as Miroslav reported. Therefore, Miroslav is on Cc in case he
>> wants to test it. Let's see.
>
> I ran the same test with I350 as before and there still seems to be a
> regression, but interestingly it's quite different to the previous versions of
> the patch. It's like there is a load-sensitive on/off switch.
>
Could you help me understand this data? I think I've been confused every
time I've looked at this.
You mention a load sensitive on/off switch... Which i guess kind of
makes sense with the numbers here:
> Without the patch:
>
> | responses | response time (ns)
> rate clients | lost invalid basic xleave | min mean max stddev
> 150000 15000 0.00% 0.00% 0.00% 100.00% +4188 +36475 +193328 16179
> 157500 15750 0.02% 0.00% 0.02% 99.96% +6373 +42969 +683894 22682
> 165375 16384 0.03% 0.00% 0.00% 99.97% +7911 +43960 +692471 24454
> 173643 16384 0.06% 0.00% 0.00% 99.94% +8323 +45627 +707240 28452
> 182325 16384 0.06% 0.00% 0.00% 99.94% +8404 +47292 +722524 26936
> 191441 16384 0.00% 0.00% 0.00% 100.00% +8930 +51738 +223727 14272
> 201013 16384 0.05% 0.00% 0.00% 99.95% +9634 +53696 +776445 23783
> 211063 16384 0.00% 0.00% 0.00% 100.00% +14393 +54558 +329546 20473
> 221616 16384 2.59% 0.00% 0.05% 97.36% +23924 +321205 +518192 21838
> 232696 16384 7.00% 0.00% 0.10% 92.90% +33396 +337709 +575661 21017
> 244330 16384 10.82% 0.00% 0.15% 89.03% +34188 +340248 +556237 20880
>
This is without patch, and the "lost" is 0% for low rates, and we have a
lower response time mean, max, and standard deviation... But xleave is 100%
> With the patch:
> 150000 15000 5.11% 0.00% 0.00% 94.88% +4426 +460642 +640884 83746
> 157500 15750 11.54% 0.00% 0.26% 88.20% +14434 +543656 +738355 30349
> 165375 16384 15.61% 0.00% 0.31% 84.08% +35822 +515304 +833859 25596
> 173643 16384 19.58% 0.00% 0.37% 80.05% +20762 +568962 +900100 28118
> 182325 16384 23.46% 0.00% 0.42% 76.13% +41829 +547974 +804170 27890
> 191441 16384 27.23% 0.00% 0.46% 72.31% +15182 +557920 +798212 28868
> 201013 16384 30.51% 0.00% 0.49% 69.00% +15980 +560764 +805576 29979
> 211063 16384 0.06% 0.00% 0.00% 99.94% +12668 +80487 +410555 62182
> 221616 16384 2.94% 0.00% 0.05% 97.00% +21587 +342769 +517566 23359
> 232696 16384 6.94% 0.00% 0.10% 92.96% +16581 +336068 +484574 18453
> 244330 16384 11.45% 0.00% 0.14% 88.41% +23608 +345023 +564130 19177
>
With the fix, we have a higher lost percentage, which sounds bad to
me..? And we have a higher response time (which also sounds bad??) and
we have a much worse standard deviation across all the values from low
to high rate.
I guess I just don't understand what these numbers mean and why its
"better" with the patch. Perhaps its the naming? Or perhaps "xleave" is
bad, and this is showing that with the patch we get less of that? But
that looks like it gets consistently lower as the rate and number of
clients goes up.
> At 211063 requests per second and higher the performance looks the
> same. But at the lower rates there is a clear drop. The higher
> mean response time (difference between server TX and RX timestamps)
> indicates more of the provided TX timestamps are hardware timestamps
> and the chrony server timestamp statistics confirm that.
>
So you're saying a higher mean response time is.. better? What is it
really measuring then? Oh. I see. it has a higher response time because
it takes longer to get a Tx timestamp, but the provided timestamp is
higher quality. While previously it was using software timestamps so it
could reply faster (since it takes less time to get the software
timestamp back out) but the quality is lower?
Ok. That makes a bit more sense...
> So, my interpretation is that like with the earlier version of the
> patch it's trading performance for timestamp quality at lower rates,
> but unlike the earlier version it's not losing performance at the
> higher rates. That seems acceptable to me. Admins of busy servers
> might need to decide if they should keep HW timestamping enabled. In
> theory, chrony could have an option to do that automatically.
>
> Thanks,
>
next prev parent reply other threads:[~2026-03-09 8:38 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-03 11:48 [PATCH iwl-next v4] igb: Retrieve Tx timestamp from BH workqueue Kurt Kanzenbach
2026-03-03 12:27 ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-03 13:18 ` Paul Menzel
2026-03-03 13:38 ` Kurt Kanzenbach
2026-03-04 9:35 ` Miroslav Lichvar
2026-03-05 8:55 ` Kurt Kanzenbach
2026-03-09 8:37 ` Jacob Keller [this message]
2026-03-09 16:05 ` [Intel-wired-lan] " Miroslav Lichvar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ecacc5fd-b3b0-4d66-83ab-4152e4ed22b8@intel.com \
--to=jacob.e.keller@intel.com \
--cc=andrew+netdev@lunn.ch \
--cc=anthony.l.nguyen@intel.com \
--cc=bigeasy@linutronix.de \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=kuba@kernel.org \
--cc=kurt@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mlichvar@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pmenzel@molgen.mpg.de \
--cc=przemyslaw.kitszel@intel.com \
--cc=richardcochran@gmail.com \
--cc=vadim.fedorenko@linux.dev \
--cc=vinicius.gomes@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox