Re: [Intel-wired-lan] [PATCH iwl-next v4] igb: Retrieve Tx timestamp from BH workqueue

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

From: Jacob Keller <jacob.e.keller@intel.com>
To: Miroslav Lichvar <mlichvar@redhat.com>,
	Kurt Kanzenbach <kurt@linutronix.de>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>,
	Tony Nguyen <anthony.l.nguyen@intel.com>,
	Przemek Kitszel <przemyslaw.kitszel@intel.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	Richard Cochran <richardcochran@gmail.com>,
	Vinicius Costa Gomes <vinicius.gomes@intel.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Vadim Fedorenko <vadim.fedorenko@linux.dev>,
	<intel-wired-lan@lists.osuosl.org>, <netdev@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [Intel-wired-lan] [PATCH iwl-next v4] igb: Retrieve Tx timestamp from BH workqueue
Date: Mon, 9 Mar 2026 01:37:54 -0700	[thread overview]
Message-ID: <ecacc5fd-b3b0-4d66-83ab-4152e4ed22b8@intel.com> (raw)
In-Reply-To: <aaf8xVPWQ0-y1BnX@localhost>

On 3/4/2026 1:35 AM, Miroslav Lichvar wrote:
> On Tue, Mar 03, 2026 at 02:38:11PM +0100, Kurt Kanzenbach wrote:
>>> It would be great, if you shared the numbers. Did Miroslav already test 
>>> this?
>>
>> Great question. I did test with ptp4l and synchronization looks fine <
>> below 10ns back to back as expected. I did not test with ntpperf,
>> because I was never able to reproduce the NTP regression to the same
>> extent as Miroslav reported. Therefore, Miroslav is on Cc in case he
>> wants to test it. Let's see.
> 
> I ran the same test with I350 as before and there still seems to be a
> regression, but interestingly it's quite different to the previous versions of
> the patch. It's like there is a load-sensitive on/off switch.
> 

Could you help me understand this data? I think I've been confused every
time I've looked at this.

You mention a load sensitive on/off switch... Which i guess kind of
makes sense with the numbers here:

> Without the patch:
> 
>                |          responses            |        response time (ns)
> rate   clients |  lost invalid   basic  xleave |    min    mean     max stddev
> 150000   15000   0.00%   0.00%   0.00% 100.00%    +4188  +36475 +193328  16179
> 157500   15750   0.02%   0.00%   0.02%  99.96%    +6373  +42969 +683894  22682
> 165375   16384   0.03%   0.00%   0.00%  99.97%    +7911  +43960 +692471  24454
> 173643   16384   0.06%   0.00%   0.00%  99.94%    +8323  +45627 +707240  28452
> 182325   16384   0.06%   0.00%   0.00%  99.94%    +8404  +47292 +722524  26936
> 191441   16384   0.00%   0.00%   0.00% 100.00%    +8930  +51738 +223727  14272
> 201013   16384   0.05%   0.00%   0.00%  99.95%    +9634  +53696 +776445  23783
> 211063   16384   0.00%   0.00%   0.00% 100.00%   +14393  +54558 +329546  20473
> 221616   16384   2.59%   0.00%   0.05%  97.36%   +23924 +321205 +518192  21838
> 232696   16384   7.00%   0.00%   0.10%  92.90%   +33396 +337709 +575661  21017
> 244330   16384  10.82%   0.00%   0.15%  89.03%   +34188 +340248 +556237  20880
>

This is without patch, and the "lost" is 0% for low rates, and we have a
lower response time mean, max, and standard deviation... But xleave is 100%

> With the patch:
> 150000   15000   5.11%   0.00%   0.00%  94.88%    +4426 +460642 +640884  83746
> 157500   15750  11.54%   0.00%   0.26%  88.20%   +14434 +543656 +738355  30349
> 165375   16384  15.61%   0.00%   0.31%  84.08%   +35822 +515304 +833859  25596
> 173643   16384  19.58%   0.00%   0.37%  80.05%   +20762 +568962 +900100  28118
> 182325   16384  23.46%   0.00%   0.42%  76.13%   +41829 +547974 +804170  27890
> 191441   16384  27.23%   0.00%   0.46%  72.31%   +15182 +557920 +798212  28868
> 201013   16384  30.51%   0.00%   0.49%  69.00%   +15980 +560764 +805576  29979
> 211063   16384   0.06%   0.00%   0.00%  99.94%   +12668  +80487 +410555  62182
> 221616   16384   2.94%   0.00%   0.05%  97.00%   +21587 +342769 +517566  23359
> 232696   16384   6.94%   0.00%   0.10%  92.96%   +16581 +336068 +484574  18453
> 244330   16384  11.45%   0.00%   0.14%  88.41%   +23608 +345023 +564130  19177
> 


With the fix, we have a higher lost percentage, which sounds bad to
me..? And we have a higher response time (which also sounds bad??) and
we have a much worse standard deviation across all the values from low
to high rate.

I guess I just don't understand what these numbers mean and why its
"better" with the patch. Perhaps its the naming? Or perhaps "xleave" is
bad, and this is showing that with the patch we get less of that? But
that looks like it gets consistently lower as the rate and number of
clients goes up.

> At 211063 requests per second and higher the performance looks the
> same. But at the lower rates there is a clear drop. The higher
> mean response time (difference between server TX and RX timestamps)
> indicates more of the provided TX timestamps are hardware timestamps
> and the chrony server timestamp statistics confirm that.
> 


So you're saying a higher mean response time is.. better? What is it
really measuring then? Oh. I see. it has a higher response time because
it takes longer to get a Tx timestamp, but the provided timestamp is
higher quality. While previously it was using software timestamps so it
could reply faster (since it takes less time to get the software
timestamp back out) but the quality is lower?

Ok. That makes a bit more sense...


> So, my interpretation is that like with the earlier version of the
> patch it's trading performance for timestamp quality at lower rates,
> but unlike the earlier version it's not losing performance at the
> higher rates. That seems acceptable to me. Admins of busy servers
> might need to decide if they should keep HW timestamping enabled. In
> theory, chrony could have an option to do that automatically.
> 

> Thanks,
>

next prev parent reply	other threads:[~2026-03-09  8:38 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-03 11:48 [PATCH iwl-next v4] igb: Retrieve Tx timestamp from BH workqueue Kurt Kanzenbach
2026-03-03 12:27 ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-03 13:18 ` Paul Menzel
2026-03-03 13:38   ` Kurt Kanzenbach
2026-03-04  9:35     ` Miroslav Lichvar
2026-03-05  8:55       ` Kurt Kanzenbach
2026-03-09  8:37       ` Jacob Keller [this message]
2026-03-09 16:05         ` [Intel-wired-lan] " Miroslav Lichvar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ecacc5fd-b3b0-4d66-83ab-4152e4ed22b8@intel.com \
    --to=jacob.e.keller@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=anthony.l.nguyen@intel.com \
    --cc=bigeasy@linutronix.de \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kuba@kernel.org \
    --cc=kurt@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlichvar@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pmenzel@molgen.mpg.de \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=richardcochran@gmail.com \
    --cc=vadim.fedorenko@linux.dev \
    --cc=vinicius.gomes@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox