All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
To: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: vladimir.oltean@nxp.com, anthony.l.nguyen@intel.com,
	kurt@linutronix.de, intel-wired-lan@lists.osuosl.org
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v3 4/4] igc: Add workaround for missing timestamps
Date: Wed, 07 Jun 2023 10:29:24 -0700	[thread overview]
Message-ID: <87ttvjqiyz.fsf@intel.com> (raw)
In-Reply-To: <692a458e-f887-f9da-e3cb-904e64a40924@molgen.mpg.de>

Hi Paul,

Paul Menzel <pmenzel@molgen.mpg.de> writes:

> Dear Vinicius,
>
>
> Thank you for your patch.
>
> Am 06.06.23 um 03:33 schrieb Vinicius Costa Gomes:
>
> You could make the commit message summary even more specific:
>
> igc: Work around HW bug causing missing timestamps
>

Sounds better. Will fix.

>> There's an hardware issue that can cause missing timestamps. The bug
>> is that the interrupt is only cleared if the IGC_TXSTMPH_0 register is
>> read.
>
> Is that hardware bug documented in some errata?
>

There is (or, there is going to be) an errata, but I don't think it's
public yet. At least I couldn't find it.

>> The bug can cause a race condition if a timestamp is captured at the
>> wrong time, and we will miss that timestamp. To reduce the time window
>> that the problem is able to happen, in case no timestamp was ready, we
>> read the "previous" value of the timestamp registers, and we compare
>> with the "current" one, if it didn't change we can reasonably sure
>
> can *be*
>

Will fix.

>> that no timestamp was captured. If they are different, we use the new
>> value as the captured timestamp.
>> 
>> This workaround has more impact when multiple timestamp registers are
>> used, and the IGC_TXSTMPH_0 register always need to be read, so the
>> interrupt is cleared.
>
> Although you shared some test cases in the cover letter, it’d be great, 
> if you documented the way to reproduce this issue also in this commit 
> message.
>

The most consistent way that I found to reproduce this issue was still
not 100% reliable, i.e. have ptp4l, plus ntpperf, plus a couple of
instances of a custom application, all requesting timestamps at the same
time, and it still took sometimes tens of minutes for the issue to
happen.

Will find a way document this in the commit message.

> In the cover letter you also mention an alternative approach. Should it 
> also be documented here? (If you implemented it already, you could also 
> sent it to the list and reference it here.)
>

The alternative approach is to not request timestamps in the first set
of registers, and only use the first set of registers as a way to clear
the interrupt.

But as we only have 4 of those registers, and it's very easy to be
bottlenecked by this, I felt this approach was a waste of resources. And
it kind of depends on having support for the extra registers (that I am
going to propose for -next).

Will add more details about the alternative to the cover letter.

>> Fixes: 2c344ae24501 ("igc: Add support for TX timestamping")
>> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
>> ---
>>   drivers/net/ethernet/intel/igc/igc_ptp.c | 48 ++++++++++++++++++------
>>   1 file changed, 37 insertions(+), 11 deletions(-)
>> 
>> diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
>> index cf963a12a92f..32ef112f8291 100644
>> --- a/drivers/net/ethernet/intel/igc/igc_ptp.c
>> +++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
>> @@ -685,14 +685,49 @@ static void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
>>   	struct sk_buff *skb = adapter->ptp_tx_skb;
>>   	struct skb_shared_hwtstamps shhwtstamps;
>>   	struct igc_hw *hw = &adapter->hw;
>> +	u32 tsynctxctl;
>>   	int adjust = 0;
>>   	u64 regval;
>>   
>>   	if (WARN_ON_ONCE(!skb))
>>   		return;
>>   
>> -	regval = rd32(IGC_TXSTMPL);
>> -	regval |= (u64)rd32(IGC_TXSTMPH) << 32;
>> +	tsynctxctl = rd32(IGC_TSYNCTXCTL);
>> +	tsynctxctl &= IGC_TSYNCTXCTL_TXTT_0;
>> +	if (tsynctxctl) {
>> +		regval = rd32(IGC_TXSTMPL);
>> +		regval |= (u64)rd32(IGC_TXSTMPH) << 32;
>> +	} else {
>> +		/* There's a bug in the hardware that could cause
>> +		 * missing interrupts for TX timestamping. The issue
>> +		 * is that for new interrupts to be triggered, the
>> +		 * IGC_TXSTMPH_0 register must be read.
>> +		 *
>> +		 * To avoid discarding a valid timestamp that just
>> +		 * happened at the "wrong" time, we need to confirm
>> +		 * that there was no timestamp captured, we do that by
>> +		 * assuming that no two timestamps in sequence have
>> +		 * the same nanosecond value.
>> +		 *
>> +		 * So, we read the "low" register, read the "high"
>> +		 * register (to latch a new timestamp) and read the
>> +		 * "low" register again, if "old" and "new" versions
>> +		 * of the "low" register are different, a valid
>> +		 * timestamp was captured, we can read the "high"
>> +		 * register again.
>> +		 */
>> +		u32 txstmpl_old, txstmpl_new;
>> +
>> +		txstmpl_old = rd32(IGC_TXSTMPL);
>> +		rd32(IGC_TXSTMPH);
>> +		txstmpl_new = rd32(IGC_TXSTMPL);
>> +
>> +		if (txstmpl_old == txstmpl_new)
>> +			return;
>> +
>> +		regval = txstmpl_new;
>> +		regval |= (u64)rd32(IGC_TXSTMPH) << 32;
>> +	}
>>   	if (igc_ptp_systim_to_hwtstamp(adapter, &shhwtstamps, regval))
>>   		return;
>>   
>> @@ -730,22 +765,13 @@ static void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
>>    */
>>   void igc_ptp_tx_tstamp_event(struct igc_adapter *adapter)
>>   {
>> -	struct igc_hw *hw = &adapter->hw;
>>   	unsigned long flags;
>> -	u32 tsynctxctl;
>>   
>>   	spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
>>   
>>   	if (!adapter->ptp_tx_skb)
>>   		goto unlock;
>>   
>> -	tsynctxctl = rd32(IGC_TSYNCTXCTL);
>> -	tsynctxctl &= IGC_TSYNCTXCTL_TXTT_0;
>> -	if (!tsynctxctl) {
>> -		WARN_ONCE(1, "Received a TSTAMP interrupt but no TSTAMP is ready.\n");
>
> Was this warning printed before your patch?
>

When smashing the NIC with as many timestamping requests as I could (as
explained above), yeah, I could see this, and that's why I felt the
workaround was needed.

>> -		goto unlock;
>> -	}
>> -
>>   	igc_ptp_tx_hwtstamp(adapter);
>>   
>>   unlock:
>
>
> Kind regards,
>
> Paul


Cheers,
-- 
Vinicius
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

      reply	other threads:[~2023-06-07 17:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-06  1:33 [Intel-wired-lan] [PATCH iwl-net v3 0/4] igc: TX timestamping fixes Vinicius Costa Gomes
2023-06-06  1:33 ` [Intel-wired-lan] [PATCH iwl-net v3 1/4] igc: Fix race condition in PTP tx code Vinicius Costa Gomes
2023-06-06  1:33 ` [Intel-wired-lan] [PATCH iwl-net v3 2/4] igc: Check if hardware TX timestamping is enabled earlier Vinicius Costa Gomes
2023-06-06  1:33 ` [Intel-wired-lan] [PATCH iwl-net v3 3/4] igc: Retrieve TX timestamp during interrupt handling Vinicius Costa Gomes
2023-06-06  7:08   ` Miroslav Lichvar
2023-06-06  1:33 ` [Intel-wired-lan] [PATCH iwl-net v3 4/4] igc: Add workaround for missing timestamps Vinicius Costa Gomes
2023-06-06  5:07   ` Paul Menzel
2023-06-07 17:29     ` Vinicius Costa Gomes [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ttvjqiyz.fsf@intel.com \
    --to=vinicius.gomes@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kurt@linutronix.de \
    --cc=pmenzel@molgen.mpg.de \
    --cc=vladimir.oltean@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.