Intel-Wired-Lan Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
To: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: vladimir.oltean@nxp.com, anthony.l.nguyen@intel.com,
	kurt@linutronix.de, intel-wired-lan@lists.osuosl.org
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v3 4/4] igc: Add workaround for missing timestamps
Date: Wed, 07 Jun 2023 10:29:24 -0700	[thread overview]
Message-ID: <87ttvjqiyz.fsf@intel.com> (raw)
In-Reply-To: <692a458e-f887-f9da-e3cb-904e64a40924@molgen.mpg.de>

Hi Paul,

Paul Menzel <pmenzel@molgen.mpg.de> writes:

> Dear Vinicius,
>
>
> Thank you for your patch.
>
> Am 06.06.23 um 03:33 schrieb Vinicius Costa Gomes:
>
> You could make the commit message summary even more specific:
>
> igc: Work around HW bug causing missing timestamps
>

Sounds better. Will fix.

>> There's an hardware issue that can cause missing timestamps. The bug
>> is that the interrupt is only cleared if the IGC_TXSTMPH_0 register is
>> read.
>
> Is that hardware bug documented in some errata?
>

There is (or, there is going to be) an errata, but I don't think it's
public yet. At least I couldn't find it.

>> The bug can cause a race condition if a timestamp is captured at the
>> wrong time, and we will miss that timestamp. To reduce the time window
>> that the problem is able to happen, in case no timestamp was ready, we
>> read the "previous" value of the timestamp registers, and we compare
>> with the "current" one, if it didn't change we can reasonably sure
>
> can *be*
>

Will fix.

>> that no timestamp was captured. If they are different, we use the new
>> value as the captured timestamp.
>> 
>> This workaround has more impact when multiple timestamp registers are
>> used, and the IGC_TXSTMPH_0 register always need to be read, so the
>> interrupt is cleared.
>
> Although you shared some test cases in the cover letter, it’d be great, 
> if you documented the way to reproduce this issue also in this commit 
> message.
>

The most consistent way that I found to reproduce this issue was still
not 100% reliable, i.e. have ptp4l, plus ntpperf, plus a couple of
instances of a custom application, all requesting timestamps at the same
time, and it still took sometimes tens of minutes for the issue to
happen.

Will find a way document this in the commit message.

> In the cover letter you also mention an alternative approach. Should it 
> also be documented here? (If you implemented it already, you could also 
> sent it to the list and reference it here.)
>

The alternative approach is to not request timestamps in the first set
of registers, and only use the first set of registers as a way to clear
the interrupt.

But as we only have 4 of those registers, and it's very easy to be
bottlenecked by this, I felt this approach was a waste of resources. And
it kind of depends on having support for the extra registers (that I am
going to propose for -next).

Will add more details about the alternative to the cover letter.

>> Fixes: 2c344ae24501 ("igc: Add support for TX timestamping")
>> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
>> ---
>>   drivers/net/ethernet/intel/igc/igc_ptp.c | 48 ++++++++++++++++++------
>>   1 file changed, 37 insertions(+), 11 deletions(-)
>> 
>> diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
>> index cf963a12a92f..32ef112f8291 100644
>> --- a/drivers/net/ethernet/intel/igc/igc_ptp.c
>> +++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
>> @@ -685,14 +685,49 @@ static void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
>>   	struct sk_buff *skb = adapter->ptp_tx_skb;
>>   	struct skb_shared_hwtstamps shhwtstamps;
>>   	struct igc_hw *hw = &adapter->hw;
>> +	u32 tsynctxctl;
>>   	int adjust = 0;
>>   	u64 regval;
>>   
>>   	if (WARN_ON_ONCE(!skb))
>>   		return;
>>   
>> -	regval = rd32(IGC_TXSTMPL);
>> -	regval |= (u64)rd32(IGC_TXSTMPH) << 32;
>> +	tsynctxctl = rd32(IGC_TSYNCTXCTL);
>> +	tsynctxctl &= IGC_TSYNCTXCTL_TXTT_0;
>> +	if (tsynctxctl) {
>> +		regval = rd32(IGC_TXSTMPL);
>> +		regval |= (u64)rd32(IGC_TXSTMPH) << 32;
>> +	} else {
>> +		/* There's a bug in the hardware that could cause
>> +		 * missing interrupts for TX timestamping. The issue
>> +		 * is that for new interrupts to be triggered, the
>> +		 * IGC_TXSTMPH_0 register must be read.
>> +		 *
>> +		 * To avoid discarding a valid timestamp that just
>> +		 * happened at the "wrong" time, we need to confirm
>> +		 * that there was no timestamp captured, we do that by
>> +		 * assuming that no two timestamps in sequence have
>> +		 * the same nanosecond value.
>> +		 *
>> +		 * So, we read the "low" register, read the "high"
>> +		 * register (to latch a new timestamp) and read the
>> +		 * "low" register again, if "old" and "new" versions
>> +		 * of the "low" register are different, a valid
>> +		 * timestamp was captured, we can read the "high"
>> +		 * register again.
>> +		 */
>> +		u32 txstmpl_old, txstmpl_new;
>> +
>> +		txstmpl_old = rd32(IGC_TXSTMPL);
>> +		rd32(IGC_TXSTMPH);
>> +		txstmpl_new = rd32(IGC_TXSTMPL);
>> +
>> +		if (txstmpl_old == txstmpl_new)
>> +			return;
>> +
>> +		regval = txstmpl_new;
>> +		regval |= (u64)rd32(IGC_TXSTMPH) << 32;
>> +	}
>>   	if (igc_ptp_systim_to_hwtstamp(adapter, &shhwtstamps, regval))
>>   		return;
>>   
>> @@ -730,22 +765,13 @@ static void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
>>    */
>>   void igc_ptp_tx_tstamp_event(struct igc_adapter *adapter)
>>   {
>> -	struct igc_hw *hw = &adapter->hw;
>>   	unsigned long flags;
>> -	u32 tsynctxctl;
>>   
>>   	spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
>>   
>>   	if (!adapter->ptp_tx_skb)
>>   		goto unlock;
>>   
>> -	tsynctxctl = rd32(IGC_TSYNCTXCTL);
>> -	tsynctxctl &= IGC_TSYNCTXCTL_TXTT_0;
>> -	if (!tsynctxctl) {
>> -		WARN_ONCE(1, "Received a TSTAMP interrupt but no TSTAMP is ready.\n");
>
> Was this warning printed before your patch?
>

When smashing the NIC with as many timestamping requests as I could (as
explained above), yeah, I could see this, and that's why I felt the
workaround was needed.

>> -		goto unlock;
>> -	}
>> -
>>   	igc_ptp_tx_hwtstamp(adapter);
>>   
>>   unlock:
>
>
> Kind regards,
>
> Paul


Cheers,
-- 
Vinicius
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

      reply	other threads:[~2023-06-07 17:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-06  1:33 [Intel-wired-lan] [PATCH iwl-net v3 0/4] igc: TX timestamping fixes Vinicius Costa Gomes
2023-06-06  1:33 ` [Intel-wired-lan] [PATCH iwl-net v3 1/4] igc: Fix race condition in PTP tx code Vinicius Costa Gomes
2023-06-06  1:33 ` [Intel-wired-lan] [PATCH iwl-net v3 2/4] igc: Check if hardware TX timestamping is enabled earlier Vinicius Costa Gomes
2023-06-06  1:33 ` [Intel-wired-lan] [PATCH iwl-net v3 3/4] igc: Retrieve TX timestamp during interrupt handling Vinicius Costa Gomes
2023-06-06  7:08   ` Miroslav Lichvar
2023-06-06  1:33 ` [Intel-wired-lan] [PATCH iwl-net v3 4/4] igc: Add workaround for missing timestamps Vinicius Costa Gomes
2023-06-06  5:07   ` Paul Menzel
2023-06-07 17:29     ` Vinicius Costa Gomes [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ttvjqiyz.fsf@intel.com \
    --to=vinicius.gomes@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kurt@linutronix.de \
    --cc=pmenzel@molgen.mpg.de \
    --cc=vladimir.oltean@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox