public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine
@ 2026-03-29  3:25 Bob Van Valzah
  2026-03-30 16:39 ` Vadim Fedorenko
  0 siblings, 1 reply; 6+ messages in thread
From: Bob Van Valzah @ 2026-03-29  3:25 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: anthony.l.nguyen, netdev, julianstj, jeff

[-- Attachment #1: Type: text/plain, Size: 3897 bytes --]

Hi,

We found a race in igc_ptp_adjfine_i225() that causes "Tx timestamp
timeout" errors and eventually wedges EXTTS when a PTP grandmaster
(ptp4l with hardware timestamping) runs concurrently with PHC
frequency discipline (any GPSDO calling clock_adjtime ADJ_FREQUENCY).

Root cause: igc_ptp_adjfine_i225() writes IGC_TIMINCA without holding
any lock.  Every other PTP clock operation in igc_ptp.c (adjtime,
gettime, settime) holds tmreg_lock, but adjfine does not.  When the
increment rate changes while the hardware is capturing a TX timestamp,
the captured value is corrupt.  The driver retries for
IGC_PTP_TX_TIMEOUT (15s), then logs the timeout and frees the skb.
Repeated occurrences eventually prevent EXTTS from delivering events.

The attached reproducer (triggers in ~17 seconds on i226):

  One thread calling clock_adjtime(ADJ_FREQUENCY) at ~200k/s on the
  PHC, another sending UDP packets with SO_TIMESTAMPING requesting
  hardware TX timestamps at ~100k/s.  A Python reproducer is at:
  https://github.com/bobvan/PePPAR-Fix/blob/main/tools/igc_tx_timeout_repro.py

  At realistic rates (1 Hz adjfine from a GPSDO + ptp4l at 128 Hz
  sync), the race triggers in ~30 minutes.

The attached patch holds ptp_tx_lock around the TIMINCA write and
skips the write if any TX timestamps are pending (tx_tstamp[i].skb
!= NULL), returning -EBUSY.  This doesn't fully close the hardware
race (a new TX capture can start between the check and the write),
but at realistic rates the residual probability gives ~25 year MTBF
vs ~30 minutes without the patch.

A complete fix would likely require either disabling TX timestamping
around TIMINCA writes (via TSYNCTXCTL), or making the timeout recovery
path more robust so a single corrupt timestamp doesn't wedge the
subsystem.  We'd welcome guidance from the igc maintainers on the
preferred approach.

Tested on:
  - Intel i226 (TimeHAT v5 board on Raspberry Pi 5)
  - Kernel 6.12.62+rpt-rpi-2712 (Raspberry Pi OS)
  - Intel out-of-tree igc driver 5.4.0-7642.46
  - Stock upstream igc_ptp.c (same code, same bug)

	Bob

---

 drivers/net/ethernet/intel/igc/igc_ptp.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
index XXXXXXX..XXXXXXX 100644
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -47,8 +47,10 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, long scaled_ppm)
 {
        struct igc_adapter *igc = container_of(ptp, struct igc_adapter,
                                               ptp_caps);
        struct igc_hw *hw = &igc->hw;
+       unsigned long flags;
        int neg_adj = 0;
        u64 rate;
        u32 inca;
+       int i;

        if (scaled_ppm < 0) {
                neg_adj = 1;
@@ -63,7 +65,21 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, long scaled_ppm)
        if (neg_adj)
                inca |= ISGN;

-       wr32(IGC_TIMINCA, inca);
+       /* Changing the clock increment rate while a TX timestamp is being
+        * captured by the hardware can corrupt the timestamp, causing the
+        * driver to report "Tx timestamp timeout" and eventually wedging
+        * the EXTTS subsystem.  Serialize with pending TX timestamps:
+        * skip the rate change if any are in flight.
+        */
+       spin_lock_irqsave(&igc->ptp_tx_lock, flags);
+       for (i = 0; i < IGC_MAX_TX_TSTAMP_REGS; i++) {
+               if (igc->tx_tstamp[i].skb) {
+                       spin_unlock_irqrestore(&igc->ptp_tx_lock, flags);
+                       return -EBUSY;
+               }
+       }
+       wr32(IGC_TIMINCA, inca);
+       spin_unlock_irqrestore(&igc->ptp_tx_lock, flags);

        return 0;
 }
--
2.39.2

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: igc_tx_timeout_repro.py --]
[-- Type: text/x-python-script; x-unix-mode=0644; name="igc_tx_timeout_repro.py", Size: 6600 bytes --]

#!/usr/bin/env python3
"""Reproducer for igc driver Tx timestamp timeout bug.

The igc driver (Intel i225/i226) has a race between clock_adjtime
(ADJ_FREQUENCY) and hardware TX timestamping.  When adjfine() is
called while a TX timestamp is pending, the timestamp register can
wedge, producing "Tx timestamp timeout" errors in dmesg and eventually
breaking EXTTS (PPS capture).

This reproducer runs two threads:
  1. Hammers clock_adjtime(ADJ_FREQUENCY) on the PHC
  2. Sends UDP packets with SO_TIMESTAMPING (hardware TX timestamps)

On a vulnerable igc driver, "Tx timestamp timeout" appears in dmesg
within seconds.

Usage:
    sudo python3 tools/igc_tx_timeout_repro.py eth1 /dev/ptp0
    # Watch with: dmesg -w | grep "Tx timestamp"

Requires root for SO_TIMESTAMPING and clock_adjtime on PHC.
"""

import ctypes
import ctypes.util
import os
import socket
import struct
import sys
import threading
import time


def get_phc_clockid(fd):
    return (~fd << 3) | 3


def adjfine_loop(phc_fd, stop_event, stats):
    """Hammer adjfine() on the PHC."""
    librt = ctypes.CDLL(ctypes.util.find_library("rt"), use_errno=True)

    class Timeval(ctypes.Structure):
        _fields_ = [("tv_sec", ctypes.c_long), ("tv_usec", ctypes.c_long)]

    class Timex(ctypes.Structure):
        _fields_ = [
            ("modes", ctypes.c_uint),
            ("offset", ctypes.c_long),
            ("freq", ctypes.c_long),
            ("maxerror", ctypes.c_long),
            ("esterror", ctypes.c_long),
            ("status", ctypes.c_int),
            ("constant", ctypes.c_long),
            ("precision", ctypes.c_long),
            ("tolerance", ctypes.c_long),
            ("time", Timeval),
            ("tick", ctypes.c_long),
            ("ppsfreq", ctypes.c_long),
            ("jitter", ctypes.c_long),
            ("shift", ctypes.c_int),
            ("stabil", ctypes.c_long),
            ("jitcnt", ctypes.c_long),
            ("calcnt", ctypes.c_long),
            ("errcnt", ctypes.c_long),
            ("stbcnt", ctypes.c_long),
            ("tai", ctypes.c_int),
        ]

    ADJ_FREQUENCY = 0x0002
    clockid = get_phc_clockid(phc_fd)
    n = 0
    toggle = False

    while not stop_event.is_set():
        tx = Timex()
        tx.modes = ADJ_FREQUENCY
        # Alternate between two tiny frequency offsets
        tx.freq = 100 if toggle else -100  # ~0.0015 ppb
        toggle = not toggle
        ret = librt.clock_adjtime(ctypes.c_int32(clockid), ctypes.byref(tx))
        if ret < 0:
            print(f"adjfine error: {ctypes.get_errno()}")
            break
        n += 1

    stats["adjfine_count"] = n


def tx_timestamp_loop(iface, stop_event, stats):
    """Send UDP packets requesting hardware TX timestamps."""
    # SOL_SOCKET options for timestamping
    SO_TIMESTAMPING = 37
    SOF_TIMESTAMPING_TX_HARDWARE = (1 << 0)
    SOF_TIMESTAMPING_RAW_HARDWARE = (1 << 6)
    SOF_TIMESTAMPING_OPT_TSONLY = (1 << 11)

    flags = (SOF_TIMESTAMPING_TX_HARDWARE |
             SOF_TIMESTAMPING_RAW_HARDWARE |
             SOF_TIMESTAMPING_OPT_TSONLY)

    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    sock.setsockopt(socket.SOL_SOCKET, SO_TIMESTAMPING, flags)
    # Bind to the specific interface
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE,
                    iface.encode() + b'\0')
    sock.settimeout(0.01)

    # Send to a harmless destination (localhost or broadcast)
    dest = ("224.0.0.1", 9999)  # multicast, won't route
    payload = b"igc_repro" + b"\x00" * 32
    n = 0
    errors = 0

    while not stop_event.is_set():
        try:
            sock.sendto(payload, dest)
            n += 1
        except OSError:
            errors += 1
        # Don't sleep — maximize collision probability
        # But yield to avoid starving the adjfine thread
        if n % 100 == 0:
            time.sleep(0.0001)

    sock.close()
    stats["tx_count"] = n
    stats["tx_errors"] = errors


def check_dmesg_for_timeout():
    """Check if Tx timestamp timeout appeared in dmesg."""
    try:
        import subprocess
        result = subprocess.run(
            ["dmesg", "--since", "60 seconds ago"],
            capture_output=True, text=True, timeout=2
        )
        return "Tx timestamp timeout" in result.stdout
    except Exception:
        return False


def main():
    if len(sys.argv) < 3:
        print(f"Usage: sudo {sys.argv[0]} <interface> <ptp_device>")
        print(f"Example: sudo {sys.argv[0]} eth1 /dev/ptp0")
        sys.exit(1)

    iface = sys.argv[1]
    ptp_dev = sys.argv[2]
    duration = float(sys.argv[3]) if len(sys.argv) > 3 else 30.0

    if os.geteuid() != 0:
        print("ERROR: must run as root (sudo)")
        sys.exit(1)

    phc_fd = os.open(ptp_dev, os.O_RDWR)
    print(f"Opened {ptp_dev} (fd={phc_fd})")
    print(f"Interface: {iface}")
    print(f"Duration: {duration}s")
    print(f"Watch for: dmesg -w | grep 'Tx timestamp timeout'")
    print()

    stop = threading.Event()
    stats = {}

    t_adj = threading.Thread(target=adjfine_loop, args=(phc_fd, stop, stats))
    t_tx = threading.Thread(target=tx_timestamp_loop, args=(iface, stop, stats))

    t_adj.start()
    t_tx.start()

    start = time.monotonic()
    triggered = False
    while time.monotonic() - start < duration:
        time.sleep(1)
        elapsed = time.monotonic() - start
        if check_dmesg_for_timeout():
            print(f"\n*** Tx timestamp timeout detected after {elapsed:.1f}s ***")
            triggered = True
            break
        print(f"  {elapsed:.0f}s: running...", end="\r")

    stop.set()
    t_adj.join(timeout=2)
    t_tx.join(timeout=2)
    os.close(phc_fd)

    adj_n = stats.get("adjfine_count", 0)
    tx_n = stats.get("tx_count", 0)
    tx_err = stats.get("tx_errors", 0)
    elapsed = time.monotonic() - start

    print(f"\nResults ({elapsed:.1f}s):")
    print(f"  adjfine calls: {adj_n} ({adj_n/elapsed:.0f}/s)")
    print(f"  TX packets:    {tx_n} ({tx_n/elapsed:.0f}/s), {tx_err} errors")

    if triggered:
        print(f"\n  BUG TRIGGERED: igc Tx timestamp timeout")
        print(f"  The igc driver's PTP timestamp register wedged due to")
        print(f"  concurrent adjfine() and hardware TX timestamping.")
        sys.exit(1)
    else:
        print(f"\n  No timeout detected in {duration}s")
        sys.exit(0)


if __name__ == "__main__":
    main()

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine
  2026-03-29  3:25 [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine Bob Van Valzah
@ 2026-03-30 16:39 ` Vadim Fedorenko
  2026-03-30 19:42   ` Bob Van Valzah
  0 siblings, 1 reply; 6+ messages in thread
From: Vadim Fedorenko @ 2026-03-30 16:39 UTC (permalink / raw)
  To: Bob Van Valzah, intel-wired-lan; +Cc: anthony.l.nguyen, netdev, julianstj, jeff

On 29/03/2026 04:25, Bob Van Valzah wrote:
> Hi,
> 
> We found a race in igc_ptp_adjfine_i225() that causes "Tx timestamp
> timeout" errors and eventually wedges EXTTS when a PTP grandmaster
> (ptp4l with hardware timestamping) runs concurrently with PHC
> frequency discipline (any GPSDO calling clock_adjtime ADJ_FREQUENCY).
> 
> Root cause: igc_ptp_adjfine_i225() writes IGC_TIMINCA without holding
> any lock.  Every other PTP clock operation in igc_ptp.c (adjtime,
> gettime, settime) holds tmreg_lock, but adjfine does not.  When the
> increment rate changes while the hardware is capturing a TX timestamp,
> the captured value is corrupt.  The driver retries for
> IGC_PTP_TX_TIMEOUT (15s), then logs the timeout and frees the skb.
> Repeated occurrences eventually prevent EXTTS from delivering events.
> 
> The attached reproducer (triggers in ~17 seconds on i226):
> 
>    One thread calling clock_adjtime(ADJ_FREQUENCY) at ~200k/s on the
>    PHC, another sending UDP packets with SO_TIMESTAMPING requesting
>    hardware TX timestamps at ~100k/s.  A Python reproducer is at:
>    https://github.com/bobvan/PePPAR-Fix/blob/main/tools/igc_tx_timeout_repro.py
> 
>    At realistic rates (1 Hz adjfine from a GPSDO + ptp4l at 128 Hz
>    sync), the race triggers in ~30 minutes.
> 
> The attached patch holds ptp_tx_lock around the TIMINCA write and
> skips the write if any TX timestamps are pending (tx_tstamp[i].skb
> != NULL), returning -EBUSY.  This doesn't fully close the hardware
> race (a new TX capture can start between the check and the write),
> but at realistic rates the residual probability gives ~25 year MTBF
> vs ~30 minutes without the patch.
> 
> A complete fix would likely require either disabling TX timestamping
> around TIMINCA writes (via TSYNCTXCTL), or making the timeout recovery
> path more robust so a single corrupt timestamp doesn't wedge the
> subsystem.  We'd welcome guidance from the igc maintainers on the
> preferred approach.
> 
> Tested on:
>    - Intel i226 (TimeHAT v5 board on Raspberry Pi 5)
>    - Kernel 6.12.62+rpt-rpi-2712 (Raspberry Pi OS)
>    - Intel out-of-tree igc driver 5.4.0-7642.46
>    - Stock upstream igc_ptp.c (same code, same bug)
> 
> 	Bob
> 
> ---
> 
>   drivers/net/ethernet/intel/igc/igc_ptp.c | 18 +++++++++++++++++-
>   1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
> index XXXXXXX..XXXXXXX 100644
> --- a/drivers/net/ethernet/intel/igc/igc_ptp.c
> +++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
> @@ -47,8 +47,10 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, long scaled_ppm)
>   {
>          struct igc_adapter *igc = container_of(ptp, struct igc_adapter,
>                                                 ptp_caps);
>          struct igc_hw *hw = &igc->hw;
> +       unsigned long flags;
>          int neg_adj = 0;
>          u64 rate;
>          u32 inca;
> +       int i;
> 
>          if (scaled_ppm < 0) {
>                  neg_adj = 1;
> @@ -63,7 +65,21 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, long scaled_ppm)
>          if (neg_adj)
>                  inca |= ISGN;
> 
> -       wr32(IGC_TIMINCA, inca);
> +       /* Changing the clock increment rate while a TX timestamp is being
> +        * captured by the hardware can corrupt the timestamp, causing the
> +        * driver to report "Tx timestamp timeout" and eventually wedging
> +        * the EXTTS subsystem.  Serialize with pending TX timestamps:
> +        * skip the rate change if any are in flight.
> +        */
> +       spin_lock_irqsave(&igc->ptp_tx_lock, flags);
> +       for (i = 0; i < IGC_MAX_TX_TSTAMP_REGS; i++) {
> +               if (igc->tx_tstamp[i].skb) {
> +                       spin_unlock_irqrestore(&igc->ptp_tx_lock, flags);
> +                       return -EBUSY;
> +               }
> +       }
> +       wr32(IGC_TIMINCA, inca);
> +       spin_unlock_irqrestore(&igc->ptp_tx_lock, flags);

It's a bit weird solution, because in this case we may end up having no
successful calls to adjfine with high amount of TX timestamp packets in 
flight. Another problem here is that access to timing registers is
guarded by tmreg_lock, but here you use ptp_tx_lock, which protects
queue.

Were you able to recover "corrupted" time stamps to figure out why they
are discarded?


> 
>          return 0;
>   }
> --
> 2.39.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine
  2026-03-30 16:39 ` Vadim Fedorenko
@ 2026-03-30 19:42   ` Bob Van Valzah
  2026-04-01  1:14     ` Bob Van Valzah
  0 siblings, 1 reply; 6+ messages in thread
From: Bob Van Valzah @ 2026-03-30 19:42 UTC (permalink / raw)
  To: Vadim Fedorenko
  Cc: intel-wired-lan, anthony.l.nguyen, netdev, julianstj, jeff

Thanks for following up Vadim. To be clear, we do not see our patch as a solution to the problem, since it does not eliminate the underlying contention, it just reduces the likelihood.

We now have a bit more hands-on time running the patch and can see its limitations. The driver just throws the timeout error less often now. Moreover, our code calling adjfine() has to be ready for an EBUSY. The patch is certainly not an elegant solution. It may well have worked better, or perhaps been a complete fix, if we had taken tmreg_lock. Sorry if we sent our patch prematurely.

We think the contention happens when a system call made by a clock disciplining daemon like ts2phc changes a PHC's frequency, while a timestamping daemon like ptp4l has requested a timestamp against that same PHC. A plausible explanation is that the hardware fails to produce a timestamp for a packet when it collides with a PHC frequency adjustment, leading to the timestamp timeout. We see this contention 128x more often than the average user because we cranked our PTP sync rate up to 128 Hz. We care about every ns here. The likelihood probably also scales with the number of PTP clients.

Sorry, in retrospect, we may have overstated our case in declaring the root cause as the lack of locking, since our locking only reduces the likelihood of the timestamp timeout. Intel may provide insight to the underlying cause of the timeout, but my hunch is that the collision of a timestamp request against a PHC and a frequency change of the same PHC causes the timestamp request to fail, leading to the timeout. Our repro code makes this happen so reliably that it should be easy for the hardware guys to explain exactly what’s going on. Hopefully, they can advise on a more elegant avoidance strategy than our ham-handed lock.

We have not tried recovering a timestamp following the error. We may have used the term “corrupted” poorly here. Since the error message says “timeout," that could mean the hardware produced no timestamp at all, rather than a corrupted one.

This isn’t a show-stopper for us, but we note that timekeeping daemons with imperfect error handling could react to this situation poorly and perhaps fail silently.

	Bob

> On Mar 30, 2026, at 11:39 AM, Vadim Fedorenko <vadim.fedorenko@linux.dev> wrote:
> 
> On 29/03/2026 04:25, Bob Van Valzah wrote:
>> Hi,
>> We found a race in igc_ptp_adjfine_i225() that causes "Tx timestamp
>> timeout" errors and eventually wedges EXTTS when a PTP grandmaster
>> (ptp4l with hardware timestamping) runs concurrently with PHC
>> frequency discipline (any GPSDO calling clock_adjtime ADJ_FREQUENCY).
>> Root cause: igc_ptp_adjfine_i225() writes IGC_TIMINCA without holding
>> any lock.  Every other PTP clock operation in igc_ptp.c (adjtime,
>> gettime, settime) holds tmreg_lock, but adjfine does not.  When the
>> increment rate changes while the hardware is capturing a TX timestamp,
>> the captured value is corrupt.  The driver retries for
>> IGC_PTP_TX_TIMEOUT (15s), then logs the timeout and frees the skb.
>> Repeated occurrences eventually prevent EXTTS from delivering events.
>> The attached reproducer (triggers in ~17 seconds on i226):
>>   One thread calling clock_adjtime(ADJ_FREQUENCY) at ~200k/s on the
>>   PHC, another sending UDP packets with SO_TIMESTAMPING requesting
>>   hardware TX timestamps at ~100k/s.  A Python reproducer is at:
>>   https://github.com/bobvan/PePPAR-Fix/blob/main/tools/igc_tx_timeout_repro.py
>>   At realistic rates (1 Hz adjfine from a GPSDO + ptp4l at 128 Hz
>>   sync), the race triggers in ~30 minutes.
>> The attached patch holds ptp_tx_lock around the TIMINCA write and
>> skips the write if any TX timestamps are pending (tx_tstamp[i].skb
>> != NULL), returning -EBUSY.  This doesn't fully close the hardware
>> race (a new TX capture can start between the check and the write),
>> but at realistic rates the residual probability gives ~25 year MTBF
>> vs ~30 minutes without the patch.
>> A complete fix would likely require either disabling TX timestamping
>> around TIMINCA writes (via TSYNCTXCTL), or making the timeout recovery
>> path more robust so a single corrupt timestamp doesn't wedge the
>> subsystem.  We'd welcome guidance from the igc maintainers on the
>> preferred approach.
>> Tested on:
>>   - Intel i226 (TimeHAT v5 board on Raspberry Pi 5)
>>   - Kernel 6.12.62+rpt-rpi-2712 (Raspberry Pi OS)
>>   - Intel out-of-tree igc driver 5.4.0-7642.46
>>   - Stock upstream igc_ptp.c (same code, same bug)
>> Bob
>> ---
>>  drivers/net/ethernet/intel/igc/igc_ptp.c | 18 +++++++++++++++++-
>>  1 file changed, 17 insertions(+), 1 deletion(-)
>> diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
>> index XXXXXXX..XXXXXXX 100644
>> --- a/drivers/net/ethernet/intel/igc/igc_ptp.c
>> +++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
>> @@ -47,8 +47,10 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, long scaled_ppm)
>>  {
>>         struct igc_adapter *igc = container_of(ptp, struct igc_adapter,
>>                                                ptp_caps);
>>         struct igc_hw *hw = &igc->hw;
>> +       unsigned long flags;
>>         int neg_adj = 0;
>>         u64 rate;
>>         u32 inca;
>> +       int i;
>>         if (scaled_ppm < 0) {
>>                 neg_adj = 1;
>> @@ -63,7 +65,21 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, long scaled_ppm)
>>         if (neg_adj)
>>                 inca |= ISGN;
>> -       wr32(IGC_TIMINCA, inca);
>> +       /* Changing the clock increment rate while a TX timestamp is being
>> +        * captured by the hardware can corrupt the timestamp, causing the
>> +        * driver to report "Tx timestamp timeout" and eventually wedging
>> +        * the EXTTS subsystem.  Serialize with pending TX timestamps:
>> +        * skip the rate change if any are in flight.
>> +        */
>> +       spin_lock_irqsave(&igc->ptp_tx_lock, flags);
>> +       for (i = 0; i < IGC_MAX_TX_TSTAMP_REGS; i++) {
>> +               if (igc->tx_tstamp[i].skb) {
>> +                       spin_unlock_irqrestore(&igc->ptp_tx_lock, flags);
>> +                       return -EBUSY;
>> +               }
>> +       }
>> +       wr32(IGC_TIMINCA, inca);
>> +       spin_unlock_irqrestore(&igc->ptp_tx_lock, flags);
> 
> It's a bit weird solution, because in this case we may end up having no
> successful calls to adjfine with high amount of TX timestamp packets in flight. Another problem here is that access to timing registers is
> guarded by tmreg_lock, but here you use ptp_tx_lock, which protects
> queue.
> 
> Were you able to recover "corrupted" time stamps to figure out why they
> are discarded?
> 
> 
>>         return 0;
>>  }
>> --
>> 2.39.2



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine
  2026-03-30 19:42   ` Bob Van Valzah
@ 2026-04-01  1:14     ` Bob Van Valzah
  2026-04-02  0:49       ` [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine] Bob Van Valzah
  0 siblings, 1 reply; 6+ messages in thread
From: Bob Van Valzah @ 2026-04-01  1:14 UTC (permalink / raw)
  To: Vadim Fedorenko
  Cc: intel-wired-lan, anthony.l.nguyen, netdev, julianstj, jeff,
	Lasse Johnsen

There is one other symptom of this bug that we neglected to report previously: PEROUT stops. That’s PPS OUT for us, so it’s how we first noticed the problem. Recovery was to unload and reload the igc driver.

I take this as further evidence for my hunch that the hardware just reaches some internal “this should never happen” state and the PHC just stops ticking.

Hoping this additional symptom report might help with diagnosis.

	Bob

> On Mar 30, 2026, at 12:42 PM, Bob Van Valzah <Bob@VanValzah.Com> wrote:
> 
> Thanks for following up Vadim. To be clear, we do not see our patch as a solution to the problem, since it does not eliminate the underlying contention, it just reduces the likelihood.
> 
> We now have a bit more hands-on time running the patch and can see its limitations. The driver just throws the timeout error less often now. Moreover, our code calling adjfine() has to be ready for an EBUSY. The patch is certainly not an elegant solution. It may well have worked better, or perhaps been a complete fix, if we had taken tmreg_lock. Sorry if we sent our patch prematurely.
> 
> We think the contention happens when a system call made by a clock disciplining daemon like ts2phc changes a PHC's frequency, while a timestamping daemon like ptp4l has requested a timestamp against that same PHC. A plausible explanation is that the hardware fails to produce a timestamp for a packet when it collides with a PHC frequency adjustment, leading to the timestamp timeout. We see this contention 128x more often than the average user because we cranked our PTP sync rate up to 128 Hz. We care about every ns here. The likelihood probably also scales with the number of PTP clients.
> 
> Sorry, in retrospect, we may have overstated our case in declaring the root cause as the lack of locking, since our locking only reduces the likelihood of the timestamp timeout. Intel may provide insight to the underlying cause of the timeout, but my hunch is that the collision of a timestamp request against a PHC and a frequency change of the same PHC causes the timestamp request to fail, leading to the timeout. Our repro code makes this happen so reliably that it should be easy for the hardware guys to explain exactly what’s going on. Hopefully, they can advise on a more elegant avoidance strategy than our ham-handed lock.
> 
> We have not tried recovering a timestamp following the error. We may have used the term “corrupted” poorly here. Since the error message says “timeout," that could mean the hardware produced no timestamp at all, rather than a corrupted one.
> 
> This isn’t a show-stopper for us, but we note that timekeeping daemons with imperfect error handling could react to this situation poorly and perhaps fail silently.
> 
> Bob
> 
>> On Mar 30, 2026, at 11:39 AM, Vadim Fedorenko <vadim.fedorenko@linux.dev> wrote:
>> 
>> On 29/03/2026 04:25, Bob Van Valzah wrote:
>>> Hi,
>>> We found a race in igc_ptp_adjfine_i225() that causes "Tx timestamp
>>> timeout" errors and eventually wedges EXTTS when a PTP grandmaster
>>> (ptp4l with hardware timestamping) runs concurrently with PHC
>>> frequency discipline (any GPSDO calling clock_adjtime ADJ_FREQUENCY).
>>> Root cause: igc_ptp_adjfine_i225() writes IGC_TIMINCA without holding
>>> any lock.  Every other PTP clock operation in igc_ptp.c (adjtime,
>>> gettime, settime) holds tmreg_lock, but adjfine does not.  When the
>>> increment rate changes while the hardware is capturing a TX timestamp,
>>> the captured value is corrupt.  The driver retries for
>>> IGC_PTP_TX_TIMEOUT (15s), then logs the timeout and frees the skb.
>>> Repeated occurrences eventually prevent EXTTS from delivering events.
>>> The attached reproducer (triggers in ~17 seconds on i226):
>>>  One thread calling clock_adjtime(ADJ_FREQUENCY) at ~200k/s on the
>>>  PHC, another sending UDP packets with SO_TIMESTAMPING requesting
>>>  hardware TX timestamps at ~100k/s.  A Python reproducer is at:
>>>  https://github.com/bobvan/PePPAR-Fix/blob/main/tools/igc_tx_timeout_repro.py
>>>  At realistic rates (1 Hz adjfine from a GPSDO + ptp4l at 128 Hz
>>>  sync), the race triggers in ~30 minutes.
>>> The attached patch holds ptp_tx_lock around the TIMINCA write and
>>> skips the write if any TX timestamps are pending (tx_tstamp[i].skb
>>> != NULL), returning -EBUSY.  This doesn't fully close the hardware
>>> race (a new TX capture can start between the check and the write),
>>> but at realistic rates the residual probability gives ~25 year MTBF
>>> vs ~30 minutes without the patch.
>>> A complete fix would likely require either disabling TX timestamping
>>> around TIMINCA writes (via TSYNCTXCTL), or making the timeout recovery
>>> path more robust so a single corrupt timestamp doesn't wedge the
>>> subsystem.  We'd welcome guidance from the igc maintainers on the
>>> preferred approach.
>>> Tested on:
>>>  - Intel i226 (TimeHAT v5 board on Raspberry Pi 5)
>>>  - Kernel 6.12.62+rpt-rpi-2712 (Raspberry Pi OS)
>>>  - Intel out-of-tree igc driver 5.4.0-7642.46
>>>  - Stock upstream igc_ptp.c (same code, same bug)
>>> Bob
>>> ---
>>> drivers/net/ethernet/intel/igc/igc_ptp.c | 18 +++++++++++++++++-
>>> 1 file changed, 17 insertions(+), 1 deletion(-)
>>> diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
>>> index XXXXXXX..XXXXXXX 100644
>>> --- a/drivers/net/ethernet/intel/igc/igc_ptp.c
>>> +++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
>>> @@ -47,8 +47,10 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, long scaled_ppm)
>>> {
>>>        struct igc_adapter *igc = container_of(ptp, struct igc_adapter,
>>>                                               ptp_caps);
>>>        struct igc_hw *hw = &igc->hw;
>>> +       unsigned long flags;
>>>        int neg_adj = 0;
>>>        u64 rate;
>>>        u32 inca;
>>> +       int i;
>>>        if (scaled_ppm < 0) {
>>>                neg_adj = 1;
>>> @@ -63,7 +65,21 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, long scaled_ppm)
>>>        if (neg_adj)
>>>                inca |= ISGN;
>>> -       wr32(IGC_TIMINCA, inca);
>>> +       /* Changing the clock increment rate while a TX timestamp is being
>>> +        * captured by the hardware can corrupt the timestamp, causing the
>>> +        * driver to report "Tx timestamp timeout" and eventually wedging
>>> +        * the EXTTS subsystem.  Serialize with pending TX timestamps:
>>> +        * skip the rate change if any are in flight.
>>> +        */
>>> +       spin_lock_irqsave(&igc->ptp_tx_lock, flags);
>>> +       for (i = 0; i < IGC_MAX_TX_TSTAMP_REGS; i++) {
>>> +               if (igc->tx_tstamp[i].skb) {
>>> +                       spin_unlock_irqrestore(&igc->ptp_tx_lock, flags);
>>> +                       return -EBUSY;
>>> +               }
>>> +       }
>>> +       wr32(IGC_TIMINCA, inca);
>>> +       spin_unlock_irqrestore(&igc->ptp_tx_lock, flags);
>> 
>> It's a bit weird solution, because in this case we may end up having no
>> successful calls to adjfine with high amount of TX timestamp packets in flight. Another problem here is that access to timing registers is
>> guarded by tmreg_lock, but here you use ptp_tx_lock, which protects
>> queue.
>> 
>> Were you able to recover "corrupted" time stamps to figure out why they
>> are discarded?
>> 
>> 
>>>        return 0;
>>> }
>>> --
>>> 2.39.2
> 
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine]
  2026-04-01  1:14     ` Bob Van Valzah
@ 2026-04-02  0:49       ` Bob Van Valzah
  2026-04-02 23:48         ` [Intel-wired-lan] " Vinicius Costa Gomes
  0 siblings, 1 reply; 6+ messages in thread
From: Bob Van Valzah @ 2026-04-02  0:49 UTC (permalink / raw)
  To: Vadim Fedorenko
  Cc: intel-wired-lan, anthony.l.nguyen, netdev, julianstj, jeff,
	Lasse Johnsen, Ian Gough

Vadim,

Thanks for the feedback on our first patch. We've spent more time in
the lab studying the igc TX timestamp behavior under stress. We
understand the failure modes much better now.

You were right that ptp_tx_lock was the wrong lock — it guards the TX
queue, not the timing registers. You suggested tmreg_lock instead. We
tested tmreg_lock alone (v2) and found it doesn't fix the bug: it appears
that the race is between the software TIMINCA write and the hardware’s
asynchronous TX timestamp capture pipeline, not between two software threads.
tmreg_lock serializes software register accesses but can't prevent the hardware
from reading TIMINCA at the instant software writes it.

Our v3 patch (attached) takes tmreg_lock as you suggested, and
additionally disables TX timestamping in hardware via TSYNCTXCTL around
the TIMINCA write. This prevents the hardware from starting new
timestamp captures during the rate change:

    spin_lock_irqsave(&igc->tmreg_lock, flags);
    txctl = rd32(IGC_TSYNCTXCTL);
    wr32(IGC_TSYNCTXCTL, txctl & ~IGC_TSYNCTXCTL_ENABLED);
    wr32(IGC_TIMINCA, inca);
    wr32(IGC_TSYNCTXCTL, txctl);
    spin_unlock_irqrestore(&igc->tmreg_lock, flags);

v3 doesn't return -EBUSY, so there's no risk of adjfine starvation
under TX load. It always completes the TIMINCA write.

This still isn't a complete fix. Under extreme stress (100k+ adjfine/s
concurrent with 100k+ TX timestamps/s), disabling TSYNCTXCTL can strand
an in-progress capture that started before the disable. The stranded
slot sits occupied for IGC_PTP_TX_TIMEOUT (15s) before the watchdog
clears it. However, these rates are far beyond any realistic PTP
deployment. At realistic rates (1 Hz adjfine + 128 Hz PTP sync), v3
passes all our tests cleanly.

We don't know of a way to prevent pathological system call patterns
from causing the hardware to lock up. The vulnerable window seems to
span the hardware's internal capture-to-latch pipeline, not just the
TIMINCA write instant, and there's no software-visible indication of
when a capture is mid-flight.

Test methodology
----------------

Our earlier multi-test runs gave misleading results because the
15-second timeout pipeline carried stranded slots from one test into
the next. We now reload the igc driver between each test to ensure
clean hardware state. The reproducer is the same one we submitted
previously:

  https://github.com/bobvan/PePPAR-Fix/blob/main/tools/igc_tx_timeout_repro.py

We also have a single-test runner that reloads the driver automatically:

  https://github.com/bobvan/PePPAR-Fix/blob/main/tools/igc_tx_single_test.py


Results (clean state, driver reloaded between each test)
--------------------------------------------------------

Stock driver (no patch):

  TX 100/s  + adj 1/s     : no timeout in 60s  (0 TO, 0 skip)
  TX 1k/s   + adj 1/s     : no timeout in 60s  (0 TO, 0 skip)
  TX 10k/s  + adj 1/s     : no timeout in 60s  (0 TO, 0 skip)
  TX 1k/s   + adj 10k/s   : no timeout in 60s  (0 TO, 0 skip)
  TX 10k/s  + adj 100k/s  : no timeout in 60s  (0 TO, 0 skip)
  TX 100k/s + adj 100k/s  : FAIL at 16s        (3 TO, 76 skip)
  TX 100k/s + no adj      : FAIL at 16s        (4 TO, 8M skip)

v3 (tmreg_lock + TSYNCTXCTL disable/enable):

  TX 128/s  + adj 1/s     : no timeout in 300s (0 TO, 0 skip)
  TX 100k/s + adj 100k/s  : FAIL at 17s        (12 TO, 1.7M skip)

Three distinct failure modes
----------------------------

1. TIMINCA corruption (the original bug):

   adjfine writes TIMINCA while a TX timestamp capture is in progress.
   The hardware latches a value computed with an inconsistent increment
   rate. The valid bit in TSYNCTXCTL is never set, the slot stays
   occupied, and the watchdog times out after 15 seconds.

   Signature: low skip count + moderate timeout count. The capture
   acquired a slot normally but the timestamp was corrupt.

   This is the bug that matters in practice. At 1 Hz adjfine + 128 Hz
   PTP sync, the collision probability per adjfine call is low but
   nonzero — we measured ~30 minute MTBF empirically.

2. TX timestamp slot exhaustion:

   At extreme TX rates (100k+) without adjfine, all 4 timestamp slots
   (IGC_MAX_TX_TSTAMP_REGS) stay occupied because the interrupt/drain
   path can't keep up. New packets requesting timestamps are skipped
   (tx_hwtstamp_skipped). Eventually some slot's 15-second timeout
   fires.

   Signature: massive skip count (millions), timeout count = 4 (one
   per slot). This is unrelated to the TIMINCA race.

3. TSYNCTXCTL stranding (v3 side effect):

   When v3 disables TSYNCTXCTL_ENABLED, a capture that was already
   in progress may never complete — the hardware won't set the valid
   bit for a capture started while timestamping was enabled if it
   finishes after timestamping is disabled. The slot stays occupied
   for 15 seconds.

   Signature: high skip count + high timeout count. At 100k adj/s,
   timestamping is disabled 100k times per second, frequently
   stranding captures.

   This doesn't occur at realistic rates because the 1 Hz adjfine
   only creates one brief (~1 µs) disable window per second.


PTP GM scaling implications
---------------------------

With adjfine fixed at 1/s (realistic for any GPSDO), the adjfine race
is not the scaling bottleneck.  The limit is the 4-slot TX timestamp
queue (IGC_MAX_TX_TSTAMP_REGS).  We swept TX timestamp rates from
128/s to 150k/s with 1 Hz adjfine and driver reload between each
test (v3 patch):

  TX   128/s + adj 1/s : clean 60s   (1 PTP client at 128 Hz)
  TX  1024/s + adj 1/s : clean 60s   (~8 clients)
  TX  8192/s + adj 1/s : clean 60s   (~64 clients)
  TX 65536/s + adj 1/s : clean 60s   (~512 clients)
  TX 100k/s  + adj 1/s : FAIL 17s    (12 TO, 9M skip — slot exhaustion)
  TX 100k/s  + NO adj  : FAIL 17s    (12 TO, 10M skip — same without adj)

The 100k/s failure is slot exhaustion (identical with and without
adjfine), not the TIMINCA race.  The i225/i226 hardware can handle
roughly 65k TX timestamps per second before the 4-slot queue becomes
the bottleneck.  This limits the i226 to approximately 500 PTP
clients at logSyncInterval -7 (128 Hz sync).

The adjfine race itself has negligible probability at 1 Hz adjfine.
The TSYNCTXCTL disable window in v3 is ~1 µs per second — a duty
cycle of 10^-6.  Even at 65k TX timestamps/s, the probability of
a capture starting during that window is ~0.065 per second, giving
an estimated MTBF of many hours.


Note on igc_ptp_tx_hang
------------------------

We noticed that igc_ptp_tx_hang() does not check TSYNCTXCTL valid
bits before declaring a timeout. It only checks whether 15 seconds
have elapsed since the slot was assigned. If the hardware captured a
valid timestamp but the completion interrupt was lost or coalesced,
the valid timestamp is thrown away by the timeout handler (which reads
TXSTMPH_0 to clear all valid bits). A more robust timeout handler
could check TSYNCTXCTL and salvage valid-but-undelivered timestamps
before declaring a timeout. This wouldn't fix the TIMINCA race but
would make the system more resilient to lost interrupts.


Tested on:
  - Intel i226 (TimeHAT board on Raspberry Pi 5)
  - Kernel 6.12.62+rpt-rpi-2712 (Raspberry Pi OS)
  - Intel out-of-tree igc driver 5.4.0-7642.46

Thanks,
Bob Van Valzah

—

 drivers/net/ethernet/intel/igc/igc_ptp.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/int
el/igc/igc_ptp.c
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -47,6 +47,8 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, lo
ng scaled_ppm)
 {
        struct igc_adapter *igc = container_of(ptp, struct igc_adapter,
                                               ptp_caps);
        struct igc_hw *hw = &igc->hw;
+       unsigned long flags;
+       u32 txctl;
        int neg_adj = 0;
        u64 rate;
        u32 inca;
@@ -63,7 +65,17 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, l
ong scaled_ppm)
        if (neg_adj)
                inca |= ISGN;

-       wr32(IGC_TIMINCA, inca);
+       /* Changing TIMINCA while the hardware is capturing a TX timestamp
+        * corrupts the timestamp, causing "Tx timestamp timeout."
+        * Temporarily disable TX timestamping so no new capture can start
+        * during the rate change.
+        */
+       spin_lock_irqsave(&igc->tmreg_lock, flags);
+       txctl = rd32(IGC_TSYNCTXCTL);
+       wr32(IGC_TSYNCTXCTL, txctl & ~IGC_TSYNCTXCTL_ENABLED);
+       wr32(IGC_TIMINCA, inca);
+       wr32(IGC_TSYNCTXCTL, txctl);
+       spin_unlock_irqrestore(&igc->tmreg_lock, flags);

        return 0;
 }
--
2.39.2


> On Mar 31, 2026, at 6:14 PM, Bob Van Valzah <Bob@VanValzah.Com> wrote:
> 
> There is one other symptom of this bug that we neglected to report previously: PEROUT stops. That’s PPS OUT for us, so it’s how we first noticed the problem. Recovery was to unload and reload the igc driver.
> 
> I take this as further evidence for my hunch that the hardware just reaches some internal “this should never happen” state and the PHC just stops ticking.
> 
> Hoping this additional symptom report might help with diagnosis.
> 
> Bob
> 
>> On Mar 30, 2026, at 12:42 PM, Bob Van Valzah <Bob@VanValzah.Com> wrote:
>> 
>> Thanks for following up Vadim. To be clear, we do not see our patch as a solution to the problem, since it does not eliminate the underlying contention, it just reduces the likelihood.
>> 
>> We now have a bit more hands-on time running the patch and can see its limitations. The driver just throws the timeout error less often now. Moreover, our code calling adjfine() has to be ready for an EBUSY. The patch is certainly not an elegant solution. It may well have worked better, or perhaps been a complete fix, if we had taken tmreg_lock. Sorry if we sent our patch prematurely.
>> 
>> We think the contention happens when a system call made by a clock disciplining daemon like ts2phc changes a PHC's frequency, while a timestamping daemon like ptp4l has requested a timestamp against that same PHC. A plausible explanation is that the hardware fails to produce a timestamp for a packet when it collides with a PHC frequency adjustment, leading to the timestamp timeout. We see this contention 128x more often than the average user because we cranked our PTP sync rate up to 128 Hz. We care about every ns here. The likelihood probably also scales with the number of PTP clients.
>> 
>> Sorry, in retrospect, we may have overstated our case in declaring the root cause as the lack of locking, since our locking only reduces the likelihood of the timestamp timeout. Intel may provide insight to the underlying cause of the timeout, but my hunch is that the collision of a timestamp request against a PHC and a frequency change of the same PHC causes the timestamp request to fail, leading to the timeout. Our repro code makes this happen so reliably that it should be easy for the hardware guys to explain exactly what’s going on. Hopefully, they can advise on a more elegant avoidance strategy than our ham-handed lock.
>> 
>> We have not tried recovering a timestamp following the error. We may have used the term “corrupted” poorly here. Since the error message says “timeout," that could mean the hardware produced no timestamp at all, rather than a corrupted one.
>> 
>> This isn’t a show-stopper for us, but we note that timekeeping daemons with imperfect error handling could react to this situation poorly and perhaps fail silently.
>> 
>> Bob
>> 
>>> On Mar 30, 2026, at 11:39 AM, Vadim Fedorenko <vadim.fedorenko@linux.dev> wrote:
>>> 
>>> On 29/03/2026 04:25, Bob Van Valzah wrote:
>>>> Hi,
>>>> We found a race in igc_ptp_adjfine_i225() that causes "Tx timestamp
>>>> timeout" errors and eventually wedges EXTTS when a PTP grandmaster
>>>> (ptp4l with hardware timestamping) runs concurrently with PHC
>>>> frequency discipline (any GPSDO calling clock_adjtime ADJ_FREQUENCY).
>>>> Root cause: igc_ptp_adjfine_i225() writes IGC_TIMINCA without holding
>>>> any lock.  Every other PTP clock operation in igc_ptp.c (adjtime,
>>>> gettime, settime) holds tmreg_lock, but adjfine does not.  When the
>>>> increment rate changes while the hardware is capturing a TX timestamp,
>>>> the captured value is corrupt.  The driver retries for
>>>> IGC_PTP_TX_TIMEOUT (15s), then logs the timeout and frees the skb.
>>>> Repeated occurrences eventually prevent EXTTS from delivering events.
>>>> The attached reproducer (triggers in ~17 seconds on i226):
>>>> One thread calling clock_adjtime(ADJ_FREQUENCY) at ~200k/s on the
>>>> PHC, another sending UDP packets with SO_TIMESTAMPING requesting
>>>> hardware TX timestamps at ~100k/s.  A Python reproducer is at:
>>>> https://github.com/bobvan/PePPAR-Fix/blob/main/tools/igc_tx_timeout_repro.py
>>>> At realistic rates (1 Hz adjfine from a GPSDO + ptp4l at 128 Hz
>>>> sync), the race triggers in ~30 minutes.
>>>> The attached patch holds ptp_tx_lock around the TIMINCA write and
>>>> skips the write if any TX timestamps are pending (tx_tstamp[i].skb
>>>> != NULL), returning -EBUSY.  This doesn't fully close the hardware
>>>> race (a new TX capture can start between the check and the write),
>>>> but at realistic rates the residual probability gives ~25 year MTBF
>>>> vs ~30 minutes without the patch.
>>>> A complete fix would likely require either disabling TX timestamping
>>>> around TIMINCA writes (via TSYNCTXCTL), or making the timeout recovery
>>>> path more robust so a single corrupt timestamp doesn't wedge the
>>>> subsystem.  We'd welcome guidance from the igc maintainers on the
>>>> preferred approach.
>>>> Tested on:
>>>> - Intel i226 (TimeHAT v5 board on Raspberry Pi 5)
>>>> - Kernel 6.12.62+rpt-rpi-2712 (Raspberry Pi OS)
>>>> - Intel out-of-tree igc driver 5.4.0-7642.46
>>>> - Stock upstream igc_ptp.c (same code, same bug)
>>>> Bob
>>>> ---
>>>> drivers/net/ethernet/intel/igc/igc_ptp.c | 18 +++++++++++++++++-
>>>> 1 file changed, 17 insertions(+), 1 deletion(-)
>>>> diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
>>>> index XXXXXXX..XXXXXXX 100644
>>>> --- a/drivers/net/ethernet/intel/igc/igc_ptp.c
>>>> +++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
>>>> @@ -47,8 +47,10 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, long scaled_ppm)
>>>> {
>>>>       struct igc_adapter *igc = container_of(ptp, struct igc_adapter,
>>>>                                              ptp_caps);
>>>>       struct igc_hw *hw = &igc->hw;
>>>> +       unsigned long flags;
>>>>       int neg_adj = 0;
>>>>       u64 rate;
>>>>       u32 inca;
>>>> +       int i;
>>>>       if (scaled_ppm < 0) {
>>>>               neg_adj = 1;
>>>> @@ -63,7 +65,21 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, long scaled_ppm)
>>>>       if (neg_adj)
>>>>               inca |= ISGN;
>>>> -       wr32(IGC_TIMINCA, inca);
>>>> +       /* Changing the clock increment rate while a TX timestamp is being
>>>> +        * captured by the hardware can corrupt the timestamp, causing the
>>>> +        * driver to report "Tx timestamp timeout" and eventually wedging
>>>> +        * the EXTTS subsystem.  Serialize with pending TX timestamps:
>>>> +        * skip the rate change if any are in flight.
>>>> +        */
>>>> +       spin_lock_irqsave(&igc->ptp_tx_lock, flags);
>>>> +       for (i = 0; i < IGC_MAX_TX_TSTAMP_REGS; i++) {
>>>> +               if (igc->tx_tstamp[i].skb) {
>>>> +                       spin_unlock_irqrestore(&igc->ptp_tx_lock, flags);
>>>> +                       return -EBUSY;
>>>> +               }
>>>> +       }
>>>> +       wr32(IGC_TIMINCA, inca);
>>>> +       spin_unlock_irqrestore(&igc->ptp_tx_lock, flags);
>>> 
>>> It's a bit weird solution, because in this case we may end up having no
>>> successful calls to adjfine with high amount of TX timestamp packets in flight. Another problem here is that access to timing registers is
>>> guarded by tmreg_lock, but here you use ptp_tx_lock, which protects
>>> queue.
>>> 
>>> Were you able to recover "corrupted" time stamps to figure out why they
>>> are discarded?
>>> 
>>> 
>>>>       return 0;
>>>> }
>>>> --
>>>> 2.39.2
>> 
>> 
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Intel-wired-lan] [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine]
  2026-04-02  0:49       ` [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine] Bob Van Valzah
@ 2026-04-02 23:48         ` Vinicius Costa Gomes
  0 siblings, 0 replies; 6+ messages in thread
From: Vinicius Costa Gomes @ 2026-04-02 23:48 UTC (permalink / raw)
  To: Bob Van Valzah, Vadim Fedorenko
  Cc: intel-wired-lan, anthony.l.nguyen, netdev, julianstj, jeff,
	Lasse Johnsen, Ian Gough

Hi,

Bob Van Valzah <bob@vanvalzah.com> writes:

> Vadim,
>
> Thanks for the feedback on our first patch. We've spent more time in
> the lab studying the igc TX timestamp behavior under stress. We
> understand the failure modes much better now.
>
> You were right that ptp_tx_lock was the wrong lock — it guards the TX
> queue, not the timing registers. You suggested tmreg_lock instead. We
> tested tmreg_lock alone (v2) and found it doesn't fix the bug: it appears
> that the race is between the software TIMINCA write and the hardware’s
> asynchronous TX timestamp capture pipeline, not between two software threads.
> tmreg_lock serializes software register accesses but can't prevent the hardware
> from reading TIMINCA at the instant software writes it.
>
> Our v3 patch (attached) takes tmreg_lock as you suggested, and
> additionally disables TX timestamping in hardware via TSYNCTXCTL around
> the TIMINCA write. This prevents the hardware from starting new
> timestamp captures during the rate change:
>
>     spin_lock_irqsave(&igc->tmreg_lock, flags);
>     txctl = rd32(IGC_TSYNCTXCTL);
>     wr32(IGC_TSYNCTXCTL, txctl & ~IGC_TSYNCTXCTL_ENABLED);
>     wr32(IGC_TIMINCA, inca);
>     wr32(IGC_TSYNCTXCTL, txctl);
>     spin_unlock_irqrestore(&igc->tmreg_lock, flags);
>

I sent, a couple of days ago, the link to your report to our hardware
folks, waiting for them to take a look.

I think that this workaround, even if incomplete, will be interesting to
them as well. Again, thanks for the detailed report.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-04-02 23:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-29  3:25 [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine Bob Van Valzah
2026-03-30 16:39 ` Vadim Fedorenko
2026-03-30 19:42   ` Bob Van Valzah
2026-04-01  1:14     ` Bob Van Valzah
2026-04-02  0:49       ` [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine] Bob Van Valzah
2026-04-02 23:48         ` [Intel-wired-lan] " Vinicius Costa Gomes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox