public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine
@ 2026-03-29  3:25 Bob Van Valzah
  2026-03-30 16:39 ` Vadim Fedorenko
  0 siblings, 1 reply; 6+ messages in thread
From: Bob Van Valzah @ 2026-03-29  3:25 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: anthony.l.nguyen, netdev, julianstj, jeff

[-- Attachment #1: Type: text/plain, Size: 3897 bytes --]

Hi,

We found a race in igc_ptp_adjfine_i225() that causes "Tx timestamp
timeout" errors and eventually wedges EXTTS when a PTP grandmaster
(ptp4l with hardware timestamping) runs concurrently with PHC
frequency discipline (any GPSDO calling clock_adjtime ADJ_FREQUENCY).

Root cause: igc_ptp_adjfine_i225() writes IGC_TIMINCA without holding
any lock.  Every other PTP clock operation in igc_ptp.c (adjtime,
gettime, settime) holds tmreg_lock, but adjfine does not.  When the
increment rate changes while the hardware is capturing a TX timestamp,
the captured value is corrupt.  The driver retries for
IGC_PTP_TX_TIMEOUT (15s), then logs the timeout and frees the skb.
Repeated occurrences eventually prevent EXTTS from delivering events.

The attached reproducer (triggers in ~17 seconds on i226):

  One thread calling clock_adjtime(ADJ_FREQUENCY) at ~200k/s on the
  PHC, another sending UDP packets with SO_TIMESTAMPING requesting
  hardware TX timestamps at ~100k/s.  A Python reproducer is at:
  https://github.com/bobvan/PePPAR-Fix/blob/main/tools/igc_tx_timeout_repro.py

  At realistic rates (1 Hz adjfine from a GPSDO + ptp4l at 128 Hz
  sync), the race triggers in ~30 minutes.

The attached patch holds ptp_tx_lock around the TIMINCA write and
skips the write if any TX timestamps are pending (tx_tstamp[i].skb
!= NULL), returning -EBUSY.  This doesn't fully close the hardware
race (a new TX capture can start between the check and the write),
but at realistic rates the residual probability gives ~25 year MTBF
vs ~30 minutes without the patch.

A complete fix would likely require either disabling TX timestamping
around TIMINCA writes (via TSYNCTXCTL), or making the timeout recovery
path more robust so a single corrupt timestamp doesn't wedge the
subsystem.  We'd welcome guidance from the igc maintainers on the
preferred approach.

Tested on:
  - Intel i226 (TimeHAT v5 board on Raspberry Pi 5)
  - Kernel 6.12.62+rpt-rpi-2712 (Raspberry Pi OS)
  - Intel out-of-tree igc driver 5.4.0-7642.46
  - Stock upstream igc_ptp.c (same code, same bug)

	Bob

---

 drivers/net/ethernet/intel/igc/igc_ptp.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
index XXXXXXX..XXXXXXX 100644
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -47,8 +47,10 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, long scaled_ppm)
 {
        struct igc_adapter *igc = container_of(ptp, struct igc_adapter,
                                               ptp_caps);
        struct igc_hw *hw = &igc->hw;
+       unsigned long flags;
        int neg_adj = 0;
        u64 rate;
        u32 inca;
+       int i;

        if (scaled_ppm < 0) {
                neg_adj = 1;
@@ -63,7 +65,21 @@ static int igc_ptp_adjfine_i225(struct ptp_clock_info *ptp, long scaled_ppm)
        if (neg_adj)
                inca |= ISGN;

-       wr32(IGC_TIMINCA, inca);
+       /* Changing the clock increment rate while a TX timestamp is being
+        * captured by the hardware can corrupt the timestamp, causing the
+        * driver to report "Tx timestamp timeout" and eventually wedging
+        * the EXTTS subsystem.  Serialize with pending TX timestamps:
+        * skip the rate change if any are in flight.
+        */
+       spin_lock_irqsave(&igc->ptp_tx_lock, flags);
+       for (i = 0; i < IGC_MAX_TX_TSTAMP_REGS; i++) {
+               if (igc->tx_tstamp[i].skb) {
+                       spin_unlock_irqrestore(&igc->ptp_tx_lock, flags);
+                       return -EBUSY;
+               }
+       }
+       wr32(IGC_TIMINCA, inca);
+       spin_unlock_irqrestore(&igc->ptp_tx_lock, flags);

        return 0;
 }
--
2.39.2

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: igc_tx_timeout_repro.py --]
[-- Type: text/x-python-script; x-unix-mode=0644; name="igc_tx_timeout_repro.py", Size: 6600 bytes --]

#!/usr/bin/env python3
"""Reproducer for igc driver Tx timestamp timeout bug.

The igc driver (Intel i225/i226) has a race between clock_adjtime
(ADJ_FREQUENCY) and hardware TX timestamping.  When adjfine() is
called while a TX timestamp is pending, the timestamp register can
wedge, producing "Tx timestamp timeout" errors in dmesg and eventually
breaking EXTTS (PPS capture).

This reproducer runs two threads:
  1. Hammers clock_adjtime(ADJ_FREQUENCY) on the PHC
  2. Sends UDP packets with SO_TIMESTAMPING (hardware TX timestamps)

On a vulnerable igc driver, "Tx timestamp timeout" appears in dmesg
within seconds.

Usage:
    sudo python3 tools/igc_tx_timeout_repro.py eth1 /dev/ptp0
    # Watch with: dmesg -w | grep "Tx timestamp"

Requires root for SO_TIMESTAMPING and clock_adjtime on PHC.
"""

import ctypes
import ctypes.util
import os
import socket
import struct
import sys
import threading
import time


def get_phc_clockid(fd):
    return (~fd << 3) | 3


def adjfine_loop(phc_fd, stop_event, stats):
    """Hammer adjfine() on the PHC."""
    librt = ctypes.CDLL(ctypes.util.find_library("rt"), use_errno=True)

    class Timeval(ctypes.Structure):
        _fields_ = [("tv_sec", ctypes.c_long), ("tv_usec", ctypes.c_long)]

    class Timex(ctypes.Structure):
        _fields_ = [
            ("modes", ctypes.c_uint),
            ("offset", ctypes.c_long),
            ("freq", ctypes.c_long),
            ("maxerror", ctypes.c_long),
            ("esterror", ctypes.c_long),
            ("status", ctypes.c_int),
            ("constant", ctypes.c_long),
            ("precision", ctypes.c_long),
            ("tolerance", ctypes.c_long),
            ("time", Timeval),
            ("tick", ctypes.c_long),
            ("ppsfreq", ctypes.c_long),
            ("jitter", ctypes.c_long),
            ("shift", ctypes.c_int),
            ("stabil", ctypes.c_long),
            ("jitcnt", ctypes.c_long),
            ("calcnt", ctypes.c_long),
            ("errcnt", ctypes.c_long),
            ("stbcnt", ctypes.c_long),
            ("tai", ctypes.c_int),
        ]

    ADJ_FREQUENCY = 0x0002
    clockid = get_phc_clockid(phc_fd)
    n = 0
    toggle = False

    while not stop_event.is_set():
        tx = Timex()
        tx.modes = ADJ_FREQUENCY
        # Alternate between two tiny frequency offsets
        tx.freq = 100 if toggle else -100  # ~0.0015 ppb
        toggle = not toggle
        ret = librt.clock_adjtime(ctypes.c_int32(clockid), ctypes.byref(tx))
        if ret < 0:
            print(f"adjfine error: {ctypes.get_errno()}")
            break
        n += 1

    stats["adjfine_count"] = n


def tx_timestamp_loop(iface, stop_event, stats):
    """Send UDP packets requesting hardware TX timestamps."""
    # SOL_SOCKET options for timestamping
    SO_TIMESTAMPING = 37
    SOF_TIMESTAMPING_TX_HARDWARE = (1 << 0)
    SOF_TIMESTAMPING_RAW_HARDWARE = (1 << 6)
    SOF_TIMESTAMPING_OPT_TSONLY = (1 << 11)

    flags = (SOF_TIMESTAMPING_TX_HARDWARE |
             SOF_TIMESTAMPING_RAW_HARDWARE |
             SOF_TIMESTAMPING_OPT_TSONLY)

    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    sock.setsockopt(socket.SOL_SOCKET, SO_TIMESTAMPING, flags)
    # Bind to the specific interface
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE,
                    iface.encode() + b'\0')
    sock.settimeout(0.01)

    # Send to a harmless destination (localhost or broadcast)
    dest = ("224.0.0.1", 9999)  # multicast, won't route
    payload = b"igc_repro" + b"\x00" * 32
    n = 0
    errors = 0

    while not stop_event.is_set():
        try:
            sock.sendto(payload, dest)
            n += 1
        except OSError:
            errors += 1
        # Don't sleep — maximize collision probability
        # But yield to avoid starving the adjfine thread
        if n % 100 == 0:
            time.sleep(0.0001)

    sock.close()
    stats["tx_count"] = n
    stats["tx_errors"] = errors


def check_dmesg_for_timeout():
    """Check if Tx timestamp timeout appeared in dmesg."""
    try:
        import subprocess
        result = subprocess.run(
            ["dmesg", "--since", "60 seconds ago"],
            capture_output=True, text=True, timeout=2
        )
        return "Tx timestamp timeout" in result.stdout
    except Exception:
        return False


def main():
    if len(sys.argv) < 3:
        print(f"Usage: sudo {sys.argv[0]} <interface> <ptp_device>")
        print(f"Example: sudo {sys.argv[0]} eth1 /dev/ptp0")
        sys.exit(1)

    iface = sys.argv[1]
    ptp_dev = sys.argv[2]
    duration = float(sys.argv[3]) if len(sys.argv) > 3 else 30.0

    if os.geteuid() != 0:
        print("ERROR: must run as root (sudo)")
        sys.exit(1)

    phc_fd = os.open(ptp_dev, os.O_RDWR)
    print(f"Opened {ptp_dev} (fd={phc_fd})")
    print(f"Interface: {iface}")
    print(f"Duration: {duration}s")
    print(f"Watch for: dmesg -w | grep 'Tx timestamp timeout'")
    print()

    stop = threading.Event()
    stats = {}

    t_adj = threading.Thread(target=adjfine_loop, args=(phc_fd, stop, stats))
    t_tx = threading.Thread(target=tx_timestamp_loop, args=(iface, stop, stats))

    t_adj.start()
    t_tx.start()

    start = time.monotonic()
    triggered = False
    while time.monotonic() - start < duration:
        time.sleep(1)
        elapsed = time.monotonic() - start
        if check_dmesg_for_timeout():
            print(f"\n*** Tx timestamp timeout detected after {elapsed:.1f}s ***")
            triggered = True
            break
        print(f"  {elapsed:.0f}s: running...", end="\r")

    stop.set()
    t_adj.join(timeout=2)
    t_tx.join(timeout=2)
    os.close(phc_fd)

    adj_n = stats.get("adjfine_count", 0)
    tx_n = stats.get("tx_count", 0)
    tx_err = stats.get("tx_errors", 0)
    elapsed = time.monotonic() - start

    print(f"\nResults ({elapsed:.1f}s):")
    print(f"  adjfine calls: {adj_n} ({adj_n/elapsed:.0f}/s)")
    print(f"  TX packets:    {tx_n} ({tx_n/elapsed:.0f}/s), {tx_err} errors")

    if triggered:
        print(f"\n  BUG TRIGGERED: igc Tx timestamp timeout")
        print(f"  The igc driver's PTP timestamp register wedged due to")
        print(f"  concurrent adjfine() and hardware TX timestamping.")
        sys.exit(1)
    else:
        print(f"\n  No timeout detected in {duration}s")
        sys.exit(0)


if __name__ == "__main__":
    main()

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-04-02 23:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-29  3:25 [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine Bob Van Valzah
2026-03-30 16:39 ` Vadim Fedorenko
2026-03-30 19:42   ` Bob Van Valzah
2026-04-01  1:14     ` Bob Van Valzah
2026-04-02  0:49       ` [PATCH] igc: fix Tx timestamp timeout caused by unlocked TIMINCA write in adj fine] Bob Van Valzah
2026-04-02 23:48         ` [Intel-wired-lan] " Vinicius Costa Gomes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox