* [Intel-wired-lan] [PATCH iwl-net v4 1/6] igc: fix PTM cycle trigger logic
2025-04-01 23:35 [Intel-wired-lan] [PATCH iwl-net v4 0/6] igc: Fix PTM timeout Jacob Keller
@ 2025-04-01 23:35 ` Jacob Keller
2025-04-02 10:46 ` Corinna Vinschen
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 2/6] igc: increase wait time before retrying PTM Jacob Keller
` (5 subsequent siblings)
6 siblings, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2025-04-01 23:35 UTC (permalink / raw)
To: Anthony Nguyen
Cc: david.zage, vinicius.gomes, rodrigo.cadore, intel-wired-lan,
netdev, Jacob Keller, Christopher S M Hall, Michal Swiatkowski,
Mor Bar-Gabay, Avigail Dahan, Corinna Vinschen
From: Christopher S M Hall <christopher.s.hall@intel.com>
Writing to clear the PTM status 'valid' bit while the PTM cycle is
triggered results in unreliable PTM operation. To fix this, clear the
PTM 'trigger' and status after each PTM transaction.
The issue can be reproduced with the following:
$ sudo phc2sys -R 1000 -O 0 -i tsn0 -m
Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to
quickly reproduce the issue.
PHC2SYS exits with:
"ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction
fails
This patch also fixes a hang in igc_probe() when loading the igc
driver in the kdump kernel on systems supporting PTM.
The igc driver running in the base kernel enables PTM trigger in
igc_probe(). Therefore the driver is always in PTM trigger mode,
except in brief periods when manually triggering a PTM cycle.
When a crash occurs, the NIC is reset while PTM trigger is enabled.
Due to a hardware problem, the NIC is subsequently in a bad busmaster
state and doesn't handle register reads/writes. When running
igc_probe() in the kdump kernel, the first register access to a NIC
register hangs driver probing and ultimately breaks kdump.
With this patch, igc has PTM trigger disabled most of the time,
and the trigger is only enabled for very brief (10 - 100 us) periods
when manually triggering a PTM cycle. Chances that a crash occurs
during a PTM trigger are not 0, but extremly reduced.
Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com>
Reviewed-by: Corinna Vinschen <vinschen@redhat.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/igc/igc_defines.h | 1 +
drivers/net/ethernet/intel/igc/igc_ptp.c | 70 ++++++++++++++++------------
2 files changed, 42 insertions(+), 29 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 8e449904aa7dbd12ea1181c9635a909e4c50afda..2ff292f5f63be29e42dc4491a56602d811cf22cc 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -593,6 +593,7 @@
#define IGC_PTM_STAT_T4M1_OVFL BIT(3) /* T4 minus T1 overflow */
#define IGC_PTM_STAT_ADJUST_1ST BIT(4) /* 1588 timer adjusted during 1st PTM cycle */
#define IGC_PTM_STAT_ADJUST_CYC BIT(5) /* 1588 timer adjusted during non-1st PTM cycle */
+#define IGC_PTM_STAT_ALL GENMASK(5, 0) /* Used to clear all status */
/* PCIe PTM Cycle Control */
#define IGC_PTM_CYCLE_CTRL_CYC_TIME(msec) ((msec) & 0x3ff) /* PTM Cycle Time (msec) */
diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
index 946edbad43022c9fdb5f2196b72c0e2d07436ed5..c640e346342be80fb53e68455d510fc6491366cd 100644
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -974,13 +974,40 @@ static void igc_ptm_log_error(struct igc_adapter *adapter, u32 ptm_stat)
}
}
+static void igc_ptm_trigger(struct igc_hw *hw)
+{
+ u32 ctrl;
+
+ /* To "manually" start the PTM cycle we need to set the
+ * trigger (TRIG) bit
+ */
+ ctrl = rd32(IGC_PTM_CTRL);
+ ctrl |= IGC_PTM_CTRL_TRIG;
+ wr32(IGC_PTM_CTRL, ctrl);
+ /* Perform flush after write to CTRL register otherwise
+ * transaction may not start
+ */
+ wrfl();
+}
+
+static void igc_ptm_reset(struct igc_hw *hw)
+{
+ u32 ctrl;
+
+ ctrl = rd32(IGC_PTM_CTRL);
+ ctrl &= ~IGC_PTM_CTRL_TRIG;
+ wr32(IGC_PTM_CTRL, ctrl);
+ /* Write to clear all status */
+ wr32(IGC_PTM_STAT, IGC_PTM_STAT_ALL);
+}
+
static int igc_phc_get_syncdevicetime(ktime_t *device,
struct system_counterval_t *system,
void *ctx)
{
- u32 stat, t2_curr_h, t2_curr_l, ctrl;
struct igc_adapter *adapter = ctx;
struct igc_hw *hw = &adapter->hw;
+ u32 stat, t2_curr_h, t2_curr_l;
int err, count = 100;
ktime_t t1, t2_curr;
@@ -994,25 +1021,13 @@ static int igc_phc_get_syncdevicetime(ktime_t *device,
* are transitory. Repeating the process returns valid
* data eventually.
*/
-
- /* To "manually" start the PTM cycle we need to clear and
- * then set again the TRIG bit.
- */
- ctrl = rd32(IGC_PTM_CTRL);
- ctrl &= ~IGC_PTM_CTRL_TRIG;
- wr32(IGC_PTM_CTRL, ctrl);
- ctrl |= IGC_PTM_CTRL_TRIG;
- wr32(IGC_PTM_CTRL, ctrl);
-
- /* The cycle only starts "for real" when software notifies
- * that it has read the registers, this is done by setting
- * VALID bit.
- */
- wr32(IGC_PTM_STAT, IGC_PTM_STAT_VALID);
+ igc_ptm_trigger(hw);
err = readx_poll_timeout(rd32, IGC_PTM_STAT, stat,
stat, IGC_PTM_STAT_SLEEP,
IGC_PTM_STAT_TIMEOUT);
+ igc_ptm_reset(hw);
+
if (err < 0) {
netdev_err(adapter->netdev, "Timeout reading IGC_PTM_STAT register\n");
return err;
@@ -1021,15 +1036,7 @@ static int igc_phc_get_syncdevicetime(ktime_t *device,
if ((stat & IGC_PTM_STAT_VALID) == IGC_PTM_STAT_VALID)
break;
- if (stat & ~IGC_PTM_STAT_VALID) {
- /* An error occurred, log it. */
- igc_ptm_log_error(adapter, stat);
- /* The STAT register is write-1-to-clear (W1C),
- * so write the previous error status to clear it.
- */
- wr32(IGC_PTM_STAT, stat);
- continue;
- }
+ igc_ptm_log_error(adapter, stat);
} while (--count);
if (!count) {
@@ -1255,7 +1262,7 @@ void igc_ptp_stop(struct igc_adapter *adapter)
void igc_ptp_reset(struct igc_adapter *adapter)
{
struct igc_hw *hw = &adapter->hw;
- u32 cycle_ctrl, ctrl;
+ u32 cycle_ctrl, ctrl, stat;
unsigned long flags;
u32 timadj;
@@ -1290,14 +1297,19 @@ void igc_ptp_reset(struct igc_adapter *adapter)
ctrl = IGC_PTM_CTRL_EN |
IGC_PTM_CTRL_START_NOW |
IGC_PTM_CTRL_SHRT_CYC(IGC_PTM_SHORT_CYC_DEFAULT) |
- IGC_PTM_CTRL_PTM_TO(IGC_PTM_TIMEOUT_DEFAULT) |
- IGC_PTM_CTRL_TRIG;
+ IGC_PTM_CTRL_PTM_TO(IGC_PTM_TIMEOUT_DEFAULT);
wr32(IGC_PTM_CTRL, ctrl);
/* Force the first cycle to run. */
- wr32(IGC_PTM_STAT, IGC_PTM_STAT_VALID);
+ igc_ptm_trigger(hw);
+ if (readx_poll_timeout_atomic(rd32, IGC_PTM_STAT, stat,
+ stat, IGC_PTM_STAT_SLEEP,
+ IGC_PTM_STAT_TIMEOUT))
+ netdev_err(adapter->netdev, "Timeout reading IGC_PTM_STAT register\n");
+
+ igc_ptm_reset(hw);
break;
default:
/* No work to do. */
--
2.48.1.397.gec9d649cc640
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [Intel-wired-lan] [PATCH iwl-net v4 1/6] igc: fix PTM cycle trigger logic
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 1/6] igc: fix PTM cycle trigger logic Jacob Keller
@ 2025-04-02 10:46 ` Corinna Vinschen
0 siblings, 0 replies; 12+ messages in thread
From: Corinna Vinschen @ 2025-04-02 10:46 UTC (permalink / raw)
To: Jacob Keller
Cc: Anthony Nguyen, david.zage, vinicius.gomes, rodrigo.cadore,
intel-wired-lan, netdev, Christopher S M Hall, Michal Swiatkowski,
Mor Bar-Gabay, Avigail Dahan
On Apr 1 16:35, Jacob Keller wrote:
> From: Christopher S M Hall <christopher.s.hall@intel.com>
>
> Writing to clear the PTM status 'valid' bit while the PTM cycle is
> triggered results in unreliable PTM operation. To fix this, clear the
> PTM 'trigger' and status after each PTM transaction.
>
> The issue can be reproduced with the following:
>
> $ sudo phc2sys -R 1000 -O 0 -i tsn0 -m
>
> Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to
> quickly reproduce the issue.
>
> PHC2SYS exits with:
>
> "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction
> fails
>
> This patch also fixes a hang in igc_probe() when loading the igc
> driver in the kdump kernel on systems supporting PTM.
>
> The igc driver running in the base kernel enables PTM trigger in
> igc_probe(). Therefore the driver is always in PTM trigger mode,
> except in brief periods when manually triggering a PTM cycle.
>
> When a crash occurs, the NIC is reset while PTM trigger is enabled.
> Due to a hardware problem, the NIC is subsequently in a bad busmaster
> state and doesn't handle register reads/writes. When running
> igc_probe() in the kdump kernel, the first register access to a NIC
> register hangs driver probing and ultimately breaks kdump.
>
> With this patch, igc has PTM trigger disabled most of the time,
> and the trigger is only enabled for very brief (10 - 100 us) periods
> when manually triggering a PTM cycle. Chances that a crash occurs
> during a PTM trigger are not 0, but extremly reduced.
>
> Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()")
> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
> Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
> Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com>
> Reviewed-by: Corinna Vinschen <vinschen@redhat.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Corinna Vinschen <vinschen@redhat.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [PATCH iwl-net v4 2/6] igc: increase wait time before retrying PTM
2025-04-01 23:35 [Intel-wired-lan] [PATCH iwl-net v4 0/6] igc: Fix PTM timeout Jacob Keller
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 1/6] igc: fix PTM cycle trigger logic Jacob Keller
@ 2025-04-01 23:35 ` Jacob Keller
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 3/6] igc: move ktime snapshot into PTM retry loop Jacob Keller
` (4 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Jacob Keller @ 2025-04-01 23:35 UTC (permalink / raw)
To: Anthony Nguyen
Cc: david.zage, vinicius.gomes, rodrigo.cadore, intel-wired-lan,
netdev, Jacob Keller, Christopher S M Hall, Michal Swiatkowski,
Mor Bar-Gabay, Avigail Dahan, Corinna Vinschen
From: Christopher S M Hall <christopher.s.hall@intel.com>
The i225/i226 hardware retries if it receives an inappropriate response
from the upstream device. If the device retries too quickly, the root
port does not respond.
The wait between attempts was reduced from 10us to 1us in commit
6b8aa753a9f9 ("igc: Decrease PTM short interval from 10 us to 1 us"), which
said:
With the 10us interval, we were seeing PTM transactions take around
12us. Hardware team suggested this interval could be lowered to 1us
which was confirmed with PCIe sniffer. With the 1us interval, PTM
dialogs took around 2us.
While a 1us short cycle time was thought to be theoretically sufficient, it
turns out in practice it is not quite long enough. It is unclear if the
problem is in the root port or an issue in i225/i226.
Increase the wait from 1us to 4us. Increasing to 2us appeared to work in
practice on the setups we have available. A value of 4us was chosen due to
the limited hardware available for testing, with a goal of ensuring we wait
long enough without overly penalizing the response time when unnecessary.
The issue can be reproduced with the following:
$ sudo phc2sys -R 1000 -O 0 -i tsn0 -m
Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to
quickly reproduce the issue.
PHC2SYS exits with:
"ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction
fails
Fixes: 6b8aa753a9f9 ("igc: Decrease PTM short interval from 10 us to 1 us")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com>
Reviewed-by: Corinna Vinschen <vinschen@redhat.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/igc/igc_defines.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 2ff292f5f63be29e42dc4491a56602d811cf22cc..d19325b0e6e0ba684abbe10482fecce92e405420 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -574,7 +574,10 @@
#define IGC_PTM_CTRL_SHRT_CYC(usec) (((usec) & 0x3f) << 2)
#define IGC_PTM_CTRL_PTM_TO(usec) (((usec) & 0xff) << 8)
-#define IGC_PTM_SHORT_CYC_DEFAULT 1 /* Default short cycle interval */
+/* A short cycle time of 1us theoretically should work, but appears to be too
+ * short in practice.
+ */
+#define IGC_PTM_SHORT_CYC_DEFAULT 4 /* Default short cycle interval */
#define IGC_PTM_CYC_TIME_DEFAULT 5 /* Default PTM cycle time */
#define IGC_PTM_TIMEOUT_DEFAULT 255 /* Default timeout for PTM errors */
--
2.48.1.397.gec9d649cc640
^ permalink raw reply related [flat|nested] 12+ messages in thread* [Intel-wired-lan] [PATCH iwl-net v4 3/6] igc: move ktime snapshot into PTM retry loop
2025-04-01 23:35 [Intel-wired-lan] [PATCH iwl-net v4 0/6] igc: Fix PTM timeout Jacob Keller
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 1/6] igc: fix PTM cycle trigger logic Jacob Keller
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 2/6] igc: increase wait time before retrying PTM Jacob Keller
@ 2025-04-01 23:35 ` Jacob Keller
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 4/6] igc: handle the IGC_PTP_ENABLED flag correctly Jacob Keller
` (3 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Jacob Keller @ 2025-04-01 23:35 UTC (permalink / raw)
To: Anthony Nguyen
Cc: david.zage, vinicius.gomes, rodrigo.cadore, intel-wired-lan,
netdev, Jacob Keller, Christopher S M Hall, Michal Swiatkowski,
Mor Bar-Gabay, Avigail Dahan, Corinna Vinschen
From: Christopher S M Hall <christopher.s.hall@intel.com>
Move ktime_get_snapshot() into the loop. If a retry does occur, a more
recent snapshot will result in a more accurate cross-timestamp.
Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com>
Reviewed-by: Corinna Vinschen <vinschen@redhat.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/igc/igc_ptp.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
index c640e346342be80fb53e68455d510fc6491366cd..516abe7405deee94866c22ccc3d101db1a21dbb6 100644
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -1011,16 +1011,16 @@ static int igc_phc_get_syncdevicetime(ktime_t *device,
int err, count = 100;
ktime_t t1, t2_curr;
- /* Get a snapshot of system clocks to use as historic value. */
- ktime_get_snapshot(&adapter->snapshot);
-
+ /* Doing this in a loop because in the event of a
+ * badly timed (ha!) system clock adjustment, we may
+ * get PTM errors from the PCI root, but these errors
+ * are transitory. Repeating the process returns valid
+ * data eventually.
+ */
do {
- /* Doing this in a loop because in the event of a
- * badly timed (ha!) system clock adjustment, we may
- * get PTM errors from the PCI root, but these errors
- * are transitory. Repeating the process returns valid
- * data eventually.
- */
+ /* Get a snapshot of system clocks to use as historic value. */
+ ktime_get_snapshot(&adapter->snapshot);
+
igc_ptm_trigger(hw);
err = readx_poll_timeout(rd32, IGC_PTM_STAT, stat,
--
2.48.1.397.gec9d649cc640
^ permalink raw reply related [flat|nested] 12+ messages in thread* [Intel-wired-lan] [PATCH iwl-net v4 4/6] igc: handle the IGC_PTP_ENABLED flag correctly
2025-04-01 23:35 [Intel-wired-lan] [PATCH iwl-net v4 0/6] igc: Fix PTM timeout Jacob Keller
` (2 preceding siblings ...)
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 3/6] igc: move ktime snapshot into PTM retry loop Jacob Keller
@ 2025-04-01 23:35 ` Jacob Keller
2025-04-10 12:11 ` Mor Bar-Gabay
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 5/6] igc: cleanup PTP module if probe fails Jacob Keller
` (2 subsequent siblings)
6 siblings, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2025-04-01 23:35 UTC (permalink / raw)
To: Anthony Nguyen
Cc: david.zage, vinicius.gomes, rodrigo.cadore, intel-wired-lan,
netdev, Jacob Keller, Christopher S M Hall, Corinna Vinschen
From: Christopher S M Hall <christopher.s.hall@intel.com>
All functions in igc_ptp.c called from igc_main.c should check the
IGC_PTP_ENABLED flag. Adding check for this flag to stop and reset
functions.
Fixes: 5f2958052c58 ("igc: Add basic skeleton for PTP")
Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com>
Reviewed-by: Corinna Vinschen <vinschen@redhat.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/igc/igc_ptp.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
index 516abe7405deee94866c22ccc3d101db1a21dbb6..343205bffc355022306bcb1db35109e2113bb430 100644
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -1244,8 +1244,12 @@ void igc_ptp_suspend(struct igc_adapter *adapter)
**/
void igc_ptp_stop(struct igc_adapter *adapter)
{
+ if (!(adapter->ptp_flags & IGC_PTP_ENABLED))
+ return;
+
igc_ptp_suspend(adapter);
+ adapter->ptp_flags &= ~IGC_PTP_ENABLED;
if (adapter->ptp_clock) {
ptp_clock_unregister(adapter->ptp_clock);
netdev_info(adapter->netdev, "PHC removed\n");
@@ -1266,6 +1270,9 @@ void igc_ptp_reset(struct igc_adapter *adapter)
unsigned long flags;
u32 timadj;
+ if (!(adapter->ptp_flags & IGC_PTP_ENABLED))
+ return;
+
/* reset the tstamp_config */
igc_ptp_set_timestamp_mode(adapter, &adapter->tstamp_config);
--
2.48.1.397.gec9d649cc640
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [Intel-wired-lan] [PATCH iwl-net v4 4/6] igc: handle the IGC_PTP_ENABLED flag correctly
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 4/6] igc: handle the IGC_PTP_ENABLED flag correctly Jacob Keller
@ 2025-04-10 12:11 ` Mor Bar-Gabay
0 siblings, 0 replies; 12+ messages in thread
From: Mor Bar-Gabay @ 2025-04-10 12:11 UTC (permalink / raw)
To: Jacob Keller, Anthony Nguyen
Cc: david.zage, vinicius.gomes, rodrigo.cadore, intel-wired-lan,
netdev, Christopher S M Hall, Corinna Vinschen
On 02/04/2025 2:35, Jacob Keller wrote:
> From: Christopher S M Hall <christopher.s.hall@intel.com>
>
> All functions in igc_ptp.c called from igc_main.c should check the
> IGC_PTP_ENABLED flag. Adding check for this flag to stop and reset
> functions.
>
> Fixes: 5f2958052c58 ("igc: Add basic skeleton for PTP")
> Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com>
> Reviewed-by: Corinna Vinschen <vinschen@redhat.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
> drivers/net/ethernet/intel/igc/igc_ptp.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [PATCH iwl-net v4 5/6] igc: cleanup PTP module if probe fails
2025-04-01 23:35 [Intel-wired-lan] [PATCH iwl-net v4 0/6] igc: Fix PTM timeout Jacob Keller
` (3 preceding siblings ...)
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 4/6] igc: handle the IGC_PTP_ENABLED flag correctly Jacob Keller
@ 2025-04-01 23:35 ` Jacob Keller
2025-04-10 12:12 ` Mor Bar-Gabay
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 6/6] igc: add lock preventing multiple simultaneous PTM transactions Jacob Keller
2025-04-10 23:44 ` [Intel-wired-lan] [PATCH iwl-net v4 0/6] igc: Fix PTM timeout Vinicius Costa Gomes
6 siblings, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2025-04-01 23:35 UTC (permalink / raw)
To: Anthony Nguyen
Cc: david.zage, vinicius.gomes, rodrigo.cadore, intel-wired-lan,
netdev, Jacob Keller, Christopher S M Hall, Corinna Vinschen
From: Christopher S M Hall <christopher.s.hall@intel.com>
Make sure that the PTP module is cleaned up if the igc_probe() fails by
calling igc_ptp_stop() on exit.
Fixes: d89f88419f99 ("igc: Add skeletal frame for Intel(R) 2.5G Ethernet Controller support")
Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com>
Reviewed-by: Corinna Vinschen <vinschen@redhat.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/igc/igc_main.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 491d942cefca7add260a76b06aea9d2e2a9e4cce..e62d76e857c7d7d3197014d90902a1abad4ee497 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -7231,6 +7231,7 @@ static int igc_probe(struct pci_dev *pdev,
err_register:
igc_release_hw_control(adapter);
+ igc_ptp_stop(adapter);
err_eeprom:
if (!igc_check_reset_block(hw))
igc_reset_phy(hw);
--
2.48.1.397.gec9d649cc640
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [Intel-wired-lan] [PATCH iwl-net v4 5/6] igc: cleanup PTP module if probe fails
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 5/6] igc: cleanup PTP module if probe fails Jacob Keller
@ 2025-04-10 12:12 ` Mor Bar-Gabay
0 siblings, 0 replies; 12+ messages in thread
From: Mor Bar-Gabay @ 2025-04-10 12:12 UTC (permalink / raw)
To: Jacob Keller, Anthony Nguyen
Cc: david.zage, vinicius.gomes, rodrigo.cadore, intel-wired-lan,
netdev, Christopher S M Hall, Corinna Vinschen
On 02/04/2025 2:35, Jacob Keller wrote:
> From: Christopher S M Hall <christopher.s.hall@intel.com>
>
> Make sure that the PTP module is cleaned up if the igc_probe() fails by
> calling igc_ptp_stop() on exit.
>
> Fixes: d89f88419f99 ("igc: Add skeletal frame for Intel(R) 2.5G Ethernet Controller support")
> Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com>
> Reviewed-by: Corinna Vinschen <vinschen@redhat.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
> drivers/net/ethernet/intel/igc/igc_main.c | 1 +
> 1 file changed, 1 insertion(+)
>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [PATCH iwl-net v4 6/6] igc: add lock preventing multiple simultaneous PTM transactions
2025-04-01 23:35 [Intel-wired-lan] [PATCH iwl-net v4 0/6] igc: Fix PTM timeout Jacob Keller
` (4 preceding siblings ...)
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 5/6] igc: cleanup PTP module if probe fails Jacob Keller
@ 2025-04-01 23:35 ` Jacob Keller
2025-04-10 12:14 ` Mor Bar-Gabay
2025-04-10 23:44 ` [Intel-wired-lan] [PATCH iwl-net v4 0/6] igc: Fix PTM timeout Vinicius Costa Gomes
6 siblings, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2025-04-01 23:35 UTC (permalink / raw)
To: Anthony Nguyen
Cc: david.zage, vinicius.gomes, rodrigo.cadore, intel-wired-lan,
netdev, Jacob Keller, Christopher S M Hall, Corinna Vinschen
From: Christopher S M Hall <christopher.s.hall@intel.com>
Add a mutex around the PTM transaction to prevent multiple transactors
Multiple processes try to initiate a PTM transaction, one or all may
fail. This can be reproduced by running two instances of the
following:
$ sudo phc2sys -O 0 -i tsn0 -m
PHC2SYS exits with:
"ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction
fails
Note: Normally two instance of PHC2SYS will not run, but one process
should not break another.
Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()")
Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com>
Reviewed-by: Corinna Vinschen <vinschen@redhat.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/igc/igc.h | 1 +
drivers/net/ethernet/intel/igc/igc_ptp.c | 20 ++++++++++++++++++--
2 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index cd1d7b6c1782352094f6867a31b6958c929bbbf4..e03b5c89bdb1ab8b1a04b6e10ab8d666d383eee2 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -319,6 +319,7 @@ struct igc_adapter {
struct timespec64 prev_ptp_time; /* Pre-reset PTP clock */
ktime_t ptp_reset_start; /* Reset time in clock mono */
struct system_time_snapshot snapshot;
+ struct mutex ptm_lock; /* Only allow one PTM transaction at a time */
char fw_version[32];
diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
index 343205bffc355022306bcb1db35109e2113bb430..612ed26a29c5d491cc3f2c2af803adc770d60fc2 100644
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -974,6 +974,7 @@ static void igc_ptm_log_error(struct igc_adapter *adapter, u32 ptm_stat)
}
}
+/* The PTM lock: adapter->ptm_lock must be held when calling igc_ptm_trigger() */
static void igc_ptm_trigger(struct igc_hw *hw)
{
u32 ctrl;
@@ -990,6 +991,7 @@ static void igc_ptm_trigger(struct igc_hw *hw)
wrfl();
}
+/* The PTM lock: adapter->ptm_lock must be held when calling igc_ptm_reset() */
static void igc_ptm_reset(struct igc_hw *hw)
{
u32 ctrl;
@@ -1068,9 +1070,16 @@ static int igc_ptp_getcrosststamp(struct ptp_clock_info *ptp,
{
struct igc_adapter *adapter = container_of(ptp, struct igc_adapter,
ptp_caps);
+ int ret;
- return get_device_system_crosststamp(igc_phc_get_syncdevicetime,
- adapter, &adapter->snapshot, cts);
+ /* This blocks until any in progress PTM transactions complete */
+ mutex_lock(&adapter->ptm_lock);
+
+ ret = get_device_system_crosststamp(igc_phc_get_syncdevicetime,
+ adapter, &adapter->snapshot, cts);
+ mutex_unlock(&adapter->ptm_lock);
+
+ return ret;
}
static int igc_ptp_getcyclesx64(struct ptp_clock_info *ptp,
@@ -1169,6 +1178,7 @@ void igc_ptp_init(struct igc_adapter *adapter)
spin_lock_init(&adapter->ptp_tx_lock);
spin_lock_init(&adapter->free_timer_lock);
spin_lock_init(&adapter->tmreg_lock);
+ mutex_init(&adapter->ptm_lock);
adapter->tstamp_config.rx_filter = HWTSTAMP_FILTER_NONE;
adapter->tstamp_config.tx_type = HWTSTAMP_TX_OFF;
@@ -1181,6 +1191,7 @@ void igc_ptp_init(struct igc_adapter *adapter)
if (IS_ERR(adapter->ptp_clock)) {
adapter->ptp_clock = NULL;
netdev_err(netdev, "ptp_clock_register failed\n");
+ mutex_destroy(&adapter->ptm_lock);
} else if (adapter->ptp_clock) {
netdev_info(netdev, "PHC added\n");
adapter->ptp_flags |= IGC_PTP_ENABLED;
@@ -1210,10 +1221,12 @@ static void igc_ptm_stop(struct igc_adapter *adapter)
struct igc_hw *hw = &adapter->hw;
u32 ctrl;
+ mutex_lock(&adapter->ptm_lock);
ctrl = rd32(IGC_PTM_CTRL);
ctrl &= ~IGC_PTM_CTRL_EN;
wr32(IGC_PTM_CTRL, ctrl);
+ mutex_unlock(&adapter->ptm_lock);
}
/**
@@ -1255,6 +1268,7 @@ void igc_ptp_stop(struct igc_adapter *adapter)
netdev_info(adapter->netdev, "PHC removed\n");
adapter->ptp_flags &= ~IGC_PTP_ENABLED;
}
+ mutex_destroy(&adapter->ptm_lock);
}
/**
@@ -1294,6 +1308,7 @@ void igc_ptp_reset(struct igc_adapter *adapter)
if (!igc_is_crosststamp_supported(adapter))
break;
+ mutex_lock(&adapter->ptm_lock);
wr32(IGC_PCIE_DIG_DELAY, IGC_PCIE_DIG_DELAY_DEFAULT);
wr32(IGC_PCIE_PHY_DELAY, IGC_PCIE_PHY_DELAY_DEFAULT);
@@ -1317,6 +1332,7 @@ void igc_ptp_reset(struct igc_adapter *adapter)
netdev_err(adapter->netdev, "Timeout reading IGC_PTM_STAT register\n");
igc_ptm_reset(hw);
+ mutex_unlock(&adapter->ptm_lock);
break;
default:
/* No work to do. */
--
2.48.1.397.gec9d649cc640
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [Intel-wired-lan] [PATCH iwl-net v4 6/6] igc: add lock preventing multiple simultaneous PTM transactions
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 6/6] igc: add lock preventing multiple simultaneous PTM transactions Jacob Keller
@ 2025-04-10 12:14 ` Mor Bar-Gabay
0 siblings, 0 replies; 12+ messages in thread
From: Mor Bar-Gabay @ 2025-04-10 12:14 UTC (permalink / raw)
To: Jacob Keller, Anthony Nguyen
Cc: david.zage, vinicius.gomes, rodrigo.cadore, intel-wired-lan,
netdev, Christopher S M Hall, Corinna Vinschen
On 02/04/2025 2:35, Jacob Keller wrote:
> From: Christopher S M Hall <christopher.s.hall@intel.com>
>
> Add a mutex around the PTM transaction to prevent multiple transactors
>
> Multiple processes try to initiate a PTM transaction, one or all may
> fail. This can be reproduced by running two instances of the
> following:
>
> $ sudo phc2sys -O 0 -i tsn0 -m
>
> PHC2SYS exits with:
>
> "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction
> fails
>
> Note: Normally two instance of PHC2SYS will not run, but one process
> should not break another.
>
> Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()")
> Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com>
> Reviewed-by: Corinna Vinschen <vinschen@redhat.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
> drivers/net/ethernet/intel/igc/igc.h | 1 +
> drivers/net/ethernet/intel/igc/igc_ptp.c | 20 ++++++++++++++++++--
> 2 files changed, 19 insertions(+), 2 deletions(-)
>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v4 0/6] igc: Fix PTM timeout
2025-04-01 23:35 [Intel-wired-lan] [PATCH iwl-net v4 0/6] igc: Fix PTM timeout Jacob Keller
` (5 preceding siblings ...)
2025-04-01 23:35 ` [Intel-wired-lan] [PATCH iwl-net v4 6/6] igc: add lock preventing multiple simultaneous PTM transactions Jacob Keller
@ 2025-04-10 23:44 ` Vinicius Costa Gomes
6 siblings, 0 replies; 12+ messages in thread
From: Vinicius Costa Gomes @ 2025-04-10 23:44 UTC (permalink / raw)
To: Jacob Keller, Anthony Nguyen
Cc: david.zage, rodrigo.cadore, intel-wired-lan, netdev, Jacob Keller,
Christopher S M Hall, Michal Swiatkowski, Mor Bar-Gabay,
Avigail Dahan, Corinna Vinschen
Hi,
Jacob Keller <jacob.e.keller@intel.com> writes:
> There have been sporadic reports of PTM timeouts using i225/i226 devices
>
> These timeouts have been root caused to:
>
> 1) Manipulating the PTM status register while PTM is enabled and triggered
> 2) The hardware retrying too quickly when an inappropriate response is
> received from the upstream device
>
> The issue can be reproduced with the following:
>
> $ sudo phc2sys -R 1000 -O 0 -i tsn0 -m
>
> Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to
> quickly reproduce the issue.
>
> PHC2SYS exits with:
>
> "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction
> fails
>
> The first patch in this series also resolves an issue reported by Corinna
> Vinschen relating to kdump:
>
> This patch also fixes a hang in igc_probe() when loading the igc
> driver in the kdump kernel on systems supporting PTM.
>
> The igc driver running in the base kernel enables PTM trigger in
> igc_probe(). Therefore the driver is always in PTM trigger mode,
> except in brief periods when manually triggering a PTM cycle.
>
> When a crash occurs, the NIC is reset while PTM trigger is enabled.
> Due to a hardware problem, the NIC is subsequently in a bad busmaster
> state and doesn't handle register reads/writes. When running
> igc_probe() in the kdump kernel, the first register access to a NIC
> register hangs driver probing and ultimately breaks kdump.
>
> With this patch, igc has PTM trigger disabled most of the time,
> and the trigger is only enabled for very brief (10 - 100 us) periods
> when manually triggering a PTM cycle. Chances that a crash occurs
> during a PTM trigger are not zero, but extremly reduced.
>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
> Changes in v4:
> - Jacob taking over sending v4 due to lack of time on Chris's part.
> - Updated commit messages based on review feedback from v3
> - Updated commit titles to slightly more imperative wording
> - Link to v3: https://lore.kernel.org/r/20241106184722.17230-1-christopher.s.hall@intel.com
> Changes in v3:
> - Added mutex_destroy() to clean up PTM lock.
> - Added missing checks for PTP enabled flag called from igc_main.c.
> - Cleanup PTP module if probe fails.
> - Wrap all access to PTM registers with PTM lock/unlock.
> - Link to v2: https://lore.kernel.org/netdev/20241023023040.111429-1-christopher.s.hall@intel.com/
> Changes in v2:
> - Removed patch modifying PTM retry loop count.
> - Moved PTM mutex initialization from igc_reset() to igc_ptp_init(), called
> once during igc_probe().
> - Link to v1: https://lore.kernel.org/netdev/20240807003032.10300-1-christopher.s.hall@intel.com/
>
> ---
> Christopher S M Hall (6):
> igc: fix PTM cycle trigger logic
> igc: increase wait time before retrying PTM
> igc: move ktime snapshot into PTM retry loop
> igc: handle the IGC_PTP_ENABLED flag correctly
> igc: cleanup PTP module if probe fails
> igc: add lock preventing multiple simultaneous PTM transactions
>
For the series:
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Cheers,
--
Vinicius
^ permalink raw reply [flat|nested] 12+ messages in thread