* [PATCH v2 0/2] Update the PHY timeouts
@ 2026-02-12 6:34 Arun R Murthy
2026-02-12 6:34 ` [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time Arun R Murthy
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Arun R Murthy @ 2026-02-12 6:34 UTC (permalink / raw)
To: Jani Nikula, uma.shankar, suraj.kandpal, ankit.k.nautiyal
Cc: intel-gfx, intel-xe, Arun R Murthy
The timeouts mentioned in the spec is the recommendation from the PHY
and doesnt include the turnaround time of SoC and the OS. So ensure that
sufficient overhead is added for SoC and OS along with the PHY
recommended timeouts.
Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com>
---
Changes in v2:
- EDITME: describe what is new in this series revision.
- EDITME: use bulletpoints and terse descriptions.
- Link to v1: https://lore.kernel.org/r/20260212-timeout-v1-0-591fa766e8a1@intel.com
---
Arun R Murthy (2):
drm/i915/cx0_phy_regs: Include SoC and OS turnaround time
drm/i915/lt_phy_regs: Add SoC/OS turnaround time
drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h | 4 ++--
drivers/gpu/drm/i915/display/intel_lt_phy_regs.h | 8 ++++----
2 files changed, 6 insertions(+), 6 deletions(-)
---
base-commit: b4bfe7d753afaf6ea4950111a309a4e2ef5aef68
change-id: 20260212-timeout-06cb232f71af
Best regards,
--
Arun R Murthy <arun.r.murthy@intel.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time
2026-02-12 6:34 [PATCH v2 0/2] Update the PHY timeouts Arun R Murthy
@ 2026-02-12 6:34 ` Arun R Murthy
2026-02-14 8:48 ` Cole Leavitt
2026-02-12 6:34 ` [PATCH v2 2/2] drm/i915/lt_phy_regs: Add SoC/OS " Arun R Murthy
2026-02-12 8:06 ` ✗ i915.CI.BAT: failure for Update the PHY timeouts (rev2) Patchwork
2 siblings, 1 reply; 6+ messages in thread
From: Arun R Murthy @ 2026-02-12 6:34 UTC (permalink / raw)
To: Jani Nikula, uma.shankar, suraj.kandpal, ankit.k.nautiyal
Cc: intel-gfx, intel-xe, Arun R Murthy
The port refclk enable timeout and the soc ready timeout value mentioned
in the spec is the PHY timings and doesn't include the turnaround time
from the SoC or OS. So add an overhead timeout value on top of the
recommended timeouts from the PHY spec.
The overhead value is based on the stress test results with multiple
available panels.
Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
---
drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h b/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
index 658890f7351530e5686c23e067deb359b3283d59..152a4e751bdcf216a95714a2bd2d6612cbbd4698 100644
--- a/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
+++ b/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
@@ -78,10 +78,10 @@
#define XELPDP_PCLK_PLL_ENABLE_TIMEOUT_US 3200
#define XELPDP_PCLK_PLL_DISABLE_TIMEOUT_US 20
#define XELPDP_PORT_BUF_SOC_READY_TIMEOUT_US 100
-#define XELPDP_PORT_RESET_START_TIMEOUT_US 5
+#define XELPDP_PORT_RESET_START_TIMEOUT_US 10
#define XELPDP_PORT_POWERDOWN_UPDATE_TIMEOUT_MS 2
#define XELPDP_PORT_RESET_END_TIMEOUT_MS 15
-#define XELPDP_REFCLK_ENABLE_TIMEOUT_US 1
+#define XELPDP_REFCLK_ENABLE_TIMEOUT_US 10
#define _XELPDP_PORT_BUF_CTL1_LN0_A 0x64004
#define _XELPDP_PORT_BUF_CTL1_LN0_B 0x64104
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/2] drm/i915/lt_phy_regs: Add SoC/OS turnaround time
2026-02-12 6:34 [PATCH v2 0/2] Update the PHY timeouts Arun R Murthy
2026-02-12 6:34 ` [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time Arun R Murthy
@ 2026-02-12 6:34 ` Arun R Murthy
2026-02-12 8:06 ` ✗ i915.CI.BAT: failure for Update the PHY timeouts (rev2) Patchwork
2 siblings, 0 replies; 6+ messages in thread
From: Arun R Murthy @ 2026-02-12 6:34 UTC (permalink / raw)
To: Jani Nikula, uma.shankar, suraj.kandpal, ankit.k.nautiyal
Cc: intel-gfx, intel-xe, Arun R Murthy
On top the timeouts mentioned in the spec which includes only the PHY
timeouts include the SoC and the OS turnaround time.
The overhead value is based on the stress test results with multiple
available panels.
Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
---
drivers/gpu/drm/i915/display/intel_lt_phy_regs.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_lt_phy_regs.h b/drivers/gpu/drm/i915/display/intel_lt_phy_regs.h
index 37e46fb9abde4156ebd7ad1eb6cbbc12e7026b23..ff6d7829dbb9c50b2001d079b435b894faf9659e 100644
--- a/drivers/gpu/drm/i915/display/intel_lt_phy_regs.h
+++ b/drivers/gpu/drm/i915/display/intel_lt_phy_regs.h
@@ -6,12 +6,12 @@
#ifndef __INTEL_LT_PHY_REGS_H__
#define __INTEL_LT_PHY_REGS_H__
-#define XE3PLPD_MSGBUS_TIMEOUT_FAST_US 500
+#define XE3PLPD_MSGBUS_TIMEOUT_FAST_US 500
#define XE3PLPD_MACCLK_TURNON_LATENCY_MS 2
-#define XE3PLPD_MACCLK_TURNOFF_LATENCY_US 1
+#define XE3PLPD_MACCLK_TURNOFF_LATENCY_US 10
#define XE3PLPD_RATE_CALIB_DONE_LATENCY_MS 1
-#define XE3PLPD_RESET_START_LATENCY_US 10
-#define XE3PLPD_PWRDN_TO_RDY_LATENCY_US 4
+#define XE3PLPD_RESET_START_LATENCY_US 10
+#define XE3PLPD_PWRDN_TO_RDY_LATENCY_US 10
#define XE3PLPD_RESET_END_LATENCY_MS 2
/* LT Phy MAC Register */
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* ✗ i915.CI.BAT: failure for Update the PHY timeouts (rev2)
2026-02-12 6:34 [PATCH v2 0/2] Update the PHY timeouts Arun R Murthy
2026-02-12 6:34 ` [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time Arun R Murthy
2026-02-12 6:34 ` [PATCH v2 2/2] drm/i915/lt_phy_regs: Add SoC/OS " Arun R Murthy
@ 2026-02-12 8:06 ` Patchwork
2 siblings, 0 replies; 6+ messages in thread
From: Patchwork @ 2026-02-12 8:06 UTC (permalink / raw)
To: Arun R Murthy; +Cc: intel-gfx
[-- Attachment #1: Type: text/plain, Size: 3310 bytes --]
== Series Details ==
Series: Update the PHY timeouts (rev2)
URL : https://patchwork.freedesktop.org/series/161527/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_17977 -> Patchwork_161527v2
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_161527v2 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_161527v2, please notify your bug team (I915-ci-infra@lists.freedesktop.org) to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_161527v2/index.html
Participating hosts (43 -> 40)
------------------------------
Missing (3): bat-dg2-13 fi-snb-2520m bat-adls-6
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_161527v2:
### IGT changes ###
#### Possible regressions ####
* igt@kms_pipe_crc_basic@read-crc:
- fi-cfl-8109u: [PASS][1] -> [DMESG-WARN][2] +48 other tests dmesg-warn
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_17977/fi-cfl-8109u/igt@kms_pipe_crc_basic@read-crc.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_161527v2/fi-cfl-8109u/igt@kms_pipe_crc_basic@read-crc.html
Known issues
------------
Here are the changes found in Patchwork_161527v2 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@i915_selftest@live:
- bat-dg2-8: [PASS][3] -> [DMESG-FAIL][4] ([i915#12061]) +1 other test dmesg-fail
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_17977/bat-dg2-8/igt@i915_selftest@live.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_161527v2/bat-dg2-8/igt@i915_selftest@live.html
* igt@i915_selftest@live@late_gt_pm:
- fi-cfl-8109u: [PASS][5] -> [DMESG-WARN][6] ([i915#13735]) +32 other tests dmesg-warn
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_17977/fi-cfl-8109u/igt@i915_selftest@live@late_gt_pm.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_161527v2/fi-cfl-8109u/igt@i915_selftest@live@late_gt_pm.html
* igt@i915_selftest@live@workarounds:
- bat-arls-6: [PASS][7] -> [DMESG-FAIL][8] ([i915#12061]) +1 other test dmesg-fail
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_17977/bat-arls-6/igt@i915_selftest@live@workarounds.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_161527v2/bat-arls-6/igt@i915_selftest@live@workarounds.html
[i915#12061]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061
[i915#13735]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13735
Build changes
-------------
* Linux: CI_DRM_17977 -> Patchwork_161527v2
CI-20190529: 20190529
CI_DRM_17977: b4bfe7d753afaf6ea4950111a309a4e2ef5aef68 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_8751: af788251f1ef729d17c802aec2c4547b52059e58 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
Patchwork_161527v2: b4bfe7d753afaf6ea4950111a309a4e2ef5aef68 @ git://anongit.freedesktop.org/gfx-ci/linux
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_161527v2/index.html
[-- Attachment #2: Type: text/html, Size: 4079 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time
2026-02-12 6:34 ` [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time Arun R Murthy
@ 2026-02-14 8:48 ` Cole Leavitt
2026-02-16 4:53 ` Murthy, Arun R
0 siblings, 1 reply; 6+ messages in thread
From: Cole Leavitt @ 2026-02-14 8:48 UTC (permalink / raw)
To: arun.r.murthy; +Cc: intel-gfx, suraj.kandpal, jani.nikula, Cole Leavitt
On Wed, 12 Feb 2026, Arun R Murthy <arun.r.murthy@intel.com> wrote:
> The port refclk enable timeout and the soc ready timeout value mentioned
> in the spec is the PHY timings and doesn't include the turnaround time
> from the SoC or OS. So add an overhead timeout value on top of the
> recommended timeouts from the PHY spec.
Hi Arun,
Thanks for the fix. I wanted to flag that I independently identified
this exact issue and posted a detailed root cause analysis on the i915
GitLab tracker five days before this patch series.
On February 7, 2026, I filed the analysis on GitLab issue #14713:
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14713#note_2739573
That comment includes the following findings, which directly correspond
to what this patch addresses:
1. Traced the error to intel_cx0_phy_lane_reset() in intel_cx0_phy.c
(line ~2911), where the driver writes the PCLK_REFCLK_REQUEST bit to
XELPDP_PORT_CLOCK_CTL and polls for PCLK_REFCLK_ACK with a timeout
of XELPDP_REFCLK_ENABLE_TIMEOUT_US = 1 (1 us).
2. Identified that this calls __intel_wait_for_register() with
fast_timeout_us=1 and slow_timeout_ms=0 -- a single spin-poll with
no slow-path fallback.
3. Compared the 1 us refclk timeout against other timeouts in the same
PHY init sequence:
XELPDP_PORT_BUF_SOC_READY_TIMEOUT_US = 100 us
XELPDP_PORT_RESET_START_TIMEOUT_US = 5 us
XELPDP_PCLK_PLL_ENABLE_TIMEOUT_US = 3200 us
XELPDP_PORT_RESET_END_TIMEOUT_MS = 15 ms
The 1 us value is an outlier by 1-3 orders of magnitude compared to
every other timeout in the same code path.
4. Recommended increasing XELPDP_REFCLK_ENABLE_TIMEOUT_US to ~100 us
or adding a slow-path ms fallback, consistent with how other waits
in the same function are structured.
This analysis was performed on a Lenovo ThinkPad P16 Gen 3 with an
Arrow Lake-S Core Ultra 9 275HX (device ID 7d67) running kernel
6.19.0-rc8. The PHY A refclk failure reproduced on every boot at ~8.5s
after i915 init, during the eDP panel probe path.
Your patch does the right thing -- increasing the timeout values and
adding SoC/OS overhead. Since my analysis identified the root cause and
recommended the same fix direction, I'd appreciate attribution:
Reported-by: Cole Leavitt <cole@unwrap.rs>
Thanks,
Cole
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time
2026-02-14 8:48 ` Cole Leavitt
@ 2026-02-16 4:53 ` Murthy, Arun R
0 siblings, 0 replies; 6+ messages in thread
From: Murthy, Arun R @ 2026-02-16 4:53 UTC (permalink / raw)
To: Cole Leavitt
Cc: intel-gfx@lists.freedesktop.org, Kandpal, Suraj, Nikula, Jani
> -----Original Message-----
> From: Cole Leavitt <cole@unwrap.rs>
> Sent: Saturday, February 14, 2026 2:18 PM
> To: Murthy, Arun R <arun.r.murthy@intel.com>
> Cc: intel-gfx@lists.freedesktop.org; Kandpal, Suraj <suraj.kandpal@intel.com>;
> Nikula, Jani <jani.nikula@intel.com>; Cole Leavitt <cole@unwrap.rs>
> Subject: Re: [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS
> turnaround time
>
> On Wed, 12 Feb 2026, Arun R Murthy <arun.r.murthy@intel.com> wrote:
> > The port refclk enable timeout and the soc ready timeout value
> > mentioned in the spec is the PHY timings and doesn't include the
> > turnaround time from the SoC or OS. So add an overhead timeout value
> > on top of the recommended timeouts from the PHY spec.
>
> Hi Arun,
>
> Thanks for the fix. I wanted to flag that I independently identified this exact
> issue and posted a detailed root cause analysis on the i915 GitLab tracker five
> days before this patch series.
>
> On February 7, 2026, I filed the analysis on GitLab issue #14713:
>
> https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14713#note_2739573
>
> That comment includes the following findings, which directly correspond to
> what this patch addresses:
>
> 1. Traced the error to intel_cx0_phy_lane_reset() in intel_cx0_phy.c
> (line ~2911), where the driver writes the PCLK_REFCLK_REQUEST bit to
> XELPDP_PORT_CLOCK_CTL and polls for PCLK_REFCLK_ACK with a timeout
> of XELPDP_REFCLK_ENABLE_TIMEOUT_US = 1 (1 us).
>
> 2. Identified that this calls __intel_wait_for_register() with
> fast_timeout_us=1 and slow_timeout_ms=0 -- a single spin-poll with
> no slow-path fallback.
>
> 3. Compared the 1 us refclk timeout against other timeouts in the same
> PHY init sequence:
>
> XELPDP_PORT_BUF_SOC_READY_TIMEOUT_US = 100 us
> XELPDP_PORT_RESET_START_TIMEOUT_US = 5 us
> XELPDP_PCLK_PLL_ENABLE_TIMEOUT_US = 3200 us
> XELPDP_PORT_RESET_END_TIMEOUT_MS = 15 ms
>
> The 1 us value is an outlier by 1-3 orders of magnitude compared to
> every other timeout in the same code path.
>
> 4. Recommended increasing XELPDP_REFCLK_ENABLE_TIMEOUT_US to ~100 us
> or adding a slow-path ms fallback, consistent with how other waits
> in the same function are structured.
>
> This analysis was performed on a Lenovo ThinkPad P16 Gen 3 with an Arrow
> Lake-S Core Ultra 9 275HX (device ID 7d67) running kernel 6.19.0-rc8. The PHY
> A refclk failure reproduced on every boot at ~8.5s after i915 init, during the eDP
> panel probe path.
>
> Your patch does the right thing -- increasing the timeout values and adding
> SoC/OS overhead. Since my analysis identified the root cause and
> recommended the same fix direction, I'd appreciate attribution:
>
> Reported-by: Cole Leavitt <cole@unwrap.rs>
>
Yes, definitely would provide credits to you in this patch.
Thanks and Regards,
Arun R Murthy
--------------------
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-02-17 13:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-12 6:34 [PATCH v2 0/2] Update the PHY timeouts Arun R Murthy
2026-02-12 6:34 ` [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time Arun R Murthy
2026-02-14 8:48 ` Cole Leavitt
2026-02-16 4:53 ` Murthy, Arun R
2026-02-12 6:34 ` [PATCH v2 2/2] drm/i915/lt_phy_regs: Add SoC/OS " Arun R Murthy
2026-02-12 8:06 ` ✗ i915.CI.BAT: failure for Update the PHY timeouts (rev2) Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox