Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Update the PHY timeouts
@ 2026-02-12  6:34 Arun R Murthy
  2026-02-12  6:34 ` [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time Arun R Murthy
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Arun R Murthy @ 2026-02-12  6:34 UTC (permalink / raw)
  To: Jani Nikula, uma.shankar, suraj.kandpal, ankit.k.nautiyal
  Cc: intel-gfx, intel-xe, Arun R Murthy

The timeouts mentioned in the spec is the recommendation from the PHY
and doesnt include the turnaround time of SoC and the OS. So ensure that
sufficient overhead is added for SoC and OS along with the PHY
recommended timeouts.

Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com>
---
Changes in v2:
- EDITME: describe what is new in this series revision.
- EDITME: use bulletpoints and terse descriptions.
- Link to v1: https://lore.kernel.org/r/20260212-timeout-v1-0-591fa766e8a1@intel.com

---
Arun R Murthy (2):
      drm/i915/cx0_phy_regs: Include SoC and OS turnaround time
      drm/i915/lt_phy_regs: Add SoC/OS turnaround time

 drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h | 4 ++--
 drivers/gpu/drm/i915/display/intel_lt_phy_regs.h  | 8 ++++----
 2 files changed, 6 insertions(+), 6 deletions(-)
---
base-commit: b4bfe7d753afaf6ea4950111a309a4e2ef5aef68
change-id: 20260212-timeout-06cb232f71af

Best regards,
-- 
Arun R Murthy <arun.r.murthy@intel.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time
  2026-02-12  6:34 [PATCH v2 0/2] Update the PHY timeouts Arun R Murthy
@ 2026-02-12  6:34 ` Arun R Murthy
  2026-02-14  8:48   ` Cole Leavitt
  2026-02-12  6:34 ` [PATCH v2 2/2] drm/i915/lt_phy_regs: Add SoC/OS " Arun R Murthy
  2026-02-12  8:06 ` ✗ i915.CI.BAT: failure for Update the PHY timeouts (rev2) Patchwork
  2 siblings, 1 reply; 6+ messages in thread
From: Arun R Murthy @ 2026-02-12  6:34 UTC (permalink / raw)
  To: Jani Nikula, uma.shankar, suraj.kandpal, ankit.k.nautiyal
  Cc: intel-gfx, intel-xe, Arun R Murthy

The port refclk enable timeout and the soc ready timeout value mentioned
in the spec is the PHY timings and doesn't include the turnaround time
from the SoC or OS. So add an overhead timeout value on top of the
recommended timeouts from the PHY spec.
The overhead value is based on the stress test results with multiple
available panels.

Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
---
 drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h b/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
index 658890f7351530e5686c23e067deb359b3283d59..152a4e751bdcf216a95714a2bd2d6612cbbd4698 100644
--- a/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
+++ b/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
@@ -78,10 +78,10 @@
 #define XELPDP_PCLK_PLL_ENABLE_TIMEOUT_US		3200
 #define XELPDP_PCLK_PLL_DISABLE_TIMEOUT_US		20
 #define XELPDP_PORT_BUF_SOC_READY_TIMEOUT_US		100
-#define XELPDP_PORT_RESET_START_TIMEOUT_US		5
+#define XELPDP_PORT_RESET_START_TIMEOUT_US		10
 #define XELPDP_PORT_POWERDOWN_UPDATE_TIMEOUT_MS		2
 #define XELPDP_PORT_RESET_END_TIMEOUT_MS		15
-#define XELPDP_REFCLK_ENABLE_TIMEOUT_US			1
+#define XELPDP_REFCLK_ENABLE_TIMEOUT_US			10
 
 #define _XELPDP_PORT_BUF_CTL1_LN0_A			0x64004
 #define _XELPDP_PORT_BUF_CTL1_LN0_B			0x64104

-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] drm/i915/lt_phy_regs: Add SoC/OS turnaround time
  2026-02-12  6:34 [PATCH v2 0/2] Update the PHY timeouts Arun R Murthy
  2026-02-12  6:34 ` [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time Arun R Murthy
@ 2026-02-12  6:34 ` Arun R Murthy
  2026-02-12  8:06 ` ✗ i915.CI.BAT: failure for Update the PHY timeouts (rev2) Patchwork
  2 siblings, 0 replies; 6+ messages in thread
From: Arun R Murthy @ 2026-02-12  6:34 UTC (permalink / raw)
  To: Jani Nikula, uma.shankar, suraj.kandpal, ankit.k.nautiyal
  Cc: intel-gfx, intel-xe, Arun R Murthy

On top the timeouts mentioned in the spec which includes only the PHY
timeouts include the SoC and the OS turnaround time.
The overhead value is based on the stress test results with multiple
available panels.

Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
---
 drivers/gpu/drm/i915/display/intel_lt_phy_regs.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_lt_phy_regs.h b/drivers/gpu/drm/i915/display/intel_lt_phy_regs.h
index 37e46fb9abde4156ebd7ad1eb6cbbc12e7026b23..ff6d7829dbb9c50b2001d079b435b894faf9659e 100644
--- a/drivers/gpu/drm/i915/display/intel_lt_phy_regs.h
+++ b/drivers/gpu/drm/i915/display/intel_lt_phy_regs.h
@@ -6,12 +6,12 @@
 #ifndef __INTEL_LT_PHY_REGS_H__
 #define __INTEL_LT_PHY_REGS_H__
 
-#define XE3PLPD_MSGBUS_TIMEOUT_FAST_US	500
+#define XE3PLPD_MSGBUS_TIMEOUT_FAST_US		500
 #define XE3PLPD_MACCLK_TURNON_LATENCY_MS	2
-#define XE3PLPD_MACCLK_TURNOFF_LATENCY_US	1
+#define XE3PLPD_MACCLK_TURNOFF_LATENCY_US	10
 #define XE3PLPD_RATE_CALIB_DONE_LATENCY_MS	1
-#define XE3PLPD_RESET_START_LATENCY_US	10
-#define XE3PLPD_PWRDN_TO_RDY_LATENCY_US	4
+#define XE3PLPD_RESET_START_LATENCY_US		10
+#define XE3PLPD_PWRDN_TO_RDY_LATENCY_US		10
 #define XE3PLPD_RESET_END_LATENCY_MS		2
 
 /* LT Phy MAC Register */

-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* ✗ i915.CI.BAT: failure for Update the PHY timeouts (rev2)
  2026-02-12  6:34 [PATCH v2 0/2] Update the PHY timeouts Arun R Murthy
  2026-02-12  6:34 ` [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time Arun R Murthy
  2026-02-12  6:34 ` [PATCH v2 2/2] drm/i915/lt_phy_regs: Add SoC/OS " Arun R Murthy
@ 2026-02-12  8:06 ` Patchwork
  2 siblings, 0 replies; 6+ messages in thread
From: Patchwork @ 2026-02-12  8:06 UTC (permalink / raw)
  To: Arun R Murthy; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 3310 bytes --]

== Series Details ==

Series: Update the PHY timeouts (rev2)
URL   : https://patchwork.freedesktop.org/series/161527/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_17977 -> Patchwork_161527v2
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_161527v2 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_161527v2, please notify your bug team (I915-ci-infra@lists.freedesktop.org) to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_161527v2/index.html

Participating hosts (43 -> 40)
------------------------------

  Missing    (3): bat-dg2-13 fi-snb-2520m bat-adls-6 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_161527v2:

### IGT changes ###

#### Possible regressions ####

  * igt@kms_pipe_crc_basic@read-crc:
    - fi-cfl-8109u:       [PASS][1] -> [DMESG-WARN][2] +48 other tests dmesg-warn
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_17977/fi-cfl-8109u/igt@kms_pipe_crc_basic@read-crc.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_161527v2/fi-cfl-8109u/igt@kms_pipe_crc_basic@read-crc.html

  
Known issues
------------

  Here are the changes found in Patchwork_161527v2 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_selftest@live:
    - bat-dg2-8:          [PASS][3] -> [DMESG-FAIL][4] ([i915#12061]) +1 other test dmesg-fail
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_17977/bat-dg2-8/igt@i915_selftest@live.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_161527v2/bat-dg2-8/igt@i915_selftest@live.html

  * igt@i915_selftest@live@late_gt_pm:
    - fi-cfl-8109u:       [PASS][5] -> [DMESG-WARN][6] ([i915#13735]) +32 other tests dmesg-warn
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_17977/fi-cfl-8109u/igt@i915_selftest@live@late_gt_pm.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_161527v2/fi-cfl-8109u/igt@i915_selftest@live@late_gt_pm.html

  * igt@i915_selftest@live@workarounds:
    - bat-arls-6:         [PASS][7] -> [DMESG-FAIL][8] ([i915#12061]) +1 other test dmesg-fail
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_17977/bat-arls-6/igt@i915_selftest@live@workarounds.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_161527v2/bat-arls-6/igt@i915_selftest@live@workarounds.html

  
  [i915#12061]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061
  [i915#13735]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13735


Build changes
-------------

  * Linux: CI_DRM_17977 -> Patchwork_161527v2

  CI-20190529: 20190529
  CI_DRM_17977: b4bfe7d753afaf6ea4950111a309a4e2ef5aef68 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_8751: af788251f1ef729d17c802aec2c4547b52059e58 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_161527v2: b4bfe7d753afaf6ea4950111a309a4e2ef5aef68 @ git://anongit.freedesktop.org/gfx-ci/linux

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_161527v2/index.html

[-- Attachment #2: Type: text/html, Size: 4079 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time
  2026-02-12  6:34 ` [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time Arun R Murthy
@ 2026-02-14  8:48   ` Cole Leavitt
  2026-02-16  4:53     ` Murthy, Arun R
  0 siblings, 1 reply; 6+ messages in thread
From: Cole Leavitt @ 2026-02-14  8:48 UTC (permalink / raw)
  To: arun.r.murthy; +Cc: intel-gfx, suraj.kandpal, jani.nikula, Cole Leavitt

On Wed, 12 Feb 2026, Arun R Murthy <arun.r.murthy@intel.com> wrote:
> The port refclk enable timeout and the soc ready timeout value mentioned
> in the spec is the PHY timings and doesn't include the turnaround time
> from the SoC or OS. So add an overhead timeout value on top of the
> recommended timeouts from the PHY spec.

Hi Arun,

Thanks for the fix. I wanted to flag that I independently identified
this exact issue and posted a detailed root cause analysis on the i915
GitLab tracker five days before this patch series.

On February 7, 2026, I filed the analysis on GitLab issue #14713:

  https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14713#note_2739573

That comment includes the following findings, which directly correspond
to what this patch addresses:

1. Traced the error to intel_cx0_phy_lane_reset() in intel_cx0_phy.c
   (line ~2911), where the driver writes the PCLK_REFCLK_REQUEST bit to
   XELPDP_PORT_CLOCK_CTL and polls for PCLK_REFCLK_ACK with a timeout
   of XELPDP_REFCLK_ENABLE_TIMEOUT_US = 1 (1 us).

2. Identified that this calls __intel_wait_for_register() with
   fast_timeout_us=1 and slow_timeout_ms=0 -- a single spin-poll with
   no slow-path fallback.

3. Compared the 1 us refclk timeout against other timeouts in the same
   PHY init sequence:

     XELPDP_PORT_BUF_SOC_READY_TIMEOUT_US  = 100 us
     XELPDP_PORT_RESET_START_TIMEOUT_US     =   5 us
     XELPDP_PCLK_PLL_ENABLE_TIMEOUT_US      = 3200 us
     XELPDP_PORT_RESET_END_TIMEOUT_MS       =  15 ms

   The 1 us value is an outlier by 1-3 orders of magnitude compared to
   every other timeout in the same code path.

4. Recommended increasing XELPDP_REFCLK_ENABLE_TIMEOUT_US to ~100 us
   or adding a slow-path ms fallback, consistent with how other waits
   in the same function are structured.

This analysis was performed on a Lenovo ThinkPad P16 Gen 3 with an
Arrow Lake-S Core Ultra 9 275HX (device ID 7d67) running kernel
6.19.0-rc8. The PHY A refclk failure reproduced on every boot at ~8.5s
after i915 init, during the eDP panel probe path.

Your patch does the right thing -- increasing the timeout values and
adding SoC/OS overhead. Since my analysis identified the root cause and
recommended the same fix direction, I'd appreciate attribution:

Reported-by: Cole Leavitt <cole@unwrap.rs>

Thanks,
Cole

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time
  2026-02-14  8:48   ` Cole Leavitt
@ 2026-02-16  4:53     ` Murthy, Arun R
  0 siblings, 0 replies; 6+ messages in thread
From: Murthy, Arun R @ 2026-02-16  4:53 UTC (permalink / raw)
  To: Cole Leavitt
  Cc: intel-gfx@lists.freedesktop.org, Kandpal, Suraj, Nikula, Jani


> -----Original Message-----
> From: Cole Leavitt <cole@unwrap.rs>
> Sent: Saturday, February 14, 2026 2:18 PM
> To: Murthy, Arun R <arun.r.murthy@intel.com>
> Cc: intel-gfx@lists.freedesktop.org; Kandpal, Suraj <suraj.kandpal@intel.com>;
> Nikula, Jani <jani.nikula@intel.com>; Cole Leavitt <cole@unwrap.rs>
> Subject: Re: [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS
> turnaround time
> 
> On Wed, 12 Feb 2026, Arun R Murthy <arun.r.murthy@intel.com> wrote:
> > The port refclk enable timeout and the soc ready timeout value
> > mentioned in the spec is the PHY timings and doesn't include the
> > turnaround time from the SoC or OS. So add an overhead timeout value
> > on top of the recommended timeouts from the PHY spec.
> 
> Hi Arun,
> 
> Thanks for the fix. I wanted to flag that I independently identified this exact
> issue and posted a detailed root cause analysis on the i915 GitLab tracker five
> days before this patch series.
> 
> On February 7, 2026, I filed the analysis on GitLab issue #14713:
> 
>   https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14713#note_2739573
> 
> That comment includes the following findings, which directly correspond to
> what this patch addresses:
> 
> 1. Traced the error to intel_cx0_phy_lane_reset() in intel_cx0_phy.c
>    (line ~2911), where the driver writes the PCLK_REFCLK_REQUEST bit to
>    XELPDP_PORT_CLOCK_CTL and polls for PCLK_REFCLK_ACK with a timeout
>    of XELPDP_REFCLK_ENABLE_TIMEOUT_US = 1 (1 us).
> 
> 2. Identified that this calls __intel_wait_for_register() with
>    fast_timeout_us=1 and slow_timeout_ms=0 -- a single spin-poll with
>    no slow-path fallback.
> 
> 3. Compared the 1 us refclk timeout against other timeouts in the same
>    PHY init sequence:
> 
>      XELPDP_PORT_BUF_SOC_READY_TIMEOUT_US  = 100 us
>      XELPDP_PORT_RESET_START_TIMEOUT_US     =   5 us
>      XELPDP_PCLK_PLL_ENABLE_TIMEOUT_US      = 3200 us
>      XELPDP_PORT_RESET_END_TIMEOUT_MS       =  15 ms
> 
>    The 1 us value is an outlier by 1-3 orders of magnitude compared to
>    every other timeout in the same code path.
> 
> 4. Recommended increasing XELPDP_REFCLK_ENABLE_TIMEOUT_US to ~100 us
>    or adding a slow-path ms fallback, consistent with how other waits
>    in the same function are structured.
> 
> This analysis was performed on a Lenovo ThinkPad P16 Gen 3 with an Arrow
> Lake-S Core Ultra 9 275HX (device ID 7d67) running kernel 6.19.0-rc8. The PHY
> A refclk failure reproduced on every boot at ~8.5s after i915 init, during the eDP
> panel probe path.
> 
> Your patch does the right thing -- increasing the timeout values and adding
> SoC/OS overhead. Since my analysis identified the root cause and
> recommended the same fix direction, I'd appreciate attribution:
> 
> Reported-by: Cole Leavitt <cole@unwrap.rs>
> 
Yes, definitely would provide credits to you in this patch.

Thanks and Regards,
Arun R Murthy
--------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-02-17 13:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-12  6:34 [PATCH v2 0/2] Update the PHY timeouts Arun R Murthy
2026-02-12  6:34 ` [PATCH v2 1/2] drm/i915/cx0_phy_regs: Include SoC and OS turnaround time Arun R Murthy
2026-02-14  8:48   ` Cole Leavitt
2026-02-16  4:53     ` Murthy, Arun R
2026-02-12  6:34 ` [PATCH v2 2/2] drm/i915/lt_phy_regs: Add SoC/OS " Arun R Murthy
2026-02-12  8:06 ` ✗ i915.CI.BAT: failure for Update the PHY timeouts (rev2) Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox