All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH v4 0/6] drm/i915/dp_link_training: Define a final failure state when link training fails
@ 2023-08-24 20:50 ` Gil Dekel
  0 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel

Next version of https://patchwork.freedesktop.org/series/122850/

v4:
  Another blunder. I uploaded the patches from my ChromeiumOS kernel dev repo
  instead of drm-tip/drm-tip. Apologies for the noise :(

v3:
  Still learning the ropes of upstream workflow. Apologies for mucking up v2.
  This is just a re-upload.

v2:
  Reorganize into:
  1) Add for final failure state for SST and MST link training fallback.
  2) Add a DRM helper for setting downstream MST ports' link-status state.
  3) Make handling SST and MST connectors simpler via intel_dp.
  4) Update link-status for downstream MST ports.
  5) Emit a uevent with the "link-status" trigger property.

v1:
Currently, when link training fails after all fallback values have been
exhausted, the i915 driver seizes to send uevents to userspace. This leave
userspace thinking that the last passing atomic commit was successful, and that
all connectors (displays) are connected and operational, when in fact, the last
link failed to train and the displays remain dark. This manifests as "zombie"
displays in userspace, in which users observe the displays appear in their
display settings page, but they are dark and unresponsive.

Since, at the time of writing, MST link training fallback is not implemented,
failing MST link training is a significantly more common case then a complete
SST link training failure. And with users using MST hubs more than ever to
connect multiple displays via their USB-C ports we observe this case often.

This patchset series suggest a solution, in which a final failure state is
defined. In this final state, the connector's bit rate capabilities, namely
max_link_rate and max_link_lane_count, are set to 0. This effectively set the
connector's bandwidth to 0Gbps, thus causing all its modes to be pruned in the
following connector probing.

Next, with this state defined, we emit a link-status=Bad uevent. The next time
userspace probes the connector, it should recognize that the connector has no
modes and ignore it since it is in a bad state.

I am aware that always sending a uevent and never stopping may result in some
userspaces having their expectations broken and enter an infinite loop of
modesets and link-training attempts. However, per DRM link-status spec:
```
 * link-status:
 *      Connector link-status property to indicate the status of link. The
 *      default value of link-status is "GOOD". If something fails during or
 *      after modeset, the kernel driver may set this to "BAD" and issue a
 *      hotplug uevent. Drivers should update this value using
 *      drm_connector_set_link_status_property().
 *
 *      When user-space receives the hotplug uevent and detects a "BAD"
 *      link-status, the sink doesn't receive pixels anymore (e.g. the screen
 *      becomes completely black). The list of available modes may have
 *      changed. User-space is expected to pick a new mode if the current one
 *      has disappeared and perform a new modeset with link-status set to
 *      "GOOD" to re-enable the connector.
```
(form drivers/gpu/drm/drm_connector.c - DOC: standard connector properties)

it seems reasonable to assume that the suggested state is an extension of the
spec's guidelines, in which the next new mode userspace picks for a connector
with no modes is - none, thus breaking the cycle of failed link-training
attempts.

I suspect that, maybe, zeroing out the bit rate capabilities is not the right
way to go, and perhaps marking the connector as disconnected instead may be a
better solution. However, if marking a connector disconnected is the way to go,
We will have to iterate over all MST ports in the MST case and mark the spawned
connectors as disconnected as well.

As a final note I should add that this approach was tested with ChromeOS as
userspace, and we observed that the zombie displays stop showing up once the
connectors are pruned of all their modes and are ignored by userspace.

For your consideration and guidance.
Thanks,

Gil Dekel (6):
  drm/i915/dp_link_training: Add a final failing state to link training
    fallback
  drm/i915/dp_link_training: Add a final failing state to link training
    fallback for MST
  drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
  drm/i915: Move DP modeset_retry_work into intel_dp
  drm/i915/dp_link_training: Set all downstream MST ports to BAD before
    retrying
  drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger
    property

 drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 ++++++++++
 drivers/gpu/drm/i915/display/intel_display.c  | 14 +++-
 .../drm/i915/display/intel_display_types.h    |  6 +-
 drivers/gpu/drm/i915/display/intel_dp.c       | 75 ++++++++++++-------
 drivers/gpu/drm/i915/display/intel_dp.h       |  2 +-
 .../drm/i915/display/intel_dp_link_training.c | 11 ++-
 include/drm/display/drm_dp_mst_helper.h       |  3 +
 7 files changed, 110 insertions(+), 40 deletions(-)

--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v4 0/6] drm/i915/dp_link_training: Define a final failure state when link training fails
@ 2023-08-24 20:50 ` Gil Dekel
  0 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel, navaremanasi

Next version of https://patchwork.freedesktop.org/series/122850/

v4:
  Another blunder. I uploaded the patches from my ChromeiumOS kernel dev repo
  instead of drm-tip/drm-tip. Apologies for the noise :(

v3:
  Still learning the ropes of upstream workflow. Apologies for mucking up v2.
  This is just a re-upload.

v2:
  Reorganize into:
  1) Add for final failure state for SST and MST link training fallback.
  2) Add a DRM helper for setting downstream MST ports' link-status state.
  3) Make handling SST and MST connectors simpler via intel_dp.
  4) Update link-status for downstream MST ports.
  5) Emit a uevent with the "link-status" trigger property.

v1:
Currently, when link training fails after all fallback values have been
exhausted, the i915 driver seizes to send uevents to userspace. This leave
userspace thinking that the last passing atomic commit was successful, and that
all connectors (displays) are connected and operational, when in fact, the last
link failed to train and the displays remain dark. This manifests as "zombie"
displays in userspace, in which users observe the displays appear in their
display settings page, but they are dark and unresponsive.

Since, at the time of writing, MST link training fallback is not implemented,
failing MST link training is a significantly more common case then a complete
SST link training failure. And with users using MST hubs more than ever to
connect multiple displays via their USB-C ports we observe this case often.

This patchset series suggest a solution, in which a final failure state is
defined. In this final state, the connector's bit rate capabilities, namely
max_link_rate and max_link_lane_count, are set to 0. This effectively set the
connector's bandwidth to 0Gbps, thus causing all its modes to be pruned in the
following connector probing.

Next, with this state defined, we emit a link-status=Bad uevent. The next time
userspace probes the connector, it should recognize that the connector has no
modes and ignore it since it is in a bad state.

I am aware that always sending a uevent and never stopping may result in some
userspaces having their expectations broken and enter an infinite loop of
modesets and link-training attempts. However, per DRM link-status spec:
```
 * link-status:
 *      Connector link-status property to indicate the status of link. The
 *      default value of link-status is "GOOD". If something fails during or
 *      after modeset, the kernel driver may set this to "BAD" and issue a
 *      hotplug uevent. Drivers should update this value using
 *      drm_connector_set_link_status_property().
 *
 *      When user-space receives the hotplug uevent and detects a "BAD"
 *      link-status, the sink doesn't receive pixels anymore (e.g. the screen
 *      becomes completely black). The list of available modes may have
 *      changed. User-space is expected to pick a new mode if the current one
 *      has disappeared and perform a new modeset with link-status set to
 *      "GOOD" to re-enable the connector.
```
(form drivers/gpu/drm/drm_connector.c - DOC: standard connector properties)

it seems reasonable to assume that the suggested state is an extension of the
spec's guidelines, in which the next new mode userspace picks for a connector
with no modes is - none, thus breaking the cycle of failed link-training
attempts.

I suspect that, maybe, zeroing out the bit rate capabilities is not the right
way to go, and perhaps marking the connector as disconnected instead may be a
better solution. However, if marking a connector disconnected is the way to go,
We will have to iterate over all MST ports in the MST case and mark the spawned
connectors as disconnected as well.

As a final note I should add that this approach was tested with ChromeOS as
userspace, and we observed that the zombie displays stop showing up once the
connectors are pruned of all their modes and are ignored by userspace.

For your consideration and guidance.
Thanks,

Gil Dekel (6):
  drm/i915/dp_link_training: Add a final failing state to link training
    fallback
  drm/i915/dp_link_training: Add a final failing state to link training
    fallback for MST
  drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
  drm/i915: Move DP modeset_retry_work into intel_dp
  drm/i915/dp_link_training: Set all downstream MST ports to BAD before
    retrying
  drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger
    property

 drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 ++++++++++
 drivers/gpu/drm/i915/display/intel_display.c  | 14 +++-
 .../drm/i915/display/intel_display_types.h    |  6 +-
 drivers/gpu/drm/i915/display/intel_dp.c       | 75 ++++++++++++-------
 drivers/gpu/drm/i915/display/intel_dp.h       |  2 +-
 .../drm/i915/display/intel_dp_link_training.c | 11 ++-
 include/drm/display/drm_dp_mst_helper.h       |  3 +
 7 files changed, 110 insertions(+), 40 deletions(-)

--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Intel-gfx] [PATCH v4 1/6] drm/i915/dp_link_training: Add a final failing state to link training fallback
  2023-08-24 20:50 ` Gil Dekel
@ 2023-08-24 20:50   ` Gil Dekel
  -1 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel

Instead of silently giving up when all link-training fallback values are
exhausted, this patch modifies the fallback's failure branch to reduces
both max_link_lane_count and max_link_rate to zero (0) and continues to
emit uevents until userspace stops attempting to modeset.

By doing so, we ensure the failing connector, which is in
link-status=Bad, has all its modes pruned (due to effectively having a
bandwidth of 0Gbps).

It is then the userspace's responsibility to ignore connectors with no
modes, even if they are marked as connected.

Signed-off-by: Gil Dekel <gildekel@chromium.org>
---
 drivers/gpu/drm/i915/display/intel_dp.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index 7067ee3a4bd3..2152ddbab557 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -276,8 +276,12 @@ static int intel_dp_common_len_rate_limit(const struct intel_dp *intel_dp,

 static int intel_dp_common_rate(struct intel_dp *intel_dp, int index)
 {
+	/* This occurs when max link rate drops to 0 via link training fallback*/
+	if (index < 0)
+		return 0;
+
 	if (drm_WARN_ON(&dp_to_i915(intel_dp)->drm,
-			index < 0 || index >= intel_dp->num_common_rates))
+			index >= intel_dp->num_common_rates))
 		return 162000;

 	return intel_dp->common_rates[index];
@@ -318,6 +322,9 @@ static int intel_dp_max_common_lane_count(struct intel_dp *intel_dp)
 int intel_dp_max_lane_count(struct intel_dp *intel_dp)
 {
 	switch (intel_dp->max_link_lane_count) {
+	/* This occurs when max link lane count drops to 0 via link training fallback*/
+	case 0:
+		return 0;
 	case 1:
 	case 2:
 	case 4:
@@ -672,7 +679,14 @@ int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
 		intel_dp->max_link_lane_count = lane_count >> 1;
 	} else {
 		drm_err(&i915->drm, "Link Training Unsuccessful\n");
-		return -1;
+		/*
+		 * Ensure all of the connector's modes are pruned in the next
+		 * probe by effectively reducing its bandwidth to 0 so userspace
+		 * can ignore it within the next modeset attempt.
+		 */
+		intel_dp->max_link_rate = 0;
+		intel_dp->max_link_lane_count = 0;
+		return 0;
 	}

 	return 0;
--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 1/6] drm/i915/dp_link_training: Add a final failing state to link training fallback
@ 2023-08-24 20:50   ` Gil Dekel
  0 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel, navaremanasi

Instead of silently giving up when all link-training fallback values are
exhausted, this patch modifies the fallback's failure branch to reduces
both max_link_lane_count and max_link_rate to zero (0) and continues to
emit uevents until userspace stops attempting to modeset.

By doing so, we ensure the failing connector, which is in
link-status=Bad, has all its modes pruned (due to effectively having a
bandwidth of 0Gbps).

It is then the userspace's responsibility to ignore connectors with no
modes, even if they are marked as connected.

Signed-off-by: Gil Dekel <gildekel@chromium.org>
---
 drivers/gpu/drm/i915/display/intel_dp.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index 7067ee3a4bd3..2152ddbab557 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -276,8 +276,12 @@ static int intel_dp_common_len_rate_limit(const struct intel_dp *intel_dp,

 static int intel_dp_common_rate(struct intel_dp *intel_dp, int index)
 {
+	/* This occurs when max link rate drops to 0 via link training fallback*/
+	if (index < 0)
+		return 0;
+
 	if (drm_WARN_ON(&dp_to_i915(intel_dp)->drm,
-			index < 0 || index >= intel_dp->num_common_rates))
+			index >= intel_dp->num_common_rates))
 		return 162000;

 	return intel_dp->common_rates[index];
@@ -318,6 +322,9 @@ static int intel_dp_max_common_lane_count(struct intel_dp *intel_dp)
 int intel_dp_max_lane_count(struct intel_dp *intel_dp)
 {
 	switch (intel_dp->max_link_lane_count) {
+	/* This occurs when max link lane count drops to 0 via link training fallback*/
+	case 0:
+		return 0;
 	case 1:
 	case 2:
 	case 4:
@@ -672,7 +679,14 @@ int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
 		intel_dp->max_link_lane_count = lane_count >> 1;
 	} else {
 		drm_err(&i915->drm, "Link Training Unsuccessful\n");
-		return -1;
+		/*
+		 * Ensure all of the connector's modes are pruned in the next
+		 * probe by effectively reducing its bandwidth to 0 so userspace
+		 * can ignore it within the next modeset attempt.
+		 */
+		intel_dp->max_link_rate = 0;
+		intel_dp->max_link_lane_count = 0;
+		return 0;
 	}

 	return 0;
--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Intel-gfx] [PATCH v4 2/6] drm/i915/dp_link_training: Add a final failing state to link training fallback for MST
  2023-08-24 20:50 ` Gil Dekel
@ 2023-08-24 20:50   ` Gil Dekel
  -1 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel

Currently, MST link training has no fallback. This means that if an MST
base connector fails to link-train once, the training completely fails,
which makes this case significantly more common than a complete SST link
training failure.

Similar to the final failure state of SST, this patch zeros out both
max_link_rate and max_link_lane_count. In addition, it stops resetting
MST params so the zeroing of the HBR fields stick. This ensures that
the MST base connector's modes will be completely pruned, since it is
effectively left with 0Gbps bandwidth.

Signed-off-by: Gil Dekel <gildekel@chromium.org>
---
 drivers/gpu/drm/i915/display/intel_dp.c       | 27 ++++++++++---------
 drivers/gpu/drm/i915/display/intel_dp.h       |  2 +-
 .../drm/i915/display/intel_dp_link_training.c |  8 +++---
 3 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index 2152ddbab557..01b180c8d9bd 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -630,7 +630,7 @@ static bool intel_dp_can_link_train_fallback_for_edp(struct intel_dp *intel_dp,
 	return true;
 }

-int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
+void intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
 					    int link_rate, u8 lane_count)
 {
 	struct drm_i915_private *i915 = dp_to_i915(intel_dp);
@@ -638,18 +638,23 @@ int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,

 	/*
 	 * TODO: Enable fallback on MST links once MST link compute can handle
-	 * the fallback params.
+	 * the fallback params. For now, similar to the SST case, ensure all of
+	 * the base connector's modes are pruned in the next connector probe by
+	 * effectively reducing its bandwidth to 0 so userspace can ignore it
+	 * within the next modeset attempt.
 	 */
 	if (intel_dp->is_mst) {
 		drm_err(&i915->drm, "Link Training Unsuccessful\n");
-		return -1;
+		intel_dp->max_link_rate = 0;
+		intel_dp->max_link_lane_count = 0;
+		return;
 	}

 	if (intel_dp_is_edp(intel_dp) && !intel_dp->use_max_params) {
 		drm_dbg_kms(&i915->drm,
 			    "Retrying Link training for eDP with max parameters\n");
 		intel_dp->use_max_params = true;
-		return 0;
+		return;
 	}

 	index = intel_dp_rate_index(intel_dp->common_rates,
@@ -662,7 +667,7 @@ int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
 							      lane_count)) {
 			drm_dbg_kms(&i915->drm,
 				    "Retrying Link training for eDP with same parameters\n");
-			return 0;
+			return;
 		}
 		intel_dp->max_link_rate = intel_dp_common_rate(intel_dp, index - 1);
 		intel_dp->max_link_lane_count = lane_count;
@@ -673,7 +678,7 @@ int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
 							      lane_count >> 1)) {
 			drm_dbg_kms(&i915->drm,
 				    "Retrying Link training for eDP with same parameters\n");
-			return 0;
+			return;
 		}
 		intel_dp->max_link_rate = intel_dp_max_common_rate(intel_dp);
 		intel_dp->max_link_lane_count = lane_count >> 1;
@@ -686,10 +691,7 @@ int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
 		 */
 		intel_dp->max_link_rate = 0;
 		intel_dp->max_link_lane_count = 0;
-		return 0;
 	}
-
-	return 0;
 }

 u32 intel_dp_mode_to_fec_clock(u32 mode_clock)
@@ -5310,10 +5312,11 @@ intel_dp_detect(struct drm_connector *connector,
 	intel_dp_configure_mst(intel_dp);

 	/*
-	 * TODO: Reset link params when switching to MST mode, until MST
-	 * supports link training fallback params.
+	 * Note: Even though MST link training fallback is not yet implemented,
+	 * do not reset. This is because the base connector needs to have all
+	 * its modes pruned when link training for the MST port fails.
 	 */
-	if (intel_dp->reset_link_params || intel_dp->is_mst) {
+	if (intel_dp->reset_link_params) {
 		intel_dp_reset_max_link_params(intel_dp);
 		intel_dp->reset_link_params = false;
 	}
diff --git a/drivers/gpu/drm/i915/display/intel_dp.h b/drivers/gpu/drm/i915/display/intel_dp.h
index 788a577ebe16..7388510e0cb2 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.h
+++ b/drivers/gpu/drm/i915/display/intel_dp.h
@@ -40,7 +40,7 @@ bool intel_dp_init_connector(struct intel_digital_port *dig_port,
 			     struct intel_connector *intel_connector);
 void intel_dp_set_link_params(struct intel_dp *intel_dp,
 			      int link_rate, int lane_count);
-int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
+void intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
 					    int link_rate, u8 lane_count);
 int intel_dp_get_active_pipes(struct intel_dp *intel_dp,
 			      struct drm_modeset_acquire_ctx *ctx,
diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
index 4485ef4f8ec6..31d0d7854003 100644
--- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
@@ -1075,10 +1075,10 @@ static void intel_dp_schedule_fallback_link_training(struct intel_dp *intel_dp,
 		lt_dbg(intel_dp, DP_PHY_DPRX,
 		       "Link Training failed with HOBL active, not enabling it from now on\n");
 		intel_dp->hobl_failed = true;
-	} else if (intel_dp_get_link_train_fallback_values(intel_dp,
-							   crtc_state->port_clock,
-							   crtc_state->lane_count)) {
-		return;
+	} else {
+		intel_dp_get_link_train_fallback_values(intel_dp,
+							crtc_state->port_clock,
+							crtc_state->lane_count);
 	}

 	/* Schedule a Hotplug Uevent to userspace to start modeset */
--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 2/6] drm/i915/dp_link_training: Add a final failing state to link training fallback for MST
@ 2023-08-24 20:50   ` Gil Dekel
  0 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel, navaremanasi

Currently, MST link training has no fallback. This means that if an MST
base connector fails to link-train once, the training completely fails,
which makes this case significantly more common than a complete SST link
training failure.

Similar to the final failure state of SST, this patch zeros out both
max_link_rate and max_link_lane_count. In addition, it stops resetting
MST params so the zeroing of the HBR fields stick. This ensures that
the MST base connector's modes will be completely pruned, since it is
effectively left with 0Gbps bandwidth.

Signed-off-by: Gil Dekel <gildekel@chromium.org>
---
 drivers/gpu/drm/i915/display/intel_dp.c       | 27 ++++++++++---------
 drivers/gpu/drm/i915/display/intel_dp.h       |  2 +-
 .../drm/i915/display/intel_dp_link_training.c |  8 +++---
 3 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index 2152ddbab557..01b180c8d9bd 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -630,7 +630,7 @@ static bool intel_dp_can_link_train_fallback_for_edp(struct intel_dp *intel_dp,
 	return true;
 }

-int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
+void intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
 					    int link_rate, u8 lane_count)
 {
 	struct drm_i915_private *i915 = dp_to_i915(intel_dp);
@@ -638,18 +638,23 @@ int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,

 	/*
 	 * TODO: Enable fallback on MST links once MST link compute can handle
-	 * the fallback params.
+	 * the fallback params. For now, similar to the SST case, ensure all of
+	 * the base connector's modes are pruned in the next connector probe by
+	 * effectively reducing its bandwidth to 0 so userspace can ignore it
+	 * within the next modeset attempt.
 	 */
 	if (intel_dp->is_mst) {
 		drm_err(&i915->drm, "Link Training Unsuccessful\n");
-		return -1;
+		intel_dp->max_link_rate = 0;
+		intel_dp->max_link_lane_count = 0;
+		return;
 	}

 	if (intel_dp_is_edp(intel_dp) && !intel_dp->use_max_params) {
 		drm_dbg_kms(&i915->drm,
 			    "Retrying Link training for eDP with max parameters\n");
 		intel_dp->use_max_params = true;
-		return 0;
+		return;
 	}

 	index = intel_dp_rate_index(intel_dp->common_rates,
@@ -662,7 +667,7 @@ int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
 							      lane_count)) {
 			drm_dbg_kms(&i915->drm,
 				    "Retrying Link training for eDP with same parameters\n");
-			return 0;
+			return;
 		}
 		intel_dp->max_link_rate = intel_dp_common_rate(intel_dp, index - 1);
 		intel_dp->max_link_lane_count = lane_count;
@@ -673,7 +678,7 @@ int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
 							      lane_count >> 1)) {
 			drm_dbg_kms(&i915->drm,
 				    "Retrying Link training for eDP with same parameters\n");
-			return 0;
+			return;
 		}
 		intel_dp->max_link_rate = intel_dp_max_common_rate(intel_dp);
 		intel_dp->max_link_lane_count = lane_count >> 1;
@@ -686,10 +691,7 @@ int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
 		 */
 		intel_dp->max_link_rate = 0;
 		intel_dp->max_link_lane_count = 0;
-		return 0;
 	}
-
-	return 0;
 }

 u32 intel_dp_mode_to_fec_clock(u32 mode_clock)
@@ -5310,10 +5312,11 @@ intel_dp_detect(struct drm_connector *connector,
 	intel_dp_configure_mst(intel_dp);

 	/*
-	 * TODO: Reset link params when switching to MST mode, until MST
-	 * supports link training fallback params.
+	 * Note: Even though MST link training fallback is not yet implemented,
+	 * do not reset. This is because the base connector needs to have all
+	 * its modes pruned when link training for the MST port fails.
 	 */
-	if (intel_dp->reset_link_params || intel_dp->is_mst) {
+	if (intel_dp->reset_link_params) {
 		intel_dp_reset_max_link_params(intel_dp);
 		intel_dp->reset_link_params = false;
 	}
diff --git a/drivers/gpu/drm/i915/display/intel_dp.h b/drivers/gpu/drm/i915/display/intel_dp.h
index 788a577ebe16..7388510e0cb2 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.h
+++ b/drivers/gpu/drm/i915/display/intel_dp.h
@@ -40,7 +40,7 @@ bool intel_dp_init_connector(struct intel_digital_port *dig_port,
 			     struct intel_connector *intel_connector);
 void intel_dp_set_link_params(struct intel_dp *intel_dp,
 			      int link_rate, int lane_count);
-int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
+void intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
 					    int link_rate, u8 lane_count);
 int intel_dp_get_active_pipes(struct intel_dp *intel_dp,
 			      struct drm_modeset_acquire_ctx *ctx,
diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
index 4485ef4f8ec6..31d0d7854003 100644
--- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
@@ -1075,10 +1075,10 @@ static void intel_dp_schedule_fallback_link_training(struct intel_dp *intel_dp,
 		lt_dbg(intel_dp, DP_PHY_DPRX,
 		       "Link Training failed with HOBL active, not enabling it from now on\n");
 		intel_dp->hobl_failed = true;
-	} else if (intel_dp_get_link_train_fallback_values(intel_dp,
-							   crtc_state->port_clock,
-							   crtc_state->lane_count)) {
-		return;
+	} else {
+		intel_dp_get_link_train_fallback_values(intel_dp,
+							crtc_state->port_clock,
+							crtc_state->lane_count);
 	}

 	/* Schedule a Hotplug Uevent to userspace to start modeset */
--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Intel-gfx] [PATCH v4 3/6] drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
  2023-08-24 20:50 ` Gil Dekel
@ 2023-08-24 20:50   ` Gil Dekel
  -1 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel

Unlike SST, MST can support multiple displays connected to a single
connector. However, this also means that if the DisplayPort link to the
top-level MST branch device becomes unstable, then every single branch
device has an unstable link.

Since there are multiple downstream ports per connector, setting the
link status of the parent mstb's port to BAD is not enough. All of the
downstream mstb ports must also have their link status set to BAD.

This aligns to how the DP link status logic in DRM works. We notify
userspace that all of the mstb ports need retraining and apply new lower
bandwidth constraints to all future atomic commits on the topology that
follow.

Since any driver supporting MST needs to figure out which connectors
live downstream on an MST topology and update their link status in order
to retrain MST links properly, we add the
drm_dp_set_mst_topology_link_status() helper. This helper simply marks
the link status of all connectors living in that topology as bad. We
will make use of this helper in i915 later in this series.

Credit: this patch is a refactor of Lyude Pual's original patch:
https://patchwork.kernel.org/project/dri-devel/patch/20180308232421.14049-5-lyude@redhat.com/

Signed-off-by: Gil Dekel <gildekel@chromium.org>
---
 drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 +++++++++++++++++++
 include/drm/display/drm_dp_mst_helper.h       |  3 ++
 2 files changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c
index ed96cfcfa304..17cbadfb6ccb 100644
--- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
@@ -3566,6 +3566,45 @@ int drm_dp_get_vc_payload_bw(const struct drm_dp_mst_topology_mgr *mgr,
 }
 EXPORT_SYMBOL(drm_dp_get_vc_payload_bw);

+/**
+ * drm_dp_set_mst_topology_link_status() - set all downstream MST ports' link status
+ * @mgr: MST topology manager to set state for
+ * @status: The new status to set the MST topology to
+ *
+ * Set all downstream ports' link-status within the topology to the given status.
+ */
+void drm_dp_set_mst_topology_link_status(struct drm_dp_mst_topology_mgr *mgr,
+					 enum drm_link_status status)
+{
+	struct drm_dp_mst_port *port;
+	struct drm_dp_mst_branch *rmstb;
+	struct drm_dp_mst_branch *mstb =
+		drm_dp_mst_topology_get_mstb_validated(mgr, mgr->mst_primary);
+
+	list_for_each_entry_reverse(port, &mstb->ports, next) {
+		struct drm_connector *connector = port->connector;
+
+		if (connector) {
+			mutex_lock(&connector->dev->mode_config.mutex);
+			drm_dbg_kms(
+				connector->dev,
+				"[MST-CONNECTOR:%d:%s] link status %d -> %d\n",
+				connector->base.id, connector->name,
+				connector->state->link_status, status);
+			connector->state->link_status = status;
+			mutex_unlock(&connector->dev->mode_config.mutex);
+		}
+
+		rmstb = drm_dp_mst_topology_get_mstb_validated(mstb->mgr,
+							       port->mstb);
+		if (rmstb) {
+			drm_dp_set_mst_topology_link_status(rmstb->mgr, status);
+			drm_dp_mst_topology_put_mstb(rmstb);
+		}
+	}
+}
+EXPORT_SYMBOL(drm_dp_set_mst_topology_link_status);
+
 /**
  * drm_dp_read_mst_cap() - check whether or not a sink supports MST
  * @aux: The DP AUX channel to use
diff --git a/include/drm/display/drm_dp_mst_helper.h b/include/drm/display/drm_dp_mst_helper.h
index ed5c9660563c..855d488bf364 100644
--- a/include/drm/display/drm_dp_mst_helper.h
+++ b/include/drm/display/drm_dp_mst_helper.h
@@ -832,6 +832,9 @@ struct edid *drm_dp_mst_get_edid(struct drm_connector *connector,
 int drm_dp_get_vc_payload_bw(const struct drm_dp_mst_topology_mgr *mgr,
 			     int link_rate, int link_lane_count);

+void drm_dp_set_mst_topology_link_status(struct drm_dp_mst_topology_mgr *mgr,
+					 enum drm_link_status status);
+
 int drm_dp_calc_pbn_mode(int clock, int bpp, bool dsc);

 void drm_dp_mst_update_slots(struct drm_dp_mst_topology_state *mst_state, uint8_t link_encoding_cap);
--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 3/6] drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
@ 2023-08-24 20:50   ` Gil Dekel
  0 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel, navaremanasi

Unlike SST, MST can support multiple displays connected to a single
connector. However, this also means that if the DisplayPort link to the
top-level MST branch device becomes unstable, then every single branch
device has an unstable link.

Since there are multiple downstream ports per connector, setting the
link status of the parent mstb's port to BAD is not enough. All of the
downstream mstb ports must also have their link status set to BAD.

This aligns to how the DP link status logic in DRM works. We notify
userspace that all of the mstb ports need retraining and apply new lower
bandwidth constraints to all future atomic commits on the topology that
follow.

Since any driver supporting MST needs to figure out which connectors
live downstream on an MST topology and update their link status in order
to retrain MST links properly, we add the
drm_dp_set_mst_topology_link_status() helper. This helper simply marks
the link status of all connectors living in that topology as bad. We
will make use of this helper in i915 later in this series.

Credit: this patch is a refactor of Lyude Pual's original patch:
https://patchwork.kernel.org/project/dri-devel/patch/20180308232421.14049-5-lyude@redhat.com/

Signed-off-by: Gil Dekel <gildekel@chromium.org>
---
 drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 +++++++++++++++++++
 include/drm/display/drm_dp_mst_helper.h       |  3 ++
 2 files changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c
index ed96cfcfa304..17cbadfb6ccb 100644
--- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
@@ -3566,6 +3566,45 @@ int drm_dp_get_vc_payload_bw(const struct drm_dp_mst_topology_mgr *mgr,
 }
 EXPORT_SYMBOL(drm_dp_get_vc_payload_bw);

+/**
+ * drm_dp_set_mst_topology_link_status() - set all downstream MST ports' link status
+ * @mgr: MST topology manager to set state for
+ * @status: The new status to set the MST topology to
+ *
+ * Set all downstream ports' link-status within the topology to the given status.
+ */
+void drm_dp_set_mst_topology_link_status(struct drm_dp_mst_topology_mgr *mgr,
+					 enum drm_link_status status)
+{
+	struct drm_dp_mst_port *port;
+	struct drm_dp_mst_branch *rmstb;
+	struct drm_dp_mst_branch *mstb =
+		drm_dp_mst_topology_get_mstb_validated(mgr, mgr->mst_primary);
+
+	list_for_each_entry_reverse(port, &mstb->ports, next) {
+		struct drm_connector *connector = port->connector;
+
+		if (connector) {
+			mutex_lock(&connector->dev->mode_config.mutex);
+			drm_dbg_kms(
+				connector->dev,
+				"[MST-CONNECTOR:%d:%s] link status %d -> %d\n",
+				connector->base.id, connector->name,
+				connector->state->link_status, status);
+			connector->state->link_status = status;
+			mutex_unlock(&connector->dev->mode_config.mutex);
+		}
+
+		rmstb = drm_dp_mst_topology_get_mstb_validated(mstb->mgr,
+							       port->mstb);
+		if (rmstb) {
+			drm_dp_set_mst_topology_link_status(rmstb->mgr, status);
+			drm_dp_mst_topology_put_mstb(rmstb);
+		}
+	}
+}
+EXPORT_SYMBOL(drm_dp_set_mst_topology_link_status);
+
 /**
  * drm_dp_read_mst_cap() - check whether or not a sink supports MST
  * @aux: The DP AUX channel to use
diff --git a/include/drm/display/drm_dp_mst_helper.h b/include/drm/display/drm_dp_mst_helper.h
index ed5c9660563c..855d488bf364 100644
--- a/include/drm/display/drm_dp_mst_helper.h
+++ b/include/drm/display/drm_dp_mst_helper.h
@@ -832,6 +832,9 @@ struct edid *drm_dp_mst_get_edid(struct drm_connector *connector,
 int drm_dp_get_vc_payload_bw(const struct drm_dp_mst_topology_mgr *mgr,
 			     int link_rate, int link_lane_count);

+void drm_dp_set_mst_topology_link_status(struct drm_dp_mst_topology_mgr *mgr,
+					 enum drm_link_status status);
+
 int drm_dp_calc_pbn_mode(int clock, int bpp, bool dsc);

 void drm_dp_mst_update_slots(struct drm_dp_mst_topology_state *mst_state, uint8_t link_encoding_cap);
--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Intel-gfx] [PATCH v4 4/6] drm/i915: Move DP modeset_retry_work into intel_dp
  2023-08-24 20:50 ` Gil Dekel
@ 2023-08-24 20:50   ` Gil Dekel
  -1 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel

Currently, link-training fallback is only implemented for SST, so having
modeset_retry_work in intel_connector makes sense. However, we hope to
implement link training fallback for MST in a follow-up patchset, so
moving modeset_retry_work to indel_dp will make handling both SST and
MST connectors simpler. This patch does exactly that, and updates all
modeset_retry_work dependencies to use an intel_dp instead.

Credit: this patch is a rebase of Lyude Pual's original patch:
https://patchwork.freedesktop.org/patch/216627/?series=41576&rev=3

Signed-off-by: Gil Dekel <gildekel@chromium.org>
---
 drivers/gpu/drm/i915/display/intel_display.c       | 14 +++++++++++---
 drivers/gpu/drm/i915/display/intel_display_types.h |  6 +++---
 drivers/gpu/drm/i915/display/intel_dp.c            | 11 ++++-------
 .../gpu/drm/i915/display/intel_dp_link_training.c  |  3 +--
 4 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index db3c26e013e3..2ec75aa0b4ee 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -7962,20 +7962,28 @@ void i830_disable_pipe(struct drm_i915_private *dev_priv, enum pipe pipe)

 void intel_hpd_poll_fini(struct drm_i915_private *i915)
 {
-	struct intel_connector *connector;
 	struct drm_connector_list_iter conn_iter;
+	struct intel_connector *connector;
+	struct intel_dp *intel_dp;
+	struct intel_encoder *encoder;

 	/* Kill all the work that may have been queued by hpd. */
 	drm_connector_list_iter_begin(&i915->drm, &conn_iter);
 	for_each_intel_connector_iter(connector, &conn_iter) {
-		if (connector->modeset_retry_work.func)
-			cancel_work_sync(&connector->modeset_retry_work);
 		if (connector->hdcp.shim) {
 			cancel_delayed_work_sync(&connector->hdcp.check_work);
 			cancel_work_sync(&connector->hdcp.prop_work);
 		}
 	}
 	drm_connector_list_iter_end(&conn_iter);
+
+	for_each_intel_dp(&i915->drm, encoder) {
+		if (encoder->type == DRM_MODE_CONNECTOR_eDP ||
+		    encoder->type == DRM_MODE_CONNECTOR_DisplayPort) {
+			intel_dp = enc_to_intel_dp(encoder);
+			cancel_work_sync(&intel_dp->modeset_retry_work);
+		}
+	}
 }

 bool intel_scanout_needs_vtd_wa(struct drm_i915_private *i915)
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h
index 731f2ec04d5c..b92bb69a3fe4 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -620,9 +620,6 @@ struct intel_connector {

 	struct intel_dp *mst_port;

-	/* Work struct to schedule a uevent on link train failure */
-	struct work_struct modeset_retry_work;
-
 	struct intel_hdcp hdcp;
 };

@@ -1779,6 +1776,9 @@ struct intel_dp {
 	/* Displayport compliance testing */
 	struct intel_dp_compliance compliance;

+	/* Work struct to schedule a uevent on link train failure */
+	struct work_struct modeset_retry_work;
+
 	/* Downstream facing port caps */
 	struct {
 		int min_tmds_clock, max_tmds_clock;
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index 01b180c8d9bd..42353b1ac487 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -5992,12 +5992,9 @@ static bool intel_edp_init_connector(struct intel_dp *intel_dp,

 static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
 {
-	struct intel_connector *intel_connector;
-	struct drm_connector *connector;
-
-	intel_connector = container_of(work, typeof(*intel_connector),
-				       modeset_retry_work);
-	connector = &intel_connector->base;
+	struct intel_dp *intel_dp =
+		container_of(work, typeof(*intel_dp), modeset_retry_work);
+	struct drm_connector *connector = &intel_dp->attached_connector->base;
 	drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s]\n", connector->base.id,
 		    connector->name);

@@ -6027,7 +6024,7 @@ intel_dp_init_connector(struct intel_digital_port *dig_port,
 	int type;

 	/* Initialize the work for modeset in case of link train failure */
-	INIT_WORK(&intel_connector->modeset_retry_work,
+	INIT_WORK(&intel_dp->modeset_retry_work,
 		  intel_dp_modeset_retry_work_fn);

 	if (drm_WARN(dev, dig_port->max_lanes < 1,
diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
index 31d0d7854003..87d13cd03ef5 100644
--- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
@@ -1063,7 +1063,6 @@ intel_dp_link_train_phy(struct intel_dp *intel_dp,
 static void intel_dp_schedule_fallback_link_training(struct intel_dp *intel_dp,
 						     const struct intel_crtc_state *crtc_state)
 {
-	struct intel_connector *intel_connector = intel_dp->attached_connector;
 	struct drm_i915_private *i915 = dp_to_i915(intel_dp);

 	if (!intel_digital_port_connected(&dp_to_dig_port(intel_dp)->base)) {
@@ -1082,7 +1081,7 @@ static void intel_dp_schedule_fallback_link_training(struct intel_dp *intel_dp,
 	}

 	/* Schedule a Hotplug Uevent to userspace to start modeset */
-	queue_work(i915->unordered_wq, &intel_connector->modeset_retry_work);
+	queue_work(i915->unordered_wq, &intel_dp->modeset_retry_work);
 }

 /* Perform the link training on all LTTPRs and the DPRX on a link. */
--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 4/6] drm/i915: Move DP modeset_retry_work into intel_dp
@ 2023-08-24 20:50   ` Gil Dekel
  0 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel, navaremanasi

Currently, link-training fallback is only implemented for SST, so having
modeset_retry_work in intel_connector makes sense. However, we hope to
implement link training fallback for MST in a follow-up patchset, so
moving modeset_retry_work to indel_dp will make handling both SST and
MST connectors simpler. This patch does exactly that, and updates all
modeset_retry_work dependencies to use an intel_dp instead.

Credit: this patch is a rebase of Lyude Pual's original patch:
https://patchwork.freedesktop.org/patch/216627/?series=41576&rev=3

Signed-off-by: Gil Dekel <gildekel@chromium.org>
---
 drivers/gpu/drm/i915/display/intel_display.c       | 14 +++++++++++---
 drivers/gpu/drm/i915/display/intel_display_types.h |  6 +++---
 drivers/gpu/drm/i915/display/intel_dp.c            | 11 ++++-------
 .../gpu/drm/i915/display/intel_dp_link_training.c  |  3 +--
 4 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index db3c26e013e3..2ec75aa0b4ee 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -7962,20 +7962,28 @@ void i830_disable_pipe(struct drm_i915_private *dev_priv, enum pipe pipe)

 void intel_hpd_poll_fini(struct drm_i915_private *i915)
 {
-	struct intel_connector *connector;
 	struct drm_connector_list_iter conn_iter;
+	struct intel_connector *connector;
+	struct intel_dp *intel_dp;
+	struct intel_encoder *encoder;

 	/* Kill all the work that may have been queued by hpd. */
 	drm_connector_list_iter_begin(&i915->drm, &conn_iter);
 	for_each_intel_connector_iter(connector, &conn_iter) {
-		if (connector->modeset_retry_work.func)
-			cancel_work_sync(&connector->modeset_retry_work);
 		if (connector->hdcp.shim) {
 			cancel_delayed_work_sync(&connector->hdcp.check_work);
 			cancel_work_sync(&connector->hdcp.prop_work);
 		}
 	}
 	drm_connector_list_iter_end(&conn_iter);
+
+	for_each_intel_dp(&i915->drm, encoder) {
+		if (encoder->type == DRM_MODE_CONNECTOR_eDP ||
+		    encoder->type == DRM_MODE_CONNECTOR_DisplayPort) {
+			intel_dp = enc_to_intel_dp(encoder);
+			cancel_work_sync(&intel_dp->modeset_retry_work);
+		}
+	}
 }

 bool intel_scanout_needs_vtd_wa(struct drm_i915_private *i915)
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h
index 731f2ec04d5c..b92bb69a3fe4 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -620,9 +620,6 @@ struct intel_connector {

 	struct intel_dp *mst_port;

-	/* Work struct to schedule a uevent on link train failure */
-	struct work_struct modeset_retry_work;
-
 	struct intel_hdcp hdcp;
 };

@@ -1779,6 +1776,9 @@ struct intel_dp {
 	/* Displayport compliance testing */
 	struct intel_dp_compliance compliance;

+	/* Work struct to schedule a uevent on link train failure */
+	struct work_struct modeset_retry_work;
+
 	/* Downstream facing port caps */
 	struct {
 		int min_tmds_clock, max_tmds_clock;
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index 01b180c8d9bd..42353b1ac487 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -5992,12 +5992,9 @@ static bool intel_edp_init_connector(struct intel_dp *intel_dp,

 static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
 {
-	struct intel_connector *intel_connector;
-	struct drm_connector *connector;
-
-	intel_connector = container_of(work, typeof(*intel_connector),
-				       modeset_retry_work);
-	connector = &intel_connector->base;
+	struct intel_dp *intel_dp =
+		container_of(work, typeof(*intel_dp), modeset_retry_work);
+	struct drm_connector *connector = &intel_dp->attached_connector->base;
 	drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s]\n", connector->base.id,
 		    connector->name);

@@ -6027,7 +6024,7 @@ intel_dp_init_connector(struct intel_digital_port *dig_port,
 	int type;

 	/* Initialize the work for modeset in case of link train failure */
-	INIT_WORK(&intel_connector->modeset_retry_work,
+	INIT_WORK(&intel_dp->modeset_retry_work,
 		  intel_dp_modeset_retry_work_fn);

 	if (drm_WARN(dev, dig_port->max_lanes < 1,
diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
index 31d0d7854003..87d13cd03ef5 100644
--- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
@@ -1063,7 +1063,6 @@ intel_dp_link_train_phy(struct intel_dp *intel_dp,
 static void intel_dp_schedule_fallback_link_training(struct intel_dp *intel_dp,
 						     const struct intel_crtc_state *crtc_state)
 {
-	struct intel_connector *intel_connector = intel_dp->attached_connector;
 	struct drm_i915_private *i915 = dp_to_i915(intel_dp);

 	if (!intel_digital_port_connected(&dp_to_dig_port(intel_dp)->base)) {
@@ -1082,7 +1081,7 @@ static void intel_dp_schedule_fallback_link_training(struct intel_dp *intel_dp,
 	}

 	/* Schedule a Hotplug Uevent to userspace to start modeset */
-	queue_work(i915->unordered_wq, &intel_connector->modeset_retry_work);
+	queue_work(i915->unordered_wq, &intel_dp->modeset_retry_work);
 }

 /* Perform the link training on all LTTPRs and the DPRX on a link. */
--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Intel-gfx] [PATCH v4 5/6] drm/i915/dp_link_training: Set all downstream MST ports to BAD before retrying
  2023-08-24 20:50 ` Gil Dekel
@ 2023-08-24 20:50   ` Gil Dekel
  -1 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel

Before sending a uevent to userspace in order to trigger a corrective
modeset, we change the failing connector's link-status to BAD. However,
the downstream MST branch ports are left in their original GOOD state.

This patch utilizes the drm helper function
drm_dp_set_mst_topology_link_status() to rectify this and set all
downstream MST connectors' link-status to BAD before emitting the uevent
to userspace.

Signed-off-by: Gil Dekel <gildekel@chromium.org>
---
 drivers/gpu/drm/i915/display/intel_dp.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index 42353b1ac487..e8b10f59e141 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -5995,16 +5995,20 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
 	struct intel_dp *intel_dp =
 		container_of(work, typeof(*intel_dp), modeset_retry_work);
 	struct drm_connector *connector = &intel_dp->attached_connector->base;
-	drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s]\n", connector->base.id,
-		    connector->name);

-	/* Grab the locks before changing connector property*/
-	mutex_lock(&connector->dev->mode_config.mutex);
-	/* Set connector link status to BAD and send a Uevent to notify
-	 * userspace to do a modeset.
+	/* Set the connector's (and possibly all its downstream MST ports') link
+	 * status to BAD.
 	 */
+	mutex_lock(&connector->dev->mode_config.mutex);
+	drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s] link status %d -> %d\n",
+		    connector->base.id, connector->name,
+		    connector->state->link_status, DRM_MODE_LINK_STATUS_BAD);
 	drm_connector_set_link_status_property(connector,
 					       DRM_MODE_LINK_STATUS_BAD);
+	if (intel_dp->is_mst) {
+		drm_dp_set_mst_topology_link_status(&intel_dp->mst_mgr,
+						    DRM_MODE_LINK_STATUS_BAD);
+	}
 	mutex_unlock(&connector->dev->mode_config.mutex);
 	/* Send Hotplug uevent so userspace can reprobe */
 	drm_kms_helper_connector_hotplug_event(connector);
--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 5/6] drm/i915/dp_link_training: Set all downstream MST ports to BAD before retrying
@ 2023-08-24 20:50   ` Gil Dekel
  0 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel, navaremanasi

Before sending a uevent to userspace in order to trigger a corrective
modeset, we change the failing connector's link-status to BAD. However,
the downstream MST branch ports are left in their original GOOD state.

This patch utilizes the drm helper function
drm_dp_set_mst_topology_link_status() to rectify this and set all
downstream MST connectors' link-status to BAD before emitting the uevent
to userspace.

Signed-off-by: Gil Dekel <gildekel@chromium.org>
---
 drivers/gpu/drm/i915/display/intel_dp.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index 42353b1ac487..e8b10f59e141 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -5995,16 +5995,20 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
 	struct intel_dp *intel_dp =
 		container_of(work, typeof(*intel_dp), modeset_retry_work);
 	struct drm_connector *connector = &intel_dp->attached_connector->base;
-	drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s]\n", connector->base.id,
-		    connector->name);

-	/* Grab the locks before changing connector property*/
-	mutex_lock(&connector->dev->mode_config.mutex);
-	/* Set connector link status to BAD and send a Uevent to notify
-	 * userspace to do a modeset.
+	/* Set the connector's (and possibly all its downstream MST ports') link
+	 * status to BAD.
 	 */
+	mutex_lock(&connector->dev->mode_config.mutex);
+	drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s] link status %d -> %d\n",
+		    connector->base.id, connector->name,
+		    connector->state->link_status, DRM_MODE_LINK_STATUS_BAD);
 	drm_connector_set_link_status_property(connector,
 					       DRM_MODE_LINK_STATUS_BAD);
+	if (intel_dp->is_mst) {
+		drm_dp_set_mst_topology_link_status(&intel_dp->mst_mgr,
+						    DRM_MODE_LINK_STATUS_BAD);
+	}
 	mutex_unlock(&connector->dev->mode_config.mutex);
 	/* Send Hotplug uevent so userspace can reprobe */
 	drm_kms_helper_connector_hotplug_event(connector);
--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Intel-gfx] [PATCH v4 6/6] drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger property
  2023-08-24 20:50 ` Gil Dekel
@ 2023-08-24 20:50   ` Gil Dekel
  -1 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel

When a link-training attempt fails, emit a uevent to user space that
includes the trigger property, which in this case will be
link-statue=Bad.

This will allow userspace to parse the uevent property and better
understand the reason for the previous modeset failure.

Signed-off-by: Gil Dekel <gildekel@chromium.org>

V2:
  - init link_status_property inline.
---
 drivers/gpu/drm/i915/display/intel_dp.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index e8b10f59e141..328e9f030033 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -42,6 +42,7 @@
 #include <drm/drm_crtc.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_probe_helper.h>
+#include <drm/drm_sysfs.h>

 #include "g4x_dp.h"
 #include "i915_drv.h"
@@ -5995,6 +5996,8 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
 	struct intel_dp *intel_dp =
 		container_of(work, typeof(*intel_dp), modeset_retry_work);
 	struct drm_connector *connector = &intel_dp->attached_connector->base;
+	struct drm_property *link_status_property =
+		connector->dev->mode_config.link_status_property;

 	/* Set the connector's (and possibly all its downstream MST ports') link
 	 * status to BAD.
@@ -6011,7 +6014,7 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
 	}
 	mutex_unlock(&connector->dev->mode_config.mutex);
 	/* Send Hotplug uevent so userspace can reprobe */
-	drm_kms_helper_connector_hotplug_event(connector);
+	drm_sysfs_connector_property_event(connector, link_status_property);
 }

 bool
--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 6/6] drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger property
@ 2023-08-24 20:50   ` Gil Dekel
  0 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-08-24 20:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: seanpaul, Gil Dekel, navaremanasi

When a link-training attempt fails, emit a uevent to user space that
includes the trigger property, which in this case will be
link-statue=Bad.

This will allow userspace to parse the uevent property and better
understand the reason for the previous modeset failure.

Signed-off-by: Gil Dekel <gildekel@chromium.org>

V2:
  - init link_status_property inline.
---
 drivers/gpu/drm/i915/display/intel_dp.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index e8b10f59e141..328e9f030033 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -42,6 +42,7 @@
 #include <drm/drm_crtc.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_probe_helper.h>
+#include <drm/drm_sysfs.h>

 #include "g4x_dp.h"
 #include "i915_drv.h"
@@ -5995,6 +5996,8 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
 	struct intel_dp *intel_dp =
 		container_of(work, typeof(*intel_dp), modeset_retry_work);
 	struct drm_connector *connector = &intel_dp->attached_connector->base;
+	struct drm_property *link_status_property =
+		connector->dev->mode_config.link_status_property;

 	/* Set the connector's (and possibly all its downstream MST ports') link
 	 * status to BAD.
@@ -6011,7 +6014,7 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
 	}
 	mutex_unlock(&connector->dev->mode_config.mutex);
 	/* Send Hotplug uevent so userspace can reprobe */
-	drm_kms_helper_connector_hotplug_event(connector);
+	drm_sysfs_connector_property_event(connector, link_status_property);
 }

 bool
--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/dp_link_training: Define a final failure state when link training fails (rev2)
  2023-08-24 20:50 ` Gil Dekel
                   ` (6 preceding siblings ...)
  (?)
@ 2023-08-24 22:10 ` Patchwork
  -1 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2023-08-24 22:10 UTC (permalink / raw)
  To: Gil Dekel; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/dp_link_training: Define a final failure state when link training fails (rev2)
URL   : https://patchwork.freedesktop.org/series/122850/
State : warning

== Summary ==

Error: dim checkpatch failed
/home/kbuild2/linux/maintainer-tools/dim: line 50: /home/kbuild2/.dimrc: No such file or directory



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/dp_link_training: Define a final failure state when link training fails (rev2)
  2023-08-24 20:50 ` Gil Dekel
                   ` (7 preceding siblings ...)
  (?)
@ 2023-08-24 22:10 ` Patchwork
  -1 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2023-08-24 22:10 UTC (permalink / raw)
  To: Gil Dekel; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/dp_link_training: Define a final failure state when link training fails (rev2)
URL   : https://patchwork.freedesktop.org/series/122850/
State : warning

== Summary ==

Error: dim sparse failed
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/dp_link_training: Define a final failure state when link training fails (rev2)
  2023-08-24 20:50 ` Gil Dekel
                   ` (8 preceding siblings ...)
  (?)
@ 2023-08-24 22:29 ` Patchwork
  -1 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2023-08-24 22:29 UTC (permalink / raw)
  To: Gil Dekel; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 5388 bytes --]

== Series Details ==

Series: drm/i915/dp_link_training: Define a final failure state when link training fails (rev2)
URL   : https://patchwork.freedesktop.org/series/122850/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_13562 -> Patchwork_122850v2
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_122850v2 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_122850v2, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_122850v2/index.html

Participating hosts (41 -> 39)
------------------------------

  Missing    (2): fi-kbl-soraka fi-snb-2520m 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_122850v2:

### IGT changes ###

#### Possible regressions ####

  * igt@dmabuf@all-tests@dma_fence:
    - fi-hsw-4770:        [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13562/fi-hsw-4770/igt@dmabuf@all-tests@dma_fence.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_122850v2/fi-hsw-4770/igt@dmabuf@all-tests@dma_fence.html

  * igt@dmabuf@all-tests@sanitycheck:
    - fi-hsw-4770:        [PASS][3] -> [ABORT][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13562/fi-hsw-4770/igt@dmabuf@all-tests@sanitycheck.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_122850v2/fi-hsw-4770/igt@dmabuf@all-tests@sanitycheck.html

  
Known issues
------------

  Here are the changes found in Patchwork_122850v2 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s0@lmem0:
    - bat-dg2-9:          [PASS][5] -> [INCOMPLETE][6] ([i915#6311])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13562/bat-dg2-9/igt@gem_exec_suspend@basic-s0@lmem0.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_122850v2/bat-dg2-9/igt@gem_exec_suspend@basic-s0@lmem0.html

  * igt@i915_suspend@basic-s3-without-i915:
    - bat-adlm-1:         NOTRUN -> [INCOMPLETE][7] ([i915#7443])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_122850v2/bat-adlm-1/igt@i915_suspend@basic-s3-without-i915.html
    - bat-mtlp-8:         NOTRUN -> [SKIP][8] ([i915#6645])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_122850v2/bat-mtlp-8/igt@i915_suspend@basic-s3-without-i915.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@requests:
    - bat-mtlp-8:         [ABORT][9] ([i915#7982] / [i915#8865]) -> [PASS][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13562/bat-mtlp-8/igt@i915_selftest@live@requests.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_122850v2/bat-mtlp-8/igt@i915_selftest@live@requests.html

  * igt@i915_selftest@live@workarounds:
    - bat-adlm-1:         [INCOMPLETE][11] ([i915#4983] / [i915#7677]) -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13562/bat-adlm-1/igt@i915_selftest@live@workarounds.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_122850v2/bat-adlm-1/igt@i915_selftest@live@workarounds.html

  
#### Warnings ####

  * igt@kms_psr@cursor_plane_move:
    - bat-rplp-1:         [ABORT][13] ([i915#8469] / [i915#8668]) -> [ABORT][14] ([i915#8668])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13562/bat-rplp-1/igt@kms_psr@cursor_plane_move.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_122850v2/bat-rplp-1/igt@kms_psr@cursor_plane_move.html

  
  [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983
  [i915#6311]: https://gitlab.freedesktop.org/drm/intel/issues/6311
  [i915#6645]: https://gitlab.freedesktop.org/drm/intel/issues/6645
  [i915#7443]: https://gitlab.freedesktop.org/drm/intel/issues/7443
  [i915#7677]: https://gitlab.freedesktop.org/drm/intel/issues/7677
  [i915#7982]: https://gitlab.freedesktop.org/drm/intel/issues/7982
  [i915#8469]: https://gitlab.freedesktop.org/drm/intel/issues/8469
  [i915#8668]: https://gitlab.freedesktop.org/drm/intel/issues/8668
  [i915#8865]: https://gitlab.freedesktop.org/drm/intel/issues/8865


Build changes
-------------

  * Linux: CI_DRM_13562 -> Patchwork_122850v2

  CI-20190529: 20190529
  CI_DRM_13562: 6cd46255547ba72bb6cc6aab91c905b1dec95696 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7451: 5d48d1fb231f449fe2f80cda14ea7a1ecfda59fa @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_122850v2: 6cd46255547ba72bb6cc6aab91c905b1dec95696 @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

64815520501e drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger property
ccc072ffe679 drm/i915/dp_link_training: Set all downstream MST ports to BAD before retrying
8ca4bfecd886 drm/i915: Move DP modeset_retry_work into intel_dp
f9fa0176ea80 drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
9a36725f2f36 drm/i915/dp_link_training: Add a final failing state to link training fallback for MST
586040501769 drm/i915/dp_link_training: Add a final failing state to link training fallback

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_122850v2/index.html

[-- Attachment #2: Type: text/html, Size: 6290 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Intel-gfx] [PATCH v4 3/6] drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
  2023-08-24 20:50   ` Gil Dekel
@ 2023-08-30 21:32     ` Lyude Paul
  -1 siblings, 0 replies; 31+ messages in thread
From: Lyude Paul @ 2023-08-30 21:32 UTC (permalink / raw)
  To: Gil Dekel, intel-gfx, dri-devel; +Cc: seanpaul

On Thu, 2023-08-24 at 16:50 -0400, Gil Dekel wrote:
> Unlike SST, MST can support multiple displays connected to a single
> connector. However, this also means that if the DisplayPort link to the
> top-level MST branch device becomes unstable, then every single branch
> device has an unstable link.
> 
> Since there are multiple downstream ports per connector, setting the
> link status of the parent mstb's port to BAD is not enough. All of the
> downstream mstb ports must also have their link status set to BAD.
> 
> This aligns to how the DP link status logic in DRM works. We notify
> userspace that all of the mstb ports need retraining and apply new lower
> bandwidth constraints to all future atomic commits on the topology that
> follow.
> 
> Since any driver supporting MST needs to figure out which connectors
> live downstream on an MST topology and update their link status in order
> to retrain MST links properly, we add the
> drm_dp_set_mst_topology_link_status() helper. This helper simply marks
> the link status of all connectors living in that topology as bad. We
> will make use of this helper in i915 later in this series.
> 
> Credit: this patch is a refactor of Lyude Pual's original patch:
> https://patchwork.kernel.org/project/dri-devel/patch/20180308232421.14049-5-lyude@redhat.com/

s/Pual/Paul/ (probably want to fix this on the other patches in the series as
well)

> 
> Signed-off-by: Gil Dekel <gildekel@chromium.org>
> ---
>  drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 +++++++++++++++++++
>  include/drm/display/drm_dp_mst_helper.h       |  3 ++
>  2 files changed, 42 insertions(+)
> 
> diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> index ed96cfcfa304..17cbadfb6ccb 100644
> --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> @@ -3566,6 +3566,45 @@ int drm_dp_get_vc_payload_bw(const struct drm_dp_mst_topology_mgr *mgr,
>  }
>  EXPORT_SYMBOL(drm_dp_get_vc_payload_bw);
> 
> +/**
> + * drm_dp_set_mst_topology_link_status() - set all downstream MST ports' link status
> + * @mgr: MST topology manager to set state for
> + * @status: The new status to set the MST topology to
> + *
> + * Set all downstream ports' link-status within the topology to the given status.
> + */
> +void drm_dp_set_mst_topology_link_status(struct drm_dp_mst_topology_mgr *mgr,
> +					 enum drm_link_status status)
> +{
> +	struct drm_dp_mst_port *port;
> +	struct drm_dp_mst_branch *rmstb;
> +	struct drm_dp_mst_branch *mstb =
> +		drm_dp_mst_topology_get_mstb_validated(mgr, mgr->mst_primary);
> +
> +	list_for_each_entry_reverse(port, &mstb->ports, next) {
> +		struct drm_connector *connector = port->connector;
> +
> +		if (connector) {
> +			mutex_lock(&connector->dev->mode_config.mutex);
> +			drm_dbg_kms(
> +				connector->dev,
> +				"[MST-CONNECTOR:%d:%s] link status %d -> %d\n",
> +				connector->base.id, connector->name,
> +				connector->state->link_status, status);
> +			connector->state->link_status = status;
> +			mutex_unlock(&connector->dev->mode_config.mutex);
> +		}
> +
> +		rmstb = drm_dp_mst_topology_get_mstb_validated(mstb->mgr,
> +							       port->mstb);
> +		if (rmstb) {
> +			drm_dp_set_mst_topology_link_status(rmstb->mgr, status);
> +			drm_dp_mst_topology_put_mstb(rmstb);
> +		}
> +	}
> +}
> +EXPORT_SYMBOL(drm_dp_set_mst_topology_link_status);
> +
>  /**
>   * drm_dp_read_mst_cap() - check whether or not a sink supports MST
>   * @aux: The DP AUX channel to use
> diff --git a/include/drm/display/drm_dp_mst_helper.h b/include/drm/display/drm_dp_mst_helper.h
> index ed5c9660563c..855d488bf364 100644
> --- a/include/drm/display/drm_dp_mst_helper.h
> +++ b/include/drm/display/drm_dp_mst_helper.h
> @@ -832,6 +832,9 @@ struct edid *drm_dp_mst_get_edid(struct drm_connector *connector,
>  int drm_dp_get_vc_payload_bw(const struct drm_dp_mst_topology_mgr *mgr,
>  			     int link_rate, int link_lane_count);
> 
> +void drm_dp_set_mst_topology_link_status(struct drm_dp_mst_topology_mgr *mgr,
> +					 enum drm_link_status status);
> +
>  int drm_dp_calc_pbn_mode(int clock, int bpp, bool dsc);
> 
>  void drm_dp_mst_update_slots(struct drm_dp_mst_topology_state *mst_state, uint8_t link_encoding_cap);
> --
> Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/6] drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
@ 2023-08-30 21:32     ` Lyude Paul
  0 siblings, 0 replies; 31+ messages in thread
From: Lyude Paul @ 2023-08-30 21:32 UTC (permalink / raw)
  To: Gil Dekel, intel-gfx, dri-devel; +Cc: seanpaul, navaremanasi

On Thu, 2023-08-24 at 16:50 -0400, Gil Dekel wrote:
> Unlike SST, MST can support multiple displays connected to a single
> connector. However, this also means that if the DisplayPort link to the
> top-level MST branch device becomes unstable, then every single branch
> device has an unstable link.
> 
> Since there are multiple downstream ports per connector, setting the
> link status of the parent mstb's port to BAD is not enough. All of the
> downstream mstb ports must also have their link status set to BAD.
> 
> This aligns to how the DP link status logic in DRM works. We notify
> userspace that all of the mstb ports need retraining and apply new lower
> bandwidth constraints to all future atomic commits on the topology that
> follow.
> 
> Since any driver supporting MST needs to figure out which connectors
> live downstream on an MST topology and update their link status in order
> to retrain MST links properly, we add the
> drm_dp_set_mst_topology_link_status() helper. This helper simply marks
> the link status of all connectors living in that topology as bad. We
> will make use of this helper in i915 later in this series.
> 
> Credit: this patch is a refactor of Lyude Pual's original patch:
> https://patchwork.kernel.org/project/dri-devel/patch/20180308232421.14049-5-lyude@redhat.com/

s/Pual/Paul/ (probably want to fix this on the other patches in the series as
well)

> 
> Signed-off-by: Gil Dekel <gildekel@chromium.org>
> ---
>  drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 +++++++++++++++++++
>  include/drm/display/drm_dp_mst_helper.h       |  3 ++
>  2 files changed, 42 insertions(+)
> 
> diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> index ed96cfcfa304..17cbadfb6ccb 100644
> --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> @@ -3566,6 +3566,45 @@ int drm_dp_get_vc_payload_bw(const struct drm_dp_mst_topology_mgr *mgr,
>  }
>  EXPORT_SYMBOL(drm_dp_get_vc_payload_bw);
> 
> +/**
> + * drm_dp_set_mst_topology_link_status() - set all downstream MST ports' link status
> + * @mgr: MST topology manager to set state for
> + * @status: The new status to set the MST topology to
> + *
> + * Set all downstream ports' link-status within the topology to the given status.
> + */
> +void drm_dp_set_mst_topology_link_status(struct drm_dp_mst_topology_mgr *mgr,
> +					 enum drm_link_status status)
> +{
> +	struct drm_dp_mst_port *port;
> +	struct drm_dp_mst_branch *rmstb;
> +	struct drm_dp_mst_branch *mstb =
> +		drm_dp_mst_topology_get_mstb_validated(mgr, mgr->mst_primary);
> +
> +	list_for_each_entry_reverse(port, &mstb->ports, next) {
> +		struct drm_connector *connector = port->connector;
> +
> +		if (connector) {
> +			mutex_lock(&connector->dev->mode_config.mutex);
> +			drm_dbg_kms(
> +				connector->dev,
> +				"[MST-CONNECTOR:%d:%s] link status %d -> %d\n",
> +				connector->base.id, connector->name,
> +				connector->state->link_status, status);
> +			connector->state->link_status = status;
> +			mutex_unlock(&connector->dev->mode_config.mutex);
> +		}
> +
> +		rmstb = drm_dp_mst_topology_get_mstb_validated(mstb->mgr,
> +							       port->mstb);
> +		if (rmstb) {
> +			drm_dp_set_mst_topology_link_status(rmstb->mgr, status);
> +			drm_dp_mst_topology_put_mstb(rmstb);
> +		}
> +	}
> +}
> +EXPORT_SYMBOL(drm_dp_set_mst_topology_link_status);
> +
>  /**
>   * drm_dp_read_mst_cap() - check whether or not a sink supports MST
>   * @aux: The DP AUX channel to use
> diff --git a/include/drm/display/drm_dp_mst_helper.h b/include/drm/display/drm_dp_mst_helper.h
> index ed5c9660563c..855d488bf364 100644
> --- a/include/drm/display/drm_dp_mst_helper.h
> +++ b/include/drm/display/drm_dp_mst_helper.h
> @@ -832,6 +832,9 @@ struct edid *drm_dp_mst_get_edid(struct drm_connector *connector,
>  int drm_dp_get_vc_payload_bw(const struct drm_dp_mst_topology_mgr *mgr,
>  			     int link_rate, int link_lane_count);
> 
> +void drm_dp_set_mst_topology_link_status(struct drm_dp_mst_topology_mgr *mgr,
> +					 enum drm_link_status status);
> +
>  int drm_dp_calc_pbn_mode(int clock, int bpp, bool dsc);
> 
>  void drm_dp_mst_update_slots(struct drm_dp_mst_topology_state *mst_state, uint8_t link_encoding_cap);
> --
> Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Intel-gfx] [PATCH v4 0/6] drm/i915/dp_link_training: Define a final failure state when link training fails
  2023-08-24 20:50 ` Gil Dekel
@ 2023-08-30 21:41   ` Lyude Paul
  -1 siblings, 0 replies; 31+ messages in thread
From: Lyude Paul @ 2023-08-30 21:41 UTC (permalink / raw)
  To: Gil Dekel, intel-gfx, dri-devel; +Cc: seanpaul

Other then the name typo (s/Pual/Paul):

Signed-off-by: Lyude Paul <lyude@redhat.com> (just since I co-authored
things~)
Reviewed-by: Lyude Paul <lyude@redhat.com>

I think we definitely want to make sure we get intel's opinions on this
though, especially regarding the usage of link-status. I think we're close
enough to link-status's intended purpose, but I definitely would like to know
what others think about that since userspace will definitely have to handle
situations like this a bit differently than with SST.

Also - definitely make sure you take a look at Imre's patch series that's
currently on the list (I just finished reviewing it), since it adds some
things to the helpers that might end up being useful here :)

https://patchwork.freedesktop.org/series/122589/

On Thu, 2023-08-24 at 16:50 -0400, Gil Dekel wrote:
> Next version of https://patchwork.freedesktop.org/series/122850/
> 
> v4:
>   Another blunder. I uploaded the patches from my ChromeiumOS kernel dev repo
>   instead of drm-tip/drm-tip. Apologies for the noise :(
> 
> v3:
>   Still learning the ropes of upstream workflow. Apologies for mucking up v2.
>   This is just a re-upload.
> 
> v2:
>   Reorganize into:
>   1) Add for final failure state for SST and MST link training fallback.
>   2) Add a DRM helper for setting downstream MST ports' link-status state.
>   3) Make handling SST and MST connectors simpler via intel_dp.
>   4) Update link-status for downstream MST ports.
>   5) Emit a uevent with the "link-status" trigger property.
> 
> v1:
> Currently, when link training fails after all fallback values have been
> exhausted, the i915 driver seizes to send uevents to userspace. This leave
> userspace thinking that the last passing atomic commit was successful, and that
> all connectors (displays) are connected and operational, when in fact, the last
> link failed to train and the displays remain dark. This manifests as "zombie"
> displays in userspace, in which users observe the displays appear in their
> display settings page, but they are dark and unresponsive.
> 
> Since, at the time of writing, MST link training fallback is not implemented,
> failing MST link training is a significantly more common case then a complete
> SST link training failure. And with users using MST hubs more than ever to
> connect multiple displays via their USB-C ports we observe this case often.
> 
> This patchset series suggest a solution, in which a final failure state is
> defined. In this final state, the connector's bit rate capabilities, namely
> max_link_rate and max_link_lane_count, are set to 0. This effectively set the
> connector's bandwidth to 0Gbps, thus causing all its modes to be pruned in the
> following connector probing.
> 
> Next, with this state defined, we emit a link-status=Bad uevent. The next time
> userspace probes the connector, it should recognize that the connector has no
> modes and ignore it since it is in a bad state.
> 
> I am aware that always sending a uevent and never stopping may result in some
> userspaces having their expectations broken and enter an infinite loop of
> modesets and link-training attempts. However, per DRM link-status spec:
> ```
>  * link-status:
>  *      Connector link-status property to indicate the status of link. The
>  *      default value of link-status is "GOOD". If something fails during or
>  *      after modeset, the kernel driver may set this to "BAD" and issue a
>  *      hotplug uevent. Drivers should update this value using
>  *      drm_connector_set_link_status_property().
>  *
>  *      When user-space receives the hotplug uevent and detects a "BAD"
>  *      link-status, the sink doesn't receive pixels anymore (e.g. the screen
>  *      becomes completely black). The list of available modes may have
>  *      changed. User-space is expected to pick a new mode if the current one
>  *      has disappeared and perform a new modeset with link-status set to
>  *      "GOOD" to re-enable the connector.
> ```
> (form drivers/gpu/drm/drm_connector.c - DOC: standard connector properties)
> 
> it seems reasonable to assume that the suggested state is an extension of the
> spec's guidelines, in which the next new mode userspace picks for a connector
> with no modes is - none, thus breaking the cycle of failed link-training
> attempts.
> 
> I suspect that, maybe, zeroing out the bit rate capabilities is not the right
> way to go, and perhaps marking the connector as disconnected instead may be a
> better solution. However, if marking a connector disconnected is the way to go,
> We will have to iterate over all MST ports in the MST case and mark the spawned
> connectors as disconnected as well.

I -think- this is probably fine, that's likely how I'd 

> 
> As a final note I should add that this approach was tested with ChromeOS as
> userspace, and we observed that the zombie displays stop showing up once the
> connectors are pruned of all their modes and are ignored by userspace.
> 
> For your consideration and guidance.
> Thanks,
> 
> Gil Dekel (6):
>   drm/i915/dp_link_training: Add a final failing state to link training
>     fallback
>   drm/i915/dp_link_training: Add a final failing state to link training
>     fallback for MST
>   drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
>   drm/i915: Move DP modeset_retry_work into intel_dp
>   drm/i915/dp_link_training: Set all downstream MST ports to BAD before
>     retrying
>   drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger
>     property
> 
>  drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 ++++++++++
>  drivers/gpu/drm/i915/display/intel_display.c  | 14 +++-
>  .../drm/i915/display/intel_display_types.h    |  6 +-
>  drivers/gpu/drm/i915/display/intel_dp.c       | 75 ++++++++++++-------
>  drivers/gpu/drm/i915/display/intel_dp.h       |  2 +-
>  .../drm/i915/display/intel_dp_link_training.c | 11 ++-
>  include/drm/display/drm_dp_mst_helper.h       |  3 +
>  7 files changed, 110 insertions(+), 40 deletions(-)
> 
> --
> Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 0/6] drm/i915/dp_link_training: Define a final failure state when link training fails
@ 2023-08-30 21:41   ` Lyude Paul
  0 siblings, 0 replies; 31+ messages in thread
From: Lyude Paul @ 2023-08-30 21:41 UTC (permalink / raw)
  To: Gil Dekel, intel-gfx, dri-devel; +Cc: seanpaul, navaremanasi

Other then the name typo (s/Pual/Paul):

Signed-off-by: Lyude Paul <lyude@redhat.com> (just since I co-authored
things~)
Reviewed-by: Lyude Paul <lyude@redhat.com>

I think we definitely want to make sure we get intel's opinions on this
though, especially regarding the usage of link-status. I think we're close
enough to link-status's intended purpose, but I definitely would like to know
what others think about that since userspace will definitely have to handle
situations like this a bit differently than with SST.

Also - definitely make sure you take a look at Imre's patch series that's
currently on the list (I just finished reviewing it), since it adds some
things to the helpers that might end up being useful here :)

https://patchwork.freedesktop.org/series/122589/

On Thu, 2023-08-24 at 16:50 -0400, Gil Dekel wrote:
> Next version of https://patchwork.freedesktop.org/series/122850/
> 
> v4:
>   Another blunder. I uploaded the patches from my ChromeiumOS kernel dev repo
>   instead of drm-tip/drm-tip. Apologies for the noise :(
> 
> v3:
>   Still learning the ropes of upstream workflow. Apologies for mucking up v2.
>   This is just a re-upload.
> 
> v2:
>   Reorganize into:
>   1) Add for final failure state for SST and MST link training fallback.
>   2) Add a DRM helper for setting downstream MST ports' link-status state.
>   3) Make handling SST and MST connectors simpler via intel_dp.
>   4) Update link-status for downstream MST ports.
>   5) Emit a uevent with the "link-status" trigger property.
> 
> v1:
> Currently, when link training fails after all fallback values have been
> exhausted, the i915 driver seizes to send uevents to userspace. This leave
> userspace thinking that the last passing atomic commit was successful, and that
> all connectors (displays) are connected and operational, when in fact, the last
> link failed to train and the displays remain dark. This manifests as "zombie"
> displays in userspace, in which users observe the displays appear in their
> display settings page, but they are dark and unresponsive.
> 
> Since, at the time of writing, MST link training fallback is not implemented,
> failing MST link training is a significantly more common case then a complete
> SST link training failure. And with users using MST hubs more than ever to
> connect multiple displays via their USB-C ports we observe this case often.
> 
> This patchset series suggest a solution, in which a final failure state is
> defined. In this final state, the connector's bit rate capabilities, namely
> max_link_rate and max_link_lane_count, are set to 0. This effectively set the
> connector's bandwidth to 0Gbps, thus causing all its modes to be pruned in the
> following connector probing.
> 
> Next, with this state defined, we emit a link-status=Bad uevent. The next time
> userspace probes the connector, it should recognize that the connector has no
> modes and ignore it since it is in a bad state.
> 
> I am aware that always sending a uevent and never stopping may result in some
> userspaces having their expectations broken and enter an infinite loop of
> modesets and link-training attempts. However, per DRM link-status spec:
> ```
>  * link-status:
>  *      Connector link-status property to indicate the status of link. The
>  *      default value of link-status is "GOOD". If something fails during or
>  *      after modeset, the kernel driver may set this to "BAD" and issue a
>  *      hotplug uevent. Drivers should update this value using
>  *      drm_connector_set_link_status_property().
>  *
>  *      When user-space receives the hotplug uevent and detects a "BAD"
>  *      link-status, the sink doesn't receive pixels anymore (e.g. the screen
>  *      becomes completely black). The list of available modes may have
>  *      changed. User-space is expected to pick a new mode if the current one
>  *      has disappeared and perform a new modeset with link-status set to
>  *      "GOOD" to re-enable the connector.
> ```
> (form drivers/gpu/drm/drm_connector.c - DOC: standard connector properties)
> 
> it seems reasonable to assume that the suggested state is an extension of the
> spec's guidelines, in which the next new mode userspace picks for a connector
> with no modes is - none, thus breaking the cycle of failed link-training
> attempts.
> 
> I suspect that, maybe, zeroing out the bit rate capabilities is not the right
> way to go, and perhaps marking the connector as disconnected instead may be a
> better solution. However, if marking a connector disconnected is the way to go,
> We will have to iterate over all MST ports in the MST case and mark the spawned
> connectors as disconnected as well.

I -think- this is probably fine, that's likely how I'd 

> 
> As a final note I should add that this approach was tested with ChromeOS as
> userspace, and we observed that the zombie displays stop showing up once the
> connectors are pruned of all their modes and are ignored by userspace.
> 
> For your consideration and guidance.
> Thanks,
> 
> Gil Dekel (6):
>   drm/i915/dp_link_training: Add a final failing state to link training
>     fallback
>   drm/i915/dp_link_training: Add a final failing state to link training
>     fallback for MST
>   drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
>   drm/i915: Move DP modeset_retry_work into intel_dp
>   drm/i915/dp_link_training: Set all downstream MST ports to BAD before
>     retrying
>   drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger
>     property
> 
>  drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 ++++++++++
>  drivers/gpu/drm/i915/display/intel_display.c  | 14 +++-
>  .../drm/i915/display/intel_display_types.h    |  6 +-
>  drivers/gpu/drm/i915/display/intel_dp.c       | 75 ++++++++++++-------
>  drivers/gpu/drm/i915/display/intel_dp.h       |  2 +-
>  .../drm/i915/display/intel_dp_link_training.c | 11 ++-
>  include/drm/display/drm_dp_mst_helper.h       |  3 +
>  7 files changed, 110 insertions(+), 40 deletions(-)
> 
> --
> Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Intel-gfx] [PATCH v4 0/6] drm/i915/dp_link_training: Define a final failure state when link training fails
  2023-08-30 21:41   ` Lyude Paul
  (?)
@ 2023-09-01 18:38   ` Rodrigo Vivi
  2023-09-01 19:05     ` Rodrigo Vivi
  -1 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2023-09-01 18:38 UTC (permalink / raw)
  To: Lyude Paul; +Cc: seanpaul, intel-gfx, Gil Dekel, dri-devel

On Wed, Aug 30, 2023 at 05:41:37PM -0400, Lyude Paul wrote:
> Other then the name typo (s/Pual/Paul):
> 
> Signed-off-by: Lyude Paul <lyude@redhat.com> (just since I co-authored
> things~)

I believe having the Co-developed-by: in the patches that you helped
out would be nice.

> Reviewed-by: Lyude Paul <lyude@redhat.com>
> 
> I think we definitely want to make sure we get intel's opinions on this
> though, especially regarding the usage of link-status. I think we're close
> enough to link-status's intended purpose, but I definitely would like to know
> what others think about that since userspace will definitely have to handle
> situations like this a bit differently than with SST.
> 
> Also - definitely make sure you take a look at Imre's patch series that's
> currently on the list (I just finished reviewing it), since it adds some
> things to the helpers that might end up being useful here :)
> 
> https://patchwork.freedesktop.org/series/122589/
> 
> On Thu, 2023-08-24 at 16:50 -0400, Gil Dekel wrote:
> > Next version of https://patchwork.freedesktop.org/series/122850/
> > 
> > v4:
> >   Another blunder. I uploaded the patches from my ChromeiumOS kernel dev repo
> >   instead of drm-tip/drm-tip. Apologies for the noise :(
> > 
> > v3:
> >   Still learning the ropes of upstream workflow. Apologies for mucking up v2.
> >   This is just a re-upload.
> > 
> > v2:
> >   Reorganize into:
> >   1) Add for final failure state for SST and MST link training fallback.
> >   2) Add a DRM helper for setting downstream MST ports' link-status state.
> >   3) Make handling SST and MST connectors simpler via intel_dp.
> >   4) Update link-status for downstream MST ports.
> >   5) Emit a uevent with the "link-status" trigger property.
> > 
> > v1:
> > Currently, when link training fails after all fallback values have been
> > exhausted, the i915 driver seizes to send uevents to userspace. This leave
> > userspace thinking that the last passing atomic commit was successful, and that
> > all connectors (displays) are connected and operational, when in fact, the last
> > link failed to train and the displays remain dark. This manifests as "zombie"
> > displays in userspace, in which users observe the displays appear in their
> > display settings page, but they are dark and unresponsive.
> > 
> > Since, at the time of writing, MST link training fallback is not implemented,
> > failing MST link training is a significantly more common case then a complete
> > SST link training failure. And with users using MST hubs more than ever to
> > connect multiple displays via their USB-C ports we observe this case often.
> > 
> > This patchset series suggest a solution, in which a final failure state is
> > defined. In this final state, the connector's bit rate capabilities, namely
> > max_link_rate and max_link_lane_count, are set to 0. This effectively set the
> > connector's bandwidth to 0Gbps, thus causing all its modes to be pruned in the
> > following connector probing.
> > 
> > Next, with this state defined, we emit a link-status=Bad uevent. The next time
> > userspace probes the connector, it should recognize that the connector has no
> > modes and ignore it since it is in a bad state.
> > 
> > I am aware that always sending a uevent and never stopping may result in some
> > userspaces having their expectations broken and enter an infinite loop of
> > modesets and link-training attempts. However, per DRM link-status spec:
> > ```
> >  * link-status:
> >  *      Connector link-status property to indicate the status of link. The
> >  *      default value of link-status is "GOOD". If something fails during or
> >  *      after modeset, the kernel driver may set this to "BAD" and issue a
> >  *      hotplug uevent. Drivers should update this value using
> >  *      drm_connector_set_link_status_property().
> >  *
> >  *      When user-space receives the hotplug uevent and detects a "BAD"
> >  *      link-status, the sink doesn't receive pixels anymore (e.g. the screen
> >  *      becomes completely black). The list of available modes may have
> >  *      changed. User-space is expected to pick a new mode if the current one
> >  *      has disappeared and perform a new modeset with link-status set to
> >  *      "GOOD" to re-enable the connector.
> > ```
> > (form drivers/gpu/drm/drm_connector.c - DOC: standard connector properties)
> > 
> > it seems reasonable to assume that the suggested state is an extension of the
> > spec's guidelines, in which the next new mode userspace picks for a connector
> > with no modes is - none, thus breaking the cycle of failed link-training
> > attempts.
> > 
> > I suspect that, maybe, zeroing out the bit rate capabilities is not the right
> > way to go, and perhaps marking the connector as disconnected instead may be a
> > better solution. However, if marking a connector disconnected is the way to go,
> > We will have to iterate over all MST ports in the MST case and mark the spawned
> > connectors as disconnected as well.
> 
> I -think- this is probably fine, that's likely how I'd 
> 
> > 
> > As a final note I should add that this approach was tested with ChromeOS as
> > userspace, and we observed that the zombie displays stop showing up once the
> > connectors are pruned of all their modes and are ignored by userspace.
> > 
> > For your consideration and guidance.
> > Thanks,
> > 
> > Gil Dekel (6):
> >   drm/i915/dp_link_training: Add a final failing state to link training
> >     fallback
> >   drm/i915/dp_link_training: Add a final failing state to link training
> >     fallback for MST
> >   drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
> >   drm/i915: Move DP modeset_retry_work into intel_dp
> >   drm/i915/dp_link_training: Set all downstream MST ports to BAD before
> >     retrying
> >   drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger
> >     property
> > 
> >  drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 ++++++++++
> >  drivers/gpu/drm/i915/display/intel_display.c  | 14 +++-
> >  .../drm/i915/display/intel_display_types.h    |  6 +-
> >  drivers/gpu/drm/i915/display/intel_dp.c       | 75 ++++++++++++-------
> >  drivers/gpu/drm/i915/display/intel_dp.h       |  2 +-
> >  .../drm/i915/display/intel_dp_link_training.c | 11 ++-
> >  include/drm/display/drm_dp_mst_helper.h       |  3 +
> >  7 files changed, 110 insertions(+), 40 deletions(-)
> > 
> > --
> > Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics
> > 
> 
> -- 
> Cheers,
>  Lyude Paul (she/her)
>  Software Engineer at Red Hat
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Intel-gfx] [PATCH v4 5/6] drm/i915/dp_link_training: Set all downstream MST ports to BAD before retrying
  2023-08-24 20:50   ` Gil Dekel
  (?)
@ 2023-09-01 18:55   ` Rodrigo Vivi
  2023-09-01 21:13     ` Gil Dekel
  -1 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2023-09-01 18:55 UTC (permalink / raw)
  To: Gil Dekel; +Cc: intel-gfx, seanpaul, dri-devel

On Thu, Aug 24, 2023 at 04:50:20PM -0400, Gil Dekel wrote:
> Before sending a uevent to userspace in order to trigger a corrective
> modeset, we change the failing connector's link-status to BAD. However,
> the downstream MST branch ports are left in their original GOOD state.
> 
> This patch utilizes the drm helper function
> drm_dp_set_mst_topology_link_status() to rectify this and set all
> downstream MST connectors' link-status to BAD before emitting the uevent
> to userspace.
> 
> Signed-off-by: Gil Dekel <gildekel@chromium.org>
> ---
>  drivers/gpu/drm/i915/display/intel_dp.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
> index 42353b1ac487..e8b10f59e141 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> @@ -5995,16 +5995,20 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
>  	struct intel_dp *intel_dp =
>  		container_of(work, typeof(*intel_dp), modeset_retry_work);
>  	struct drm_connector *connector = &intel_dp->attached_connector->base;
> -	drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s]\n", connector->base.id,
> -		    connector->name);
> 
> -	/* Grab the locks before changing connector property*/
> -	mutex_lock(&connector->dev->mode_config.mutex);
> -	/* Set connector link status to BAD and send a Uevent to notify
> -	 * userspace to do a modeset.
> +	/* Set the connector's (and possibly all its downstream MST ports') link
> +	 * status to BAD.
>  	 */
> +	mutex_lock(&connector->dev->mode_config.mutex);
> +	drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s] link status %d -> %d\n",
> +		    connector->base.id, connector->name,
> +		    connector->state->link_status, DRM_MODE_LINK_STATUS_BAD);
>  	drm_connector_set_link_status_property(connector,
>  					       DRM_MODE_LINK_STATUS_BAD);
> +	if (intel_dp->is_mst) {
> +		drm_dp_set_mst_topology_link_status(&intel_dp->mst_mgr,
> +						    DRM_MODE_LINK_STATUS_BAD);

Something is weird with the locking here.
I noticed that on patch 3 this new function also gets the same
mutex_lock(&connector->dev->mode_config.mutex);

Since you didn't reach the deadlock, I'm clearly missing something
on the flow. But regardless of what I could be missing, I believe
this is totally not future proof and we will for sure hit dead-lock
cases.

> +	}
>  	mutex_unlock(&connector->dev->mode_config.mutex);
>  	/* Send Hotplug uevent so userspace can reprobe */
>  	drm_kms_helper_connector_hotplug_event(connector);
> --
> Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Intel-gfx] [PATCH v4 1/6] drm/i915/dp_link_training: Add a final failing state to link training fallback
  2023-08-24 20:50   ` Gil Dekel
  (?)
@ 2023-09-01 18:57   ` Rodrigo Vivi
  2023-09-01 22:51     ` Gil Dekel
  -1 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2023-09-01 18:57 UTC (permalink / raw)
  To: Gil Dekel; +Cc: intel-gfx, seanpaul, dri-devel

On Thu, Aug 24, 2023 at 04:50:16PM -0400, Gil Dekel wrote:
> Instead of silently giving up when all link-training fallback values are
> exhausted, this patch modifies the fallback's failure branch to reduces
> both max_link_lane_count and max_link_rate to zero (0) and continues to
> emit uevents until userspace stops attempting to modeset.
> 
> By doing so, we ensure the failing connector, which is in
> link-status=Bad, has all its modes pruned (due to effectively having a
> bandwidth of 0Gbps).
> 
> It is then the userspace's responsibility to ignore connectors with no
> modes, even if they are marked as connected.
> 
> Signed-off-by: Gil Dekel <gildekel@chromium.org>
> ---
>  drivers/gpu/drm/i915/display/intel_dp.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
> index 7067ee3a4bd3..2152ddbab557 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> @@ -276,8 +276,12 @@ static int intel_dp_common_len_rate_limit(const struct intel_dp *intel_dp,
> 
>  static int intel_dp_common_rate(struct intel_dp *intel_dp, int index)
>  {
> +	/* This occurs when max link rate drops to 0 via link training fallback*/
> +	if (index < 0)
> +		return 0;

I'm not comfortable with handling negative input as normal here
and no warning or msg.
Maybe I'm missing the cases in which this will get to negative and
it would be acceptable? In this case probably better to improve the
commit message.

> +
>  	if (drm_WARN_ON(&dp_to_i915(intel_dp)->drm,
> -			index < 0 || index >= intel_dp->num_common_rates))
> +			index >= intel_dp->num_common_rates))
>  		return 162000;
> 
>  	return intel_dp->common_rates[index];
> @@ -318,6 +322,9 @@ static int intel_dp_max_common_lane_count(struct intel_dp *intel_dp)
>  int intel_dp_max_lane_count(struct intel_dp *intel_dp)
>  {
>  	switch (intel_dp->max_link_lane_count) {
> +	/* This occurs when max link lane count drops to 0 via link training fallback*/
> +	case 0:
> +		return 0;
>  	case 1:
>  	case 2:
>  	case 4:
> @@ -672,7 +679,14 @@ int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
>  		intel_dp->max_link_lane_count = lane_count >> 1;
>  	} else {
>  		drm_err(&i915->drm, "Link Training Unsuccessful\n");
> -		return -1;
> +		/*
> +		 * Ensure all of the connector's modes are pruned in the next
> +		 * probe by effectively reducing its bandwidth to 0 so userspace
> +		 * can ignore it within the next modeset attempt.
> +		 */
> +		intel_dp->max_link_rate = 0;
> +		intel_dp->max_link_lane_count = 0;
> +		return 0;
>  	}
> 
>  	return 0;
> --
> Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Intel-gfx] [PATCH v4 0/6] drm/i915/dp_link_training: Define a final failure state when link training fails
  2023-09-01 18:38   ` [Intel-gfx] " Rodrigo Vivi
@ 2023-09-01 19:05     ` Rodrigo Vivi
  2023-09-01 19:25       ` Gil Dekel
  0 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2023-09-01 19:05 UTC (permalink / raw)
  To: Lyude Paul, Jani Nikula, Ville Syrjälä, Imre
  Cc: Gil Dekel, intel-gfx, seanpaul, dri-devel

On Fri, Sep 01, 2023 at 02:38:11PM -0400, Rodrigo Vivi wrote:
> On Wed, Aug 30, 2023 at 05:41:37PM -0400, Lyude Paul wrote:
> > Other then the name typo (s/Pual/Paul):
> > 
> > Signed-off-by: Lyude Paul <lyude@redhat.com> (just since I co-authored
> > things~)
> 
> I believe having the Co-developed-by: in the patches that you helped
> out would be nice.
> 
> > Reviewed-by: Lyude Paul <lyude@redhat.com>
> > 
> > I think we definitely want to make sure we get intel's opinions on this
> > though, especially regarding the usage of link-status. I think we're close
> > enough to link-status's intended purpose, but I definitely would like to know
> > what others think about that since userspace will definitely have to handle
> > situations like this a bit differently than with SST.

I'd like to get Jani, or Ville, or Imre's eyes here. I believe this
series has a good potential to solve some bad lingering MST issues,
but I'm indeed concerned with the impact on depending on userspace
behavior here.

(Besides that potential dead-lock...)

> > 
> > Also - definitely make sure you take a look at Imre's patch series that's
> > currently on the list (I just finished reviewing it), since it adds some
> > things to the helpers that might end up being useful here :)
> > 
> > https://patchwork.freedesktop.org/series/122589/
> > 
> > On Thu, 2023-08-24 at 16:50 -0400, Gil Dekel wrote:
> > > Next version of https://patchwork.freedesktop.org/series/122850/
> > > 
> > > v4:
> > >   Another blunder. I uploaded the patches from my ChromeiumOS kernel dev repo
> > >   instead of drm-tip/drm-tip. Apologies for the noise :(
> > > 
> > > v3:
> > >   Still learning the ropes of upstream workflow. Apologies for mucking up v2.
> > >   This is just a re-upload.
> > > 
> > > v2:
> > >   Reorganize into:
> > >   1) Add for final failure state for SST and MST link training fallback.
> > >   2) Add a DRM helper for setting downstream MST ports' link-status state.
> > >   3) Make handling SST and MST connectors simpler via intel_dp.
> > >   4) Update link-status for downstream MST ports.
> > >   5) Emit a uevent with the "link-status" trigger property.
> > > 
> > > v1:
> > > Currently, when link training fails after all fallback values have been
> > > exhausted, the i915 driver seizes to send uevents to userspace. This leave
> > > userspace thinking that the last passing atomic commit was successful, and that
> > > all connectors (displays) are connected and operational, when in fact, the last
> > > link failed to train and the displays remain dark. This manifests as "zombie"
> > > displays in userspace, in which users observe the displays appear in their
> > > display settings page, but they are dark and unresponsive.
> > > 
> > > Since, at the time of writing, MST link training fallback is not implemented,
> > > failing MST link training is a significantly more common case then a complete
> > > SST link training failure. And with users using MST hubs more than ever to
> > > connect multiple displays via their USB-C ports we observe this case often.
> > > 
> > > This patchset series suggest a solution, in which a final failure state is
> > > defined. In this final state, the connector's bit rate capabilities, namely
> > > max_link_rate and max_link_lane_count, are set to 0. This effectively set the
> > > connector's bandwidth to 0Gbps, thus causing all its modes to be pruned in the
> > > following connector probing.
> > > 
> > > Next, with this state defined, we emit a link-status=Bad uevent. The next time
> > > userspace probes the connector, it should recognize that the connector has no
> > > modes and ignore it since it is in a bad state.
> > > 
> > > I am aware that always sending a uevent and never stopping may result in some
> > > userspaces having their expectations broken and enter an infinite loop of
> > > modesets and link-training attempts. However, per DRM link-status spec:
> > > ```
> > >  * link-status:
> > >  *      Connector link-status property to indicate the status of link. The
> > >  *      default value of link-status is "GOOD". If something fails during or
> > >  *      after modeset, the kernel driver may set this to "BAD" and issue a
> > >  *      hotplug uevent. Drivers should update this value using
> > >  *      drm_connector_set_link_status_property().
> > >  *
> > >  *      When user-space receives the hotplug uevent and detects a "BAD"
> > >  *      link-status, the sink doesn't receive pixels anymore (e.g. the screen
> > >  *      becomes completely black). The list of available modes may have
> > >  *      changed. User-space is expected to pick a new mode if the current one
> > >  *      has disappeared and perform a new modeset with link-status set to
> > >  *      "GOOD" to re-enable the connector.
> > > ```
> > > (form drivers/gpu/drm/drm_connector.c - DOC: standard connector properties)
> > > 
> > > it seems reasonable to assume that the suggested state is an extension of the
> > > spec's guidelines, in which the next new mode userspace picks for a connector
> > > with no modes is - none, thus breaking the cycle of failed link-training
> > > attempts.
> > > 
> > > I suspect that, maybe, zeroing out the bit rate capabilities is not the right
> > > way to go, and perhaps marking the connector as disconnected instead may be a
> > > better solution. However, if marking a connector disconnected is the way to go,
> > > We will have to iterate over all MST ports in the MST case and mark the spawned
> > > connectors as disconnected as well.
> > 
> > I -think- this is probably fine, that's likely how I'd 
> > 
> > > 
> > > As a final note I should add that this approach was tested with ChromeOS as
> > > userspace, and we observed that the zombie displays stop showing up once the
> > > connectors are pruned of all their modes and are ignored by userspace.
> > > 
> > > For your consideration and guidance.
> > > Thanks,
> > > 
> > > Gil Dekel (6):
> > >   drm/i915/dp_link_training: Add a final failing state to link training
> > >     fallback
> > >   drm/i915/dp_link_training: Add a final failing state to link training
> > >     fallback for MST
> > >   drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
> > >   drm/i915: Move DP modeset_retry_work into intel_dp
> > >   drm/i915/dp_link_training: Set all downstream MST ports to BAD before
> > >     retrying
> > >   drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger
> > >     property
> > > 
> > >  drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 ++++++++++
> > >  drivers/gpu/drm/i915/display/intel_display.c  | 14 +++-
> > >  .../drm/i915/display/intel_display_types.h    |  6 +-
> > >  drivers/gpu/drm/i915/display/intel_dp.c       | 75 ++++++++++++-------
> > >  drivers/gpu/drm/i915/display/intel_dp.h       |  2 +-
> > >  .../drm/i915/display/intel_dp_link_training.c | 11 ++-
> > >  include/drm/display/drm_dp_mst_helper.h       |  3 +
> > >  7 files changed, 110 insertions(+), 40 deletions(-)
> > > 
> > > --
> > > Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics
> > > 
> > 
> > -- 
> > Cheers,
> >  Lyude Paul (she/her)
> >  Software Engineer at Red Hat
> > 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Intel-gfx] [PATCH v4 0/6] drm/i915/dp_link_training: Define a final failure state when link training fails
  2023-09-01 19:05     ` Rodrigo Vivi
@ 2023-09-01 19:25       ` Gil Dekel
  0 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-09-01 19:25 UTC (permalink / raw)
  To: Rodrigo Vivi
  Cc: Ville Syrjälä, Jani Nikula, dri-devel, seanpaul,
	intel-gfx

On Fri, Sep 1, 2023 at 3:05 PM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
>
> On Fri, Sep 01, 2023 at 02:38:11PM -0400, Rodrigo Vivi wrote:
> > On Wed, Aug 30, 2023 at 05:41:37PM -0400, Lyude Paul wrote:
> > > Other then the name typo (s/Pual/Paul):
> > >
> > > Signed-off-by: Lyude Paul <lyude@redhat.com> (just since I co-authored
> > > things~)
> >
> > I believe having the Co-developed-by: in the patches that you helped
> > out would be nice.
> >
Agreed. I think I'll set Lyude as the author of the two patches she originally
wrote, and set myself as the co-developer. Thank you both for pointing this out.

> > > Reviewed-by: Lyude Paul <lyude@redhat.com>
> > >
> > > I think we definitely want to make sure we get intel's opinions on this
> > > though, especially regarding the usage of link-status. I think we're close
> > > enough to link-status's intended purpose, but I definitely would like to know
> > > what others think about that since userspace will definitely have to handle
> > > situations like this a bit differently than with SST.
>
> I'd like to get Jani, or Ville, or Imre's eyes here. I believe this
> series has a good potential to solve some bad lingering MST issues,
> but I'm indeed concerned with the impact on depending on userspace
> behavior here.
>
I would love to have other userspaces reviewing (or at least ACKing)
this series. Any thoughts on who should be added on the next revision?

> (Besides that potential dead-lock...)
>
Response is on patch 4/5.
> > >
> > > Also - definitely make sure you take a look at Imre's patch series that's
> > > currently on the list (I just finished reviewing it), since it adds some
> > > things to the helpers that might end up being useful here :)
> > >
> > > https://patchwork.freedesktop.org/series/122589/
> > >
Do you have anything particular in mind?

> > > On Thu, 2023-08-24 at 16:50 -0400, Gil Dekel wrote:
> > > > Next version of https://patchwork.freedesktop.org/series/122850/
> > > >
> > > > v4:
> > > >   Another blunder. I uploaded the patches from my ChromeiumOS kernel dev repo
> > > >   instead of drm-tip/drm-tip. Apologies for the noise :(
> > > >
> > > > v3:
> > > >   Still learning the ropes of upstream workflow. Apologies for mucking up v2.
> > > >   This is just a re-upload.
> > > >
> > > > v2:
> > > >   Reorganize into:
> > > >   1) Add for final failure state for SST and MST link training fallback.
> > > >   2) Add a DRM helper for setting downstream MST ports' link-status state.
> > > >   3) Make handling SST and MST connectors simpler via intel_dp.
> > > >   4) Update link-status for downstream MST ports.
> > > >   5) Emit a uevent with the "link-status" trigger property.
> > > >
> > > > v1:
> > > > Currently, when link training fails after all fallback values have been
> > > > exhausted, the i915 driver seizes to send uevents to userspace. This leave
> > > > userspace thinking that the last passing atomic commit was successful, and that
> > > > all connectors (displays) are connected and operational, when in fact, the last
> > > > link failed to train and the displays remain dark. This manifests as "zombie"
> > > > displays in userspace, in which users observe the displays appear in their
> > > > display settings page, but they are dark and unresponsive.
> > > >
> > > > Since, at the time of writing, MST link training fallback is not implemented,
> > > > failing MST link training is a significantly more common case then a complete
> > > > SST link training failure. And with users using MST hubs more than ever to
> > > > connect multiple displays via their USB-C ports we observe this case often.
> > > >
> > > > This patchset series suggest a solution, in which a final failure state is
> > > > defined. In this final state, the connector's bit rate capabilities, namely
> > > > max_link_rate and max_link_lane_count, are set to 0. This effectively set the
> > > > connector's bandwidth to 0Gbps, thus causing all its modes to be pruned in the
> > > > following connector probing.
> > > >
> > > > Next, with this state defined, we emit a link-status=Bad uevent. The next time
> > > > userspace probes the connector, it should recognize that the connector has no
> > > > modes and ignore it since it is in a bad state.
> > > >
> > > > I am aware that always sending a uevent and never stopping may result in some
> > > > userspaces having their expectations broken and enter an infinite loop of
> > > > modesets and link-training attempts. However, per DRM link-status spec:
> > > > ```
> > > >  * link-status:
> > > >  *      Connector link-status property to indicate the status of link. The
> > > >  *      default value of link-status is "GOOD". If something fails during or
> > > >  *      after modeset, the kernel driver may set this to "BAD" and issue a
> > > >  *      hotplug uevent. Drivers should update this value using
> > > >  *      drm_connector_set_link_status_property().
> > > >  *
> > > >  *      When user-space receives the hotplug uevent and detects a "BAD"
> > > >  *      link-status, the sink doesn't receive pixels anymore (e.g. the screen
> > > >  *      becomes completely black). The list of available modes may have
> > > >  *      changed. User-space is expected to pick a new mode if the current one
> > > >  *      has disappeared and perform a new modeset with link-status set to
> > > >  *      "GOOD" to re-enable the connector.
> > > > ```
> > > > (form drivers/gpu/drm/drm_connector.c - DOC: standard connector properties)
> > > >
> > > > it seems reasonable to assume that the suggested state is an extension of the
> > > > spec's guidelines, in which the next new mode userspace picks for a connector
> > > > with no modes is - none, thus breaking the cycle of failed link-training
> > > > attempts.
> > > >
> > > > I suspect that, maybe, zeroing out the bit rate capabilities is not the right
> > > > way to go, and perhaps marking the connector as disconnected instead may be a
> > > > better solution. However, if marking a connector disconnected is the way to go,
> > > > We will have to iterate over all MST ports in the MST case and mark the spawned
> > > > connectors as disconnected as well.
> > >
> > > I -think- this is probably fine, that's likely how I'd
> > >
> > > >
> > > > As a final note I should add that this approach was tested with ChromeOS as
> > > > userspace, and we observed that the zombie displays stop showing up once the
> > > > connectors are pruned of all their modes and are ignored by userspace.
> > > >
> > > > For your consideration and guidance.
> > > > Thanks,
> > > >
> > > > Gil Dekel (6):
> > > >   drm/i915/dp_link_training: Add a final failing state to link training
> > > >     fallback
> > > >   drm/i915/dp_link_training: Add a final failing state to link training
> > > >     fallback for MST
> > > >   drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
> > > >   drm/i915: Move DP modeset_retry_work into intel_dp
> > > >   drm/i915/dp_link_training: Set all downstream MST ports to BAD before
> > > >     retrying
> > > >   drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger
> > > >     property
> > > >
> > > >  drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 ++++++++++
> > > >  drivers/gpu/drm/i915/display/intel_display.c  | 14 +++-
> > > >  .../drm/i915/display/intel_display_types.h    |  6 +-
> > > >  drivers/gpu/drm/i915/display/intel_dp.c       | 75 ++++++++++++-------
> > > >  drivers/gpu/drm/i915/display/intel_dp.h       |  2 +-
> > > >  .../drm/i915/display/intel_dp_link_training.c | 11 ++-
> > > >  include/drm/display/drm_dp_mst_helper.h       |  3 +
> > > >  7 files changed, 110 insertions(+), 40 deletions(-)
> > > >
> > > > --
> > > > Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics
> > > >
> > >
> > > --
> > > Cheers,
> > >  Lyude Paul (she/her)
> > >  Software Engineer at Red Hat
> > >

Thanks for your time and comments!
--
Best,
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Intel-gfx] [PATCH v4 5/6] drm/i915/dp_link_training: Set all downstream MST ports to BAD before retrying
  2023-09-01 18:55   ` [Intel-gfx] " Rodrigo Vivi
@ 2023-09-01 21:13     ` Gil Dekel
  2023-09-01 23:24       ` Gil Dekel
  0 siblings, 1 reply; 31+ messages in thread
From: Gil Dekel @ 2023-09-01 21:13 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, seanpaul, dri-devel

On Fri, Sep 1, 2023 at 2:55 PM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
>
> On Thu, Aug 24, 2023 at 04:50:20PM -0400, Gil Dekel wrote:
> > Before sending a uevent to userspace in order to trigger a corrective
> > modeset, we change the failing connector's link-status to BAD. However,
> > the downstream MST branch ports are left in their original GOOD state.
> >
> > This patch utilizes the drm helper function
> > drm_dp_set_mst_topology_link_status() to rectify this and set all
> > downstream MST connectors' link-status to BAD before emitting the uevent
> > to userspace.
> >
> > Signed-off-by: Gil Dekel <gildekel@chromium.org>
> > ---
> >  drivers/gpu/drm/i915/display/intel_dp.c | 16 ++++++++++------
> >  1 file changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
> > index 42353b1ac487..e8b10f59e141 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -5995,16 +5995,20 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
> >       struct intel_dp *intel_dp =
> >               container_of(work, typeof(*intel_dp), modeset_retry_work);
> >       struct drm_connector *connector = &intel_dp->attached_connector->base;
> > -     drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s]\n", connector->base.id,
> > -                 connector->name);
> >
> > -     /* Grab the locks before changing connector property*/
> > -     mutex_lock(&connector->dev->mode_config.mutex);
> > -     /* Set connector link status to BAD and send a Uevent to notify
> > -      * userspace to do a modeset.
> > +     /* Set the connector's (and possibly all its downstream MST ports') link
> > +      * status to BAD.
> >        */
> > +     mutex_lock(&connector->dev->mode_config.mutex);
> > +     drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s] link status %d -> %d\n",
> > +                 connector->base.id, connector->name,
> > +                 connector->state->link_status, DRM_MODE_LINK_STATUS_BAD);
> >       drm_connector_set_link_status_property(connector,
> >                                              DRM_MODE_LINK_STATUS_BAD);
> > +     if (intel_dp->is_mst) {
> > +             drm_dp_set_mst_topology_link_status(&intel_dp->mst_mgr,
> > +                                                 DRM_MODE_LINK_STATUS_BAD);
>
> Something is weird with the locking here.
> I noticed that on patch 3 this new function also gets the same
> mutex_lock(&connector->dev->mode_config.mutex);
>
> Since you didn't reach the deadlock, I'm clearly missing something
> on the flow. But regardless of what I could be missing, I believe
> this is totally not future proof and we will for sure hit dead-lock
> cases.
>
You are not wrong.

Something must have been wrong in my workflow, as I was positive I
tested the code with this lock, but I must remember wrong. I tried
testing my current code and it immediately locked, as you expected.
So thank you for catching this.

Lyude's original patch didn't include drm_dp_set_mst_topology_link_status()
as an exposed drm helper function, so when I adjusted it for this series, I
decided to add locks similar to how her other function using
drm_dp_set_mst_topology_link_status() did. However, I failed to use the
right lock, which is:
drm_modeset_lock(&connector->dev->mode_config.connection_mutex, NULL);
drm_modeset_unlock(&connector->dev->mode_config.connection_mutex);
This is similar to how drm_connector_set_link_status_property() locks
before writing to link_status.

I made sure to test my code with the above locks, and it runs well. Here's
an instrumented log excerpt for failing link-training with an MST hub
(I hacked the driver to always fail non eDP connectors and print the
raw pointer addresses of the drm_device and mutex right before locking):
[   43.466329] i915 0000:00:02.0: [drm] *ERROR* Link Training
Unsuccessful via gildekel HACK - (not eDP)
[   43.594950] i915 0000:00:02.0: [drm] *ERROR* Link Training
Unsuccessful via gildekel HACK - (not eDP)
[   43.594979] i915 0000:00:02.0: [drm] *ERROR* Link Training Unsuccessful
[   43.595023] i915 0000:00:02.0: [drm] *ERROR* [CONNECTOR:273:DP-3]:
[   43.595028] i915 0000:00:02.0: [drm] *ERROR*
connector->dev=00000000d4850450
[   43.595033] i915 0000:00:02.0: [drm] *ERROR*
connector->dev->mode_config.mutex=00000000aac3fe45
[   44.771091] i915 0000:00:02.0: [drm] *ERROR*
[MST-CONNECTOR:300:DP-5]:
[   44.771108] i915 0000:00:02.0: [drm] *ERROR*
connector->dev=000000003fb97435
[   44.771115] i915 0000:00:02.0: [drm] *ERROR*
&connector->dev->mode_config.connection_mutex=000000009aece20e
[   44.771127] i915 0000:00:02.0: [drm] *ERROR*
[MST-CONNECTOR:303:DP-6]:
[   44.771132] i915 0000:00:02.0: [drm] *ERROR*
connector->dev=0000000075236b75
[   44.771137] i915 0000:00:02.0: [drm] *ERROR*
&connector->dev->mode_config.connection_mutex=000000009aece20e

Also, I was under the assumption that all connectors in an MST topology
should reference the same drm_device object, but it seems like that's
not the case. Is my assumption wrong?

> > +     }
> >       mutex_unlock(&connector->dev->mode_config.mutex);
> >       /* Send Hotplug uevent so userspace can reprobe */
> >       drm_kms_helper_connector_hotplug_event(connector);
> > --
> > Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics


Thanks for your time and comments!
--
Best,
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Intel-gfx] [PATCH v4 1/6] drm/i915/dp_link_training: Add a final failing state to link training fallback
  2023-09-01 18:57   ` [Intel-gfx] " Rodrigo Vivi
@ 2023-09-01 22:51     ` Gil Dekel
  0 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-09-01 22:51 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, seanpaul, dri-devel

On Fri, Sep 1, 2023 at 2:57 PM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
>
> On Thu, Aug 24, 2023 at 04:50:16PM -0400, Gil Dekel wrote:
> > Instead of silently giving up when all link-training fallback values are
> > exhausted, this patch modifies the fallback's failure branch to reduces
> > both max_link_lane_count and max_link_rate to zero (0) and continues to
> > emit uevents until userspace stops attempting to modeset.
> >
> > By doing so, we ensure the failing connector, which is in
> > link-status=Bad, has all its modes pruned (due to effectively having a
> > bandwidth of 0Gbps).
> >
> > It is then the userspace's responsibility to ignore connectors with no
> > modes, even if they are marked as connected.
> >
> > Signed-off-by: Gil Dekel <gildekel@chromium.org>
> > ---
> >  drivers/gpu/drm/i915/display/intel_dp.c | 18 ++++++++++++++++--
> >  1 file changed, 16 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
> > index 7067ee3a4bd3..2152ddbab557 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -276,8 +276,12 @@ static int intel_dp_common_len_rate_limit(const struct intel_dp *intel_dp,
> >
> >  static int intel_dp_common_rate(struct intel_dp *intel_dp, int index)
> >  {
> > +     /* This occurs when max link rate drops to 0 via link training fallback*/
> > +     if (index < 0)
> > +             return 0;
>
> I'm not comfortable with handling negative input as normal here
> and no warning or msg.
> Maybe I'm missing the cases in which this will get to negative and
> it would be acceptable? In this case probably better to improve the
> commit message.
>
If we set the max_link_rate to 0, as I am suggesting in this approach,
it will eventually reach:
int intel_dp_rate_limit_len(intel_dp_rate_limit_len(const int *rates,
int len, int max_rate)

and since max_rate is == 0, the returned value will be 0.

The common use case of
int intel_dp_common_rate(struct intel_dp *intel_dp, int index) is:
intel_dp_common_rate(intel_dp, len - 1);

where len is the position of the last link rate value that was attempted
in the connector's array of bit rates.

When len == 0, you hit the case where the index becomes -1, which
indicates we ran out of possible bitrates in:
intel_dp->num_common_rates
so we return the bit rate 0Gbps for all invalid cases (index < 0).

If this is acceptable, I'll gladly add some details to the commit
message.

An alternative approach is to introduce 0 link rate at the front of:
intel_dp->num_common_rates, which will ensure all connectors
have 0 as a last option, and then we can use the normal fallback
code path with no special cases here.

I hope this makes sense...
> > +
> >       if (drm_WARN_ON(&dp_to_i915(intel_dp)->drm,
> > -                     index < 0 || index >= intel_dp->num_common_rates))
> > +                     index >= intel_dp->num_common_rates))
> >               return 162000;
> >
> >       return intel_dp->common_rates[index];
> > @@ -318,6 +322,9 @@ static int intel_dp_max_common_lane_count(struct intel_dp *intel_dp)
> >  int intel_dp_max_lane_count(struct intel_dp *intel_dp)
> >  {
> >       switch (intel_dp->max_link_lane_count) {
> > +     /* This occurs when max link lane count drops to 0 via link training fallback*/
> > +     case 0:
> > +             return 0;
> >       case 1:
> >       case 2:
> >       case 4:
> > @@ -672,7 +679,14 @@ int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
> >               intel_dp->max_link_lane_count = lane_count >> 1;
> >       } else {
> >               drm_err(&i915->drm, "Link Training Unsuccessful\n");
> > -             return -1;
> > +             /*
> > +              * Ensure all of the connector's modes are pruned in the next
> > +              * probe by effectively reducing its bandwidth to 0 so userspace
> > +              * can ignore it within the next modeset attempt.
> > +              */
> > +             intel_dp->max_link_rate = 0;
> > +             intel_dp->max_link_lane_count = 0;
> > +             return 0;
> >       }
> >
> >       return 0;
> > --
> > Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

Thanks for your reviews and comments!

--
Best,
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Intel-gfx] [PATCH v4 6/6] drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger property
  2023-08-24 20:50   ` Gil Dekel
@ 2023-09-01 23:22     ` Manasi Navare
  -1 siblings, 0 replies; 31+ messages in thread
From: Manasi Navare @ 2023-09-01 23:22 UTC (permalink / raw)
  To: Gil Dekel; +Cc: seanpaul, intel-gfx, dri-devel

Thanks Gil for completing the logic here by emitting link status = BAD
even for final link failure state.

On Thu, Aug 24, 2023 at 1:54 PM Gil Dekel <gildekel@chromium.org> wrote:
>
> When a link-training attempt fails, emit a uevent to user space that
> includes the trigger property, which in this case will be
> link-statue=Bad.

Fix the typo above for link-status

>
> This will allow userspace to parse the uevent property and better
> understand the reason for the previous modeset failure.

I think we need to add more explanation in the commit message as to
current problem of no link status = BAD for final failure wrongly reflects link
is good in userspace even when there could have been a failure and black screen.

But that this patch in conjunction with proper handling in userspace fixes it.

Also here we should mention that this patch also now uses:
drm_sysfs_connector_property_event
instead of the earlier generic drm_kms_helper_connector_hotplug_event.

This will need some changes in other userspaces that parse this else it will
cause failures for other userspace once this lands.

With all the above changes,

Acked-by: Manasi Navare <navaremanasi@chromium.org>


Regards
Manasi

>
> Signed-off-by: Gil Dekel <gildekel@chromium.org>
>
> V2:
>   - init link_status_property inline.
> ---
>  drivers/gpu/drm/i915/display/intel_dp.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
> index e8b10f59e141..328e9f030033 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> @@ -42,6 +42,7 @@
>  #include <drm/drm_crtc.h>
>  #include <drm/drm_edid.h>
>  #include <drm/drm_probe_helper.h>
> +#include <drm/drm_sysfs.h>
>
>  #include "g4x_dp.h"
>  #include "i915_drv.h"
> @@ -5995,6 +5996,8 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
>         struct intel_dp *intel_dp =
>                 container_of(work, typeof(*intel_dp), modeset_retry_work);
>         struct drm_connector *connector = &intel_dp->attached_connector->base;
> +       struct drm_property *link_status_property =
> +               connector->dev->mode_config.link_status_property;
>
>         /* Set the connector's (and possibly all its downstream MST ports') link
>          * status to BAD.
> @@ -6011,7 +6014,7 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
>         }
>         mutex_unlock(&connector->dev->mode_config.mutex);
>         /* Send Hotplug uevent so userspace can reprobe */
> -       drm_kms_helper_connector_hotplug_event(connector);
> +       drm_sysfs_connector_property_event(connector, link_status_property);
>  }
>
>  bool
> --
> Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 6/6] drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger property
@ 2023-09-01 23:22     ` Manasi Navare
  0 siblings, 0 replies; 31+ messages in thread
From: Manasi Navare @ 2023-09-01 23:22 UTC (permalink / raw)
  To: Gil Dekel; +Cc: seanpaul, intel-gfx, dri-devel

Thanks Gil for completing the logic here by emitting link status = BAD
even for final link failure state.

On Thu, Aug 24, 2023 at 1:54 PM Gil Dekel <gildekel@chromium.org> wrote:
>
> When a link-training attempt fails, emit a uevent to user space that
> includes the trigger property, which in this case will be
> link-statue=Bad.

Fix the typo above for link-status

>
> This will allow userspace to parse the uevent property and better
> understand the reason for the previous modeset failure.

I think we need to add more explanation in the commit message as to
current problem of no link status = BAD for final failure wrongly reflects link
is good in userspace even when there could have been a failure and black screen.

But that this patch in conjunction with proper handling in userspace fixes it.

Also here we should mention that this patch also now uses:
drm_sysfs_connector_property_event
instead of the earlier generic drm_kms_helper_connector_hotplug_event.

This will need some changes in other userspaces that parse this else it will
cause failures for other userspace once this lands.

With all the above changes,

Acked-by: Manasi Navare <navaremanasi@chromium.org>


Regards
Manasi

>
> Signed-off-by: Gil Dekel <gildekel@chromium.org>
>
> V2:
>   - init link_status_property inline.
> ---
>  drivers/gpu/drm/i915/display/intel_dp.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
> index e8b10f59e141..328e9f030033 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> @@ -42,6 +42,7 @@
>  #include <drm/drm_crtc.h>
>  #include <drm/drm_edid.h>
>  #include <drm/drm_probe_helper.h>
> +#include <drm/drm_sysfs.h>
>
>  #include "g4x_dp.h"
>  #include "i915_drv.h"
> @@ -5995,6 +5996,8 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
>         struct intel_dp *intel_dp =
>                 container_of(work, typeof(*intel_dp), modeset_retry_work);
>         struct drm_connector *connector = &intel_dp->attached_connector->base;
> +       struct drm_property *link_status_property =
> +               connector->dev->mode_config.link_status_property;
>
>         /* Set the connector's (and possibly all its downstream MST ports') link
>          * status to BAD.
> @@ -6011,7 +6014,7 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
>         }
>         mutex_unlock(&connector->dev->mode_config.mutex);
>         /* Send Hotplug uevent so userspace can reprobe */
> -       drm_kms_helper_connector_hotplug_event(connector);
> +       drm_sysfs_connector_property_event(connector, link_status_property);
>  }
>
>  bool
> --
> Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Intel-gfx] [PATCH v4 5/6] drm/i915/dp_link_training: Set all downstream MST ports to BAD before retrying
  2023-09-01 21:13     ` Gil Dekel
@ 2023-09-01 23:24       ` Gil Dekel
  0 siblings, 0 replies; 31+ messages in thread
From: Gil Dekel @ 2023-09-01 23:24 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, seanpaul, dri-devel

On Fri, Sep 1, 2023 at 5:13 PM Gil Dekel <gildekel@chromium.org> wrote:
>
> On Fri, Sep 1, 2023 at 2:55 PM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> >
> > On Thu, Aug 24, 2023 at 04:50:20PM -0400, Gil Dekel wrote:
> > > Before sending a uevent to userspace in order to trigger a corrective
> > > modeset, we change the failing connector's link-status to BAD. However,
> > > the downstream MST branch ports are left in their original GOOD state.
> > >
> > > This patch utilizes the drm helper function
> > > drm_dp_set_mst_topology_link_status() to rectify this and set all
> > > downstream MST connectors' link-status to BAD before emitting the uevent
> > > to userspace.
> > >
> > > Signed-off-by: Gil Dekel <gildekel@chromium.org>
> > > ---
> > >  drivers/gpu/drm/i915/display/intel_dp.c | 16 ++++++++++------
> > >  1 file changed, 10 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
> > > index 42353b1ac487..e8b10f59e141 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > > @@ -5995,16 +5995,20 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
> > >       struct intel_dp *intel_dp =
> > >               container_of(work, typeof(*intel_dp), modeset_retry_work);
> > >       struct drm_connector *connector = &intel_dp->attached_connector->base;
> > > -     drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s]\n", connector->base.id,
> > > -                 connector->name);
> > >
> > > -     /* Grab the locks before changing connector property*/
> > > -     mutex_lock(&connector->dev->mode_config.mutex);
> > > -     /* Set connector link status to BAD and send a Uevent to notify
> > > -      * userspace to do a modeset.
> > > +     /* Set the connector's (and possibly all its downstream MST ports') link
> > > +      * status to BAD.
> > >        */
> > > +     mutex_lock(&connector->dev->mode_config.mutex);
> > > +     drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s] link status %d -> %d\n",
> > > +                 connector->base.id, connector->name,
> > > +                 connector->state->link_status, DRM_MODE_LINK_STATUS_BAD);
> > >       drm_connector_set_link_status_property(connector,
> > >                                              DRM_MODE_LINK_STATUS_BAD);
> > > +     if (intel_dp->is_mst) {
> > > +             drm_dp_set_mst_topology_link_status(&intel_dp->mst_mgr,
> > > +                                                 DRM_MODE_LINK_STATUS_BAD);
> >
> > Something is weird with the locking here.
> > I noticed that on patch 3 this new function also gets the same
> > mutex_lock(&connector->dev->mode_config.mutex);
> >
> > Since you didn't reach the deadlock, I'm clearly missing something
> > on the flow. But regardless of what I could be missing, I believe
> > this is totally not future proof and we will for sure hit dead-lock
> > cases.
> >
> You are not wrong.
>
> Something must have been wrong in my workflow, as I was positive I
> tested the code with this lock, but I must remember wrong. I tried
> testing my current code and it immediately locked, as you expected.
> So thank you for catching this.
>
> Lyude's original patch didn't include drm_dp_set_mst_topology_link_status()
> as an exposed drm helper function, so when I adjusted it for this series, I
> decided to add locks similar to how her other function using
> drm_dp_set_mst_topology_link_status() did. However, I failed to use the
> right lock, which is:
> drm_modeset_lock(&connector->dev->mode_config.connection_mutex, NULL);
> drm_modeset_unlock(&connector->dev->mode_config.connection_mutex);
> This is similar to how drm_connector_set_link_status_property() locks
> before writing to link_status.
>
> I made sure to test my code with the above locks, and it runs well. Here's
> an instrumented log excerpt for failing link-training with an MST hub
> (I hacked the driver to always fail non eDP connectors and print the
> raw pointer addresses of the drm_device and mutex right before locking):
> [   43.466329] i915 0000:00:02.0: [drm] *ERROR* Link Training
> Unsuccessful via gildekel HACK - (not eDP)
> [   43.594950] i915 0000:00:02.0: [drm] *ERROR* Link Training
> Unsuccessful via gildekel HACK - (not eDP)
> [   43.594979] i915 0000:00:02.0: [drm] *ERROR* Link Training Unsuccessful
> [   43.595023] i915 0000:00:02.0: [drm] *ERROR* [CONNECTOR:273:DP-3]:
> [   43.595028] i915 0000:00:02.0: [drm] *ERROR*
> connector->dev=00000000d4850450
> [   43.595033] i915 0000:00:02.0: [drm] *ERROR*
> connector->dev->mode_config.mutex=00000000aac3fe45
> [   44.771091] i915 0000:00:02.0: [drm] *ERROR*
> [MST-CONNECTOR:300:DP-5]:
> [   44.771108] i915 0000:00:02.0: [drm] *ERROR*
> connector->dev=000000003fb97435
> [   44.771115] i915 0000:00:02.0: [drm] *ERROR*
> &connector->dev->mode_config.connection_mutex=000000009aece20e
> [   44.771127] i915 0000:00:02.0: [drm] *ERROR*
> [MST-CONNECTOR:303:DP-6]:
> [   44.771132] i915 0000:00:02.0: [drm] *ERROR*
> connector->dev=0000000075236b75
> [   44.771137] i915 0000:00:02.0: [drm] *ERROR*
> &connector->dev->mode_config.connection_mutex=000000009aece20e
>
> Also, I was under the assumption that all connectors in an MST topology
> should reference the same drm_device object, but it seems like that's
> not the case. Is my assumption wrong?
>
Sorry, please disregard. I was printing the pointers' address instead
of the value's address. Same drm_device is shared:

Link Training Unsuccessful via gildekel HACK - (not eDP)
Link Training Unsuccessful via gildekel HACK - (not eDP)
Link Training Unsuccessful
[CONNECTOR:273:DP-3]:
  connector->dev=00000000b88b882c
  connector->dev->mode_config.mutex=00000000d64b22db
    [MST-CONNECTOR:297:DP-5]:
      connector->dev=00000000b88b882c
      &connector->dev->mode_config.connection_mutex=000000002d876227
    [MST-CONNECTOR:301:DP-6]:
      connector->dev=00000000b88b882c
      &connector->dev->mode_config.connection_mutex=000000002d876227

Sorry for the noise.

> > > +     }
> > >       mutex_unlock(&connector->dev->mode_config.mutex);
> > >       /* Send Hotplug uevent so userspace can reprobe */
> > >       drm_kms_helper_connector_hotplug_event(connector);
> > > --
> > > Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics
>
>
> Thanks for your time and comments!
> --
> Best,
> Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics



-- 
Practice Makes Notable Improvements

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2023-09-01 23:24 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-24 20:50 [Intel-gfx] [PATCH v4 0/6] drm/i915/dp_link_training: Define a final failure state when link training fails Gil Dekel
2023-08-24 20:50 ` Gil Dekel
2023-08-24 20:50 ` [Intel-gfx] [PATCH v4 1/6] drm/i915/dp_link_training: Add a final failing state to link training fallback Gil Dekel
2023-08-24 20:50   ` Gil Dekel
2023-09-01 18:57   ` [Intel-gfx] " Rodrigo Vivi
2023-09-01 22:51     ` Gil Dekel
2023-08-24 20:50 ` [Intel-gfx] [PATCH v4 2/6] drm/i915/dp_link_training: Add a final failing state to link training fallback for MST Gil Dekel
2023-08-24 20:50   ` Gil Dekel
2023-08-24 20:50 ` [Intel-gfx] [PATCH v4 3/6] drm/dp_mst: Add drm_dp_set_mst_topology_link_status() Gil Dekel
2023-08-24 20:50   ` Gil Dekel
2023-08-30 21:32   ` [Intel-gfx] " Lyude Paul
2023-08-30 21:32     ` Lyude Paul
2023-08-24 20:50 ` [Intel-gfx] [PATCH v4 4/6] drm/i915: Move DP modeset_retry_work into intel_dp Gil Dekel
2023-08-24 20:50   ` Gil Dekel
2023-08-24 20:50 ` [Intel-gfx] [PATCH v4 5/6] drm/i915/dp_link_training: Set all downstream MST ports to BAD before retrying Gil Dekel
2023-08-24 20:50   ` Gil Dekel
2023-09-01 18:55   ` [Intel-gfx] " Rodrigo Vivi
2023-09-01 21:13     ` Gil Dekel
2023-09-01 23:24       ` Gil Dekel
2023-08-24 20:50 ` [Intel-gfx] [PATCH v4 6/6] drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger property Gil Dekel
2023-08-24 20:50   ` Gil Dekel
2023-09-01 23:22   ` [Intel-gfx] " Manasi Navare
2023-09-01 23:22     ` Manasi Navare
2023-08-24 22:10 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/dp_link_training: Define a final failure state when link training fails (rev2) Patchwork
2023-08-24 22:10 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-08-24 22:29 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2023-08-30 21:41 ` [Intel-gfx] [PATCH v4 0/6] drm/i915/dp_link_training: Define a final failure state when link training fails Lyude Paul
2023-08-30 21:41   ` Lyude Paul
2023-09-01 18:38   ` [Intel-gfx] " Rodrigo Vivi
2023-09-01 19:05     ` Rodrigo Vivi
2023-09-01 19:25       ` Gil Dekel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.