intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Important MST fixes for 4.6
@ 2016-05-31 16:49 Lyude
  2016-05-31 16:49 ` [PATCH 1/4] drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config() Lyude
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Lyude @ 2016-05-31 16:49 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: David Airlie, Daniel Vetter,
	open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., dri-devel@lists.freedesktop.org open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., linux-kernel@vger.kernel.org open list

Unfortunately we've never really made use of creating/destroying connectors
on the fly like we have with MST, so 4.6 ended up showing a lot of various
bugs with hotplugging MST displays, booting with MST displays, etc. Most of
these bugs are very likely to panic the kernel, and a couple of them end up
even doing out of bounds memory accesses causing all sorts of other issues.

The proper fix for these issues is Dave's connector lifetime patch series
in 4.7-rc1[1], however backporting those patches would be too big of a fix
to submit for stable. This patch series is a much smaller set of changes to
workaround this issue.

As another note: the first two patches in this series are already upstream for
4.7-rc1, however since we have the connector ref lifetime patches in 4.7-rc1
they don't fix any kernel panics there, only a few inconsistencies in
i915/drm's code. They do however, fix kernel panics for 4.6 since connectors
getting destroyed can make dev->mode_config.num_connector have a different
value from fb_helper->connector_count.

[1]: 0552f7651bc233e5407ab06ba97a9d7c25e19580 in master

Lyude (4):
  drm/i915/fbdev: Fix num_connector references in
    intel_fb_initial_config()
  drm/fb_helper: Fix references to dev->mode_config.num_connector
  drm/i915: Discard previous atomic state on resume if connectors change
  drm/atomic: Verify connector->funcs != NULL when clearing states

 drivers/gpu/drm/drm_atomic.c         |  2 +-
 drivers/gpu/drm/drm_fb_helper.c      |  5 ++---
 drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++
 drivers/gpu/drm/i915/intel_fbdev.c   |  6 +++---
 4 files changed, 18 insertions(+), 7 deletions(-)

-- 
2.5.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/4] drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config()
  2016-05-31 16:49 [PATCH 0/4] Important MST fixes for 4.6 Lyude
@ 2016-05-31 16:49 ` Lyude
  2016-05-31 16:49 ` [PATCH 3/4] drm/i915: Discard previous atomic state on resume if connectors change Lyude
  2016-05-31 19:10 ` [Intel-gfx] [PATCH 0/4] Important MST fixes for 4.6 Daniel Vetter
  2 siblings, 0 replies; 5+ messages in thread
From: Lyude @ 2016-05-31 16:49 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: David Airlie, Daniel Vetter,
	open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., dri-devel@lists.freedesktop.org open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., linux-kernel@vger.kernel.org open list

commit 14a3842a1d5945067d1dd0788f314e14d5b18e5b upstream.

During boot time, MST devices usually send a ton of hotplug events
irregardless of whether or not any physical hotplugs actually occurred.
Hotplugs mean connectors being created/destroyed, and the number of DRM
connectors changing under us. This isn't a problem if we use
fb_helper->connector_count since we only set it once in the code,
however if we use num_connector from struct drm_mode_config we risk it's
value changing under us. On top of that, there's even a chance that
dev->mode_config.num_connector != fb_helper->connector_count. If the
number of connectors happens to increase under us, we'll end up using
the wrong array size for memcpy and start writing beyond the actual
length of the array, occasionally resulting in kernel panics.

This fix is only required for 4.6 and below. David Airlie's patchseries
for 4.7 to add connector reference counting provides a more proper fix
for this.

Upstream fix: 0552f7651bc2 ("drm/i915/mst: use reference counted
connectors. (v3)")
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Lyude <cpaul@redhat.com>
---
 drivers/gpu/drm/i915/intel_fbdev.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 97a91e6..c607217 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -366,12 +366,12 @@ static bool intel_fb_initial_config(struct drm_fb_helper *fb_helper,
 	uint64_t conn_configured = 0, mask;
 	int pass = 0;
 
-	save_enabled = kcalloc(dev->mode_config.num_connector, sizeof(bool),
+	save_enabled = kcalloc(fb_helper->connector_count, sizeof(bool),
 			       GFP_KERNEL);
 	if (!save_enabled)
 		return false;
 
-	memcpy(save_enabled, enabled, dev->mode_config.num_connector);
+	memcpy(save_enabled, enabled, fb_helper->connector_count);
 	mask = (1 << fb_helper->connector_count) - 1;
 retry:
 	for (i = 0; i < fb_helper->connector_count; i++) {
@@ -510,7 +510,7 @@ retry:
 	if (fallback) {
 bail:
 		DRM_DEBUG_KMS("Not using firmware configuration\n");
-		memcpy(enabled, save_enabled, dev->mode_config.num_connector);
+		memcpy(enabled, save_enabled, fb_helper->connector_count);
 		kfree(save_enabled);
 		return false;
 	}
-- 
2.5.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/4] drm/i915: Discard previous atomic state on resume if connectors change
  2016-05-31 16:49 [PATCH 0/4] Important MST fixes for 4.6 Lyude
  2016-05-31 16:49 ` [PATCH 1/4] drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config() Lyude
@ 2016-05-31 16:49 ` Lyude
  2016-06-04 20:46   ` Greg Kroah-Hartman
  2016-05-31 19:10 ` [Intel-gfx] [PATCH 0/4] Important MST fixes for 4.6 Daniel Vetter
  2 siblings, 1 reply; 5+ messages in thread
From: Lyude @ 2016-05-31 16:49 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: David Airlie, Daniel Vetter,
	open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., dri-devel@lists.freedesktop.org open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., linux-kernel@vger.kernel.org open list

If an MST device is disconnected while the machine is suspended, the
number of connectors will change as well after we call
intel_dp_mst_resume(). This means that any previous atomic state we had
before suspending is no longer valid, since it'll still be pointing to
missing connectors. We need to check for this before committing the
state, otherwise we'll kernel panic on resume whenever if any MST
display was disconnected before we started resuming:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 [drm_kms_helper]
Call Trace:
 [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915]
 [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0
 [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0
 [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm]
 [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90
 [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm]
 [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915]
 [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90
 [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915]
 [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915]
 [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0
 [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190
 [<ffffffff814da455>] device_resume+0xd5/0x1f0
 [<ffffffff814da58d>] async_resume+0x1d/0x50
 [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150
 [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0
 [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0
 [<ffffffff810ad038>] worker_thread+0x48/0x4e0
 [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0
 [<ffffffff810b3794>] kthread+0xe4/0x100
 [<ffffffff81742672>] ret_from_fork+0x22/0x50
 [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200

Changes since v1:
  - Move drm_atomic_state_free() call down so we're holding the
    appropriate locks when destroying the atomic state
Changes since v2:
  - Check that state != NULL before we start accessing it's members

This fix is only required for 4.6 and below. David Airlie's patchseries
for 4.7 to add connector reference counting provides a more proper fix
for this.

Upstream fix: 0552f7651bc2 ("drm/i915/mst: use reference counted
connectors. (v3)")
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Lyude <cpaul@redhat.com>
---
 drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 0104a06..6fdb90e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -15958,6 +15958,18 @@ void intel_display_resume(struct drm_device *dev)
 retry:
 	ret = drm_modeset_lock_all_ctx(dev, &ctx);
 
+	/*
+	 * With MST, the number of connectors can change between suspend and
+	 * resume, which means that the state we want to restore might now be
+	 * impossible to use since it'll be pointing to non-existant
+	 * connectors.
+	 */
+	if (ret == 0 && state &&
+	    state->num_connector != dev->mode_config.num_connector) {
+		drm_atomic_state_free(state);
+		state = NULL;
+	}
+
 	if (ret == 0 && !setup) {
 		setup = true;
 
-- 
2.5.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Intel-gfx] [PATCH 0/4] Important MST fixes for 4.6
  2016-05-31 16:49 [PATCH 0/4] Important MST fixes for 4.6 Lyude
  2016-05-31 16:49 ` [PATCH 1/4] drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config() Lyude
  2016-05-31 16:49 ` [PATCH 3/4] drm/i915: Discard previous atomic state on resume if connectors change Lyude
@ 2016-05-31 19:10 ` Daniel Vetter
  2 siblings, 0 replies; 5+ messages in thread
From: Daniel Vetter @ 2016-05-31 19:10 UTC (permalink / raw)
  To: Lyude
  Cc: stable, Greg Kroah-Hartman, David Airlie, Daniel Vetter,
	open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., dri-devel@lists.freedesktop.org open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., linux-kernel@vger.kernel.org open list

On Tue, May 31, 2016 at 12:49:03PM -0400, Lyude wrote:
> Unfortunately we've never really made use of creating/destroying connectors
> on the fly like we have with MST, so 4.6 ended up showing a lot of various
> bugs with hotplugging MST displays, booting with MST displays, etc. Most of
> these bugs are very likely to panic the kernel, and a couple of them end up
> even doing out of bounds memory accesses causing all sorts of other issues.
> 
> The proper fix for these issues is Dave's connector lifetime patch series
> in 4.7-rc1[1], however backporting those patches would be too big of a fix
> to submit for stable. This patch series is a much smaller set of changes to
> workaround this issue.
> 
> As another note: the first two patches in this series are already upstream for
> 4.7-rc1, however since we have the connector ref lifetime patches in 4.7-rc1
> they don't fix any kernel panics there, only a few inconsistencies in
> i915/drm's code. They do however, fix kernel panics for 4.6 since connectors
> getting destroyed can make dev->mode_config.num_connector have a different
> value from fb_helper->connector_count.
> 
> [1]: 0552f7651bc233e5407ab06ba97a9d7c25e19580 in master

In case it's not clear: There's no way we're going to backport the
connector refcounting stuff from 4.7, way too risky. But we still need
some way to make current stable kernels happy, and the 2 small hacks (plus
2 backports) seem like an adequate solution.

Hence acked from my side for applying to all supported stable kernels.

Cheers, Daniel
> 
> Lyude (4):
>   drm/i915/fbdev: Fix num_connector references in
>     intel_fb_initial_config()
>   drm/fb_helper: Fix references to dev->mode_config.num_connector
>   drm/i915: Discard previous atomic state on resume if connectors change
>   drm/atomic: Verify connector->funcs != NULL when clearing states
> 
>  drivers/gpu/drm/drm_atomic.c         |  2 +-
>  drivers/gpu/drm/drm_fb_helper.c      |  5 ++---
>  drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++
>  drivers/gpu/drm/i915/intel_fbdev.c   |  6 +++---
>  4 files changed, 18 insertions(+), 7 deletions(-)
> 
> -- 
> 2.5.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 3/4] drm/i915: Discard previous atomic state on resume if connectors change
  2016-05-31 16:49 ` [PATCH 3/4] drm/i915: Discard previous atomic state on resume if connectors change Lyude
@ 2016-06-04 20:46   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 5+ messages in thread
From: Greg Kroah-Hartman @ 2016-06-04 20:46 UTC (permalink / raw)
  To: Lyude
  Cc: stable, Daniel Vetter, Jani Nikula, David Airlie,
	open list:INTEL DRM DRIVERS (excluding Poulsbo, Moorestow...), dri-devel@lists.freedesktop.org (open list:INTEL DRM DRIVERS (excluding Poulsbo, Moorestow...), linux-kernel@vger.kernel.org (open list))

On Tue, May 31, 2016 at 12:49:06PM -0400, Lyude wrote:
> If an MST device is disconnected while the machine is suspended, the
> number of connectors will change as well after we call
> intel_dp_mst_resume(). This means that any previous atomic state we had
> before suspending is no longer valid, since it'll still be pointing to
> missing connectors. We need to check for this before committing the
> state, otherwise we'll kernel panic on resume whenever if any MST
> display was disconnected before we started resuming:
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 [drm_kms_helper]
> Call Trace:
>  [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915]
>  [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0
>  [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0
>  [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm]
>  [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90
>  [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm]
>  [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915]
>  [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90
>  [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915]
>  [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915]
>  [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0
>  [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190
>  [<ffffffff814da455>] device_resume+0xd5/0x1f0
>  [<ffffffff814da58d>] async_resume+0x1d/0x50
>  [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150
>  [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0
>  [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0
>  [<ffffffff810ad038>] worker_thread+0x48/0x4e0
>  [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0
>  [<ffffffff810b3794>] kthread+0xe4/0x100
>  [<ffffffff81742672>] ret_from_fork+0x22/0x50
>  [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200
> 
> Changes since v1:
>   - Move drm_atomic_state_free() call down so we're holding the
>     appropriate locks when destroying the atomic state
> Changes since v2:
>   - Check that state != NULL before we start accessing it's members
> 
> This fix is only required for 4.6 and below. David Airlie's patchseries
> for 4.7 to add connector reference counting provides a more proper fix
> for this.

This doesn't apply to 4.5-stable or 4.4-stable, if you want it there,
can you provide a backported version for me to apply?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-06-04 20:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-31 16:49 [PATCH 0/4] Important MST fixes for 4.6 Lyude
2016-05-31 16:49 ` [PATCH 1/4] drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config() Lyude
2016-05-31 16:49 ` [PATCH 3/4] drm/i915: Discard previous atomic state on resume if connectors change Lyude
2016-06-04 20:46   ` Greg Kroah-Hartman
2016-05-31 19:10 ` [Intel-gfx] [PATCH 0/4] Important MST fixes for 4.6 Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).