From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 92005] Linux 4.2 DisplayPort MST deadlock? Date: Tue, 15 Sep 2015 02:20:39 +0000 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1910916630==" Return-path: Received: from culpepper.freedesktop.org (unknown [131.252.210.165]) by gabe.freedesktop.org (Postfix) with ESMTP id 27ED66E987 for ; Mon, 14 Sep 2015 19:20:39 -0700 (PDT) List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============1910916630== Content-Type: multipart/alternative; boundary="1442283639.1E6dfA2b0.25237"; charset="UTF-8" --1442283639.1E6dfA2b0.25237 Date: Tue, 15 Sep 2015 02:20:39 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" https://bugs.freedesktop.org/show_bug.cgi?id=92005 Bug ID: 92005 Summary: Linux 4.2 DisplayPort MST deadlock? Product: DRI Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: General Assignee: dri-devel@lists.freedesktop.org Reporter: adam_richter2004@yahoo.com In Linux-4.2, there appears to be mutex contention and possible occasionally a deadlock between two kernel functions, drm_mode_getconnector and drm_fb_helper_hotplug_event over the mutex dev->mode_config.mutex . Neither of these functions is specific to MultiStreamTransport or even DisplayPort generally, but I think that the DP-MST code might be unique in causing the contention and deadlock, either due to unanticipated the unusual tree structure of DP-MST or because of a bug in the DP-MST code. In particular, I have at least once observed the following call trace, where I think drm_mode_getconnector took the mutex, though a long hierarcy of calls, eventually ended up calling drm_fb_helper_hotplug_event, which tried to take it again. I hesitate to open this ticket, because I am not sure that "dev" variable at the top of this stack trace is the same one as at the bottom, especially considering that I did not notice the system complaining about attempting to block on a mutex where mutex->owner == current, even though CONFIG_DEBUG_MUTEXES was set. The system that I got this trace from was blocked infinitely as far as I could tell, which is unusual, in that the problem that I usually observe has to do with "xrandr" taking on the order of a minute to complete, and often being inaccurate, but usually not hanging forever. I suspect that what happened probably involved some intervening hotplug event or perhaps involving kernel work functions in a way that I am not completely clear about where mutex->owner could somehow have been set to the "current" of a kernel work thread instead of the X server. Anyhow, the part that I think would likely be helpful to anyone working on this (and basically the reason I am posting now, rather than waiting) is that this stack trace might indicate some confusion in assemptions about whether dev->mode_config.mutex is help by the caller of certain functions in the middle of this stack trace. [] drm_fb_helper_hotplug_event+0x138/0x150 [drm_kms_helper] [] intel_fbdev_output_poll_changed+0x1e/0x30 [i915] [] drm_kms_helper_hotplug_event+0x2b/0x40 [drm_kms_helper] [] intel_dp_mst_hotplug+0x15/0x20 [i915] [] drm_dp_destroy_port+0xd4/0xe0 [drm_kms_helper] [] drm_dp_put_port+0x15/0x20 [drm_kms_helper] [] drm_dp_destroy_mst_branch_device+0x4e/0x100 [drm_kms_helper] [] drm_dp_put_mst_branch_device+0x15/0x20 [drm_kms_helper] [] drm_dp_mst_i2c_xfer+0x9d/0x270 [drm_kms_helper] [] __i2c_transfer+0x121/0x430 [] i2c_transfer+0x79/0xb0 [] drm_do_probe_ddc_edid+0xc9/0x130 [drm] [] drm_do_get_edid+0x17a/0x250 [drm] [] drm_get_edid+0x45/0x3d0 [drm] [] drm_dp_mst_get_edid+0x7e/0xa0 [drm_kms_helper] [] intel_dp_mst_get_modes+0x29/0x50 [i915] [] drm_helper_probe_single_connector_modes_merge_bits+0x108/0x4e0 [drm_kms_helper] [] drm_helper_probe_single_connector_modes+0x13/0x20 [drm_kms_helper] [] drm_mode_getconnector+0x389/0x410 [drm] [] drm_ioctl+0x1a5/0x670 [drm] [] drm_compat_ioctl+0x33/0x40 [drm] [] i915_compat_ioctl+0x32/0x40 [i915] [] compat_SyS_ioctl+0xc9/0x15d0 [] sysenter_dispatch+0xf/0x29 [] 0xffffffffffffffff I expect that I will update or close this ticket as (or if) I learn more. I hope this information is helpful. Comments, and, of course, fixes, are most welcome. -- You are receiving this mail because: You are the assignee for the bug. --1442283639.1E6dfA2b0.25237 Date: Tue, 15 Sep 2015 02:20:39 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8"
Bug ID 92005
Summary Linux 4.2 DisplayPort MST deadlock?
Product DRI
Version unspecified
Hardware x86-64 (AMD64)
OS Linux (All)
Status NEW
Severity normal
Priority medium
Component General
Assignee dri-devel@lists.freedesktop.org
Reporter adam_richter2004@yahoo.com

In Linux-4.2, there appears to be mutex contention and possible occasionally a
deadlock between two kernel functions, drm_mode_getconnector and
drm_fb_helper_hotplug_event over the mutex dev->mode_config.mutex .  
Neither of these functions is specific to MultiStreamTransport or even
DisplayPort generally, but I think that the DP-MST code might be unique in
causing the contention and deadlock, either due to unanticipated the unusual
tree structure of DP-MST or because of a bug in the DP-MST code.  In
particular, I have at least once observed the following call trace, where I
think drm_mode_getconnector took the mutex, though a long hierarcy of calls,
eventually ended up calling drm_fb_helper_hotplug_event, which tried to take it
again.

I hesitate to open this ticket, because I am not sure that "dev" variable at
the top of this stack trace is the same one as at the bottom, especially
considering that I did not notice the system complaining about attempting to
block on a mutex where mutex->owner == current, even though
CONFIG_DEBUG_MUTEXES was set.  The system that I got this trace from was
blocked infinitely as far as I could tell, which is unusual, in that the
problem that I usually observe has to do with "xrandr" taking on the order of a
minute to complete, and often being inaccurate, but usually not hanging
forever.

I suspect that what happened probably involved some intervening hotplug event
or perhaps involving kernel work functions in a way that I am not completely
clear about where mutex->owner could somehow have been set to the "current" of
a kernel work thread instead of the X server.

Anyhow, the part that I think would likely be helpful to anyone working on this
(and basically the reason I am posting now, rather than waiting) is that this
stack trace might indicate some confusion in assemptions about whether
dev->mode_config.mutex is help by the caller of certain functions in the middle
of this stack trace.

[<ffffffffa01837c8>] drm_fb_helper_hotplug_event+0x138/0x150 [drm_kms_helper]
[<ffffffffa02de31e>] intel_fbdev_output_poll_changed+0x1e/0x30 [i915]
[<ffffffffa017755b>] drm_kms_helper_hotplug_event+0x2b/0x40 [drm_kms_helper]
[<ffffffffa02f1c15>] intel_dp_mst_hotplug+0x15/0x20 [i915]
[<ffffffffa017aef4>] drm_dp_destroy_port+0xd4/0xe0 [drm_kms_helper]
[<ffffffffa017af15>] drm_dp_put_port+0x15/0x20 [drm_kms_helper]
[<ffffffffa017b04e>] drm_dp_destroy_mst_branch_device+0x4e/0x100
[drm_kms_helper]
[<ffffffffa017b115>] drm_dp_put_mst_branch_device+0x15/0x20 [drm_kms_helper]
[<ffffffffa017b6fd>] drm_dp_mst_i2c_xfer+0x9d/0x270 [drm_kms_helper]
[<ffffffff814cca91>] __i2c_transfer+0x121/0x430
[<ffffffff814cce19>] i2c_transfer+0x79/0xb0
[<ffffffffa00bb4a9>] drm_do_probe_ddc_edid+0xc9/0x130 [drm]
[<ffffffffa00bb0fa>] drm_do_get_edid+0x17a/0x250 [drm]
[<ffffffffa00bca55>] drm_get_edid+0x45/0x3d0 [drm]
[<ffffffffa017b9ee>] drm_dp_mst_get_edid+0x7e/0xa0 [drm_kms_helper]
[<ffffffffa02f1c99>] intel_dp_mst_get_modes+0x29/0x50 [i915]
[<ffffffffa0177908>]
drm_helper_probe_single_connector_modes_merge_bits+0x108/0x4e0 [drm_kms_helper]
[<ffffffffa0177cf3>] drm_helper_probe_single_connector_modes+0x13/0x20
[drm_kms_helper]
[<ffffffffa00b6ab9>] drm_mode_getconnector+0x389/0x410 [drm]
[<ffffffffa00a8685>] drm_ioctl+0x1a5/0x670 [drm]
[<ffffffffa00c4e53>] drm_compat_ioctl+0x33/0x40 [drm]
[<ffffffffa026bde2>] i915_compat_ioctl+0x32/0x40 [i915]
[<ffffffff812475f9>] compat_SyS_ioctl+0xc9/0x15d0
[<ffffffff8161ed22>] sysenter_dispatch+0xf/0x29
[<ffffffffffffffff>] 0xffffffffffffffff

I expect that I will update or close this ticket as (or if) I learn more.

I hope this information is helpful.  Comments, and, of course, fixes, are most
welcome.


You are receiving this mail because:
  • You are the assignee for the bug.
--1442283639.1E6dfA2b0.25237-- --===============1910916630== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHA6Ly9saXN0 cy5mcmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9kcmktZGV2ZWwK --===============1910916630==--