All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
To: Manasi Navare <manasi.d.navare@intel.com>,
	intel-gfx@lists.freedesktop.org
Cc: Tony Cheng <tony.cheng@amd.com>,
	Daniel Vetter <daniel.vetter@intel.com>,
	Harry Wentland <Harry.wentland@amd.com>
Subject: Re: [PATCH] drm/i915: Cancel the modeset retry work during modeset cleanup
Date: Thu, 26 Oct 2017 09:31:19 +0200	[thread overview]
Message-ID: <d0c4111b-e090-c042-7289-327183663d8d@linux.intel.com> (raw)
In-Reply-To: <1508980804-17197-1-git-send-email-manasi.d.navare@intel.com>

Op 26-10-17 om 03:20 schreef Manasi Navare:
> During modeset cleanup on driver unload we may have a pending
> hotplug work. This needs to be cancel early during the teardown
> so that it does not fire after we have freed the connector.
> We do this after drm_kms_helper_poll_fini(dev) since this might
> cause link retrain and before intel_fbdev_fini() since this tries to
> free the connector.
>
> If this is not done we may see something like:
> DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock))
>  ------------[ cut here ]------------
>  WARNING: CPU: 4 PID: 5010 at kernel/locking/mutex-debug.c:103 mutex_destroy+0x4e/0x60
>  Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178
> +a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid
> +[last unloaded: snd_hda_intel]
>  CPU: 4 PID: 5010 Comm: drv_module_relo Tainted: G     U          4.14.0-rc3-CI-CI_DRM_3186+ #1
>  Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017
>  task: ffff8803c827aa40 task.stack: ffffc90000520000
>  RIP: 0010:mutex_destroy+0x4e/0x60
>  RSP: 0018:ffffc90000523d58 EFLAGS: 00010292
>  RAX: 000000000000002a RBX: ffff88044fbef648 RCX: 0000000000000000
>  RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffffffff810f0cf0
>  RBP: ffffc90000523d60 R08: 0000000000000001 R09: 0000000000000001
>  R10: 000000000f21cb81 R11: 0000000000000000 R12: ffff88044f71efc8
>  R13: ffffffffa02b3d20 R14: ffffffffa02b3d90 R15: ffff880459b29308
>  FS:  00007f5df4d6e8c0(0000) GS:ffff88045d300000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 000055ec51f00a18 CR3: 0000000451782006 CR4: 00000000003606e0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>  Call Trace:
>   drm_fb_helper_fini+0xd9/0x130
>   intel_fbdev_destroy+0x12/0x60 [i915]
>   intel_fbdev_fini+0x28/0x30 [i915]
>   intel_modeset_cleanup+0x45/0xa0 [i915]
>   i915_driver_unload+0x92/0x180 [i915]
>   i915_pci_remove+0x19/0x30 [i915]
>   i915_driver_unload+0x92/0x180 [i915]
>   i915_pci_remove+0x19/0x30 [i915]
>   pci_device_remove+0x39/0xb0
>   device_release_driver_internal+0x15d/0x220
>   driver_detach+0x40/0x80
>   bus_remove_driver+0x58/0xd0
>   driver_unregister+0x2c/0x40
>   pci_unregister_driver+0x36/0xb0
>   i915_exit+0x1a/0x8b [i915]
>   SyS_delete_module+0x18c/0x1e0
>   entry_SYSCALL_64_fastpath+0x1c/0xb1
>  RIP: 0033:0x7f5df3286287
>  RSP: 002b:00007fff8e107cc8 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0
>  RAX: ffffffffffffffda RBX: ffffffff81493a03 RCX: 00007f5df3286287
>  RDX: 0000000000000001 RSI: 0000000000000800 RDI: 0000564c7be02e48
>  RBP: ffffc90000523f88 R08: 0000000000000000 R09: 0000000000000080
>  R10: 00007f5df4d6e8c0 R11: 0000000000000246 R12: 0000000000000000
>  R13: 00007fff8e107eb0 R14: 0000000000000000 R15: 0000000000000000
Or a GPF like:

 general protection fault: 0000 [#1] PREEMPT SMP
 Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid [last unloaded: snd_hda_intel]
 CPU: 0 PID: 82 Comm: kworker/0:1 Tainted: G     U  W       4.14.0-rc3-CI-CI_DRM_3186+ #1
 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017
 Workqueue: events intel_dp_modeset_retry_work_fn [i915]
 task: ffff88045a5caa40 task.stack: ffffc90000378000
 RIP: 0010:drm_setup_crtcs+0x143/0xbf0
 RSP: 0018:ffffc9000037bd20 EFLAGS: 00010202
 RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000002 RCX: 0000000000000001
 RDX: 0000000000000001 RSI: 0000000000000780 RDI: 00000000ffffffff
 RBP: ffffc9000037bdb8 R08: 0000000000000001 R09: 0000000000000001
 R10: 0000000000000780 R11: 0000000000000000 R12: 0000000000000002
 R13: ffff88044fbef4e8 R14: 0000000000000780 R15: 0000000000000438
 FS:  0000000000000000(0000) GS:ffff88045d200000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055ec51ee5168 CR3: 000000044c89d003 CR4: 00000000003606f0
 Call Trace:
  drm_fb_helper_hotplug_event.part.18+0x7e/0xc0
  drm_fb_helper_hotplug_event+0x1a/0x20
  intel_fbdev_output_poll_changed+0x1a/0x20 [i915]
  drm_kms_helper_hotplug_event+0x27/0x30
  intel_dp_modeset_retry_work_fn+0x77/0x80 [i915]
  process_one_work+0x233/0x660
  worker_thread+0x206/0x3b0
  kthread+0x152/0x190
  ? process_one_work+0x660/0x660
  ? kthread_create_on_node+0x40/0x40
  ret_from_fork+0x27/0x40
 Code: 06 00 00 45 8b 45 20 31 db 45 31 e4 45 85 c0 0f 8e 91 06 00 00 44 8b 75 94 44 8b 7d 90 49 8b 45 28 49 63 d4 44 89 f6 41 83 c4 01 <48> 8b 04 d0 44 89 fa 48 8b 38 48 8b 87 a8 01 00 00 ff 50 20 01
 RIP: drm_setup_crtcs+0x143/0xbf0 RSP: ffffc9000037bd20
 ---[ end trace 08901ff1a77d30c7 ]---

The second backtrace is also needed, since it shows the full picture of the race.

retry_work_fn calling kms helper hotplug event, which ends up calling drm_fb_helper_hotplug_event.

This is the race, and the reason why retry_work_fn has to be killed after poll_fini and before fbdev_fini.

I agree with Chris though, having a separate function makes the dependency wrt poll_fini more clear.

The connector isn't freed yet, but fbdev is at the point of the GPF. :)

> Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Fixes: 9301397a63b3 ("drm/i915: Implement Link Rate fallback on Link training failure")
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tony Cheng <tony.cheng@amd.com>
> Cc: Harry Wentland <Harry.wentland@amd.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
> Cc: Manasi Navare <manasi.d.navare@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h      |  1 +
>  drivers/gpu/drm/i915/intel_display.c | 16 ++++++++++++++++
>  2 files changed, 17 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 366ba74..f81b073 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -4114,6 +4114,7 @@ extern void intel_modeset_gem_init(struct drm_device *dev);
>  extern void intel_modeset_cleanup(struct drm_device *dev);
>  extern int intel_connector_register(struct drm_connector *);
>  extern void intel_connector_unregister(struct drm_connector *);
> +extern void intel_connector_work_fn_cleanup(struct drm_device *dev);
>  extern int intel_modeset_vga_set_state(struct drm_i915_private *dev_priv,
>  				       bool state);
>  extern void intel_display_resume(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 0e493a1..44158ff 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -15200,6 +15200,19 @@ void intel_connector_unregister(struct drm_connector *connector)
>  	intel_panel_destroy_backlight(connector);
>  }
>  
> +void intel_connector_work_fn_cleanup(struct drm_device *dev)
> +{
> +	struct intel_connector *connector;
> +	struct drm_connector_list_iter conn_iter;
> +
> +	drm_connector_list_iter_begin(dev, &conn_iter);
> +	for_each_intel_connector_iter(connector, &conn_iter) {
> +		if (connector->modeset_retry_work.func)
> +			cancel_work_sync(&connector->modeset_retry_work);
> +	}
> +	drm_connector_list_iter_end(&conn_iter);
> +}
> +
>  void intel_modeset_cleanup(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = to_i915(dev);
> @@ -15222,6 +15235,9 @@ void intel_modeset_cleanup(struct drm_device *dev)
>  	 */
>  	drm_kms_helper_poll_fini(dev);
>  
> +	/* Cancel pending modeset retry work in intel_connector */
> +	intel_connector_work_fn_cleanup(dev);
> +
>  	/* poll work can call into fbdev, hence clean that up afterwards */
>  	intel_fbdev_fini(dev_priv);
>  


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

      parent reply	other threads:[~2017-10-26  7:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-26  1:20 [PATCH] drm/i915: Cancel the modeset retry work during modeset cleanup Manasi Navare
2017-10-26  1:38 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-10-26  3:12 ` ✓ Fi.CI.IGT: " Patchwork
2017-10-26  6:33 ` [PATCH] " Chris Wilson
2017-10-26 16:18   ` Manasi Navare
2017-10-26  7:31 ` Maarten Lankhorst [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d0c4111b-e090-c042-7289-327183663d8d@linux.intel.com \
    --to=maarten.lankhorst@linux.intel.com \
    --cc=Harry.wentland@amd.com \
    --cc=daniel.vetter@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=manasi.d.navare@intel.com \
    --cc=tony.cheng@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.