From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Intel Graphics Development <intel-gfx@lists.freedesktop.org>
Cc: Daniel Vetter <daniel.vetter@intel.com>,
Daniel Vetter <daniel.vetter@ffwll.ch>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 2/2] drm/i915: Use rcu instead of stop_machine in set_wedged
Date: Tue, 10 Oct 2017 12:50:45 +0300 [thread overview]
Message-ID: <87r2ubjpqi.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <20171009164401.16035-2-daniel.vetter@ffwll.ch>
Daniel Vetter <daniel.vetter@ffwll.ch> writes:
> stop_machine is not really a locking primitive we should use, except
> when the hw folks tell us the hw is broken and that's the only way to
> work around it.
>
> This patch tries to address the locking abuse of stop_machine() from
>
> commit 20e4933c478a1ca694b38fa4ac44d99e659941f5
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date: Tue Nov 22 14:41:21 2016 +0000
>
> drm/i915: Stop the machine as we install the wedged submit_request handler
>
> Chris said parts of the reasons for going with stop_machine() was that
> it's no overhead for the fast-path. But these callbacks use irqsave
> spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast.
>
> To stay as close as possible to the stop_machine semantics we first
> update all the submit function pointers to the nop handler, then call
> synchronize_rcu() to make sure no new requests can be submitted. This
> should give us exactly the huge barrier we want.
>
> I pondered whether we should annotate engine->submit_request as __rcu
> and use rcu_assign_pointer and rcu_dereference on it. But the reason
> behind those is to make sure the compiler/cpu barriers are there for
> when you have an actual data structure you point at, to make sure all
> the writes are seen correctly on the read side. But we just have a
> function pointer, and .text isn't changed, so no need for these
> barriers and hence no need for annotations.
>
> Unfortunately there's a complication with the call to
> intel_engine_init_global_seqno:
>
> - Without stop_machine we must hold the corresponding spinlock.
>
> - Without stop_machine we must ensure that all requests are marked as
> having failed with dma_fence_set_error() before we call it. That
> means we need to split the nop request submission into two phases,
> both synchronized with rcu:
>
> 1. Only stop submitting the requests to hw and mark them as failed.
>
> 2. After all pending requests in the scheduler/ring are suitably
> marked up as failed and we can force complete them all, also force
> complete by calling intel_engine_init_global_seqno().
>
> This should fix the followwing lockdep splat:
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G U
> ------------------------------------------------------
> kworker/3:4/562 is trying to acquire lock:
> (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8113d4bc>] stop_machine+0x1c/0x40
>
> but task is already holding lock:
> (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915]
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #6 (&dev->struct_mutex){+.+.}:
> __lock_acquire+0x1420/0x15e0
> lock_acquire+0xb0/0x200
> __mutex_lock+0x86/0x9b0
> mutex_lock_interruptible_nested+0x1b/0x20
> i915_mutex_lock_interruptible+0x51/0x130 [i915]
> i915_gem_fault+0x209/0x650 [i915]
> __do_fault+0x1e/0x80
> __handle_mm_fault+0xa08/0xed0
> handle_mm_fault+0x156/0x300
> __do_page_fault+0x2c5/0x570
> do_page_fault+0x28/0x250
> page_fault+0x22/0x30
>
> -> #5 (&mm->mmap_sem){++++}:
> __lock_acquire+0x1420/0x15e0
> lock_acquire+0xb0/0x200
> __might_fault+0x68/0x90
> _copy_to_user+0x23/0x70
> filldir+0xa5/0x120
> dcache_readdir+0xf9/0x170
> iterate_dir+0x69/0x1a0
> SyS_getdents+0xa5/0x140
> entry_SYSCALL_64_fastpath+0x1c/0xb1
>
> -> #4 (&sb->s_type->i_mutex_key#5){++++}:
> down_write+0x3b/0x70
> handle_create+0xcb/0x1e0
> devtmpfsd+0x139/0x180
> kthread+0x152/0x190
> ret_from_fork+0x27/0x40
>
> -> #3 ((complete)&req.done){+.+.}:
> __lock_acquire+0x1420/0x15e0
> lock_acquire+0xb0/0x200
> wait_for_common+0x58/0x210
> wait_for_completion+0x1d/0x20
> devtmpfs_create_node+0x13d/0x160
> device_add+0x5eb/0x620
> device_create_groups_vargs+0xe0/0xf0
> device_create+0x3a/0x40
> msr_device_create+0x2b/0x40
> cpuhp_invoke_callback+0xc9/0xbf0
> cpuhp_thread_fun+0x17b/0x240
> smpboot_thread_fn+0x18a/0x280
> kthread+0x152/0x190
> ret_from_fork+0x27/0x40
>
> -> #2 (cpuhp_state-up){+.+.}:
> __lock_acquire+0x1420/0x15e0
> lock_acquire+0xb0/0x200
> cpuhp_issue_call+0x133/0x1c0
> __cpuhp_setup_state_cpuslocked+0x139/0x2a0
> __cpuhp_setup_state+0x46/0x60
> page_writeback_init+0x43/0x67
> pagecache_init+0x3d/0x42
> start_kernel+0x3a8/0x3fc
> x86_64_start_reservations+0x2a/0x2c
> x86_64_start_kernel+0x6d/0x70
> verify_cpu+0x0/0xfb
>
> -> #1 (cpuhp_state_mutex){+.+.}:
> __lock_acquire+0x1420/0x15e0
> lock_acquire+0xb0/0x200
> __mutex_lock+0x86/0x9b0
> mutex_lock_nested+0x1b/0x20
> __cpuhp_setup_state_cpuslocked+0x53/0x2a0
> __cpuhp_setup_state+0x46/0x60
> page_alloc_init+0x28/0x30
> start_kernel+0x145/0x3fc
> x86_64_start_reservations+0x2a/0x2c
> x86_64_start_kernel+0x6d/0x70
> verify_cpu+0x0/0xfb
>
> -> #0 (cpu_hotplug_lock.rw_sem){++++}:
> check_prev_add+0x430/0x840
> __lock_acquire+0x1420/0x15e0
> lock_acquire+0xb0/0x200
> cpus_read_lock+0x3d/0xb0
> stop_machine+0x1c/0x40
> i915_gem_set_wedged+0x1a/0x20 [i915]
> i915_reset+0xb9/0x230 [i915]
> i915_reset_device+0x1f6/0x260 [i915]
> i915_handle_error+0x2d8/0x430 [i915]
> hangcheck_declare_hang+0xd3/0xf0 [i915]
> i915_hangcheck_elapsed+0x262/0x2d0 [i915]
> process_one_work+0x233/0x660
> worker_thread+0x4e/0x3b0
> kthread+0x152/0x190
> ret_from_fork+0x27/0x40
>
> other info that might help us debug this:
>
> Chain exists of:
> cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev->struct_mutex
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&dev->struct_mutex);
> lock(&mm->mmap_sem);
> lock(&dev->struct_mutex);
> lock(cpu_hotplug_lock.rw_sem);
>
> *** DEADLOCK ***
>
> 3 locks held by kworker/3:4/562:
> #0: ("events_long"){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660
> #1: ((&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660
> #2: (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915]
>
> stack backtrace:
> CPU: 3 PID: 562 Comm: kworker/3:4 Tainted: G U 4.14.0-rc3-CI-CI_DRM_3179+ #1
> Hardware name: /NUC7i5BNB, BIOS BNKBL357.86A.0048.2017.0704.1415 07/04/2017
> Workqueue: events_long i915_hangcheck_elapsed [i915]
> Call Trace:
> dump_stack+0x68/0x9f
> print_circular_bug+0x235/0x3c0
> ? lockdep_init_map_crosslock+0x20/0x20
> check_prev_add+0x430/0x840
> ? irq_work_queue+0x86/0xe0
> ? wake_up_klogd+0x53/0x70
> __lock_acquire+0x1420/0x15e0
> ? __lock_acquire+0x1420/0x15e0
> ? lockdep_init_map_crosslock+0x20/0x20
> lock_acquire+0xb0/0x200
> ? stop_machine+0x1c/0x40
> ? i915_gem_object_truncate+0x50/0x50 [i915]
> cpus_read_lock+0x3d/0xb0
> ? stop_machine+0x1c/0x40
> stop_machine+0x1c/0x40
> i915_gem_set_wedged+0x1a/0x20 [i915]
> i915_reset+0xb9/0x230 [i915]
> i915_reset_device+0x1f6/0x260 [i915]
> ? gen8_gt_irq_ack+0x170/0x170 [i915]
> ? work_on_cpu_safe+0x60/0x60
> i915_handle_error+0x2d8/0x430 [i915]
> ? vsnprintf+0xd1/0x4b0
> ? scnprintf+0x3a/0x70
> hangcheck_declare_hang+0xd3/0xf0 [i915]
> ? intel_runtime_pm_put+0x56/0xa0 [i915]
> i915_hangcheck_elapsed+0x262/0x2d0 [i915]
> process_one_work+0x233/0x660
> worker_thread+0x4e/0x3b0
> kthread+0x152/0x190
> ? process_one_work+0x660/0x660
> ? kthread_create_on_node+0x40/0x40
> ret_from_fork+0x27/0x40
> Setting dangerous option reset - tainting kernel
> i915 0000:00:02.0: Resetting chip after gpu hang
> Setting dangerous option reset - tainting kernel
> i915 0000:00:02.0: Resetting chip after gpu hang
>
> v2: Have 1 global synchronize_rcu() barrier across all engines, and
> improve commit message.
>
> v3: We need to protect the seqno update with the timeline spinlock (in
> set_wedged) to avoid racing with other updates of the seqno, like we
> already do in nop_submit_request (Chris).
>
> v4: Use two-phase sequence to plug the race Chris spotted where we can
> complete requests before they're marked up with -EIO.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102886
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103096
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Marta Lofstedt <marta.lofstedt@intel.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 76 +++++++++++++++--------
> drivers/gpu/drm/i915/i915_gem_request.c | 2 +
> drivers/gpu/drm/i915/selftests/i915_gem_request.c | 2 +
> 3 files changed, 53 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 82a10036fb38..8d7d8d1f78db 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3057,49 +3057,71 @@ static void nop_submit_request(struct drm_i915_gem_request *request)
>
> spin_lock_irqsave(&request->engine->timeline->lock, flags);
> __i915_gem_request_submit(request);
> - intel_engine_init_global_seqno(request->engine, request->global_seqno);
> spin_unlock_irqrestore(&request->engine->timeline->lock, flags);
> }
>
> -static void engine_set_wedged(struct intel_engine_cs *engine)
> +static void nop_complete_submit_request(struct drm_i915_gem_request *request)
> {
> - /* We need to be sure that no thread is running the old callback as
> - * we install the nop handler (otherwise we would submit a request
> - * to hardware that will never complete). In order to prevent this
> - * race, we wait until the machine is idle before making the swap
> - * (using stop_machine()).
> - */
> - engine->submit_request = nop_submit_request;
> + unsigned long flags;
>
> - /* Mark all executing requests as skipped */
> - engine->cancel_requests(engine);
> + GEM_BUG_ON(!i915_terminally_wedged(&request->i915->gpu_error));
> + dma_fence_set_error(&request->fence, -EIO);
>
> - /* Mark all pending requests as complete so that any concurrent
> - * (lockless) lookup doesn't try and wait upon the request as we
> - * reset it.
> - */
> - intel_engine_init_global_seqno(engine,
> - intel_engine_last_submit(engine));
> + spin_lock_irqsave(&request->engine->timeline->lock, flags);
> + __i915_gem_request_submit(request);
> + intel_engine_init_global_seqno(request->engine, request->global_seqno);
> + spin_unlock_irqrestore(&request->engine->timeline->lock, flags);
> }
>
> -static int __i915_gem_set_wedged_BKL(void *data)
> +void i915_gem_set_wedged(struct drm_i915_private *i915)
> {
> - struct drm_i915_private *i915 = data;
> struct intel_engine_cs *engine;
> enum intel_engine_id id;
>
> + /* First, stop submission to hw, but do not yet complete requests by
> + * rolling the global seqno forward (since this would complete requests
> + * for which we haven't set the fence error to EIO yet).
> + */
> for_each_engine(engine, i915, id)
> - engine_set_wedged(engine);
> + engine->submit_request = nop_submit_request;
>
> - set_bit(I915_WEDGED, &i915->gpu_error.flags);
> - wake_up_all(&i915->gpu_error.reset_queue);
> + /* Make sure no one is running the old callback before we proceed with
> + * cancelling requests and resetting the completion tracking. Otherwise
> + * we might submit a request to the hardware which never completes.
> + */
> + synchronize_rcu();
>
> - return 0;
> -}
> + for_each_engine(engine, i915, id) {
> + /* Mark all executing requests as skipped */
> + engine->cancel_requests(engine);
>
> -void i915_gem_set_wedged(struct drm_i915_private *dev_priv)
> -{
> - stop_machine(__i915_gem_set_wedged_BKL, dev_priv, NULL);
> + /* Only once we've force-cancelled all in-flight requests can we
> + * start to complete all requests.
> + */
> + engine->submit_request = nop_complete_submit_request;
> + }
> +
> + /* Make sure no request can slip through without getting completed by
> + * either this call here to intel_engine_init_global_seqno, or the one
> + * in nop_complete_submit_request.
> + */
> + synchronize_rcu();
> +
> + for_each_engine(engine, i915, id) {
> + unsigned long flags;
> +
> + /* Mark all pending requests as complete so that any concurrent
> + * (lockless) lookup doesn't try and wait upon the request as we
> + * reset it.
> + */
> + spin_lock_irqsave(&engine->timeline->lock, flags);
> + intel_engine_init_global_seqno(engine,
> + intel_engine_last_submit(engine));
> + spin_unlock_irqrestore(&engine->timeline->lock, flags);
> + }
> +
> + set_bit(I915_WEDGED, &i915->gpu_error.flags);
Can we this set_bit above the loop, right after synchronize_rcu()?
Thinking is that we stop accepting requests we can't complete
the earliest.
-Mika
> + wake_up_all(&i915->gpu_error.reset_queue);
> }
>
> bool i915_gem_unset_wedged(struct drm_i915_private *i915)
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index b100b38f1dd2..ef78a85cb845 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -556,7 +556,9 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
> switch (state) {
> case FENCE_COMPLETE:
> trace_i915_gem_request_submit(request);
> + rcu_read_lock();
> request->engine->submit_request(request);
> + rcu_read_unlock();
> break;
>
> case FENCE_FREE:
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_request.c b/drivers/gpu/drm/i915/selftests/i915_gem_request.c
> index 78b9f811707f..a999161e8db1 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_request.c
> @@ -215,7 +215,9 @@ static int igt_request_rewind(void *arg)
> }
> i915_gem_request_get(vip);
> i915_add_request(vip);
> + rcu_read_lock();
> request->engine->submit_request(request);
> + rcu_read_unlock();
>
> mutex_unlock(&i915->drm.struct_mutex);
>
> --
> 2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2017-10-10 9:50 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-09 16:44 [PATCH 1/2] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock Daniel Vetter
2017-10-09 16:44 ` [PATCH 2/2] drm/i915: Use rcu instead of stop_machine in set_wedged Daniel Vetter
2017-10-10 9:21 ` Chris Wilson
2017-10-10 12:30 ` Daniel Vetter
2017-10-10 13:29 ` Chris Wilson
2017-10-10 9:50 ` Mika Kuoppala [this message]
2017-10-10 12:37 ` Daniel Vetter
2017-10-10 13:39 ` Chris Wilson
2017-10-10 14:09 ` Daniel Vetter
2017-10-09 18:42 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock Patchwork
2017-10-10 0:56 ` ✓ Fi.CI.IGT: " Patchwork
2017-10-10 10:57 ` [PATCH 1/2] " Daniel Vetter
-- strict thread matches above, loose matches on Subject: below --
2017-10-06 9:06 Daniel Vetter
2017-10-06 9:06 ` [PATCH 2/2] drm/i915: Use rcu instead of stop_machine in set_wedged Daniel Vetter
2017-10-06 9:17 ` Chris Wilson
2017-10-06 10:12 ` Thomas Gleixner
2017-10-06 11:12 ` Peter Zijlstra
2017-10-06 14:12 ` Daniel Vetter
2017-10-06 11:03 ` Chris Wilson
2017-10-06 14:20 ` Daniel Vetter
2017-10-06 17:29 ` Chris Wilson
2017-10-09 9:12 ` Daniel Vetter
2017-10-06 17:37 ` Chris Wilson
2017-10-09 9:26 ` Daniel Vetter
2017-10-06 11:10 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r2ubjpqi.fsf@gaia.fi.intel.com \
--to=mika.kuoppala@linux.intel.com \
--cc=daniel.vetter@ffwll.ch \
--cc=daniel.vetter@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox