* [Intel-gfx] [PATCH] drm/i915: Mark up racy read of rq->engine
@ 2020-04-23 11:58 Chris Wilson
2020-04-23 13:25 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
2020-04-23 14:53 ` [Intel-gfx] [PATCH] " Tvrtko Ursulin
0 siblings, 2 replies; 4+ messages in thread
From: Chris Wilson @ 2020-04-23 11:58 UTC (permalink / raw)
To: intel-gfx; +Cc: Chris Wilson
As the i915_request.engine may be updated by a virtual engine to either
point to the virtual engine or the real physical engine on submission,
we have to be wary that the engine pointer may change.
[ 213.317076] BUG: KCSAN: data-race in execlists_dequeue [i915] / i915_request_wait [i915]
[ 213.317097]
[ 213.317110] write (marked) to 0xffff8881e8647650 of 8 bytes by interrupt on cpu 2:
[ 213.317386] execlists_dequeue+0x43b/0x1670 [i915]
[ 213.317645] __execlists_submission_tasklet+0x48/0x60 [i915]
[ 213.317905] execlists_submission_tasklet+0xd3/0x170 [i915]
[ 213.317926] tasklet_action_common.isra.0+0x42/0x90
[ 213.317943] __do_softirq+0xc8/0x206
[ 213.317958] irq_exit+0xcd/0xe0
[ 213.317980] irq_work_interrupt+0xf/0x20
[ 213.317999] __tsan_read8+0x30/0x100
[ 213.318272] retire_requests+0xdd/0xf0 [i915]
[ 213.318502] engine_retire+0xa6/0xe0 [i915]
[ 213.318519] process_one_work+0x3af/0x640
[ 213.318534] worker_thread+0x80/0x670
[ 213.318548] kthread+0x19a/0x1e0
[ 213.318566] ret_from_fork+0x1f/0x30
[ 213.318584]
[ 213.318595] read to 0xffff8881e8647650 of 8 bytes by task 458 on cpu 1:
[ 213.318847] i915_request_wait+0x3e3/0x510 [i915]
[ 213.319088] i915_gem_object_wait_fence+0x81/0xa0 [i915]
[ 213.319328] i915_gem_object_wait+0x26b/0x560 [i915]
[ 213.319578] i915_gem_wait_ioctl+0x141/0x290 [i915]
[ 213.319597] drm_ioctl_kernel+0xe9/0x130
[ 213.319613] drm_ioctl+0x27d/0x45e
[ 213.319628] ksys_ioctl+0x89/0xb0
[ 213.319648] __x64_sys_ioctl+0x42/0x60
[ 213.319666] do_syscall_64+0x6e/0x2c0
[ 213.319680] entry_SYSCALL_64_after_hwframe+0x44/0xa9
In this case, we are merely trying to flush the most recent engine
associated with the request, and do not care which as the process of
chaing the engine is done by a tasklet, with which we are yielding to.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/i915_request.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 22635bbabf06..e9fd20242438 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1660,7 +1660,7 @@ long i915_request_wait(struct i915_request *rq,
break;
}
- intel_engine_flush_submission(rq->engine);
+ intel_engine_flush_submission(READ_ONCE(rq->engine));
if (signal_pending_state(state, current)) {
timeout = -ERESTARTSYS;
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Mark up racy read of rq->engine
2020-04-23 11:58 [Intel-gfx] [PATCH] drm/i915: Mark up racy read of rq->engine Chris Wilson
@ 2020-04-23 13:25 ` Patchwork
2020-04-23 14:53 ` [Intel-gfx] [PATCH] " Tvrtko Ursulin
1 sibling, 0 replies; 4+ messages in thread
From: Patchwork @ 2020-04-23 13:25 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Mark up racy read of rq->engine
URL : https://patchwork.freedesktop.org/series/76392/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_8352 -> Patchwork_17439
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_17439 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_17439, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17439/index.html
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_17439:
### IGT changes ###
#### Possible regressions ####
* igt@i915_selftest@live@gt_pm:
- fi-byt-j1900: [PASS][1] -> [FAIL][2]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8352/fi-byt-j1900/igt@i915_selftest@live@gt_pm.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17439/fi-byt-j1900/igt@i915_selftest@live@gt_pm.html
#### Suppressed ####
The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.
* igt@i915_selftest@live@workarounds:
- {fi-tgl-dsi}: [PASS][3] -> [INCOMPLETE][4]
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8352/fi-tgl-dsi/igt@i915_selftest@live@workarounds.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17439/fi-tgl-dsi/igt@i915_selftest@live@workarounds.html
Known issues
------------
Here are the changes found in Patchwork_17439 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@i915_selftest@live@sanitycheck:
- fi-bwr-2160: [PASS][5] -> [INCOMPLETE][6] ([i915#489])
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8352/fi-bwr-2160/igt@i915_selftest@live@sanitycheck.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17439/fi-bwr-2160/igt@i915_selftest@live@sanitycheck.html
#### Possible fixes ####
* igt@i915_selftest@live@gt_lrc:
- fi-snb-2600: [FAIL][7] ([i915#1763]) -> [PASS][8]
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8352/fi-snb-2600/igt@i915_selftest@live@gt_lrc.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17439/fi-snb-2600/igt@i915_selftest@live@gt_lrc.html
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
[i915#1763]: https://gitlab.freedesktop.org/drm/intel/issues/1763
[i915#489]: https://gitlab.freedesktop.org/drm/intel/issues/489
Participating hosts (48 -> 43)
------------------------------
Additional (1): fi-bsw-kefka
Missing (6): fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus
Build changes
-------------
* CI: CI-20190529 -> None
* Linux: CI_DRM_8352 -> Patchwork_17439
CI-20190529: 20190529
CI_DRM_8352: 248cbab28d58c203de956df1db4cdeb53ea97a89 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5608: e7bcaf1dd251d454706c7cd64282f531aec50183 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_17439: caf81f8e993e52fdcebf6e56807b254a65257613 @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
caf81f8e993e drm/i915: Mark up racy read of rq->engine
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17439/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915: Mark up racy read of rq->engine
2020-04-23 11:58 [Intel-gfx] [PATCH] drm/i915: Mark up racy read of rq->engine Chris Wilson
2020-04-23 13:25 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
@ 2020-04-23 14:53 ` Tvrtko Ursulin
2020-04-23 15:11 ` Chris Wilson
1 sibling, 1 reply; 4+ messages in thread
From: Tvrtko Ursulin @ 2020-04-23 14:53 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 23/04/2020 12:58, Chris Wilson wrote:
> As the i915_request.engine may be updated by a virtual engine to either
> point to the virtual engine or the real physical engine on submission,
> we have to be wary that the engine pointer may change.
>
> [ 213.317076] BUG: KCSAN: data-race in execlists_dequeue [i915] / i915_request_wait [i915]
> [ 213.317097]
> [ 213.317110] write (marked) to 0xffff8881e8647650 of 8 bytes by interrupt on cpu 2:
> [ 213.317386] execlists_dequeue+0x43b/0x1670 [i915]
> [ 213.317645] __execlists_submission_tasklet+0x48/0x60 [i915]
> [ 213.317905] execlists_submission_tasklet+0xd3/0x170 [i915]
> [ 213.317926] tasklet_action_common.isra.0+0x42/0x90
> [ 213.317943] __do_softirq+0xc8/0x206
> [ 213.317958] irq_exit+0xcd/0xe0
> [ 213.317980] irq_work_interrupt+0xf/0x20
> [ 213.317999] __tsan_read8+0x30/0x100
> [ 213.318272] retire_requests+0xdd/0xf0 [i915]
> [ 213.318502] engine_retire+0xa6/0xe0 [i915]
> [ 213.318519] process_one_work+0x3af/0x640
> [ 213.318534] worker_thread+0x80/0x670
> [ 213.318548] kthread+0x19a/0x1e0
> [ 213.318566] ret_from_fork+0x1f/0x30
> [ 213.318584]
> [ 213.318595] read to 0xffff8881e8647650 of 8 bytes by task 458 on cpu 1:
> [ 213.318847] i915_request_wait+0x3e3/0x510 [i915]
> [ 213.319088] i915_gem_object_wait_fence+0x81/0xa0 [i915]
> [ 213.319328] i915_gem_object_wait+0x26b/0x560 [i915]
> [ 213.319578] i915_gem_wait_ioctl+0x141/0x290 [i915]
> [ 213.319597] drm_ioctl_kernel+0xe9/0x130
> [ 213.319613] drm_ioctl+0x27d/0x45e
> [ 213.319628] ksys_ioctl+0x89/0xb0
> [ 213.319648] __x64_sys_ioctl+0x42/0x60
> [ 213.319666] do_syscall_64+0x6e/0x2c0
> [ 213.319680] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> In this case, we are merely trying to flush the most recent engine
> associated with the request, and do not care which as the process of
> chaing the engine is done by a tasklet, with which we are yielding to.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_request.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 22635bbabf06..e9fd20242438 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1660,7 +1660,7 @@ long i915_request_wait(struct i915_request *rq,
> break;
> }
>
> - intel_engine_flush_submission(rq->engine);
> + intel_engine_flush_submission(READ_ONCE(rq->engine));
>
> if (signal_pending_state(state, current)) {
> timeout = -ERESTARTSYS;
>
What with the mutex_acquire/release in this case? No practical effect
but they are also dereferencing rq->engine... Take a copy of engine for
lockdep at start and another read for engine flushing in the loop?
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915: Mark up racy read of rq->engine
2020-04-23 14:53 ` [Intel-gfx] [PATCH] " Tvrtko Ursulin
@ 2020-04-23 15:11 ` Chris Wilson
0 siblings, 0 replies; 4+ messages in thread
From: Chris Wilson @ 2020-04-23 15:11 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
Quoting Tvrtko Ursulin (2020-04-23 15:53:44)
>
> On 23/04/2020 12:58, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 22635bbabf06..e9fd20242438 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1660,7 +1660,7 @@ long i915_request_wait(struct i915_request *rq,
> > break;
> > }
> >
> > - intel_engine_flush_submission(rq->engine);
> > + intel_engine_flush_submission(READ_ONCE(rq->engine));
> >
> > if (signal_pending_state(state, current)) {
> > timeout = -ERESTARTSYS;
> >
>
> What with the mutex_acquire/release in this case? No practical effect
> but they are also dereferencing rq->engine... Take a copy of engine for
> lockdep at start and another read for engine flushing in the loop?
No practical difference [today], since the lockmap is on the gt.
Well that will be interesting in the future. Hmm, we could replace it
with a static [global] lockmap and then link all gt->reset.mutex to.
The only danger then is that we link a wait on one gt with a wait on
another -- except that if we allow it, that will happen naturally.
So I don't see a loss in generality in using a global lockmap.
Make a note for the future.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-04-23 15:11 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-23 11:58 [Intel-gfx] [PATCH] drm/i915: Mark up racy read of rq->engine Chris Wilson
2020-04-23 13:25 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
2020-04-23 14:53 ` [Intel-gfx] [PATCH] " Tvrtko Ursulin
2020-04-23 15:11 ` Chris Wilson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.