From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 2/4] drm/i915/gt: Protect defer_request() from new waiters
Date: Fri, 07 Feb 2020 11:25:28 +0200 [thread overview]
Message-ID: <874kw2agvr.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <20200206204915.2636606-2-chris@chris-wilson.co.uk>
Chris Wilson <chris@chris-wilson.co.uk> writes:
> Mika spotted
>
> <4>[17436.705441] general protection fault: 0000 [#1] PREEMPT SMP PTI
> <4>[17436.705447] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.5.0+ #1
> <4>[17436.705449] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3805 05/16/2018
> <4>[17436.705512] RIP: 0010:__execlists_submission_tasklet+0xc4d/0x16e0 [i915]
> <4>[17436.705516] Code: c5 4c 8d 60 e0 75 17 e9 8c 07 00 00 49 8b 44 24 20 49 39 c5 4c 8d 60 e0 0f 84 7a 07 00 00 49 8b 5c 24 08 49 8b 87 80 00 00 00 <48> 39 83 d8 fe ff ff 75 d9 48 8b 83 88 fe ff ff a8 01 0f 84 b6 05
> <4>[17436.705518] RSP: 0018:ffffc9000012ce80 EFLAGS: 00010083
> <4>[17436.705521] RAX: ffff88822ae42000 RBX: 5a5a5a5a5a5a5a5a RCX: dead000000000122
> <4>[17436.705523] RDX: ffff88822ae42588 RSI: ffff8881e32a7908 RDI: ffff8881c429fd48
> <4>[17436.705525] RBP: ffffc9000012cf00 R08: ffff88822ae42588 R09: 00000000fffffffe
> <4>[17436.705527] R10: ffff8881c429fb80 R11: 00000000a677cf08 R12: ffff8881c42a0aa8
> <4>[17436.705529] R13: ffff8881c429fd38 R14: ffff88822ae42588 R15: ffff8881c429fb80
> <4>[17436.705532] FS: 0000000000000000(0000) GS:ffff88822ed00000(0000) knlGS:0000000000000000
> <4>[17436.705534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4>[17436.705536] CR2: 00007f858c76d000 CR3: 0000000005610003 CR4: 00000000003606e0
> <4>[17436.705538] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> <4>[17436.705540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> <4>[17436.705542] Call Trace:
> <4>[17436.705545] <IRQ>
> <4>[17436.705603] execlists_submission_tasklet+0xc0/0x130 [i915]
>
> which is us consuming a partially initialised new waiter in
> defer_requests(). We can prevent this by initialising the i915_dependency
> prior to making it visible, and since we are using a concurrent
> list_add/iterator mark them up to the compiler.
I tried to find the culprit myself but was confused if it was
the request or the waiter which was wrong. So here is a short
summary of discussion in irc:
RBX: 5a5a5a...is POISON_INUSE
Requests won't get poisoned as they are reused and protected
by rcu. Thus it points to waiter and the evidence and code matches so,
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>
> Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_lrc.c | 7 ++++++-
> drivers/gpu/drm/i915/i915_scheduler.c | 5 +++--
> 2 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index c196fb90c59f..b350e01d86d2 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1615,6 +1615,11 @@ last_active(const struct intel_engine_execlists *execlists)
> return *last;
> }
>
> +#define for_each_waiter(p__, rq__) \
> + list_for_each_entry_lockless(p__, \
> + &(rq__)->sched.waiters_list, \
> + wait_link)
> +
> static void defer_request(struct i915_request *rq, struct list_head * const pl)
> {
> LIST_HEAD(list);
> @@ -1632,7 +1637,7 @@ static void defer_request(struct i915_request *rq, struct list_head * const pl)
> GEM_BUG_ON(i915_request_is_active(rq));
> list_move_tail(&rq->sched.link, pl);
>
> - list_for_each_entry(p, &rq->sched.waiters_list, wait_link) {
> + for_each_waiter(p, rq) {
> struct i915_request *w =
> container_of(p->waiter, typeof(*w), sched);
>
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index 5d96cfba40f8..9cbd31443eb0 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -423,8 +423,6 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
>
> if (!node_signaled(signal)) {
> INIT_LIST_HEAD(&dep->dfs_link);
> - list_add(&dep->wait_link, &signal->waiters_list);
> - list_add(&dep->signal_link, &node->signalers_list);
> dep->signaler = signal;
> dep->waiter = node;
> dep->flags = flags;
> @@ -434,6 +432,9 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
> !node_started(signal))
> node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
>
> + list_add(&dep->signal_link, &node->signalers_list);
> + list_add_rcu(&dep->wait_link, &signal->waiters_list);
> +
> /*
> * As we do not allow WAIT to preempt inflight requests,
> * once we have executed a request, along with triggering
> --
> 2.25.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2020-02-07 9:26 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-06 20:49 [Intel-gfx] [PATCH 1/4] drm/i915/gt: Prevent queuing retire workers on the virtual engine Chris Wilson
2020-02-06 20:49 ` [Intel-gfx] [PATCH 2/4] drm/i915/gt: Protect defer_request() from new waiters Chris Wilson
2020-02-07 9:25 ` Mika Kuoppala [this message]
2020-02-06 20:49 ` [Intel-gfx] [PATCH 3/4] drm/i915/gt: Protect execlists_hold/unhold " Chris Wilson
2020-02-07 8:57 ` [Intel-gfx] [PATCH v2] " Chris Wilson
2020-02-07 9:51 ` Mika Kuoppala
2020-02-06 20:49 ` [Intel-gfx] [PATCH 4/4] drm/i915/gem: Don't leak non-persistent requests on changing engines Chris Wilson
2020-02-06 22:33 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/4] drm/i915/gt: Prevent queuing retire workers on the virtual engine Patchwork
2020-02-06 22:54 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-02-07 9:13 ` [Intel-gfx] [PATCH 1/4] " Mika Kuoppala
2020-02-07 9:25 ` Chris Wilson
2020-02-07 9:40 ` Mika Kuoppala
2020-02-07 9:34 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/4] drm/i915/gt: Prevent queuing retire workers on the virtual engine (rev2) Patchwork
2020-02-07 9:59 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-02-07 11:29 ` [Intel-gfx] [PATCH 1/4] drm/i915/gt: Prevent queuing retire workers on the virtual engine Tvrtko Ursulin
2020-02-10 9:31 ` [Intel-gfx] ✓ Fi.CI.IGT: success for series starting with [1/4] drm/i915/gt: Prevent queuing retire workers on the virtual engine (rev2) Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874kw2agvr.fsf@gaia.fi.intel.com \
--to=mika.kuoppala@linux.intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox