From: Imre Deak <imre.deak@intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v2 2/2] drm/i915/bxt: work around HW context corruption due to coherency problem
Date: Fri, 18 Sep 2015 15:24:43 +0300 [thread overview]
Message-ID: <1442579083.13059.12.camel@intel.com> (raw)
In-Reply-To: <20150918090224.GS6442@nuc-i3427.alporthouse.com>
On Fri, 2015-09-18 at 10:02 +0100, Chris Wilson wrote:
> On Thu, Sep 17, 2015 at 07:17:44PM +0300, Imre Deak wrote:
> > The execlist context object is mapped with a CPU/GPU coherent mapping
> > everywhere, but on BXT A stepping due to a HW issue the coherency is not
> > guaranteed. To work around this flush the context object after pinning
> > it (to flush cache lines left by the context initialization/read-back
> > from backing storage) and mark it as uncached so later updates during
> > context switching will be coherent.
> >
> > I noticed this problem via a GPU hang, where IPEHR pointed to an invalid
> > opcode value. I couldn't find this value on the ring but looking at the
> > contents of the active context object it turned out to be a parameter
> > dword of a bigger command there. The original command opcode itself
> > was zeroed out, based on the above I assume due to a CPU writeback of
> > the corresponding cacheline. When restoring the context the GPU would
> > jump over the zeroed out opcode and hang when trying to execute the
> > above parameter dword.
> >
> > I could easily reproduce this by running igt/gem_render_copy_redux and
> > gem_tiled_blits/basic in parallel, but I guess it could be triggered by
> > anything involving frequent switches between two separate contexts. With
> > this workaround I couldn't reproduce the problem.
> >
> > v2:
> > - instead of clflushing after updating the tail and PDP values during
> > context switching, map the corresponding page as uncached to avoid a
> > race between CPU and GPU, both updating the same cacheline at the same
> > time (Ville)
>
> No. Changing PAT involves a stop_machine() and is severely detrimental
> to performance (context creation overhead does impact userspace).
Hm, where on that path is stop_machine(), I haven't found it. I guess
there is an overhead because of the TLB flush on each CPU, I haven't
benchmarked it. If that's a real issue (atm this is only a w/a for
stepping A) we could have a cache of uncached pages.
> Mapping it as uncached doesn't remove the race anyway.
Please explain, I do believe it does.
--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-09-18 12:24 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-17 16:17 [PATCH 1/2] drm/i915/bxt: prevent allocating context object from HIGHMEM Imre Deak
2015-09-17 16:17 ` [PATCH v2 2/2] drm/i915/bxt: work around HW context corruption due to coherency problem Imre Deak
2015-09-18 9:02 ` Chris Wilson
2015-09-18 12:24 ` Imre Deak [this message]
2015-09-23 13:35 ` Daniel Vetter
2015-09-23 13:39 ` Chris Wilson
2015-09-23 13:57 ` Imre Deak
2015-09-23 14:17 ` Chris Wilson
2015-09-23 15:40 ` Imre Deak
2015-09-23 17:07 ` Imre Deak
2015-09-23 13:58 ` Daniel Vetter
2015-09-23 14:39 ` Chris Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1442579083.13059.12.camel@intel.com \
--to=imre.deak@intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.