From: Greg KH <gregkh@linuxfoundation.org>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org, stable@vger.kernel.org
Subject: Re: [Intel-gfx] [PATCH 2/4] drm/i915/gt: Wait for CSB entries on Tigerlake
Date: Wed, 16 Sep 2020 10:35:01 +0200 [thread overview]
Message-ID: <20200916083501.GA675213@kroah.com> (raw)
In-Reply-To: <160024481825.2231.4268855132793535750@build.alporthouse.com>
On Wed, Sep 16, 2020 at 09:26:58AM +0100, Chris Wilson wrote:
> Quoting Greg KH (2020-09-16 07:33:58)
> > On Tue, Sep 15, 2020 at 01:41:48PM +0100, Chris Wilson wrote:
> > > On Tigerlake, we are seeing a repeat of commit d8f505311717 ("drm/i915/icl:
> > > Forcibly evict stale csb entries") where, presumably, due to a missing
> > > Global Observation Point synchronisation, the write pointer of the CSB
> > > ringbuffer is updated _prior_ to the contents of the ringbuffer. That is
> > > we see the GPU report more context-switch entries for us to parse, but
> > > those entries have not been written, leading us to process stale events,
> > > and eventually report a hung GPU.
> > >
> > > However, this effect appears to be much more severe than we previously
> > > saw on Icelake (though it might be best if we try the same approach
> > > there as well and measure), and Bruce suggested the good idea of resetting
> > > the CSB entry after use so that we can detect when it has been updated by
> > > the GPU. By instrumenting how long that may be, we can set a reliable
> > > upper bound for how long we should wait for:
> > >
> > > 513 late, avg of 61 retries (590 ns), max of 1061 retries (10099 ns)
> > >
> > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045
> > > References: d8f505311717 ("drm/i915/icl: Forcibly evict stale csb entries")
> >
> > What does "References:" mean? Should that be "Fixes:"?
>
> It's a reference to an earlier w/a for a previous generation for the
> same symptoms. This patch should supplement that w/a.
I see no such "reference" to that tag in
Documentation/process/submitting-patches.rst, so how were we supposed to
know this? :)
thanks,
greg k-h
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
WARNING: multiple messages have this Message-ID (diff)
From: Greg KH <gregkh@linuxfoundation.org>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org, stable@vger.kernel.org
Subject: Re: [Intel-gfx] [PATCH 2/4] drm/i915/gt: Wait for CSB entries on Tigerlake
Date: Wed, 16 Sep 2020 10:35:01 +0200 [thread overview]
Message-ID: <20200916083501.GA675213@kroah.com> (raw)
In-Reply-To: <160024481825.2231.4268855132793535750@build.alporthouse.com>
On Wed, Sep 16, 2020 at 09:26:58AM +0100, Chris Wilson wrote:
> Quoting Greg KH (2020-09-16 07:33:58)
> > On Tue, Sep 15, 2020 at 01:41:48PM +0100, Chris Wilson wrote:
> > > On Tigerlake, we are seeing a repeat of commit d8f505311717 ("drm/i915/icl:
> > > Forcibly evict stale csb entries") where, presumably, due to a missing
> > > Global Observation Point synchronisation, the write pointer of the CSB
> > > ringbuffer is updated _prior_ to the contents of the ringbuffer. That is
> > > we see the GPU report more context-switch entries for us to parse, but
> > > those entries have not been written, leading us to process stale events,
> > > and eventually report a hung GPU.
> > >
> > > However, this effect appears to be much more severe than we previously
> > > saw on Icelake (though it might be best if we try the same approach
> > > there as well and measure), and Bruce suggested the good idea of resetting
> > > the CSB entry after use so that we can detect when it has been updated by
> > > the GPU. By instrumenting how long that may be, we can set a reliable
> > > upper bound for how long we should wait for:
> > >
> > > 513 late, avg of 61 retries (590 ns), max of 1061 retries (10099 ns)
> > >
> > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045
> > > References: d8f505311717 ("drm/i915/icl: Forcibly evict stale csb entries")
> >
> > What does "References:" mean? Should that be "Fixes:"?
>
> It's a reference to an earlier w/a for a previous generation for the
> same symptoms. This patch should supplement that w/a.
I see no such "reference" to that tag in
Documentation/process/submitting-patches.rst, so how were we supposed to
know this? :)
thanks,
greg k-h
next prev parent reply other threads:[~2020-09-16 8:34 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-15 12:41 [Intel-gfx] [PATCH 1/4] drm/i915/gt: Widen CSB pointer to u64 for the parsers Chris Wilson
2020-09-15 12:41 ` [Intel-gfx] [PATCH 2/4] drm/i915/gt: Wait for CSB entries on Tigerlake Chris Wilson
2020-09-15 12:41 ` Chris Wilson
2020-09-15 13:19 ` [Intel-gfx] " Mika Kuoppala
2020-09-16 6:33 ` Greg KH
2020-09-16 6:33 ` Greg KH
2020-09-16 8:26 ` [Intel-gfx] " Chris Wilson
2020-09-16 8:26 ` Chris Wilson
2020-09-16 8:35 ` Greg KH [this message]
2020-09-16 8:35 ` Greg KH
2020-09-15 12:41 ` [Intel-gfx] [PATCH 3/4] drm/i915/gt: Apply the CSB w/a for all Chris Wilson
2020-09-15 13:29 ` Mika Kuoppala
2020-09-15 12:41 ` [Intel-gfx] [PATCH 4/4] drm/i915/gt: Use a mmio read of the CSB in case of failure Chris Wilson
2020-09-15 13:39 ` Mika Kuoppala
2020-09-15 13:56 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/4] drm/i915/gt: Widen CSB pointer to u64 for the parsers Patchwork
2020-09-15 13:57 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-09-15 14:21 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-09-15 17:26 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200916083501.GA675213@kroah.com \
--to=gregkh@linuxfoundation.org \
--cc=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.