From: Imre Deak <imre.deak@intel.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/i915: make sure power domains don't get disabled during error capture
Date: Thu, 28 Nov 2013 15:53:37 +0200 [thread overview]
Message-ID: <1385646817.16543.10.camel@intelbox> (raw)
In-Reply-To: <20131128133407.GJ27344@phenom.ffwll.local>
[-- Attachment #1.1: Type: text/plain, Size: 3617 bytes --]
On Thu, 2013-11-28 at 14:34 +0100, Daniel Vetter wrote:
> On Thu, Nov 28, 2013 at 12:52:07PM +0200, Imre Deak wrote:
> > On Thu, 2013-11-28 at 11:05 +0100, Daniel Vetter wrote:
> > > On Thu, Nov 28, 2013 at 10:04:42AM +0200, Imre Deak wrote:
> > > > So far we had a race during error capturing where a power domain could
> > > > get disabled after we queried its state to be on. Fix this by protecting
> > > > the power domain state tracking changes with a lock and holding this
> > > > lock during error capturing.
> > > >
> > > > Note that this still allows the case where we are halfway in enabling a
> > > > power domain and still report it as disabled. This isn't a problem as we
> > > > only want to avoid register accesses while the domain is off and that is
> > > > now guaranteed.
> > > >
> > > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > >
> > > > [ Applies on top of "drm/i915: add intel_display_power_enabled_sw() for
> > > > use in atomic ctx" ]
> > >
> > > Tbh I don't see the point for this - the error capture code is inherently
> > > racy, we don't grab any of the other modeset locks. We also don't care and
> > > furthermore we should try to grab as few locks as possible to increase the
> > > chances that we can actually capture the error state successfully. Every
> > > lock you add adds another depency and so more ways to end in tears and
> > > deadlocks.
> > >
> > > So if the racy error capture is the only reason I'll reject this.
> >
> > Yes it is only the race this fixes, but I'd like to argue for its
> > usefulness. It fixes in particular the following two things:
> >
> > 1. avoiding unclaimed register accesses, which was the original goal of
> >
> > commit 9d1cb9147dbe45f6e94dc796518ecf67cb64b359
> > Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > Date: Fri Nov 1 13:32:08 2013 -0200
> >
> > drm/i915: avoid unclaimed registers when capturing the error state
>
> Hm, I've thought we won't hit any unclaimed register warnings if there's
> no concurrent/racing other thread doing stuff to the power wells?
We'll hit it if we read a register while its power well is off, but we
think it's on based on SW tracking. That's possible for example if you
get one of the ERROR_INTERRUPTS and call into the error capture code and
this happens while someone is disabling the power well.
> And as explained I don't care that much about racing hangcheck ...
>
> > 2. avoiding an inconsistent power on/off value in the error state wrt.
> > to the captured registers. Without the lock we can have an error state
> > showing the power was on, but register values captured with power off,
> > with some unpredictable values. This consistency check was the whole
> > point of adding the power on/off value to the error state.
> >
> > I understand that adding complexity in general is risky, but in this
> > case the risk is minimal. The only place we take the lock is to adjust a
> > counter and to read HW registers, without any further nested locks, so
> > no chance for lock inversions.
>
> The thing is that we already have tons of other similar races, e.g. we
> could capture hilariously incosistent modeset state where the pfit is
> disabled but pipesrc already set to something that would need upscaling.
> And it's not a problem. So I still think we can just ignore those races,
> even more we /should/ do so.
Ok, I wanted to reduce the uncertainty. But if we don't care about this
and having occasional unclaimed reg access reports are ok then I'm ok to
ignore this patch.
--Imre
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
prev parent reply other threads:[~2013-11-28 13:53 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-28 8:04 [PATCH] drm/i915: make sure power domains don't get disabled during error capture Imre Deak
2013-11-28 10:05 ` Daniel Vetter
2013-11-28 10:52 ` Imre Deak
2013-11-28 13:34 ` Daniel Vetter
2013-11-28 13:53 ` Imre Deak [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1385646817.16543.10.camel@intelbox \
--to=imre.deak@intel.com \
--cc=daniel@ffwll.ch \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox