public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Imre Deak <imre.deak@intel.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/i915: make sure power domains don't get disabled during error capture
Date: Thu, 28 Nov 2013 15:53:37 +0200	[thread overview]
Message-ID: <1385646817.16543.10.camel@intelbox> (raw)
In-Reply-To: <20131128133407.GJ27344@phenom.ffwll.local>


[-- Attachment #1.1: Type: text/plain, Size: 3617 bytes --]

On Thu, 2013-11-28 at 14:34 +0100, Daniel Vetter wrote:
> On Thu, Nov 28, 2013 at 12:52:07PM +0200, Imre Deak wrote:
> > On Thu, 2013-11-28 at 11:05 +0100, Daniel Vetter wrote:
> > > On Thu, Nov 28, 2013 at 10:04:42AM +0200, Imre Deak wrote:
> > > > So far we had a race during error capturing where a power domain could
> > > > get disabled after we queried its state to be on. Fix this by protecting
> > > > the power domain state tracking changes with a lock and holding this
> > > > lock during error capturing.
> > > > 
> > > > Note that this still allows the case where we are halfway in enabling a
> > > > power domain and still report it as disabled. This isn't a problem as we
> > > > only want to avoid register accesses while the domain is off and that is
> > > > now guaranteed.
> > > > 
> > > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > > 
> > > > [ Applies on top of "drm/i915: add intel_display_power_enabled_sw() for
> > > > use in atomic ctx" ]
> > > 
> > > Tbh I don't see the point for this - the error capture code is inherently
> > > racy, we don't grab any of the other modeset locks. We also don't care and
> > > furthermore we should try to grab as few locks as possible to increase the
> > > chances that we can actually capture the error state successfully. Every
> > > lock you add adds another depency and so more ways to end in tears and
> > > deadlocks.
> > >
> > > So if the racy error capture is the only reason I'll reject this.
> > 
> > Yes it is only the race this fixes, but I'd like to argue for its
> > usefulness. It fixes in particular the following two things:
> > 
> > 1. avoiding unclaimed register accesses, which was the original goal of 
> >     
> > commit 9d1cb9147dbe45f6e94dc796518ecf67cb64b359
> > Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > Date:   Fri Nov 1 13:32:08 2013 -0200
> > 
> > drm/i915: avoid unclaimed registers when capturing the error state
> 
> Hm, I've thought we won't hit any unclaimed register warnings if there's
> no concurrent/racing other thread doing stuff to the power wells?

We'll hit it if we read a register while its power well is off, but we
think it's on based on SW tracking. That's possible for example if you
get one of the ERROR_INTERRUPTS and call into the error capture code and
this happens while someone is disabling the power well.

> And as explained I don't care that much about racing hangcheck ...
>
> > 2. avoiding an inconsistent power on/off value in the error state wrt.
> > to the captured registers. Without the lock we can have an error state
> > showing the power was on, but register values captured with power off,
> > with some unpredictable values. This consistency check was the whole
> > point of adding the power on/off value to the error state.
> > 
> > I understand that adding complexity in general is risky, but in this
> > case the risk is minimal. The only place we take the lock is to adjust a
> > counter and to read HW registers, without any further nested locks, so
> > no chance for lock inversions.
> 
> The thing is that we already have tons of other similar races, e.g. we
> could capture hilariously incosistent modeset state where the pfit is
> disabled but pipesrc already set to something that would need upscaling.
> And it's not a problem. So I still think we can just ignore those races,
> even more we /should/ do so.

Ok, I wanted to reduce the uncertainty. But if we don't care about this
and having occasional unclaimed reg access reports are ok then I'm ok to
ignore this patch.

--Imre


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

      reply	other threads:[~2013-11-28 13:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-28  8:04 [PATCH] drm/i915: make sure power domains don't get disabled during error capture Imre Deak
2013-11-28 10:05 ` Daniel Vetter
2013-11-28 10:52   ` Imre Deak
2013-11-28 13:34     ` Daniel Vetter
2013-11-28 13:53       ` Imre Deak [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1385646817.16543.10.camel@intelbox \
    --to=imre.deak@intel.com \
    --cc=daniel@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox