All of lore.kernel.org
 help / color / mirror / Atom feed
From: Imre Deak <imre.deak@intel.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/i915: make sure power domains don't get disabled during error capture
Date: Thu, 28 Nov 2013 15:53:37 +0200	[thread overview]
Message-ID: <1385646817.16543.10.camel@intelbox> (raw)
In-Reply-To: <20131128133407.GJ27344@phenom.ffwll.local>


[-- Attachment #1.1: Type: text/plain, Size: 3617 bytes --]

On Thu, 2013-11-28 at 14:34 +0100, Daniel Vetter wrote:
> On Thu, Nov 28, 2013 at 12:52:07PM +0200, Imre Deak wrote:
> > On Thu, 2013-11-28 at 11:05 +0100, Daniel Vetter wrote:
> > > On Thu, Nov 28, 2013 at 10:04:42AM +0200, Imre Deak wrote:
> > > > So far we had a race during error capturing where a power domain could
> > > > get disabled after we queried its state to be on. Fix this by protecting
> > > > the power domain state tracking changes with a lock and holding this
> > > > lock during error capturing.
> > > > 
> > > > Note that this still allows the case where we are halfway in enabling a
> > > > power domain and still report it as disabled. This isn't a problem as we
> > > > only want to avoid register accesses while the domain is off and that is
> > > > now guaranteed.
> > > > 
> > > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > > 
> > > > [ Applies on top of "drm/i915: add intel_display_power_enabled_sw() for
> > > > use in atomic ctx" ]
> > > 
> > > Tbh I don't see the point for this - the error capture code is inherently
> > > racy, we don't grab any of the other modeset locks. We also don't care and
> > > furthermore we should try to grab as few locks as possible to increase the
> > > chances that we can actually capture the error state successfully. Every
> > > lock you add adds another depency and so more ways to end in tears and
> > > deadlocks.
> > >
> > > So if the racy error capture is the only reason I'll reject this.
> > 
> > Yes it is only the race this fixes, but I'd like to argue for its
> > usefulness. It fixes in particular the following two things:
> > 
> > 1. avoiding unclaimed register accesses, which was the original goal of 
> >     
> > commit 9d1cb9147dbe45f6e94dc796518ecf67cb64b359
> > Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > Date:   Fri Nov 1 13:32:08 2013 -0200
> > 
> > drm/i915: avoid unclaimed registers when capturing the error state
> 
> Hm, I've thought we won't hit any unclaimed register warnings if there's
> no concurrent/racing other thread doing stuff to the power wells?

We'll hit it if we read a register while its power well is off, but we
think it's on based on SW tracking. That's possible for example if you
get one of the ERROR_INTERRUPTS and call into the error capture code and
this happens while someone is disabling the power well.

> And as explained I don't care that much about racing hangcheck ...
>
> > 2. avoiding an inconsistent power on/off value in the error state wrt.
> > to the captured registers. Without the lock we can have an error state
> > showing the power was on, but register values captured with power off,
> > with some unpredictable values. This consistency check was the whole
> > point of adding the power on/off value to the error state.
> > 
> > I understand that adding complexity in general is risky, but in this
> > case the risk is minimal. The only place we take the lock is to adjust a
> > counter and to read HW registers, without any further nested locks, so
> > no chance for lock inversions.
> 
> The thing is that we already have tons of other similar races, e.g. we
> could capture hilariously incosistent modeset state where the pfit is
> disabled but pipesrc already set to something that would need upscaling.
> And it's not a problem. So I still think we can just ignore those races,
> even more we /should/ do so.

Ok, I wanted to reduce the uncertainty. But if we don't care about this
and having occasional unclaimed reg access reports are ok then I'm ok to
ignore this patch.

--Imre


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

      reply	other threads:[~2013-11-28 13:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-28  8:04 [PATCH] drm/i915: make sure power domains don't get disabled during error capture Imre Deak
2013-11-28 10:05 ` Daniel Vetter
2013-11-28 10:52   ` Imre Deak
2013-11-28 13:34     ` Daniel Vetter
2013-11-28 13:53       ` Imre Deak [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1385646817.16543.10.camel@intelbox \
    --to=imre.deak@intel.com \
    --cc=daniel@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.