From mboxrd@z Thu Jan 1 00:00:00 1970 From: Imre Deak Subject: Re: [PATCH] drm/i915: make sure power domains don't get disabled during error capture Date: Thu, 28 Nov 2013 15:53:37 +0200 Message-ID: <1385646817.16543.10.camel@intelbox> References: <1385625882-14169-1-git-send-email-imre.deak@intel.com> <20131128100549.GG27344@phenom.ffwll.local> <1385635927.4181.32.camel@ideak-mobl> <20131128133407.GJ27344@phenom.ffwll.local> Reply-To: imre.deak@intel.com Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1356972403==" Return-path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTP id C8D51F9D00 for ; Thu, 28 Nov 2013 05:53:40 -0800 (PST) In-Reply-To: <20131128133407.GJ27344@phenom.ffwll.local> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org To: Daniel Vetter Cc: intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org --===============1356972403== Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="=-Y9Z7K1uYYSi4qIzilt2J" --=-Y9Z7K1uYYSi4qIzilt2J Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 2013-11-28 at 14:34 +0100, Daniel Vetter wrote: > On Thu, Nov 28, 2013 at 12:52:07PM +0200, Imre Deak wrote: > > On Thu, 2013-11-28 at 11:05 +0100, Daniel Vetter wrote: > > > On Thu, Nov 28, 2013 at 10:04:42AM +0200, Imre Deak wrote: > > > > So far we had a race during error capturing where a power domain co= uld > > > > get disabled after we queried its state to be on. Fix this by prote= cting > > > > the power domain state tracking changes with a lock and holding thi= s > > > > lock during error capturing. > > > >=20 > > > > Note that this still allows the case where we are halfway in enabli= ng a > > > > power domain and still report it as disabled. This isn't a problem = as we > > > > only want to avoid register accesses while the domain is off and th= at is > > > > now guaranteed. > > > >=20 > > > > Signed-off-by: Imre Deak > > > >=20 > > > > [ Applies on top of "drm/i915: add intel_display_power_enabled_sw()= for > > > > use in atomic ctx" ] > > >=20 > > > Tbh I don't see the point for this - the error capture code is inhere= ntly > > > racy, we don't grab any of the other modeset locks. We also don't car= e and > > > furthermore we should try to grab as few locks as possible to increas= e the > > > chances that we can actually capture the error state successfully. Ev= ery > > > lock you add adds another depency and so more ways to end in tears an= d > > > deadlocks. > > > > > > So if the racy error capture is the only reason I'll reject this. > >=20 > > Yes it is only the race this fixes, but I'd like to argue for its > > usefulness. It fixes in particular the following two things: > >=20 > > 1. avoiding unclaimed register accesses, which was the original goal of= =20 > > =20 > > commit 9d1cb9147dbe45f6e94dc796518ecf67cb64b359 > > Author: Paulo Zanoni > > Date: Fri Nov 1 13:32:08 2013 -0200 > >=20 > > drm/i915: avoid unclaimed registers when capturing the error state >=20 > Hm, I've thought we won't hit any unclaimed register warnings if there's > no concurrent/racing other thread doing stuff to the power wells? We'll hit it if we read a register while its power well is off, but we think it's on based on SW tracking. That's possible for example if you get one of the ERROR_INTERRUPTS and call into the error capture code and this happens while someone is disabling the power well. > And as explained I don't care that much about racing hangcheck ... > > > 2. avoiding an inconsistent power on/off value in the error state wrt. > > to the captured registers. Without the lock we can have an error state > > showing the power was on, but register values captured with power off, > > with some unpredictable values. This consistency check was the whole > > point of adding the power on/off value to the error state. > >=20 > > I understand that adding complexity in general is risky, but in this > > case the risk is minimal. The only place we take the lock is to adjust = a > > counter and to read HW registers, without any further nested locks, so > > no chance for lock inversions. >=20 > The thing is that we already have tons of other similar races, e.g. we > could capture hilariously incosistent modeset state where the pfit is > disabled but pipesrc already set to something that would need upscaling. > And it's not a problem. So I still think we can just ignore those races, > even more we /should/ do so. Ok, I wanted to reduce the uncertainty. But if we don't care about this and having occasional unclaimed reg access reports are ok then I'm ok to ignore this patch. --Imre --=-Y9Z7K1uYYSi4qIzilt2J Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJSl0rhAAoJEORIIAnNuWDF11sIAKlObYe5bkIOx0j1MsjGJU5K ETZlA/DzLkuWX7YuWlF/8czidjDlCUhKg6jjjXmf9DaxO/4dTMHOGvnMT4FAgxKv SlHP0MG0soA0/JpSwdl/7rf/kZuMYm3W8wKnI1W0TR0wT+QhuIMtsKLaDjpMFLtb TkywTBEGf0/vbBCy2FPo65H6srtN1iLvLpyXO874F8T/whRv3bt1y3uQBxPh8Zzc 3t+sm6SJSfQRk0rJOeUVSzq4FX9EtJXXUhYjKsmXAwxsrxUd2Y3RdZM+rM2tkNX2 BVU3U1Keei/slrSDwRM7SHNmWRXPh4VLbUeSfesGZ0ZeJuwgpxkxc72VdAU95MI= =Owon -----END PGP SIGNATURE----- --=-Y9Z7K1uYYSi4qIzilt2J-- --===============1356972403== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx --===============1356972403==--