From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Vetter Subject: Re: [PATCH] drm/i915: don't return -ENXIO from gmbus xfer Date: Sun, 20 May 2012 20:16:46 +0200 Message-ID: <20120520181553.GE5145@phenom.ffwll.local> References: <1336974981-5176-1-git-send-email-daniel.vetter@ffwll.ch> <1337458212-15791-1-git-send-email-daniel.vetter@ffwll.ch> <20120520151916.GB5145@phenom.ffwll.local> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mail-wg0-f43.google.com (mail-wg0-f43.google.com [74.125.82.43]) by gabe.freedesktop.org (Postfix) with ESMTP id 58E409E74F for ; Sun, 20 May 2012 11:15:33 -0700 (PDT) Received: by wgbdr1 with SMTP id dr1so3422163wgb.12 for ; Sun, 20 May 2012 11:15:32 -0700 (PDT) Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Daniel Kurtz Cc: Daniel Vetter , Intel Graphics Development List-Id: intel-gfx@lists.freedesktop.org On Sun, May 20, 2012 at 11:07:46AM -0700, Daniel Kurtz wrote: > On Sun, May 20, 2012 at 8:19 AM, Daniel Vetter wrote: > > > > On Sat, May 19, 2012 at 10:10:12PM +0200, Daniel Vetter wrote: > > > ... too much risk for flaky edid transfers. > > > > > > This regression has been introduced in > > > > > > commit e646d5773572bf52017983d758bdf05777dc5600 > > > Author: Daniel Kurtz > > > Date: =A0 Fri Mar 30 19:46:38 2012 +0800 > > > > > > =A0 =A0 drm/i915/intel_i2c: always wait for IDLE before clearing NAK > > > > > > This patch keeps the improved NAK handling on the hw side, but reverts > > > the change to return -ENXIO in case the gmbus controller reports a > > > NAK. > > > > > > Cc: Daniel Kurtz > > > > Hi Daniel, > > > > Can you please take a look at this one and smash your r-b onto it if you > > agree? > > > > Thanks, Daniel > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=3D49518 > > > Reported-and-Tested-by: Julian Simioni > > > Signed-Off-by: Daniel Vetter > > > --- > > > =A0drivers/gpu/drm/i915/intel_i2c.c | =A0 =A07 ++++--- > > > =A01 file changed, 4 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/intel_i2c.c b/drivers/gpu/drm/i915/= intel_i2c.c > > > index e04255e..0588d8e 100644 > > > --- a/drivers/gpu/drm/i915/intel_i2c.c > > > +++ b/drivers/gpu/drm/i915/intel_i2c.c > > > @@ -418,10 +418,11 @@ clear_err: > > > =A0 =A0 =A0 =A0* If no ACK is received during the address phase of a = transaction, > > > =A0 =A0 =A0 =A0* the adapter must report -ENXIO. > > > =A0 =A0 =A0 =A0* It is not clear what to return if no ACK is received= at other times. > > > - =A0 =A0 =A0* So, we always return -ENXIO in all NAK cases, to ensur= e we send > > > - =A0 =A0 =A0* it at least during the one case that is specified. > > > + =A0 =A0 =A0* > > > + =A0 =A0 =A0* Unfortunately we can't afford false positives in retur= ning -ENXIO, > > > + =A0 =A0 =A0* hence never return -ENXIO. > > > =A0 =A0 =A0 =A0*/ > > > - =A0 =A0 ret =3D -ENXIO; > > > + =A0 =A0 ret =3D i; > = > = > The bugzilla report shows that in the old case, the gmbus failure was: > [ 2.528812] vgaarb: device changed decodes: > PCI:0000:00:02.0,olddecodes=3Dio+mem,decodes=3Dio+mem:owns=3Dio+mem > [ 2.540753] [drm] GMBUS [i915 gmbus panel] timed out waiting for idle > [ 2.580813] [drm:intel_panel_get_backlight], get backlight PWM =3D 0 > = > And now the failure is: > [ 2.523015] vgaarb: device changed decodes: > PCI:0000:00:02.0,olddecodes=3Dio+mem,decodes=3Dio+mem:owns=3Dio+mem > [ 2.534770] [drm] GMBUS [i915 gmbus panel] timed out after NAK > [ 2.534797] [drm:gmbus_xfer], GMBUS [i915 gmbus panel] NAK for > addr: 0050 w(1) > [ 2.534883] [drm:intel_panel_get_backlight], get backlight PWM =3D 0 > = > From these logs, it looks like in both cases, there is an i2c > communication error between host and panel. The difference is that: > * in the old case, the return value was 0, which triggered a silent > retry (the 40ms gap between GMBUS and PWM messages) > * but, now that the return value is -ENXIO, the caller does not retry. > = > Actually, in the 'new' case, there are two errors happening: a NAK and > then a timeout after the NAK, while waiting for the controller to > clear the ACTIVE bit. I'm not actually sure what causes "timeout > after NAK", but I think it means the controller is waiting for entire > transaction to complete (perhaps the STOP bit after NAK?). > = > Since this reported issue is happening in this double error path, I'd > rather a patch that fixes it without disabling the more generic NAK > path. Maybe something like this: Hm, good suggestion. This way, if we get a NAK but no issues later on, we'll still fail faster if gmbus detected that no device is responding. Which should speed up boot a bit. I'll run this past the bug reporter. Thanks, Daniel > = > clear_err: > /* > * Wait for bus to IDLE before clearing NAK. > * If we clear the NAK while bus is still active, then it will stay > * active and the next transaction may fail. > */ > + ret =3D -ENXIO; > if (wait_for((I915_READ(GMBUS2 + reg_offset) & GMBUS_ACTIVE) =3D=3D 0, > - 10)) > + 10)) { > DRM_DEBUG_KMS("GMBUS [%s] timed out after NAK\n", > adapter->name); > + ret =3D -ETIMEDOUT; // Or 0 ? > + } > = > /* Toggle the Software Clear Interrupt bit. This has the effect > * of resetting the GMBUS controller and so clearing the > * BUS_ERROR raised by the slave's NAK. > */ > I915_WRITE(GMBUS1 + reg_offset, GMBUS_SW_CLR_INT); > I915_WRITE(GMBUS1 + reg_offset, 0); > I915_WRITE(GMBUS0 + reg_offset, 0); > = > DRM_DEBUG_KMS("GMBUS [%s] NAK for addr: %04x %c(%d)\n", > adapter->name, msgs[i].addr, > (msgs[i].flags & I2C_M_RD) ? 'r' : 'w', msgs[i].len); > = > /* > * If no ACK is received during the address phase of a transaction, > * the adapter must report -ENXIO. > * It is not clear what to return if no ACK is received at other times. > - * So, we always return -ENXIO in all NAK cases, to ensure we send > - * it at least during the one case that is specified. > + * So, return -ENXIO for NAK after any byte, unless there was a timeout > + * while waiting for IDLE after NAK. > */ > - ret =3D -ENXIO; > goto out; > = > -Daniel > = > > > > > =A0 =A0 =A0 goto out; > > > > > > =A0timeout: > > > -- > > > 1.7.10 > > > > > > > -- > > Daniel Vetter > > Mail: daniel@ffwll.ch > > Mobile: +41 (0)79 365 57 48 -- = Daniel Vetter Mail: daniel@ffwll.ch Mobile: +41 (0)79 365 57 48