From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Vetter Subject: Re: [PATCH 2/2] [v2] drm/i915: Disable GGTT PTEs on GEN6+ suspend Date: Fri, 18 Oct 2013 15:45:43 +0200 Message-ID: <20131018134543.GY4830@phenom.ffwll.local> References: <1381940302-1920-2-git-send-email-benjamin.widawsky@intel.com> <1381940490-2082-1-git-send-email-benjamin.widawsky@intel.com> <20131016165831.GB32493@nuc-i3427.alporthouse.com> <20131016170627.GA2947@bwidawsk.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail-ee0-f54.google.com (mail-ee0-f54.google.com [74.125.83.54]) by gabe.freedesktop.org (Postfix) with ESMTP id 58CB1E612D for ; Fri, 18 Oct 2013 06:45:23 -0700 (PDT) Received: by mail-ee0-f54.google.com with SMTP id e53so2018597eek.41 for ; Fri, 18 Oct 2013 06:45:22 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20131016170627.GA2947@bwidawsk.net> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Ben Widawsky Cc: Intel GFX , Ben Widawsky List-Id: intel-gfx@lists.freedesktop.org On Wed, Oct 16, 2013 at 10:06:27AM -0700, Ben Widawsky wrote: > On Wed, Oct 16, 2013 at 05:58:31PM +0100, Chris Wilson wrote: > > On Wed, Oct 16, 2013 at 09:21:30AM -0700, Ben Widawsky wrote: > > > Once the machine gets to a certain point in the suspend process, we > > > expect the GPU to be idle. If it is not, we might corrupt memory. > > > Empirically (with an early version of this patch) we have seen this is > > > not the case. We cannot currently explain why the latent GPU writes > > > occur. > > > > > > In the technical sense, this patch is a workaround in that we have an > > > issue we can't explain, and the patch indirectly solves the issue. > > > However, it's really better than a workaround because we understand why > > > it works, and it really should be a safe thing to do in all cases. > > > > > > The noticeable effect other than the debug messages would be an increase > > > in the suspend time. I have not measure how expensive it actually is. > > > > > > I think it would be good to spend further time to root cause why we're > > > seeing these latent writes, but it shouldn't preclude preventing the > > > fallout. > > > > > > NOTE: It should be safe (and makes some sense IMO) to also keep the > > > VALID bit unset on resume when we clear_range(). I've opted not to do > > > this as properly clearing those bits at some later point would be extra > > > work. > > > > > > v2: Fix bugzilla link > > > > And the other one? > > > > I'm really amazing. If we move ahead with this patch, Daniel, can you just erase > the extra bugs.freedesktop.org/6549:// > > > > Bugzilla: http://bugs.freedesktop.org/6549://bugs.freedesktop.org/show_bug.cgi?id=65496 > > Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=65496 Fixed and merged with cc: stable. -Daniel > > > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=59321 > > > Tested-by: Takashi Iwai > > > Tested-by: Paulo Zanoni > > > Signed-off-by: Ben Widawsky > > > > So clearing the valid bit should result in the GPU reporting errors for > > delayed accesses, but none were reported? > > -Chris > > > > So I can't actually reproduce the problem for some reason. Paulo will > need to answer. One theory is the fault information is lost on suspend. > > The original patch put faults both in suspend, and resume. After this, I > asked Paulo to wedge the GPU, and there I saw faults. > > -- > Ben Widawsky, Intel Open Source Technology Center > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch