From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.perfora.net ([74.208.4.197]:49352 "EHLO mout.perfora.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752117AbdBITm2 (ORCPT ); Thu, 9 Feb 2017 14:42:28 -0500 Date: Thu, 9 Feb 2017 14:09:41 -0500 From: Jim Rees To: Daniel Vetter Cc: Jani Nikula , Maarten Lankhorst , Chris Wilson , Daniel Vetter , DRI Development , Intel Graphics Development , stable@vger.kernel.org, Daniel Vetter Subject: Re: [PATCH 1/2] drm: refernce count event->completion Message-ID: <20170209190941.GA11649@rees.org> References: <20161221102331.31033-1-daniel.vetter@ffwll.ch> <20161221103641.GA735@nuc-i3427.alporthouse.com> <5d8242c1-f3e5-4042-a222-c36119089e68@linux.intel.com> <20170104100510.fhgu237qfhiv47gr@phenom.ffwll.local> <87mvdvsa1q.fsf@intel.com> <20170209172029.fkmco4meu5q2ydfw@phenom.ffwll.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170209172029.fkmco4meu5q2ydfw@phenom.ffwll.local> Sender: stable-owner@vger.kernel.org List-ID: Daniel Vetter wrote: Latest report just says that the revert isn't helping either. I suspect the report is a giantic conflagration of everything ever that kills various reporters boxes. I still believe that the patch here fixes the original bug, but there might be a lot more hiding. It's at least seen quite a pile of testing, so I think it's sounds, and we could cherry-pick it to dinf with cc: stable for 4.9+. Worst case it's not going to help for the other problems. No, that's not what the latest report says. It says, "running for 2 weeks ... This is certainly way, way better than the current stock experience, which results in my T460s entirely locking up daily." and "Less than a day after I made that comment I got a hard lockup". So reverting the buggy helper nonblock tracking commit took this reporter from locking up daily to locking up once in two weeks. For everyone else, reverting the buggy commit fixes all bugs. Also note that this most recent lockup appears to be a different bug ("GPU HANG: ecode"). So we have a commit that is causing hard lockups and flip_done timeouts for multiple users. Reverting this commit fixes the problem. But we did not push the revert up for 4.9, and it looks like we're not going to push it up for 4.10 either.