From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759282AbcAUKJF (ORCPT ); Thu, 21 Jan 2016 05:09:05 -0500 Received: from mail-wm0-f43.google.com ([74.125.82.43]:37231 "EHLO mail-wm0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759215AbcAUKI7 (ORCPT ); Thu, 21 Jan 2016 05:08:59 -0500 Date: Thu, 21 Jan 2016 11:09:05 +0100 From: Daniel Vetter To: Michel =?iso-8859-1?Q?D=E4nzer?= Cc: Mario Kleiner , Vlastimil Babka , Ville =?iso-8859-1?Q?Syrj=E4l=E4?= , LKML , dri-devel@lists.freedesktop.org, mgraesslin@kde.org, kwin@kde.org, Alex Deucher , Christian =?iso-8859-1?Q?K=F6nig?= Subject: Re: linux-4.4 bisected: kwin5 stuck on kde5 loading screen with radeon Message-ID: <20160121100905.GL19130@phenom.ffwll.local> Mail-Followup-To: Michel =?iso-8859-1?Q?D=E4nzer?= , Mario Kleiner , Vlastimil Babka , Ville =?iso-8859-1?Q?Syrj=E4l=E4?= , LKML , dri-devel@lists.freedesktop.org, mgraesslin@kde.org, kwin@kde.org, Alex Deucher , Christian =?iso-8859-1?Q?K=F6nig?= References: <20160115122629.GC23290@intel.com> <5699C5E5.90702@gmail.com> <569CC357.8030302@suse.cz> <569FEEDE.4060409@gmail.com> <56A053CE.7000500@daenzer.net> <56A06D2E.4000008@gmail.com> <56A07CF9.5060506@daenzer.net> <56A07D97.6030606@daenzer.net> <20160121075849.GH19130@phenom.ffwll.local> <56A0989E.30006@daenzer.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <56A0989E.30006@daenzer.net> X-Operating-System: Linux phenom 4.3.0-1-amd64 User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 21, 2016 at 05:36:46PM +0900, Michel Dänzer wrote: > On 21.01.2016 16:58, Daniel Vetter wrote: > > On Thu, Jan 21, 2016 at 03:41:27PM +0900, Michel Dänzer wrote: > >> On 21.01.2016 15:38, Michel Dänzer wrote: > >>> On 21.01.2016 14:31, Mario Kleiner wrote: > >>>> On 01/21/2016 04:43 AM, Michel Dänzer wrote: > >>>>> On 21.01.2016 05:32, Mario Kleiner wrote: > >>>>>> > >>>>>> So the problem is that AMDs hardware frame counters reset to > >>>>>> zero during a modeset. The old DRM code dealt with drivers doing that by > >>>>>> keeping vblank irqs enabled during modesets and incrementing vblank > >>>>>> count by one during each vblank irq, i think that's what > >>>>>> drm_vblank_pre_modeset() and drm_vblank_post_modeset() were meant for. > >>>>> > >>>>> Right, looks like there's been a regression breaking this. I suspect the > >>>>> problem is that vblank->last isn't getting updated from > >>>>> drm_vblank_post_modeset. Not sure which change broke that though, or how > >>>>> to fix it. Ville? > >>>>> > >>>> > >>>> The whole logic has changed and the software counter updates are now > >>>> driven all the time by the hw counter. > >>>> > >>>>> > >>>>> BTW, I'm seeing a similar issue with drm_vblank_on/off as well, which > >>>>> exposed the bug fixed by 209e4dbc ("drm/vblank: Use u32 consistently for > >>>>> vblank counters"). I've been meaning to track that down since then; one > >>>>> of these days hopefully, but if anybody has any ideas offhand... > >>>> > >>>> I spent the last few hours reading through the drm and radeon code and i > >>>> think what should probably work is to replace the > >>>> drm_vblank_pre/post_modeset calls in radeon/amdgpu by drm_vblank_off/on > >>>> calls. These are apparently meant for drivers whose hw counters reset > >>>> during modeset, [...] > >>> > >>> ... just like drm_vblank_pre/post_modeset. That those were broken is a > >>> regression which needs to be fixed anyway. I don't think switching to > >>> drm_vblank_on/off is suitable for stable trees. > >> > >> Even more so since as I mentioned, there is (has been since at least > >> about half a year ago) a counter jumping bug with drm_vblank_on/off as well. > > > > Hm, never noticed you reported that. I thought the reason for not picking > > up my drm_vblank_on/off patches was that there's a bug in amdgpu userspace > > where it tried to use vblank waits on a disabled pipe? > > http://lists.freedesktop.org/archives/dri-devel/2015-July/086451.html > > I don't know why it didn't get picked up. Yeah, checking my tree your ack is indeed in there. I think I'll resend them. > > Can you please point me at the vblank on/off jump bug please? > > AFAIR I originally reported it in response to > http://lists.freedesktop.org/archives/dri-devel/2015-August/087841.html > , but I can't find that in the archives, so maybe that was just on IRC. > See > http://lists.freedesktop.org/archives/dri-devel/2016-January/099122.html > . Basically, I ran into the bug fixed by your patch because the counter > jumped forward on every DPMS off, so it hit the 32-bit boundary after > just a few days. Ok, so just uncovered the overflow bug. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch