From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754953AbcATUch (ORCPT ); Wed, 20 Jan 2016 15:32:37 -0500 Received: from mail-wm0-f49.google.com ([74.125.82.49]:34059 "EHLO mail-wm0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751112AbcATUce (ORCPT ); Wed, 20 Jan 2016 15:32:34 -0500 Subject: Re: linux-4.4 bisected: kwin5 stuck on kde5 loading screen with radeon To: Vlastimil Babka , =?UTF-8?B?VmlsbGUgU3lyasOkbMOk?= References: <5698CB20.9050602@suse.cz> <20160115122629.GC23290@intel.com> <5699C5E5.90702@gmail.com> <569CC357.8030302@suse.cz> Cc: Alex Deucher , =?UTF-8?Q?Christian_K=c3=b6nig?= , Daniel Vetter , mgraesslin@kde.org, David Airlie , dri-devel@lists.freedesktop.org, LKML , kwin@kde.org From: Mario Kleiner Message-ID: <569FEEDE.4060409@gmail.com> Date: Wed, 20 Jan 2016 21:32:30 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <569CC357.8030302@suse.cz> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/18/2016 11:49 AM, Vlastimil Babka wrote: > On 01/16/2016 05:24 AM, Mario Kleiner wrote: >> >> >> On 01/15/2016 01:26 PM, Ville Syrjälä wrote: >>> On Fri, Jan 15, 2016 at 11:34:08AM +0100, Vlastimil Babka wrote: >> >> I'm currently running... >> >> while xinit /usr/bin/ksplashqml --test -- :1 ; do echo yay; done >> >> ... in an endless loop on Linux 4.4 SMP PREEMPT on HD-5770 and so far i >> can't trigger a hang after hundreds of runs. >> >> Does this also hang for you? > > No, test mode seems to be fine. > >> I think a drm.debug=0x21 setting and grep'ping the syslog for "vblank" >> should probably give useful info around the time of the hang. > > Attached. Captured by having kdm running, switching to console, running > "dmesg -C ; dmesg -w > /tmp/dmesg", switch to kdm, enter password, see > frozen splashscreen, switch back, terminate dmesg. So somewhere around > the middle there should be where ksplashscreen starts... > >> Maybe also check XOrg.0.log for (WW) warnings related to flip. > > No such warnings there. > >> thanks, >> -mario >> >> >>>> Thanks, >>>> Vlastimil >>> > Thanks. So the problem is that AMDs hardware frame counters reset to zero during a modeset. The old DRM code dealt with drivers doing that by keeping vblank irqs enabled during modesets and incrementing vblank count by one during each vblank irq, i think that's what drm_vblank_pre_modeset() and drm_vblank_post_modeset() were meant for. The new code in drm_update_vblank_count() breaks this. The reset of the counter to zero is treated as counter wraparound, so our software vblank counter jumps forward by up to 2^24 counts in response (in case of AMD's 24 bit hw counters), and then the vblank event handling code in drm_handle_vblank_events() and other places detects the counter being more than 2^23 counts ahead of queued vblank events and as part of its own wraparound handling for the 32-Bit software counter doesn't deliver these queued events for a long time -> no vblank swap trigger event -> no swap -> client hangs waiting for swap completion. I think i remember seeing the ksplash progress screen occasionally blanking half way through login, i guess that's when kwin triggers a modeset in parallel to ksplash doing its OpenGL animations. So depending on the hw vblank count at the time of login ksplash would or wouldn't hang, apparently i got "lucky" with my counts at login. -mario