From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758950AbcAUI2l (ORCPT ); Thu, 21 Jan 2016 03:28:41 -0500 Received: from mail-wm0-f50.google.com ([74.125.82.50]:36271 "EHLO mail-wm0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751528AbcAUI2j (ORCPT ); Thu, 21 Jan 2016 03:28:39 -0500 Subject: Re: linux-4.4 bisected: kwin5 stuck on kde5 loading screen with radeon To: =?UTF-8?Q?Michel_D=c3=a4nzer?= , Vlastimil Babka , =?UTF-8?B?VmlsbGUgU3lyasOkbMOk?= References: <5698CB20.9050602@suse.cz> <20160115122629.GC23290@intel.com> <5699C5E5.90702@gmail.com> <569CC357.8030302@suse.cz> <569FEEDE.4060409@gmail.com> <56A053CE.7000500@daenzer.net> <56A06D2E.4000008@gmail.com> <56A07CF9.5060506@daenzer.net> Cc: Daniel Vetter , LKML , dri-devel@lists.freedesktop.org, mgraesslin@kde.org, kwin@kde.org, Alex Deucher , =?UTF-8?Q?Christian_K=c3=b6nig?= From: Mario Kleiner Message-ID: <56A096B1.4060203@gmail.com> Date: Thu, 21 Jan 2016 09:28:34 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <56A07CF9.5060506@daenzer.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/21/2016 07:38 AM, Michel Dänzer wrote: > On 21.01.2016 14:31, Mario Kleiner wrote: >> On 01/21/2016 04:43 AM, Michel Dänzer wrote: >>> On 21.01.2016 05:32, Mario Kleiner wrote: >>>> >>>> So the problem is that AMDs hardware frame counters reset to >>>> zero during a modeset. The old DRM code dealt with drivers doing that by >>>> keeping vblank irqs enabled during modesets and incrementing vblank >>>> count by one during each vblank irq, i think that's what >>>> drm_vblank_pre_modeset() and drm_vblank_post_modeset() were meant for. >>> >>> Right, looks like there's been a regression breaking this. I suspect the >>> problem is that vblank->last isn't getting updated from >>> drm_vblank_post_modeset. Not sure which change broke that though, or how >>> to fix it. Ville? >>> >> >> The whole logic has changed and the software counter updates are now >> driven all the time by the hw counter. >> >>> >>> BTW, I'm seeing a similar issue with drm_vblank_on/off as well, which >>> exposed the bug fixed by 209e4dbc ("drm/vblank: Use u32 consistently for >>> vblank counters"). I've been meaning to track that down since then; one >>> of these days hopefully, but if anybody has any ideas offhand... >> >> I spent the last few hours reading through the drm and radeon code and i >> think what should probably work is to replace the >> drm_vblank_pre/post_modeset calls in radeon/amdgpu by drm_vblank_off/on >> calls. These are apparently meant for drivers whose hw counters reset >> during modeset, [...] > > ... just like drm_vblank_pre/post_modeset. That those were broken is a > regression which needs to be fixed anyway. I don't think switching to > drm_vblank_on/off is suitable for stable trees. > > Looking at Vlastimil's original post again, I'd say the most likely > culprit is 4dfd6486 ("drm: Use vblank timestamps to guesstimate how many > vblanks were missed"). > Yes, i think reverting that one alone would likely fix it by reverting to the old vblank update logic. > >> Once drm_vblank_off is called, drm_vblank_get will no-op and return an >> error, so clients can't enable vblank irqs during the modeset - pageflip >> ioctl and waitvblank ioctl would fail while a modeset happens - >> hopefully userspace handles this correctly everywhere. > > We've fixed xf86-video-ati for this. > > >> I'll hack up a patch for demonstration now. > > You're a bit late to that party. :) > > http://lists.freedesktop.org/archives/dri-devel/2015-May/083614.html > http://lists.freedesktop.org/archives/dri-devel/2015-July/086451.html > > Oops. Just sent out my little (so far untested) creations. Yes, they are essentially the same as Daniel's patches. The only addition is to also fix that other potential small race i describe by slightly moving the xxx_pm_compute_clocks() calls around. And a fix for drm_vblank_get/put imbalance in radeon_pm if vblank_on/off would be used. -mario