From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mario Kleiner Subject: Re: linux-4.4 bisected: kwin5 stuck on kde5 loading screen with radeon Date: Thu, 21 Jan 2016 06:31:26 +0100 Message-ID: <56A06D2E.4000008@gmail.com> References: <5698CB20.9050602@suse.cz> <20160115122629.GC23290@intel.com> <5699C5E5.90702@gmail.com> <569CC357.8030302@suse.cz> <569FEEDE.4060409@gmail.com> <56A053CE.7000500@daenzer.net> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; Format="flowed" Content-Transfer-Encoding: base64 Return-path: Received: from mail-wm0-f45.google.com (mail-wm0-f45.google.com [74.125.82.45]) by gabe.freedesktop.org (Postfix) with ESMTPS id 345CA6E063 for ; Wed, 20 Jan 2016 21:31:32 -0800 (PST) Received: by mail-wm0-f45.google.com with SMTP id r129so157471196wmr.0 for ; Wed, 20 Jan 2016 21:31:32 -0800 (PST) In-Reply-To: <56A053CE.7000500@daenzer.net> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: =?UTF-8?Q?Michel_D=c3=a4nzer?= , Vlastimil Babka , =?UTF-8?B?VmlsbGUgU3lyasOkbMOk?= Cc: Daniel Vetter , LKML , dri-devel@lists.freedesktop.org, mgraesslin@kde.org, kwin@kde.org, Alex Deucher , =?UTF-8?Q?Christian_K=c3=b6nig?= List-Id: dri-devel@lists.freedesktop.org T24gMDEvMjEvMjAxNiAwNDo0MyBBTSwgTWljaGVsIETDpG56ZXIgd3JvdGU6Cj4gT24gMjEuMDEu MjAxNiAwNTozMiwgTWFyaW8gS2xlaW5lciB3cm90ZToKPj4KPj4gU28gdGhlIHByb2JsZW0gaXMg dGhhdCBBTURzIGhhcmR3YXJlIGZyYW1lIGNvdW50ZXJzIHJlc2V0IHRvCj4+IHplcm8gZHVyaW5n IGEgbW9kZXNldC4gVGhlIG9sZCBEUk0gY29kZSBkZWFsdCB3aXRoIGRyaXZlcnMgZG9pbmcgdGhh dCBieQo+PiBrZWVwaW5nIHZibGFuayBpcnFzIGVuYWJsZWQgZHVyaW5nIG1vZGVzZXRzIGFuZCBp bmNyZW1lbnRpbmcgdmJsYW5rCj4+IGNvdW50IGJ5IG9uZSBkdXJpbmcgZWFjaCB2YmxhbmsgaXJx LCBpIHRoaW5rIHRoYXQncyB3aGF0Cj4+IGRybV92YmxhbmtfcHJlX21vZGVzZXQoKSBhbmQgZHJt X3ZibGFua19wb3N0X21vZGVzZXQoKSB3ZXJlIG1lYW50IGZvci4KPgo+IFJpZ2h0LCBsb29rcyBs aWtlIHRoZXJlJ3MgYmVlbiBhIHJlZ3Jlc3Npb24gYnJlYWtpbmcgdGhpcy4gSSBzdXNwZWN0IHRo ZQo+IHByb2JsZW0gaXMgdGhhdCB2YmxhbmstPmxhc3QgaXNuJ3QgZ2V0dGluZyB1cGRhdGVkIGZy b20KPiBkcm1fdmJsYW5rX3Bvc3RfbW9kZXNldC4gTm90IHN1cmUgd2hpY2ggY2hhbmdlIGJyb2tl IHRoYXQgdGhvdWdoLCBvciBob3cKPiB0byBmaXggaXQuIFZpbGxlPwo+CgpUaGUgd2hvbGUgbG9n aWMgaGFzIGNoYW5nZWQgYW5kIHRoZSBzb2Z0d2FyZSBjb3VudGVyIHVwZGF0ZXMgYXJlIG5vdyAK ZHJpdmVuIGFsbCB0aGUgdGltZSBieSB0aGUgaHcgY291bnRlci4KCj4KPiBCVFcsIEknbSBzZWVp bmcgYSBzaW1pbGFyIGlzc3VlIHdpdGggZHJtX3ZibGFua19vbi9vZmYgYXMgd2VsbCwgd2hpY2gK PiBleHBvc2VkIHRoZSBidWcgZml4ZWQgYnkgMjA5ZTRkYmMgKCJkcm0vdmJsYW5rOiBVc2UgdTMy IGNvbnNpc3RlbnRseSBmb3IKPiB2YmxhbmsgY291bnRlcnMiKS4gSSd2ZSBiZWVuIG1lYW5pbmcg dG8gdHJhY2sgdGhhdCBkb3duIHNpbmNlIHRoZW47IG9uZQo+IG9mIHRoZXNlIGRheXMgaG9wZWZ1 bGx5LCBidXQgaWYgYW55Ym9keSBoYXMgYW55IGlkZWFzIG9mZmhhbmQuLi4KPgo+CgpJIHNwZW50 IHRoZSBsYXN0IGZldyBob3VycyByZWFkaW5nIHRocm91Z2ggdGhlIGRybSBhbmQgcmFkZW9uIGNv ZGUgYW5kIGkgCnRoaW5rIHdoYXQgc2hvdWxkIHByb2JhYmx5IHdvcmsgaXMgdG8gcmVwbGFjZSB0 aGUgCmRybV92YmxhbmtfcHJlL3Bvc3RfbW9kZXNldCBjYWxscyBpbiByYWRlb24vYW1kZ3B1IGJ5 IGRybV92Ymxhbmtfb2ZmL29uIApjYWxscy4gVGhlc2UgYXJlIGFwcGFyZW50bHkgbWVhbnQgZm9y IGRyaXZlcnMgd2hvc2UgaHcgY291bnRlcnMgcmVzZXQgCmR1cmluZyBtb2Rlc2V0LCBhbmQgc2Vl bSB0byByZWluaXRpYWxpemUgc3R1ZmYgcHJvcGVybHkgYW5kIHJlbGVhc2UgCmNsaWVudHMgcXVl dWVkIHZibGFuayBldmVudHMgdG8gYXZvaWQgYmxvY2tpbmcgLSBub3QgdGVzdGVkIHNvIGZhciwg anVzdCAKbG9va2VkIGF0IHRoZSBjb2RlLgoKT25jZSBkcm1fdmJsYW5rX29mZiBpcyBjYWxsZWQs IGRybV92YmxhbmtfZ2V0IHdpbGwgbm8tb3AgYW5kIHJldHVybiBhbiAKZXJyb3IsIHNvIGNsaWVu dHMgY2FuJ3QgZW5hYmxlIHZibGFuayBpcnFzIGR1cmluZyB0aGUgbW9kZXNldCAtIHBhZ2VmbGlw IAppb2N0bCBhbmQgd2FpdHZibGFuayBpb2N0bCB3b3VsZCBmYWlsIHdoaWxlIGEgbW9kZXNldCBo YXBwZW5zIC0gCmhvcGVmdWxseSB1c2Vyc3BhY2UgaGFuZGxlcyB0aGlzIGNvcnJlY3RseSBldmVy eXdoZXJlLgoKSXQgd291bGQgYWxzbyBjYXVzZSByYWRlb25zIHBvd2VyIG1hbmFnZW1lbnQgdG8g bm90IHN5bmMgaXRzIGFjdGlvbnMgdG8gCnZibGFuayBpZiBpdCB3b3VsZCBnZXQgaW52b2tlZCBk dXJpbmcgYSBtb2Rlc2V0LCBidXQgdGhhdCBzZWVtcyB0byBiZSAKaGFuZGxlZCBieSBhIDIwMCBt c2VjIHRpbWVvdXQgYW5kIGhvcGVmdWxseSBvbmx5IGNhdXNlIHZpc3VhbCBnbGl0Y2hlcyAtIApv ciBpbnZpc2libGUgZ2xpdGNoZXMgd2hpbGUgdGhlIGNydGMgaXMgYmxhbmtlZCBkdXJpbmcgbW9k ZXNldD8KClRoZXJlIGNvdWxkIGJlIGFub3RoZXIgdGlueSByYWNlIHdpdGggdGhlIG5ldyAidmJs YW5rIGNvdW50ZXIgYnVtcGluZyIgCmxvZ2ljIGZyb20gY29tbWl0IDViNTU2MWIgKCJkcm0vcmFk ZW9uOiBGaXh1cCBodyB2YmxhbmsgY291bnRlcnMvdHMgCi4uLiIpIGlmIGRybV91cGRhdGVfdmJs YW5rX2NvdW50ZXIoKSB3b3VsZCBiZSBjYWxsZWQgbXVsdGlwbGUgdGltZXMgaW4gCnF1aWNrIHN1 Y2Nlc3Npb24gd2l0aGluIHRoZSAicmFkZW9uX2NydGMtPmxiX3ZibGFua19sZWFkX2xpbmVzIiAK c2NhbmxpbmVzIGJlZm9yZSBzdGFydCBvZiByZWFsIHZibGFuayBpZmYgYXQgdGhlIHNhbWUgdGlt ZSBhIG1vZGVzZXQgCndvdWxkIGhhcHBlbiBhbmQgc2V0IHJhZGVvbl9jcnRjLT5sYl92Ymxhbmtf bGVhZF9saW5lcyB0byBhIHNtYWxsZXIgCnZhbHVlIGR1ZSB0byBhIGNoYW5nZSBpbiBob3Jpem9u dGFsIG1vZGUgcmVzb2x1dGlvbi4gVGhhdCBuZWVkcyBhIAptb2Rlc2V0IHRvIGhhcHBlbiB0byBh IGhpZ2hlciBob3Jpem9udGFsIHJlc29sdXRpb24ganVzdCBleGFjdGx5IHdoZW4gCnRoZSBzY2Fu b3V0IGlzIGluIGV4YWN0bHkgdGhlIHJpZ2h0IDUgb3Igc28gc2NhbmxpbmVzIGFuZCBzb21lIGNs aWVudCBpcyAKY2FsbGluZyBkcm1fdmJsYW5rX2dldCgpIHRvIGVuYWJsZSB2YmxhbmsgaXJxcyBh dCB0aGUgc2FtZSB0aW1lLCBidXQgaXQgCndvdWxkIGNhdXNlIHRoZSBzYW1lIGhhbmcgaWYgaXQg aGFwcGVuZWQgLSBub3QgdGhhdCBsaWtlbHkgdG8gaGFwcGVuIApvZnRlbiwgYnV0IHN0aWxsIG5v dCBuaWNlLCBhbHNvIE11cnBoeSdzIGxhdy4uLiBJZiB3ZSBjb3VsZCBzd2l0Y2ggdG8gCmRybV92 Ymxhbmtfb2ZmL29uIGluc3RlYWQgb2YgZHJtX3ZibGFua19wcmUvcG9zdF9tb2Rlc2V0IHdlIGNv dWxkIHJlbW92ZSAKdGhvc2UgcmFjZSBhcyB3ZWxsIGJ5IGZvcmJpZGRpbmcgYW55IHZibGFuayBp cnEgcmVsYXRlZCBhY3Rpdml0eSBkdXJpbmcgCmEgbW9kZXNldC4KCkknbGwgaGFjayB1cCBhIHBh dGNoIGZvciBkZW1vbnN0cmF0aW9uIG5vdy4KX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX18KZHJpLWRldmVsIG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMu ZnJlZWRlc2t0b3Aub3JnCmh0dHA6Ly9saXN0cy5mcmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0 aW5mby9kcmktZGV2ZWwK From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758321AbcAUFbf (ORCPT ); Thu, 21 Jan 2016 00:31:35 -0500 Received: from mail-wm0-f49.google.com ([74.125.82.49]:38690 "EHLO mail-wm0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750934AbcAUFbc (ORCPT ); Thu, 21 Jan 2016 00:31:32 -0500 Subject: Re: linux-4.4 bisected: kwin5 stuck on kde5 loading screen with radeon To: =?UTF-8?Q?Michel_D=c3=a4nzer?= , Vlastimil Babka , =?UTF-8?B?VmlsbGUgU3lyasOkbMOk?= References: <5698CB20.9050602@suse.cz> <20160115122629.GC23290@intel.com> <5699C5E5.90702@gmail.com> <569CC357.8030302@suse.cz> <569FEEDE.4060409@gmail.com> <56A053CE.7000500@daenzer.net> Cc: Daniel Vetter , LKML , dri-devel@lists.freedesktop.org, mgraesslin@kde.org, kwin@kde.org, Alex Deucher , =?UTF-8?Q?Christian_K=c3=b6nig?= From: Mario Kleiner Message-ID: <56A06D2E.4000008@gmail.com> Date: Thu, 21 Jan 2016 06:31:26 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <56A053CE.7000500@daenzer.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/21/2016 04:43 AM, Michel Dänzer wrote: > On 21.01.2016 05:32, Mario Kleiner wrote: >> >> So the problem is that AMDs hardware frame counters reset to >> zero during a modeset. The old DRM code dealt with drivers doing that by >> keeping vblank irqs enabled during modesets and incrementing vblank >> count by one during each vblank irq, i think that's what >> drm_vblank_pre_modeset() and drm_vblank_post_modeset() were meant for. > > Right, looks like there's been a regression breaking this. I suspect the > problem is that vblank->last isn't getting updated from > drm_vblank_post_modeset. Not sure which change broke that though, or how > to fix it. Ville? > The whole logic has changed and the software counter updates are now driven all the time by the hw counter. > > BTW, I'm seeing a similar issue with drm_vblank_on/off as well, which > exposed the bug fixed by 209e4dbc ("drm/vblank: Use u32 consistently for > vblank counters"). I've been meaning to track that down since then; one > of these days hopefully, but if anybody has any ideas offhand... > > I spent the last few hours reading through the drm and radeon code and i think what should probably work is to replace the drm_vblank_pre/post_modeset calls in radeon/amdgpu by drm_vblank_off/on calls. These are apparently meant for drivers whose hw counters reset during modeset, and seem to reinitialize stuff properly and release clients queued vblank events to avoid blocking - not tested so far, just looked at the code. Once drm_vblank_off is called, drm_vblank_get will no-op and return an error, so clients can't enable vblank irqs during the modeset - pageflip ioctl and waitvblank ioctl would fail while a modeset happens - hopefully userspace handles this correctly everywhere. It would also cause radeons power management to not sync its actions to vblank if it would get invoked during a modeset, but that seems to be handled by a 200 msec timeout and hopefully only cause visual glitches - or invisible glitches while the crtc is blanked during modeset? There could be another tiny race with the new "vblank counter bumping" logic from commit 5b5561b ("drm/radeon: Fixup hw vblank counters/ts ...") if drm_update_vblank_counter() would be called multiple times in quick succession within the "radeon_crtc->lb_vblank_lead_lines" scanlines before start of real vblank iff at the same time a modeset would happen and set radeon_crtc->lb_vblank_lead_lines to a smaller value due to a change in horizontal mode resolution. That needs a modeset to happen to a higher horizontal resolution just exactly when the scanout is in exactly the right 5 or so scanlines and some client is calling drm_vblank_get() to enable vblank irqs at the same time, but it would cause the same hang if it happened - not that likely to happen often, but still not nice, also Murphy's law... If we could switch to drm_vblank_off/on instead of drm_vblank_pre/post_modeset we could remove those race as well by forbidding any vblank irq related activity during a modeset. I'll hack up a patch for demonstration now.