From mboxrd@z Thu Jan 1 00:00:00 1970 From: robin.murphy@arm.com (Robin Murphy) Date: Fri, 8 Apr 2016 19:46:55 +0100 Subject: Nouveau crashes in 4.6-rc on arm64 In-Reply-To: <570737F5.30105@nvidia.com> References: <57064992.1060509@arm.com> <570737F5.30105@nvidia.com> Message-ID: <5707FC9F.50905@arm.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Hi Alex, On 08/04/16 05:47, Alexandre Courbot wrote: > Hi Robin, > > On 04/07/2016 08:50 PM, Robin Murphy wrote: >> Hello, >> >> With 4.6-rc2 (and -rc1) I'm seeing Nouveau blowing up at boot, from the >> look of it by dereferencing some offset from NULL inside >> nouveau_fbcon_imageblit(). My setup is an old XFX 7600GT card plugged >> into an ARM Juno r1 board, which works fine with 4.5 and earlier. >> >> Attached are a couple of logs from booting arm64 defconfig plus DRM and >> Nouveau enabled - the second also has framebuffer console rotation >> turned on, which interestingly seems to move the point of failure, and >> the display does eventually come up to show the tail end of the panic in >> that case. >> >> I might be able to find time for a full bisection next week if isn't >> something sufficiently obvious to anyone who knows this driver. > > Looking at the log it is not clear to me what could be causing this. I > can boot 4.6-rc2 with a GM206 card without any issue. A bisect would > indeed be useful here. OK, turns out the lure of writing something to remotely drive a Juno and parse kernel bootlogs through an automatic bisection was too great to resist on a Friday afternoon :D Bisection came down to 1733a2ad3674("drm/nouveau/device/pci: set as non-CPU-coherent on ARM64"), and sure enough reverting that removes the crash. I have to say, that commit looks pretty bogus anyway - since de335bb49269("PCI: Update DMA configuration from DT") in 4.1, PCI devices should correctly inherit the coherency property from their host controller's DT node and get the appropriate DMA ops assigned. From a brief look at the Nouveau code, I guess it could possibly be the assumptions the TTM stuff going awry in the presence of coherent DMA ops. Regardless of how the code goes wrong, though, it's trivially incorrect to have a blanket statement that PCI devices are non-coherent on arm64, so whatever the original issue was this isn't the right way to fix it. Robin. > > Thanks, > Alex. > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Robin Murphy Subject: Re: Nouveau crashes in 4.6-rc on arm64 Date: Fri, 8 Apr 2016 19:46:55 +0100 Message-ID: <5707FC9F.50905@arm.com> References: <57064992.1060509@arm.com> <570737F5.30105@nvidia.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; Format="flowed" Content-Transfer-Encoding: base64 Return-path: Received: from foss.arm.com (foss.arm.com [217.140.101.70]) by gabe.freedesktop.org (Postfix) with ESMTP id 7F97B6E023 for ; Fri, 8 Apr 2016 18:46:58 +0000 (UTC) In-Reply-To: <570737F5.30105@nvidia.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Alexandre Courbot , dri-devel@lists.freedesktop.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: bskeggs@redhat.com List-Id: dri-devel@lists.freedesktop.org SGkgQWxleCwKCk9uIDA4LzA0LzE2IDA1OjQ3LCBBbGV4YW5kcmUgQ291cmJvdCB3cm90ZToKPiBI aSBSb2JpbiwKPgo+IE9uIDA0LzA3LzIwMTYgMDg6NTAgUE0sIFJvYmluIE11cnBoeSB3cm90ZToK Pj4gSGVsbG8sCj4+Cj4+IFdpdGggNC42LXJjMiAoYW5kIC1yYzEpIEknbSBzZWVpbmcgTm91dmVh dSBibG93aW5nIHVwIGF0IGJvb3QsIGZyb20gdGhlCj4+IGxvb2sgb2YgaXQgYnkgZGVyZWZlcmVu Y2luZyBzb21lIG9mZnNldCBmcm9tIE5VTEwgaW5zaWRlCj4+IG5vdXZlYXVfZmJjb25faW1hZ2Vi bGl0KCkuIE15IHNldHVwIGlzIGFuIG9sZCBYRlggNzYwMEdUIGNhcmQgcGx1Z2dlZAo+PiBpbnRv IGFuIEFSTSBKdW5vIHIxIGJvYXJkLCB3aGljaCB3b3JrcyBmaW5lIHdpdGggNC41IGFuZCBlYXJs aWVyLgo+Pgo+PiBBdHRhY2hlZCBhcmUgYSBjb3VwbGUgb2YgbG9ncyBmcm9tIGJvb3RpbmcgYXJt NjQgZGVmY29uZmlnIHBsdXMgRFJNIGFuZAo+PiBOb3V2ZWF1IGVuYWJsZWQgLSB0aGUgc2Vjb25k IGFsc28gaGFzIGZyYW1lYnVmZmVyIGNvbnNvbGUgcm90YXRpb24KPj4gdHVybmVkIG9uLCB3aGlj aCBpbnRlcmVzdGluZ2x5IHNlZW1zIHRvIG1vdmUgdGhlIHBvaW50IG9mIGZhaWx1cmUsIGFuZAo+ PiB0aGUgZGlzcGxheSBkb2VzIGV2ZW50dWFsbHkgY29tZSB1cCB0byBzaG93IHRoZSB0YWlsIGVu ZCBvZiB0aGUgcGFuaWMgaW4KPj4gdGhhdCBjYXNlLgo+Pgo+PiBJIG1pZ2h0IGJlIGFibGUgdG8g ZmluZCB0aW1lIGZvciBhIGZ1bGwgYmlzZWN0aW9uIG5leHQgd2VlayBpZiBpc24ndAo+PiBzb21l dGhpbmcgc3VmZmljaWVudGx5IG9idmlvdXMgdG8gYW55b25lIHdobyBrbm93cyB0aGlzIGRyaXZl ci4KPgo+IExvb2tpbmcgYXQgdGhlIGxvZyBpdCBpcyBub3QgY2xlYXIgdG8gbWUgd2hhdCBjb3Vs ZCBiZSBjYXVzaW5nIHRoaXMuIEkKPiBjYW4gYm9vdCA0LjYtcmMyIHdpdGggYSBHTTIwNiBjYXJk IHdpdGhvdXQgYW55IGlzc3VlLiBBIGJpc2VjdCB3b3VsZAo+IGluZGVlZCBiZSB1c2VmdWwgaGVy ZS4KCk9LLCB0dXJucyBvdXQgdGhlIGx1cmUgb2Ygd3JpdGluZyBzb21ldGhpbmcgdG8gcmVtb3Rl bHkgZHJpdmUgYSBKdW5vIGFuZCAKcGFyc2Uga2VybmVsIGJvb3Rsb2dzIHRocm91Z2ggYW4gYXV0 b21hdGljIGJpc2VjdGlvbiB3YXMgdG9vIGdyZWF0IHRvIApyZXNpc3Qgb24gYSBGcmlkYXkgYWZ0 ZXJub29uIDpECgpCaXNlY3Rpb24gY2FtZSBkb3duIHRvIDE3MzNhMmFkMzY3NCgiZHJtL25vdXZl YXUvZGV2aWNlL3BjaTogc2V0IGFzIApub24tQ1BVLWNvaGVyZW50IG9uIEFSTTY0IiksIGFuZCBz dXJlIGVub3VnaCByZXZlcnRpbmcgdGhhdCByZW1vdmVzIHRoZSAKY3Jhc2guIEkgaGF2ZSB0byBz YXksIHRoYXQgY29tbWl0IGxvb2tzIHByZXR0eSBib2d1cyBhbnl3YXkgLSBzaW5jZSAKZGUzMzVi YjQ5MjY5KCJQQ0k6IFVwZGF0ZSBETUEgY29uZmlndXJhdGlvbiBmcm9tIERUIikgaW4gNC4xLCBQ Q0kgCmRldmljZXMgc2hvdWxkIGNvcnJlY3RseSBpbmhlcml0IHRoZSBjb2hlcmVuY3kgcHJvcGVy dHkgZnJvbSB0aGVpciBob3N0IApjb250cm9sbGVyJ3MgRFQgbm9kZSBhbmQgZ2V0IHRoZSBhcHBy b3ByaWF0ZSBETUEgb3BzIGFzc2lnbmVkLiBGcm9tIGEgCmJyaWVmIGxvb2sgYXQgdGhlIE5vdXZl YXUgY29kZSwgSSBndWVzcyBpdCBjb3VsZCBwb3NzaWJseSBiZSB0aGUgCmFzc3VtcHRpb25zIHRo ZSBUVE0gc3R1ZmYgZ29pbmcgYXdyeSBpbiB0aGUgcHJlc2VuY2Ugb2YgY29oZXJlbnQgRE1BIApv cHMuIFJlZ2FyZGxlc3Mgb2YgaG93IHRoZSBjb2RlIGdvZXMgd3JvbmcsIHRob3VnaCwgaXQncyB0 cml2aWFsbHkgCmluY29ycmVjdCB0byBoYXZlIGEgYmxhbmtldCBzdGF0ZW1lbnQgdGhhdCBQQ0kg ZGV2aWNlcyBhcmUgbm9uLWNvaGVyZW50IApvbiBhcm02NCwgc28gd2hhdGV2ZXIgdGhlIG9yaWdp bmFsIGlzc3VlIHdhcyB0aGlzIGlzbid0IHRoZSByaWdodCB3YXkgdG8gCmZpeCBpdC4KClJvYmlu LgoKPgo+IFRoYW5rcywKPiBBbGV4Lgo+Cj4KPiBfX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fXwo+IGxpbnV4LWFybS1rZXJuZWwgbWFpbGluZyBsaXN0Cj4gbGlu dXgtYXJtLWtlcm5lbEBsaXN0cy5pbmZyYWRlYWQub3JnCj4gaHR0cDovL2xpc3RzLmluZnJhZGVh ZC5vcmcvbWFpbG1hbi9saXN0aW5mby9saW51eC1hcm0ta2VybmVsCj4KCl9fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCmRyaS1kZXZlbCBtYWlsaW5nIGxpc3QK ZHJpLWRldmVsQGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3RzLmZyZWVkZXNrdG9w Lm9yZy9tYWlsbWFuL2xpc3RpbmZvL2RyaS1kZXZlbAo= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758827AbcDHSq7 (ORCPT ); Fri, 8 Apr 2016 14:46:59 -0400 Received: from foss.arm.com ([217.140.101.70]:41226 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753599AbcDHSq6 (ORCPT ); Fri, 8 Apr 2016 14:46:58 -0400 Subject: Re: Nouveau crashes in 4.6-rc on arm64 To: Alexandre Courbot , dri-devel@lists.freedesktop.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: <57064992.1060509@arm.com> <570737F5.30105@nvidia.com> From: Robin Murphy Cc: bskeggs@redhat.com Message-ID: <5707FC9F.50905@arm.com> Date: Fri, 8 Apr 2016 19:46:55 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <570737F5.30105@nvidia.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Alex, On 08/04/16 05:47, Alexandre Courbot wrote: > Hi Robin, > > On 04/07/2016 08:50 PM, Robin Murphy wrote: >> Hello, >> >> With 4.6-rc2 (and -rc1) I'm seeing Nouveau blowing up at boot, from the >> look of it by dereferencing some offset from NULL inside >> nouveau_fbcon_imageblit(). My setup is an old XFX 7600GT card plugged >> into an ARM Juno r1 board, which works fine with 4.5 and earlier. >> >> Attached are a couple of logs from booting arm64 defconfig plus DRM and >> Nouveau enabled - the second also has framebuffer console rotation >> turned on, which interestingly seems to move the point of failure, and >> the display does eventually come up to show the tail end of the panic in >> that case. >> >> I might be able to find time for a full bisection next week if isn't >> something sufficiently obvious to anyone who knows this driver. > > Looking at the log it is not clear to me what could be causing this. I > can boot 4.6-rc2 with a GM206 card without any issue. A bisect would > indeed be useful here. OK, turns out the lure of writing something to remotely drive a Juno and parse kernel bootlogs through an automatic bisection was too great to resist on a Friday afternoon :D Bisection came down to 1733a2ad3674("drm/nouveau/device/pci: set as non-CPU-coherent on ARM64"), and sure enough reverting that removes the crash. I have to say, that commit looks pretty bogus anyway - since de335bb49269("PCI: Update DMA configuration from DT") in 4.1, PCI devices should correctly inherit the coherency property from their host controller's DT node and get the appropriate DMA ops assigned. From a brief look at the Nouveau code, I guess it could possibly be the assumptions the TTM stuff going awry in the presence of coherent DMA ops. Regardless of how the code goes wrong, though, it's trivially incorrect to have a blanket statement that PCI devices are non-coherent on arm64, so whatever the original issue was this isn't the right way to fix it. Robin. > > Thanks, > Alex. > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >