From mboxrd@z Thu Jan 1 00:00:00 1970 From: robin.murphy@arm.com (Robin Murphy) Date: Wed, 20 Apr 2016 11:51:06 +0100 Subject: Nouveau crashes in 4.6-rc on arm64 In-Reply-To: <57175DA7.3000505@arm.com> References: <57064992.1060509@arm.com> <570737F5.30105@nvidia.com> <5707FC9F.50905@arm.com> <570B50B4.4020304@nvidia.com> <571706FF.1010300@nvidia.com> <57175DA7.3000505@arm.com> Message-ID: <57175F1A.5060708@arm.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 20/04/16 11:44, Robin Murphy wrote: > Hi Alex, > > On 20/04/16 05:35, Alexandre Courbot wrote: > [...] >>>> Bisection came down to 1733a2ad3674("drm/nouveau/device/pci: set as >>>> non-CPU-coherent on ARM64"), and sure enough reverting that removes the >>>> crash. >>> >>> Thanks for taking the time to bisect this. And apologies as it seems my >>> commit is the reason for your troubles. >>> >>> The CPU coherency flag is used for two things: explicitly sync buffers >>> pages when required, and allocating buffers that are not explicitly >>> synced (like fences or pushbuffers) using the DMA API. For this latter >>> use, it also accesses the buffer's content using the mapping provided by >>> dma_alloc_coherent() instead of creating a new one. All nouveau_bos are >>> supposed to be written using nouveau_bo_rd32(), and this function >>> handles the case of an DMA-API allocated object by detecting that the >>> result of ttm_kmap_obj_virtual() is NULL. >>> >>> But as it turns out, OUT_RINGp() also calls ttm_kmap_obj_virtual() in >>> order to perform a memcpy and uses its result directly - which means we >>> are doing memcpy on a NULL pointer. We never caught this because we >>> typically do not use Nouveau's fbcon with an ARM setup. >>> >>> I don't really like this special access for coherent objects, and >>> actually had a patch in my tree to attempt to remove it (attached). >>> Although it is not the whole solution (see below), the issue should at >>> least not be visible with it applied - could you confirm? >> >> Hi Robin, could you confirm whether the attached patch in my previous >> mail helps with your problem? > > With that patch on top of -rc4, it's conjuring up something that looks > somewhat more like a real address on top of the offset, as it now > crashes with "Unable to handle kernel paging request at virtual address > ffffff8008f841ac", rather than the previous "Unable to handle kernel > NULL pointer dereference at virtual address 000001ac". > > That does of course mean it still crashes in the same place, though :( > > Robin. > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy > the information in any medium. Thank you. And since I intentionally sent this to the lists, anyone reading that _is_ an intended recipient, so it's all good, I promise! [sorry, SMTP server mixup on my end... *berates self*] Robin. > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 From: Robin Murphy Subject: Re: Nouveau crashes in 4.6-rc on arm64 Date: Wed, 20 Apr 2016 11:51:06 +0100 Message-ID: <57175F1A.5060708@arm.com> References: <57064992.1060509@arm.com> <570737F5.30105@nvidia.com> <5707FC9F.50905@arm.com> <570B50B4.4020304@nvidia.com> <571706FF.1010300@nvidia.com> <57175DA7.3000505@arm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; Format="flowed" Content-Transfer-Encoding: base64 Return-path: Received: from foss.arm.com (foss.arm.com [217.140.101.70]) by gabe.freedesktop.org (Postfix) with ESMTP id 5A3B86E95A for ; Wed, 20 Apr 2016 10:51:09 +0000 (UTC) In-Reply-To: <57175DA7.3000505@arm.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Alexandre Courbot , dri-devel@lists.freedesktop.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: bskeggs@redhat.com List-Id: dri-devel@lists.freedesktop.org T24gMjAvMDQvMTYgMTE6NDQsIFJvYmluIE11cnBoeSB3cm90ZToKPiBIaSBBbGV4LAo+Cj4gT24g MjAvMDQvMTYgMDU6MzUsIEFsZXhhbmRyZSBDb3VyYm90IHdyb3RlOgo+IFsuLi5dCj4+Pj4gQmlz ZWN0aW9uIGNhbWUgZG93biB0byAxNzMzYTJhZDM2NzQoImRybS9ub3V2ZWF1L2RldmljZS9wY2k6 IHNldCBhcwo+Pj4+IG5vbi1DUFUtY29oZXJlbnQgb24gQVJNNjQiKSwgYW5kIHN1cmUgZW5vdWdo IHJldmVydGluZyB0aGF0IHJlbW92ZXMgdGhlCj4+Pj4gY3Jhc2guCj4+Pgo+Pj4gVGhhbmtzIGZv ciB0YWtpbmcgdGhlIHRpbWUgdG8gYmlzZWN0IHRoaXMuIEFuZCBhcG9sb2dpZXMgYXMgaXQgc2Vl bXMgbXkKPj4+IGNvbW1pdCBpcyB0aGUgcmVhc29uIGZvciB5b3VyIHRyb3VibGVzLgo+Pj4KPj4+ IFRoZSBDUFUgY29oZXJlbmN5IGZsYWcgaXMgdXNlZCBmb3IgdHdvIHRoaW5nczogZXhwbGljaXRs eSBzeW5jIGJ1ZmZlcnMKPj4+IHBhZ2VzIHdoZW4gcmVxdWlyZWQsIGFuZCBhbGxvY2F0aW5nIGJ1 ZmZlcnMgdGhhdCBhcmUgbm90IGV4cGxpY2l0bHkKPj4+IHN5bmNlZCAobGlrZSBmZW5jZXMgb3Ig cHVzaGJ1ZmZlcnMpIHVzaW5nIHRoZSBETUEgQVBJLiBGb3IgdGhpcyBsYXR0ZXIKPj4+IHVzZSwg aXQgYWxzbyBhY2Nlc3NlcyB0aGUgYnVmZmVyJ3MgY29udGVudCB1c2luZyB0aGUgbWFwcGluZyBw cm92aWRlZCBieQo+Pj4gZG1hX2FsbG9jX2NvaGVyZW50KCkgaW5zdGVhZCBvZiBjcmVhdGluZyBh IG5ldyBvbmUuIEFsbCBub3V2ZWF1X2JvcyBhcmUKPj4+IHN1cHBvc2VkIHRvIGJlIHdyaXR0ZW4g dXNpbmcgbm91dmVhdV9ib19yZDMyKCksIGFuZCB0aGlzIGZ1bmN0aW9uCj4+PiBoYW5kbGVzIHRo ZSBjYXNlIG9mIGFuIERNQS1BUEkgYWxsb2NhdGVkIG9iamVjdCBieSBkZXRlY3RpbmcgdGhhdCB0 aGUKPj4+IHJlc3VsdCBvZiB0dG1fa21hcF9vYmpfdmlydHVhbCgpIGlzIE5VTEwuCj4+Pgo+Pj4g QnV0IGFzIGl0IHR1cm5zIG91dCwgT1VUX1JJTkdwKCkgYWxzbyBjYWxscyB0dG1fa21hcF9vYmpf dmlydHVhbCgpIGluCj4+PiBvcmRlciB0byBwZXJmb3JtIGEgbWVtY3B5IGFuZCB1c2VzIGl0cyBy ZXN1bHQgZGlyZWN0bHkgLSB3aGljaCBtZWFucyB3ZQo+Pj4gYXJlIGRvaW5nIG1lbWNweSBvbiBh IE5VTEwgcG9pbnRlci4gV2UgbmV2ZXIgY2F1Z2h0IHRoaXMgYmVjYXVzZSB3ZQo+Pj4gdHlwaWNh bGx5IGRvIG5vdCB1c2UgTm91dmVhdSdzIGZiY29uIHdpdGggYW4gQVJNIHNldHVwLgo+Pj4KPj4+ IEkgZG9uJ3QgcmVhbGx5IGxpa2UgdGhpcyBzcGVjaWFsIGFjY2VzcyBmb3IgY29oZXJlbnQgb2Jq ZWN0cywgYW5kCj4+PiBhY3R1YWxseSBoYWQgYSBwYXRjaCBpbiBteSB0cmVlIHRvIGF0dGVtcHQg dG8gcmVtb3ZlIGl0IChhdHRhY2hlZCkuCj4+PiBBbHRob3VnaCBpdCBpcyBub3QgdGhlIHdob2xl IHNvbHV0aW9uIChzZWUgYmVsb3cpLCB0aGUgaXNzdWUgc2hvdWxkIGF0Cj4+PiBsZWFzdCBub3Qg YmUgdmlzaWJsZSB3aXRoIGl0IGFwcGxpZWQgLSBjb3VsZCB5b3UgY29uZmlybT8KPj4KPj4gSGkg Um9iaW4sIGNvdWxkIHlvdSBjb25maXJtIHdoZXRoZXIgdGhlIGF0dGFjaGVkIHBhdGNoIGluIG15 IHByZXZpb3VzCj4+IG1haWwgaGVscHMgd2l0aCB5b3VyIHByb2JsZW0/Cj4KPiBXaXRoIHRoYXQg cGF0Y2ggb24gdG9wIG9mIC1yYzQsIGl0J3MgY29uanVyaW5nIHVwIHNvbWV0aGluZyB0aGF0IGxv b2tzCj4gc29tZXdoYXQgbW9yZSBsaWtlIGEgcmVhbCBhZGRyZXNzIG9uIHRvcCBvZiB0aGUgb2Zm c2V0LCBhcyBpdCBub3cKPiBjcmFzaGVzIHdpdGggIlVuYWJsZSB0byBoYW5kbGUga2VybmVsIHBh Z2luZyByZXF1ZXN0IGF0IHZpcnR1YWwgYWRkcmVzcwo+IGZmZmZmZjgwMDhmODQxYWMiLCByYXRo ZXIgdGhhbiB0aGUgcHJldmlvdXMgIlVuYWJsZSB0byBoYW5kbGUga2VybmVsCj4gTlVMTCBwb2lu dGVyIGRlcmVmZXJlbmNlIGF0IHZpcnR1YWwgYWRkcmVzcyAwMDAwMDFhYyIuCj4KPiBUaGF0IGRv ZXMgb2YgY291cnNlIG1lYW4gaXQgc3RpbGwgY3Jhc2hlcyBpbiB0aGUgc2FtZSBwbGFjZSwgdGhv dWdoIDooCj4KPiBSb2Jpbi4KPiBJTVBPUlRBTlQgTk9USUNFOiBUaGUgY29udGVudHMgb2YgdGhp cyBlbWFpbCBhbmQgYW55IGF0dGFjaG1lbnRzIGFyZQo+IGNvbmZpZGVudGlhbCBhbmQgbWF5IGFs c28gYmUgcHJpdmlsZWdlZC4gSWYgeW91IGFyZSBub3QgdGhlIGludGVuZGVkCj4gcmVjaXBpZW50 LCBwbGVhc2Ugbm90aWZ5IHRoZSBzZW5kZXIgaW1tZWRpYXRlbHkgYW5kIGRvIG5vdCBkaXNjbG9z ZSB0aGUKPiBjb250ZW50cyB0byBhbnkgb3RoZXIgcGVyc29uLCB1c2UgaXQgZm9yIGFueSBwdXJw b3NlLCBvciBzdG9yZSBvciBjb3B5Cj4gdGhlIGluZm9ybWF0aW9uIGluIGFueSBtZWRpdW0uIFRo YW5rIHlvdS4KCkFuZCBzaW5jZSBJIGludGVudGlvbmFsbHkgc2VudCB0aGlzIHRvIHRoZSBsaXN0 cywgYW55b25lIHJlYWRpbmcgdGhhdCAKX2lzXyBhbiBpbnRlbmRlZCByZWNpcGllbnQsIHNvIGl0 J3MgYWxsIGdvb2QsIEkgcHJvbWlzZSEKCltzb3JyeSwgU01UUCBzZXJ2ZXIgbWl4dXAgb24gbXkg ZW5kLi4uICpiZXJhdGVzIHNlbGYqXQoKUm9iaW4uCgo+Cj4KPiBfX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fXwo+IGxpbnV4LWFybS1rZXJuZWwgbWFpbGluZyBs aXN0Cj4gbGludXgtYXJtLWtlcm5lbEBsaXN0cy5pbmZyYWRlYWQub3JnCj4gaHR0cDovL2xpc3Rz LmluZnJhZGVhZC5vcmcvbWFpbG1hbi9saXN0aW5mby9saW51eC1hcm0ta2VybmVsCgpfX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpkcmktZGV2ZWwgbWFpbGlu ZyBsaXN0CmRyaS1kZXZlbEBsaXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5mcmVl ZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9kcmktZGV2ZWwK From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932614AbcDTKvL (ORCPT ); Wed, 20 Apr 2016 06:51:11 -0400 Received: from foss.arm.com ([217.140.101.70]:46396 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752121AbcDTKvJ (ORCPT ); Wed, 20 Apr 2016 06:51:09 -0400 Subject: Re: Nouveau crashes in 4.6-rc on arm64 To: Alexandre Courbot , dri-devel@lists.freedesktop.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: <57064992.1060509@arm.com> <570737F5.30105@nvidia.com> <5707FC9F.50905@arm.com> <570B50B4.4020304@nvidia.com> <571706FF.1010300@nvidia.com> <57175DA7.3000505@arm.com> Cc: bskeggs@redhat.com From: Robin Murphy Message-ID: <57175F1A.5060708@arm.com> Date: Wed, 20 Apr 2016 11:51:06 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <57175DA7.3000505@arm.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 20/04/16 11:44, Robin Murphy wrote: > Hi Alex, > > On 20/04/16 05:35, Alexandre Courbot wrote: > [...] >>>> Bisection came down to 1733a2ad3674("drm/nouveau/device/pci: set as >>>> non-CPU-coherent on ARM64"), and sure enough reverting that removes the >>>> crash. >>> >>> Thanks for taking the time to bisect this. And apologies as it seems my >>> commit is the reason for your troubles. >>> >>> The CPU coherency flag is used for two things: explicitly sync buffers >>> pages when required, and allocating buffers that are not explicitly >>> synced (like fences or pushbuffers) using the DMA API. For this latter >>> use, it also accesses the buffer's content using the mapping provided by >>> dma_alloc_coherent() instead of creating a new one. All nouveau_bos are >>> supposed to be written using nouveau_bo_rd32(), and this function >>> handles the case of an DMA-API allocated object by detecting that the >>> result of ttm_kmap_obj_virtual() is NULL. >>> >>> But as it turns out, OUT_RINGp() also calls ttm_kmap_obj_virtual() in >>> order to perform a memcpy and uses its result directly - which means we >>> are doing memcpy on a NULL pointer. We never caught this because we >>> typically do not use Nouveau's fbcon with an ARM setup. >>> >>> I don't really like this special access for coherent objects, and >>> actually had a patch in my tree to attempt to remove it (attached). >>> Although it is not the whole solution (see below), the issue should at >>> least not be visible with it applied - could you confirm? >> >> Hi Robin, could you confirm whether the attached patch in my previous >> mail helps with your problem? > > With that patch on top of -rc4, it's conjuring up something that looks > somewhat more like a real address on top of the offset, as it now > crashes with "Unable to handle kernel paging request at virtual address > ffffff8008f841ac", rather than the previous "Unable to handle kernel > NULL pointer dereference at virtual address 000001ac". > > That does of course mean it still crashes in the same place, though :( > > Robin. > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy > the information in any medium. Thank you. And since I intentionally sent this to the lists, anyone reading that _is_ an intended recipient, so it's all good, I promise! [sorry, SMTP server mixup on my end... *berates self*] Robin. > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel