From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: i915 regression in kernel 4.10 Date: Tue, 20 Dec 2016 09:42:46 -0500 Message-ID: <20161220144246.GA23668@char.us.oracle.com> References: <7abf8559-3aa7-af3a-8dc1-1dee42019fcd@suse.com> <20161219122934.GM29871@nuc-i3427.alporthouse.com> <3de0be86-c0bc-6bfd-defa-745b589d7bd9@suse.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Content-Disposition: inline In-Reply-To: <3de0be86-c0bc-6bfd-defa-745b589d7bd9@suse.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Juergen Gross Cc: airlied@linux.ie, intel-gfx , Linux Kernel Mailing List , dri-devel@lists.freedesktop.org, daniel.vetter@intel.com, Boris Ostrovsky List-Id: dri-devel@lists.freedesktop.org T24gTW9uLCBEZWMgMTksIDIwMTYgYXQgMDM6MTY6NDRQTSArMDEwMCwgSnVlcmdlbiBHcm9zcyB3 cm90ZToKPiBPbiAxOS8xMi8xNiAxMzoyOSwgQ2hyaXMgV2lsc29uIHdyb3RlOgo+ID4gT24gTW9u LCBEZWMgMTksIDIwMTYgYXQgMTI6Mzk6MTZQTSArMDEwMCwgSnVlcmdlbiBHcm9zcyB3cm90ZToK PiA+PiBXaXRoIHJlY2VudCA0LjEwIGtlcm5lbCB0aGUgZ3JhcGhpY3MgaXNuJ3QgY29taW5nIHVw IHVuZGVyIFhlbi4gRmlyc3QKPiA+PiBmYWlsdXJlIG1lc3NhZ2UgaXM6Cj4gPj4KPiA+PiBbICAg NDYuNjU2NjQ5XSBpOTE1IDAwMDA6MDA6MDIuMDogc3dpb3RsYiBidWZmZXIgaXMgZnVsbCAoc3o6 IDE2MzAyMDggYnl0ZXMpCj4gPiAKPiA+IERvIHdlIGdldCBhIHNpbGVudCBmYWlsdXJlPyBpOTE1 X2dlbV9ndHRfcHJlcGFyZV9wYWdlcygpIGlzIHdoZXJlIHdlCj4gPiBjYWxsIGRtYV9tYXBfc2co KSBhbmQgcGFzcyB0aGUgc2cgdG8gc3dpb3RsYiAoaW4gdGhpcyBjYXNlKSBmb3IKPiA+IHJlbWFw cGluZywgYW5kIHdlIGRvIGNoZWNrIGZvciBhbiBlcnJvciB2YWx1ZSBvZiAwLiBBZnRlciB0aGF0 IGVycm9yLAo+ID4gU1dJT1RMQl9NQVBfRVJST1IgaXMgcHJvcGFnYXRlZCBiYWNrIGFuZCBjb252 ZXJ0ZWQgdG8gMCBmb3IKPiA+IGRtYV9tYXBfc2coKS4gVGhhdCBsb29rcyB2YWxpZCwgYW5kIHdl IHNob3VsZCByZXBvcnQgRU5PTUVNIGJhY2sgdG8gdGhlCj4gPiBjYWxsZXIuCj4gPiAKPiA+PiBM YXRlciBJIHNlZSBzcGxhdHMgbGlrZToKPiA+Pgo+ID4+IFsgICA0OS4zOTM1ODNdIGdlbmVyYWwg cHJvdGVjdGlvbiBmYXVsdDogMDAwMCBbIzFdIFNNUAo+ID4gCj4gPiBXaGF0IHdhcyB0aGUgZmF1 bHRpbmcgYWRkcmVzcz8gUkFYIGlzIHBhcnRpY3VsYXJseSBub24tcG9pbnRlci1saWtlIHNvIEkK PiA+IHdvbmRlciBpZiB3ZSB3YWxrZWQgb250byBhbiB1bmluaXRpYWxpc2VkIHBvcnRpb24gb2Yg dGhlIHNndGFibGUuIFdlIG1heQo+ID4gaGF2ZSB0cmlwcGVkIG92ZXIgYSBidWcgaW4gb3VyIHNn X3BhZ2UgaXRlcmF0b3IuCj4gCj4gRHVyaW5nIHRoZSBiaXNlY3QgcHJvY2VzcyB0aGVyZSBoYXZl IGJlZW4gZWl0aGVyIEdQIG9yIE5VTEwgcG9pbnRlcgo+IGRlcmVmZXJlbmNlcyBvciBvdGhlciBw YWdlIGZhdWx0cy4gVHlwaWNhbCBhZGRyZXNzZXMgd2hlcmU6Cj4gCj4geGVuX3N3aW90bGJfdW5t YXBfc2dfYXR0cnMrMHgxZi8weDUwOiBhY2Nlc3MgdG8gMDAwMDAwMDAwMDAwMDAxOAo+IHhlbl9z d2lvdGxiX3VubWFwX3NnX2F0dHJzKzB4MWYvMHg1MDogYWNjZXNzIHRvIDAwMDAwMDAwMDMwMjAx MTgKPiAKPiA+IAo+ID4gVGhlIGF0dGFjaGVkIHBhdGNoIHNob3VsZCBwcmV2ZW50IGFuIGVhcmx5 IEVOT01FTSBmb2xsb3dpbmcgdGhlIHN3aW90bGIKPiA+IGFsbG9jYXRpb24gZmFpbHVyZS4gQnV0 IEkgc3VzcGVjdCB0aGF0IHdlIHdpbGwgc3RpbGwgYmUgdHJpcHBpbmcgdXAgdGhlCj4gPiBmYWls dXJlIGluIHRoZSBzZyB3YWxrZXIgd2hlbiBiaW5kaW5nIHRvIHRoZSBHUFUuCj4gPiAtQ2hyaXMK PiA+IAo+IAo+IFRoZSBwYXRjaCBpcyB3b3JraW5nIG5vdCB0b28gYmFkLiA6LSkKPiAKPiBTdGls bCBzZXZlcmFsICJzd2lvdGxiIGJ1ZmZlciBpcyBmdWxsIiBtZXNzYWdlcyAoc29tZSB3aXRoIHN6 OiwgbW9zdAo+IHdpdGhvdXQpLCBidXQgbm8gZmF1bHRzIGFueSBtb3JlIChuZWl0aGVyIEdQIG5v ciBOVUxMIHBvaW50ZXIKPiBkZXJlZmVyZW5jZSkuIEdyYXBoaWNhbCBsb2dpbiBpcyB3b3JraW5n IG5vdy4KCgpJIHRoaW5rIEkga25vdyB3aHkuIFRoZSBvcHRpbWl6YXRpb24gdGhhdCB3YXMgYWRk ZWQgYXNzdW1lcyB0aGF0CmJ1cyBhZGRyZXNzZXMgaXMgdGhlIHNhbWUgYXMgcGh5c2ljYWwgYWRk cmVzcy4gSGVuY2UgaXQgcGFja3MgYWxsCm9mIHRoZSB2aXJ0dWFsIGFkZHJlc3NlcyBpbiB0aGUg c2csIGFuZCBoYW5kcyBpdCBvZmYgdG8gU1dJT1RMQgp3aGljaCB3YWxrcyBlYWNoIG9uZSBhbmQg cmVhbGl6ZXMgdGhhdCBpdCBoYXMgdG8gdXNlIHRoZSBib3VuY2UKYnVmZmVyLgoKSSBhbSB3b25k ZXJpbmcgaWYgd291bGQgbWFrZSBzZW5zZSB0byBwdWxsICdzd2lvdGxiX21heF9zaXplJyBpbnNp ZGUKb2YgU1dJT1RMQiBhbmQgbWFrZSBpdCBhbiBsaWJyYXJ5LWlzaCAtIHNvIFhlbi1TV0lPVExC IGNhbiByZWdpc3RlcgphcyB3ZWxsIGFuZCByZXBvcnQgc2F5IHRoYXQgaXQgY2FuIG9ubHkgcHJv dmlkZSBvbmUgcGFnZQoodW5sZXNzIGl0IGlzIHJ1bm5pbmcgdW5kZXIgYmFyZW10YWwpLgoKT3Ig bWFrZSB0aGUgdXNhZ2Ugb2YgJ21heF9zZWdlbWVudCcgYW5kICdwYWdlX3RvX3BmbihwYWdlKSAh PSBsYXN0X3BmbiArIDEnCmluIGk5MTVfZ2VtX29iamVjdF9HZXRfcGFnZXNfZ3R0IHVzZSBzb21l dGhpbmcgc2ltaWxhciB0byB4ZW5fYmlvdmVjX3BoeXNfbWVyZ2VhYmxlPwpfX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpJbnRlbC1nZnggbWFpbGluZyBsaXN0 CkludGVsLWdmeEBsaXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5mcmVlZGVza3Rv cC5vcmcvbWFpbG1hbi9saXN0aW5mby9pbnRlbC1nZngK From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932750AbcLTOno (ORCPT ); Tue, 20 Dec 2016 09:43:44 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:49617 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758216AbcLTOnf (ORCPT ); Tue, 20 Dec 2016 09:43:35 -0500 Date: Tue, 20 Dec 2016 09:42:46 -0500 From: Konrad Rzeszutek Wilk To: Juergen Gross Cc: Chris Wilson , Linux Kernel Mailing List , dri-devel@lists.freedesktop.org, intel-gfx , airlied@linux.ie, jani.nikula@linux.intel.com, daniel.vetter@intel.com, Boris Ostrovsky Subject: Re: i915 regression in kernel 4.10 Message-ID: <20161220144246.GA23668@char.us.oracle.com> References: <7abf8559-3aa7-af3a-8dc1-1dee42019fcd@suse.com> <20161219122934.GM29871@nuc-i3427.alporthouse.com> <3de0be86-c0bc-6bfd-defa-745b589d7bd9@suse.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3de0be86-c0bc-6bfd-defa-745b589d7bd9@suse.com> User-Agent: Mutt/1.7.1 (2016-10-04) X-Source-IP: aserv0022.oracle.com [141.146.126.234] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 19, 2016 at 03:16:44PM +0100, Juergen Gross wrote: > On 19/12/16 13:29, Chris Wilson wrote: > > On Mon, Dec 19, 2016 at 12:39:16PM +0100, Juergen Gross wrote: > >> With recent 4.10 kernel the graphics isn't coming up under Xen. First > >> failure message is: > >> > >> [ 46.656649] i915 0000:00:02.0: swiotlb buffer is full (sz: 1630208 bytes) > > > > Do we get a silent failure? i915_gem_gtt_prepare_pages() is where we > > call dma_map_sg() and pass the sg to swiotlb (in this case) for > > remapping, and we do check for an error value of 0. After that error, > > SWIOTLB_MAP_ERROR is propagated back and converted to 0 for > > dma_map_sg(). That looks valid, and we should report ENOMEM back to the > > caller. > > > >> Later I see splats like: > >> > >> [ 49.393583] general protection fault: 0000 [#1] SMP > > > > What was the faulting address? RAX is particularly non-pointer-like so I > > wonder if we walked onto an uninitialised portion of the sgtable. We may > > have tripped over a bug in our sg_page iterator. > > During the bisect process there have been either GP or NULL pointer > dereferences or other page faults. Typical addresses where: > > xen_swiotlb_unmap_sg_attrs+0x1f/0x50: access to 0000000000000018 > xen_swiotlb_unmap_sg_attrs+0x1f/0x50: access to 0000000003020118 > > > > > The attached patch should prevent an early ENOMEM following the swiotlb > > allocation failure. But I suspect that we will still be tripping up the > > failure in the sg walker when binding to the GPU. > > -Chris > > > > The patch is working not too bad. :-) > > Still several "swiotlb buffer is full" messages (some with sz:, most > without), but no faults any more (neither GP nor NULL pointer > dereference). Graphical login is working now. I think I know why. The optimization that was added assumes that bus addresses is the same as physical address. Hence it packs all of the virtual addresses in the sg, and hands it off to SWIOTLB which walks each one and realizes that it has to use the bounce buffer. I am wondering if would make sense to pull 'swiotlb_max_size' inside of SWIOTLB and make it an library-ish - so Xen-SWIOTLB can register as well and report say that it can only provide one page (unless it is running under baremtal). Or make the usage of 'max_segement' and 'page_to_pfn(page) != last_pfn + 1' in i915_gem_object_Get_pages_gtt use something similar to xen_biovec_phys_mergeable?