From mboxrd@z Thu Jan 1 00:00:00 1970 From: Grigori Goronzy Subject: Re: CIK hangs with kernel 3.15, bisected Date: Tue, 13 May 2014 01:38:30 +0200 Message-ID: <53715B76.9040304@chown.ath.cx> References: <536D4B08.8030000@chown.ath.cx> <536DE204.5060308@vodafone.de> <536E552F.6010401@vodafone.de> <536F3D93.6050604@vodafone.de> <5370C3AB.1080903@vodafone.de> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252"; Format="flowed" Content-Transfer-Encoding: quoted-printable Return-path: Received: from pygmy.kinoho.net (pygmy.kinoho.net [134.0.27.24]) by gabe.freedesktop.org (Postfix) with ESMTP id 887CA6E51E for ; Mon, 12 May 2014 16:38:33 -0700 (PDT) In-Reply-To: <5370C3AB.1080903@vodafone.de> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: =?windows-1252?Q?Christian_K=F6nig?= , =?windows-1252?Q?Marek_Ol=9A=E1k?= Cc: dri-devel List-Id: dri-devel@lists.freedesktop.org I can confirm this fixes it for me, too. 3.15 with these fixes and the large PTE patches actually ends up being = noticeably slower than earlier kernels with Xonotic, though. I wonder = what's going on. Grigori On 12.05.2014 14:50, Christian K=F6nig wrote: > I could reproduce the problem with xonotic and I think I've found the > issue. > > Please test the attached patch. > > Thanks, > Christian. > > Am 11.05.2014 11:06, schrieb Christian K=F6nig: >>> I have tested it and it doesn't fix the hangs. >> Yeah, thought so. Well it was just a guess. >> >>> (Also, I don't like the patch, because it reverts the behavior I added >>> for userspace buffers.) >> Actually it shouldn't affect that. The alternative domain always >> contains GART even when userspace only specified VRAM as placement (as >> long as it is technical possible to do so). >> >> So what should happen is that TTM sees the current placement, matches >> that with the desired placement and should find that it doesn't need >> to move the buffer (we should just test if this behavior really works >> as expected). >> >> Christian. >> >> Am 10.05.2014 23:38, schrieb Marek Ol=9A=E1k: >>> Hi Christian, >>> >>> I have tested it and it doesn't fix the hangs. >>> >>> (Also, I don't like the patch, because it reverts the behavior I added >>> for userspace buffers.) >>> >>> Marek >>> >>> >>> >>> On Sat, May 10, 2014 at 6:34 PM, Christian K=F6nig >>> wrote: >>>> Couldn't reproduce the issue so far. So the attached patch is just a >>>> complete shoot into the dark found by rereading the code, but it might >>>> actually be the problem. >>>> >>>> Please give it a try. >>>> >>>> Going to keep testing in the meantime, >>>> Christian. >>>> >>>> Am 10.05.2014 10:23, schrieb Christian K=F6nig: >>>> >>>>>> I see hangs with kernel 3.15 and SI under memory pressure, e.g. if >>>>>> I boot >>>>>> with radeon.vramlimit=3D256 and then run Xonotic timedemo with high >>>>>> settings. >>>>>> I haven't had a chance to bisect it yet, but it might be a similar >>>>>> problem. >>>>> Sounds like the same issue to me. Thx for the good test case. >>>>> >>>>>> Any idea what is wrong with it? >>>>> Actually I already wondered that it went so smooth without any >>>>> regression >>>>> so far, didn't noticed the bug in bugzilla.kernel.org yet. >>>>> >>>>>> Some of the tests allocate a lot of MSAA textures and the tests also >>>>>> run in parallel, which creates a lot of memory pressure and probably >>>>>> causes buffer evictions. >>>>> Sounds like the underlying problem to me. We probably evict some >>>>> part of a >>>>> page table without updating the page directory. Going to dig into >>>>> it today, >>>>> it's probably just a one liner missing somewhere in the VM code. >>>>> >>>>> Christian. >>>>> >>>>> Am 09.05.2014 23:39, schrieb Grigori Goronzy: >>>>>> On 09.05.2014 20:03, Marek Ol=9A=E1k wrote: >>>>>>> >>>>>>> This commit which first appeared in 3.15-rc1 causes hangs on >>>>>>> Bonaire: >>>>>>> [...] >>>>>>> >>>>>>> The simplest way to reproduce the hangs is to run piglit with these >>>>>>> parameters: >>>>>>> -t texelFetch.fs >>>>>>> >>>>>>> Some of the tests allocate a lot of MSAA textures and the tests also >>>>>>> run in parallel, which creates a lot of memory pressure and probably >>>>>>> causes buffer evictions. >>>>>>> >>>>>> I see hangs with kernel 3.15 and SI under memory pressure, e.g. if >>>>>> I boot >>>>>> with radeon.vramlimit=3D256 and then run Xonotic timedemo with high >>>>>> settings. >>>>>> I haven't had a chance to bisect it yet, but it might be a similar >>>>>> problem. >>>>>> >>>>>> Grigori >>>>> >> >