From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerome Glisse Subject: Re: CONFIG_DMA_CMA causes ttm performance problems/hangs. Date: Tue, 12 Aug 2014 22:04:53 -0400 Message-ID: <20140813020452.GB3001@gmail.com> References: <53E50C1B.9080507@gmail.com> <53E5B41B.3030009@vmware.com> <60bd3db2-4919-40c4-a4ff-1b7b043cadfc@email.android.com> <53E628FE.10808@vmware.com> <53E6E2CE.8070005@gmail.com> <53E75192.3070003@vmware.com> <53E7B39D.2060900@gmail.com> <53E896C9.5010501@vmware.com> <20140811151712.GA3541@gmail.com> <53EAC461.2060503@daenzer.net> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mail-qc0-f177.google.com (mail-qc0-f177.google.com [209.85.216.177]) by gabe.freedesktop.org (Postfix) with ESMTP id 199F56E208 for ; Tue, 12 Aug 2014 19:04:43 -0700 (PDT) Received: by mail-qc0-f177.google.com with SMTP id x13so3562572qcv.8 for ; Tue, 12 Aug 2014 19:04:42 -0700 (PDT) Content-Disposition: inline In-Reply-To: <53EAC461.2060503@daenzer.net> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Michel =?iso-8859-1?Q?D=E4nzer?= Cc: Thomas Hellstrom , Konrad Rzeszutek Wilk , kamal@canonical.com, LKML , "dri-devel@lists.freedesktop.org" , Dave Airlie , ben@decadent.org.uk, m.szyprowski@samsung.com List-Id: dri-devel@lists.freedesktop.org On Wed, Aug 13, 2014 at 10:50:25AM +0900, Michel D=E4nzer wrote: > On 12.08.2014 00:17, Jerome Glisse wrote: > > On Mon, Aug 11, 2014 at 12:11:21PM +0200, Thomas Hellstrom wrote: > >> On 08/10/2014 08:02 PM, Mario Kleiner wrote: > >>> On 08/10/2014 01:03 PM, Thomas Hellstrom wrote: > >>>> On 08/10/2014 05:11 AM, Mario Kleiner wrote: > >>>>> > >>>>> The other problem is that probably TTM does not reuse pages from the > >>>>> DMA pool. If i trace the __ttm_dma_alloc_page > >>>>> > >>>>> and > >>>>> __ttm_dma_free_page > >>>>> > >>>>> calls for > >>>>> those single page allocs/frees, then over a 20 second interval of > >>>>> tracing and switching tabs in firefox, scrolling things around etc.= i > >>>>> find about as many alloc's as i find free's, e.g., 1607 allocs vs. > >>>>> 1648 frees. > >>>> This is because historically the pools have been designed to keep on= ly > >>>> pages with nonstandard caching attributes since changing page caching > >>>> attributes have been very slow but the kernel page allocators have b= een > >>>> reasonably fast. > >>>> > >>>> /Thomas > >>> > >>> Ok. A bit more ftraceing showed my hang problem case goes through the > >>> "if (is_cached)" paths, so the pool doesn't recycle anything and i see > >>> it bouncing up and down by 4 pages all the time. > >>> > >>> But for the non-cached case, which i don't hit with my problem, could > >>> one of you look at line 954... > >>> > >>> https://urldefense.proofpoint.com/v1/url?u=3Dhttp://lxr.free-electron= s.com/source/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c%23L954&k=3DoIvRg1%2Bd= GAgOoM1BIlLLqw%3D%3D%0A&r=3Dl5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3= D%0A&m=3DQQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=3De15c5180= 5d429ee6d8960d6b88035e9811a1cdbfbf13168eec2fbb2214b99c60 > >>> > >>> > >>> ... and tell me why that unconditional npages =3D count; assignment > >>> makes sense? It seems to essentially disable all recycling for the dma > >>> pool whenever the pool isn't filled up to/beyond its maximum with free > >>> pages? When the pool is filled up, lots of stuff is recycled, but when > >>> it is already somewhat below capacity, it gets "punished" by not > >>> getting refilled? I'd just like to understand the logic behind that l= ine. > >>> > >>> thanks, > >>> -mario > >> > >> I'll happily forward that question to Konrad who wrote the code (or it > >> may even stem from the ordinary page pool code which IIRC has Dave > >> Airlie / Jerome Glisse as authors) > > = > > This is effectively bogus code, i now wonder how it came to stay alive. > > Attached patch will fix that. > = > I haven't tested Mario's scenario specifically, but it survived piglit > and the UE4 Effects Cave Demo (for which 1GB of VRAM isn't enough, so > some BOs ended up in GTT instead with write-combined CPU mappings) on > radeonsi without any noticeable issues. > = > Tested-by: Michel D=E4nzer > = My patch does not fix the cma bug, cma should not allocate single page into it reserved contiguous memory. But cma is a broken technology in the first place and it should not be enabled on x86 who ever did that is a moron. So i would definitly encourage opening a bug against cma. None the less ttm code was buggy too and this patch will fix that but will only allieviate or delay the symptoms reported by Mario. Cheers, J=E9r=F4me > = > -- = > Earthling Michel D=E4nzer | http://www.amd.com > Libre software enthusiast | Mesa and X developer