From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?windows-1252?Q?Christian_K=F6nig?= Subject: Re: [PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT Date: Fri, 18 Jul 2014 17:47:45 +0200 Message-ID: <53C941A1.60001@vodafone.de> References: <1405591275-14461-1-git-send-email-michel@daenzer.net> <53C7A0D0.6080202@vodafone.de> <53C88F8C.40907@daenzer.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252"; Format="flowed" Content-Transfer-Encoding: quoted-printable Return-path: Received: from pegasos-out.vodafone.de (pegasos-out.vodafone.de [80.84.1.38]) by gabe.freedesktop.org (Postfix) with ESMTP id B15F56E79F for ; Fri, 18 Jul 2014 08:48:01 -0700 (PDT) In-Reply-To: <53C88F8C.40907@daenzer.net> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: =?windows-1252?Q?Michel_D=E4nzer?= Cc: mesa-dev@lists.freedesktop.org, dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org Am 18.07.2014 05:07, schrieb Michel D=E4nzer: >>> [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >=3D SI >> I'm still not very keen with this change since I still don't understand >> the reason why it's faster than with GTT. Definitely needs more testing >> on a wider range of systems. > Sure. If anyone wants to give this patch a spin and see if they can > measure any performance difference, good or bad, that would be interestin= g. > >> Maybe limit it to APUs for now? > But IIRC, CPU writes to VRAM vs. write-combined GTT are actually an even > bigger win with dedicated GPUs than with the Kaveri built-in GPU on my > system. I suspect it may depend on the bandwidth available for PCIe vs. > system memory though. I've made a few tests today with the kernel part of the patches running = Xonotic on Ultra in 1920 x 1080. Without any patches I get around ~47.0fps on average with my dedicated = HD7870. Adding only "drm/radeon: Use write-combined CPU mappings of rings and = IBs on >=3D SI" and that goes down to ~45.3fps. Adding on to off that "drm/radeon: Use VRAM for indirect buffers on >=3D = SI" and the frame rate goes down to ~27.74fps. So enabling this unconditionally is definitely not a good idea. What I = don't understand yet is why using USWC reduces the fps on SI as well. It = looks like the reads from the IB buffer for command stream validation on = SI affect that more than thought. Christian.