From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?ISO-8859-15?Q?Christian_K=F6nig?= Subject: Re: [PATCH] drm/radeon: Inline r100_mm_rreg, v2 Date: Sun, 20 Apr 2014 15:44:32 +0200 Message-ID: <5353CF40.4080504@vodafone.de> References: <20140419001147.89fe2275.cand@gmx.com> <53524A99.1070705@vodafone.de> <20140419203305.ae6edc0e.cand@gmx.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15"; Format="flowed" Content-Transfer-Encoding: quoted-printable Return-path: Received: from pegasos-out.vodafone.de (pegasos-out.vodafone.de [80.84.1.38]) by gabe.freedesktop.org (Postfix) with ESMTP id 2F0176E1A6 for ; Sun, 20 Apr 2014 06:44:50 -0700 (PDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by pegasos-out.vodafone.de (Rohrpostix1 Daemon) with ESMTP id 1B60726090D for ; Sun, 20 Apr 2014 15:44:47 +0200 (CEST) Received: from pegasos-out.vodafone.de ([127.0.0.1]) by localhost (rohrpostix1.prod.vfnet.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cBLpArdmTO0V for ; Sun, 20 Apr 2014 15:44:41 +0200 (CEST) In-Reply-To: <20140419203305.ae6edc0e.cand@gmx.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Lauri Kasanen , Alex Deucher Cc: Maling list - DRI developers List-Id: dri-devel@lists.freedesktop.org Am 19.04.2014 19:33, schrieb Lauri Kasanen: > On Sat, 19 Apr 2014 11:15:53 -0400 > Alex Deucher wrote: > >> On Sat, Apr 19, 2014 at 6:06 AM, Christian K=F6nig >>>> This was originally un-inlined by Andi Kleen in 2011 citing size conce= rns. >>>> Indeed, a first attempt at inlining it grew radeon.ko by 7%. >>>> >>>> However, 2% of cpu is spent in this function. Simply inlining it gave = 1% >>>> more fps >>>> in Urban Terror. >>>> >>>> v2: We know the minimum MMIO size. Adding it to the if allows the comp= iler >>>> to >>>> optimize the branch out, improving both performance and size. >>>> >>>> The v2 patch decreases radeon.ko size by 2%. I didn't re-benchmark, but >>>> common sense >>>> says perf is now more than 1% better. >>> Nice! >>> >>> But are you sure that the register PCI bar is always at least 64K in si= ze? >>> Keep in mind that this code is used over all generations since R100. >>> Additional to that we probably should have a define for that and also a= pply >>> the optimizations to r100_mm_wreg as well. > Yes, I checked the earlier code. It had 64kb hard-coded, and when it > was changed in 2010 to use the dynamic value, the commit said later > asics are larger. (07bec2df01) > > A quick google also didn't find any dmesg with smaller values, R100 > cards had 64kb. Thanks for digging that up, this indeed sounds valid and like a very = nice optimization. Just as suggested before add a define for this in radeon.h, something = like RADEON_MIN_PCI_BAR_SIZE and do the same for r100_mm_wreg as well. >> If most of the register accesses are for the interrupt setup, I wonder >> if it would be better to just clean up the irq_set functions to reduce >> the register accesses. E.g., only touch the registers for the >> specific irq masks have changed. > Yes, that should also be done. But as this function is used elsewhere > as well, having it fast (not to mention the size decrease) would be > good. > > I think this patch is safe enough for 3.15, but perhaps it's too late > now. Yeah, probably. But 3.16 should work as well. Christian. > - Lauri