From mboxrd@z Thu Jan 1 00:00:00 1970 From: Lauri Kasanen Subject: Re: [PATCH] drm/radeon: Inline r100_mm_rreg, v2 Date: Sat, 19 Apr 2014 20:33:05 +0300 Message-ID: <20140419203305.ae6edc0e.cand@gmx.com> References: <20140419001147.89fe2275.cand@gmx.com> <53524A99.1070705@vodafone.de> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mout.gmx.net (mout.gmx.net [74.208.4.200]) by gabe.freedesktop.org (Postfix) with ESMTP id DB13C6E0AE for ; Sat, 19 Apr 2014 10:31:48 -0700 (PDT) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Alex Deucher Cc: Maling list - DRI developers List-Id: dri-devel@lists.freedesktop.org On Sat, 19 Apr 2014 11:15:53 -0400 Alex Deucher wrote: > On Sat, Apr 19, 2014 at 6:06 AM, Christian K=F6nig > >> This was originally un-inlined by Andi Kleen in 2011 citing size conce= rns. > >> Indeed, a first attempt at inlining it grew radeon.ko by 7%. > >> > >> However, 2% of cpu is spent in this function. Simply inlining it gave = 1% > >> more fps > >> in Urban Terror. > >> > >> v2: We know the minimum MMIO size. Adding it to the if allows the comp= iler > >> to > >> optimize the branch out, improving both performance and size. > >> > >> The v2 patch decreases radeon.ko size by 2%. I didn't re-benchmark, but > >> common sense > >> says perf is now more than 1% better. > > > > Nice! > > > > But are you sure that the register PCI bar is always at least 64K in si= ze? > > Keep in mind that this code is used over all generations since R100. > > Additional to that we probably should have a define for that and also a= pply > > the optimizations to r100_mm_wreg as well. Yes, I checked the earlier code. It had 64kb hard-coded, and when it was changed in 2010 to use the dynamic value, the commit said later asics are larger. (07bec2df01) A quick google also didn't find any dmesg with smaller values, R100 cards had 64kb. > If most of the register accesses are for the interrupt setup, I wonder > if it would be better to just clean up the irq_set functions to reduce > the register accesses. E.g., only touch the registers for the > specific irq masks have changed. Yes, that should also be done. But as this function is used elsewhere as well, having it fast (not to mention the size decrease) would be good. I think this patch is safe enough for 3.15, but perhaps it's too late now. - Lauri