From mboxrd@z Thu Jan 1 00:00:00 1970 From: Arnd Bergmann Subject: Re: [PATCH V3 00/14] genirq endian fixes; bcm7120/brcmstb IRQ updates Date: Mon, 03 Nov 2014 12:56:52 +0100 Message-ID: <2217077.1aQXS9nJph@wuerfel> References: <1414890241-9938-1-git-send-email-cernekee@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7Bit Return-path: In-Reply-To: <1414890241-9938-1-git-send-email-cernekee@gmail.com> Sender: linux-sh-owner@vger.kernel.org To: Kevin Cernekee Cc: f.fainelli@gmail.com, tglx@linutronix.de, jason@lakedaemon.net, ralf@linux-mips.org, linux-sh@vger.kernel.org, sergei.shtylyov@cogentembedded.com, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, mbizon@freebox.fr, jogo@openwrt.org, linux-mips@linux-mips.org List-Id: devicetree@vger.kernel.org On Saturday 01 November 2014 18:03:47 Kevin Cernekee wrote: > V2->V3: > > - Move updated irq_reg_{readl,writel} functions back into > so they can be called by irqchip drivers > > - Add gc->reg_{readl,writel} function pointers so that irqchip > drivers like arch/sh/boards/mach-se/{7343,7722}/irq.c can override them > > - CC: linux-sh list in lieu of Paul's defunct linux-sh.org email address > > - Fix handling of zero L2 status in bcm7120-l2.c > > - Rebase on Linus' head of tree Looks all great. I also looked at the series now and am very happy about how it turned out. > - Drop GENERIC_CHIP / GENERIC_CHIP_BE compile-time optimizations > > For the latter item, I ran a quick benchmark to see if the extra > indirection in irq_reg_{readl,write} had any perceptible effect on > register access times. The MIPS BE case did show a small performance > hit from using the read wrapper, but on ARM LE the only differences > were attributed to the presence/absence of a barrier: > > > BCM3384 (UBUS architecture, MIPS BE, IRQ_GC_BE_IO): > > irq_reg_readl : 207 ns > readl : 186 ns > __raw_readl : 186 ns > ioread32be : 195 ns > > irq_reg_writel : 177 ns > writel : 177 ns > __raw_writel : 177 ns > iowrite32be : 177 ns > > > BCM7445 (GISB architecture, ARM LE, standard LE readl): > > irq_reg_readl : 519 ns > readl : 519 ns > __raw_readl : 482 ns > ioread32be : 519 ns > > irq_reg_writel : 500 ns > writel : 500 ns > __raw_writel : 482 ns > iowrite32be : 500 ns > Yes, good idea to check this. 43ns is probably not significant to warrant optimizing this, but if we wanted to, a driver could now override the accessors using readl_relaxed()/writel_relaxed(). Note that the cost of the barriers can depend a lot on the hardware setup and on the state of the system. I believe synchronizing the L2 cache on some Cortex-A9 machines can be particularly expensive. Anyway, the existing code doesn't do it, so we can leave that as a possible optimization. Arnd