From mboxrd@z Thu Jan 1 00:00:00 1970 From: catalin.marinas@arm.com (Catalin Marinas) Date: Mon, 17 Mar 2014 17:29:44 +0000 Subject: PL310 errata workarounds In-Reply-To: <20140317153738.GB21483@n2100.arm.linux.org.uk> References: <20140314144835.GP21483@n2100.arm.linux.org.uk> <20140314150110.GQ21483@n2100.arm.linux.org.uk> <20140316115207.GW21483@n2100.arm.linux.org.uk> <20140317153738.GB21483@n2100.arm.linux.org.uk> Message-ID: <20140317172943.GI24070@arm.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Mon, Mar 17, 2014 at 03:37:38PM +0000, Russell King - ARM Linux wrote: > On Mon, Mar 17, 2014 at 10:04:20AM -0500, Rob Herring wrote: > > On Sun, Mar 16, 2014 at 6:52 AM, Russell King - ARM Linux > > wrote: > > > The MCPM stuff is another issue: what the conditions are there I've > > > no idea, but it looks like other CPUs will be running when it calls > > > outer_cache_flush(). MCPM commentry claims that this will be > > > "harmless" and I just had to laugh at that - even with this workaround > > > enabled, it doesn't fix the problem on L2C-310 R2P0 as the workaround > > > implementation only works on R3P0! > > > > For MCPM, is there even a platform that has a PL310 used as an L3? I > > suppose architecturally it is possible, but in reality it is probably > > not something that's ever been tested. > > I have no idea about MCPM I assume that the MCPM comment about outer_cache_flush() being harmless is because it is assumed to be a no-op. In the mach-vexpress/dcscb.c file, there is a v7_exit_coherency_flush() prior to outer_flush_all(). While it looks like the right way, the comment for v7_exit_coherency_flush() states that ldrex/strex no longer work after the call. You could do with a lock-less outer_flush_all(), even though it is a background operation assuming there is no race (single CPU running). I think your big clean-up series is already hiding outer_flush_all() under the L2 disable function. > > This would help with contention in readl/writel, but you still have > > most all the overhead of a spinlock. I'm not sure which is the bigger > > component: lock contention or all the loads, stores and dsb/dmbs > > associated with the lock. > > Using the arch r/w locks is not that heavy, and doesn't have the problem > that interrupts are locked out during much of the L2 maintanence. Even > with arch r/w locks, the L2 cache ops don't show up much in perf, compared > to the existing implementation where they show quite highly. > > The only issue is we'd only be able to use this optimisation when we > aren't running in IRQ context anyway, which I think isn't that great a > restriction on it. IIRC Will had a patch for this but I don't remember whether it showed any improvements (I guess not since the patch hasn't been pushed). > > Isn't using by way ops potentially broken if you are running a secure > > OS? If linux is doing a by way operation and the secure OS does range > > operations, someone is going to crash on an abort. I suppose no one > > sees this due to the limited function of secure OSs. > > Let's cover that should it happen - a secure OS should check the status > of the L2 hardware before issuing any cache operation anyway for exactly > this reason (you can always read from the L2 registers to check whether > any operation is in progress.) But between the secure OS check and the actual operation, Linux could start a background one (the SMP case). If this happens, the hardware generates an error signal, I guess this translates to an external abort (I haven't tried it but it doesn't look to safe for the secure world). -- Catalin