From mboxrd@z Thu Jan 1 00:00:00 1970 From: mark.rutland@arm.com (Mark Rutland) Date: Wed, 3 Feb 2016 17:53:51 +0000 Subject: [PATCH v4 4/7] arm64: Handle early CPU boot failures In-Reply-To: <20160203173448.GD26487@MBP.local> References: <1453745225-27736-1-git-send-email-suzuki.poulose@arm.com> <1453745225-27736-5-git-send-email-suzuki.poulose@arm.com> <20160203125735.GA26487@MBP.local> <20160203164632.GC1234@leverpostej> <20160203173448.GD26487@MBP.local> Message-ID: <20160203175351.GG1234@leverpostej> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Wed, Feb 03, 2016 at 05:34:49PM +0000, Catalin Marinas wrote: > On Wed, Feb 03, 2016 at 04:46:32PM +0000, Mark Rutland wrote: > > On Wed, Feb 03, 2016 at 12:57:38PM +0000, Catalin Marinas wrote: > > > On Mon, Jan 25, 2016 at 06:07:02PM +0000, Suzuki K. Poulose wrote: > > > > + * update_early_cpu_boot_status tmp, status > > > > + * - Corrupts tmp, x0, x1 > > > > + * - Writes 'status' to __early_cpu_boot_status and makes sure > > > > + * it is committed to memory. > > > > + */ > > > > + > > > > + .macro update_early_cpu_boot_status tmp, status > > > > + mov \tmp, lr > > > > + adrp x0, __early_cpu_boot_status > > > > + add x0, x0, #:lo12:__early_cpu_boot_status > > > > > > Nitpick: you could use the adr_l macro. > > > > > > > + mov x1, #\status > > > > + str x1, [x0] > > > > + add x1, x0, 4 > > > > + bl __inval_cache_range > > > > + mov lr, \tmp > > > > + .endm > > > > > > If the CPU that's currently booting has the MMU off, what's the point of > > > invalidating the cache here? > > > > To invalidate stale lines for this address, held in any caches prior to > > the PoC. I'm assuming that __early_cpu_boot_status is sufficiently > > padded to the CWG. > > I would have rather invalidated it before writing the [x0], if that's > what it's aimed at. That alone wouldn't not be sufficient, due to speculative fetches allocating new (clean) lines prior to the write completing. I was expecting the CWG-aligned region to only be written to with the MMU off, i.e. we'd only have clean stale lines and no dirty lines. > > Cache maintenance works when SCTLR_ELx.M == 0, though barriers are > > required prior to cache maintenance as non-cacheable accesses do not > > hazard by VA. > > > > The MMU being off has no effect on the cache maintenance itself. > > I know, but whether it has an effect on other CPUs is a different > question (it probably has). Anyway, I would rather do the invalidation > on the CPU that actually reads this status. My only worry would be how this gets ordered against the (non-cacheable) store. I guess we'd complete that with a DSB SY regardless. Given that, I have no problem doing the invalidate on the read side. Assuming we only write from the side with the MMU off, we don't need maintenance on that side. > > > The operation may not even be broadcast to the other CPU. So you > > > actually need the invalidation before reading the status on the > > > primary CPU. > > > > We require that CPUs are coherent when they enter the kernel, so any > > cache maintenance operation _must_ affect all coherent caches (i.e. it > > must be broadcast and must affect all coherent caches prior to the PoC > > in this case). > > In general, if you perform cache maintenance on a non-shareable mapping, > I don't think it would be broadcast. But in this case, the MMU is off, > data accesses default to Device_nGnRnE and considered outer shareable, > so it may actually work. Is this stated anywhere in the ARM ARM? In ARM DDI 0487A.h, D4.2.8 "The effects of disabling a stage of address translation" we state: Cache maintenance instructions act on the target cache regardless of whether any stages of address translation are disabled, and regardless of the values of the memory attributes. However, if a stage of address translation is disabled, they use the flat address mapping for that translation stage. Thanks, Mark.