From mboxrd@z Thu Jan 1 00:00:00 1970 From: david.vrabel@citrix.com (David Vrabel) Date: Thu, 22 Dec 2011 16:38:23 +0000 Subject: Oops in guest after ioremap() on ARMv7 In-Reply-To: <20111222144937.GE20635@arm.com> References: <4EF31DA7.9030407@citrix.com> <20111222144937.GE20635@arm.com> Message-ID: <4EF35CFF.3050200@citrix.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 22/12/11 14:49, Catalin Marinas wrote: > On Thu, Dec 22, 2011 at 12:08:07PM +0000, David Vrabel wrote: >> When running the linux kernel on the ARMv7 envelope model as a guest >> under the Xen hypervisor there is a oops (see below for an example of >> the page translation fault) when trying to access ioremap()'d memory. The translation tables for userspace seem to be also affected. The program repeatedly faults with a translation fault on the same address. Putting a cache_flush_all() after the call to handle_mm_fault() in __do_page_fault() makes userspace work as well. >> The same kernel works fine when not running under the hypervisor. >> >> It's a 3.2.0-rc5+ kernel with the two additional linux-arch-arm >> branches: arm-arch/vexpress and arm-arch/arm-lpae. >> >> Calling flush_cache_all() in flush_cache_vmap() makes it work. What >> isn't being correctly flushed? I see that flush_pmd_entry() and >> cpu_v7_set_pte_ext() already flush the L1 and L2 translation table >> entries and I can't think of anything else that would need to be flushed >> (unless the mapped virtual addresses need to be flushed as well?) >> >> The "Barrier Litmus Tests and Cookbook" says that a TLB flush and a >> branch predictor flush are required after a translation table entry >> update. This seems not to be done but adding this didn't seem to help >> (and using local_flush_tlb_all()) in flush_cache_vmap() didn't help either). >> >> I don't see anything in the hypervisor that could be causing this as the >> fault is occurring at stage 1 and not stage 2 translation. > > Interesting error, I don't have an immediate idea of what might be > wrong, just some questions. > > What's the value of the VTCR register for this guest? Are the > translation table walks marked as cacheable? Also, are the page table > attributes Normal Cacheable in the stage 2 translation? The processor > chooses the more restrictive attribute between stage 1 and stage 2. VTCR = 0x80002558 which is: Outer Shareable; Normal memory, outer write-back write-allocate cacheable; Normal memory, inner write-back, write-allocate cacheable. L3 TT entries for stage 2 have the following attributes: Outer-Shareable; Normal, inner write-back cachable; Normal, outer write-back cacheable. These look sensible to me. David