* cpu_vm_mask checks in ARM flush functions @ 2009-10-24 4:03 muni anda 2009-10-24 11:10 ` Russell King - ARM Linux 0 siblings, 1 reply; 12+ messages in thread From: muni anda @ 2009-10-24 4:03 UTC (permalink / raw) To: linux-arm-kernel I was going though the cache flush functions in arch/arm/mm/flush.c and found that cpu_isset() is used at a lot of places. I couldn't understand the reason why there is a need for cpu_vm_mask checks? My understanding was that those functions will be executed on the CPU for which the cpu_mask is already set (in switch_mm call). Is there a different calling sequence that I am missing? Can someone please the need for those checks (For example flush_ptrace_access() function running on vipt_non_aliasing caches). Thanks, -Muni ^ permalink raw reply [flat|nested] 12+ messages in thread
* cpu_vm_mask checks in ARM flush functions 2009-10-24 4:03 cpu_vm_mask checks in ARM flush functions muni anda @ 2009-10-24 11:10 ` Russell King - ARM Linux 2009-10-26 10:22 ` Catalin Marinas 0 siblings, 1 reply; 12+ messages in thread From: Russell King - ARM Linux @ 2009-10-24 11:10 UTC (permalink / raw) To: linux-arm-kernel On Fri, Oct 23, 2009 at 09:03:16PM -0700, muni anda wrote: > I was going though the cache flush functions in arch/arm/mm/flush.c > and found that cpu_isset() is used at a lot of places. I couldn't > understand the reason why there is a need for cpu_vm_mask checks? My > understanding was that those functions will be executed on the CPU for > which the cpu_mask is already set (in switch_mm call). Is there a > different calling sequence that I am missing? There are cases where (eg) pages are unmapped in a different process to the one which is running. > Can someone please the need for those checks (For example > flush_ptrace_access() function running on vipt_non_aliasing caches). ptrace accesses are done by a different process from the one being operated on, so it is not likely that their address space is directly accessible - but it can be. For VIVT caches, this means that if the address space is visible, we need to flush the page. If it is not visible, we must not do any cache maintainence (which will be done when the kernel switches address spaces anyway.) For VIPT aliasing caches, we do not flush the caches on every context switch, and so we must always deal with cache aliases caused by accessing the kernel's view of memory. For VIPT non-aliasing caches, there may be a bug there; it requires more time than I currently have to think about to say for certain though. ^ permalink raw reply [flat|nested] 12+ messages in thread
* cpu_vm_mask checks in ARM flush functions 2009-10-24 11:10 ` Russell King - ARM Linux @ 2009-10-26 10:22 ` Catalin Marinas 2009-10-26 10:51 ` Russell King - ARM Linux 0 siblings, 1 reply; 12+ messages in thread From: Catalin Marinas @ 2009-10-26 10:22 UTC (permalink / raw) To: linux-arm-kernel On Sat, 2009-10-24 at 12:10 +0100, Russell King - ARM Linux wrote: > On Fri, Oct 23, 2009 at 09:03:16PM -0700, muni anda wrote: > > I was going though the cache flush functions in arch/arm/mm/flush.c > > and found that cpu_isset() is used at a lot of places. I couldn't > > understand the reason why there is a need for cpu_vm_mask checks? My > > understanding was that those functions will be executed on the CPU for > > which the cpu_mask is already set (in switch_mm call). Is there a > > different calling sequence that I am missing? [...] > For VIPT non-aliasing caches, there may be a bug there; it requires > more time than I currently have to think about to say for certain > though. Someone in ARM mentioned that setting breakpoints on ARM11MPCore doesn't always work. I gave them a patch with cpu_vm_mask check removed but they said it still doesn't work. I cannot guarantee that the fix doesn't work until I try it but I haven't had time for it yet. -- Catalin ^ permalink raw reply [flat|nested] 12+ messages in thread
* cpu_vm_mask checks in ARM flush functions 2009-10-26 10:22 ` Catalin Marinas @ 2009-10-26 10:51 ` Russell King - ARM Linux 2009-10-26 11:15 ` Catalin Marinas 0 siblings, 1 reply; 12+ messages in thread From: Russell King - ARM Linux @ 2009-10-26 10:51 UTC (permalink / raw) To: linux-arm-kernel On Mon, Oct 26, 2009 at 10:22:26AM +0000, Catalin Marinas wrote: > On Sat, 2009-10-24 at 12:10 +0100, Russell King - ARM Linux wrote: > > On Fri, Oct 23, 2009 at 09:03:16PM -0700, muni anda wrote: > > > I was going though the cache flush functions in arch/arm/mm/flush.c > > > and found that cpu_isset() is used at a lot of places. I couldn't > > > understand the reason why there is a need for cpu_vm_mask checks? My > > > understanding was that those functions will be executed on the CPU for > > > which the cpu_mask is already set (in switch_mm call). Is there a > > > different calling sequence that I am missing? > [...] > > For VIPT non-aliasing caches, there may be a bug there; it requires > > more time than I currently have to think about to say for certain > > though. > > Someone in ARM mentioned that setting breakpoints on ARM11MPCore doesn't > always work. I gave them a patch with cpu_vm_mask check removed but they > said it still doesn't work. I cannot guarantee that the fix doesn't work > until I try it but I haven't had time for it yet. Since I don't have an EABI gdb, I can't test it either. Was the ARM11MPCore system being tested one which broadcasts the cache ops in hardware? ^ permalink raw reply [flat|nested] 12+ messages in thread
* cpu_vm_mask checks in ARM flush functions 2009-10-26 10:51 ` Russell King - ARM Linux @ 2009-10-26 11:15 ` Catalin Marinas 2009-10-26 11:19 ` Russell King - ARM Linux 0 siblings, 1 reply; 12+ messages in thread From: Catalin Marinas @ 2009-10-26 11:15 UTC (permalink / raw) To: linux-arm-kernel On Mon, 2009-10-26 at 10:51 +0000, Russell King - ARM Linux wrote: > On Mon, Oct 26, 2009 at 10:22:26AM +0000, Catalin Marinas wrote: > > On Sat, 2009-10-24 at 12:10 +0100, Russell King - ARM Linux wrote: > > > On Fri, Oct 23, 2009 at 09:03:16PM -0700, muni anda wrote: > > > > I was going though the cache flush functions in arch/arm/mm/flush.c > > > > and found that cpu_isset() is used at a lot of places. I couldn't > > > > understand the reason why there is a need for cpu_vm_mask checks? My > > > > understanding was that those functions will be executed on the CPU for > > > > which the cpu_mask is already set (in switch_mm call). Is there a > > > > different calling sequence that I am missing? > > [...] > > > For VIPT non-aliasing caches, there may be a bug there; it requires > > > more time than I currently have to think about to say for certain > > > though. > > > > Someone in ARM mentioned that setting breakpoints on ARM11MPCore doesn't > > always work. I gave them a patch with cpu_vm_mask check removed but they > > said it still doesn't work. I cannot guarantee that the fix doesn't work > > until I try it but I haven't had time for it yet. > > Since I don't have an EABI gdb, I can't test it either. Not for this particular case but if you want I have some scripts to generate a debootstrap root filesystem for the RealView boards (they require a Debian PC). > Was the ARM11MPCore system being tested one which broadcasts the cache ops > in hardware? There's no ARM11MPCore with in-hardware cache ops broadcasting (only Cortex-A9). I think there was also a case of not setting the PG_arch_1 bit on SMP at all. Now that you mention this, they tried a patch for Cortex-A9 which goes back to lazy cache-flushing in flush_dcache_page/update_mmu_cache and the gdb breakpoints were working fine. -- Catalin ^ permalink raw reply [flat|nested] 12+ messages in thread
* cpu_vm_mask checks in ARM flush functions 2009-10-26 11:15 ` Catalin Marinas @ 2009-10-26 11:19 ` Russell King - ARM Linux 2009-10-26 11:59 ` Catalin Marinas 0 siblings, 1 reply; 12+ messages in thread From: Russell King - ARM Linux @ 2009-10-26 11:19 UTC (permalink / raw) To: linux-arm-kernel On Mon, Oct 26, 2009 at 11:15:19AM +0000, Catalin Marinas wrote: > On Mon, 2009-10-26 at 10:51 +0000, Russell King - ARM Linux wrote: > > Since I don't have an EABI gdb, I can't test it either. > > Not for this particular case but if you want I have some scripts to > generate a debootstrap root filesystem for the RealView boards (they > require a Debian PC). That doesn't help - I don't have any Debian stuff here. > > Was the ARM11MPCore system being tested one which broadcasts the cache ops > > in hardware? > > There's no ARM11MPCore with in-hardware cache ops broadcasting (only > Cortex-A9). I think there was also a case of not setting the PG_arch_1 > bit on SMP at all. That'll be why it removing that check doesn't resolve the problem then. The coherent flushes aren't broadcasted. > Now that you mention this, they tried a patch for Cortex-A9 which goes > back to lazy cache-flushing in flush_dcache_page/update_mmu_cache and > the gdb breakpoints were working fine. The lazy d-cache flushing has nothing to do with gdb breakpoints. ^ permalink raw reply [flat|nested] 12+ messages in thread
* cpu_vm_mask checks in ARM flush functions 2009-10-26 11:19 ` Russell King - ARM Linux @ 2009-10-26 11:59 ` Catalin Marinas 2009-10-26 12:15 ` Russell King - ARM Linux 0 siblings, 1 reply; 12+ messages in thread From: Catalin Marinas @ 2009-10-26 11:59 UTC (permalink / raw) To: linux-arm-kernel On Mon, 2009-10-26 at 11:19 +0000, Russell King - ARM Linux wrote: > On Mon, Oct 26, 2009 at 11:15:19AM +0000, Catalin Marinas wrote: > > On Mon, 2009-10-26 at 10:51 +0000, Russell King - ARM Linux wrote: > > > Was the ARM11MPCore system being tested one which broadcasts the cache ops > > > in hardware? > > > > There's no ARM11MPCore with in-hardware cache ops broadcasting (only > > Cortex-A9). I think there was also a case of not setting the PG_arch_1 > > bit on SMP at all. > > That'll be why it removing that check doesn't resolve the problem then. > The coherent flushes aren't broadcasted. You get to flush the D-cache with the breakpoint if removing the check but still not invalidating the I-cache on a different CPU. The particular case the guys here were testing was setting a breakpoint on a page which wasn't executed/mapped yet in the debugged application. > > Now that you mention this, they tried a patch for Cortex-A9 which goes > > back to lazy cache-flushing in flush_dcache_page/update_mmu_cache and > > the gdb breakpoints were working fine. > > The lazy d-cache flushing has nothing to do with gdb breakpoints. Not directly but in the particular case above (page not yet faulted in) PG_arch_1 is set via access_process_vm() -> get_user_pages() -> flush_dcache_page() and this was causing a flush via update_mmu_cache() when the page is later faulted in. As a general solution on Cortex-A9, I think removing the cpu_vm_mask check in flush_ptrace_access() should solve the problem (not on ARM11MPCore). -- Catalin ^ permalink raw reply [flat|nested] 12+ messages in thread
* cpu_vm_mask checks in ARM flush functions 2009-10-26 11:59 ` Catalin Marinas @ 2009-10-26 12:15 ` Russell King - ARM Linux 2009-10-26 18:01 ` Catalin Marinas 0 siblings, 1 reply; 12+ messages in thread From: Russell King - ARM Linux @ 2009-10-26 12:15 UTC (permalink / raw) To: linux-arm-kernel On Mon, Oct 26, 2009 at 11:59:41AM +0000, Catalin Marinas wrote: > On Mon, 2009-10-26 at 11:19 +0000, Russell King - ARM Linux wrote: > > On Mon, Oct 26, 2009 at 11:15:19AM +0000, Catalin Marinas wrote: > > > On Mon, 2009-10-26 at 10:51 +0000, Russell King - ARM Linux wrote: > > > > Was the ARM11MPCore system being tested one which broadcasts the cache ops > > > > in hardware? > > > > > > There's no ARM11MPCore with in-hardware cache ops broadcasting (only > > > Cortex-A9). I think there was also a case of not setting the PG_arch_1 > > > bit on SMP at all. > > > > That'll be why it removing that check doesn't resolve the problem then. > > The coherent flushes aren't broadcasted. > > You get to flush the D-cache with the breakpoint if removing the check > but still not invalidating the I-cache on a different CPU. The > particular case the guys here were testing was setting a breakpoint on a > page which wasn't executed/mapped yet in the debugged application. That will cause the page to be faulted into the address space (note that the address space may not be mapped) along with all the processing which faulting a page in usually entails. There shouldn't be any I-cache lines associated with this page after get_user_pages() returns (since it flushes the I-cache and we're not using a CPU which speculatively prefetches). The write to the kernel mapping of the page is done, and with the additional VM mask test removed, a local D-cache and I-cache flush is done (however, if you're running with full preemption enabled, this could occur on a different CPU to that which dirtied its cache.) Was preempt enabled? However, with a page which has been mapped in and executed from, we do need to do an I-cache flush on the other processors which have run this executable. The whole flush_ptrace_access() is definitely weak in this area. > > > Now that you mention this, they tried a patch for Cortex-A9 which goes > > > back to lazy cache-flushing in flush_dcache_page/update_mmu_cache and > > > the gdb breakpoints were working fine. > > > > The lazy d-cache flushing has nothing to do with gdb breakpoints. > > Not directly but in the particular case above (page not yet faulted in) > PG_arch_1 is set via access_process_vm() -> get_user_pages() -> > flush_dcache_page() and this was causing a flush via update_mmu_cache() > when the page is later faulted in. The page will have been faulted in by get_user_pages() itself, which will have internally done the flush_dcache_page() followed by an update_mmu_cache(). get_user_pages() follows this by an additional flush_anon_page() and flush_dcache_page() call, which will just do more flushing. update_mmu_cache() won't be called again. If we're writing to the page, and it's a page of text, that page will be COW'd, and hence become an anonymous page - flush_dcache_page() should have no effect on that (our version does - it immediately flushes the kernel mapping only in that case, whether or not lazy is enabled.) So I think this apparant change of behaviour is entirely misleading. ^ permalink raw reply [flat|nested] 12+ messages in thread
* cpu_vm_mask checks in ARM flush functions 2009-10-26 12:15 ` Russell King - ARM Linux @ 2009-10-26 18:01 ` Catalin Marinas 2009-10-26 18:27 ` muni anda 2009-10-26 19:13 ` Russell King - ARM Linux 0 siblings, 2 replies; 12+ messages in thread From: Catalin Marinas @ 2009-10-26 18:01 UTC (permalink / raw) To: linux-arm-kernel On Mon, 2009-10-26 at 12:15 +0000, Russell King - ARM Linux wrote: > On Mon, Oct 26, 2009 at 11:59:41AM +0000, Catalin Marinas wrote: > > On Mon, 2009-10-26 at 11:19 +0000, Russell King - ARM Linux wrote: > > > On Mon, Oct 26, 2009 at 11:15:19AM +0000, Catalin Marinas wrote: > > > > On Mon, 2009-10-26 at 10:51 +0000, Russell King - ARM Linux wrote: > > > > > Was the ARM11MPCore system being tested one which broadcasts the cache ops > > > > > in hardware? > > > > > > > > There's no ARM11MPCore with in-hardware cache ops broadcasting (only > > > > Cortex-A9). I think there was also a case of not setting the PG_arch_1 > > > > bit on SMP at all. > > > > > > That'll be why it removing that check doesn't resolve the problem then. > > > The coherent flushes aren't broadcasted. > > > > You get to flush the D-cache with the breakpoint if removing the check > > but still not invalidating the I-cache on a different CPU. The > > particular case the guys here were testing was setting a breakpoint on a > > page which wasn't executed/mapped yet in the debugged application. [...] > The write to the kernel mapping of the page is done, and with the > additional VM mask test removed, a local D-cache and I-cache flush > is done (however, if you're running with full preemption enabled, > this could occur on a different CPU to that which dirtied its cache.) > > Was preempt enabled? No. > However, with a page which has been mapped in and executed from, we do > need to do an I-cache flush on the other processors which have run this > executable. The whole flush_ptrace_access() is definitely weak in this > area. Can we use smp_call_function()? Are interrupt enabled in flush_ptrace_access()? -- Catalin ^ permalink raw reply [flat|nested] 12+ messages in thread
* cpu_vm_mask checks in ARM flush functions 2009-10-26 18:01 ` Catalin Marinas @ 2009-10-26 18:27 ` muni anda 2009-10-28 16:34 ` Catalin Marinas 2009-10-26 19:13 ` Russell King - ARM Linux 1 sibling, 1 reply; 12+ messages in thread From: muni anda @ 2009-10-26 18:27 UTC (permalink / raw) To: linux-arm-kernel >Now that you mention this, they tried a patch for Cortex-A9 which goes >back to lazy cache-flushing in flush_dcache_page/update_mmu_cache and >the gdb breakpoints were working fine. Where can I get this patch? - Muni ^ permalink raw reply [flat|nested] 12+ messages in thread
* cpu_vm_mask checks in ARM flush functions 2009-10-26 18:27 ` muni anda @ 2009-10-28 16:34 ` Catalin Marinas 0 siblings, 0 replies; 12+ messages in thread From: Catalin Marinas @ 2009-10-28 16:34 UTC (permalink / raw) To: linux-arm-kernel On Mon, 2009-10-26 at 11:27 -0700, muni anda wrote: > >Now that you mention this, they tried a patch for Cortex-A9 which goes > >back to lazy cache-flushing in flush_dcache_page/update_mmu_cache and > >the gdb breakpoints were working fine. > > Where can I get this patch? Note it only fixes some gdb issues as a side-effect (see the discussion thread). ARMv7: Use lazy cache flushing if hardware broadcasts cache operations From: Catalin Marinas <catalin.marinas@arm.com> ARMv7 processors like Cortex-A9 broadcast the cache maintenance operations in hardware. The patch adds the CPU ID checks for such feature and allows the flush_dcache_page/update_mmu_cache pair to work in lazy flushing mode similar to the UP case. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> --- arch/arm/include/asm/cachetype.h | 16 ++++++++++++++++ arch/arm/mm/fault-armv.c | 5 ++--- arch/arm/mm/flush.c | 8 +++----- 3 files changed, 21 insertions(+), 8 deletions(-) diff --git a/arch/arm/include/asm/cachetype.h b/arch/arm/include/asm/cachetype.h index d3a4c2c..3d5d1c9 100644 --- a/arch/arm/include/asm/cachetype.h +++ b/arch/arm/include/asm/cachetype.h @@ -1,6 +1,8 @@ #ifndef __ASM_ARM_CACHETYPE_H #define __ASM_ARM_CACHETYPE_H +#include <asm/cputype.h> + #define CACHEID_VIVT (1 << 0) #define CACHEID_VIPT_NONALIASING (1 << 1) #define CACHEID_VIPT_ALIASING (1 << 2) @@ -49,4 +51,18 @@ static inline unsigned int __attribute__((pure)) cacheid_is(unsigned int mask) (~__CACHEID_NEVER & __CACHEID_ARCH_MIN & mask & cacheid); } +/* + * Cache maintenance operations hardware broadcasting. + */ +#ifndef CONFIG_SMP +#define cache_ops_hw_broadcast() 1 +#elif __LINUX_ARM_ARCH__ <= 6 +#define cache_ops_hw_broadcast() 0 +#else +static inline int cache_ops_hw_broadcast(void) +{ + return ((read_cpuid_ext(CPUID_EXT_MMFR3) >> 12) & 0xf) >= 1; +} +#endif + #endif diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c index d0d17b6..214c3ed 100644 --- a/arch/arm/mm/fault-armv.c +++ b/arch/arm/mm/fault-armv.c @@ -153,10 +153,9 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr, pte_t pte) page = pfn_to_page(pfn); mapping = page_mapping(page); -#ifndef CONFIG_SMP - if (test_and_clear_bit(PG_dcache_dirty, &page->flags)) + if (cache_ops_hw_broadcast() && + test_and_clear_bit(PG_dcache_dirty, &page->flags)) __flush_dcache_page(mapping, page); -#endif if (mapping) { if (cache_is_vivt()) make_coherent(mapping, vma, addr, pfn); diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c index b279429..2ebb999 100644 --- a/arch/arm/mm/flush.c +++ b/arch/arm/mm/flush.c @@ -217,12 +217,10 @@ void flush_dcache_page(struct page *page) { struct address_space *mapping = page_mapping(page); -#ifndef CONFIG_SMP - if (!PageHighMem(page) && mapping && !mapping_mapped(mapping)) + if (cache_ops_hw_broadcast() && + !PageHighMem(page) && mapping && !mapping_mapped(mapping)) set_bit(PG_dcache_dirty, &page->flags); - else -#endif - { + else { __flush_dcache_page(mapping, page); if (mapping && cache_is_vivt()) __flush_dcache_aliases(mapping, page); -- Catalin ^ permalink raw reply related [flat|nested] 12+ messages in thread
* cpu_vm_mask checks in ARM flush functions 2009-10-26 18:01 ` Catalin Marinas 2009-10-26 18:27 ` muni anda @ 2009-10-26 19:13 ` Russell King - ARM Linux 1 sibling, 0 replies; 12+ messages in thread From: Russell King - ARM Linux @ 2009-10-26 19:13 UTC (permalink / raw) To: linux-arm-kernel On Mon, Oct 26, 2009 at 06:01:05PM +0000, Catalin Marinas wrote: > On Mon, 2009-10-26 at 12:15 +0000, Russell King - ARM Linux wrote: > > On Mon, Oct 26, 2009 at 11:59:41AM +0000, Catalin Marinas wrote: > > > On Mon, 2009-10-26 at 11:19 +0000, Russell King - ARM Linux wrote: > > > > On Mon, Oct 26, 2009 at 11:15:19AM +0000, Catalin Marinas wrote: > > > > > On Mon, 2009-10-26 at 10:51 +0000, Russell King - ARM Linux wrote: > > > > > > Was the ARM11MPCore system being tested one which broadcasts the cache ops > > > > > > in hardware? > > > > > > > > > > There's no ARM11MPCore with in-hardware cache ops broadcasting (only > > > > > Cortex-A9). I think there was also a case of not setting the PG_arch_1 > > > > > bit on SMP at all. > > > > > > > > That'll be why it removing that check doesn't resolve the problem then. > > > > The coherent flushes aren't broadcasted. > > > > > > You get to flush the D-cache with the breakpoint if removing the check > > > but still not invalidating the I-cache on a different CPU. The > > > particular case the guys here were testing was setting a breakpoint on a > > > page which wasn't executed/mapped yet in the debugged application. > [...] > > The write to the kernel mapping of the page is done, and with the > > additional VM mask test removed, a local D-cache and I-cache flush > > is done (however, if you're running with full preemption enabled, > > this could occur on a different CPU to that which dirtied its cache.) > > > > Was preempt enabled? > > No. In which case, until there is definitive evidence otherwise, I'm tempted to discount this report - logically, I can't see a problem with that additional test gone for the described scenario. > > However, with a page which has been mapped in and executed from, we do > > need to do an I-cache flush on the other processors which have run this > > executable. The whole flush_ptrace_access() is definitely weak in this > > area. > > Can we use smp_call_function()? Are interrupt enabled in > flush_ptrace_access()? Yes, I'm working on a patch which does this (which'll be part of a series of patches in this area.) ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-10-28 16:34 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-10-24 4:03 cpu_vm_mask checks in ARM flush functions muni anda 2009-10-24 11:10 ` Russell King - ARM Linux 2009-10-26 10:22 ` Catalin Marinas 2009-10-26 10:51 ` Russell King - ARM Linux 2009-10-26 11:15 ` Catalin Marinas 2009-10-26 11:19 ` Russell King - ARM Linux 2009-10-26 11:59 ` Catalin Marinas 2009-10-26 12:15 ` Russell King - ARM Linux 2009-10-26 18:01 ` Catalin Marinas 2009-10-26 18:27 ` muni anda 2009-10-28 16:34 ` Catalin Marinas 2009-10-26 19:13 ` Russell King - ARM Linux
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).