* [PATCH] arm: add check for global exclusive monitor @ 2013-02-18 16:26 Vladimir Murzin 2013-02-18 16:44 ` Russell King - ARM Linux 0 siblings, 1 reply; 5+ messages in thread From: Vladimir Murzin @ 2013-02-18 16:26 UTC (permalink / raw) To: linux-arm-kernel Since ARMv6 new atomic instructions have been introduced: ldrex/strex. Several implementation are possible based on (1) global and local exclusive monitors and (2) local exclusive monitor and snoop unit. In case of the 2nd option exclusive store operation on uncached region may be faulty. Check for availability of the global monitor to provide some hint about possible issues. Signed-off-by: Vladimir Murzin <murzin.v@gmail.com> --- arch/arm/include/asm/bugs.h | 14 ++++++++++++-- arch/arm/mm/fault-armv.c | 43 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 55 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/bugs.h b/arch/arm/include/asm/bugs.h index a97f1ea..230432e 100644 --- a/arch/arm/include/asm/bugs.h +++ b/arch/arm/include/asm/bugs.h @@ -13,9 +13,19 @@ #ifdef CONFIG_MMU extern void check_writebuffer_bugs(void); -#define check_bugs() check_writebuffer_bugs() +#if __LINUX_ARM_ARCH__ < 6 +static void check_gmonitor_bugs(void) {}; #else -#define check_bugs() do { } while (0) +extern void check_gmonitor_bugs(void); +#endif + +static inline void check_bugs(void) +{ + check_writebuffer_bugs(); + check_gmonitor_bugs(); +} +#else +static inline void check_bugs(void) { } #endif #endif diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c index 7599e26..c12846b 100644 --- a/arch/arm/mm/fault-armv.c +++ b/arch/arm/mm/fault-armv.c @@ -206,6 +206,49 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr, __flush_icache_all(); } } +#else +void __init check_gmonitor_bugs(void) +{ + struct page *page; + const char *reason; + unsigned long res = 1; + + printk(KERN_INFO "CPU: Testing for global monitor: "); + + page = alloc_page(GFP_KERNEL); + if (page) { + unsigned long *p; + pgprot_t prot = __pgprot_modify(PAGE_KERNEL, + L_PTE_MT_MASK, L_PTE_MT_UNCACHED); + + p = vmap(&page, 1, VM_IOREMAP, prot); + + if (p) { + int temp; + + __asm__ __volatile__( \ + "ldrex %1, [%2]\n" \ + "strex %0, %1, [%2]" \ + : "=&r" (res), "=&r" (temp) \ + : "r" (p) \ + : "cc", "memory"); + + reason = "n\\a (atomic ops may be faulty)"; + } else { + reason = "unable to map memory\n"; + } + + vunmap(p); + put_page(page); + } else { + reason = "unable to grab page\n"; + } + + if (res) + printk("failed, %s\n", reason); + else + printk("ok\n"); +} #endif /* __LINUX_ARM_ARCH__ < 6 */ /* -- 1.7.8.6 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH] arm: add check for global exclusive monitor 2013-02-18 16:26 [PATCH] arm: add check for global exclusive monitor Vladimir Murzin @ 2013-02-18 16:44 ` Russell King - ARM Linux [not found] ` <20130219105534.GA31156@murzin.rnd.samsung.ru> 0 siblings, 1 reply; 5+ messages in thread From: Russell King - ARM Linux @ 2013-02-18 16:44 UTC (permalink / raw) To: linux-arm-kernel On Mon, Feb 18, 2013 at 08:26:50PM +0400, Vladimir Murzin wrote: > Since ARMv6 new atomic instructions have been introduced: > ldrex/strex. Several implementation are possible based on (1) global > and local exclusive monitors and (2) local exclusive monitor and snoop > unit. > > In case of the 2nd option exclusive store operation on uncached > region may be faulty. > > Check for availability of the global monitor to provide some hint about > possible issues. How does this code actually do that? > +void __init check_gmonitor_bugs(void) > +{ > + struct page *page; > + const char *reason; > + unsigned long res = 1; > + > + printk(KERN_INFO "CPU: Testing for global monitor: "); > + > + page = alloc_page(GFP_KERNEL); > + if (page) { > + unsigned long *p; > + pgprot_t prot = __pgprot_modify(PAGE_KERNEL, > + L_PTE_MT_MASK, L_PTE_MT_UNCACHED); > + > + p = vmap(&page, 1, VM_IOREMAP, prot); This is bad practise. Remapping a page of already mapped kernel memory using different attributes (in this case, strongly ordered) is _definitely_ a violation of the architecture requirements. The behaviour you will see from this are in no way guaranteed. If you want to do this, it must either come from highmem, or not already be mapped. Moreover, this is absolutely silly - the ARM ARM says: "LDREX and STREX operations *must* be performed only on memory with the Normal memory attribute." L_PTE_MT_UNCACHED doesn't get you that. As I say above, that gets you strongly ordered memory, not "normal memory" as required by the architecture for use with exclusive types. > + > + if (p) { > + int temp; > + > + __asm__ __volatile__( \ > + "ldrex %1, [%2]\n" \ > + "strex %0, %1, [%2]" \ > + : "=&r" (res), "=&r" (temp) \ > + : "r" (p) \ \ character not required for any of the above. Neither is the __ version of "asm" and "volatile". > + : "cc", "memory"); > + > + reason = "n\\a (atomic ops may be faulty)"; "n\\a" ? So... at the moment this has me wondering - you're testing atomic operations with a strongly ordered memory region, which ARM already define this to be outside of the architecture spec. The behaviour you see is not defined architecturally. And if you're trying to use LDREX/STREX to a strongly ordered or device memory region, then you're quite right that it'll be unreliable. It's not defined to even work. That's not because they're faulty, it's because you're abusing them. ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20130219105534.GA31156@murzin.rnd.samsung.ru>]
* [PATCH] arm: add check for global exclusive monitor [not found] ` <20130219105534.GA31156@murzin.rnd.samsung.ru> @ 2013-02-20 4:55 ` Vladimir Murzin 2013-02-20 10:59 ` Russell King - ARM Linux 0 siblings, 1 reply; 5+ messages in thread From: Vladimir Murzin @ 2013-02-20 4:55 UTC (permalink / raw) To: linux-arm-kernel Thanks for review Russel! On Mon, Feb 18, 2013 at 04:44:20PM +0000, Russell King - ARM Linux wrote: > On Mon, Feb 18, 2013 at 08:26:50PM +0400, Vladimir Murzin wrote: > > Since ARMv6 new atomic instructions have been introduced: > > ldrex/strex. Several implementation are possible based on (1) global > > and local exclusive monitors and (2) local exclusive monitor and snoop > > unit. > > > > In case of the 2nd option exclusive store operation on uncached > > region may be faulty. > > > > Check for availability of the global monitor to provide some hint about > > possible issues. > > How does this code actually do that? According to DHT0008A_arm_synchronization_primitives.pdf the global monitor is introduce to track exclusive accesses to sharable memory regions. This article also says about some system-wide implication which should be taken into account: (1) for systems with coherency management (2) for systems without coherency management The first one lay on SCU, L1 data cache and local monitor. The second one requires implementation of global monitor if memory regions cannot be cached. It set up the behaviour for store-exclusive operations when global monitor is not available: these operations always fail. Taking all these into account we can guess about availability of global monitor by using store-exclusive operation on uncached memory region. > > > +void __init check_gmonitor_bugs(void) > > +{ > > + struct page *page; > > + const char *reason; > > + unsigned long res = 1; > > + > > + printk(KERN_INFO "CPU: Testing for global monitor: "); > > + > > + page = alloc_page(GFP_KERNEL); > > + if (page) { > > + unsigned long *p; > > + pgprot_t prot = __pgprot_modify(PAGE_KERNEL, > > + L_PTE_MT_MASK, L_PTE_MT_UNCACHED); > > + > > + p = vmap(&page, 1, VM_IOREMAP, prot); > > This is bad practise. Remapping a page of already mapped kernel memory > using different attributes (in this case, strongly ordered) is _definitely_ > a violation of the architecture requirements. The behaviour you will see > from this are in no way guaranteed. DDI0406C_arm_architecture_reference_manual.pdf (A3-131) says: A memory location can be marked as having different cacheability attributes, for example when using aliases in a virtual to physical address mapping: * if the attributes differ only in the cache allocation hint this does not affect the behavior of accesses to that location * for other cases see Mismatched memory attributes on page A3-136. Isn't L_PTE_MT_UNCACHED about cache allocation hint? > > If you want to do this, it must either come from highmem, or not already > be mapped. > > Moreover, this is absolutely silly - the ARM ARM says: > > "LDREX and STREX operations *must* be performed only on memory with the > Normal memory attribute." DDI0406C_arm_architecture_reference_manual.pdf (A3-121) says: It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be performed to a memory region with the Device or Strongly-ordered memory attribute. Unless the implementation documentation explicitly states that LDREX and STREX operations to a memory region with the Device or Strongly-ordered attribute are permitted, the effect of such operations is UNPREDICTABLE. At least it allows to perform operations on memory region with the Strongly-ordered attribute... but still unpredictable. > > L_PTE_MT_UNCACHED doesn't get you that. As I say above, that gets you > strongly ordered memory, not "normal memory" as required by the > architecture for use with exclusive types. > > > + > > + if (p) { > > + int temp; > > + > > + __asm__ __volatile__( \ > > + "ldrex %1, [%2]\n" \ > > + "strex %0, %1, [%2]" \ > > + : "=&r" (res), "=&r" (temp) \ > > + : "r" (p) \ > > \ character not required for any of the above. Neither is the __ version > of "asm" and "volatile". Thanks. > > > + : "cc", "memory"); > > + > > + reason = "n\\a (atomic ops may be faulty)"; > > "n\\a" ? "not detected"? > So... at the moment this has me wondering - you're testing atomic > operations with a strongly ordered memory region, which ARM already > define this to be outside of the architecture spec. The behaviour you > see is not defined architecturally. > > And if you're trying to use LDREX/STREX to a strongly ordered or device > memory region, then you're quite right that it'll be unreliable. It's > not defined to even work. That's not because they're faulty, it's because > you're abusing them. However, IRL it is not hard to meet this undefined difference. At least I'm able to see it on Tegra2 Harmony and Pandaboard. Moreover, demand on Normal memory attribute breaks up ability to turn caches off. In this case we are not able to boot the system up (seen on Tegra2 harmony). This patch is aimed to highlight the difference in implementation. That's why it has some softering in guessing about faulty. Might be it worth warning about unpredictable effect instead? Best wishes Vladimir ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] arm: add check for global exclusive monitor 2013-02-20 4:55 ` Vladimir Murzin @ 2013-02-20 10:59 ` Russell King - ARM Linux 2013-02-22 16:46 ` Vladimir Murzin 0 siblings, 1 reply; 5+ messages in thread From: Russell King - ARM Linux @ 2013-02-20 10:59 UTC (permalink / raw) To: linux-arm-kernel On Wed, Feb 20, 2013 at 08:55:34AM +0400, Vladimir Murzin wrote: > Thanks for review Russel! > > On Mon, Feb 18, 2013 at 04:44:20PM +0000, Russell King - ARM Linux wrote: > > On Mon, Feb 18, 2013 at 08:26:50PM +0400, Vladimir Murzin wrote: > > > +void __init check_gmonitor_bugs(void) > > > +{ > > > + struct page *page; > > > + const char *reason; > > > + unsigned long res = 1; > > > + > > > + printk(KERN_INFO "CPU: Testing for global monitor: "); > > > + > > > + page = alloc_page(GFP_KERNEL); > > > + if (page) { > > > + unsigned long *p; > > > + pgprot_t prot = __pgprot_modify(PAGE_KERNEL, > > > + L_PTE_MT_MASK, L_PTE_MT_UNCACHED); > > > + > > > + p = vmap(&page, 1, VM_IOREMAP, prot); > > > > This is bad practise. Remapping a page of already mapped kernel memory > > using different attributes (in this case, strongly ordered) is _definitely_ > > a violation of the architecture requirements. The behaviour you will see > > from this are in no way guaranteed. > DDI0406C_arm_architecture_reference_manual.pdf (A3-131) says: > > A memory location can be marked as having different cacheability > attributes, for example when using aliases in a > virtual to physical address mapping: > * if the attributes differ only in the cache allocation hint this does > not affect the behavior of accesses to that location > * for other cases see Mismatched memory attributes on page A3-136. > > Isn't L_PTE_MT_UNCACHED about cache allocation hint? No. I've told you above what this gets you. > > If you want to do this, it must either come from highmem, or not already > > be mapped. > > > > Moreover, this is absolutely silly - the ARM ARM says: > > > > "LDREX and STREX operations *must* be performed only on memory with the > > Normal memory attribute." > > DDI0406C_arm_architecture_reference_manual.pdf (A3-121) says: > > It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be > performed to a memory region with the Device or Strongly-ordered > memory attribute. Unless the implementation documentation explicitly > states that LDREX and STREX operations to a memory region with the > Device or Strongly-ordered attribute are permitted, the effect of such > operations is UNPREDICTABLE. > > At least it allows to perform operations on memory region with the > Strongly-ordered attribute... but still unpredictable. And "unpredictable" means that you can't rely on *any* aspect of its behaviour. You can't test for its behaviour and then rely on the result of that test telling you how it will behave in future. Even if a "ldrex/strex" sequence to a strongly-ordered region succeeds or fails, that's no guarantee with "unpredictable" that it will have that behaviour next time. > > L_PTE_MT_UNCACHED doesn't get you that. As I say above, that gets you > > strongly ordered memory, not "normal memory" as required by the > > architecture for use with exclusive types. > > > > > + > > > + if (p) { > > > + int temp; > > > + > > > + __asm__ __volatile__( \ > > > + "ldrex %1, [%2]\n" \ > > > + "strex %0, %1, [%2]" \ > > > + : "=&r" (res), "=&r" (temp) \ > > > + : "r" (p) \ > > > > \ character not required for any of the above. Neither is the __ version > > of "asm" and "volatile". > Thanks. > > > > > + : "cc", "memory"); > > > + > > > + reason = "n\\a (atomic ops may be faulty)"; > > > > "n\\a" ? > "not detected"? > > So... at the moment this has me wondering - you're testing atomic > > operations with a strongly ordered memory region, which ARM already > > define this to be outside of the architecture spec. The behaviour you > > see is not defined architecturally. > > > > And if you're trying to use LDREX/STREX to a strongly ordered or device > > memory region, then you're quite right that it'll be unreliable. It's > > not defined to even work. That's not because they're faulty, it's because > > you're abusing them. > However, IRL it is not hard to meet this undefined difference. At > least I'm able to see it on Tegra2 Harmony and Pandaboard. Moreover, > demand on Normal memory attribute breaks up ability to turn caches > off. In this case we are not able to boot the system up (seen on > Tegra2 harmony). This patch is aimed to highlight the difference in > implementation. That's why it has some softering in guessing about > faulty. Might be it worth warning about unpredictable effect instead? You're soo busy quoting bits of architecture manual back at me that you can't see your own errors - errors which I've pointed out already. "normal memory" != "uncached". Normal memory can be uncacheable. Asking the kernel for L_PTE_MT_BUFFERABLE will give you "normal memory, uncached". ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] arm: add check for global exclusive monitor 2013-02-20 10:59 ` Russell King - ARM Linux @ 2013-02-22 16:46 ` Vladimir Murzin 0 siblings, 0 replies; 5+ messages in thread From: Vladimir Murzin @ 2013-02-22 16:46 UTC (permalink / raw) To: linux-arm-kernel On Wed, Feb 20, 2013 at 10:59:57AM +0000, Russell King - ARM Linux wrote: > On Wed, Feb 20, 2013 at 08:55:34AM +0400, Vladimir Murzin wrote: > > Thanks for review Russel! > > > > On Mon, Feb 18, 2013 at 04:44:20PM +0000, Russell King - ARM Linux wrote: > > > On Mon, Feb 18, 2013 at 08:26:50PM +0400, Vladimir Murzin wrote: > > > > +void __init check_gmonitor_bugs(void) > > > > +{ > > > > + struct page *page; > > > > + const char *reason; > > > > + unsigned long res = 1; > > > > + > > > > + printk(KERN_INFO "CPU: Testing for global monitor: "); > > > > + > > > > + page = alloc_page(GFP_KERNEL); > > > > + if (page) { > > > > + unsigned long *p; > > > > + pgprot_t prot = __pgprot_modify(PAGE_KERNEL, > > > > + L_PTE_MT_MASK, L_PTE_MT_UNCACHED); > > > > + > > > > + p = vmap(&page, 1, VM_IOREMAP, prot); > > > > > > This is bad practise. Remapping a page of already mapped kernel memory > > > using different attributes (in this case, strongly ordered) is _definitely_ > > > a violation of the architecture requirements. The behaviour you will see > > > from this are in no way guaranteed. > > DDI0406C_arm_architecture_reference_manual.pdf (A3-131) says: > > > > A memory location can be marked as having different cacheability > > attributes, for example when using aliases in a > > virtual to physical address mapping: > > * if the attributes differ only in the cache allocation hint this does > > not affect the behavior of accesses to that location > > * for other cases see Mismatched memory attributes on page A3-136. > > > > Isn't L_PTE_MT_UNCACHED about cache allocation hint? > > No. I've told you above what this gets you. > > > > If you want to do this, it must either come from highmem, or not already > > > be mapped. > > > > > > Moreover, this is absolutely silly - the ARM ARM says: > > > > > > "LDREX and STREX operations *must* be performed only on memory with the > > > Normal memory attribute." > > > > DDI0406C_arm_architecture_reference_manual.pdf (A3-121) says: > > > > It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be > > performed to a memory region with the Device or Strongly-ordered > > memory attribute. Unless the implementation documentation explicitly > > states that LDREX and STREX operations to a memory region with the > > Device or Strongly-ordered attribute are permitted, the effect of such > > operations is UNPREDICTABLE. > > > > At least it allows to perform operations on memory region with the > > Strongly-ordered attribute... but still unpredictable. > > And "unpredictable" means that you can't rely on *any* aspect of its > behaviour. You can't test for its behaviour and then rely on the > result of that test telling you how it will behave in future. > > Even if a "ldrex/strex" sequence to a strongly-ordered region succeeds > or fails, that's no guarantee with "unpredictable" that it will have > that behaviour next time. > > > > L_PTE_MT_UNCACHED doesn't get you that. As I say above, that gets you > > > strongly ordered memory, not "normal memory" as required by the > > > architecture for use with exclusive types. > > > > > > > + > > > > + if (p) { > > > > + int temp; > > > > + > > > > + __asm__ __volatile__( \ > > > > + "ldrex %1, [%2]\n" \ > > > > + "strex %0, %1, [%2]" \ > > > > + : "=&r" (res), "=&r" (temp) \ > > > > + : "r" (p) \ > > > > > > \ character not required for any of the above. Neither is the __ version > > > of "asm" and "volatile". > > Thanks. > > > > > > > + : "cc", "memory"); > > > > + > > > > + reason = "n\\a (atomic ops may be faulty)"; > > > > > > "n\\a" ? > > "not detected"? > > > So... at the moment this has me wondering - you're testing atomic > > > operations with a strongly ordered memory region, which ARM already > > > define this to be outside of the architecture spec. The behaviour you > > > see is not defined architecturally. > > > > > > And if you're trying to use LDREX/STREX to a strongly ordered or device > > > memory region, then you're quite right that it'll be unreliable. It's > > > not defined to even work. That's not because they're faulty, it's because > > > you're abusing them. > > However, IRL it is not hard to meet this undefined difference. At > > least I'm able to see it on Tegra2 Harmony and Pandaboard. Moreover, > > demand on Normal memory attribute breaks up ability to turn caches > > off. In this case we are not able to boot the system up (seen on > > Tegra2 harmony). This patch is aimed to highlight the difference in > > implementation. That's why it has some softering in guessing about > > faulty. Might be it worth warning about unpredictable effect instead? > > You're soo busy quoting bits of architecture manual back at me that you > can't see your own errors - errors which I've pointed out already. > > "normal memory" != "uncached". Normal memory can be uncacheable. > Asking the kernel for L_PTE_MT_BUFFERABLE will give you "normal memory, > uncached". Sorry for delay. Thanks for pointing the error. It was bad idea to make assumption about attribute based on the name. I shoul have checked them first. I'll update the patch soon. Best wishes Vladimir ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-02-22 16:46 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-02-18 16:26 [PATCH] arm: add check for global exclusive monitor Vladimir Murzin 2013-02-18 16:44 ` Russell King - ARM Linux [not found] ` <20130219105534.GA31156@murzin.rnd.samsung.ru> 2013-02-20 4:55 ` Vladimir Murzin 2013-02-20 10:59 ` Russell King - ARM Linux 2013-02-22 16:46 ` Vladimir Murzin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).