LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/7] Avoid cache trashing on clearing huge/gigantic page
From: Kirill A. Shutemov @ 2012-08-16 15:15 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-mips, linux-sh, Jan Beulich, H. Peter Anvin, sparclinux,
	Andrea Arcangeli, Andi Kleen, Robert Richter, x86, Hugh Dickins,
	Ingo Molnar, Mel Gorman, Alex Shi, Thomas Gleixner,
	KAMEZAWA Hiroyuki, Tim Chen, linux-kernel, Andy Lutomirski,
	Johannes Weiner, Andrew Morton, linuxppc-dev, Kirill A. Shutemov

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Clearing a 2MB huge page will typically blow away several levels of CPU
caches.  To avoid this only cache clear the 4K area around the fault
address and use a cache avoiding clears for the rest of the 2MB area.

This patchset implements cache avoiding version of clear_page only for
x86. If an architecture wants to provide cache avoiding version of
clear_page it should to define ARCH_HAS_USER_NOCACHE to 1 and implement
clear_page_nocache() and clear_user_highpage_nocache().

v3:
  - Rebased to current Linus' tree. kmap_atomic() build issue is fixed;
  - Pass fault address to clear_huge_page(). v2 had problem with clearing
    for sizes other than HPAGE_SIZE
  - x86: fix 32bit variant. Fallback version of clear_page_nocache() has
    been added for non-SSE2 systems;
  - x86: clear_page_nocache() moved to clear_page_{32,64}.S;
  - x86: use pushq_cfi/popq_cfi instead of push/pop;
v2:
  - No code change. Only commit messages are updated.
  - RFC mark is dropped.

Andi Kleen (5):
  THP: Use real address for NUMA policy
  THP: Pass fault address to __do_huge_pmd_anonymous_page()
  x86: Add clear_page_nocache
  mm: make clear_huge_page cache clear only around the fault address
  x86: switch the 64bit uncached page clear to SSE/AVX v2

Kirill A. Shutemov (2):
  hugetlb: pass fault address to hugetlb_no_page()
  mm: pass fault address to clear_huge_page()

 arch/x86/include/asm/page.h      |    2 +
 arch/x86/include/asm/string_32.h |    5 ++
 arch/x86/include/asm/string_64.h |    5 ++
 arch/x86/lib/Makefile            |    3 +-
 arch/x86/lib/clear_page_32.S     |   72 +++++++++++++++++++++++++++++++++++
 arch/x86/lib/clear_page_64.S     |   78 ++++++++++++++++++++++++++++++++++++++
 arch/x86/mm/fault.c              |    7 +++
 include/linux/mm.h               |    2 +-
 mm/huge_memory.c                 |   17 ++++----
 mm/hugetlb.c                     |   39 ++++++++++---------
 mm/memory.c                      |   37 +++++++++++++++---
 11 files changed, 232 insertions(+), 35 deletions(-)
 create mode 100644 arch/x86/lib/clear_page_32.S

-- 
1.7.7.6

^ permalink raw reply

* [PATCH v3 2/7] THP: Pass fault address to __do_huge_pmd_anonymous_page()
From: Kirill A. Shutemov @ 2012-08-16 15:15 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-mips, linux-sh, Jan Beulich, H. Peter Anvin, sparclinux,
	Andrea Arcangeli, Andi Kleen, Robert Richter, x86, Hugh Dickins,
	Ingo Molnar, Mel Gorman, Alex Shi, Thomas Gleixner,
	KAMEZAWA Hiroyuki, Tim Chen, linux-kernel, Andy Lutomirski,
	Johannes Weiner, Andrew Morton, linuxppc-dev, Kirill A. Shutemov
In-Reply-To: <1345130154-9602-1-git-send-email-kirill.shutemov@linux.intel.com>

From: Andi Kleen <ak@linux.intel.com>

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/huge_memory.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 70737ec..6f0825b611 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -633,7 +633,8 @@ static inline pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma)
 
 static int __do_huge_pmd_anonymous_page(struct mm_struct *mm,
 					struct vm_area_struct *vma,
-					unsigned long haddr, pmd_t *pmd,
+					unsigned long haddr,
+					unsigned long address, pmd_t *pmd,
 					struct page *page)
 {
 	pgtable_t pgtable;
@@ -720,8 +721,8 @@ int do_huge_pmd_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma,
 			put_page(page);
 			goto out;
 		}
-		if (unlikely(__do_huge_pmd_anonymous_page(mm, vma, haddr, pmd,
-							  page))) {
+		if (unlikely(__do_huge_pmd_anonymous_page(mm, vma, haddr,
+						address, pmd, page))) {
 			mem_cgroup_uncharge_page(page);
 			put_page(page);
 			goto out;
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH v3 3/7] hugetlb: pass fault address to hugetlb_no_page()
From: Kirill A. Shutemov @ 2012-08-16 15:15 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-mips, linux-sh, Jan Beulich, H. Peter Anvin, sparclinux,
	Andrea Arcangeli, Andi Kleen, Robert Richter, x86, Hugh Dickins,
	Ingo Molnar, Mel Gorman, Alex Shi, Thomas Gleixner,
	KAMEZAWA Hiroyuki, Tim Chen, linux-kernel, Andy Lutomirski,
	Johannes Weiner, Andrew Morton, linuxppc-dev, Kirill A. Shutemov
In-Reply-To: <1345130154-9602-1-git-send-email-kirill.shutemov@linux.intel.com>

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/hugetlb.c |   38 +++++++++++++++++++-------------------
 1 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index bc72712..3c86d3d 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2672,7 +2672,8 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
 }
 
 static int hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma,
-			unsigned long address, pte_t *ptep, unsigned int flags)
+			unsigned long haddr, unsigned long fault_address,
+			pte_t *ptep, unsigned int flags)
 {
 	struct hstate *h = hstate_vma(vma);
 	int ret = VM_FAULT_SIGBUS;
@@ -2696,7 +2697,7 @@ static int hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma,
 	}
 
 	mapping = vma->vm_file->f_mapping;
-	idx = vma_hugecache_offset(h, vma, address);
+	idx = vma_hugecache_offset(h, vma, haddr);
 
 	/*
 	 * Use page lock to guard against racing truncation
@@ -2708,7 +2709,7 @@ retry:
 		size = i_size_read(mapping->host) >> huge_page_shift(h);
 		if (idx >= size)
 			goto out;
-		page = alloc_huge_page(vma, address, 0);
+		page = alloc_huge_page(vma, haddr, 0);
 		if (IS_ERR(page)) {
 			ret = PTR_ERR(page);
 			if (ret == -ENOMEM)
@@ -2717,7 +2718,7 @@ retry:
 				ret = VM_FAULT_SIGBUS;
 			goto out;
 		}
-		clear_huge_page(page, address, pages_per_huge_page(h));
+		clear_huge_page(page, haddr, pages_per_huge_page(h));
 		__SetPageUptodate(page);
 
 		if (vma->vm_flags & VM_MAYSHARE) {
@@ -2763,7 +2764,7 @@ retry:
 	 * the spinlock.
 	 */
 	if ((flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED))
-		if (vma_needs_reservation(h, vma, address) < 0) {
+		if (vma_needs_reservation(h, vma, haddr) < 0) {
 			ret = VM_FAULT_OOM;
 			goto backout_unlocked;
 		}
@@ -2778,16 +2779,16 @@ retry:
 		goto backout;
 
 	if (anon_rmap)
-		hugepage_add_new_anon_rmap(page, vma, address);
+		hugepage_add_new_anon_rmap(page, vma, haddr);
 	else
 		page_dup_rmap(page);
 	new_pte = make_huge_pte(vma, page, ((vma->vm_flags & VM_WRITE)
 				&& (vma->vm_flags & VM_SHARED)));
-	set_huge_pte_at(mm, address, ptep, new_pte);
+	set_huge_pte_at(mm, haddr, ptep, new_pte);
 
 	if ((flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) {
 		/* Optimization, do the COW without a second fault */
-		ret = hugetlb_cow(mm, vma, address, ptep, new_pte, page);
+		ret = hugetlb_cow(mm, vma, haddr, ptep, new_pte, page);
 	}
 
 	spin_unlock(&mm->page_table_lock);
@@ -2813,21 +2814,20 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	struct page *pagecache_page = NULL;
 	static DEFINE_MUTEX(hugetlb_instantiation_mutex);
 	struct hstate *h = hstate_vma(vma);
+	unsigned long haddr = address & huge_page_mask(h);
 
-	address &= huge_page_mask(h);
-
-	ptep = huge_pte_offset(mm, address);
+	ptep = huge_pte_offset(mm, haddr);
 	if (ptep) {
 		entry = huge_ptep_get(ptep);
 		if (unlikely(is_hugetlb_entry_migration(entry))) {
-			migration_entry_wait(mm, (pmd_t *)ptep, address);
+			migration_entry_wait(mm, (pmd_t *)ptep, haddr);
 			return 0;
 		} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
 			return VM_FAULT_HWPOISON_LARGE |
 				VM_FAULT_SET_HINDEX(hstate_index(h));
 	}
 
-	ptep = huge_pte_alloc(mm, address, huge_page_size(h));
+	ptep = huge_pte_alloc(mm, haddr, huge_page_size(h));
 	if (!ptep)
 		return VM_FAULT_OOM;
 
@@ -2839,7 +2839,7 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	mutex_lock(&hugetlb_instantiation_mutex);
 	entry = huge_ptep_get(ptep);
 	if (huge_pte_none(entry)) {
-		ret = hugetlb_no_page(mm, vma, address, ptep, flags);
+		ret = hugetlb_no_page(mm, vma, haddr, address, ptep, flags);
 		goto out_mutex;
 	}
 
@@ -2854,14 +2854,14 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * consumed.
 	 */
 	if ((flags & FAULT_FLAG_WRITE) && !pte_write(entry)) {
-		if (vma_needs_reservation(h, vma, address) < 0) {
+		if (vma_needs_reservation(h, vma, haddr) < 0) {
 			ret = VM_FAULT_OOM;
 			goto out_mutex;
 		}
 
 		if (!(vma->vm_flags & VM_MAYSHARE))
 			pagecache_page = hugetlbfs_pagecache_page(h,
-								vma, address);
+								vma, haddr);
 	}
 
 	/*
@@ -2884,16 +2884,16 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 
 	if (flags & FAULT_FLAG_WRITE) {
 		if (!pte_write(entry)) {
-			ret = hugetlb_cow(mm, vma, address, ptep, entry,
+			ret = hugetlb_cow(mm, vma, haddr, ptep, entry,
 							pagecache_page);
 			goto out_page_table_lock;
 		}
 		entry = pte_mkdirty(entry);
 	}
 	entry = pte_mkyoung(entry);
-	if (huge_ptep_set_access_flags(vma, address, ptep, entry,
+	if (huge_ptep_set_access_flags(vma, haddr, ptep, entry,
 						flags & FAULT_FLAG_WRITE))
-		update_mmu_cache(vma, address, ptep);
+		update_mmu_cache(vma, haddr, ptep);
 
 out_page_table_lock:
 	spin_unlock(&mm->page_table_lock);
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH v3 5/7] x86: Add clear_page_nocache
From: Kirill A. Shutemov @ 2012-08-16 15:15 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-mips, linux-sh, Jan Beulich, H. Peter Anvin, sparclinux,
	Andrea Arcangeli, Andi Kleen, Robert Richter, x86, Hugh Dickins,
	Ingo Molnar, Mel Gorman, Alex Shi, Thomas Gleixner,
	KAMEZAWA Hiroyuki, Tim Chen, linux-kernel, Andy Lutomirski,
	Johannes Weiner, Andrew Morton, linuxppc-dev, Kirill A. Shutemov
In-Reply-To: <1345130154-9602-1-git-send-email-kirill.shutemov@linux.intel.com>

From: Andi Kleen <ak@linux.intel.com>

Add a cache avoiding version of clear_page. Straight forward integer variant
of the existing 64bit clear_page, for both 32bit and 64bit.

Also add the necessary glue for highmem including a layer that non cache
coherent architectures that use the virtual address for flushing can
hook in. This is not needed on x86 of course.

If an architecture wants to provide cache avoiding version of clear_page
it should to define ARCH_HAS_USER_NOCACHE to 1 and implement
clear_page_nocache() and clear_user_highpage_nocache().

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/include/asm/page.h      |    2 +
 arch/x86/include/asm/string_32.h |    5 +++
 arch/x86/include/asm/string_64.h |    5 +++
 arch/x86/lib/Makefile            |    3 +-
 arch/x86/lib/clear_page_32.S     |   72 ++++++++++++++++++++++++++++++++++++++
 arch/x86/lib/clear_page_64.S     |   29 +++++++++++++++
 arch/x86/mm/fault.c              |    7 ++++
 7 files changed, 122 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/lib/clear_page_32.S

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index 8ca8283..aa83a1b 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -29,6 +29,8 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr,
 	copy_page(to, from);
 }
 
+void clear_user_highpage_nocache(struct page *page, unsigned long vaddr);
+
 #define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
 	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
diff --git a/arch/x86/include/asm/string_32.h b/arch/x86/include/asm/string_32.h
index 3d3e835..3f2fbcf 100644
--- a/arch/x86/include/asm/string_32.h
+++ b/arch/x86/include/asm/string_32.h
@@ -3,6 +3,8 @@
 
 #ifdef __KERNEL__
 
+#include <linux/linkage.h>
+
 /* Let gcc decide whether to inline or use the out of line functions */
 
 #define __HAVE_ARCH_STRCPY
@@ -337,6 +339,9 @@ void *__constant_c_and_count_memset(void *s, unsigned long pattern,
 #define __HAVE_ARCH_MEMSCAN
 extern void *memscan(void *addr, int c, size_t size);
 
+#define ARCH_HAS_USER_NOCACHE 1
+asmlinkage void clear_page_nocache(void *page);
+
 #endif /* __KERNEL__ */
 
 #endif /* _ASM_X86_STRING_32_H */
diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index 19e2c46..ca23d1d 100644
--- a/arch/x86/include/asm/string_64.h
+++ b/arch/x86/include/asm/string_64.h
@@ -3,6 +3,8 @@
 
 #ifdef __KERNEL__
 
+#include <linux/linkage.h>
+
 /* Written 2002 by Andi Kleen */
 
 /* Only used for special circumstances. Stolen from i386/string.h */
@@ -63,6 +65,9 @@ char *strcpy(char *dest, const char *src);
 char *strcat(char *dest, const char *src);
 int strcmp(const char *cs, const char *ct);
 
+#define ARCH_HAS_USER_NOCACHE 1
+asmlinkage void clear_page_nocache(void *page);
+
 #endif /* __KERNEL__ */
 
 #endif /* _ASM_X86_STRING_64_H */
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index b00f678..14e47a2 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -23,6 +23,7 @@ lib-y += memcpy_$(BITS).o
 lib-$(CONFIG_SMP) += rwlock.o
 lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
 lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o
+lib-y += clear_page_$(BITS).o
 
 obj-y += msr.o msr-reg.o msr-reg-export.o
 
@@ -40,7 +41,7 @@ endif
 else
         obj-y += iomap_copy_64.o
         lib-y += csum-partial_64.o csum-copy_64.o csum-wrappers_64.o
-        lib-y += thunk_64.o clear_page_64.o copy_page_64.o
+        lib-y += thunk_64.o copy_page_64.o
         lib-y += memmove_64.o memset_64.o
         lib-y += copy_user_64.o copy_user_nocache_64.o
 	lib-y += cmpxchg16b_emu.o
diff --git a/arch/x86/lib/clear_page_32.S b/arch/x86/lib/clear_page_32.S
new file mode 100644
index 0000000..9592161
--- /dev/null
+++ b/arch/x86/lib/clear_page_32.S
@@ -0,0 +1,72 @@
+#include <linux/linkage.h>
+#include <asm/alternative-asm.h>
+#include <asm/cpufeature.h>
+#include <asm/dwarf2.h>
+
+/*
+ * Fallback version if SSE2 is not avaible.
+ */
+ENTRY(clear_page_nocache)
+	CFI_STARTPROC
+	mov    %eax,%edx
+	xorl   %eax,%eax
+	movl   $4096/32,%ecx
+	.p2align 4
+.Lloop:
+	decl	%ecx
+#define PUT(x) mov %eax,x*4(%edx)
+	PUT(0)
+	PUT(1)
+	PUT(2)
+	PUT(3)
+	PUT(4)
+	PUT(5)
+	PUT(6)
+	PUT(7)
+#undef PUT
+	lea	32(%edx),%edx
+	jnz	.Lloop
+	nop
+	ret
+	CFI_ENDPROC
+ENDPROC(clear_page_nocache)
+
+	.section .altinstr_replacement,"ax"
+1:      .byte 0xeb /* jmp <disp8> */
+	.byte (clear_page_nocache_sse2 - clear_page_nocache) - (2f - 1b)
+	/* offset */
+2:
+	.previous
+	.section .altinstructions,"a"
+	altinstruction_entry clear_page_nocache,1b,X86_FEATURE_XMM2,\
+				16, 2b-1b
+	.previous
+
+/*
+ * Zero a page avoiding the caches
+ * eax	page
+ */
+ENTRY(clear_page_nocache_sse2)
+	CFI_STARTPROC
+	mov    %eax,%edx
+	xorl   %eax,%eax
+	movl   $4096/32,%ecx
+	.p2align 4
+.Lloop_sse2:
+	decl	%ecx
+#define PUT(x) movnti %eax,x*4(%edx)
+	PUT(0)
+	PUT(1)
+	PUT(2)
+	PUT(3)
+	PUT(4)
+	PUT(5)
+	PUT(6)
+	PUT(7)
+#undef PUT
+	lea	32(%edx),%edx
+	jnz	.Lloop_sse2
+	nop
+	ret
+	CFI_ENDPROC
+ENDPROC(clear_page_nocache_sse2)
diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S
index f2145cf..9d2f3c2 100644
--- a/arch/x86/lib/clear_page_64.S
+++ b/arch/x86/lib/clear_page_64.S
@@ -40,6 +40,7 @@ ENTRY(clear_page)
 	PUT(5)
 	PUT(6)
 	PUT(7)
+#undef PUT
 	leaq	64(%rdi),%rdi
 	jnz	.Lloop
 	nop
@@ -71,3 +72,31 @@ ENDPROC(clear_page)
 	altinstruction_entry clear_page,2b,X86_FEATURE_ERMS,   \
 			     .Lclear_page_end-clear_page,3b-2b
 	.previous
+
+/*
+ * Zero a page avoiding the caches
+ * rdi	page
+ */
+ENTRY(clear_page_nocache)
+	CFI_STARTPROC
+	xorl   %eax,%eax
+	movl   $4096/64,%ecx
+	.p2align 4
+.Lloop_nocache:
+	decl	%ecx
+#define PUT(x) movnti %rax,x*8(%rdi)
+	movnti %rax,(%rdi)
+	PUT(1)
+	PUT(2)
+	PUT(3)
+	PUT(4)
+	PUT(5)
+	PUT(6)
+	PUT(7)
+#undef PUT
+	leaq	64(%rdi),%rdi
+	jnz	.Lloop_nocache
+	nop
+	ret
+	CFI_ENDPROC
+ENDPROC(clear_page_nocache)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 76dcd9d..d8cf231 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1209,3 +1209,10 @@ good_area:
 
 	up_read(&mm->mmap_sem);
 }
+
+void clear_user_highpage_nocache(struct page *page, unsigned long vaddr)
+{
+	void *p = kmap_atomic(page);
+	clear_page_nocache(p);
+	kunmap_atomic(p);
+}
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH v3 6/7] mm: make clear_huge_page cache clear only around the fault address
From: Kirill A. Shutemov @ 2012-08-16 15:15 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-mips, linux-sh, Jan Beulich, H. Peter Anvin, sparclinux,
	Andrea Arcangeli, Andi Kleen, Robert Richter, x86, Hugh Dickins,
	Ingo Molnar, Mel Gorman, Alex Shi, Thomas Gleixner,
	KAMEZAWA Hiroyuki, Tim Chen, linux-kernel, Andy Lutomirski,
	Johannes Weiner, Andrew Morton, linuxppc-dev, Kirill A. Shutemov
In-Reply-To: <1345130154-9602-1-git-send-email-kirill.shutemov@linux.intel.com>

From: Andi Kleen <ak@linux.intel.com>

Clearing a 2MB huge page will typically blow away several levels
of CPU caches. To avoid this only cache clear the 4K area
around the fault address and use a cache avoiding clears
for the rest of the 2MB area.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/memory.c |   34 +++++++++++++++++++++++++++++-----
 1 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index dfc179b..d4626b9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3969,18 +3969,34 @@ EXPORT_SYMBOL(might_fault);
 #endif
 
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS)
+
+#ifndef ARCH_HAS_USER_NOCACHE
+#define ARCH_HAS_USER_NOCACHE 0
+#endif
+
+#if ARCH_HAS_USER_NOCACHE == 0
+#define clear_user_highpage_nocache clear_user_highpage
+#endif
+
 static void clear_gigantic_page(struct page *page,
-				unsigned long addr,
-				unsigned int pages_per_huge_page)
+		unsigned long haddr, unsigned long fault_address,
+		unsigned int pages_per_huge_page)
 {
 	int i;
 	struct page *p = page;
+	unsigned long vaddr;
+	int target = (fault_address - haddr) >> PAGE_SHIFT;
 
 	might_sleep();
+	vaddr = haddr;
 	for (i = 0; i < pages_per_huge_page;
 	     i++, p = mem_map_next(p, page, i)) {
 		cond_resched();
-		clear_user_highpage(p, addr + i * PAGE_SIZE);
+		vaddr = haddr + i*PAGE_SIZE;
+		if (!ARCH_HAS_USER_NOCACHE  || i == target)
+			clear_user_highpage(p, vaddr);
+		else
+			clear_user_highpage_nocache(p, vaddr);
 	}
 }
 void clear_huge_page(struct page *page,
@@ -3988,16 +4004,24 @@ void clear_huge_page(struct page *page,
 		     unsigned int pages_per_huge_page)
 {
 	int i;
+	unsigned long vaddr;
+	int target = (fault_address - haddr) >> PAGE_SHIFT;
 
 	if (unlikely(pages_per_huge_page > MAX_ORDER_NR_PAGES)) {
-		clear_gigantic_page(page, haddr, pages_per_huge_page);
+		clear_gigantic_page(page, haddr, fault_address,
+				pages_per_huge_page);
 		return;
 	}
 
 	might_sleep();
+	vaddr = haddr;
 	for (i = 0; i < pages_per_huge_page; i++) {
 		cond_resched();
-		clear_user_highpage(page + i, haddr + i * PAGE_SIZE);
+		vaddr = haddr + i*PAGE_SIZE;
+		if (!ARCH_HAS_USER_NOCACHE || i == target)
+			clear_user_highpage(page + i, vaddr);
+		else
+			clear_user_highpage_nocache(page + i, vaddr);
 	}
 }
 
-- 
1.7.7.6

^ permalink raw reply related

* Re: therm_pm72 units, interface
From: Jan Engelhardt @ 2012-08-16 15:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1345066551.11751.1.camel@pasglop>


On Wednesday 2012-08-15 23:35, Benjamin Herrenschmidt wrote:
>> XServe G5 of mine started powering off more or less 
>> randomly
>
>BTW. There's a new windfarm driver for these in recent kernels...
>
>Appart from that, the trip points are coming from a calibration EEPROM,
>you may want to tweak the driver to warn a bit earlier or that sort of
>things ? (Or just to print more things out ?)

If you have more things to print/offer via sysfs, I'm all for it.

The XsG5 really has (by looking into the casing): 1 PCI Fan,
6 center fans, 1 PSU intake and 1 PSU outblow fan (this last one
seems rather slow-turning, but maybe that's normal).
It is not quite clear which is which in the sysfs display.

What I did figure out: at the PROM, fans run at what seems
to be full speed (some 8000-9000 rpm?). Once Linux and therm_pm72
are loaded, the fans settle down towards 4000 rpm, and if the machine
has warmed up, that is then when it powers off. (The kernel is indeed
3.4. I now need to figure out how to place a new kernel on it without
it powering off inbetween.)

>> $ cd /sys/devices/temperature; grep '' *;
>> backside_fan_pwm:32
>> backside_temperature:54.000
>> cpu0_current:34.423
>> cpu0_exhaust_fan_rpm:5340
>> cpu0_intake_fan_rpm:5340
>> cpu0_temperature:72.889
>> cpu0_voltage:1.252
>> cpu1_current:34.179
>> cpu1_exhaust_fan_rpm:4584
>> cpu1_intake_fan_rpm:4584
>> cpu1_temperature:68.526
>> cpu1_voltage:1.259
>> dimms_temperature:53.000
>> grep: driver: Er en filkatalog
>> modalias:platform:temperature
>> grep: power: Er en filkatalog
>> slots_fan_pwm:20
>> slots_temperature:38.500
>> grep: subsystem: Er en filkatalog
>> uevent:DRIVER=temperature
>> uevent:OF_NAME=fan
>> uevent:OF_FULLNAME=/u3@0,f8000000/i2c@f8001000/fan@15e
>> uevent:OF_TYPE=fcu
>> uevent:OF_COMPATIBLE_0=fcu
>> uevent:OF_COMPATIBLE_N=1
>> uevent:MODALIAS=of:NfanTfcuCfcu

^ permalink raw reply

* Re: [PATCH v3 2/2] powerpc: Uprobes port to powerpc
From: Oleg Nesterov @ 2012-08-16 15:21 UTC (permalink / raw)
  To: Ananth N Mavinakayanahalli
  Cc: Srikar Dronamraju, peterz, lkml, Paul Mackerras, Anton Blanchard,
	Ingo Molnar, linuxppc-dev
In-Reply-To: <20120816050030.GA12060@in.ibm.com>

On 08/16, Ananth N Mavinakayanahalli wrote:
>
> On Thu, Aug 16, 2012 at 07:41:53AM +1000, Benjamin Herrenschmidt wrote:
> > On Wed, 2012-08-15 at 18:59 +0200, Oleg Nesterov wrote:
> > > On 07/26, Ananth N Mavinakayanahalli wrote:
> > > >
> > > > From: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
> > > >
> > > > This is the port of uprobes to powerpc. Usage is similar to x86.
> > >
> > > I am just curious why this series was ignored by powerpc maintainers...
> >
> > Because it arrived too late for the previous merge window considering my
> > limited bandwidth for reviewing things and that nobody else seems to
> > have reviewed it :-)
> >
> > It's still on track for the next one, and I'm hoping to dedicate most of
> > next week going through patches & doing a powerpc -next.
>
> Thanks Ben!

Great!

> > > Just one question... Shouldn't arch_uprobe_pre_xol() forbid to probe
> > > UPROBE_SWBP_INSN (at least) ?
> > >
> > > (I assume that emulate_step() can't handle this case but of course I
> > >  do not understand arch/powerpc/lib/sstep.c)
> > >
> > > Note that uprobe_pre_sstep_notifier() sets utask->state = UTASK_BP_HIT
> > > without any checks. This doesn't look right if it was UTASK_SSTEP...
> > >
> > > But again, I do not know what powepc will actually do if we try to
> > > single-step over UPROBE_SWBP_INSN.
> >
> > Ananth ?
>
> set_swbp() will return -EEXIST to install_breakpoint if we are trying to
> put a breakpoint on UPROBE_SWBP_INSN.

not really, this -EEXIST (already removed by recent changes) means that
bp was already installed.

But this doesn't matter,

> So, the arch agnostic code itself
> takes care of this case...

Yes. I forgot about install_breakpoint()->is_swbp_insn() check which
returns -ENOTSUPP, somehow I thought arch_uprobe_analyze_insn() does
this.

> or am I missing something?

No, it is me.

> However, I see that we need a powerpc specific is_swbp_insn()
> implementation since we will have to take care of all the trap variants.

Hmm, I am not sure. is_swbp_insn(insn), as it is used in the arch agnostic
code, should only return true if insn == UPROBE_SWBP_INSN (just in case,
this logic needs more fixes but this is offtopic).

If powerpc has another insn(s) which can trigger powerpc's do_int3()
counterpart, they should be rejected by arch_uprobe_analyze_insn().
I think.

> I will need to update the patches based on changes being made by Oleg
> and Sebastien for the single-step issues.

Perhaps you can do this in a separate change?

We need some (simple) changes in the arch agnostic code first, they
should not break poweppc. These changes are still under discussion.
Once we have "__weak  arch_uprobe_step*" you can reimplement these
hooks and fix the problems with single-stepping.

Oleg.

^ permalink raw reply

* Re: [PATCH v3 6/7] mm: make clear_huge_page cache clear only around the fault address
From: Andrea Arcangeli @ 2012-08-16 16:16 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-mips, Andi Kleen, Alex Shi, Robert Richter, linuxppc-dev,
	x86, Hugh Dickins, linux-kernel, Jan Beulich, Andy Lutomirski,
	Johannes Weiner, linux-mm, linux-sh, Ingo Molnar, Mel Gorman,
	H. Peter Anvin, sparclinux, Thomas Gleixner, Tim Chen,
	Andrew Morton, KAMEZAWA Hiroyuki
In-Reply-To: <1345130154-9602-7-git-send-email-kirill.shutemov@linux.intel.com>

Hi Kirill,

On Thu, Aug 16, 2012 at 06:15:53PM +0300, Kirill A. Shutemov wrote:
>  	for (i = 0; i < pages_per_huge_page;
>  	     i++, p = mem_map_next(p, page, i)) {

It may be more optimal to avoid a multiplication/shiftleft before the
add, and to do:

  	for (i = 0, vaddr = haddr; i < pages_per_huge_page;
  	     i++, p = mem_map_next(p, page, i), vaddr += PAGE_SIZE) {

>  		cond_resched();
> -		clear_user_highpage(p, addr + i * PAGE_SIZE);
> +		vaddr = haddr + i*PAGE_SIZE;

Not sure if gcc can optimize it away because of the external calls.

> +		if (!ARCH_HAS_USER_NOCACHE || i == target)
> +			clear_user_highpage(page + i, vaddr);
> +		else
> +			clear_user_highpage_nocache(page + i, vaddr);
>  	}


My only worry overall is if there can be some workload where this may
actually slow down userland if the CPU cache is very large and
userland would access most of the faulted in memory after the first
fault.

So I wouldn't mind to add one more check in addition of
!ARCH_HAS_USER_NOCACHE above to check a runtime sysctl variable. It'll
waste a cacheline yes but I doubt it's measurable compared to the time
it takes to do a >=2M hugepage copy.

Furthermore it would allow people to benchmark its effect without
having to rebuild the kernel themself.

All other patches looks fine to me.

Thanks!
Andrea

^ permalink raw reply

* Re: [PATCH v3 0/7] mv643xx.c: Add basic device tree support.
From: Ian Molton @ 2012-08-16 16:30 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: thomas.petazzoni, andrew, netdev, devicetree-discuss, ben.dooks,
	linuxppc-dev, David Miller, linux-arm-kernel
In-Reply-To: <5028D040.60604@codethink.co.uk>

Ping :)

Can we get some consensus on the right approach here? I'm loathe to code
this if its going to be rejected.

I'd prefer the driver to be properly split so we dont have the MDIO
driver mapping the ethernet drivers address spaces, but if thats not
going to be merged, I'm not feeling like doing the work for nothing.

If the driver is to use the overlapping-address mapped-by-the-mdio
scheme, then so be it, but I could do with knowing.

Another point against the latter scheme is that the MDIO driver could
sensibly be used (the block is identical) on the ArmadaXP, which has 4
ethernet blocks rather than two, yet grouped in two pairs with a
discontiguous address range.

I'd like to get this moved along as soon as possible though.

-Ian

^ permalink raw reply

* Re: [PATCH v3 6/7] mm: make clear_huge_page cache clear only around the fault address
From: Kirill A. Shutemov @ 2012-08-16 16:43 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-mips, linux-sh, Jan Beulich, linux-mm, H. Peter Anvin,
	sparclinux, Andi Kleen, Robert Richter, x86, Hugh Dickins,
	Ingo Molnar, Mel Gorman, Alex Shi, Thomas Gleixner,
	KAMEZAWA Hiroyuki, Tim Chen, linux-kernel, Andy Lutomirski,
	Johannes Weiner, Andrew Morton, linuxppc-dev, Kirill A. Shutemov
In-Reply-To: <20120816161647.GM11188@redhat.com>

On Thu, Aug 16, 2012 at 06:16:47PM +0200, Andrea Arcangeli wrote:
> Hi Kirill,
> 
> On Thu, Aug 16, 2012 at 06:15:53PM +0300, Kirill A. Shutemov wrote:
> >  	for (i = 0; i < pages_per_huge_page;
> >  	     i++, p = mem_map_next(p, page, i)) {
> 
> It may be more optimal to avoid a multiplication/shiftleft before the
> add, and to do:
> 
>   	for (i = 0, vaddr = haddr; i < pages_per_huge_page;
>   	     i++, p = mem_map_next(p, page, i), vaddr += PAGE_SIZE) {
> 

Makes sense. I'll update it.

> >  		cond_resched();
> > -		clear_user_highpage(p, addr + i * PAGE_SIZE);
> > +		vaddr = haddr + i*PAGE_SIZE;
> 
> Not sure if gcc can optimize it away because of the external calls.
> 
> > +		if (!ARCH_HAS_USER_NOCACHE || i == target)
> > +			clear_user_highpage(page + i, vaddr);
> > +		else
> > +			clear_user_highpage_nocache(page + i, vaddr);
> >  	}
> 
> 
> My only worry overall is if there can be some workload where this may
> actually slow down userland if the CPU cache is very large and
> userland would access most of the faulted in memory after the first
> fault.
> 
> So I wouldn't mind to add one more check in addition of
> !ARCH_HAS_USER_NOCACHE above to check a runtime sysctl variable. It'll
> waste a cacheline yes but I doubt it's measurable compared to the time
> it takes to do a >=2M hugepage copy.

Hm.. I think with static_key we can avoid cache overhead here. I'll try.
 
> Furthermore it would allow people to benchmark its effect without
> having to rebuild the kernel themself.
> 
> All other patches looks fine to me.

Thanks, for review. Could you take a look at huge zero page patchset? ;)

-- 
 Kirill A. Shutemov

^ permalink raw reply

* Re: [PATCH v3 6/7] mm: make clear_huge_page cache clear only around the fault address
From: Andrea Arcangeli @ 2012-08-16 18:29 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-mips, Andi Kleen, Alex Shi, Robert Richter, linuxppc-dev,
	x86, Hugh Dickins, linux-kernel, Jan Beulich, Andy Lutomirski,
	Johannes Weiner, linux-mm, linux-sh, Ingo Molnar, Mel Gorman,
	H. Peter Anvin, sparclinux, Thomas Gleixner, Tim Chen,
	Andrew Morton, KAMEZAWA Hiroyuki
In-Reply-To: <20120816164356.GA30106@shutemov.name>

On Thu, Aug 16, 2012 at 07:43:56PM +0300, Kirill A. Shutemov wrote:
> Hm.. I think with static_key we can avoid cache overhead here. I'll try.

Could you elaborate on the static_key? Is it some sort of self
modifying code?

> Thanks, for review. Could you take a look at huge zero page patchset? ;)

I've noticed that too, nice :). I'm checking some detail on the
wrprotect fault behavior but I'll comment there.

^ permalink raw reply

* Re: [PATCH v3 6/7] mm: make clear_huge_page cache clear only around the fault address
From: Kirill A. Shutemov @ 2012-08-16 18:37 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-mips, linux-sh, Jan Beulich, linux-mm, H. Peter Anvin,
	sparclinux, Andi Kleen, Robert Richter, x86, Hugh Dickins,
	Ingo Molnar, Mel Gorman, Alex Shi, Thomas Gleixner,
	KAMEZAWA Hiroyuki, Tim Chen, linux-kernel, Andy Lutomirski,
	Johannes Weiner, Andrew Morton, linuxppc-dev, Kirill A. Shutemov
In-Reply-To: <20120816182944.GN11188@redhat.com>

On Thu, Aug 16, 2012 at 08:29:44PM +0200, Andrea Arcangeli wrote:
> On Thu, Aug 16, 2012 at 07:43:56PM +0300, Kirill A. Shutemov wrote:
> > Hm.. I think with static_key we can avoid cache overhead here. I'll try.
> 
> Could you elaborate on the static_key? Is it some sort of self
> modifying code?

Runtime code patching. See Documentation/static-keys.txt. We can patch it
on sysctl.

> 
> > Thanks, for review. Could you take a look at huge zero page patchset? ;)
> 
> I've noticed that too, nice :). I'm checking some detail on the
> wrprotect fault behavior but I'll comment there.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
 Kirill A. Shutemov

^ permalink raw reply

* Re: [PATCH v3 6/7] mm: make clear_huge_page cache clear only around the fault address
From: Andrea Arcangeli @ 2012-08-16 19:42 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-mips, linux-sh, Jan Beulich, linux-mm, H. Peter Anvin,
	sparclinux, Andi Kleen, Robert Richter, x86, Hugh Dickins,
	Ingo Molnar, Mel Gorman, Alex Shi, Thomas Gleixner,
	KAMEZAWA Hiroyuki, Tim Chen, linux-kernel, Andy Lutomirski,
	Johannes Weiner, Andrew Morton, linuxppc-dev, Kirill A. Shutemov
In-Reply-To: <20120816183725.GA30284@shutemov.name>

On Thu, Aug 16, 2012 at 09:37:25PM +0300, Kirill A. Shutemov wrote:
> On Thu, Aug 16, 2012 at 08:29:44PM +0200, Andrea Arcangeli wrote:
> > On Thu, Aug 16, 2012 at 07:43:56PM +0300, Kirill A. Shutemov wrote:
> > > Hm.. I think with static_key we can avoid cache overhead here. I'll try.
> > 
> > Could you elaborate on the static_key? Is it some sort of self
> > modifying code?
> 
> Runtime code patching. See Documentation/static-keys.txt. We can patch it
> on sysctl.

I guessed it had to be patching the code, thanks for the
pointer. It looks a perfect fit for this one agreed.

^ permalink raw reply

* Re: [PATCH] scsi/ibmvscsi: /sys/class/scsi_host/hostX/config doesn't show any information
From: Robert Jennings @ 2012-08-16 19:45 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: olaf, Linda Xie, Brian J King, linux-scsi, James E.J. Bottomley,
	linuxppc-dev
In-Reply-To: <1343611985.21647.25.camel@pasglop>

On Sun, Jul 29, 2012 at 8:33 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> n Wed, 2012-07-18 at 18:49 +0200, olaf@aepfle.de wrote:
>> From: Linda Xie <lxiep@us.ibm.com>
>>
>> Expected result:
>> It should show something like this:
>> x1521p4:~ # cat /sys/class/scsi_host/host1/config
>> PARTITIONNAME='x1521p4'
>> NWSDNAME='X1521P4'
>> HOSTNAME='X1521P4'
>> DOMAINNAME='RCHLAND.IBM.COM'
>> NAMESERVERS='9.10.244.100 9.10.244.200'
>>
>> Actual result:
>> x1521p4:~ # cat /sys/class/scsi_host/host0/config
>> x1521p4:~ #
>>
>> This patch changes the size of the buffer used for transfering config
>> data to 4K. It was tested against 2.6.19-rc2 tree.
>>
>> Reported by IBM during SLES11 beta testing:
>
> So this patch just seems to blindly replace all occurrences of PAGE_SIZE
> with HOST_PAGE_SIZE which is utterly wrong. Only one of those needs to
> be changed, the one passed to ibmvscsi_do_host_config() which is what's
> visible to the server, all the rest is just sysfs attributes and should
> remain as-is.
>
> Additionally (not even mentioning that there is no explanation as to
> what the real problem is anywhere in the changeset) I don't like the
> fix. The root of the problem is that the MAD header has a 16-bit length
> field, so writing 0x10000 (64K PAGE_SIZE) into it doesn't quite work.
>
> So in addition to a better comment, I would suggest a fix more like
> this:
>
> scsi/ibmvscsi: Fix host config length field overflow
>
> The length field in the host config packet is only 16-bit long, so
> passing it 0x10000 (64K which is our standard PAGE_SIZE) doesn't
> work and result in an empty config from the server.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> CC: <stable@vger.kernel.org>

Acked-by: Robert Jennings <rcj@linux.vnet.ibm.com>

Tested with an IBM i host and confirmed the fix.

> ---
>
> diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
> index 3a6c474..337e8b3 100644
> --- a/drivers/scsi/ibmvscsi/ibmvscsi.c
> +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
> @@ -1541,6 +1541,9 @@ static int ibmvscsi_do_host_config(struct ibmvscsi_host_data *hostdata,
>
>         host_config = &evt_struct->iu.mad.host_config;
>
> +       /* The transport length field is only 16-bit */
> +       length = min(0xffff, length);
> +
>         /* Set up a lun reset SRP command */
>         memset(host_config, 0x00, sizeof(*host_config));
>         host_config->common.type = VIOSRP_HOST_CONFIG_TYPE;
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: therm_pm72 units, interface
From: Benjamin Herrenschmidt @ 2012-08-16 20:51 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: linuxppc-dev
In-Reply-To: <alpine.LNX.2.01.1208161718210.8111@frira.zrqbmnf.qr>


> If you have more things to print/offer via sysfs, I'm all for it.
> 
> The XsG5 really has (by looking into the casing): 1 PCI Fan,
> 6 center fans, 1 PSU intake and 1 PSU outblow fan (this last one
> seems rather slow-turning, but maybe that's normal).
> It is not quite clear which is which in the sysfs display.

The cpu intake & exhaust are the same, they are handled by groups of 3
ie, cpu0_* is the 3 fans on CPU 0, cpu1_* is the 3 fans on CPU 1.

Backside fan is supposed to blow on the U3 chip, I don't remember where
it's located, and slots fan is the PCI one afaik. The PSU's own fan
isn't under our direct control

> What I did figure out: at the PROM, fans run at what seems
> to be full speed (some 8000-9000 rpm?). Once Linux and therm_pm72
> are loaded, the fans settle down towards 4000 rpm, and if the machine
> has warmed up, that is then when it powers off. (The kernel is indeed
> 3.4. I now need to figure out how to place a new kernel on it without
> it powering off inbetween.)

You can try netbooting... OF netboot is limited to 4M sized zImages
which can be a bit tough nowadays, but modern yaboot can netboot larger
files. Another option is USB sticks.

> >> $ cd /sys/devices/temperature; grep '' *;
> >> backside_fan_pwm:32
> >> backside_temperature:54.000
> >> cpu0_current:34.423
> >> cpu0_exhaust_fan_rpm:5340
> >> cpu0_intake_fan_rpm:5340
> >> cpu0_temperature:72.889
> >> cpu0_voltage:1.252
> >> cpu1_current:34.179
> >> cpu1_exhaust_fan_rpm:4584
> >> cpu1_intake_fan_rpm:4584
> >> cpu1_temperature:68.526
> >> cpu1_voltage:1.259
> >> dimms_temperature:53.000
> >> grep: driver: Er en filkatalog
> >> modalias:platform:temperature
> >> grep: power: Er en filkatalog
> >> slots_fan_pwm:20
> >> slots_temperature:38.500
> >> grep: subsystem: Er en filkatalog
> >> uevent:DRIVER=temperature
> >> uevent:OF_NAME=fan
> >> uevent:OF_FULLNAME=/u3@0,f8000000/i2c@f8001000/fan@15e
> >> uevent:OF_TYPE=fcu
> >> uevent:OF_COMPATIBLE_0=fcu
> >> uevent:OF_COMPATIBLE_N=1
> >> uevent:MODALIAS=of:NfanTfcuCfcu

Cheers,
Ben.

^ permalink raw reply

* Re: powerpc/perf: hw breakpoints return ENOSPC
From: Michael Neuling @ 2012-08-16 23:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Frederic Weisbecker, K Prasad, linux-kernel,
	linuxppc-dev
In-Reply-To: <1345117498.29668.23.camel@twins>

> > > > On this second syscall, fetch_bp_busy_slots() sets slots.pinned to be 1,
> > > > despite there being no breakpoint on this CPU.  This is because the call
> > > > the task_bp_pinned, checks all CPUs, rather than just the current CPU.
> > > > POWER7 only has one hardware breakpoint per CPU (ie. HBP_NUM=1), so we
> > > > return ENOSPC.
> > > 
> > > I think this comes from the ptrace legacy, we register a breakpoint on
> > > all cpus because when we migrate a task it cannot fail to migrate the
> > > breakpoint.
> > > 
> > > Its one of the things I hate most about the hwbp stuff as it relates to
> > > perf.
> > > 
> > > Frederic knows more...
> > 
> > Maybe I should wait for Frederic to respond but I'm not sure I
> > understand what you're saying.
> > 
> > I can see how using ptrace hw breakpoints and perf hw breakpoints at the
> > same time could be a problem, but I'm not sure how this would stop it.
> 
> ptrace uses perf for hwbp support so we're stuck with all kinds of
> stupid ptrace constraints.. or somesuch.

OK

> > Are you saying that we need to keep at least 1 slot free at all times,
> > so that we can use it for ptrace?
> 
> No, I'm saying perf-hwbp is weird because of ptrace, maybe the ptrace
> weirdness shouldn't live in perf-hwpb but in the ptrace-perf glue
> however..

OK.

> > Is "perf record -e mem:0x10000000 true" ever going to be able to work on
> > POWER7 with only one hw breakpoint resource per CPU?  
> 
> I think it should work... but I'm fairly sure it currently doesn't
> because of how things are done. 'perf record -ie mem:0x100... true'
> might just work.

Adding -i doesn't help. 

Mikey

^ permalink raw reply

* Re: powerpc/perf: hw breakpoints return ENOSPC
From: Michael Ellerman @ 2012-08-17  1:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Frederic Weisbecker, linux-kernel, linuxppc-dev,
	K Prasad, Ingo Molnar
In-Reply-To: <1345126502.29668.36.camel@twins>

On Thu, 2012-08-16 at 16:15 +0200, Peter Zijlstra wrote:
> On Fri, 2012-08-17 at 00:02 +1000, Michael Ellerman wrote:
> > You do want to guarantee that the task will always be subject to the
> > breakpoint, even if it moves cpus. So is there any way to guarantee that
> > other than reserving a breakpoint slot on every cpu ahead of time? 
> 
> That's not how regular perf works.. regular perf can overload hw
> resources at will and stuff is strictly per-cpu.
..
> For regular (!pinned) events, we'll RR the created events on the
> available hardware resources.

Yeah I know, but that isn't really the semantics you want for a
breakpoint. You don't want to sometimes have the breakpoint active and
sometimes not, it needs to be active at all times when the task is
running.

At the very least you want it to behave like a pinned event, ie. if it
can't be scheduled you get notified and can tell the user.

> HWBP does things completely different and reserves a slot over all CPUs
> for everything, thus stuff completely falls apart.

So it would seem :)

I guess my point was that reserving a slot on each cpu seems like a
reasonable way of guaranteeing that wherever the task goes we will be
able to install the breakpoint.

But obviously we need some way to make it play nice with perf.

cheers

^ permalink raw reply

* Re: [PATCH v3 2/2] powerpc: Uprobes port to powerpc
From: Ananth N Mavinakayanahalli @ 2012-08-17  5:13 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Srikar Dronamraju, peterz, lkml, Paul Mackerras, Anton Blanchard,
	Ingo Molnar, linuxppc-dev
In-Reply-To: <20120816152112.GA8874@redhat.com>

On Thu, Aug 16, 2012 at 05:21:12PM +0200, Oleg Nesterov wrote:

...

> > So, the arch agnostic code itself
> > takes care of this case...
> 
> Yes. I forgot about install_breakpoint()->is_swbp_insn() check which
> returns -ENOTSUPP, somehow I thought arch_uprobe_analyze_insn() does
> this.
> 
> > or am I missing something?
> 
> No, it is me.
> 
> > However, I see that we need a powerpc specific is_swbp_insn()
> > implementation since we will have to take care of all the trap variants.
> 
> Hmm, I am not sure. is_swbp_insn(insn), as it is used in the arch agnostic
> code, should only return true if insn == UPROBE_SWBP_INSN (just in case,
> this logic needs more fixes but this is offtopic).

I think it does...

> If powerpc has another insn(s) which can trigger powerpc's do_int3()
> counterpart, they should be rejected by arch_uprobe_analyze_insn().
> I think.

The insn that gets passed to arch_uprobe_analyze_insn() is copy_insn()'s
version, which is the file copy of the instruction. We should also take
care of the in-memory copy, in case gdb had inserted a breakpoint at the
same location, right? Updating is_swbp_insn() per-arch where needed will
take care of both the cases, 'cos it gets called before
arch_analyze_uprobe_insn() too.

> > I will need to update the patches based on changes being made by Oleg
> > and Sebastien for the single-step issues.
> 
> Perhaps you can do this in a separate change?
> 
> We need some (simple) changes in the arch agnostic code first, they
> should not break poweppc. These changes are still under discussion.
> Once we have "__weak  arch_uprobe_step*" you can reimplement these
> hooks and fix the problems with single-stepping.

OK. Agreed.

Ananth

^ permalink raw reply

* Re: [git pull] Please pull powerpc.git merge branch (updated)
From: Kumar Gala @ 2012-08-17  5:28 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <alpine.LFD.2.00.1208100806200.16855@right.am.freescale.net>

Ben,

Poke.  :)

- k

On Aug 10, 2012, at 8:07 AM, Kumar Gala wrote:

> Ben,
>=20
> Two updates from last week (one dts bug fix, one minor defconfig =
update)
>=20
> - k
>=20
> The following changes since commit =
0d7614f09c1ebdbaa1599a5aba7593f147bf96ee:
>=20
>  Linux 3.6-rc1 (2012-08-02 16:38:10 -0700)
>=20
> are available in the git repository at:
>=20
>  git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc.git merge
>=20
> for you to fetch changes up to =
09a3017a585eb8567a7de15b426bb1dfb548bf0f:
>=20
>  powerpc/p4080ds: dts - add usb controller version info and port0 =
(2012-08-10 07:47:02 -0500)
>=20
> ----------------------------------------------------------------
> Jia Hongtao (1):
>      powerpc/fsl-pci: Only scan PCI bus if configured as a host
>=20
> Shengzhou Liu (1):
>      powerpc/p4080ds: dts - add usb controller version info and port0
>=20
> Zhao Chenhui (1):
>      powerpc/85xx: mpc85xx_defconfig - add VIA PATA support for =
MPC85xxCDS
>=20
> arch/powerpc/boot/dts/fsl/p4080si-post.dtsi |    7 +++++++
> arch/powerpc/configs/mpc85xx_defconfig      |    1 +
> arch/powerpc/sysdev/fsl_pci.c               |   13 ++++++++-----
> 3 files changed, 16 insertions(+), 5 deletions(-)
>=20
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply

* RE: [PATCH v2 1/2] powerpc/mpic: Add Open-PIC global timer document
From: Wang Dongsheng-B40534 @ 2012-08-17  7:15 UTC (permalink / raw)
  To: Wood Scott-B07421
  Cc: Li Yang-R58472, devicetree-discuss@lists.ozlabs.org,
	paulus@samba.org, Gala Kumar-B11780,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <502AC0B0.5030801@freescale.com>

DQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogV29vZCBTY290dC1CMDc0
MjENCj4gU2VudDogV2VkbmVzZGF5LCBBdWd1c3QgMTUsIDIwMTIgNToxOSBBTQ0KPiBUbzogV2Fu
ZyBEb25nc2hlbmctQjQwNTM0DQo+IENjOiBXb29kIFNjb3R0LUIwNzQyMTsgYmVuaEBrZXJuZWwu
Y3Jhc2hpbmcub3JnOyBwYXVsdXNAc2FtYmEub3JnOw0KPiBsaW51eHBwYy1kZXZAbGlzdHMub3ps
YWJzLm9yZzsgZGV2aWNldHJlZS1kaXNjdXNzQGxpc3RzLm96bGFicy5vcmc7IEdhbGENCj4gS3Vt
YXItQjExNzgwOyBMaSBZYW5nLVI1ODQ3Mg0KPiBTdWJqZWN0OiBSZTogW1BBVENIIHYyIDEvMl0g
cG93ZXJwYy9tcGljOiBBZGQgT3Blbi1QSUMgZ2xvYmFsIHRpbWVyDQo+IGRvY3VtZW50DQo+IA0K
PiBPbiAwOC8xMy8yMDEyIDA5OjQwIFBNLCBXYW5nIERvbmdzaGVuZy1CNDA1MzQgd3JvdGU6DQo+
ID4+Pj4gK0V4YW1wbGUgMjoNCj4gPj4+Pj4gKw0KPiA+Pj4+PiArCXRpbWVyOiB0aW1lckAwMTBm
MCB7DQo+ID4+Pj4+ICsJCWNvbXBhdGlibGUgPSAib3Blbi1waWMsZ2xvYmFsLXRpbWVyIjsNCj4g
Pj4+Pj4gKwkJZGV2aWNlX3R5cGUgPSAib3Blbi1waWMiOw0KPiA+Pj4+PiArCQlyZWcgPSA8MHgw
MTBmMCA0IDB4MDExMDAgMHgxMDA+Ow0KPiA+Pj4+PiArCQlpbnRlcnJ1cHRzID0gPDAgMCAzIDAN
Cj4gPj4+Pj4gKwkJCSAgICAgIDEgMCAzIDANCj4gPj4+Pj4gKwkJCSAgICAgIDIgMCAzIDANCj4g
Pj4+Pj4gKwkJICAgICAgICAgICAgICAzIDAgMyAwPjsNCj4gPj4+Pj4gKwl9Ow0KPiA+Pj4+DQo+
ID4+Pj4gNC1jZWxsIGludGVycnVwdCBzcGVjaWZpZXJzIGFyZSBzcGVjaWZpYyB0byBGcmVlc2Nh
bGUgTVBJQ3MuICBUaGlzDQo+ID4+Pj4gbWVhbnMgdGhlcmUncyBubyB3YXkgdG8gZGVzY3JpYmUg
dGhlIHRpbWVyIGludGVycnVwdCBvbiBhIG5vbi0NCj4gPj4gRnJlZXNjYWxlIG9wZW5waWMuDQo+
ID4+Pj4gQWdhaW4sIEkgc3VnZ2VzdCB3ZSBub3QgYm90aGVyIHdpdGggdGhpcyBpbiB0aGUgYWJz
ZW5jZSBvZiBhbiBhY3R1YWwNCj4gPj4+PiBuZWVkIHRvIHN1cHBvcnQgdGhlIHRpbWVyIG9uIG5v
bi1GcmVlc2NhbGUgb3BlbnBpYyBpbiBwYXJ0aXRpb25lZA0KPiA+PiBzY2VuYXJpb3MuDQo+ID4+
Pj4gVGhlIGV4aXN0aW5nIG9wZW5waWMgbm9kZSBpcyBzdWZmaWNpZW50IHRvIGRlc2NyaWJlIHRo
ZQ0KPiA+Pj4+IGhhcmR3YXJlIGluIHRoZSBhYnNlbmNlIG9mIHBhcnRpdGlvbmluZy4gICBXZSBj
b3VsZCBoYXZlIGFuDQo+ID4+Pj4gIm9wZW5waWMtbm8tdGltZXIiIHByb3BlcnR5IHRvIGluZGlj
YXRlIHRoYXQgd2UncmUgZGVzY3JpYmluZyBpdA0KPiA+Pj4+IHNlcGFyYXRlbHksIHNvIHRoYXQg
dGhlIGFic2VuY2Ugb2YgYSB0aW1lciBub2RlIGlzbid0IGFtYmlndW91cyBhcw0KPiB0bw0KPiA+
Pj4+IHdoZXRoZXIgaXQncyBhbiBvbGQgdHJlZSBvciBhIHBhcnRpdGlvbmVkIHNjZW5hcmlvLiAg
QW4gZnNsLG1waWMNCj4gPj4+PiBjb21wYXRpYmxlIHdvdWxkIGltcGx5IG9wZW5waWMtbm8tdGlt
ZXIuDQo+ID4+Pj4NCj4gPj4+PiBOb3RlIHRoYXQgSSBiZWxpZXZlIG1hbnkgb2YgdGhlIG5vbi1G
cmVlc2NhbGUgb3BlbnBpYyBub2RlcyBhcmUNCj4gZ29pbmcNCj4gPj4+PiB0byBiZSBmb3VuZCBv
biBzeXN0ZW1zIHdpdGggcmVhbCBPcGVuIEZpcm13YXJlLCBzbyB3ZSBjYW4ndCBnbw0KPiA+Pj4+
IGNoYW5naW5nIHRoZSBkZXZpY2UgdHJlZSBmb3IgdGhlbS4NCj4gPj4+IFtXYW5nIERvbmdzaGVu
Z10gSW4gdGhlIE9wZW4tUElDIHNwZWNpZmljYXRpb24sIHRoZXJlIGFyZSBmb3VyIHRpbWVyLg0K
PiA+Pj4gCQlpbnRlcnJ1cHRzID0gPDAgMCAzIDANCj4gPj4+IAkJCSAgICAgIDEgMCAzIDANCj4g
Pj4+IAkJCSAgICAgIDIgMCAzIDANCj4gPj4+IAkJICAgICAgICAgICAgICAzIDAgMyAwPjsNCj4g
Pj4+DQo+ID4+PiBUaGUgImludGVycnVwdHMiIGp1c3QgbGV0IHVzZXIga25vdyB0aGVyZSBhcmUg
Zm91ciB0aW1lcnMuIFVzYWdlDQo+IGJhc2VkDQo+ID4+ICJpbnRlcnJ1cHRzIg0KPiA+Pj4gYmlu
ZGluZyB0byBjaGFuZ2UgZHRzLg0KPiA+Pg0KPiA+PiBJIGNhbid0IHVuZGVyc3RhbmQgdGhlIGFi
b3ZlIG9yIGhvdyBpdCdzIGEgcmVzcG9uc2UgdG8gd2hhdCBJIHdyb3RlLg0KPiA+Pg0KPiA+IFtX
YW5nIERvbmdzaGVuZ10gSSBtZWFuIHRoaXMganVzdCB0byB0ZWxsIGhvdyBtYW55IHRpbWVycyB0
byBzdXBwb3J0IGluDQo+IE9wZW4tUElDDQo+ID4gc3BlY2lmaWNhdGlvbi4gSWYgc29tZW9uZSBu
ZWVkcyB0byB3cml0ZSAiaW50ZXJydXB0cyIgaW50byBkdHMsIHRoaXMNCj4gbXVzdCBjb21wbHkN
Cj4gPiB3aXRoIHRoZSBzcGVjaWZpY2F0aW9uIG9mIHRoZSBpbnRlcnJ1cHQgdG8gd3JpdGUuIHRo
aXMgaXMgYmFzZWQgb24gdGhlDQo+IHBpYyBkcml2ZXINCj4gPiBzaG91bGQgYmUgY2hhbmdlZCBp
biBkaWZmZXJlbnQgcGxhdGZvcm1zLg0KPiANCj4gTXkgcG9pbnQgKGJleW9uZCB0aGF0IGV4YW1w
bGVzIHByb3ZpZGVkIHNob3VsZCBiZSB2YWxpZCBmb3IgKnNvbWUqDQo+IHN5c3RlbSkgaXMgdGhl
cmUgaXMgbm8gdmFsaWQgdGhpbmcgdG8gcHV0IGluIHRoZSBpbnRlcnJ1cHRzIHByb3BlcnR5DQo+
IGhlcmUgd2hlbiB0aGUgaW50ZXJydXB0IGNvbnRyb2xsZXIgaXMgbm90ICJmc2wsbXBpYyIsIHNv
IHRoaXMgZG9lc24ndA0KPiB3b3JrLg0KPiANCltXYW5nIERvbmdzaGVuZ10gRmluZSwgSSB3aWxs
IHJlbW92ZSB0aGlzIGRvY3VtZW50IG9mIE9wZW4tUElDIGdsb2JhbCB0aW1lci4NCldlIG9ubHkg
c3VwcG9ydCBtcGljIHRpbWVyLiBEcml2ZXIgd2lsbCBiZSBjb21wYXRpYmxlIHdpdGggT1BFTi1Q
SUMNCnNwZWNpZmljYXRpb24uIExldCBzb21lb25lIHdobyBjYXJlcyBhYm91dCBvcmRpbmFyeSBP
cGVuUElDIGRyaXZlcnMgYWRkDQpzdXBwb3J0Pw0KDQo+IC1TY290dA0KDQo=

^ permalink raw reply

* Re: [PATCH v3 0/7] mv643xx.c: Add basic device tree support.
From: Arnd Bergmann @ 2012-08-17 12:13 UTC (permalink / raw)
  To: Ian Molton
  Cc: thomas.petazzoni, andrew, netdev, devicetree-discuss, ben.dooks,
	linuxppc-dev, David Miller, linux-arm-kernel
In-Reply-To: <5028D040.60604@codethink.co.uk>

On Monday 13 August 2012, Ian Molton wrote:
> On 10/08/12 11:49, Arnd Bergmann wrote:
> > On Thursday 09 August 2012, Ian Molton wrote:
> >>>  The driver
> >>> already knows all those offsets and they are always the same
> >>> for all variants of mv643xx, right?
> >> Yes, but its not clean. And no amount of refactoring is
> >> really going to make a nice driver that also fits the ancient
> >> (and badly thought out) OF bindings.
> > In what way is it badly though out, or not clean? The use of
> > underscores in the properties, and the way that the sram
> > is configured is problematic, I agree. But The way that
> > the three ports are addressed and how the PHY is found
> > seems quite clever.
> 
> It forces one to load the MDIO driver first, because it maps ALL the
> registers for both itself and all the ports, and the MDIO driver has no
> way of knowing how many ethernet blocks are present (I have a device
> here with two, and another with four). Thats anywhere from 1 to 12
> ports, split across 1 to 4 address ranges, and theres a big gap in the
> address range between controllers 0,1 and 2,3. *ALL* the devices on the
> board are sharing ethernet block 0's MDIO bus. By pure luck it happens
> to work, because the blocks 2,3 have an alias of the MDIO registers from
> blocks 0,1.
> 
> Having the MDIO driver map the ethernet drivers memory is a terrible
> solution, IMO. Ethernet drivers should map their own memory, and that
> introduces the n-ports-per-block problem, because their address ranges
> overlap.
> 
> I think the best solution is to make each ethernet block register 3 ports.
> 
> the PPC code can simply generate different fixups so that instead of
> creating 3 devices, it creates one, with three ports.

Ok.

> Can we get some consensus on the right approach here? I'm loathe to code
> this if its going to be rejected.
> 
> I'd prefer the driver to be properly split so we dont have the MDIO
> driver mapping the ethernet drivers address spaces, but if thats not
> going to be merged, I'm not feeling like doing the work for nothing.
> 
> If the driver is to use the overlapping-address mapped-by-the-mdio
> scheme, then so be it, but I could do with knowing.
> 
> Another point against the latter scheme is that the MDIO driver could
> sensibly be used (the block is identical) on the ArmadaXP, which has 4
> ethernet blocks rather than two, yet grouped in two pairs with a
> discontiguous address range.
> 
> I'd like to get this moved along as soon as possible though.

I don't object to any device driver changes, but I do want to make
sure that the bindings are sensible and can coexist with the
ones that have been used for the past 5 years.

Maybe you can move the binding for the ethernet parts out of the
marvell.txt file into the place you want to use for the new
bindings and then extend it to cover both the old and the new style.

	Arnd

^ permalink raw reply

* Re: therm_pm72 units, interface
From: Jan Engelhardt @ 2012-08-17 13:51 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1345150317.11751.38.camel@pasglop>


On Thursday 2012-08-16 22:51, Benjamin Herrenschmidt wrote:
>
>You can try netbooting... OF netboot is limited to 4M sized zImages
>which can be a bit tough nowadays, but modern yaboot can netboot larger
>files. Another option is USB sticks.

I can just exploit the fact that the machine will run for about an
hour when it has had a cooldown night.

Except that 3.5, as I already expected by scary mails read on
linux-kernel, looked dangerous to use. Here is a boot-time
hang.

---


Apple RackMac3,1 5.1.7f2 BootROM built on 12/09/04 at 10:58:45
Copyright 1994-2004 Apple Computer, Inc.
All Rights Reserved.

Welcome to Open Firmware, the system time and date is: 14:03:41 08/17/2012

To continue booting, type "mac-boot" and press return.
To shut down, type "shut-down" and press return.

 ok
0 > boot load-size=97d adler32=7e30648e

parsing <CHRP-BOOT>

evaluating <BOOT-SCRIPT>
DART table allocated at: c00000007f000000
Using PowerMac machine description
Found initrd at 0xc000000002400000:0xc000000002783e10
Found U3 memory controller & host bridge @ 0xf8000000 revision: 0x35
Mapped at 0xd000080080000000
Found a K2 mac-io controller, rev: 96, mapped at 0xd000080080050000
PowerMac motherboard: XServe G5
DART IOMMU initialized for U3 type chipset
bootconsole [udbg0] enabled
CPU maps initialized for 1 thread per core
Starting Linux PPC64 #1 SMP Wed Aug 15 21:49:59 UTC 2012 (4904750)
-----------------------------------------------------
ppc64_pft_size                = 0x0
physicalMemorySize            = 0x80000000
htab_address                  = 0xc00000007c000000
htab_hash_mask                = 0x3ffff
-----------------------------------------------------
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 3.5.2-5-ppc64 (geeko@buildhost) (gcc version 4.7.1 20120723 [gcc-4_7-branch revision 189773] (SUSE Linux) ) #1 SMP Wed Aug 15 21:49:59 UTC 2012 (4904750)
[boot]0012 Setup Arch
Found U3-AGP PCI host bridge.  Firmware bus number: 240->255
PCI host bridge /pci@0,f0000000  ranges:
 MEM 0x00000000f1000000..0x00000000f1ffffff -> 0x00000000f1000000
  IO 0x00000000f0000000..0x00000000f07fffff -> 0x0000000000000000
 MEM 0x00000000b0000000..0x00000000bfffffff -> 0x00000000b0000000
Can't get bus-range for /ht@0,f2000000, assume bus 0
Found U3-HT PCI host bridge.  Firmware bus number: 0->239
PCI host bridge /ht@0,f2000000 (primary) ranges:
via-pmu: Server Mode is disabled
PMU driver v2 initialized for Core99, firmware: 0c
nvram: Checking bank 0...
nvram: gen0=508, gen1=507
nvram: Active bank is: 0
nvram: OF partition at 0x410
nvram: XP partition at 0x1020
nvram: NR partition at 0x1120
Zone ranges:
  DMA      [mem 0x00000000-0x7fffffff]
  Normal   empty
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x00000000-0x7fffffff]
[boot]0015 Setup Done
PERCPU: Embedded 2 pages/cpu @c000000001700000 s85504 r0 d45568 u524288
Built 1 zonelists in Node order, mobility grouping on.  Total pages: 32740
Policy zone: DMA
Kernel command line: root=/dev/disk/by-label/silvroot sysrq=511 console=ttyPZ0,57600 console=tty0
PID hash table entries: 4096 (order: -1, 32768 bytes)
freeing bootmem node 0
Memory: 2012160k/2097152k available (17152k kernel code, 84992k reserved, 1984k data, 3243k bss, 5952k init)
Hierarchical RCU implementation.
        CONFIG_RCU_FANOUT set to non-default value of 32
        RCU dyntick-idle grace-period acceleration is enabled.
NR_IRQS:512 nr_irqs:512 16
mpic: Setting up MPIC " MPIC 1   " version 1.2 at 80040000, max 2 CPUs
mpic: ISU size: 120, shift: 7, mask: 7f
mpic: Initializing for 120 sources
mpic: Setting up MPIC " MPIC 2   " version 1.2 at f8040000, max 2 CPUs
mpic: ISU size: 124, shift: 7, mask: 7f
mpic: Initializing for 124 sources
/u3@0,f8000000/mpic@f8040000: hooking up to IRQ 56
clocksource: timebase mult[1e000005] shift[24] registered
Console: colour dummy device 80x25
console [tty0] enabled, bootconsole disabled
console [ttyPZ0] enabled
allocated 524288 bytes of page_cgroup
please try 'cgroup_disable=memory' option if you don't want memory cgroups
pid_max: default: 32768 minimum: 301
Security Framework initialized
AppArmor: AppArmor initialized
Dentry cache hash table entries: 262144 (order: 5, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 4, 1048576 bytes)
Mount-cache hash table entries: 4096
Initializing cgroup subsys cpuacct
Initializing cgroup subsys memory
Initializing cgroup subsys devices
Initializing cgroup subsys freezer
Initializing cgroup subsys net_cls
Initializing cgroup subsys blkio
Initializing cgroup subsys perf_event
PowerMac SMP probe found 2 cpus
KeyWest i2c @0xf8001003 irq 16 /u3@0,f8000000/i2c@f8001000
 channel 0 bus <multibus>
 channel 1 bus <multibus>
KeyWest i2c @0x80018000 irq 26 /ht@0,f2000000/pci@3/mac-io@7/i2c@18000
 channel 0 bus <multibus>
PMU i2c /ht@0,f2000000/pci@3/mac-io@7/via-pmu@16000/pmu-i2c
 channel 1 bus <multibus>
 channel 2 bus <multibus>
Processor timebase sync using Pulsar i2c clock
mpic: requesting IPIs...
PPC970/FX/MP performance monitor hardware support registered
Brought up 2 CPUs
devtmpfs: initialized
NET: Registered protocol family 16
IBM eBus Device Driver
CPU Hotplug not supported by firmware - disabling.
PCI: Probing PCI hardware
PCI host bridge to bus 0000:f0
pci_bus 0000:f0: root bus resource [io  0x10000-0x80ffff] (bus address [0x0000-0x7fffff])
pci_bus 0000:f0: root bus resource [mem 0xf1000000-0xf1ffffff]
pci_bus 0000:f0: root bus resource [mem 0xb0000000-0xbfffffff]
IOMMU table initialized, virtual merging enabled
PCI host bridge to bus 0001:00
pci_bus 0001:00: root bus resource [io  0x820000-0xc1ffff] (bus address [0x0000-0x3fffff])
pci_bus 0001:00: root bus resource [mem 0xfa000000-0xffffffff]
pci_bus 0001:00: root bus resource [mem 0x80000000-0xafffffff]
pci_bus 0001:00: root bus resource [mem 0xc0000000-0xefffffff]
pci 0001:00:01.0: PCI bridge to [bus 06-06]
pci 0001:00:02.0: PCI bridge to [bus 07-07]
pci 0001:00:03.0: PCI bridge to [bus 01-01]
pci 0001:00:04.0: PCI bridge to [bus 02-02]
pci 0001:00:05.0: PCI bridge to [bus 03-03]
pci 0001:00:06.0: PCI bridge to [bus 04-04]
pci 0001:00:07.0: PCI bridge to [bus 05-05]
PCI: Cannot allocate resource region 1 of device 0001:06:03.0, will remap
pci 0001:00:01.0: BAR 13: assigned [io  0x821000-0x821fff]
pci 0001:06:03.0: BAR 1: assigned [io  0x821000-0x8210ff]
pci 0001:00:01.0: PCI bridge to [bus 06-06]
pci 0001:00:01.0:   bridge window [io  0x821000-0x821fff]
pci 0001:00:01.0:   bridge window [mem 0x90000000-0x9fffffff]
pci 0001:00:02.0: PCI bridge to [bus 07-07]
pci 0001:00:02.0:   bridge window [mem 0xa0000000-0xa00fffff]
pci 0001:00:03.0: PCI bridge to [bus 01-01]
pci 0001:00:03.0:   bridge window [mem 0x80000000-0x800fffff]
pci 0001:00:04.0: PCI bridge to [bus 02-02]
pci 0001:00:04.0:   bridge window [mem 0x80100000-0x801fffff]
pci 0001:00:05.0: PCI bridge to [bus 03-03]
pci 0001:00:05.0:   bridge window [mem 0x80200000-0x802fffff]
pci 0001:00:06.0: PCI bridge to [bus 04-04]
pci 0001:00:06.0:   bridge window [mem 0x80300000-0x805fffff]
pci 0001:00:07.0: PCI bridge to [bus 05-05]
pci 0001:00:07.0:   bridge window [mem 0x80600000-0x806fffff]
opal: Node not found
bio: create slab <bio-0> at 0
vgaarb: device added: PCI:0001:06:03.0,decodes=io+mem,owns=none,locks=none
vgaarb: loaded
vgaarb: bridge control possible 0001:06:03.0
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
Switching to clocksource timebase
AppArmor: AppArmor Filesystem Enabled
NET: Registered protocol family 2
IP route cache hash table entries: 16384 (order: 1, 131072 bytes)
TCP established hash table entries: 65536 (order: 4, 1048576 bytes)
TCP bind hash table entries: 65536 (order: 4, 1048576 bytes)
TCP: Hash tables configured (established 65536 bind 65536)
TCP: reno registered
UDP hash table entries: 2048 (order: 0, 65536 bytes)
UDP-Lite hash table entries: 2048 (order: 0, 65536 bytes)
NET: Registered protocol family 1
pci 0001:00:01.0: MSI quirk detected; subordinate MSI disabled
pci 0001:00:01.0: AMD8131 rev 12 detected; disabling PCI-X MMRBC
pci 0001:00:02.0: MSI quirk detected; subordinate MSI disabled
pci 0001:00:02.0: AMD8131 rev 12 detected; disabling PCI-X MMRBC
pci 0001:02:0b.0: enabling device (0000 -> 0002)
pci 0001:02:0b.1: enabling device (0000 -> 0002)
pci 0001:02:0b.2: enabling device (0004 -> 0006)
Unpacking initramfs...
Freeing initrd memory: 3648k freed
rtas_flash: no firmware flash support
Registering G5 CPU frequency driver
Frequency method: i2c/pfunc, Voltage method: i2c/pfunc
Low: 1800 Mhz, High: 2300 Mhz, Cur: 1800 MHz
audit: initializing netlink socket (disabled)
type=2000 audit(1345212264.179:1): initialized
HugeTLB registered 16 MB page size, pre-allocated 0 pages
VFS: Disk quotas dquot_6.5.2
Dquot-cache hash table entries: 8192 (order 0, 65536 bytes)
msgmni has been set to 3936
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
io scheduler noop registered
io scheduler deadline registered
io scheduler cfq registered (default)
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
rpadlpar_io_init: partition not DLPAR capable
Using unsupported 640x480 ATY,Rage128y at 98000000, depth=8, pitch=640
Console: switching to colour frame buffer device 80x30
fb0: Open Firmware frame buffer device on /ht@0,f2000000/pci@1/ATY,Rage128y@3
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
pmac_zilog: 0.6 (Benjamin Herrenschmidt <benh@kernel.crashing.org>)
Registering IBM pSeries RNG driver
MacIO PCI driver attached to K2 chipset
0.00013020:ch-a: ttyPZ0 at MMIO 0x80013020 (irq = 22) is a Z85c30 ESCC - Serial port
0.00013000:ch-b: ttyPZ1 at MMIO 0x80013000 (irq = 23) is a Z85c30 ESCC - Serial port
Uniform Multi-Platform E-IDE driver
ide-pmac 0001:03:0d.0: enabling device (0014 -> 0016)
adb: starting probe task...
adb: finished probe task...
ide-pmac: Found Apple K2 ATA-6 controller (PCI), bus ID 3, irq 39
hda: MATSHITACD-RW CW-8124, ATAPI CD/DVD-ROM drive
hda: UDMA/33 mode selected
ide0 at 0xd0000800825e6000-0xd0000800825e6070,0xd0000800825e6160 on irq 39
mousedev: PS/2 mouse device common for all mice
PowerMac i2c bus pmu 2 registered
PowerMac i2c bus pmu 1 registered
PowerMac i2c bus mac-io 0 registered
PowerMac i2c bus u3 1 registered
i2c i2c-3: i2c-powermac: modalias failure on /u3@0,f8000000/i2c@f8001000/cereal@1c0
PowerMac i2c bus u3 0 registered
EDAC MC: Ver: 2.1.0
cpuidle: using governor ladder
cpuidle: using governor menu
TCP: cubic registered
NET: Registered protocol family 10
NET: Registered protocol family 15
lib80211: common routines for IEEE802.11 drivers
Key type dns_resolver registered
PM: Registered nosave memory: 000000007f000000 - 0000000080000000
registered taskstats version 1
input: PMU as /devices/virtual/input/input0
/home/abuild/rpmbuild/BUILD/kernel-ppc64-3.5.2/linux-3.5/drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
Freeing unused kernel memory: 5952k freed
SCSI subsystem initialized
scsi0 : sata_svw
scsi1 : sata_svw
scsi2 : sata_svw
scsi3 : sata_svw
ata1: SATA max UDMA/133 mmio m8192@0x80600000 port 0x80600000 irq 17
ata2: SATA max UDMA/133 mmio m8192@0x80600000 port 0x80600100 irq 17
ata3: SATA max UDMA/133 mmio m8192@0x80600000 port 0x80600200 irq 17
ata4: SATA max UDMA/133 mmio m8192@0x80600000 port 0x80600300 irq 17
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ATA-7: HITACHI HDS7216SBSUN160G 0825QPY8WM, P22OAB8A, max UDMA/133
ata1.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata1.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access     ATA      HITACHI HDS7216S P22O PQ: 0 ANSI: 5
ata2: SATA link down (SStatus 4 SControl 300)
ata3: SATA link down (SStatus 4 SControl 300)
ata4: SATA link down (SStatus 4 SControl 300)
rdac: device handler registered
alua: device handler registered
hp_sw: device handler registered
emc: device handler registered
udevd[86]: starting version 182
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
BUG: soft lockup - CPU#0 stuck for 22s! [udevd:88]
BUG: soft lockup - CPU#1 stuck for 22s! [udevd:93]
NIP: c00000000000fc84 LR: c00000000000fc84 CTR: c000000000163c20
REGS: c0000000797af830 TRAP: 0901   Not tainted  (3.5.2-5-ppc64)
MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 24222484  XER: 20000000
SOFTE: 1
TASK = c000000079500260[93] 'udevd' THREAD: c0000000797ac000 CPU: 1
GPR00: c00000000000fc48 c0000000797afab0 c00000000121ee98 0000000000000800
GPR04: 0000000000000001 d000000001382a58 00000000aadf3316 0000000000000000
GPR08: 00000000aadf331b 0000000000000000 0000000080000001 0000000000000000
GPR12: 0000000024222482 c00000000fe20780
NIP [c00000000000fc84] .arch_local_irq_restore+0x74/0x90
LR [c00000000000fc84] .arch_local_irq_restore+0x74/0x90
Call Trace:
[c0000000797afab0] [c00000000000fc48] .arch_local_irq_restore+0x38/0x90 (unreliable)
[c0000000797afb20] [c0000000007901b4] ._raw_spin_unlock_irqrestore+0x34/0x80
[c0000000797afb90] [c0000000000d8bdc] .lowest_in_progress+0xbc/0xe0
[c0000000797afc20] [c0000000000d8c58] .async_synchronize_cookie_domain+0x58/0x170
[c0000000797afd00] [c0000000000d8dc8] .async_synchronize_full+0x38/0x70
[c0000000797afd90] [c00000000011d4b0] .SyS_init_module+0xf0/0x240
[c0000000797afe30] [c0000000000098dc] syscall_exit+0x0/0xa0
Instruction dump:
409e002c e92d0020 61298000 7d210164 38210070 e8010010 7c0803a6 4e800020
60000000 60000000 60000000 4bff3985 <60000000> 4bffffdc e92d0020 7d210164
Modules linked in: sd_mod crc_t10dif usbcore usb_common scsi_dh_emc scsi_dh_hp_sw scsi_dh_alua scsi_dh_rdac scsi_dh sata_svw libata scsi_mod
Modules linked in: sd_mod crc_t10dif usbcore usb_common scsi_dh_emc scsi_dh_hp_sw scsi_dh_alua scsi_dh_rdac scsi_dh sata_svw libata scsi_mod
NIP: c00000000079095c LR: c0000000007909a0 CTR: c000000000163c20
REGS: c000000079abb880 TRAP: 0901   Not tainted  (3.5.2-5-ppc64)
MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 24224484  XER: 20000000
SOFTE: 1
TASK = c000000079b45410[88] 'udevd' THREAD: c000000079ab8000 CPU: 0
GPR00: c0000000000d8b4c c000000079abbb00 c00000000121ee98 0000000000000001
GPR04: c0000000010cab88 d00000000169ab38 00000000aad506ec 0000000000000000
GPR08: 00000000aad506f4 0000000080000001 0000000000000000 0000000000000000
GPR12: 0000000024222482 c00000000fe20000
NIP [c00000000079095c] ._raw_spin_lock_irqsave+0x9c/0x110
LR [c0000000007909a0] ._raw_spin_lock_irqsave+0xe0/0x110
Call Trace:
[c000000079abbb00] [c000000079abbb90] 0xc000000079abbb90 (unreliable)
[c000000079abbb90] [c0000000000d8b4c] .lowest_in_progress+0x2c/0xe0
[c000000079abbc20] [c0000000000d8c58] .async_synchronize_cookie_domain+0x58/0x170
[c000000079abbd00] [c0000000000d8dc8] .async_synchronize_full+0x38/0x70
[c000000079abbd90] [c00000000011d4b0] .SyS_init_module+0xf0/0x240
[c000000079abbe30] [c0000000000098dc] syscall_exit+0x0/0xa0
Instruction dump:
81810008 eb81ffe0 eba1ffe8 7c0803a6 ebe1fff8 7d908120 4e800020 8b8d022a
4092004c 38600000 4b87f2bd 60000000 <7c210b78> e92d0000 e9290008 792a7fe1
INFO: rcu_sched self-detected stall on CPU { 1}  (t=6000 jiffies)
Call Trace:
[c000000079abb180] [c000000000014d84] .show_stack+0x74/0x1b0 (unreliable)
[c000000079abb230] [c00000000015da0c] .__rcu_pending+0x1fc/0x570
[c000000079abb2f0] [c00000000015ddc0] .rcu_pending+0x40/0xc0
[c000000079abb380] [c00000000015efe8] .rcu_check_callbacks+0x88/0x200
[c000000079abb420] [c0000000000b67f4] .update_process_times+0x44/0xa0
[c000000079abb4b0] [c00000000010e70c] .tick_sched_timer+0x7c/0x100
[c000000079abb550] [c0000000000d3a04] .__run_hrtimer+0xb4/0x2b0
[c000000079abb600] [c0000000000d4b28] .hrtimer_interrupt+0x138/0x3c0
[c000000079abb710] [c00000000001cbc0] .timer_interrupt+0x120/0x2f0
[c000000079abb7c0] [c000000000003cd8] decrementer_common+0x158/0x180
--- Exception: 901 at .arch_local_irq_restore+0x74/0x90
    LR = .arch_local_irq_restore+0x74/0x90
[c000000079abbab0] [c00000000000fc48] .arch_local_irq_restore+0x38/0x90 (unreliable)
[c000000079abbb20] [c0000000007901b4] ._raw_spin_unlock_irqrestore+0x34/0x80
[c000000079abbb90] [c0000000000d8bdc] .lowest_in_progress+0xbc/0xe0
[c000000079abbc20] [c0000000000d8c58] .async_synchronize_cookie_domain+0x58/0x170
[c000000079abbd00] [c0000000000d8dc8] .async_synchronize_full+0x38/0x70
[c000000079abbd90] [c00000000011d4b0] .SyS_init_module+0xf0/0x240
[c000000079abbe30] [c0000000000098dc] syscall_exit+0x0/0xa0
INFO: rcu_sched self-detected stall on CPU { 0}  (t=6000 jiffies)
Call Trace:
[c0000000797af180] [c000000000014d84] .show_stack+0x74/0x1b0 (unreliable)
[c0000000797af230] [c00000000015da0c] .__rcu_pending+0x1fc/0x570
[c0000000797af2f0] [c00000000015ddc0] .rcu_pending+0x40/0xc0
[c0000000797af380] [c00000000015efe8] .rcu_check_callbacks+0x88/0x200
[c0000000797af420] [c0000000000b67f4] .update_process_times+0x44/0xa0
[c0000000797af4b0] [c00000000010e70c] .tick_sched_timer+0x7c/0x100
[c0000000797af550] [c0000000000d3a04] .__run_hrtimer+0xb4/0x2b0
[c0000000797af600] [c0000000000d4b28] .hrtimer_interrupt+0x138/0x3c0
[c0000000797af710] [c00000000001cbc0] .timer_interrupt+0x120/0x2f0
[c0000000797af7c0] [c000000000003cd8] decrementer_common+0x158/0x180
--- Exception: 901 at .arch_local_irq_restore+0x74/0x90
    LR = .arch_local_irq_restore+0x74/0x90
[c0000000797afab0] [c00000000000fc48] .arch_local_irq_restore+0x38/0x90 (unreliable)
[c0000000797afb20] [c0000000007901b4] ._raw_spin_unlock_irqrestore+0x34/0x80
[c0000000797afb90] [c0000000000d8bdc] .lowest_in_progress+0xbc/0xe0
[c0000000797afc20] [c0000000000d8c58] .async_synchronize_cookie_domain+0x58/0x170
[c0000000797afd00] [c0000000000d8dc8] .async_synchronize_full+0x38/0x70
[c0000000797afd90] [c00000000011d4b0] .SyS_init_module+0xf0/0x240
[c0000000797afe30] [c0000000000098dc] syscall_exit+0x0/0xa0

^ permalink raw reply

* Re: [PATCH] powerpc:Update Integrated Flash controller device tree bindings
From: Kumar Gala @ 2012-08-17 14:07 UTC (permalink / raw)
  To: Prabhakar Kushwaha; +Cc: linuxppc-dev
In-Reply-To: <1345043632-28535-1-git-send-email-prabhakar@freescale.com>


On Aug 15, 2012, at 10:13 AM, Prabhakar Kushwaha wrote:

> Freescale's Integrated Flash controller (IFC) may have one or two
> interrupts. In case of single interrupt line, it will cover all IFC
> interrupts.
> 
> Update this information in IFC device tree bindings
> 
> Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
> ---
> Base upon git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc.git
> Branch next
> 
> .../devicetree/bindings/powerpc/fsl/ifc.txt        |    9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)

applied to next

- k

^ permalink raw reply

* Re: [PATCH] powerpc/mpc85xx:Add new ext fields to Integrated FLash Controller
From: Kumar Gala @ 2012-08-17 14:07 UTC (permalink / raw)
  To: Prabhakar Kushwaha; +Cc: linuxppc-dev, York Sun
In-Reply-To: <1345089502-23979-1-git-send-email-prabhakar@freescale.com>


On Aug 15, 2012, at 10:58 PM, Prabhakar Kushwaha wrote:

> Freescale's Integrated Flash controller(IFC) v1.1.0 supports 40 bit
> address bus width. 
> In case more than 32 bit address is used, the EXT registers should be set.
> 
> Add support of ext registers.
> 
> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
> Signed-off-by: York Sun <yorksun@freescale.com>
> Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
> ---
> Base upon git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc.git
> Branch next
> 
> arch/powerpc/include/asm/fsl_ifc.h |   14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)

applied to next

- k

^ permalink raw reply

* Re: [v2 PATCH 1/1] booke/wdt: some ioctls do not return values properly
From: Kumar Gala @ 2012-08-17 14:07 UTC (permalink / raw)
  To: Tiejun Chen; +Cc: linuxppc-dev, B04825, linux-watchdog
In-Reply-To: <1344304780-13555-1-git-send-email-tiejun.chen@windriver.com>


On Aug 6, 2012, at 8:59 PM, Tiejun Chen wrote:

> Fix some booke wdt ioctls return value error.
> 
> Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
> ---
> drivers/watchdog/booke_wdt.c |    7 +++----
> 1 files changed, 3 insertions(+), 4 deletions(-)

applied to merge

- k

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox