LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v5 01/45] percpu_rwlock: Introduce the global reader-writer lock backend
From: Michel Lespinasse @ 2013-01-24  4:14 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-doc, peterz, fweisbec, linux-kernel, mingo, linux-arch,
	linux, xiaoguangrong, wangyun, paulmck, nikunj, linux-pm, rusty,
	rjw, namhyung, tglx, linux-arm-kernel, netdev, oleg, sbw,
	Srivatsa S. Bhat, tj, akpm, linuxppc-dev
In-Reply-To: <1358883152.21576.55.camel@gandalf.local.home>

On Tue, Jan 22, 2013 at 11:32 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 2013-01-22 at 13:03 +0530, Srivatsa S. Bhat wrote:
>> A straight-forward (and obvious) algorithm to implement Per-CPU Reader-Writer
>> locks can also lead to too many deadlock possibilities which can make it very
>> hard/impossible to use. This is explained in the example below, which helps
>> justify the need for a different algorithm to implement flexible Per-CPU
>> Reader-Writer locks.
>>
>> We can use global rwlocks as shown below safely, without fear of deadlocks:
>>
>> Readers:
>>
>>          CPU 0                                CPU 1
>>          ------                               ------
>>
>> 1.    spin_lock(&random_lock);             read_lock(&my_rwlock);
>>
>>
>> 2.    read_lock(&my_rwlock);               spin_lock(&random_lock);
>>
>>
>> Writer:
>>
>>          CPU 2:
>>          ------
>>
>>        write_lock(&my_rwlock);
>>
>
> I thought global locks are now fair. That is, a reader will block if a
> writer is waiting. Hence, the above should deadlock on the current
> rwlock_t types.

I believe you are mistaken here. struct rw_semaphore is fair (and
blocking), but rwlock_t is unfair. The reason we can't easily make
rwlock_t fair is because tasklist_lock currently depends on the
rwlock_t unfairness - tasklist_lock readers typically don't disable
local interrupts, and tasklist_lock may be acquired again from within
an interrupt, which would deadlock if rwlock_t was fair and a writer
was queued by the time the interrupt is processed.

> We need to fix those locations (or better yet, remove all rwlocks ;-)

tasklist_lock is the main remaining user. I'm not sure about removing
rwlock_t, but I would like to at least make it fair somehow :)

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply

* Re: [v0][PATCH 1/1] powerpc/book3e: disable interrupt after preempt_schedule_irq
From: Benjamin Herrenschmidt @ 2013-01-24  4:02 UTC (permalink / raw)
  To: Tiejun Chen; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1357469374-25719-1-git-send-email-tiejun.chen@windriver.com>

On Sun, 2013-01-06 at 18:49 +0800, Tiejun Chen wrote:
> In preempt case current arch_local_irq_restore() from
> preempt_schedule_irq() may enable hard interrupt but we really
> should disable interrupts when we return from the interrupt,
> and so that we don't get interrupted after loading SRR0/1.

This is an excellent catch, thanks !

Cheers,
Ben.

> Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
> ---
>  arch/powerpc/kernel/entry_64.S |   13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index e9a906c..4e1de34 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -662,6 +662,19 @@ resume_kernel:
>  	ld	r4,TI_FLAGS(r9)
>  	andi.	r0,r4,_TIF_NEED_RESCHED
>  	bne	1b
> +
> +	/*
> +	 * arch_local_irq_restore() from preempt_schedule_irq above may
> +	 * enable hard interrupt but we really should disable interrupts
> +	 * when we return from the interrupt, and so that we don't get
> +	 * interrupted after loading SRR0/1.
> +	 */
> +#ifdef CONFIG_PPC_BOOK3E
> +	wrteei	0
> +#else
> +	ld	r10,PACAKMSR(r13) /* Get kernel MSR without EE */
> +	mtmsrd	r10,1		  /* Update machine state */
> +#endif /* CONFIG_PPC_BOOK3E */
>  #endif /* CONFIG_PREEMPT */
>  
>  	.globl	fast_exc_return_irq

^ permalink raw reply

* Re: [PATCH powerpc ] Avoid debug_smp_processor_id() check in arch_spin_unlock_wait()
From: Benjamin Herrenschmidt @ 2013-01-24  3:47 UTC (permalink / raw)
  To: Li Zhong; +Cc: Paul Mackerras, Paul E. McKenney, PowerPC email list
In-Reply-To: <1357808418.10378.16.camel@ThinkPad-T5421.cn.ibm.com>

On Thu, 2013-01-10 at 17:00 +0800, Li Zhong wrote:
> Use local_paca directly in arch_spin_unlock_wait(), as all processors have the
> same value for the field shared_proc, so we don't need care racy here.

Of course that won't build if CONFIG_PPC_SPLPAR isn't defined...

Maybe you could change the definition of the SHARED_PROCESSOR
macro itself. The only possible "risk" would be a stale lppaca
if we preempt & hot unplug the CPU at the wrong time (provided
we no longer stop_machine either), I suppose if that's a real
concern we could delay freeing of lppaca's via RCU or such.

Ben.

> Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
> ---
>  arch/powerpc/lib/locks.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c
> index bb7cfec..850bea6 100644
> --- a/arch/powerpc/lib/locks.c
> +++ b/arch/powerpc/lib/locks.c
> @@ -72,7 +72,7 @@ void arch_spin_unlock_wait(arch_spinlock_t *lock)
>  {
>  	while (lock->slock) {
>  		HMT_low();
> -		if (SHARED_PROCESSOR)
> +		if (local_paca->lppaca_ptr->shared_proc)
>  			__spin_yield(lock);
>  	}
>  	HMT_medium();

^ permalink raw reply

* Re: [PATCH] perf: Fix compile warnings in tests/attr.c
From: Michael Ellerman @ 2013-01-24  2:42 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Anton Blanchard, linux-kernel, linuxppc-dev, paulus, acme, mingo,
	Jiri Olsa
In-Reply-To: <20130123185733.GA22906@us.ibm.com>

On Wed, 2013-01-23 at 10:57 -0800, Sukadev Bhattiprolu wrote:
> Michael Ellerman [michael@ellerman.id.au] wrote:
> | > | make: *** [tests/attr.o] Error 1
> | > | 
> | > | i386 compiles fine
> | > 
> | > __u64 is 'unsigned long long' on x86 and PRIu64 is 'llu' which is fine.
> | > 
> | > __u64 is 'unsigned long' on Power and PRIu64 is 'lu' which is again fine.
> | > 
> | > But __u64 is 'unsigned long long' on x86_64, but PRIu64 is '%lu' bc __WORDSIZE
> | > is 64.
> | 
> | 
> | This is a bit of a mess, but let me see if I can help explain it.
> 
> Yes it is :-) thanks for explaining it.
> 
> | 
> | The root of the problem is that you're mixing up the kernel type __u64,
> | with the userspace format specifier PRIu64.
> 
> struct perf_event_attr is shared with user space and is using __u64. Should
> it use uint64_t instead ?

Absolutely not. That's the whole reason we have __u64, it's so we can
define it in the kernel but share it with userspace, and not have it
conflict with any userspace types.


> | PRIu64 is the format specifier for printing a uint64_t, it _may_ also be
> | the right specifier for a __u64, but there's no guarantee of that - as
> | you have discovered.
> | 
> | Inside the kernel both x86 and powerpc use unsigned long long always, in
> | 32-bit and 64-bit code. That means in the kernel we can always use %llu.
> | 
> | On x86 that definition is also exported to userspace, so on x86 __u64 is
> | always unsigned long long. As you noticed this potentially differs from
> | uint64_t, which can be confusing. However it means in x86 userspace code
> | you can always print a __u64 with %llu.
> | 
> | On powerpc we default to using definitions that match userspace, so
> | __u64 changes depending on your wordsize, and so you must use PRIu64
> | etc. to print them.
> 
> Well, using __u64 and PRIu64 seems breaks x86-64...

Yes, see the previous paragraph.

> | There is however support in recent powerpc kernels to switch to using
> | unsigned long long even on 64-bit. See commit 2c9c6ce.
> | 
> | You need to define __SANE_USERSPACE_TYPES__ before including types.h.
> | Then you can always use %llu to print __u64.
> 
> but __SANE_USERSPACE_TYPES__ with __u64 and %llu seems to work on x86,
> x86-64, powerpc.

> Will modify my patch to add __SANE_USERSPACE_TYPES__ but leave the %llu
> as is.

Right. It will only work with newer kernel headers, but that's basically
unsolvable.

cheers

^ permalink raw reply

* Re: [PATCH Bug fix 0/5] Bug fix for physical memory hot-remove.
From: Simon Jeons @ 2013-01-24  1:48 UTC (permalink / raw)
  To: Tang Chen
  Cc: linux-mm, paulus, hpa, cl, sfr, x86, linux-acpi, isimatu.yasuaki,
	linfeng, mgorman, kosaki.motohiro, rientjes, len.brown, jiang.liu,
	wency, julian.calaby, glommer, wujianguo, yinghai, laijs,
	linux-kernel, minchan.kim, akpm, linuxppc-dev
In-Reply-To: <51009003.5020901@cn.fujitsu.com>

On Thu, 2013-01-24 at 09:36 +0800, Tang Chen wrote:
> On 01/24/2013 08:35 AM, Simon Jeons wrote:
> > On Wed, 2013-01-23 at 21:17 +0800, Tang Chen wrote:
> >> On 01/23/2013 08:29 PM, Simon Jeons wrote:
> >>> Hi Tang,
> >>>
> >>> I remember your big physical memory hot-remove patchset has already
> >>> merged by Andrew, but where I can find it? Could you give me git tree
> >>> address?
> >>
> >> Hi Simon,
> >>
> >> You can find all the physical memory hot-remove patches and related bugfix
> >> patches from the following url:
> >>
> >> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm
> >
> > ~/linux-next$ git remote -v
> > origin git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> > (fetch)
> > origin git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> > (push)
> > ~/linux-next$ git branch
> > * akpm
> >    master
> > ~/linux-next$ wc -l mm/memory_hotplug.c
> > 1173 mm/memory_hotplug.c
> >
> > I still can't find it. :(
> 
> Hi Simon,
> 
> It's weird, in my akpm git log, I can find the following commit.
> 
> commit deed0460e01b3968f2cf46fb94851936535b7e0d
> Author: Tang Chen <tangchen@cn.fujitsu.com>
> Date:   Sat Jan 19 11:07:13 2013 +1100
> 
>      memory-hotplug: do not allocate pgdat if it was not freed when offline.
> 
> And there are 15 related patches following it. And above it, there are 
> some more related bugfix
> from me and Andrew. They are all in -mm tree.
> 
> This is what I did:
> 
>    $ git remote add next 
> http://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
>    $ git remote update next
>    $ git checkout -b next-akpm remotes/next/akpm
> 
> Please pull your tree and try again. :)

Got it. Thanks! :)

> 
> Thanks. :)
> 
> 
> >
> >>
> >>
> >> Thanks. :)
> >>
> >> --
> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >> the body to majordomo@kvack.org.  For more info on Linux MM,
> >> see: http://www.linux-mm.org/ .
> >> Don't email:<a href=mailto:"dont@kvack.org">  email@kvack.org</a>
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >

^ permalink raw reply

* Re: [PATCH Bug fix 0/5] Bug fix for physical memory hot-remove.
From: Tang Chen @ 2013-01-24  1:36 UTC (permalink / raw)
  To: Simon Jeons
  Cc: linux-mm, paulus, hpa, cl, sfr, x86, linux-acpi, isimatu.yasuaki,
	linfeng, mgorman, kosaki.motohiro, rientjes, len.brown, jiang.liu,
	wency, julian.calaby, glommer, wujianguo, yinghai, laijs,
	linux-kernel, minchan.kim, akpm, linuxppc-dev
In-Reply-To: <1358987715.3351.3.camel@kernel>

On 01/24/2013 08:35 AM, Simon Jeons wrote:
> On Wed, 2013-01-23 at 21:17 +0800, Tang Chen wrote:
>> On 01/23/2013 08:29 PM, Simon Jeons wrote:
>>> Hi Tang,
>>>
>>> I remember your big physical memory hot-remove patchset has already
>>> merged by Andrew, but where I can find it? Could you give me git tree
>>> address?
>>
>> Hi Simon,
>>
>> You can find all the physical memory hot-remove patches and related bugfix
>> patches from the following url:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm
>
> ~/linux-next$ git remote -v
> origin git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> (fetch)
> origin git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> (push)
> ~/linux-next$ git branch
> * akpm
>    master
> ~/linux-next$ wc -l mm/memory_hotplug.c
> 1173 mm/memory_hotplug.c
>
> I still can't find it. :(

Hi Simon,

It's weird, in my akpm git log, I can find the following commit.

commit deed0460e01b3968f2cf46fb94851936535b7e0d
Author: Tang Chen <tangchen@cn.fujitsu.com>
Date:   Sat Jan 19 11:07:13 2013 +1100

     memory-hotplug: do not allocate pgdat if it was not freed when offline.

And there are 15 related patches following it. And above it, there are 
some more related bugfix
from me and Andrew. They are all in -mm tree.

This is what I did:

   $ git remote add next 
http://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
   $ git remote update next
   $ git checkout -b next-akpm remotes/next/akpm

Please pull your tree and try again. :)

Thanks. :)


>
>>
>>
>> Thanks. :)
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email:<a href=mailto:"dont@kvack.org">  email@kvack.org</a>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* [PATCH 8/8] mm: remove free_area_cache
From: Michel Lespinasse @ 2013-01-24  1:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, James E.J. Bottomley,
	Helge Deller, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	David Howells, Tony Luck, Fenghua Yu
  Cc: linux-ia64, linux-parisc, linux-kernel, linux-mm, linux-alpha,
	Andrew Morton, linuxppc-dev
In-Reply-To: <1358990991-21316-1-git-send-email-walken@google.com>

Since all architectures have been converted to use vm_unmapped_area(),
there is no remaining use for the free_area_cache.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>

---
 arch/arm/mm/mmap.c               |    2 --
 arch/arm64/mm/mmap.c             |    2 --
 arch/mips/mm/mmap.c              |    2 --
 arch/powerpc/mm/mmap_64.c        |    2 --
 arch/s390/mm/mmap.c              |    4 ----
 arch/sparc/kernel/sys_sparc_64.c |    2 --
 arch/tile/mm/mmap.c              |    2 --
 arch/x86/ia32/ia32_aout.c        |    2 --
 arch/x86/mm/mmap.c               |    2 --
 fs/binfmt_aout.c                 |    2 --
 fs/binfmt_elf.c                  |    2 --
 include/linux/mm_types.h         |    3 ---
 include/linux/sched.h            |    2 --
 kernel/fork.c                    |    4 ----
 mm/mmap.c                        |   28 ----------------------------
 mm/nommu.c                       |    4 ----
 mm/util.c                        |    1 -
 17 files changed, 0 insertions(+), 66 deletions(-)

diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index 10062ceadd1c..0c6356255fe3 100644
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ -181,11 +181,9 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
 	if (mmap_is_legacy()) {
 		mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
 		mm->get_unmapped_area = arch_get_unmapped_area;
-		mm->unmap_area = arch_unmap_area;
 	} else {
 		mm->mmap_base = mmap_base(random_factor);
 		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-		mm->unmap_area = arch_unmap_area_topdown;
 	}
 }
 
diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
index 7c7be7855638..8ed6cb1a900f 100644
--- a/arch/arm64/mm/mmap.c
+++ b/arch/arm64/mm/mmap.c
@@ -90,11 +90,9 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
 	if (mmap_is_legacy()) {
 		mm->mmap_base = TASK_UNMAPPED_BASE;
 		mm->get_unmapped_area = arch_get_unmapped_area;
-		mm->unmap_area = arch_unmap_area;
 	} else {
 		mm->mmap_base = mmap_base();
 		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-		mm->unmap_area = arch_unmap_area_topdown;
 	}
 }
 EXPORT_SYMBOL_GPL(arch_pick_mmap_layout);
diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c
index d9be7540a6be..f4e63c29d044 100644
--- a/arch/mips/mm/mmap.c
+++ b/arch/mips/mm/mmap.c
@@ -158,11 +158,9 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
 	if (mmap_is_legacy()) {
 		mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
 		mm->get_unmapped_area = arch_get_unmapped_area;
-		mm->unmap_area = arch_unmap_area;
 	} else {
 		mm->mmap_base = mmap_base(random_factor);
 		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-		mm->unmap_area = arch_unmap_area_topdown;
 	}
 }
 
diff --git a/arch/powerpc/mm/mmap_64.c b/arch/powerpc/mm/mmap_64.c
index 67a42ed0d2fc..cb8bdbe4972f 100644
--- a/arch/powerpc/mm/mmap_64.c
+++ b/arch/powerpc/mm/mmap_64.c
@@ -92,10 +92,8 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
 	if (mmap_is_legacy()) {
 		mm->mmap_base = TASK_UNMAPPED_BASE;
 		mm->get_unmapped_area = arch_get_unmapped_area;
-		mm->unmap_area = arch_unmap_area;
 	} else {
 		mm->mmap_base = mmap_base();
 		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-		mm->unmap_area = arch_unmap_area_topdown;
 	}
 }
diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c
index c59a5efa58b1..f2a462625c9e 100644
--- a/arch/s390/mm/mmap.c
+++ b/arch/s390/mm/mmap.c
@@ -91,11 +91,9 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
 	if (mmap_is_legacy()) {
 		mm->mmap_base = TASK_UNMAPPED_BASE;
 		mm->get_unmapped_area = arch_get_unmapped_area;
-		mm->unmap_area = arch_unmap_area;
 	} else {
 		mm->mmap_base = mmap_base();
 		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-		mm->unmap_area = arch_unmap_area_topdown;
 	}
 }
 
@@ -173,11 +171,9 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
 	if (mmap_is_legacy()) {
 		mm->mmap_base = TASK_UNMAPPED_BASE;
 		mm->get_unmapped_area = s390_get_unmapped_area;
-		mm->unmap_area = arch_unmap_area;
 	} else {
 		mm->mmap_base = mmap_base();
 		mm->get_unmapped_area = s390_get_unmapped_area_topdown;
-		mm->unmap_area = arch_unmap_area_topdown;
 	}
 }
 
diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c
index 708bc29d36a8..f3c169f9d3a1 100644
--- a/arch/sparc/kernel/sys_sparc_64.c
+++ b/arch/sparc/kernel/sys_sparc_64.c
@@ -290,7 +290,6 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
 	    sysctl_legacy_va_layout) {
 		mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
 		mm->get_unmapped_area = arch_get_unmapped_area;
-		mm->unmap_area = arch_unmap_area;
 	} else {
 		/* We know it's 32-bit */
 		unsigned long task_size = STACK_TOP32;
@@ -302,7 +301,6 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
 
 		mm->mmap_base = PAGE_ALIGN(task_size - gap - random_factor);
 		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-		mm->unmap_area = arch_unmap_area_topdown;
 	}
 }
 
diff --git a/arch/tile/mm/mmap.c b/arch/tile/mm/mmap.c
index f96f4cec602a..d67d91ebf63e 100644
--- a/arch/tile/mm/mmap.c
+++ b/arch/tile/mm/mmap.c
@@ -66,10 +66,8 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
 	if (!is_32bit || rlimit(RLIMIT_STACK) == RLIM_INFINITY) {
 		mm->mmap_base = TASK_UNMAPPED_BASE;
 		mm->get_unmapped_area = arch_get_unmapped_area;
-		mm->unmap_area = arch_unmap_area;
 	} else {
 		mm->mmap_base = mmap_base(mm);
 		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-		mm->unmap_area = arch_unmap_area_topdown;
 	}
 }
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index a703af19c281..3b3558577642 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -309,8 +309,6 @@ static int load_aout_binary(struct linux_binprm *bprm)
 		(current->mm->start_data = N_DATADDR(ex));
 	current->mm->brk = ex.a_bss +
 		(current->mm->start_brk = N_BSSADDR(ex));
-	current->mm->free_area_cache = TASK_UNMAPPED_BASE;
-	current->mm->cached_hole_size = 0;
 
 	retval = setup_arg_pages(bprm, IA32_STACK_TOP, EXSTACK_DEFAULT);
 	if (retval < 0) {
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index 845df6835f9f..62c29a5bfe26 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -115,10 +115,8 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
 	if (mmap_is_legacy()) {
 		mm->mmap_base = mmap_legacy_base();
 		mm->get_unmapped_area = arch_get_unmapped_area;
-		mm->unmap_area = arch_unmap_area;
 	} else {
 		mm->mmap_base = mmap_base();
 		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-		mm->unmap_area = arch_unmap_area_topdown;
 	}
 }
diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c
index 6043567b95c2..692e75ca6415 100644
--- a/fs/binfmt_aout.c
+++ b/fs/binfmt_aout.c
@@ -256,8 +256,6 @@ static int load_aout_binary(struct linux_binprm * bprm)
 		(current->mm->start_data = N_DATADDR(ex));
 	current->mm->brk = ex.a_bss +
 		(current->mm->start_brk = N_BSSADDR(ex));
-	current->mm->free_area_cache = current->mm->mmap_base;
-	current->mm->cached_hole_size = 0;
 
 	retval = setup_arg_pages(bprm, STACK_TOP, EXSTACK_DEFAULT);
 	if (retval < 0) {
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 0c42cdbabecf..e2087dea9c1e 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -730,8 +730,6 @@ static int load_elf_binary(struct linux_binprm *bprm)
 
 	/* Do this so that we can load the interpreter, if need be.  We will
 	   change some of these later */
-	current->mm->free_area_cache = current->mm->mmap_base;
-	current->mm->cached_hole_size = 0;
 	retval = setup_arg_pages(bprm, randomize_stack_top(STACK_TOP),
 				 executable_stack);
 	if (retval < 0) {
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index f8f5162a3571..e50eb047ea8a 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -329,12 +329,9 @@ struct mm_struct {
 	unsigned long (*get_unmapped_area) (struct file *filp,
 				unsigned long addr, unsigned long len,
 				unsigned long pgoff, unsigned long flags);
-	void (*unmap_area) (struct mm_struct *mm, unsigned long addr);
 #endif
 	unsigned long mmap_base;		/* base of mmap area */
 	unsigned long task_size;		/* size of task vm space */
-	unsigned long cached_hole_size; 	/* if non-zero, the largest hole below free_area_cache */
-	unsigned long free_area_cache;		/* first hole of size cached_hole_size or larger */
 	unsigned long highest_vm_end;		/* highest vma end address */
 	pgd_t * pgd;
 	atomic_t mm_users;			/* How many users with user space? */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 206bb089c06b..fa7e0a60ebe9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -366,8 +366,6 @@ extern unsigned long
 arch_get_unmapped_area_topdown(struct file *filp, unsigned long addr,
 			  unsigned long len, unsigned long pgoff,
 			  unsigned long flags);
-extern void arch_unmap_area(struct mm_struct *, unsigned long);
-extern void arch_unmap_area_topdown(struct mm_struct *, unsigned long);
 #else
 static inline void arch_pick_mmap_layout(struct mm_struct *mm) {}
 #endif
diff --git a/kernel/fork.c b/kernel/fork.c
index a31b823b3c2d..bdf61755ef4a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -364,8 +364,6 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
 	mm->locked_vm = 0;
 	mm->mmap = NULL;
 	mm->mmap_cache = NULL;
-	mm->free_area_cache = oldmm->mmap_base;
-	mm->cached_hole_size = ~0UL;
 	mm->map_count = 0;
 	cpumask_clear(mm_cpumask(mm));
 	mm->mm_rb = RB_ROOT;
@@ -539,8 +537,6 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p)
 	mm->nr_ptes = 0;
 	memset(&mm->rss_stat, 0, sizeof(mm->rss_stat));
 	spin_lock_init(&mm->page_table_lock);
-	mm->free_area_cache = TASK_UNMAPPED_BASE;
-	mm->cached_hole_size = ~0UL;
 	mm_init_aio(mm);
 	mm_init_owner(mm, p);
 
diff --git a/mm/mmap.c b/mm/mmap.c
index f54b235f29a9..532f447879d4 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1800,15 +1800,6 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 }
 #endif	
 
-void arch_unmap_area(struct mm_struct *mm, unsigned long addr)
-{
-	/*
-	 * Is this a new hole at the lowest possible address?
-	 */
-	if (addr >= TASK_UNMAPPED_BASE && addr < mm->free_area_cache)
-		mm->free_area_cache = addr;
-}
-
 /*
  * This mmap-allocator allocates new areas top-down from below the
  * stack's low limit (the base):
@@ -1865,19 +1856,6 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 }
 #endif
 
-void arch_unmap_area_topdown(struct mm_struct *mm, unsigned long addr)
-{
-	/*
-	 * Is this a new hole at the highest possible address?
-	 */
-	if (addr > mm->free_area_cache)
-		mm->free_area_cache = addr;
-
-	/* dont allow allocations above current base */
-	if (mm->free_area_cache > mm->mmap_base)
-		mm->free_area_cache = mm->mmap_base;
-}
-
 unsigned long
 get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 		unsigned long pgoff, unsigned long flags)
@@ -2276,7 +2254,6 @@ detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma,
 {
 	struct vm_area_struct **insertion_point;
 	struct vm_area_struct *tail_vma = NULL;
-	unsigned long addr;
 
 	insertion_point = (prev ? &prev->vm_next : &mm->mmap);
 	vma->vm_prev = NULL;
@@ -2293,11 +2270,6 @@ detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma,
 	} else
 		mm->highest_vm_end = prev ? prev->vm_end : 0;
 	tail_vma->vm_next = NULL;
-	if (mm->unmap_area == arch_unmap_area)
-		addr = prev ? prev->vm_end : mm->mmap_base;
-	else
-		addr = vma ?  vma->vm_start : mm->mmap_base;
-	mm->unmap_area(mm, addr);
 	mm->mmap_cache = NULL;		/* Kill the cache. */
 }
 
diff --git a/mm/nommu.c b/mm/nommu.c
index 79c3cac87afa..b5535ff2f9d1 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1852,10 +1852,6 @@ unsigned long arch_get_unmapped_area(struct file *file, unsigned long addr,
 	return -ENOMEM;
 }
 
-void arch_unmap_area(struct mm_struct *mm, unsigned long addr)
-{
-}
-
 void unmap_mapping_range(struct address_space *mapping,
 			 loff_t const holebegin, loff_t const holelen,
 			 int even_cows)
diff --git a/mm/util.c b/mm/util.c
index c55e26b17d93..4c19aa6a1b43 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -293,7 +293,6 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
 {
 	mm->mmap_base = TASK_UNMAPPED_BASE;
 	mm->get_unmapped_area = arch_get_unmapped_area;
-	mm->unmap_area = arch_unmap_area;
 }
 #endif
 
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 7/8] mm: use vm_unmapped_area() on powerpc architecture
From: Michel Lespinasse @ 2013-01-24  1:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, James E.J. Bottomley,
	Helge Deller, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	David Howells, Tony Luck, Fenghua Yu
  Cc: linux-ia64, linux-parisc, linux-kernel, linux-mm, linux-alpha,
	Andrew Morton, linuxppc-dev
In-Reply-To: <1358990991-21316-1-git-send-email-walken@google.com>

Update the powerpc slice_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

Signed-off-by: Michel Lespinasse <walken@google.com>

---
 arch/powerpc/mm/slice.c |  123 ++++++++++++++++++++++++++++++-----------------
 1 files changed, 78 insertions(+), 45 deletions(-)

diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index 999a74f25ebe..3e99c149271a 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -237,36 +237,69 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz
 #endif
 }
 
+/*
+ * Compute which slice addr is part of;
+ * set *boundary_addr to the start or end boundary of that slice
+ * (depending on 'end' parameter);
+ * return boolean indicating if the slice is marked as available in the
+ * 'available' slice_mark.
+ */
+static bool slice_scan_available(unsigned long addr,
+				 struct slice_mask available,
+				 int end,
+				 unsigned long *boundary_addr)
+{
+	unsigned long slice;
+	if (addr < SLICE_LOW_TOP) {
+		slice = GET_LOW_SLICE_INDEX(addr);
+		*boundary_addr = (slice + end) << SLICE_LOW_SHIFT;
+		return !!(available.low_slices & (1u << slice));
+	} else {
+		slice = GET_HIGH_SLICE_INDEX(addr);
+		*boundary_addr = (slice + end) ?
+			((slice + end) << SLICE_HIGH_SHIFT) : SLICE_LOW_TOP;
+		return !!(available.high_slices & (1u << slice));
+	}
+}
+
 static unsigned long slice_find_area_bottomup(struct mm_struct *mm,
 					      unsigned long len,
 					      struct slice_mask available,
 					      int psize)
 {
-	struct vm_area_struct *vma;
-	unsigned long addr;
-	struct slice_mask mask;
 	int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
+	unsigned long addr, found, next_end;
+	struct vm_unmapped_area_info info;
 
-	addr = TASK_UNMAPPED_BASE;
-
-	for (;;) {
-		addr = _ALIGN_UP(addr, 1ul << pshift);
-		if ((TASK_SIZE - len) < addr)
-			break;
-		vma = find_vma(mm, addr);
-		BUG_ON(vma && (addr >= vma->vm_end));
+	info.flags = 0;
+	info.length = len;
+	info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
+	info.align_offset = 0;
 
-		mask = slice_range_to_mask(addr, len);
-		if (!slice_check_fit(mask, available)) {
-			if (addr < SLICE_LOW_TOP)
-				addr = _ALIGN_UP(addr + 1,  1ul << SLICE_LOW_SHIFT);
-			else
-				addr = _ALIGN_UP(addr + 1,  1ul << SLICE_HIGH_SHIFT);
+	addr = TASK_UNMAPPED_BASE;
+	while (addr < TASK_SIZE) {
+		info.low_limit = addr;
+		if (!slice_scan_available(addr, available, 1, &addr))
 			continue;
+
+ next_slice:
+		/*
+		 * At this point [info.low_limit; addr) covers
+		 * available slices only and ends at a slice boundary.
+		 * Check if we need to reduce the range, or if we can
+		 * extend it to cover the next available slice.
+		 */
+		if (addr >= TASK_SIZE)
+			addr = TASK_SIZE;
+		else if (slice_scan_available(addr, available, 1, &next_end)) {
+			addr = next_end;
+			goto next_slice;
 		}
-		if (!vma || addr + len <= vma->vm_start)
-			return addr;
-		addr = vma->vm_end;
+		info.high_limit = addr;
+
+		found = vm_unmapped_area(&info);
+		if (!(found & ~PAGE_MASK))
+			return found;
 	}
 
 	return -ENOMEM;
@@ -277,39 +310,39 @@ static unsigned long slice_find_area_topdown(struct mm_struct *mm,
 					     struct slice_mask available,
 					     int psize)
 {
-	struct vm_area_struct *vma;
-	unsigned long addr;
-	struct slice_mask mask;
 	int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
+	unsigned long addr, found, prev;
+	struct vm_unmapped_area_info info;
 
-	addr = mm->mmap_base;
-	while (addr > len) {
-		/* Go down by chunk size */
-		addr = _ALIGN_DOWN(addr - len, 1ul << pshift);
+	info.flags = VM_UNMAPPED_AREA_TOPDOWN;
+	info.length = len;
+	info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
+	info.align_offset = 0;
 
-		/* Check for hit with different page size */
-		mask = slice_range_to_mask(addr, len);
-		if (!slice_check_fit(mask, available)) {
-			if (addr < SLICE_LOW_TOP)
-				addr = _ALIGN_DOWN(addr, 1ul << SLICE_LOW_SHIFT);
-			else if (addr < (1ul << SLICE_HIGH_SHIFT))
-				addr = SLICE_LOW_TOP;
-			else
-				addr = _ALIGN_DOWN(addr, 1ul << SLICE_HIGH_SHIFT);
+	addr = mm->mmap_base;
+	while (addr > PAGE_SIZE) {
+		info.high_limit = addr;
+		if (!slice_scan_available(addr - 1, available, 0, &addr))
 			continue;
-		}
 
+ prev_slice:
 		/*
-		 * Lookup failure means no vma is above this address,
-		 * else if new region fits below vma->vm_start,
-		 * return with success:
+		 * At this point [addr; info.high_limit) covers
+		 * available slices only and starts at a slice boundary.
+		 * Check if we need to reduce the range, or if we can
+		 * extend it to cover the previous available slice.
 		 */
-		vma = find_vma(mm, addr);
-		if (!vma || (addr + len) <= vma->vm_start)
-			return addr;
+		if (addr < PAGE_SIZE)
+			addr = PAGE_SIZE;
+		else if (slice_scan_available(addr - 1, available, 0, &prev)) {
+			addr = prev;
+			goto prev_slice;
+		}
+		info.low_limit = addr;
 
-		/* try just below the current vma->vm_start */
-		addr = vma->vm_start;
+		found = vm_unmapped_area(&info);
+		if (!(found & ~PAGE_MASK))
+			return found;
 	}
 
 	/*
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 6/8] mm: remove free_area_cache use in powerpc architecture
From: Michel Lespinasse @ 2013-01-24  1:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, James E.J. Bottomley,
	Helge Deller, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	David Howells, Tony Luck, Fenghua Yu
  Cc: linux-ia64, linux-parisc, linux-kernel, linux-mm, linux-alpha,
	Andrew Morton, linuxppc-dev
In-Reply-To: <1358990991-21316-1-git-send-email-walken@google.com>

As all other architectures have been converted to use vm_unmapped_area(),
we are about to retire the free_area_cache.

This change simply removes the use of that cache in
slice_get_unmapped_area(), which will most certainly have a
performance cost. Next one will convert that function to use the
vm_unmapped_area() infrastructure and regain the performance.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>

---
 arch/powerpc/include/asm/page_64.h       |    3 +-
 arch/powerpc/mm/hugetlbpage.c            |    2 +-
 arch/powerpc/mm/slice.c                  |  108 +++++------------------------
 arch/powerpc/platforms/cell/spufs/file.c |    2 +-
 4 files changed, 22 insertions(+), 93 deletions(-)

diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h
index cd915d6b093d..88693cef4f3d 100644
--- a/arch/powerpc/include/asm/page_64.h
+++ b/arch/powerpc/include/asm/page_64.h
@@ -99,8 +99,7 @@ extern unsigned long slice_get_unmapped_area(unsigned long addr,
 					     unsigned long len,
 					     unsigned long flags,
 					     unsigned int psize,
-					     int topdown,
-					     int use_cache);
+					     int topdown);
 
 extern unsigned int get_slice_psize(struct mm_struct *mm,
 				    unsigned long addr);
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 1a6de0a7d8eb..5dc52d803ed8 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -742,7 +742,7 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 	struct hstate *hstate = hstate_file(file);
 	int mmu_psize = shift_to_mmu_psize(huge_page_shift(hstate));
 
-	return slice_get_unmapped_area(addr, len, flags, mmu_psize, 1, 0);
+	return slice_get_unmapped_area(addr, len, flags, mmu_psize, 1);
 }
 #endif
 
diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index cf9dada734b6..999a74f25ebe 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -240,23 +240,15 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz
 static unsigned long slice_find_area_bottomup(struct mm_struct *mm,
 					      unsigned long len,
 					      struct slice_mask available,
-					      int psize, int use_cache)
+					      int psize)
 {
 	struct vm_area_struct *vma;
-	unsigned long start_addr, addr;
+	unsigned long addr;
 	struct slice_mask mask;
 	int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
 
-	if (use_cache) {
-		if (len <= mm->cached_hole_size) {
-			start_addr = addr = TASK_UNMAPPED_BASE;
-			mm->cached_hole_size = 0;
-		} else
-			start_addr = addr = mm->free_area_cache;
-	} else
-		start_addr = addr = TASK_UNMAPPED_BASE;
+	addr = TASK_UNMAPPED_BASE;
 
-full_search:
 	for (;;) {
 		addr = _ALIGN_UP(addr, 1ul << pshift);
 		if ((TASK_SIZE - len) < addr)
@@ -272,63 +264,24 @@ full_search:
 				addr = _ALIGN_UP(addr + 1,  1ul << SLICE_HIGH_SHIFT);
 			continue;
 		}
-		if (!vma || addr + len <= vma->vm_start) {
-			/*
-			 * Remember the place where we stopped the search:
-			 */
-			if (use_cache)
-				mm->free_area_cache = addr + len;
+		if (!vma || addr + len <= vma->vm_start)
 			return addr;
-		}
-		if (use_cache && (addr + mm->cached_hole_size) < vma->vm_start)
-		        mm->cached_hole_size = vma->vm_start - addr;
 		addr = vma->vm_end;
 	}
 
-	/* Make sure we didn't miss any holes */
-	if (use_cache && start_addr != TASK_UNMAPPED_BASE) {
-		start_addr = addr = TASK_UNMAPPED_BASE;
-		mm->cached_hole_size = 0;
-		goto full_search;
-	}
 	return -ENOMEM;
 }
 
 static unsigned long slice_find_area_topdown(struct mm_struct *mm,
 					     unsigned long len,
 					     struct slice_mask available,
-					     int psize, int use_cache)
+					     int psize)
 {
 	struct vm_area_struct *vma;
 	unsigned long addr;
 	struct slice_mask mask;
 	int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
 
-	/* check if free_area_cache is useful for us */
-	if (use_cache) {
-		if (len <= mm->cached_hole_size) {
-			mm->cached_hole_size = 0;
-			mm->free_area_cache = mm->mmap_base;
-		}
-
-		/* either no address requested or can't fit in requested
-		 * address hole
-		 */
-		addr = mm->free_area_cache;
-
-		/* make sure it can fit in the remaining address space */
-		if (addr > len) {
-			addr = _ALIGN_DOWN(addr - len, 1ul << pshift);
-			mask = slice_range_to_mask(addr, len);
-			if (slice_check_fit(mask, available) &&
-			    slice_area_is_free(mm, addr, len))
-					/* remember the address as a hint for
-					 * next time
-					 */
-					return (mm->free_area_cache = addr);
-		}
-	}
-
 	addr = mm->mmap_base;
 	while (addr > len) {
 		/* Go down by chunk size */
@@ -352,16 +305,8 @@ static unsigned long slice_find_area_topdown(struct mm_struct *mm,
 		 * return with success:
 		 */
 		vma = find_vma(mm, addr);
-		if (!vma || (addr + len) <= vma->vm_start) {
-			/* remember the address as a hint for next time */
-			if (use_cache)
-				mm->free_area_cache = addr;
+		if (!vma || (addr + len) <= vma->vm_start)
 			return addr;
-		}
-
-		/* remember the largest hole we saw so far */
-		if (use_cache && (addr + mm->cached_hole_size) < vma->vm_start)
-		        mm->cached_hole_size = vma->vm_start - addr;
 
 		/* try just below the current vma->vm_start */
 		addr = vma->vm_start;
@@ -373,28 +318,18 @@ static unsigned long slice_find_area_topdown(struct mm_struct *mm,
 	 * can happen with large stack limits and large mmap()
 	 * allocations.
 	 */
-	addr = slice_find_area_bottomup(mm, len, available, psize, 0);
-
-	/*
-	 * Restore the topdown base:
-	 */
-	if (use_cache) {
-		mm->free_area_cache = mm->mmap_base;
-		mm->cached_hole_size = ~0UL;
-	}
-
-	return addr;
+	return slice_find_area_bottomup(mm, len, available, psize);
 }
 
 
 static unsigned long slice_find_area(struct mm_struct *mm, unsigned long len,
 				     struct slice_mask mask, int psize,
-				     int topdown, int use_cache)
+				     int topdown)
 {
 	if (topdown)
-		return slice_find_area_topdown(mm, len, mask, psize, use_cache);
+		return slice_find_area_topdown(mm, len, mask, psize);
 	else
-		return slice_find_area_bottomup(mm, len, mask, psize, use_cache);
+		return slice_find_area_bottomup(mm, len, mask, psize);
 }
 
 #define or_mask(dst, src)	do {			\
@@ -415,7 +350,7 @@ static unsigned long slice_find_area(struct mm_struct *mm, unsigned long len,
 
 unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 				      unsigned long flags, unsigned int psize,
-				      int topdown, int use_cache)
+				      int topdown)
 {
 	struct slice_mask mask = {0, 0};
 	struct slice_mask good_mask;
@@ -430,8 +365,8 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 	BUG_ON(mm->task_size == 0);
 
 	slice_dbg("slice_get_unmapped_area(mm=%p, psize=%d...\n", mm, psize);
-	slice_dbg(" addr=%lx, len=%lx, flags=%lx, topdown=%d, use_cache=%d\n",
-		  addr, len, flags, topdown, use_cache);
+	slice_dbg(" addr=%lx, len=%lx, flags=%lx, topdown=%d\n",
+		  addr, len, flags, topdown);
 
 	if (len > mm->task_size)
 		return -ENOMEM;
@@ -503,8 +438,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 		/* Now let's see if we can find something in the existing
 		 * slices for that size
 		 */
-		newaddr = slice_find_area(mm, len, good_mask, psize, topdown,
-					  use_cache);
+		newaddr = slice_find_area(mm, len, good_mask, psize, topdown);
 		if (newaddr != -ENOMEM) {
 			/* Found within the good mask, we don't have to setup,
 			 * we thus return directly
@@ -536,8 +470,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 	 * anywhere in the good area.
 	 */
 	if (addr) {
-		addr = slice_find_area(mm, len, good_mask, psize, topdown,
-				       use_cache);
+		addr = slice_find_area(mm, len, good_mask, psize, topdown);
 		if (addr != -ENOMEM) {
 			slice_dbg(" found area at 0x%lx\n", addr);
 			return addr;
@@ -547,15 +480,14 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 	/* Now let's see if we can find something in the existing slices
 	 * for that size plus free slices
 	 */
-	addr = slice_find_area(mm, len, potential_mask, psize, topdown,
-			       use_cache);
+	addr = slice_find_area(mm, len, potential_mask, psize, topdown);
 
 #ifdef CONFIG_PPC_64K_PAGES
 	if (addr == -ENOMEM && psize == MMU_PAGE_64K) {
 		/* retry the search with 4k-page slices included */
 		or_mask(potential_mask, compat_mask);
 		addr = slice_find_area(mm, len, potential_mask, psize,
-				       topdown, use_cache);
+				       topdown);
 	}
 #endif
 
@@ -586,8 +518,7 @@ unsigned long arch_get_unmapped_area(struct file *filp,
 				     unsigned long flags)
 {
 	return slice_get_unmapped_area(addr, len, flags,
-				       current->mm->context.user_psize,
-				       0, 1);
+				       current->mm->context.user_psize, 0);
 }
 
 unsigned long arch_get_unmapped_area_topdown(struct file *filp,
@@ -597,8 +528,7 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp,
 					     const unsigned long flags)
 {
 	return slice_get_unmapped_area(addr0, len, flags,
-				       current->mm->context.user_psize,
-				       1, 1);
+				       current->mm->context.user_psize, 1);
 }
 
 unsigned int get_slice_psize(struct mm_struct *mm, unsigned long addr)
diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platforms/cell/spufs/file.c
index 0cfece4cf6ef..2eb4df2a9388 100644
--- a/arch/powerpc/platforms/cell/spufs/file.c
+++ b/arch/powerpc/platforms/cell/spufs/file.c
@@ -352,7 +352,7 @@ static unsigned long spufs_get_unmapped_area(struct file *file,
 
 	/* Else, try to obtain a 64K pages slice */
 	return slice_get_unmapped_area(addr, len, flags,
-				       MMU_PAGE_64K, 1, 0);
+				       MMU_PAGE_64K, 1);
 }
 #endif /* CONFIG_SPU_FS_64K_LS */
 
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 5/8] mm: use vm_unmapped_area() in hugetlbfs on ia64 architecture
From: Michel Lespinasse @ 2013-01-24  1:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, James E.J. Bottomley,
	Helge Deller, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	David Howells, Tony Luck, Fenghua Yu
  Cc: linux-ia64, linux-parisc, linux-kernel, linux-mm, linux-alpha,
	Andrew Morton, linuxppc-dev
In-Reply-To: <1358990991-21316-1-git-send-email-walken@google.com>

Update the ia64 hugetlb_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>

---
 arch/ia64/mm/hugetlbpage.c |   20 +++++++++-----------
 1 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/arch/ia64/mm/hugetlbpage.c b/arch/ia64/mm/hugetlbpage.c
index 5ca674b74737..76069c18ee42 100644
--- a/arch/ia64/mm/hugetlbpage.c
+++ b/arch/ia64/mm/hugetlbpage.c
@@ -148,7 +148,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
 unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 		unsigned long pgoff, unsigned long flags)
 {
-	struct vm_area_struct *vmm;
+	struct vm_unmapped_area_info info;
 
 	if (len > RGN_MAP_LIMIT)
 		return -ENOMEM;
@@ -165,16 +165,14 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, u
 	/* This code assumes that RGN_HPAGE != 0. */
 	if ((REGION_NUMBER(addr) != RGN_HPAGE) || (addr & (HPAGE_SIZE - 1)))
 		addr = HPAGE_REGION_BASE;
-	else
-		addr = ALIGN(addr, HPAGE_SIZE);
-	for (vmm = find_vma(current->mm, addr); ; vmm = vmm->vm_next) {
-		/* At this point:  (!vmm || addr < vmm->vm_end). */
-		if (REGION_OFFSET(addr) + len > RGN_MAP_LIMIT)
-			return -ENOMEM;
-		if (!vmm || (addr + len) <= vmm->vm_start)
-			return addr;
-		addr = ALIGN(vmm->vm_end, HPAGE_SIZE);
-	}
+
+	info.flags = 0;
+	info.length = len;
+	info.low_limit = addr;
+	info.high_limit = HPAGE_REGION_BASE + RGN_MAP_LIMIT;
+	info.align_mask = PAGE_MASK & (HPAGE_SIZE - 1);
+	info.align_offset = 0;
+	return vm_unmapped_area(&info);
 }
 
 static int __init hugetlb_setup_sz(char *str)
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 4/8] mm: use vm_unmapped_area() on ia64 architecture
From: Michel Lespinasse @ 2013-01-24  1:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, James E.J. Bottomley,
	Helge Deller, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	David Howells, Tony Luck, Fenghua Yu
  Cc: linux-ia64, linux-parisc, linux-kernel, linux-mm, linux-alpha,
	Andrew Morton, linuxppc-dev
In-Reply-To: <1358990991-21316-1-git-send-email-walken@google.com>

Update the ia64 arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>

---
 arch/ia64/kernel/sys_ia64.c |   37 ++++++++++++-------------------------
 1 files changed, 12 insertions(+), 25 deletions(-)

diff --git a/arch/ia64/kernel/sys_ia64.c b/arch/ia64/kernel/sys_ia64.c
index d9439ef2f661..41e33f84c185 100644
--- a/arch/ia64/kernel/sys_ia64.c
+++ b/arch/ia64/kernel/sys_ia64.c
@@ -25,9 +25,9 @@ arch_get_unmapped_area (struct file *filp, unsigned long addr, unsigned long len
 			unsigned long pgoff, unsigned long flags)
 {
 	long map_shared = (flags & MAP_SHARED);
-	unsigned long start_addr, align_mask = PAGE_SIZE - 1;
+	unsigned long align_mask = 0;
 	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
+	struct vm_unmapped_area_info info;
 
 	if (len > RGN_MAP_LIMIT)
 		return -ENOMEM;
@@ -44,7 +44,7 @@ arch_get_unmapped_area (struct file *filp, unsigned long addr, unsigned long len
 		addr = 0;
 #endif
 	if (!addr)
-		addr = mm->free_area_cache;
+		addr = TASK_UNMAPPED_BASE;
 
 	if (map_shared && (TASK_SIZE > 0xfffffffful))
 		/*
@@ -53,28 +53,15 @@ arch_get_unmapped_area (struct file *filp, unsigned long addr, unsigned long len
 		 * tasks, we prefer to avoid exhausting the address space too quickly by
 		 * limiting alignment to a single page.
 		 */
-		align_mask = SHMLBA - 1;
-
-  full_search:
-	start_addr = addr = (addr + align_mask) & ~align_mask;
-
-	for (vma = find_vma(mm, addr); ; vma = vma->vm_next) {
-		/* At this point:  (!vma || addr < vma->vm_end). */
-		if (TASK_SIZE - len < addr || RGN_MAP_LIMIT - len < REGION_OFFSET(addr)) {
-			if (start_addr != TASK_UNMAPPED_BASE) {
-				/* Start a new search --- just in case we missed some holes.  */
-				addr = TASK_UNMAPPED_BASE;
-				goto full_search;
-			}
-			return -ENOMEM;
-		}
-		if (!vma || addr + len <= vma->vm_start) {
-			/* Remember the address where we stopped this search:  */
-			mm->free_area_cache = addr + len;
-			return addr;
-		}
-		addr = (vma->vm_end + align_mask) & ~align_mask;
-	}
+		align_mask = PAGE_MASK & (SHMLBA - 1);
+
+	info.flags = 0;
+	info.length = len;
+	info.low_limit = addr;
+	info.high_limit = TASK_SIZE;
+	info.align_mask = align_mask;
+	info.align_offset = 0;
+	return vm_unmapped_area(&info);
 }
 
 asmlinkage long
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 3/8] mm: use vm_unmapped_area() on frv architecture
From: Michel Lespinasse @ 2013-01-24  1:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, James E.J. Bottomley,
	Helge Deller, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	David Howells, Tony Luck, Fenghua Yu
  Cc: linux-ia64, linux-parisc, linux-kernel, linux-mm, linux-alpha,
	Andrew Morton, linuxppc-dev
In-Reply-To: <1358990991-21316-1-git-send-email-walken@google.com>

Update the frv arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>

---
 arch/frv/mm/elf-fdpic.c |   49 ++++++++++++++++------------------------------
 1 files changed, 17 insertions(+), 32 deletions(-)

diff --git a/arch/frv/mm/elf-fdpic.c b/arch/frv/mm/elf-fdpic.c
index 385fd30b142f..836f14707a62 100644
--- a/arch/frv/mm/elf-fdpic.c
+++ b/arch/frv/mm/elf-fdpic.c
@@ -60,7 +60,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
 				     unsigned long pgoff, unsigned long flags)
 {
 	struct vm_area_struct *vma;
-	unsigned long limit;
+	struct vm_unmapped_area_info info;
 
 	if (len > TASK_SIZE)
 		return -ENOMEM;
@@ -79,39 +79,24 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
 	}
 
 	/* search between the bottom of user VM and the stack grow area */
-	addr = PAGE_SIZE;
-	limit = (current->mm->start_stack - 0x00200000);
-	if (addr + len <= limit) {
-		limit -= len;
-
-		if (addr <= limit) {
-			vma = find_vma(current->mm, PAGE_SIZE);
-			for (; vma; vma = vma->vm_next) {
-				if (addr > limit)
-					break;
-				if (addr + len <= vma->vm_start)
-					goto success;
-				addr = vma->vm_end;
-			}
-		}
-	}
+	info.flags = 0;
+	info.length = len;
+	info.low_limit = PAGE_SIZE;
+	info.high_limit = (current->mm->start_stack - 0x00200000);
+	info.align_mask = 0;
+	info.align_offset = 0;
+	addr = vm_unmapped_area(&info);
+	if (!(addr & ~PAGE_MASK))
+		goto success;
+	VM_BUG_ON(addr != -ENOMEM);
 
 	/* search from just above the WorkRAM area to the top of memory */
-	addr = PAGE_ALIGN(0x80000000);
-	limit = TASK_SIZE - len;
-	if (addr <= limit) {
-		vma = find_vma(current->mm, addr);
-		for (; vma; vma = vma->vm_next) {
-			if (addr > limit)
-				break;
-			if (addr + len <= vma->vm_start)
-				goto success;
-			addr = vma->vm_end;
-		}
-
-		if (!vma && addr <= limit)
-			goto success;
-	}
+	info.low_limit = PAGE_ALIGN(0x80000000);
+	info.high_limit = TASK_SIZE;
+	addr = vm_unmapped_area(&info);
+	if (!(addr & ~PAGE_MASK))
+		goto success;
+	VM_BUG_ON(addr != -ENOMEM);
 
 #if 0
 	printk("[area] l=%lx (ENOMEM) f='%s'\n",
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 2/8] mm: use vm_unmapped_area() on alpha architecture
From: Michel Lespinasse @ 2013-01-24  1:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, James E.J. Bottomley,
	Helge Deller, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	David Howells, Tony Luck, Fenghua Yu
  Cc: linux-ia64, linux-parisc, linux-kernel, linux-mm, linux-alpha,
	Andrew Morton, linuxppc-dev
In-Reply-To: <1358990991-21316-1-git-send-email-walken@google.com>

Update the alpha arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>

---
 arch/alpha/kernel/osf_sys.c |   20 +++++++++-----------
 1 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index 14db93e4c8a8..ba707e23ef37 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -1298,17 +1298,15 @@ static unsigned long
 arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
 		         unsigned long limit)
 {
-	struct vm_area_struct *vma = find_vma(current->mm, addr);
-
-	while (1) {
-		/* At this point:  (!vma || addr < vma->vm_end). */
-		if (limit - len < addr)
-			return -ENOMEM;
-		if (!vma || addr + len <= vma->vm_start)
-			return addr;
-		addr = vma->vm_end;
-		vma = vma->vm_next;
-	}
+	struct vm_unmapped_area_info info;
+
+	info.flags = 0;
+	info.length = len;
+	info.low_limit = addr;
+	info.high_limit = limit;
+	info.align_mask = 0;
+	info.align_offset = 0;
+	return vm_unmapped_area(&info);
 }
 
 unsigned long
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 1/8] mm: use vm_unmapped_area() on parisc architecture
From: Michel Lespinasse @ 2013-01-24  1:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, James E.J. Bottomley,
	Helge Deller, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	David Howells, Tony Luck, Fenghua Yu
  Cc: linux-ia64, linux-parisc, linux-kernel, linux-mm, linux-alpha,
	Andrew Morton, linuxppc-dev
In-Reply-To: <1358990991-21316-1-git-send-email-walken@google.com>

Update the parisc arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>

---
 arch/parisc/kernel/sys_parisc.c |   46 ++++++++++++++------------------------
 1 files changed, 17 insertions(+), 29 deletions(-)

diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
index f76c10863c62..6ab138088076 100644
--- a/arch/parisc/kernel/sys_parisc.c
+++ b/arch/parisc/kernel/sys_parisc.c
@@ -35,18 +35,15 @@
 
 static unsigned long get_unshared_area(unsigned long addr, unsigned long len)
 {
-	struct vm_area_struct *vma;
+	struct vm_unmapped_area_info info;
 
-	addr = PAGE_ALIGN(addr);
-
-	for (vma = find_vma(current->mm, addr); ; vma = vma->vm_next) {
-		/* At this point:  (!vma || addr < vma->vm_end). */
-		if (TASK_SIZE - len < addr)
-			return -ENOMEM;
-		if (!vma || addr + len <= vma->vm_start)
-			return addr;
-		addr = vma->vm_end;
-	}
+	info.flags = 0;
+	info.length = len;
+	info.low_limit = PAGE_ALIGN(addr);
+	info.high_limit = TASK_SIZE;
+	info.align_mask = 0;
+	info.align_offset = 0;
+	return vm_unmapped_area(&info);
 }
 
 #define DCACHE_ALIGN(addr) (((addr) + (SHMLBA - 1)) &~ (SHMLBA - 1))
@@ -63,30 +60,21 @@ static unsigned long get_unshared_area(unsigned long addr, unsigned long len)
  */
 static int get_offset(struct address_space *mapping)
 {
-	int offset = (unsigned long) mapping << (PAGE_SHIFT - 8);
-	return offset & 0x3FF000;
+	return (unsigned long) mapping >> 8;
 }
 
 static unsigned long get_shared_area(struct address_space *mapping,
 		unsigned long addr, unsigned long len, unsigned long pgoff)
 {
-	struct vm_area_struct *vma;
-	int offset = mapping ? get_offset(mapping) : 0;
-
-	offset = (offset + (pgoff << PAGE_SHIFT)) & 0x3FF000;
+	struct vm_unmapped_area_info info;
 
-	addr = DCACHE_ALIGN(addr - offset) + offset;
-
-	for (vma = find_vma(current->mm, addr); ; vma = vma->vm_next) {
-		/* At this point:  (!vma || addr < vma->vm_end). */
-		if (TASK_SIZE - len < addr)
-			return -ENOMEM;
-		if (!vma || addr + len <= vma->vm_start)
-			return addr;
-		addr = DCACHE_ALIGN(vma->vm_end - offset) + offset;
-		if (addr < vma->vm_end) /* handle wraparound */
-			return -ENOMEM;
-	}
+	info.flags = 0;
+	info.length = len;
+	info.low_limit = PAGE_ALIGN(addr);
+	info.high_limit = TASK_SIZE;
+	info.align_mask = PAGE_MASK & (SHMLBA - 1);
+	info.align_offset = (get_offset(mapping) + pgoff) << PAGE_SHIFT;
+	return vm_unmapped_area(&info);
 }
 
 unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 0/8] convert remaining archs to use vm_unmapped_area()
From: Michel Lespinasse @ 2013-01-24  1:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, James E.J. Bottomley,
	Helge Deller, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	David Howells, Tony Luck, Fenghua Yu
  Cc: linux-ia64, linux-parisc, linux-kernel, linux-mm, linux-alpha,
	Andrew Morton, linuxppc-dev

This is a resend of my "finish the mission" patch series. I need arch
maintainers to approve so I can push this to andrew's -mm tree.

These patches, which apply on top of v3.8-rc kernels, are to complete the
VMA gap finding code I introduced (following Rik's initial proposal) in
v3.8-rc1.

First 5 patches introduce the use of vm_unmapped_area() to replace brute
force searches on parisc, alpha, frv and ia64 architectures (all relatively
trivial uses of the vm_unmapped_area() infrastructure)

Next 2 patches do the same as above for the powerpc architecture. This
change is not as trivial as for the other architectures, because we
need to account for each address space slice potentially having a
different page size.

The last patch removes the free_area_cache, which was used by all the
brute force searches before they got converted to the
vm_unmapped_area() infrastructure.

I did some basic testing on x86 and powerpc; however the first 5 (simpler)
patches for parisc, alpha, frv and ia64 architectures are untested.

Michel Lespinasse (8):
  mm: use vm_unmapped_area() on parisc architecture
  mm: use vm_unmapped_area() on alpha architecture
  mm: use vm_unmapped_area() on frv architecture
  mm: use vm_unmapped_area() on ia64 architecture
  mm: use vm_unmapped_area() in hugetlbfs on ia64 architecture
  mm: remove free_area_cache use in powerpc architecture
  mm: use vm_unmapped_area() on powerpc architecture
  mm: remove free_area_cache

 arch/alpha/kernel/osf_sys.c              |   20 ++--
 arch/arm/mm/mmap.c                       |    2 -
 arch/arm64/mm/mmap.c                     |    2 -
 arch/frv/mm/elf-fdpic.c                  |   49 +++----
 arch/ia64/kernel/sys_ia64.c              |   37 ++----
 arch/ia64/mm/hugetlbpage.c               |   20 ++--
 arch/mips/mm/mmap.c                      |    2 -
 arch/parisc/kernel/sys_parisc.c          |   46 +++----
 arch/powerpc/include/asm/page_64.h       |    3 +-
 arch/powerpc/mm/hugetlbpage.c            |    2 +-
 arch/powerpc/mm/mmap_64.c                |    2 -
 arch/powerpc/mm/slice.c                  |  228 +++++++++++++-----------------
 arch/powerpc/platforms/cell/spufs/file.c |    2 +-
 arch/s390/mm/mmap.c                      |    4 -
 arch/sparc/kernel/sys_sparc_64.c         |    2 -
 arch/tile/mm/mmap.c                      |    2 -
 arch/x86/ia32/ia32_aout.c                |    2 -
 arch/x86/mm/mmap.c                       |    2 -
 fs/binfmt_aout.c                         |    2 -
 fs/binfmt_elf.c                          |    2 -
 include/linux/mm_types.h                 |    3 -
 include/linux/sched.h                    |    2 -
 kernel/fork.c                            |    4 -
 mm/mmap.c                                |   28 ----
 mm/nommu.c                               |    4 -
 mm/util.c                                |    1 -
 26 files changed, 163 insertions(+), 310 deletions(-)

-- 
1.7.7.3

^ permalink raw reply

* Re: [PATCH Bug fix 0/5] Bug fix for physical memory hot-remove.
From: Simon Jeons @ 2013-01-24  0:35 UTC (permalink / raw)
  To: Tang Chen
  Cc: linux-mm, paulus, hpa, cl, sfr, x86, linux-acpi, isimatu.yasuaki,
	linfeng, mgorman, kosaki.motohiro, rientjes, len.brown, jiang.liu,
	wency, julian.calaby, glommer, wujianguo, yinghai, laijs,
	linux-kernel, minchan.kim, akpm, linuxppc-dev
In-Reply-To: <50FFE2FC.9030401@cn.fujitsu.com>

On Wed, 2013-01-23 at 21:17 +0800, Tang Chen wrote:
> On 01/23/2013 08:29 PM, Simon Jeons wrote:
> > Hi Tang,
> >
> > I remember your big physical memory hot-remove patchset has already
> > merged by Andrew, but where I can find it? Could you give me git tree
> > address?
> 
> Hi Simon,
> 
> You can find all the physical memory hot-remove patches and related bugfix
> patches from the following url:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm

~/linux-next$ git remote -v
origin git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(fetch)
origin git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(push)
~/linux-next$ git branch
* akpm
  master
~/linux-next$ wc -l mm/memory_hotplug.c 
1173 mm/memory_hotplug.c

I still can't find it. :(

> 
> 
> Thanks. :)
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH 2/2] powerpc/85xx: describe the PAMU topology in the device tree
From: Scott Wood @ 2013-01-23 22:15 UTC (permalink / raw)
  To: Gala Kumar-B11780
  Cc: linuxppc-dev@ozlabs.org list, Wood Scott-B07421, Timur Tabi,
	Yoder Stuart-B08248
In-Reply-To: <CF1EE7AED478CD48A05574C8E2DA142D6DA7A5@039-SN1MPN1-005.039d.mgd.msft.net>

On 01/23/2013 11:27:29 AM, Gala Kumar-B11780 wrote:
>=20
> On Jan 17, 2013, at 4:34 PM, Timur Tabi wrote:
>=20
> > From: Timur Tabi <timur@freescale.com>
> >
> > The PAMU caches use the LIODNs to determine which cache lines hold =20
> the
> > entries for the corresponding LIODs.  The LIODNs must therefore be
> > carefully assigned to avoid cache thrashing -- two active LIODs with
> > LIODNs that put them in the same cache line.
> >
> > Currently, LIODNs are statically assigned by U-Boot, but this has
> > limitations.  LIODNs are assigned even for devices that may be =20
> disabled
> > or unused by the kernel.  Static assignments also do not allow for =20
> device
> > drivers which may know which LIODs can be used simultaneously.  In
> > other words, we really should assign LIODNs dynamically in Linux.
> >
> > To do that, we need to describe the PAMU device and cache =20
> topologies in
> > the device trees.
> >
> > Signed-off-by: Timur Tabi <timur@freescale.com>
> > ---
> > .../devicetree/bindings/powerpc/fsl/guts.txt       |   14 ++-
> > .../devicetree/bindings/powerpc/fsl/pamu.txt       |  142 =20
> ++++++++++++++++++++
> > arch/powerpc/boot/dts/fsl/p2041si-post.dtsi        |   87 =20
> +++++++++++--
> > arch/powerpc/boot/dts/fsl/p3041si-post.dtsi        |   87 =20
> +++++++++++--
> > arch/powerpc/boot/dts/fsl/p4080si-post.dtsi        |   68 +++++++++-
> > arch/powerpc/boot/dts/fsl/p5020si-post.dtsi        |   92 =20
> +++++++++++--
> > arch/powerpc/boot/dts/fsl/p5040si-post.dtsi        |   92 =20
> +++++++++++--
> > 7 files changed, 533 insertions(+), 49 deletions(-)
> > create mode 100644 =20
> Documentation/devicetree/bindings/powerpc/fsl/pamu.txt
>=20
> Scott, Stuart, does this have your guys Ack?

ACK

-Scott=

^ permalink raw reply

* Re: Freescale P2020 CPU Freeze over PCIe abort signal
From: Scott Wood @ 2013-01-23 21:40 UTC (permalink / raw)
  To: siva kumar; +Cc: linuxppc-dev
In-Reply-To: <CAJrcUeQCJu2wnMSWdNBSxc_5hzm1NmQw2sQjKQTwvoAF5Q3hYw@mail.gmail.com>

On 01/23/2013 11:41:26 AM, siva kumar wrote:
> Hi ,
>=20
> Is there any update on this , am getting in to the same state .
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2010-October/086680.html

Eran's initial comment of "This should probably go to the Freescale =20
support" seems right.  You can reach Freescale support at =20
support@freescale.com.

-Scott=

^ permalink raw reply

* [PATCH 3/4] powerpc: add alternate dts file for sbc8548 boot via SODIMM
From: Paul Gortmaker @ 2013-01-23 20:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Paul Gortmaker
In-Reply-To: <1358972013-5271-1-git-send-email-paul.gortmaker@windriver.com>

By moving the two JP12 jumpers 90 degrees, and switching the
setting of SW2.8, the sbc8548 can be configured to boot off
the alternate 64MB SODIMM, which when populated with u-boot
can be a handy recovery option, in case the u-boot in the
8MB soldered on flash gets corrupted.  Here we add an alternate
dts file to match that configuration.

To better highlight the differences, the output from the u-boot
"fli" command is shown for the normal configuration and then
the alternate configuration.

Normal:
 -----------------------
Bank # 1: CFI conformant flash (8 x 8)  Size: 8 MB in 64 Sectors
  Intel Extended command set, Manufacturer ID: 0x89, Device ID: 0x17
  Erase timeout: 4096 ms, write timeout: 1 ms
  Buffer write timeout: 2 ms, buffer size: 32 bytes

  Sector Start Addresses:
  FF800000 E      FF820000 E      FF840000 E      FF860000 E      FF880000 E
 [...]
  FFEE0000 E      FFF00000 E      FFF20000 E      FFF40000 E      FFF60000 E
  FFF80000        FFFA0000   RO   FFFC0000   RO   FFFE0000   RO

Bank # 2: CFI conformant flash (32 x 8)  Size: 64 MB in 128 Sectors
  Intel Extended command set, Manufacturer ID: 0x89, Device ID: 0x18
  Erase timeout: 4096 ms, write timeout: 1 ms
  Buffer write timeout: 2 ms, buffer size: 32 bytes

  Sector Start Addresses:
  EC000000 E      EC080000 E      EC100000 E      EC180000 E      EC200000 E
 [...]
  EFC00000 E      EFC80000 E      EFD00000 E      EFD80000 E      EFE00000 E
  EFE80000 E      EFF00000        EFF80000
 -----------------------

Alternate:
 -----------------------
Bank # 1: CFI conformant flash (32 x 8)  Size: 64 MB in 128 Sectors
  Intel Extended command set, Manufacturer ID: 0x89, Device ID: 0x18
  Erase timeout: 4096 ms, write timeout: 1 ms
  Buffer write timeout: 2 ms, buffer size: 32 bytes

  Sector Start Addresses:
  FC000000 E      FC080000 E      FC100000 E      FC180000 E      FC200000 E
 [...]
  FFC00000 E      FFC80000 E      FFD00000 E      FFD80000 E      FFE00000 E
  FFE80000 E      FFF00000   RO   FFF80000   RO

Bank # 2: CFI conformant flash (8 x 8)  Size: 8 MB in 64 Sectors
  Intel Extended command set, Manufacturer ID: 0x89, Device ID: 0x17
  Erase timeout: 4096 ms, write timeout: 1 ms
  Buffer write timeout: 2 ms, buffer size: 32 bytes

  Sector Start Addresses:
  EF800000 E      EF820000 E      EF840000 E      EF860000 E      EF880000 E
 [...]
  EFEE0000 E      EFF00000 E      EFF20000 E      EFF40000 E      EFF60000 E
  EFF80000 E      EFFA0000        EFFC0000        EFFE0000
 -----------------------

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 arch/powerpc/boot/dts/sbc8548-altflash.dts | 115 +++++++++++++++++++++++++++++
 1 file changed, 115 insertions(+)
 create mode 100644 arch/powerpc/boot/dts/sbc8548-altflash.dts

diff --git a/arch/powerpc/boot/dts/sbc8548-altflash.dts b/arch/powerpc/boot/dts/sbc8548-altflash.dts
new file mode 100644
index 0000000..0b38a0d
--- /dev/null
+++ b/arch/powerpc/boot/dts/sbc8548-altflash.dts
@@ -0,0 +1,115 @@
+/*
+ * SBC8548 Device Tree Source
+ *
+ * Configured for booting off the alternate (64MB SODIMM) flash.
+ * Requires switching JP12 jumpers and changing SW2.8 setting.
+ *
+ * Copyright 2013 Wind River Systems Inc.
+ *
+ * Paul Gortmaker (see MAINTAINERS for contact information)
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+
+/dts-v1/;
+
+/include/ "sbc8548-pre.dtsi"
+
+/{
+	localbus@e0000000 {
+		#address-cells = <2>;
+		#size-cells = <1>;
+		compatible = "simple-bus";
+		reg = <0xe0000000 0x5000>;
+		interrupt-parent = <&mpic>;
+
+		ranges = <0x0 0x0 0xfc000000 0x04000000		/*64MB Flash*/
+			  0x3 0x0 0xf0000000 0x04000000		/*64MB SDRAM*/
+			  0x4 0x0 0xf4000000 0x04000000 	/*64MB SDRAM*/
+			  0x5 0x0 0xf8000000 0x00b10000		/* EPLD */
+			  0x6 0x0 0xef800000 0x00800000>;	/*8MB Flash*/
+
+		flash@0,0 {
+			#address-cells = <1>;
+			#size-cells = <1>;
+			reg = <0x0 0x0 0x04000000>;
+			compatible = "intel,JS28F128", "cfi-flash";
+			bank-width = <4>;
+			device-width = <1>;
+			partition@0x0 {
+				label = "space";
+				/* FC000000 -> FFEFFFFF */
+				reg = <0x00000000 0x03f00000>;
+			};
+			partition@0x03f00000 {
+				label = "bootloader";
+				/* FFF00000 -> FFFFFFFF */
+				reg = <0x03f00000 0x00100000>;
+				read-only;
+			};
+                };
+
+
+		epld@5,0 {
+			compatible = "wrs,epld-localbus";
+			#address-cells = <2>;
+			#size-cells = <1>;
+			reg = <0x5 0x0 0x00b10000>;
+			ranges = <
+				0x0 0x0 0x5 0x000000 0x1fff	/* LED */
+				0x1 0x0 0x5 0x100000 0x1fff	/* Switches */
+				0x3 0x0 0x5 0x300000 0x1fff	/* HW Rev. */
+				0xb 0x0	0x5 0xb00000 0x1fff	/* EEPROM */
+			>;
+
+			led@0,0 {
+				compatible = "led";
+				reg = <0x0 0x0 0x1fff>;
+			};
+
+			switches@1,0 {
+				compatible = "switches";
+				reg = <0x1 0x0 0x1fff>;
+			};
+
+			hw-rev@3,0 {
+				compatible = "hw-rev";
+				reg = <0x3 0x0 0x1fff>;
+			};
+
+			eeprom@b,0 {
+				compatible = "eeprom";
+				reg = <0xb 0 0x1fff>;
+			};
+
+		};
+
+		alt-flash@6,0 {
+			#address-cells = <1>;
+			#size-cells = <1>;
+			compatible = "intel,JS28F640", "cfi-flash";
+			reg = <0x6 0x0 0x800000>;
+			bank-width = <1>;
+			device-width = <1>;
+			partition@0x0 {
+				label = "space";
+				/* EF800000 -> EFF9FFFF */
+				reg = <0x00000000 0x007a0000>;
+			};
+			partition@0x7a0000 {
+				label = "bootloader";
+				/* EFFA0000 -> EFFFFFFF */
+				reg = <0x007a0000 0x00060000>;
+				read-only;
+			};
+		};
+
+
+        };
+};
+
+/include/ "sbc8548-post.dtsi"
-- 
1.8.1

^ permalink raw reply related

* [PATCH 2/4] powerpc: update sbc8548 flash information to match recent u-boot
From: Paul Gortmaker @ 2013-01-23 20:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Paul Gortmaker
In-Reply-To: <1358972013-5271-1-git-send-email-paul.gortmaker@windriver.com>

The original memory map for the sbc8548 had the 64MB SODIMM flash
device misaligned by 8MB to allow a window of address space for
the soldered on 8MB device -- i.e.

 start           end             CS<n>   width   Desc.
 ----------------------------------------------------------
 fb80_0000       ff7f_ffff       CS6     32      SODIMM flash (64MB)
 ff80_0000       ffff_ffff       CS0     8       Boot flash (8MB)

However, if we want to change the configuration so that it boots
off the 64MB flash, it is in turn then aligned with a 64MB boundary,
starting at fc00_0000 (and the 8MB @ fb80_0000 -> fbff_ffff).

This makes for complicated updates, since what is the beginning
of the physical device is 8MB into its address space in the default
configuration shown above.

This issue was fixed as of u-boot commit 3fd673cf363bc86ed42eff713d4
("sbc8548: relocate 64MB user flash to sane boundary") -- in which
the SODIMM was mapped to ec00_0000 (natively aligned under efff_ffff)
and so when JP12/SW2.8 are switched, it will be a a simple 0xec --> 0xfc
mapping between the two instances.

Here we make the associated changes in the localbus flash memory
map in the dts file:  indicating the 64MB device starts at ec00_0000
and that the tail end of the 64MB device (last 2 sectors) can contain
a bootloader image.

The partitions for both flash devices get a clean-up; there were
non-meaningful assignments in there that probably originated from
the MPC8548CDS on which the file was based on.  Now there is just
the categorization of free space and bootloader images.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 arch/powerpc/boot/dts/sbc8548.dts | 34 +++++++++++++++-------------------
 1 file changed, 15 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/boot/dts/sbc8548.dts b/arch/powerpc/boot/dts/sbc8548.dts
index ad8dc68..1df2a09 100644
--- a/arch/powerpc/boot/dts/sbc8548.dts
+++ b/arch/powerpc/boot/dts/sbc8548.dts
@@ -28,23 +28,25 @@
 			  0x3 0x0 0xf0000000 0x04000000		/*64MB SDRAM*/
 			  0x4 0x0 0xf4000000 0x04000000 	/*64MB SDRAM*/
 			  0x5 0x0 0xf8000000 0x00b10000		/* EPLD */
-			  0x6 0x0 0xfb800000 0x04000000>;	/*64MB Flash*/
+			  0x6 0x0 0xec000000 0x04000000>;	/*64MB Flash*/
 
 
 		flash@0,0 {
 			#address-cells = <1>;
 			#size-cells = <1>;
-			compatible = "cfi-flash";
+			compatible = "intel,JS28F640", "cfi-flash";
 			reg = <0x0 0x0 0x800000>;
 			bank-width = <1>;
 			device-width = <1>;
 			partition@0x0 {
 				label = "space";
-				reg = <0x00000000 0x00100000>;
+				/* FF800000 -> FFF9FFFF */
+				reg = <0x00000000 0x007a0000>;
 			};
-			partition@0x100000 {
+			partition@0x7a0000 {
 				label = "bootloader";
-				reg = <0x00100000 0x00700000>;
+				/* FFFA0000 -> FFFFFFFF */
+				reg = <0x007a0000 0x00060000>;
 				read-only;
 			};
 		};
@@ -87,26 +89,20 @@
 			#address-cells = <1>;
 			#size-cells = <1>;
 			reg = <0x6 0x0 0x04000000>;
-			compatible = "cfi-flash";
+			compatible = "intel,JS28F128", "cfi-flash";
 			bank-width = <4>;
 			device-width = <1>;
 			partition@0x0 {
+				label = "space";
+				/* EC000000 -> EFEFFFFF */
+				reg = <0x00000000 0x03f00000>;
+			};
+			partition@0x03f00000 {
 				label = "bootloader";
-				reg = <0x00000000 0x00100000>;
+				/* EFF00000 -> EFFFFFFF */
+				reg = <0x03f00000 0x00100000>;
 				read-only;
 			};
-			partition@0x00100000 {
-				label = "file-system";
-				reg = <0x00100000 0x01f00000>;
-			};
-			partition@0x02000000 {
-				label = "boot-config";
-				reg = <0x02000000 0x00100000>;
-			};
-			partition@0x02100000 {
-				label = "space";
-				reg = <0x02100000 0x01f00000>;
-			};
                 };
         };
 };
-- 
1.8.1

^ permalink raw reply related

* [PATCH 1/4] powerpc: split sbc8548 dts file into pre and post chunks
From: Paul Gortmaker @ 2013-01-23 20:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Paul Gortmaker
In-Reply-To: <1358972013-5271-1-git-send-email-paul.gortmaker@windriver.com>

Updates to u-boot allow this board to boot off of either
the 8MB soldered on flash, or the 64MB SODIMM flash.

This is achieved by changing JP12 and SW2.8 which in turn
swaps which flash device appears on /CS0 and /CS6 respectively.

Since the flash devices are not the same size, this also
changes the MTD memory map layout on the local bus.

Here we split the common chunks out into a pre and post
include, so they can be reused by an upcoming "alternative
boot" dts file; leaving only the local bus chunk behind.

No content changes are made at this point - it is just purely
the move to using include files.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 arch/powerpc/boot/dts/sbc8548-post.dtsi | 295 +++++++++++++++++++++++++++++
 arch/powerpc/boot/dts/sbc8548-pre.dtsi  |  52 ++++++
 arch/powerpc/boot/dts/sbc8548.dts       | 322 +-------------------------------
 3 files changed, 351 insertions(+), 318 deletions(-)
 create mode 100644 arch/powerpc/boot/dts/sbc8548-post.dtsi
 create mode 100644 arch/powerpc/boot/dts/sbc8548-pre.dtsi

diff --git a/arch/powerpc/boot/dts/sbc8548-post.dtsi b/arch/powerpc/boot/dts/sbc8548-post.dtsi
new file mode 100644
index 0000000..33a47e2
--- /dev/null
+++ b/arch/powerpc/boot/dts/sbc8548-post.dtsi
@@ -0,0 +1,295 @@
+/*
+ * SBC8548 Device Tree Source
+ *
+ * Copyright 2007 Wind River Systems Inc.
+ *
+ * Paul Gortmaker (see MAINTAINERS for contact information)
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+/{
+	soc8548@e0000000 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		device_type = "soc";
+		ranges = <0x00000000 0xe0000000 0x00100000>;
+		bus-frequency = <0>;
+		compatible = "simple-bus";
+
+		ecm-law@0 {
+			compatible = "fsl,ecm-law";
+			reg = <0x0 0x1000>;
+			fsl,num-laws = <10>;
+		};
+
+		ecm@1000 {
+			compatible = "fsl,mpc8548-ecm", "fsl,ecm";
+			reg = <0x1000 0x1000>;
+			interrupts = <17 2>;
+			interrupt-parent = <&mpic>;
+		};
+
+		memory-controller@2000 {
+			compatible = "fsl,mpc8548-memory-controller";
+			reg = <0x2000 0x1000>;
+			interrupt-parent = <&mpic>;
+			interrupts = <0x12 0x2>;
+		};
+
+		L2: l2-cache-controller@20000 {
+			compatible = "fsl,mpc8548-l2-cache-controller";
+			reg = <0x20000 0x1000>;
+			cache-line-size = <0x20>;	// 32 bytes
+			cache-size = <0x80000>;	// L2, 512K
+			interrupt-parent = <&mpic>;
+			interrupts = <0x10 0x2>;
+		};
+
+		i2c@3000 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			cell-index = <0>;
+			compatible = "fsl-i2c";
+			reg = <0x3000 0x100>;
+			interrupts = <0x2b 0x2>;
+			interrupt-parent = <&mpic>;
+			dfsrr;
+		};
+
+		i2c@3100 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			cell-index = <1>;
+			compatible = "fsl-i2c";
+			reg = <0x3100 0x100>;
+			interrupts = <0x2b 0x2>;
+			interrupt-parent = <&mpic>;
+			dfsrr;
+		};
+
+		dma@21300 {
+			#address-cells = <1>;
+			#size-cells = <1>;
+			compatible = "fsl,mpc8548-dma", "fsl,eloplus-dma";
+			reg = <0x21300 0x4>;
+			ranges = <0x0 0x21100 0x200>;
+			cell-index = <0>;
+			dma-channel@0 {
+				compatible = "fsl,mpc8548-dma-channel",
+						"fsl,eloplus-dma-channel";
+				reg = <0x0 0x80>;
+				cell-index = <0>;
+				interrupt-parent = <&mpic>;
+				interrupts = <20 2>;
+			};
+			dma-channel@80 {
+				compatible = "fsl,mpc8548-dma-channel",
+						"fsl,eloplus-dma-channel";
+				reg = <0x80 0x80>;
+				cell-index = <1>;
+				interrupt-parent = <&mpic>;
+				interrupts = <21 2>;
+			};
+			dma-channel@100 {
+				compatible = "fsl,mpc8548-dma-channel",
+						"fsl,eloplus-dma-channel";
+				reg = <0x100 0x80>;
+				cell-index = <2>;
+				interrupt-parent = <&mpic>;
+				interrupts = <22 2>;
+			};
+			dma-channel@180 {
+				compatible = "fsl,mpc8548-dma-channel",
+						"fsl,eloplus-dma-channel";
+				reg = <0x180 0x80>;
+				cell-index = <3>;
+				interrupt-parent = <&mpic>;
+				interrupts = <23 2>;
+			};
+		};
+
+		enet0: ethernet@24000 {
+			#address-cells = <1>;
+			#size-cells = <1>;
+			cell-index = <0>;
+			device_type = "network";
+			model = "eTSEC";
+			compatible = "gianfar";
+			reg = <0x24000 0x1000>;
+			ranges = <0x0 0x24000 0x1000>;
+			local-mac-address = [ 00 00 00 00 00 00 ];
+			interrupts = <0x1d 0x2 0x1e 0x2 0x22 0x2>;
+			interrupt-parent = <&mpic>;
+			tbi-handle = <&tbi0>;
+			phy-handle = <&phy0>;
+
+			mdio@520 {
+				#address-cells = <1>;
+				#size-cells = <0>;
+				compatible = "fsl,gianfar-mdio";
+				reg = <0x520 0x20>;
+
+				phy0: ethernet-phy@19 {
+					interrupt-parent = <&mpic>;
+					interrupts = <0x6 0x1>;
+					reg = <0x19>;
+					device_type = "ethernet-phy";
+				};
+				phy1: ethernet-phy@1a {
+					interrupt-parent = <&mpic>;
+					interrupts = <0x7 0x1>;
+					reg = <0x1a>;
+					device_type = "ethernet-phy";
+				};
+				tbi0: tbi-phy@11 {
+					reg = <0x11>;
+					device_type = "tbi-phy";
+				};
+			};
+		};
+
+		enet1: ethernet@25000 {
+			#address-cells = <1>;
+			#size-cells = <1>;
+			cell-index = <1>;
+			device_type = "network";
+			model = "eTSEC";
+			compatible = "gianfar";
+			reg = <0x25000 0x1000>;
+			ranges = <0x0 0x25000 0x1000>;
+			local-mac-address = [ 00 00 00 00 00 00 ];
+			interrupts = <0x23 0x2 0x24 0x2 0x28 0x2>;
+			interrupt-parent = <&mpic>;
+			tbi-handle = <&tbi1>;
+			phy-handle = <&phy1>;
+
+			mdio@520 {
+				#address-cells = <1>;
+				#size-cells = <0>;
+				compatible = "fsl,gianfar-tbi";
+				reg = <0x520 0x20>;
+
+				tbi1: tbi-phy@11 {
+					reg = <0x11>;
+					device_type = "tbi-phy";
+				};
+			};
+		};
+
+		serial0: serial@4500 {
+			cell-index = <0>;
+			device_type = "serial";
+			compatible = "fsl,ns16550", "ns16550";
+			reg = <0x4500 0x100>;	// reg base, size
+			clock-frequency = <0>;	// should we fill in in uboot?
+			interrupts = <0x2a 0x2>;
+			interrupt-parent = <&mpic>;
+		};
+
+		serial1: serial@4600 {
+			cell-index = <1>;
+			device_type = "serial";
+			compatible = "fsl,ns16550", "ns16550";
+			reg = <0x4600 0x100>;	// reg base, size
+			clock-frequency = <0>;	// should we fill in in uboot?
+			interrupts = <0x2a 0x2>;
+			interrupt-parent = <&mpic>;
+		};
+
+		global-utilities@e0000 {	//global utilities reg
+			compatible = "fsl,mpc8548-guts";
+			reg = <0xe0000 0x1000>;
+			fsl,has-rstcr;
+		};
+
+		crypto@30000 {
+			compatible = "fsl,sec2.1", "fsl,sec2.0";
+			reg = <0x30000 0x10000>;
+			interrupts = <45 2>;
+			interrupt-parent = <&mpic>;
+			fsl,num-channels = <4>;
+			fsl,channel-fifo-len = <24>;
+			fsl,exec-units-mask = <0xfe>;
+			fsl,descriptor-types-mask = <0x12b0ebf>;
+		};
+
+		mpic: pic@40000 {
+			interrupt-controller;
+			#address-cells = <0>;
+			#interrupt-cells = <2>;
+			reg = <0x40000 0x40000>;
+			compatible = "chrp,open-pic";
+			device_type = "open-pic";
+		};
+	};
+
+	pci0: pci@e0008000 {
+		interrupt-map-mask = <0xf800 0x0 0x0 0x7>;
+		interrupt-map = <
+			/* IDSEL 0x01 (PCI-X slot) @66MHz */
+			0x0800 0x0 0x0 0x1 &mpic 0x2 0x1
+			0x0800 0x0 0x0 0x2 &mpic 0x3 0x1
+			0x0800 0x0 0x0 0x3 &mpic 0x4 0x1
+			0x0800 0x0 0x0 0x4 &mpic 0x1 0x1
+
+			/* IDSEL 0x11 (PCI, 3.3V 32bit) @33MHz */
+			0x8800 0x0 0x0 0x1 &mpic 0x2 0x1
+			0x8800 0x0 0x0 0x2 &mpic 0x3 0x1
+			0x8800 0x0 0x0 0x3 &mpic 0x4 0x1
+			0x8800 0x0 0x0 0x4 &mpic 0x1 0x1>;
+
+		interrupt-parent = <&mpic>;
+		interrupts = <0x18 0x2>;
+		bus-range = <0 0>;
+		ranges = <0x02000000 0x0 0x80000000 0x80000000 0x0 0x10000000
+			  0x01000000 0x0 0x00000000 0xe2000000 0x0 0x00800000>;
+		clock-frequency = <66000000>;
+		#interrupt-cells = <1>;
+		#size-cells = <2>;
+		#address-cells = <3>;
+		reg = <0xe0008000 0x1000>;
+		compatible = "fsl,mpc8540-pcix", "fsl,mpc8540-pci";
+		device_type = "pci";
+	};
+
+	pci1: pcie@e000a000 {
+		interrupt-map-mask = <0xf800 0x0 0x0 0x7>;
+		interrupt-map = <
+
+			/* IDSEL 0x0 (PEX) */
+			0x0000 0x0 0x0 0x1 &mpic 0x0 0x1
+			0x0000 0x0 0x0 0x2 &mpic 0x1 0x1
+			0x0000 0x0 0x0 0x3 &mpic 0x2 0x1
+			0x0000 0x0 0x0 0x4 &mpic 0x3 0x1>;
+
+		interrupt-parent = <&mpic>;
+		interrupts = <0x1a 0x2>;
+		bus-range = <0x0 0xff>;
+		ranges = <0x02000000 0x0 0xa0000000 0xa0000000 0x0 0x10000000
+			  0x01000000 0x0 0x00000000 0xe2800000 0x0 0x08000000>;
+		clock-frequency = <33000000>;
+		#interrupt-cells = <1>;
+		#size-cells = <2>;
+		#address-cells = <3>;
+		reg = <0xe000a000 0x1000>;
+		compatible = "fsl,mpc8548-pcie";
+		device_type = "pci";
+		pcie@0 {
+			reg = <0x0 0x0 0x0 0x0 0x0>;
+			#size-cells = <2>;
+			#address-cells = <3>;
+			device_type = "pci";
+			ranges = <0x02000000 0x0 0xa0000000
+				  0x02000000 0x0 0xa0000000
+				  0x0 0x10000000
+
+				  0x01000000 0x0 0x00000000
+				  0x01000000 0x0 0x00000000
+				  0x0 0x00800000>;
+		};
+	};
+};
diff --git a/arch/powerpc/boot/dts/sbc8548-pre.dtsi b/arch/powerpc/boot/dts/sbc8548-pre.dtsi
new file mode 100644
index 0000000..d8c6629
--- /dev/null
+++ b/arch/powerpc/boot/dts/sbc8548-pre.dtsi
@@ -0,0 +1,52 @@
+/*
+ * SBC8548 Device Tree Source
+ *
+ * Copyright 2007 Wind River Systems Inc.
+ *
+ * Paul Gortmaker (see MAINTAINERS for contact information)
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+/{
+	model = "SBC8548";
+	compatible = "SBC8548";
+	#address-cells = <1>;
+	#size-cells = <1>;
+
+	aliases {
+		ethernet0 = &enet0;
+		ethernet1 = &enet1;
+		serial0 = &serial0;
+		serial1 = &serial1;
+		pci0 = &pci0;
+		pci1 = &pci1;
+	};
+
+	cpus {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		PowerPC,8548@0 {
+			device_type = "cpu";
+			reg = <0>;
+			d-cache-line-size = <0x20>;	// 32 bytes
+			i-cache-line-size = <0x20>;	// 32 bytes
+			d-cache-size = <0x8000>;	// L1, 32K
+			i-cache-size = <0x8000>;	// L1, 32K
+			timebase-frequency = <0>;	// From uboot
+			bus-frequency = <0>;
+			clock-frequency = <0>;
+			next-level-cache = <&L2>;
+		};
+	};
+
+	memory {
+		device_type = "memory";
+		reg = <0x00000000 0x10000000>;
+	};
+
+};
diff --git a/arch/powerpc/boot/dts/sbc8548.dts b/arch/powerpc/boot/dts/sbc8548.dts
index 77be771..ad8dc68 100644
--- a/arch/powerpc/boot/dts/sbc8548.dts
+++ b/arch/powerpc/boot/dts/sbc8548.dts
@@ -14,44 +14,9 @@
 
 /dts-v1/;
 
-/ {
-	model = "SBC8548";
-	compatible = "SBC8548";
-	#address-cells = <1>;
-	#size-cells = <1>;
-
-	aliases {
-		ethernet0 = &enet0;
-		ethernet1 = &enet1;
-		serial0 = &serial0;
-		serial1 = &serial1;
-		pci0 = &pci0;
-		pci1 = &pci1;
-	};
-
-	cpus {
-		#address-cells = <1>;
-		#size-cells = <0>;
-
-		PowerPC,8548@0 {
-			device_type = "cpu";
-			reg = <0>;
-			d-cache-line-size = <0x20>;	// 32 bytes
-			i-cache-line-size = <0x20>;	// 32 bytes
-			d-cache-size = <0x8000>;	// L1, 32K
-			i-cache-size = <0x8000>;	// L1, 32K
-			timebase-frequency = <0>;	// From uboot
-			bus-frequency = <0>;
-			clock-frequency = <0>;
-			next-level-cache = <&L2>;
-		};
-	};
-
-	memory {
-		device_type = "memory";
-		reg = <0x00000000 0x10000000>;
-	};
+/include/ "sbc8548-pre.dtsi"
 
+/{
 	localbus@e0000000 {
 		#address-cells = <2>;
 		#size-cells = <1>;
@@ -144,285 +109,6 @@
 			};
                 };
         };
-
-	soc8548@e0000000 {
-		#address-cells = <1>;
-		#size-cells = <1>;
-		device_type = "soc";
-		ranges = <0x00000000 0xe0000000 0x00100000>;
-		bus-frequency = <0>;
-		compatible = "simple-bus";
-
-		ecm-law@0 {
-			compatible = "fsl,ecm-law";
-			reg = <0x0 0x1000>;
-			fsl,num-laws = <10>;
-		};
-
-		ecm@1000 {
-			compatible = "fsl,mpc8548-ecm", "fsl,ecm";
-			reg = <0x1000 0x1000>;
-			interrupts = <17 2>;
-			interrupt-parent = <&mpic>;
-		};
-
-		memory-controller@2000 {
-			compatible = "fsl,mpc8548-memory-controller";
-			reg = <0x2000 0x1000>;
-			interrupt-parent = <&mpic>;
-			interrupts = <0x12 0x2>;
-		};
-
-		L2: l2-cache-controller@20000 {
-			compatible = "fsl,mpc8548-l2-cache-controller";
-			reg = <0x20000 0x1000>;
-			cache-line-size = <0x20>;	// 32 bytes
-			cache-size = <0x80000>;	// L2, 512K
-			interrupt-parent = <&mpic>;
-			interrupts = <0x10 0x2>;
-		};
-
-		i2c@3000 {
-			#address-cells = <1>;
-			#size-cells = <0>;
-			cell-index = <0>;
-			compatible = "fsl-i2c";
-			reg = <0x3000 0x100>;
-			interrupts = <0x2b 0x2>;
-			interrupt-parent = <&mpic>;
-			dfsrr;
-		};
-
-		i2c@3100 {
-			#address-cells = <1>;
-			#size-cells = <0>;
-			cell-index = <1>;
-			compatible = "fsl-i2c";
-			reg = <0x3100 0x100>;
-			interrupts = <0x2b 0x2>;
-			interrupt-parent = <&mpic>;
-			dfsrr;
-		};
-
-		dma@21300 {
-			#address-cells = <1>;
-			#size-cells = <1>;
-			compatible = "fsl,mpc8548-dma", "fsl,eloplus-dma";
-			reg = <0x21300 0x4>;
-			ranges = <0x0 0x21100 0x200>;
-			cell-index = <0>;
-			dma-channel@0 {
-				compatible = "fsl,mpc8548-dma-channel",
-						"fsl,eloplus-dma-channel";
-				reg = <0x0 0x80>;
-				cell-index = <0>;
-				interrupt-parent = <&mpic>;
-				interrupts = <20 2>;
-			};
-			dma-channel@80 {
-				compatible = "fsl,mpc8548-dma-channel",
-						"fsl,eloplus-dma-channel";
-				reg = <0x80 0x80>;
-				cell-index = <1>;
-				interrupt-parent = <&mpic>;
-				interrupts = <21 2>;
-			};
-			dma-channel@100 {
-				compatible = "fsl,mpc8548-dma-channel",
-						"fsl,eloplus-dma-channel";
-				reg = <0x100 0x80>;
-				cell-index = <2>;
-				interrupt-parent = <&mpic>;
-				interrupts = <22 2>;
-			};
-			dma-channel@180 {
-				compatible = "fsl,mpc8548-dma-channel",
-						"fsl,eloplus-dma-channel";
-				reg = <0x180 0x80>;
-				cell-index = <3>;
-				interrupt-parent = <&mpic>;
-				interrupts = <23 2>;
-			};
-		};
-
-		enet0: ethernet@24000 {
-			#address-cells = <1>;
-			#size-cells = <1>;
-			cell-index = <0>;
-			device_type = "network";
-			model = "eTSEC";
-			compatible = "gianfar";
-			reg = <0x24000 0x1000>;
-			ranges = <0x0 0x24000 0x1000>;
-			local-mac-address = [ 00 00 00 00 00 00 ];
-			interrupts = <0x1d 0x2 0x1e 0x2 0x22 0x2>;
-			interrupt-parent = <&mpic>;
-			tbi-handle = <&tbi0>;
-			phy-handle = <&phy0>;
-
-			mdio@520 {
-				#address-cells = <1>;
-				#size-cells = <0>;
-				compatible = "fsl,gianfar-mdio";
-				reg = <0x520 0x20>;
-
-				phy0: ethernet-phy@19 {
-					interrupt-parent = <&mpic>;
-					interrupts = <0x6 0x1>;
-					reg = <0x19>;
-					device_type = "ethernet-phy";
-				};
-				phy1: ethernet-phy@1a {
-					interrupt-parent = <&mpic>;
-					interrupts = <0x7 0x1>;
-					reg = <0x1a>;
-					device_type = "ethernet-phy";
-				};
-				tbi0: tbi-phy@11 {
-					reg = <0x11>;
-					device_type = "tbi-phy";
-				};
-			};
-		};
-
-		enet1: ethernet@25000 {
-			#address-cells = <1>;
-			#size-cells = <1>;
-			cell-index = <1>;
-			device_type = "network";
-			model = "eTSEC";
-			compatible = "gianfar";
-			reg = <0x25000 0x1000>;
-			ranges = <0x0 0x25000 0x1000>;
-			local-mac-address = [ 00 00 00 00 00 00 ];
-			interrupts = <0x23 0x2 0x24 0x2 0x28 0x2>;
-			interrupt-parent = <&mpic>;
-			tbi-handle = <&tbi1>;
-			phy-handle = <&phy1>;
-
-			mdio@520 {
-				#address-cells = <1>;
-				#size-cells = <0>;
-				compatible = "fsl,gianfar-tbi";
-				reg = <0x520 0x20>;
-
-				tbi1: tbi-phy@11 {
-					reg = <0x11>;
-					device_type = "tbi-phy";
-				};
-			};
-		};
-
-		serial0: serial@4500 {
-			cell-index = <0>;
-			device_type = "serial";
-			compatible = "fsl,ns16550", "ns16550";
-			reg = <0x4500 0x100>;	// reg base, size
-			clock-frequency = <0>;	// should we fill in in uboot?
-			interrupts = <0x2a 0x2>;
-			interrupt-parent = <&mpic>;
-		};
-
-		serial1: serial@4600 {
-			cell-index = <1>;
-			device_type = "serial";
-			compatible = "fsl,ns16550", "ns16550";
-			reg = <0x4600 0x100>;	// reg base, size
-			clock-frequency = <0>;	// should we fill in in uboot?
-			interrupts = <0x2a 0x2>;
-			interrupt-parent = <&mpic>;
-		};
-
-		global-utilities@e0000 {	//global utilities reg
-			compatible = "fsl,mpc8548-guts";
-			reg = <0xe0000 0x1000>;
-			fsl,has-rstcr;
-		};
-
-		crypto@30000 {
-			compatible = "fsl,sec2.1", "fsl,sec2.0";
-			reg = <0x30000 0x10000>;
-			interrupts = <45 2>;
-			interrupt-parent = <&mpic>;
-			fsl,num-channels = <4>;
-			fsl,channel-fifo-len = <24>;
-			fsl,exec-units-mask = <0xfe>;
-			fsl,descriptor-types-mask = <0x12b0ebf>;
-		};
-
-		mpic: pic@40000 {
-			interrupt-controller;
-			#address-cells = <0>;
-			#interrupt-cells = <2>;
-			reg = <0x40000 0x40000>;
-			compatible = "chrp,open-pic";
-			device_type = "open-pic";
-		};
-	};
-
-	pci0: pci@e0008000 {
-		interrupt-map-mask = <0xf800 0x0 0x0 0x7>;
-		interrupt-map = <
-			/* IDSEL 0x01 (PCI-X slot) @66MHz */
-			0x0800 0x0 0x0 0x1 &mpic 0x2 0x1
-			0x0800 0x0 0x0 0x2 &mpic 0x3 0x1
-			0x0800 0x0 0x0 0x3 &mpic 0x4 0x1
-			0x0800 0x0 0x0 0x4 &mpic 0x1 0x1
-
-			/* IDSEL 0x11 (PCI, 3.3V 32bit) @33MHz */
-			0x8800 0x0 0x0 0x1 &mpic 0x2 0x1
-			0x8800 0x0 0x0 0x2 &mpic 0x3 0x1
-			0x8800 0x0 0x0 0x3 &mpic 0x4 0x1
-			0x8800 0x0 0x0 0x4 &mpic 0x1 0x1>;
-
-		interrupt-parent = <&mpic>;
-		interrupts = <0x18 0x2>;
-		bus-range = <0 0>;
-		ranges = <0x02000000 0x0 0x80000000 0x80000000 0x0 0x10000000
-			  0x01000000 0x0 0x00000000 0xe2000000 0x0 0x00800000>;
-		clock-frequency = <66000000>;
-		#interrupt-cells = <1>;
-		#size-cells = <2>;
-		#address-cells = <3>;
-		reg = <0xe0008000 0x1000>;
-		compatible = "fsl,mpc8540-pcix", "fsl,mpc8540-pci";
-		device_type = "pci";
-	};
-
-	pci1: pcie@e000a000 {
-		interrupt-map-mask = <0xf800 0x0 0x0 0x7>;
-		interrupt-map = <
-
-			/* IDSEL 0x0 (PEX) */
-			0x0000 0x0 0x0 0x1 &mpic 0x0 0x1
-			0x0000 0x0 0x0 0x2 &mpic 0x1 0x1
-			0x0000 0x0 0x0 0x3 &mpic 0x2 0x1
-			0x0000 0x0 0x0 0x4 &mpic 0x3 0x1>;
-
-		interrupt-parent = <&mpic>;
-		interrupts = <0x1a 0x2>;
-		bus-range = <0x0 0xff>;
-		ranges = <0x02000000 0x0 0xa0000000 0xa0000000 0x0 0x10000000
-			  0x01000000 0x0 0x00000000 0xe2800000 0x0 0x08000000>;
-		clock-frequency = <33000000>;
-		#interrupt-cells = <1>;
-		#size-cells = <2>;
-		#address-cells = <3>;
-		reg = <0xe000a000 0x1000>;
-		compatible = "fsl,mpc8548-pcie";
-		device_type = "pci";
-		pcie@0 {
-			reg = <0x0 0x0 0x0 0x0 0x0>;
-			#size-cells = <2>;
-			#address-cells = <3>;
-			device_type = "pci";
-			ranges = <0x02000000 0x0 0xa0000000
-				  0x02000000 0x0 0xa0000000
-				  0x0 0x10000000
-
-				  0x01000000 0x0 0x00000000
-				  0x01000000 0x0 0x00000000
-				  0x0 0x00800000>;
-		};
-	};
 };
+
+/include/ "sbc8548-post.dtsi"
-- 
1.8.1

^ permalink raw reply related

* [PATCH 0/4] powerpc: update sbc8548 flash/mtd settings
From: Paul Gortmaker @ 2013-01-23 20:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Paul Gortmaker

The sbc8548 dts information for flash was out of date with respect
to u-boot configuration changes made just over a year ago.  However
MTD support wasn't enabled by default, so the stale information
wasn't really an issue.

Here we update that information, split the dts file to allow
sharing the common blocks between the two possible flash
mappings, add the second mapping for when the jumpers are
configured for booting off alternate flash, and add the
necessary MTD settings to the defconfig, so that ease of
use out of the box is improved.

Thanks,
Paul

---

The following changes since commit 7d1f9aeff1ee4a20b1aeb377dd0f579fe9647619:

  Linux 3.8-rc4 (2013-01-17 19:25:45 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux.git sbc8548-flash

for you to fetch changes up to e43fbd09e84ad455f84bfd9c317c94d9dac84e31:

  powerpc: enable MTD options in sbc8548 defconfig (2013-01-23 14:49:10 -0500)

----------------------------------------------------------------
Paul Gortmaker (4):
      powerpc: split sbc8548 dts file into pre and post chunks
      powerpc: update sbc8548 flash information to match recent u-boot
      powerpc: add alternate dts file for sbc8548 boot via SODIMM
      powerpc: enable MTD options in sbc8548 defconfig

 arch/powerpc/boot/dts/sbc8548-altflash.dts  | 115 +++++++++
 arch/powerpc/boot/dts/sbc8548-post.dtsi     | 295 +++++++++++++++++++++++
 arch/powerpc/boot/dts/sbc8548-pre.dtsi      |  52 ++++
 arch/powerpc/boot/dts/sbc8548.dts           | 356 ++--------------------------
 arch/powerpc/configs/85xx/sbc8548_defconfig |  19 ++
 5 files changed, 500 insertions(+), 337 deletions(-)
 create mode 100644 arch/powerpc/boot/dts/sbc8548-altflash.dts
 create mode 100644 arch/powerpc/boot/dts/sbc8548-post.dtsi
 create mode 100644 arch/powerpc/boot/dts/sbc8548-pre.dtsi

^ permalink raw reply

* [PATCH 4/4] powerpc: enable MTD options in sbc8548 defconfig
From: Paul Gortmaker @ 2013-01-23 20:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Paul Gortmaker
In-Reply-To: <1358972013-5271-1-git-send-email-paul.gortmaker@windriver.com>

This board has soldered on flash, and a SODIMM flash module.
Both can be used for booting, via switching JP12 and SW2.8
and using the sbc8548-altflash.dts when booting from SODIMM.

Here we enable MTD in kernel so that we can see the bootloader
(and other flash sectors) from linux.

Normal configuration:

 root@sbc8548:~# cat /proc/mtd
 dev:    size   erasesize  name
 mtd0: 007a0000 00020000 "space"
 mtd1: 00060000 00020000 "bootloader"
 mtd2: 03f00000 00080000 "space"
 mtd3: 00100000 00080000 "bootloader"
 root@sbc8548:~# dd if=/dev/mtd1 count=1 bs=48|hexdump -C
 1+0 records in
 1+0 records out
 00000000  27 05 19 56 55 2d 42 6f  6f 74 20 32 30 31 32 2e  |'..VU-Boot 2012.|
 00000010  31 30 2d 64 69 72 74 79  20 28 4a 61 6e 20 31 39  |10-dirty (Jan 19|
 00000020  20 32 30 31 33 20 2d 20  31 39 3a 34 30 3a 31 31  | 2013 - 19:40:11|
 00000030
 root@sbc8548:~# dd if=/dev/mtd3 count=1 bs=48|hexdump -C
 1+0 records in
 1+0 records out
 00000000  27 05 19 56 55 2d 42 6f  6f 74 20 32 30 31 32 2e  |'..VU-Boot 2012.|
 00000010  31 30 2d 64 69 72 74 79  20 28 44 65 63 20 31 33  |10-dirty (Dec 13|
 00000020  20 32 30 31 32 20 2d 20  31 35 3a 30 30 3a 30 37  | 2012 - 15:00:07|
 00000030
 root@sbc8548:~#

Alternate configuration, with sbc8548-altflash.dts:

 root@sbc8548:~# cat /proc/mtd
 dev:    size   erasesize  name
 mtd0: 03f00000 00080000 "space"
 mtd1: 00100000 00080000 "bootloader"
 mtd2: 007a0000 00020000 "space"
 mtd3: 00060000 00020000 "bootloader"
 root@sbc8548:~# dd if=/dev/mtd1 count=1 bs=48|hexdump -C
 1+0 records in
 1+0 records out
 00000000  27 05 19 56 55 2d 42 6f  6f 74 20 32 30 31 32 2e  |'..VU-Boot 2012.|
 00000010  31 30 2d 64 69 72 74 79  20 28 44 65 63 20 31 33  |10-dirty (Dec 13|
 00000020  20 32 30 31 32 20 2d 20  31 35 3a 30 30 3a 30 37  | 2012 - 15:00:07|
 00000030
 root@sbc8548:~# dd if=/dev/mtd3 count=1 bs=48|hexdump -C
 1+0 records in
 1+0 records out
 00000000  27 05 19 56 55 2d 42 6f  6f 74 20 32 30 31 32 2e  |'..VU-Boot 2012.|
 00000010  31 30 2d 64 69 72 74 79  20 28 4a 61 6e 20 31 39  |10-dirty (Jan 19|
 00000020  20 32 30 31 33 20 2d 20  31 39 3a 34 30 3a 31 31  | 2013 - 19:40:11|
 00000030
 root@sbc8548:~#

Note that in the latter, the larger SODIMM device appears 1st,
as mtd0 and mtd1, as indicated in the sizes, and in the date
of the u-boot image.

The kernel configuration is the same in both cases; only the dtb
needs to be changed in accordance with the JP12/SW2.8 settings.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 arch/powerpc/configs/85xx/sbc8548_defconfig | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/powerpc/configs/85xx/sbc8548_defconfig b/arch/powerpc/configs/85xx/sbc8548_defconfig
index 5b2b651..008a7a4 100644
--- a/arch/powerpc/configs/85xx/sbc8548_defconfig
+++ b/arch/powerpc/configs/85xx/sbc8548_defconfig
@@ -55,3 +55,22 @@ CONFIG_ROOT_NFS=y
 # CONFIG_RCU_CPU_STALL_DETECTOR is not set
 CONFIG_SYSCTL_SYSCALL_CHECK=y
 # CONFIG_CRYPTO_ANSI_CPRNG is not set
+CONFIG_MTD=y
+CONFIG_MTD_OF_PARTS=y
+CONFIG_MTD_CHAR=y
+CONFIG_MTD_BLKDEVS=y
+CONFIG_MTD_BLOCK=y
+CONFIG_MTD_CFI=y
+CONFIG_MTD_GEN_PROBE=y
+CONFIG_MTD_CFI_ADV_OPTIONS=y
+CONFIG_MTD_CFI_NOSWAP=y
+CONFIG_MTD_CFI_GEOMETRY=y
+CONFIG_MTD_MAP_BANK_WIDTH_1=y
+CONFIG_MTD_MAP_BANK_WIDTH_2=y
+CONFIG_MTD_MAP_BANK_WIDTH_4=y
+CONFIG_MTD_CFI_I1=y
+CONFIG_MTD_CFI_I2=y
+CONFIG_MTD_CFI_I4=y
+CONFIG_MTD_CFI_INTELEXT=y
+CONFIG_MTD_CFI_UTIL=y
+CONFIG_MTD_PHYSMAP_OF=y
-- 
1.8.1

^ permalink raw reply related

* Re: [PATCH v5 04/45] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
From: Tejun Heo @ 2013-01-23 19:57 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: linux-doc, peterz, fweisbec, linux-kernel, mingo, linux-arch,
	linux, xiaoguangrong, wangyun, paulmck, nikunj, linux-pm, rusty,
	rostedt, rjw, namhyung, tglx, linux-arm-kernel, netdev, oleg, sbw,
	akpm, linuxppc-dev
In-Reply-To: <51003B20.2060506@linux.vnet.ibm.com>

Hello, Srivatsa.

On Thu, Jan 24, 2013 at 01:03:52AM +0530, Srivatsa S. Bhat wrote:
> Hmm.. I split it up into steps to help explain the reasoning behind
> the code sufficiently, rather than spring all of the intricacies at
> one go (which would make it very hard to write the changelog/comments
> also). The split made it easier for me to document it well in the
> changelog, because I could deal with reasonable chunks of code/complexity
> at a time. IMHO that helps people reading it for the first time to
> understand the logic easily.

I don't know.  It's a judgement call I guess.  I personally would much
prefer having ample documentation as comments in the source itself or
as a separate Documentation/ file as that's what most people are gonna
be looking at to figure out what's going on.  Maybe just compact it a
bit and add more in-line documentation instead?

> > The only two options are either punishing writers or identifying and
> > updating all such possible deadlocks.  percpu_rwsem does the former,
> > right?  I don't know how feasible the latter would be.
> 
> I don't think we can avoid looking into all the possible deadlocks,
> as long as we use rwlocks inside get/put_online_cpus_atomic() (assuming
> rwlocks are fair). Even with Oleg's idea of using synchronize_sched()
> at the writer, we still need to take care of locking rules, because the
> synchronize_sched() only helps avoid the memory barriers at the reader,
> and doesn't help get rid of the rwlocks themselves.

Well, percpu_rwlock don't have to use rwlock for the slow path.  It
can implement its own writer starving locking scheme.  It's not like
implementing slow path global rwlock logic is difficult.

> CPU 0                          CPU 1
> 
> read_lock(&rwlock)
> 
>                               write_lock(&rwlock) //spins, because CPU 0
>                               //has acquired the lock for read
> 
> read_lock(&rwlock)
>    ^^^^^
> What happens here? Does CPU 0 start spinning (and hence deadlock) or will
> it continue realizing that it already holds the rwlock for read?

I don't think rwlock allows nesting write lock inside read lock.
read_lock(); write_lock() will always deadlock.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH v5 04/45] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
From: Srivatsa S. Bhat @ 2013-01-23 19:33 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-doc, peterz, fweisbec, linux-kernel, mingo, linux-arch,
	linux, xiaoguangrong, wangyun, paulmck, nikunj, linux-pm, rusty,
	rostedt, rjw, namhyung, tglx, linux-arm-kernel, netdev, oleg, sbw,
	akpm, linuxppc-dev
In-Reply-To: <20130123185522.GG2373@mtj.dyndns.org>

On 01/24/2013 12:25 AM, Tejun Heo wrote:
> Hello, Srivatsa.
> 
> First of all, I'm not sure whether we need to be this step-by-step
> when introducing something new.  It's not like we're transforming an
> existing implementation and it doesn't seem to help understanding the
> series that much either.
> 

Hmm.. I split it up into steps to help explain the reasoning behind
the code sufficiently, rather than spring all of the intricacies at
one go (which would make it very hard to write the changelog/comments
also). The split made it easier for me to document it well in the
changelog, because I could deal with reasonable chunks of code/complexity
at a time. IMHO that helps people reading it for the first time to
understand the logic easily.

> On Tue, Jan 22, 2013 at 01:03:53PM +0530, Srivatsa S. Bhat wrote:
>> Using global rwlocks as the backend for per-CPU rwlocks helps us avoid many
>> lock-ordering related problems (unlike per-cpu locks). However, global
> 
> So, unfortunately, this already seems broken, right?  The problem here
> seems to be that previously, say, read_lock() implied
> preempt_disable() but as this series aims to move away from it, it
> introduces the problem of locking order between such locks and the new
> contruct.
>

Not sure I got your point correctly. Are you referring to Steve's comment
that rwlocks are probably fair now (and hence not really safe when used
like this)? If yes, I haven't actually verified that yet, but yes, that
will make this hard to use, since we need to take care of locking rules.

But suppose rwlocks are unfair (as I had assumed them to be), then we
have absolutely no problems and no lock-ordering to worry about.
 
> The only two options are either punishing writers or identifying and
> updating all such possible deadlocks.  percpu_rwsem does the former,
> right?  I don't know how feasible the latter would be.

I don't think we can avoid looking into all the possible deadlocks,
as long as we use rwlocks inside get/put_online_cpus_atomic() (assuming
rwlocks are fair). Even with Oleg's idea of using synchronize_sched()
at the writer, we still need to take care of locking rules, because the
synchronize_sched() only helps avoid the memory barriers at the reader,
and doesn't help get rid of the rwlocks themselves.
So in short, I don't see how we can punish the writers and thereby somehow
avoid looking into possible deadlocks (if rwlocks are fair).

>  Srivatsa,
> you've been looking at all the places which would require conversion,
> how difficult would doing the latter be?
> 

The problem is that some APIs like smp_call_function() will need to use
get/put_online_cpus_atomic(). That is when the locking becomes tricky
in the subsystem which invokes these APIs with other (subsystem-specific,
internal) locks held. So we could potentially use a convention such as
"Make get/put_online_cpus_atomic() your outer-most calls, within which
you nest the other locks" to rule out all ABBA deadlock possibilities...
But we might still hit some hard-to-convert places.. 

BTW, Steve, fair rwlocks doesn't mean the following scenario will result
in a deadlock right?

CPU 0                          CPU 1

read_lock(&rwlock)

                              write_lock(&rwlock) //spins, because CPU 0
                              //has acquired the lock for read

read_lock(&rwlock)
   ^^^^^
What happens here? Does CPU 0 start spinning (and hence deadlock) or will
it continue realizing that it already holds the rwlock for read?

If the above ends in a deadlock, then its next to impossible to convert
all the places safely (because the above mentioned convention will simply
fall apart).

>> +#define reader_uses_percpu_refcnt(pcpu_rwlock, cpu)			\
>> +		(ACCESS_ONCE(per_cpu(*((pcpu_rwlock)->reader_refcnt), cpu)))
>> +
>> +#define reader_nested_percpu(pcpu_rwlock)				\
>> +			(__this_cpu_read(*((pcpu_rwlock)->reader_refcnt)) > 1)
>> +
>> +#define writer_active(pcpu_rwlock)					\
>> +			(__this_cpu_read(*((pcpu_rwlock)->writer_signal)))
> 
> Why are these in the public header file?  Are they gonna be used to
> inline something?
>

No, I can put it in the .c file itself. Will do.
 
>> +static inline void raise_writer_signal(struct percpu_rwlock *pcpu_rwlock,
>> +				       unsigned int cpu)
>> +{
>> +	per_cpu(*pcpu_rwlock->writer_signal, cpu) = true;
>> +}
>> +
>> +static inline void drop_writer_signal(struct percpu_rwlock *pcpu_rwlock,
>> +				      unsigned int cpu)
>> +{
>> +	per_cpu(*pcpu_rwlock->writer_signal, cpu) = false;
>> +}
>> +
>> +static void announce_writer_active(struct percpu_rwlock *pcpu_rwlock)
>> +{
>> +	unsigned int cpu;
>> +
>> +	for_each_online_cpu(cpu)
>> +		raise_writer_signal(pcpu_rwlock, cpu);
>> +
>> +	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
>> +}
>> +
>> +static void announce_writer_inactive(struct percpu_rwlock *pcpu_rwlock)
>> +{
>> +	unsigned int cpu;
>> +
>> +	drop_writer_signal(pcpu_rwlock, smp_processor_id());
>> +
>> +	for_each_online_cpu(cpu)
>> +		drop_writer_signal(pcpu_rwlock, cpu);
>> +
>> +	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
>> +}
> 
> It could be just personal preference but I find the above one line
> wrappers more obfuscating than anything else.  What's the point of
> wrapping writer_signal = true/false into a separate function?  These
> simple wrappers just add layers that people have to dig through to
> figure out what's going on without adding anything of value.  I'd much
> prefer collapsing these into the percpu_write_[un]lock().
>

Sure, I see your point. I'll change that.

Thanks a lot for your feedback Tejun!

Regards,
Srivatsa S. Bhat

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox