LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 0/7] This patchset adds support for running Linux under the Freescale hypervisor,
From: Tabi Timur-B04825 @ 2011-05-23 21:09 UTC (permalink / raw)
  To: Gala Kumar-B11780
  Cc: Tabi Timur-B04825, linux-kernel@vger.kernel.org, akpm@kernel.org,
	linux-console@vger.kernel.org, greg@kroah.com,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <A221B293-A4DD-4223-AEEA-F3E5243D2C0A@freescale.com>

On Fri, May 20, 2011 at 3:29 PM, Kumar Gala <kumar.gala@freescale.com> wrot=
e:

> Applied to 'test' branch. =A0(grabbed 'v2' of tty patch). =A0Fixed merged=
 conflicts.

I don't think you pushed this branch to git.kernel.org

http://git.kernel.org/?p=3Dlinux/kernel/git/galak/powerpc.git;a=3Dshortlog;=
h=3Drefs/heads/test

--=20
Timur Tabi
Linux kernel developer at Freescale=

^ permalink raw reply

* Re: [PATCH 2/2] powerpc/book3e-64: reraise doorbell when masked by soft-irq-disable
From: Benjamin Herrenschmidt @ 2011-05-23 20:51 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev
In-Reply-To: <20110523152618.00a0e5f0@schlenkerla.am.freescale.net>

On Mon, 2011-05-23 at 15:26 -0500, Scott Wood wrote:
> On Sat, 21 May 2011 08:32:58 +1000
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
> > On Fri, 2011-05-20 at 14:00 -0500, Scott Wood wrote:
> > > Signed-off-by: Scott Wood <scottwood@freescale.com>
> > > ---
> > >  arch/powerpc/kernel/exceptions-64e.S |   22 +++++++++++++++++++++-
> > >  1 files changed, 21 insertions(+), 1 deletions(-)
> > 
> > You can probably remove the doorbell re-check when enabling interrupts
> > now, can't you ?
> 
> Ah, so that's how it currently gets away without re-raising when the
> interrupt happens. :-)
> 
> I'll remove it.

Yup, I was too lazy to make a special case in the exception handlers :-)

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH 2/7] powerpc/mm: 64-bit 4k: use a PMD-based virtual page table
From: Benjamin Herrenschmidt @ 2011-05-23 20:51 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev
In-Reply-To: <20110523135433.557e2d63@schlenkerla.am.freescale.net>

On Mon, 2011-05-23 at 13:54 -0500, Scott Wood wrote:
> On Sat, 21 May 2011 08:15:36 +1000
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
> > On Fri, 2011-05-20 at 15:57 -0500, Scott Wood wrote:
> > 
> > > I see a 2% cost going from virtual pmd to full 4-level walk in the
> > > benchmark mentioned above (some type of sort), and just under 3% in
> > > page-stride lat_mem_rd from lmbench.
> > > 
> > > OTOH, the virtual pmd approach still leaves the possibility of taking a
> > > bunch of virtual page table misses if non-localized accesses happen over a
> > > very large chunk of address space (tens of GiB), and we'd have one fewer
> > > type of TLB miss to worry about complexity-wise with a straight table walk.
> > > 
> > > Let me know what you'd prefer.
> > 
> > I'm tempted to kill the virtual linear feature alltogether.. it didn't
> > buy us that much. Have you looked if you can snatch back some of those
> > cycles with hand tuning of the level walker ?
> 
> That's after trying a bit of that (pulled the pgd load up before
> normal_tlb_miss, and some other reordering).  Not sure how much more can be
> squeezed out of it with such techniques, at least with e5500.
> 
> Hmm, in the normal miss case we know we're in the first EXTLB level,
> right?  So we could cut out a load/mfspr by subtracting EXTLB from r12
> to get the PACA (that load's latency is pretty well buried, but maybe we
> could replace it with loading pgd, replacing it later if it's a kernel
> region).  Maybe move pgd to the first EXTLB, so it's in the same cache line
> as the state save data. The PACA cacheline containing pgd is probably
> pretty hot in normal kernel code, but not so much in a long stretch of
> userspace plus TLB misses (other than for pgd itself).

Is your linear mapping bolted ? If it is you may be able to cut out most
of the save/restore stuff (SRR0,1, ...) since with a normal walk you
won't take nested misses.
 
> > Would it work/help to have a simple cache of the last pmd & address and
> > compare just that ?
> 
> Maybe.
> 
> It would still slow down the case where you miss that cache -- not by as
> much as a virtual page table miss (and it wouldn't compete for TLB entries
> with actual user pages), but it would happen more often, since you'd only be
> able to cache one pmd.
>
> > Maybe in a SPRG or a known cache hot location like
> > the PACA in a line that we already load anyways ?
> 
> A cache access is faster than a SPRG access on our chips (plus we
> don't have many to spare, especially if we want to avoid swapping SPRG4-7 on
> guest entry/exit in KVM), so I'd favor putting it in the PACA.
> 
> I'll try this stuff out and see what helps.

Cool,

Cheers,
Ben.

^ permalink raw reply

* Re: Fwd: UART #1 access on Sequoia board
From: Josh Boyer @ 2011-05-23 20:42 UTC (permalink / raw)
  To: Muhammad Waseem; +Cc: linuxppc-dev
In-Reply-To: <BANLkTikvmy-bkckgxyQttoKQqkuNUd2+5g@mail.gmail.com>

On Tue, May 24, 2011 at 12:22:14AM +0500, Muhammad Waseem wrote:

Please don't top post.

>I am using kernel version 2.6.23 which came with the Sequoia support CD. So
>that means UART # 1 BSP would probably not there?  But device file
>/dev/ttyS1 is present in rootfs when listed using 'ls'. Do you think
>updating the kernel might solve the problem?

Not necessarily.  You can create device nodes on the rootfs without
anything being bound to them.  One of the easiest ways to see if
something discovered the serial port is to look in the boot output.  You
should see something like:

serial8250.0: ttyS0 at MMIO 0x1ef600300 (irq = 17) is a 16550A                  
console [ttyS0] enabled, bootconsole disabled                                   
console [ttyS0] enabled, bootconsole disabled                                   
serial8250.0: ttyS1 at MMIO 0x1ef600400 (irq = 18) is a 16550A                  
serial8250.0: ttyS2 at MMIO 0x1ef600500 (irq = 19) is a 16550A                  
serial8250.0: ttyS3 at MMIO 0x1ef600600 (irq = 20) is a 16550A   

or:

1ef600300.serial: ttyS0 at MMIO 0x1ef600300 (irq = 17) is a 16550               
1ef600400.serial: ttyS1 at MMIO 0x1ef600400 (irq = 18) is a 16550               
1ef600500.serial: ttyS2 at MMIO 0x1ef600500 (irq = 19) is a 16550               
1ef600600.serial: ttyS3 at MMIO 0x1ef600600 (irq = 20) is a 16550      

>BTW where can I see the board's DTS file?

arch/powerpc/boot/dts/sequoia.dts

josh

^ permalink raw reply

* Re: [PATCH 2/2] powerpc/book3e-64: reraise doorbell when masked by soft-irq-disable
From: Scott Wood @ 2011-05-23 20:26 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1305930778.7481.197.camel@pasglop>

On Sat, 21 May 2011 08:32:58 +1000
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Fri, 2011-05-20 at 14:00 -0500, Scott Wood wrote:
> > Signed-off-by: Scott Wood <scottwood@freescale.com>
> > ---
> >  arch/powerpc/kernel/exceptions-64e.S |   22 +++++++++++++++++++++-
> >  1 files changed, 21 insertions(+), 1 deletions(-)
> 
> You can probably remove the doorbell re-check when enabling interrupts
> now, can't you ?

Ah, so that's how it currently gets away without re-raising when the
interrupt happens. :-)

I'll remove it.

-Scott

^ permalink raw reply

* Re: [PATCH] oprofile, powerpc: Handle events that raise an exception without overflowing
From: Maynard Johnson @ 2011-05-23 20:04 UTC (permalink / raw)
  To: Eric B Munson
  Cc: robert.richter, linux-kernel, oprofile-list, paulus, linuxppc-dev
In-Reply-To: <20110523193736.GA2997@mgebm.net>

Eric B Munson wrote:
> On Mon, 23 May 2011, Eric B Munson wrote:
> 
>> Commit 0837e3242c73566fc1c0196b4ec61779c25ffc93 fixes a situation on POWER7
>> where events can roll back if a specualtive event doesn't actually complete.
>> This can raise a performance monitor exception.  We need to catch this to ensure
>> that we reset the PMC.  In all cases the PMC will be less than 256 cycles from
>> overflow.
>>
>> This patch lifts Anton's fix for the problem in perf and applies it to oprofile
>> as well.
>>
>> Signed-off-by: Eric B Munson <emunson@mgebm.net>
>> Cc: <stable@kernel.org> # as far back as it applies cleanly
> 
> I'd like to get this patch into mainline this merge window if at all possible.
Ack.  I've been able to create a system hang profiling with speculative events on POWER7.  This patch fixes that problem.

-Maynard

> 
>> ---
>>  arch/powerpc/oprofile/op_model_power4.c |   24 +++++++++++++++++++++++-
>>  1 files changed, 23 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/powerpc/oprofile/op_model_power4.c b/arch/powerpc/oprofile/op_model_power4.c
>> index 8ee51a2..e6bec74 100644
>> --- a/arch/powerpc/oprofile/op_model_power4.c
>> +++ b/arch/powerpc/oprofile/op_model_power4.c
>> @@ -261,6 +261,28 @@ static int get_kernel(unsigned long pc, unsigned long mmcra)
>>  	return is_kernel;
>>  }
>>  
>> +static bool pmc_overflow(unsigned long val)
>> +{
>> +	if ((int)val < 0)
>> +		return true;
>> +
>> +	/*
>> +	 * Events on POWER7 can roll back if a speculative event doesn't
>> +	 * eventually complete. Unfortunately in some rare cases they will
>> +	 * raise a performance monitor exception. We need to catch this to
>> +	 * ensure we reset the PMC. In all cases the PMC will be 256 or less
>> +	 * cycles from overflow.
>> +	 *
>> +	 * We only do this if the first pass fails to find any overflowing
>> +	 * PMCs because a user might set a period of less than 256 and we
>> +	 * don't want to mistakenly reset them.
>> +	 */
>> +	if (__is_processor(PV_POWER7) && ((0x80000000 - val) <= 256))
>> +		return true;
>> +
>> +	return false;
>> +}
>> +
>>  static void power4_handle_interrupt(struct pt_regs *regs,
>>  				    struct op_counter_config *ctr)
>>  {
>> @@ -281,7 +303,7 @@ static void power4_handle_interrupt(struct pt_regs *regs,
>>  
>>  	for (i = 0; i < cur_cpu_spec->num_pmcs; ++i) {
>>  		val = classic_ctr_read(i);
>> -		if (val < 0) {
>> +		if (pmc_overflow(val)) {
>>  			if (oprofile_running && ctr[i].enabled) {
>>  				oprofile_add_ext_sample(pc, regs, i, is_kernel);
>>  				classic_ctr_write(i, reset_value[i]);
>> -- 
>> 1.7.4.1
>>
>>
>>
>> _______________________________________________
>> Linuxppc-dev mailing list
>> Linuxppc-dev@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply

* Re: [PATCH] oprofile, powerpc: Handle events that raise an exception without overflowing
From: Eric B Munson @ 2011-05-23 19:37 UTC (permalink / raw)
  To: benh; +Cc: robert.richter, oprofile-list, paulus, linuxppc-dev, linux-kernel
In-Reply-To: <1306160560-5309-1-git-send-email-emunson@mgebm.net>

[-- Attachment #1: Type: text/plain, Size: 2435 bytes --]

On Mon, 23 May 2011, Eric B Munson wrote:

> Commit 0837e3242c73566fc1c0196b4ec61779c25ffc93 fixes a situation on POWER7
> where events can roll back if a specualtive event doesn't actually complete.
> This can raise a performance monitor exception.  We need to catch this to ensure
> that we reset the PMC.  In all cases the PMC will be less than 256 cycles from
> overflow.
> 
> This patch lifts Anton's fix for the problem in perf and applies it to oprofile
> as well.
> 
> Signed-off-by: Eric B Munson <emunson@mgebm.net>
> Cc: <stable@kernel.org> # as far back as it applies cleanly

I'd like to get this patch into mainline this merge window if at all possible.

> ---
>  arch/powerpc/oprofile/op_model_power4.c |   24 +++++++++++++++++++++++-
>  1 files changed, 23 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/oprofile/op_model_power4.c b/arch/powerpc/oprofile/op_model_power4.c
> index 8ee51a2..e6bec74 100644
> --- a/arch/powerpc/oprofile/op_model_power4.c
> +++ b/arch/powerpc/oprofile/op_model_power4.c
> @@ -261,6 +261,28 @@ static int get_kernel(unsigned long pc, unsigned long mmcra)
>  	return is_kernel;
>  }
>  
> +static bool pmc_overflow(unsigned long val)
> +{
> +	if ((int)val < 0)
> +		return true;
> +
> +	/*
> +	 * Events on POWER7 can roll back if a speculative event doesn't
> +	 * eventually complete. Unfortunately in some rare cases they will
> +	 * raise a performance monitor exception. We need to catch this to
> +	 * ensure we reset the PMC. In all cases the PMC will be 256 or less
> +	 * cycles from overflow.
> +	 *
> +	 * We only do this if the first pass fails to find any overflowing
> +	 * PMCs because a user might set a period of less than 256 and we
> +	 * don't want to mistakenly reset them.
> +	 */
> +	if (__is_processor(PV_POWER7) && ((0x80000000 - val) <= 256))
> +		return true;
> +
> +	return false;
> +}
> +
>  static void power4_handle_interrupt(struct pt_regs *regs,
>  				    struct op_counter_config *ctr)
>  {
> @@ -281,7 +303,7 @@ static void power4_handle_interrupt(struct pt_regs *regs,
>  
>  	for (i = 0; i < cur_cpu_spec->num_pmcs; ++i) {
>  		val = classic_ctr_read(i);
> -		if (val < 0) {
> +		if (pmc_overflow(val)) {
>  			if (oprofile_running && ctr[i].enabled) {
>  				oprofile_add_ext_sample(pc, regs, i, is_kernel);
>  				classic_ctr_write(i, reset_value[i]);
> -- 
> 1.7.4.1
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Fwd: UART #1 access on Sequoia board
From: Muhammad Waseem @ 2011-05-23 19:22 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <BANLkTin5TyfdfNDfXokDdtdpDmnoRQ6aAA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1140 bytes --]

I am using kernel version 2.6.23 which came with the Sequoia support CD. So
that means UART # 1 BSP would probably not there?  But device file
/dev/ttyS1 is present in rootfs when listed using 'ls'. Do you think
updating the kernel might solve the problem?

BTW where can I see the board's DTS file?

--
Waseem




On Mon, May 23, 2011 at 5:40 PM, Josh Boyer <jwboyer@linux.vnet.ibm.com>wrote:

> On Mon, May 23, 2011 at 11:34:26AM +0500, Muhammad Waseem wrote:
> >Hello,
> >   I am working on PPC440EPx (Sequoia) to access its UART # 1 port for
> data
> >transfer, while UART # 0 is connected to remote terminal access on host.
> >However there is no module/driver listed for UART # 1 using 'lsmod'. The
> >kernel version is 2.6.23. How can I access the UART # 1 in user space
> and/or
> >kernel space? Do I need to develop own BSP for this? please help.
>
> The driver is normally built into the kernel.  If the board DTS file has
> both UARTs listed, and you have /dev/ttyS1 on your rootfs you should be
> able to use it.  However, Sequoia wasn't added until 2.6.24, so I'm not
> really sure what kernel flavor you are using.
>
> josh
>

[-- Attachment #2: Type: text/html, Size: 1731 bytes --]

^ permalink raw reply

* Re: [PATCH 2/7] powerpc/mm: 64-bit 4k: use a PMD-based virtual page table
From: Scott Wood @ 2011-05-23 18:54 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1305929736.7481.188.camel@pasglop>

On Sat, 21 May 2011 08:15:36 +1000
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Fri, 2011-05-20 at 15:57 -0500, Scott Wood wrote:
> 
> > I see a 2% cost going from virtual pmd to full 4-level walk in the
> > benchmark mentioned above (some type of sort), and just under 3% in
> > page-stride lat_mem_rd from lmbench.
> > 
> > OTOH, the virtual pmd approach still leaves the possibility of taking a
> > bunch of virtual page table misses if non-localized accesses happen over a
> > very large chunk of address space (tens of GiB), and we'd have one fewer
> > type of TLB miss to worry about complexity-wise with a straight table walk.
> > 
> > Let me know what you'd prefer.
> 
> I'm tempted to kill the virtual linear feature alltogether.. it didn't
> buy us that much. Have you looked if you can snatch back some of those
> cycles with hand tuning of the level walker ?

That's after trying a bit of that (pulled the pgd load up before
normal_tlb_miss, and some other reordering).  Not sure how much more can be
squeezed out of it with such techniques, at least with e5500.

Hmm, in the normal miss case we know we're in the first EXTLB level,
right?  So we could cut out a load/mfspr by subtracting EXTLB from r12
to get the PACA (that load's latency is pretty well buried, but maybe we
could replace it with loading pgd, replacing it later if it's a kernel
region).  Maybe move pgd to the first EXTLB, so it's in the same cache line
as the state save data. The PACA cacheline containing pgd is probably
pretty hot in normal kernel code, but not so much in a long stretch of
userspace plus TLB misses (other than for pgd itself).

> Would it work/help to have a simple cache of the last pmd & address and
> compare just that ?

Maybe.

It would still slow down the case where you miss that cache -- not by as
much as a virtual page table miss (and it wouldn't compete for TLB entries
with actual user pages), but it would happen more often, since you'd only be
able to cache one pmd.

> Maybe in a SPRG or a known cache hot location like
> the PACA in a line that we already load anyways ?

A cache access is faster than a SPRG access on our chips (plus we
don't have many to spare, especially if we want to avoid swapping SPRG4-7 on
guest entry/exit in KVM), so I'd favor putting it in the PACA.

I'll try this stuff out and see what helps.

-Scott

^ permalink raw reply

* Re: Best Linux choice for POWER7?
From: Josh Boyer @ 2011-05-23 18:11 UTC (permalink / raw)
  To: Gabriel Menini; +Cc: linuxppc-dev
In-Reply-To: <BANLkTinLMava9Oyke2OL7Xy==h8AhBywfQ@mail.gmail.com>

On Mon, May 23, 2011 at 02:43:14PM -0300, Gabriel Menini wrote:
>Hello, list.
>
>I am looking for the most-tested Linux distro for POWER7 architecture.

One of the enterprise distributions are going to be the best bet for a
well tested and supported POWER7.  They, of course, require purchasing a
subscription.  If that isn't something you're looking to pay for, then
your choices are limited to the community distros like Debian.  Fedora
and Ubuntu have a secondary effort, but it's somewhat stale from what I
understand.

>Sorry if it is not the correct list.

I'm not sure there is a great list for that kind of question.  Here is
as good a place as any.

josh

^ permalink raw reply

* Best Linux choice for POWER7?
From: Gabriel Menini @ 2011-05-23 17:43 UTC (permalink / raw)
  To: linuxppc-dev

Hello, list.

I am looking for the most-tested Linux distro for POWER7 architecture.

Sorry if it is not the correct list.

Regards,
-- 
Gabriel Menini

^ permalink raw reply

* RE: [ v4] powerpc: Force page alignment for initrd reserved memory
From: Milton Miller @ 2011-05-23 17:39 UTC (permalink / raw)
  To: Dave Carroll; +Cc: Paul Mackerras, LPPC, LKML
In-Reply-To: <522F24EF533FC546962ECFA2054FF777373072AB78@MAILSERVER2.cos.astekcorp.com>


On Mon, 23 May 2011 about 10:50:07 -0600, Dave Carroll wrote:
> I'm withdrawing this patch for the moment, due to two areas that need
> further research;
> 
> 1) An adjacent FDT blob, as mentioned by Milton Miller, and
> 

Ok ... by the way, see move_device_tree() in arch/powerpc/kernel/prom.c

> 2) Potential interaction with the crash kernel, as used in
>         init/initramfs.c

which already goes around the start and end of crashk_res, which
is adjusted to PAGE_ALIGN in reserve_crashkernel() in machine_kexec.c

 
> If anyone sees other interactions, please feel free to let me know ...
> 
> Thanks,
> -Dave Carroll

One interaction that I have ignored is preserve_initrd overlapping
crash kernel.  Loading the crash kernel destroys the preserved initrd.
But that is beyond the scope of your current patch (and probably a
seperate patch, with cross-architecture scope).

milton

^ permalink raw reply

* Re: PCI DMA to user mem on mpc83xx
From: Ira W. Snyder @ 2011-05-23 17:27 UTC (permalink / raw)
  To: Andre Schwarz; +Cc: LinuxPPC List
In-Reply-To: <4DDA2509.6070702@matrix-vision.de>

On Mon, May 23, 2011 at 11:12:41AM +0200, Andre Schwarz wrote:
> Ira,
> 
> we have a pretty old PCI device driver here that needs some basic rework 
> running on 2.6.27 on several MPC83xx.
> It's a simple char-device with "give me some data" implemented using 
> read() resulting in zero-copy DMA to user mem.
> 
> There's get_user_pages() working under the hood along with 
> SetPageDirty() and page_cache_release().
> 
> Main goal is to prepare a sg-list that gets fed into a DMA controller.
> 
> I wonder if there's a more up-to-date/efficient and future proof scheme 
> of creating the mapping.
> 
> 
> Could you provide some pointers or would you stick to the current scheme ?
> 

This scheme is the best you'll come up with for zero-copy IO. I used
get_user_pages_fast(), but otherwise my implementation was the same.
These interfaces should be fairly future proof.

In the end, I realized that most of my transfers were 4 bytes in length,
and zero copy IO was a waste of effort. I decided to use mmap instead.

Ira

^ permalink raw reply

* RE: [ v4] powerpc: Force page alignment for initrd reserved memory
From: Dave Carroll @ 2011-05-23 16:50 UTC (permalink / raw)
  To: Dave Carroll, 'Milton Miller'; +Cc: Paul Mackerras, LPPC, LKML
In-Reply-To: <522F24EF533FC546962ECFA2054FF777373072AB75@MAILSERVER2.cos.astekcorp.com>


I'm withdrawing this patch for the moment, due to two areas that need furth=
er research;

1) An adjacent FDT blob, as mentioned by Milton Miller, and

2) Potential interaction with the crash kernel, as used in
        init/initramfs.c

If anyone sees other interactions, please feel free to let me know ...

Thanks,
-Dave Carroll

^ permalink raw reply

* Re: [PATCH V2 2/2] cpc925_edac: support single-processor configurations
From: Segher Boessenkool @ 2011-05-23 15:50 UTC (permalink / raw)
  To: Dmitry Eremin-Solenikov
  Cc: Harry Ciao, linuxppc-dev, Paul Mackerras, Doug Thompson
In-Reply-To: <1306059295-25806-2-git-send-email-dbaryshkov@gmail.com>

> If second CPU is not enabled, CPC925 EDAC driver will spill out 
> warnings
> about errors on second Processor Interface. Support masking that out,
> by detecting at runtime which CPUs are present in device tree.
>
> Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
> Cc: Harry Ciao <qingtao.cao@windriver.com>
> Cc: Doug Thompson <dougthompson@xmission.com>
> Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>

Acked-by: Segher Boessenkool <segher@kernel.crashing.org>

Minor stuff...

> +	/* Get first CPU node */

Comment doesn't match code.

> +	for (cpunode = NULL;
> +	     (cpunode = of_get_next_child(cpus, cpunode)) != NULL;) {

Use a while loop instead?

> +		const u32 *reg = of_get_property(cpunode, "reg", NULL);
> +
> +		if (!strcmp(cpunode->type, "cpu") && reg != NULL)
> +			mask &= ~APIMASK_ADI(*reg);
> +	}

You might want to check if the "reg" value is < 2, you get C undefined
behaviour if it is too big (not that that should happen), and it's 
clearer
code anyway.

> +	cpumask = cpc925_cpu_getmask();

You could choose a function name that makes more clear these are the
processor _interfaces_ that are _not_ used :-)

You could cache this value as well.


Segher

^ permalink raw reply

* Re: [PATCH V2 1/2] Maple: register CPC925 EDAC device on all boards with CPC925
From: Segher Boessenkool @ 2011-05-23 15:40 UTC (permalink / raw)
  To: Dmitry Eremin-Solenikov; +Cc: Harry Ciao, Paul Mackerras, linuxppc-dev
In-Reply-To: <1306059295-25806-1-git-send-email-dbaryshkov@gmail.com>

> Currently Maple setup code creates cpc925_edac device only on
> Motorola ATCA-6101 blade. Make setup code check bridge revision
> and enable EDAC on all U3H bridges.
>
> Verified on Momentum MapleD (ppc970fx kit) board.
>
> Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>

Acked-by: Segher Boessenkool <segher@kernel.crashing.org>

One tiny thing:

> +	if (rev >= 0x34 && rev <= 0x3f) { /* U3H */
> +		printk(KERN_ERR "%s: Non-CPC925(U3H) bridge revision: %02x\n",
> +			__func__, rev);
> +		return -ENODEV;
> +	}

That's not really an error, is it?


Segher

^ permalink raw reply

* [PATCH] oprofile, powerpc: Handle events that raise an exception without overflowing
From: Eric B Munson @ 2011-05-23 14:22 UTC (permalink / raw)
  To: benh
  Cc: robert.richter, linux-kernel, oprofile-list, Eric B Munson,
	paulus, linuxppc-dev

Commit 0837e3242c73566fc1c0196b4ec61779c25ffc93 fixes a situation on POWER7
where events can roll back if a specualtive event doesn't actually complete.
This can raise a performance monitor exception.  We need to catch this to ensure
that we reset the PMC.  In all cases the PMC will be less than 256 cycles from
overflow.

This patch lifts Anton's fix for the problem in perf and applies it to oprofile
as well.

Signed-off-by: Eric B Munson <emunson@mgebm.net>
Cc: <stable@kernel.org> # as far back as it applies cleanly
---
 arch/powerpc/oprofile/op_model_power4.c |   24 +++++++++++++++++++++++-
 1 files changed, 23 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/oprofile/op_model_power4.c b/arch/powerpc/oprofile/op_model_power4.c
index 8ee51a2..e6bec74 100644
--- a/arch/powerpc/oprofile/op_model_power4.c
+++ b/arch/powerpc/oprofile/op_model_power4.c
@@ -261,6 +261,28 @@ static int get_kernel(unsigned long pc, unsigned long mmcra)
 	return is_kernel;
 }
 
+static bool pmc_overflow(unsigned long val)
+{
+	if ((int)val < 0)
+		return true;
+
+	/*
+	 * Events on POWER7 can roll back if a speculative event doesn't
+	 * eventually complete. Unfortunately in some rare cases they will
+	 * raise a performance monitor exception. We need to catch this to
+	 * ensure we reset the PMC. In all cases the PMC will be 256 or less
+	 * cycles from overflow.
+	 *
+	 * We only do this if the first pass fails to find any overflowing
+	 * PMCs because a user might set a period of less than 256 and we
+	 * don't want to mistakenly reset them.
+	 */
+	if (__is_processor(PV_POWER7) && ((0x80000000 - val) <= 256))
+		return true;
+
+	return false;
+}
+
 static void power4_handle_interrupt(struct pt_regs *regs,
 				    struct op_counter_config *ctr)
 {
@@ -281,7 +303,7 @@ static void power4_handle_interrupt(struct pt_regs *regs,
 
 	for (i = 0; i < cur_cpu_spec->num_pmcs; ++i) {
 		val = classic_ctr_read(i);
-		if (val < 0) {
+		if (pmc_overflow(val)) {
 			if (oprofile_running && ctr[i].enabled) {
 				oprofile_add_ext_sample(pc, regs, i, is_kernel);
 				classic_ctr_write(i, reset_value[i]);
-- 
1.7.4.1

^ permalink raw reply related

* Re: UART #1 access on Sequoia board
From: Josh Boyer @ 2011-05-23 12:40 UTC (permalink / raw)
  To: Muhammad Waseem; +Cc: linuxppc-dev
In-Reply-To: <BANLkTims8CrK6s0bJZPBt3jTDRDKzg2e-A@mail.gmail.com>

On Mon, May 23, 2011 at 11:34:26AM +0500, Muhammad Waseem wrote:
>Hello,
>   I am working on PPC440EPx (Sequoia) to access its UART # 1 port for data
>transfer, while UART # 0 is connected to remote terminal access on host.
>However there is no module/driver listed for UART # 1 using 'lsmod'. The
>kernel version is 2.6.23. How can I access the UART # 1 in user space and/or
>kernel space? Do I need to develop own BSP for this? please help.

The driver is normally built into the kernel.  If the board DTS file has
both UARTs listed, and you have /dev/ttyS1 on your rootfs you should be
able to use it.  However, Sequoia wasn't added until 2.6.24, so I'm not
really sure what kernel flavor you are using.

josh

^ permalink raw reply

* [PATCH][v3] powerpc/85xx: add host-pci(e) bridge only for RC
From: Prabhakar Kushwaha @ 2011-05-23 10:23 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: meet2prabhu, Vivek Mahajan, Prabhakar Kushwaha

FSL PCIe controller can act as agent(EP) or host(RC).
Under Agent(EP) mode they are configured via Host. So it is not required to add
with the PCI(e) sub-system.

Add and configure PCIe controller only for RC mode.

Signed-off-by: Vivek Mahajan <vivek.mahajan@freescale.com>
Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
---
 Based upon git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git(branch master)

 Chages for v2: Incorporated Kumar's comment
 	- Use PCI_CLASS_PROG instead of PCI_HEADER_TYPE 
 Changes for v3: 
 	- updated if check condition
	- removed checkpatch warning

 arch/powerpc/sysdev/fsl_pci.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
index 68ca929..4a1d37c 100644
--- a/arch/powerpc/sysdev/fsl_pci.c
+++ b/arch/powerpc/sysdev/fsl_pci.c
@@ -323,6 +323,7 @@ int __init fsl_add_bridge(struct device_node *dev, int is_primary)
 	struct pci_controller *hose;
 	struct resource rsrc;
 	const int *bus_range;
+	u8 progif;
 
 	if (!of_device_is_available(dev)) {
 		pr_warning("%s: disabled\n", dev->full_name);
@@ -353,6 +354,19 @@ int __init fsl_add_bridge(struct device_node *dev, int is_primary)
 
 	setup_indirect_pci(hose, rsrc.start, rsrc.start + 0x4,
 		PPC_INDIRECT_TYPE_BIG_ENDIAN);
+
+	early_read_config_byte(hose, 0, 0, PCI_CLASS_PROG, &progif);
+	if ((progif & 1) == 1) {
+		u32 temp;
+
+		temp = (u32)hose->cfg_data & ~PAGE_MASK;
+		if (((u32)hose->cfg_data & PAGE_MASK) != (u32)hose->cfg_addr)
+			iounmap(hose->cfg_data - temp);
+		iounmap(hose->cfg_addr);
+		pcibios_free_controller(hose);
+		return 0;
+	}
+
 	setup_pci_cmd(hose);
 
 	/* check PCI express link status */
-- 
1.7.4.1

^ permalink raw reply related

* [PATCH] powerpc/5200: add GPIO functions for simple interrupt GPIOs
From: Anatolij Gustschin @ 2011-05-23  9:25 UTC (permalink / raw)
  To: linuxppc-dev

The mpc52xx_gpio driver currently supports 8 wakeup GPIOs and 32
simple GPIOs. Extend it to also support GPIO function of 8 simple
interrupt GPIOs controlled in the standard GPIO register module.

Signed-off-by: Anatolij Gustschin <agust@denx.de>
---
 arch/powerpc/platforms/52xx/mpc52xx_gpio.c |  117 ++++++++++++++++++++++++++++
 1 files changed, 117 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/platforms/52xx/mpc52xx_gpio.c b/arch/powerpc/platforms/52xx/mpc52xx_gpio.c
index 1757d1d..42a0759 100644
--- a/arch/powerpc/platforms/52xx/mpc52xx_gpio.c
+++ b/arch/powerpc/platforms/52xx/mpc52xx_gpio.c
@@ -35,6 +35,9 @@ struct mpc52xx_gpiochip {
 	unsigned int shadow_dvo;
 	unsigned int shadow_gpioe;
 	unsigned int shadow_ddr;
+	unsigned char sint_shadow_dvo;
+	unsigned char sint_shadow_gpioe;
+	unsigned char sint_shadow_ddr;
 };
 
 /*
@@ -309,6 +312,100 @@ mpc52xx_simple_gpio_dir_out(struct gpio_chip *gc, unsigned int gpio, int val)
 	return 0;
 }
 
+static int mpc52xx_sint_gpio_get(struct gpio_chip *gc, unsigned int gpio)
+{
+	struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc);
+	struct mpc52xx_gpio __iomem *regs = mm_gc->regs;
+	unsigned int ret;
+
+	ret = (in_8(&regs->sint_ival) >> (7 - gpio)) & 1;
+
+	pr_debug("%s: gpio: %d ret: %d\n", __func__, gpio, ret);
+
+	return ret;
+}
+
+static inline void
+__mpc52xx_sint_gpio_set(struct gpio_chip *gc, unsigned int gpio, int val)
+{
+	struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc);
+	struct mpc52xx_gpiochip *chip = container_of(mm_gc,
+			struct mpc52xx_gpiochip, mmchip);
+	struct mpc52xx_gpio __iomem *regs = mm_gc->regs;
+
+	if (val)
+		chip->sint_shadow_dvo |= 1 << (7 - gpio);
+	else
+		chip->sint_shadow_dvo &= ~(1 << (7 - gpio));
+
+	out_8(&regs->sint_dvo, chip->sint_shadow_dvo);
+}
+
+static void
+mpc52xx_sint_gpio_set(struct gpio_chip *gc, unsigned int gpio, int val)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&gpio_lock, flags);
+
+	__mpc52xx_sint_gpio_set(gc, gpio, val);
+
+	spin_unlock_irqrestore(&gpio_lock, flags);
+
+	pr_debug("%s: gpio: %d val: %d\n", __func__, gpio, val);
+}
+
+static int mpc52xx_sint_gpio_dir_in(struct gpio_chip *gc, unsigned int gpio)
+{
+	struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc);
+	struct mpc52xx_gpiochip *chip = container_of(mm_gc,
+			struct mpc52xx_gpiochip, mmchip);
+	struct mpc52xx_gpio __iomem *regs = mm_gc->regs;
+	unsigned long flags;
+
+	spin_lock_irqsave(&gpio_lock, flags);
+
+	/* set the direction */
+	chip->sint_shadow_ddr &= ~(1 << (7 - gpio));
+	out_8(&regs->sint_ddr, chip->sint_shadow_ddr);
+
+	/* and enable the pin */
+	chip->sint_shadow_gpioe |= 1 << (7 - gpio);
+	out_8(&regs->sint_gpioe, chip->sint_shadow_gpioe);
+
+	spin_unlock_irqrestore(&gpio_lock, flags);
+
+	return 0;
+}
+
+static int
+mpc52xx_sint_gpio_dir_out(struct gpio_chip *gc, unsigned int gpio, int val)
+{
+	struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc);
+	struct mpc52xx_gpio __iomem *regs = mm_gc->regs;
+	struct mpc52xx_gpiochip *chip = container_of(mm_gc,
+			struct mpc52xx_gpiochip, mmchip);
+	unsigned long flags;
+
+	spin_lock_irqsave(&gpio_lock, flags);
+
+	__mpc52xx_sint_gpio_set(gc, gpio, val);
+
+	/* Then set direction */
+	chip->sint_shadow_ddr |= 1 << (7 - gpio);
+	out_8(&regs->sint_ddr, chip->sint_shadow_ddr);
+
+	/* Finally enable the pin */
+	chip->sint_shadow_gpioe |= 1 << (7 - gpio);
+	out_8(&regs->sint_gpioe, chip->sint_shadow_gpioe);
+
+	spin_unlock_irqrestore(&gpio_lock, flags);
+
+	pr_debug("%s: gpio: %d val: %d\n", __func__, gpio, val);
+
+	return 0;
+}
+
 static int __devinit mpc52xx_simple_gpiochip_probe(struct platform_device *ofdev)
 {
 	struct mpc52xx_gpiochip *chip;
@@ -337,6 +434,26 @@ static int __devinit mpc52xx_simple_gpiochip_probe(struct platform_device *ofdev
 	chip->shadow_ddr = in_be32(&regs->simple_ddr);
 	chip->shadow_dvo = in_be32(&regs->simple_dvo);
 
+	chip = kzalloc(sizeof(*chip), GFP_KERNEL);
+	if (!chip)
+		return -ENOMEM;
+
+	gc = &chip->mmchip.gc;
+
+	gc->ngpio            = 8;
+	gc->direction_input  = mpc52xx_sint_gpio_dir_in;
+	gc->direction_output = mpc52xx_sint_gpio_dir_out;
+	gc->get              = mpc52xx_sint_gpio_get;
+	gc->set              = mpc52xx_sint_gpio_set;
+
+	ret = of_mm_gpiochip_add(ofdev->dev.of_node, &chip->mmchip);
+	if (ret)
+		return ret;
+
+	regs = chip->mmchip.regs;
+	chip->sint_shadow_gpioe = in_8(&regs->sint_gpioe);
+	chip->sint_shadow_ddr = in_8(&regs->sint_ddr);
+	chip->sint_shadow_dvo = in_8(&regs->sint_dvo);
 	return 0;
 }
 
-- 
1.7.1

^ permalink raw reply related

* PCI DMA to user mem on mpc83xx
From: Andre Schwarz @ 2011-05-23  9:12 UTC (permalink / raw)
  To: Ira W. Snyder; +Cc: LinuxPPC List

Ira,

we have a pretty old PCI device driver here that needs some basic rework 
running on 2.6.27 on several MPC83xx.
It's a simple char-device with "give me some data" implemented using 
read() resulting in zero-copy DMA to user mem.

There's get_user_pages() working under the hood along with 
SetPageDirty() and page_cache_release().

Main goal is to prepare a sg-list that gets fed into a DMA controller.

I wonder if there's a more up-to-date/efficient and future proof scheme 
of creating the mapping.


Could you provide some pointers or would you stick to the current scheme ?


Regards,
André

MATRIX VISION GmbH, Talstrasse 16, DE-71570 Oppenweiler
Registergericht: Amtsgericht Stuttgart, HRB 271090
Geschaeftsfuehrer: Gerhard Thullner, Werner Armingeon, Uwe Furtner

^ permalink raw reply

* UART #1 access on Sequoia board
From: Muhammad Waseem @ 2011-05-23  6:34 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 389 bytes --]

Hello,
   I am working on PPC440EPx (Sequoia) to access its UART # 1 port for data
transfer, while UART # 0 is connected to remote terminal access on host.
However there is no module/driver listed for UART # 1 using 'lsmod'. The
kernel version is 2.6.23. How can I access the UART # 1 in user space and/or
kernel space? Do I need to develop own BSP for this? please help.

regards,
Waseem

[-- Attachment #2: Type: text/html, Size: 458 bytes --]

^ permalink raw reply

* [PATCH][v2] powerpc/85xx: add host-pci(e) bridge only for RC
From: Prabhakar Kushwaha @ 2011-05-23  6:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: meet2prabhu, Vivek Mahajan, Prabhakar Kushwaha

FSL PCIe controller can act as agent(EP) or host(RC).
Under Agent(EP) mode they are configured via Host. So it is not required to add
with the PCI(e) sub-system.

Add and configure PCIe controller only for RC mode.

Signed-off-by: Vivek Mahajan <vivek.mahajan@freescale.com>
Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
---
 Based upon git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git(branch master)

 Chages for v2: Incorporated Kumar's comment
 	- Use PCI_CLASS_PROG instead of PCI_HEADER_TYPE 

 arch/powerpc/sysdev/fsl_pci.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
index 68ca929..4a1d37c 100644
--- a/arch/powerpc/sysdev/fsl_pci.c
+++ b/arch/powerpc/sysdev/fsl_pci.c
@@ -323,6 +323,7 @@ int __init fsl_add_bridge(struct device_node *dev, int is_primary)
 	struct pci_controller *hose;
 	struct resource rsrc;
 	const int *bus_range;
+	u8 progif;
 
 	if (!of_device_is_available(dev)) {
 		pr_warning("%s: disabled\n", dev->full_name);
@@ -353,6 +354,19 @@ int __init fsl_add_bridge(struct device_node *dev, int is_primary)
 
 	setup_indirect_pci(hose, rsrc.start, rsrc.start + 0x4,
 		PPC_INDIRECT_TYPE_BIG_ENDIAN);
+
+	early_read_config_byte(hose, 0, 0, PCI_CLASS_PROG, &progif);
+	if (progif & 1 ) {
+		u32 temp;
+
+		temp = (u32)hose->cfg_data & ~PAGE_MASK;
+		if (((u32)hose->cfg_data & PAGE_MASK) != (u32)hose->cfg_addr)
+			iounmap(hose->cfg_data - temp);
+		iounmap(hose->cfg_addr);
+		pcibios_free_controller(hose);
+		return 0;
+	}
+
 	setup_pci_cmd(hose);
 
 	/* check PCI express link status */
-- 
1.7.4.1

^ permalink raw reply related

* [PATCH v4] powerpc: Force page alignment for initrd reserved memory
From: Dave Carroll @ 2011-05-23  2:31 UTC (permalink / raw)
  To: 'Milton Miller'; +Cc: LPPC, Paul Mackerras, LKML
In-Reply-To: <initrd-reserve-reply2@mdm.bga.com>

When using 64K pages with a separate cpio rootfs, U-Boot will align
the rootfs on a 4K page boundary. When the memory is reserved, and
subsequent early memblock_alloc is called, it will allocate memory
between the 64K page alignment and reserved memory. When the reserved
memory is subsequently freed, it is done so by pages, causing the
early memblock_alloc requests to be re-used, which in my case, caused
the device-tree to be clobbered.

This patch forces the reserved memory for initrd to be kernel page
aligned, and adds the same range extension when freeing initrd.

Signed-off-by: Dave Carroll <dcarroll@astekcorp.com>
---

* I think this handles Milton's concerns with the exception of
  a packed FDT next to initrd with mismatched page sizes. That
  would require coordination with the bootloader.

 arch/powerpc/kernel/prom.c |    4 +++-
 arch/powerpc/mm/init_32.c  |   15 +++++++++------
 arch/powerpc/mm/init_64.c  |   15 +++++++++------
 3 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 48aeb55..387e5c9 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -555,7 +555,9 @@ static void __init early_reserve_mem(void)
 #ifdef CONFIG_BLK_DEV_INITRD
        /* then reserve the initrd, if any */
        if (initrd_start && (initrd_end > initrd_start))
-               memblock_reserve(__pa(initrd_start), initrd_end - initrd_st=
art);
+               memblock_reserve(_ALIGN_DOWN(__pa(initrd_start), PAGE_SIZE)=
,
+                       _ALIGN_UP(initrd_end, PAGE_SIZE) -
+                       _ALIGN_DOWN(initrd_start, PAGE_SIZE));
 #endif /* CONFIG_BLK_DEV_INITRD */

 #ifdef CONFIG_PPC32
diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index d65b591..1aad444 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -226,13 +226,16 @@ void free_initmem(void)
 #ifdef CONFIG_BLK_DEV_INITRD
 void free_initrd_mem(unsigned long start, unsigned long end)
 {
-       if (start < end)
+       if (start && (start < end)) {
+               start =3D _ALIGN_DOWN(start, PAGE_SIZE);
+               end =3D _ALIGN_UP(end, PAGE_SIZE);
                printk ("Freeing initrd memory: %ldk freed\n", (end - start=
) >> 10);
-       for (; start < end; start +=3D PAGE_SIZE) {
-               ClearPageReserved(virt_to_page(start));
-               init_page_count(virt_to_page(start));
-               free_page(start);
-               totalram_pages++;
+               for (; start < end; start +=3D PAGE_SIZE) {
+                       ClearPageReserved(virt_to_page(start));
+                       init_page_count(virt_to_page(start));
+                       free_page(start);
+                       totalram_pages++;
+               }
        }
 }
 #endif
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 6374b21..fa9586b 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -102,13 +102,16 @@ void free_initmem(void)
 #ifdef CONFIG_BLK_DEV_INITRD
 void free_initrd_mem(unsigned long start, unsigned long end)
 {
-       if (start < end)
+       if (start && (start < end)) {
+               start =3D _ALIGN_DOWN(start, PAGE_SIZE);
+               end =3D _ALIGN_UP(end, PAGE_SIZE);
                printk ("Freeing initrd memory: %ldk freed\n", (end - start=
) >> 10);
-       for (; start < end; start +=3D PAGE_SIZE) {
-               ClearPageReserved(virt_to_page(start));
-               init_page_count(virt_to_page(start));
-               free_page(start);
-               totalram_pages++;
+               for (; start < end; start +=3D PAGE_SIZE) {
+                       ClearPageReserved(virt_to_page(start));
+                       init_page_count(virt_to_page(start));
+                       free_page(start);
+                       totalram_pages++;
+               }
        }
 }
 #endif
--
1.7.4

^ permalink raw reply related

* RE: [PATCH v3] powerpc: Force page alignment for initrd reserved memory
From: Dave Carroll @ 2011-05-23  1:29 UTC (permalink / raw)
  To: 'Milton Miller'; +Cc: LPPC, Paul Mackerras, LKML
In-Reply-To: <initrd-reserve-reply2@mdm.bga.com>

>On Sun, 22 May 2011 about 15:17, Milton Miller wrote:
>>On Sat, 21 May 2011 about 11:05:27 -0600, Dave Carroll wrote:>
>> When using 64K pages with a separate cpio rootfs, U-Boot will align
>> the rootfs on a 4K page boundary. When the memory is reserved, and
>> subsequent early memblock_alloc is called, it will allocate memory
>> between the 64K page alignment and reserved memory. When the reserved
>> memory is subsequently freed, it is done so by pages, causing the
>> early memblock_alloc requests to be re-used, which in my case, caused
>> the device-tree to be clobbered.
>>
>> This patch forces the reserved memory for initrd to be kernel page
>> aligned, and adds the same range extension when freeing initrd.
>
>Getting better, but
>
>>
>>
>> Signed-off-by: Dave Carroll <dcarroll@astekcorp.com>
>> ---
>>  arch/powerpc/kernel/prom.c |    4 +++-
>>  arch/powerpc/mm/init_32.c  |    3 +++
>>  arch/powerpc/mm/init_64.c  |    3 +++
>>  3 files changed, 9 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
>> index 48aeb55..397d4a0 100644
>> --- a/arch/powerpc/kernel/prom.c
>> +++ b/arch/powerpc/kernel/prom.c
>> @@ -555,7 +555,9 @@ static void __init early_reserve_mem(void)
>>  #ifdef CONFIG_BLK_DEV_INITRD
>>         /* then reserve the initrd, if any */
>>         if (initrd_start && (initrd_end > initrd_start))
>
>Here you test the unaligned values
>
>>  void free_initrd_mem(unsigned long start, unsigned long end)
>>  {
>> +       start =3D _ALIGN_DOWN(start, PAGE_SIZE);
>> +       end =3D _ALIGN_UP(end, PAGE_SIZE);
>> +
>>         if (start < end)
>>                 printk ("Freeing initrd memory: %ldk freed\n", (end - st=
art) >> 10);
>
>But here you test the aligned values.  And they are aligned with
>opposite bias.  Which means that if start =3D=3D end (or is less than,
>but within the same page), a page that wasn't reserved (same
>32 and 64 bit) gets freed.
>

Agreed ... I'll have the but shortly ...

>I thought "what happens if we are within a page of end, could we
>free the last page of bss?", but then I checked vmlinux.lds and we
>align end to page size.  I thought other allocations should be safe,
>but then remembered:
>
>The flattened device tree (of which we continue to use the string
>table after boot) could be a problem.
>

I had previouly looked at free_initrd_mem, and thought the same conditions
should be used to handle the memory release, but as for the explicit alignm=
ent
of the release areas, that seemed to be handled by the fact that all of the
releases are specifically page aligned. The remainder of the free_initrd_me=
m
routine:

        for (; start < end; start +=3D PAGE_SIZE) {
                ClearPageReserved(virt_to_page(start));
                init_page_count(virt_to_page(start));
                free_page(start);
                totalram_pages++;
        }

implicitly aligns down start to a page boundary, and also would implicitly =
align
up the end address. While I would be a proponent of something like;

        if (start && (start < end)) do { remainder of free_initrd_mem }

I'm not sure of the goal in explicitly attempting to align the addresses in=
 the
routine as you proposed.

As for the FDT, if the FDT is packed contiguous with initrd, and the alignm=
ent is on
4K page boundaries, it would have been released before this patch. In my ca=
se (U-Boot),
they are not near each other.

Thanks,
-Dave
>
>milton

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox