LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: I think I have 8xx working...
From: Benjamin Herrenschmidt @ 2009-10-15  1:12 UTC (permalink / raw)
  To: Rex Feany; +Cc: Scott Wood, linuxppc-dev@ozlabs.org
In-Reply-To: <20091015010855.GA32710@compile2.chatsunix.int.mrv.com>

On Wed, 2009-10-14 at 18:08 -0700, Rex Feany wrote:
> Thus spake Benjamin Herrenschmidt (benh@kernel.crashing.org):
> 
> > On Wed, 2009-10-14 at 17:41 -0700, Rex Feany wrote:
> > > The biggest problem for me turned out to be the MMU context IDs being
> > > clamped to 32 when the 8xx only has 16. With this, things are a bit more
> > > stable :)
> > 
> > This is with Joakim patches or without ?
> 
> with just one: adding tlbil_va() to do_page_fault().

Ah cool, at least that keeps my sanity, I didn't get how it could work
at all without that bit :-)

Scott, Joakim: It will be interesting to figure out exactly what is the
condition where that is necessary. It definitely should make the one
in set_pte_filter() or ptep_set_access_flags_filter() unnecessary.

Joakim, you mentioned increased amount of TLB errors or misses when
doing that too often, that sounds a bit weird and worth investigating

Finally we could implement preload from update_mmu_cache() but that's a
different matter alltogether.

Cheers,
Ben.

^ permalink raw reply

* Yenta + sata_sil + 2.6.30.8 fails because dma_ops==NULL
From: roger blofeld @ 2009-10-15  3:56 UTC (permalink / raw)
  To: linuxppc-dev

Hi
 I have a powerbook G4 which has drives connected via a cardbus sata_sil. It is running fedora 11. It boots fine with 2.6.27.35-170.2.94.fc10.ppc, but with 2.6.30.8-64.fc11.ppc because of an error in the probe function of sata_sil.

 I have traced the problem to be the line in sata_sil.c function sil_init_one

    rc = pci_set_dma_mask(pdev, ATA_DMA_MASK);

which returns -5 (-EIO) because 

 get_dma_ops(&pdev->dev);

returns NULL.

I looked at the diffs to sata_sil.c between these kernel revs and didn't see the problem. Seems to be something at a higher level changed.

Any ideas where I should look or how to find the real cause?

Thanks
-roger

P.S., on the working kernel lspci gives:
0001:11:00.0 Mass storage controller: Silicon Image, Inc. SiI 3512 [SATALink/SATARaid] Serial ATA Controller (rev 01)

^ permalink raw reply

* [PATCH] powerpc perf events: Fix priority of MSR HV vs PR bits
From: Michael Neuling @ 2009-10-15  5:11 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev, Paul Mackerras

The architecture defines that if MSR PR is set we are in problem state
irrespective of the HV bit.  This fixes perf events to reflect this.

Signed-off-by: Michael Neuling <mikey@neuling.org>
CC: paulus@samba.org
---
Tested on PHYP and BML.

This could go back into 31 too with s/event/counters/g.  It only
effects bare metal on server chips, so no biggy if it doesn't.  

 arch/powerpc/kernel/perf_event.c |   17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

Index: linux-2.6-ozlabs/arch/powerpc/kernel/perf_event.c
===================================================================
--- linux-2.6-ozlabs.orig/arch/powerpc/kernel/perf_event.c
+++ linux-2.6-ozlabs/arch/powerpc/kernel/perf_event.c
@@ -116,20 +116,23 @@ static inline void perf_get_data_addr(st
 static inline u32 perf_get_misc_flags(struct pt_regs *regs)
 {
 	unsigned long mmcra = regs->dsisr;
+	unsigned long sihv = MMCRA_SIHV;
+	unsigned long sipr = MMCRA_SIPR;
 
 	if (TRAP(regs) != 0xf00)
 		return 0;	/* not a PMU interrupt */
 
 	if (ppmu->flags & PPMU_ALT_SIPR) {
-		if (mmcra & POWER6_MMCRA_SIHV)
-			return PERF_RECORD_MISC_HYPERVISOR;
-		return (mmcra & POWER6_MMCRA_SIPR) ?
-			PERF_RECORD_MISC_USER : PERF_RECORD_MISC_KERNEL;
+		sihv = POWER6_MMCRA_SIHV;
+		sipr = POWER6_MMCRA_SIPR;
 	}
-	if (mmcra & MMCRA_SIHV)
+
+	/* PR has priority over HV, so order below is important */
+	if (mmcra & sipr)
+		return PERF_RECORD_MISC_USER;
+	if ((mmcra & sihv) && (freeze_events_kernel != MMCR0_FCHV))
 		return PERF_RECORD_MISC_HYPERVISOR;
-	return (mmcra & MMCRA_SIPR) ? PERF_RECORD_MISC_USER :
-		PERF_RECORD_MISC_KERNEL;
+	return PERF_RECORD_MISC_KERNEL;
 }
 
 /*

^ permalink raw reply

* [PATCH] powerpc perf events: Fix priority of MSR HV vs PR bits
From: Michael Neuling @ 2009-10-15  5:32 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev, Paul Mackerras
In-Reply-To: <15714.1255583517@neuling.org>

The architecture defines that if MSR PR is set we are in problem state
irrespective of the HV bit.  This fixes perf events to reflect this.

Also, on bare metal systems, samples taken in Linux will now be reported
as kernel rather than hypervisor.  

Signed-off-by: Michael Neuling <mikey@neuling.org>
CC: paulus@samba.org
---
Paulus suggested that the comment needed updating to reflect the
additional functionality that I've changed.

 arch/powerpc/kernel/perf_event.c |   17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

Index: linux-2.6-ozlabs/arch/powerpc/kernel/perf_event.c
===================================================================
--- linux-2.6-ozlabs.orig/arch/powerpc/kernel/perf_event.c
+++ linux-2.6-ozlabs/arch/powerpc/kernel/perf_event.c
@@ -116,20 +116,23 @@ static inline void perf_get_data_addr(st
 static inline u32 perf_get_misc_flags(struct pt_regs *regs)
 {
 	unsigned long mmcra = regs->dsisr;
+	unsigned long sihv = MMCRA_SIHV;
+	unsigned long sipr = MMCRA_SIPR;
 
 	if (TRAP(regs) != 0xf00)
 		return 0;	/* not a PMU interrupt */
 
 	if (ppmu->flags & PPMU_ALT_SIPR) {
-		if (mmcra & POWER6_MMCRA_SIHV)
-			return PERF_RECORD_MISC_HYPERVISOR;
-		return (mmcra & POWER6_MMCRA_SIPR) ?
-			PERF_RECORD_MISC_USER : PERF_RECORD_MISC_KERNEL;
+		sihv = POWER6_MMCRA_SIHV;
+		sipr = POWER6_MMCRA_SIPR;
 	}
-	if (mmcra & MMCRA_SIHV)
+
+	/* PR has priority over HV, so order below is important */
+	if (mmcra & sipr)
+		return PERF_RECORD_MISC_USER;
+	if ((mmcra & sihv) && (freeze_events_kernel != MMCR0_FCHV))
 		return PERF_RECORD_MISC_HYPERVISOR;
-	return (mmcra & MMCRA_SIPR) ? PERF_RECORD_MISC_USER :
-		PERF_RECORD_MISC_KERNEL;
+	return PERF_RECORD_MISC_KERNEL;
 }
 
 /*

^ permalink raw reply

* powerpc problem with .data.page_aligned -> __page_aligned_data conversion
From: Benjamin Herrenschmidt @ 2009-10-15  5:32 UTC (permalink / raw)
  To: tabbott
  Cc: linuxppc-dev, Andrew Morton, Linus Torvalds, sam,
	linux-kernel@vger.kernel.org

Hi folks !

So I have a problem ... :-)

For some weird reason, our gcc until 4.3 (fixed in 4.3) had the weird
idea that the alignment attribute should not be allowed to force an
alignment greater than 32k. If attempted, it would warn -and- crop the
alignment to 32k.

As you can imagine, this doesn't play terribly well with 64K page size,
which is (sadly) our default on ppc64 on distros nowadays. I know it's
insane but that's what I have to live with...

And of course, our current distros (well, at least RHEL5 which is pretty
common) don't have a fixed gcc.

Until recently, we got away with it because very few people used
__page_aligned and those who did always did it on something that is
naturally a PAGE_SIZE multiple in size. So we just stick the thing into
the appropriate section and that's it.

The new generic macros __page_aligned_data/bss now always add the
alignment directive generically.

This has a few issues for us:

	- The patch that converted bits of powerpc to the new macro break since
it now hits that bug

	- We can't trivially change the macros to not to the align directive on
powerpc, but that's taking the risk that drivers or other pieces of
generic code now assume that they can use __page_aligned_* on a data
structure whose size isn't a multiple of _PAGE_SIZE, since that would
work on all architectures ... except powerpc where it would silently
break by shifting everybody else alignment in that section.

As of today, nothing seems to do that (unless I mis-grepped) though.

What do you recommend I do ? I can ban gcc < 4.3 but that's a bit
harsh :-) I know a few people that won't be happy to be unable to build
newer kernels with current distro gccs.

Or can do the above making the macro definition drop the alignment part
on powerpc. Will work for now, but will require great care to avoid
subtle and nasty breakage (basically same as before)

Or maybe I can do the above but only when using gcc < 4.3 so at least if
the breakage happen, that will only be with older gccs ...

Unless somebody has a better idea ?

Cheers,
Ben.

^ permalink raw reply

* Re: I think I have 8xx working...
From: Joakim Tjernlund @ 2009-10-15  5:32 UTC (permalink / raw)
  To: Rex Feany; +Cc: Scott Wood, linuxppc-dev@ozlabs.org
In-Reply-To: <20091015010855.GA32710@compile2.chatsunix.int.mrv.com>

Rex Feany <RFeany@mrv.com> wrote on 15/10/2009 03:08:55:
>
> Thus spake Benjamin Herrenschmidt (benh@kernel.crashing.org):
>
> > On Wed, 2009-10-14 at 17:41 -0700, Rex Feany wrote:
> > > The biggest problem for me turned out to be the MMU context IDs being
> > > clamped to 32 when the 8xx only has 16. With this, things are a bit more
> > > stable :)
> >
> > This is with Joakim patches or without ?
>
> with just one: adding tlbil_va() to do_page_fault().

Looking forward too hear about the the other patches too :)

^ permalink raw reply

* Re: I think I have 8xx working...
From: Joakim Tjernlund @ 2009-10-15  5:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev@ozlabs.org, Rex Feany
In-Reply-To: <1255569170.2347.62.camel@pasglop>

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 15/10/2009 03:12:50:

>
> On Wed, 2009-10-14 at 18:08 -0700, Rex Feany wrote:
> > Thus spake Benjamin Herrenschmidt (benh@kernel.crashing.org):
> >
> > > On Wed, 2009-10-14 at 17:41 -0700, Rex Feany wrote:
> > > > The biggest problem for me turned out to be the MMU context IDs being
> > > > clamped to 32 when the 8xx only has 16. With this, things are a bit more
> > > > stable :)
> > >
> > > This is with Joakim patches or without ?
> >
> > with just one: adding tlbil_va() to do_page_fault().
>
> Ah cool, at least that keeps my sanity, I didn't get how it could work
> at all without that bit :-)
>
> Scott, Joakim: It will be interesting to figure out exactly what is the
> condition where that is necessary. It definitely should make the one
> in set_pte_filter() or ptep_set_access_flags_filter() unnecessary.
>
> Joakim, you mentioned increased amount of TLB errors or misses when
> doing that too often, that sounds a bit weird and worth investigating

I didn't say that, did I? More like:
if I don't do tlbil_va at all I get a lot of extra/duplicate TLB errors
for the same address. Adding the patch makes these go away.
I guess one could do tlbil_va unconditionally but I didn't
see any improvements from doing that, but this was all on 2.4

>
> Finally we could implement preload from update_mmu_cache() but that's a
> different matter alltogether.
>
> Cheers,
> Ben.
>
>
>
>

^ permalink raw reply

* Re: linux booting fails on ppc440x5 with SRAM
From: Vineeth _ @ 2009-10-15  6:01 UTC (permalink / raw)
  To: Wolfgang Denk; +Cc: Johnny Hung, linuxppc-dev
In-Reply-To: <a9b543570910120634n286693f3s15b39970ad9d1c74@mail.gmail.com>

Hi,

In arch/powerpc/kernel/head_44x.S file, it says it clears all the TLBs
except the current working one. In our case we have mainly 3 TLBs for
FLASH, SRAM and the UART. We have the TLB of 16MB of SRAM *which is
our total memory*. So the TLBs that we create from our bootloader will
be used for the entire linux operation. I am attaching the TLB
initialization code with this mail, does it make any problem ?

  li  r3,1        # Slot1 SRAM
  LOADREG_32    r4,0x0
  LOADREG_32    r6, ( TLB_I_MASK | TLB_U_RWX | TLB_S_RWX |TLB_G_MASK  )
  LOADREG_32    r5,TLB_SIZE_16M
  mr  r7,r4
  bl  Save_TLB

Vineeth _

On Mon, Oct 12, 2009 at 7:04 PM, Vineeth _ <blacklites@gmail.com> wrote:
> Hi Wolfgang,
>
> The link says about the initialization of the SDRAM; Does it
> applicable in our case, where we have SRAM on our board. Does the
> initialization means just clearing the memory in case of SRAM ? We
> tried clearing the memory before the operation which doesnt work too.
> We are creating a TLB of 16MB for the SRAM which is not cachable        .
>
> -Vineeth
>
> On Fri, Oct 9, 2009 at 5:03 PM, Wolfgang Denk <wd@denx.de> wrote:
>> Dear Vineeth _,
>>
>> In message <a9b543570910090320t1444f8f1qf4c8ab7dbbef679f@mail.gmail.com> you wrote:
>>> We ported the uboot Memory test and tested the 15MB ram and it was
>>> successful.interestingly we have only 16MB SRAM in our board.We used 1
>>
>> Such a memory test means nothing. The only thing you can learn from it
>> is that basic read and write accesses are working. You don;t get any
>> information about the behaviour for burst mode accesses from such a
>> test. See the FAQ at
>> http://www.denx.de/wiki/view/DULG/LinuxCrashesRandomly  and at
>> http://www.denx.de/wiki/DULG/UBootCrashAfterRelocation
>>
>> Best regards,
>>
>> Wolfgang Denk
>>
>> --
>> DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
>> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
>> Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
>> I mean, I . . . think to understand you, I just don't know  what  you
>> are saying ...                        - Terry Pratchett, _Soul Music_
>>
>

^ permalink raw reply

* Re: [v8 PATCH 2/8]: cpuidle: implement a list based approach to register a set of idle routines.
From: Arun R Bharadwaj @ 2009-10-15  6:06 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-arch, Peter Zijlstra, linux-kernel, linux-acpi,
	Arun Bharadwaj, Ingo Molnar, linuxppc-dev, Arjan van de Ven
In-Reply-To: <20091014071838.GE23248@one.firstfloor.org>

* Andi Kleen <andi@firstfloor.org> [2009-10-14 09:18:38]:

> > How about something like this..
> > If the arch does not enable CONFIG_CPU_IDLE, the cpuidle_idle_call
> > which is called from cpu_idle() should call default_idle without
> > involving the registering cpuidle steps. This should prevent bloating
> > up of the kernel for archs which dont want to use cpuidle.
> 
> On x86 some people want small kernel too, so selecting it on a architecture
> granuality is not good. Also you can make it default, you just need
> to slim it down first.
> 

No, I dont mean selecting it on an architecture granularity.

At compile time, if CONFIG_CPU_IDLE is disabled, the arch can redefine
cpuidle_idle_call. For e.g. in arch/x86/kernel/process.c

#ifndef CONFIG_CPU_IDLE
void cpuidle_idle_call(void)
{
	if (local_idle)
		local_idle();
	else
		default_idle();
}
#endif

where local_idle points to the idle routine selected using
select_idle_routine() which can be poll, mwait, c1e.

So this way, we still preserve the exact same functionality as before
and we also remove the ugly pm_idle exported function pointer and we
avoid unnecessary code bloat for platforms who do not want to use
cpuidle.

--arun

^ permalink raw reply

* [PATCH -mmotm -v2] Fix bitmap-introduce-bitmap_set-bitmap_clear-bitmap_find_next_zero_area. patch
From: Akinobu Mita @ 2009-10-15  6:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-usb, linux-ia64, linuxppc-dev, Paul Mackerras,
	H. Peter Anvin, sparclinux, Lothar Wassmann, x86, linux-altix,
	Ingo Molnar, Fenghua Yu, Yevgeny Petrilin, Thomas Gleixner,
	Tony Luck, Kyle Hubert, netdev, Greg Kroah-Hartman, linux-kernel,
	FUJITA Tomonori, David S. Miller
In-Reply-To: <20091014032258.GA18527@localhost.localdomain>

From: Akinobu Mita <akinobu.mita@gmail.com>
Subject: Fix bitmap-introduce-bitmap_set-bitmap_clear-bitmap_find_next_zero_area.patch

- Rewrite bitmap_set and bitmap_clear

  Instead of setting or clearing for each bit.

- Fix off-by-one errors in bitmap_find_next_zero_area

  This bug was derived from find_next_zero_area in iommu-helper.

- Add kerneldoc for bitmap_find_next_zero_area

This patch is supposed to be folded into
bitmap-introduce-bitmap_set-bitmap_clear-bitmap_find_next_zero_area.patch

* -v2
- Remove an extra test at the "index" value by find_next_bit

Index was just returned by find_next_zero_bit, so we know it's zero.

Noticed-by: Kyle Hubert <khubert@gmail.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
---
 lib/bitmap.c |   62 ++++++++++++++++++++++++++++++++++++++++++++-------------
 1 files changed, 48 insertions(+), 14 deletions(-)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index 2415da4..962b863 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -271,28 +271,62 @@ int __bitmap_weight(const unsigned long *bitmap, int bits)
 }
 EXPORT_SYMBOL(__bitmap_weight);
 
-void bitmap_set(unsigned long *map, int i, int len)
-{
-	int end = i + len;
+#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) % BITS_PER_LONG))
 
-	while (i < end) {
-		__set_bit(i, map);
-		i++;
+void bitmap_set(unsigned long *map, int start, int nr)
+{
+	unsigned long *p = map + BIT_WORD(start);
+	const int size = start + nr;
+	int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
+	unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
+
+	while (nr - bits_to_set >= 0) {
+		*p |= mask_to_set;
+		nr -= bits_to_set;
+		bits_to_set = BITS_PER_LONG;
+		mask_to_set = ~0UL;
+		p++;
+	}
+	if (nr) {
+		mask_to_set &= BITMAP_LAST_WORD_MASK(size);
+		*p |= mask_to_set;
 	}
 }
 EXPORT_SYMBOL(bitmap_set);
 
 void bitmap_clear(unsigned long *map, int start, int nr)
 {
-	int end = start + nr;
-
-	while (start < end) {
-		__clear_bit(start, map);
-		start++;
+	unsigned long *p = map + BIT_WORD(start);
+	const int size = start + nr;
+	int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
+	unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+
+	while (nr - bits_to_clear >= 0) {
+		*p &= ~mask_to_clear;
+		nr -= bits_to_clear;
+		bits_to_clear = BITS_PER_LONG;
+		mask_to_clear = ~0UL;
+		p++;
+	}
+	if (nr) {
+		mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+		*p &= ~mask_to_clear;
 	}
 }
 EXPORT_SYMBOL(bitmap_clear);
 
+/*
+ * bitmap_find_next_zero_area - find a contiguous aligned zero area
+ * @map: The address to base the search on
+ * @size: The bitmap size in bits
+ * @start: The bitnumber to start searching at
+ * @nr: The number of zeroed bits we're looking for
+ * @align_mask: Alignment mask for zero area
+ *
+ * The @align_mask should be one less than a power of 2; the effect is that
+ * the bit offset of all zero areas this function finds is multiples of that
+ * power of 2. A @align_mask of 0 means no alignment is required.
+ */
 unsigned long bitmap_find_next_zero_area(unsigned long *map,
 					 unsigned long size,
 					 unsigned long start,
@@ -304,12 +338,12 @@ again:
 	index = find_next_zero_bit(map, size, start);
 
 	/* Align allocation */
-	index = (index + align_mask) & ~align_mask;
+	index = __ALIGN_MASK(index, align_mask);
 
 	end = index + nr;
-	if (end >= size)
+	if (end > size)
 		return end;
-	i = find_next_bit(map, end, index);
+	i = find_next_bit(map, end, index + 1);
 	if (i < end) {
 		start = i + 1;
 		goto again;
-- 
1.5.4.3

^ permalink raw reply related

* Re: I think I have 8xx working...
From: Benjamin Herrenschmidt @ 2009-10-15  6:21 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev@ozlabs.org, Rex Feany
In-Reply-To: <OFA38A7495.FD13F032-ONC1257650.001E7457-C1257650.001F51B8@transmode.se>

On Thu, 2009-10-15 at 07:42 +0200, Joakim Tjernlund wrote:
> I didn't say that, did I? More like:
> if I don't do tlbil_va at all I get a lot of extra/duplicate TLB
> errors
> for the same address. Adding the patch makes these go away.
> I guess one could do tlbil_va unconditionally but I didn't
> see any improvements from doing that, but this was all on 2.4

Ok so I misparsed you. I was advocating going unconditional here
and though you said it would increase the miss rate.

Cheers,
Ben

^ permalink raw reply

* [git pull] Please pull powerpc.git merge branch
From: Benjamin Herrenschmidt @ 2009-10-15  6:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linuxppc-dev list, Andrew Morton, Linux Kernel list

Hi Linus !

A tad late due mostly to me being on vacation :-) Here's some powerpc
fixes for 2.6.32, not that much and nothing really big.

The following changes since commit 80f506918fdaaca6b574ba931536a58ce015c7be:
  Linus Torvalds (1):
        Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git merge

Anton Blanchard (1):
      powerpc: Fix hypervisor TLB batching

Anton Vorontsov (1):
      powerpc/kgdb: Fix build failure caused by "kgdb.c: unused variable 'acc'"

Benjamin Herrenschmidt (3):
      powerpc/pmac: Fix issues with sleep on some powerbooks
      powerpc/mm: Fix hang accessing top of vmalloc space
      Merge commit 'ftrace/ppc' into merge

Dragos Tatulea (1):
      powerpc/oprofile: Add ppc750 CL as supported by oprofile

Heiko Schocher (1):
      powerpc/pci: Fix MODPOST warning

Michael Ellerman (1):
      powerpc: Fix memory leak in axon_msi.c

Sean MacLennan (1):
      powerpc: warning: allocated section `.data_nosave' not in segment

Steven Rostedt (2):
      powerpc/ftrace: show real return addresses in modules
      powerpc64/ftrace: use PACA to retrieve TOC in mod_return_to_handler

 arch/powerpc/include/asm/firmware.h       |   10 +++---
 arch/powerpc/kernel/cputable.c            |    2 +
 arch/powerpc/kernel/entry_64.S            |    3 +-
 arch/powerpc/kernel/kgdb.c                |    6 ----
 arch/powerpc/kernel/pci-common.c          |    2 +-
 arch/powerpc/kernel/process.c             |   10 +++++--
 arch/powerpc/kernel/vmlinux.lds.S         |    1 +
 arch/powerpc/mm/slb_low.S                 |   10 +++----
 arch/powerpc/platforms/cell/axon_msi.c    |    2 +-
 arch/powerpc/platforms/powermac/low_i2c.c |    7 +++-
 arch/powerpc/platforms/pseries/firmware.c |    3 +-
 drivers/macintosh/via-pmu.c               |   40 ++++++++++++++++------------
 12 files changed, 51 insertions(+), 45 deletions(-)

^ permalink raw reply

* [PATCH v0 1/2] DMA: fsldma: Disable DMA_INTERRUPT when Async_tx enabled
From: Vishnu Suresh @ 2009-10-15  6:41 UTC (permalink / raw)
  To: linuxppc-dev, linux-crypto, linux-kernel, linux-raid, herbert,
	dan.j.williams
  Cc: Vishnu Suresh

This patch disables the use of DMA_INTERRUPT capability with Async_tx

The fsldma produces a null transfer with DMA_INTERRUPT
capability when used with Async_tx. When RAID devices queue
a transaction via Async_tx, this  results in a hang.

Signed-off-by: Vishnu Suresh <Vishnu@freescale.com>
---
 drivers/dma/fsldma.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 296f9e7..66d9b39 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -1200,7 +1200,13 @@ static int __devinit of_fsl_dma_probe(struct of_device *dev,
 						- fdev->reg.start + 1);
 
 	dma_cap_set(DMA_MEMCPY, fdev->common.cap_mask);
+#ifndef CONFIG_ASYNC_CORE
+	/*
+	 * The DMA_INTERRUPT async_tx is a NULL transfer, which will
+	 * triger a PE interrupt.
+	 */
 	dma_cap_set(DMA_INTERRUPT, fdev->common.cap_mask);
+#endif
 	dma_cap_set(DMA_SLAVE, fdev->common.cap_mask);
 	fdev->common.device_alloc_chan_resources = fsl_dma_alloc_chan_resources;
 	fdev->common.device_free_chan_resources = fsl_dma_free_chan_resources;
-- 
1.6.4.2

^ permalink raw reply related

* [PATCH v0 2/2] Crypto: Talitos: Support for Async_tx XOR offload
From: Vishnu Suresh @ 2009-10-15  6:41 UTC (permalink / raw)
  To: linuxppc-dev, linux-crypto, linux-kernel, linux-raid, herbert,
	dan.j.williams
  Cc: Dipen Dudhat, Maneesh Gupta, Vishnu Suresh
In-Reply-To: <1255588866-4133-1-git-send-email-Vishnu@freescale.com>

Expose Talitos's XOR functionality to be used for
RAID Parity calculation via the Async_tx layer.

Thanks to Surender Kumar and Lee Nipper for their help in
realising this driver

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Dipen Dudhat <Dipen.Dudhat@freescale.com>
Signed-off-by: Maneesh Gupta <Maneesh.Gupta@freescale.com>
Signed-off-by: Vishnu Suresh <Vishnu@freescale.com>
---
 drivers/crypto/Kconfig   |    2 +
 drivers/crypto/talitos.c |  423 +++++++++++++++++++++++++++++++++++++++++++++-
 drivers/crypto/talitos.h |    2 +
 3 files changed, 426 insertions(+), 1 deletions(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index b08403d..343e578 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -192,6 +192,8 @@ config CRYPTO_DEV_TALITOS
 	select CRYPTO_ALGAPI
 	select CRYPTO_AUTHENC
 	select HW_RANDOM
+	select DMA_ENGINE
+	select ASYNC_XOR
 	depends on FSL_SOC
 	help
 	  Say 'Y' here to use the Freescale Security Engine (SEC)
diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index c47ffe8..84819d4 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -1,7 +1,7 @@
 /*
  * talitos - Freescale Integrated Security Engine (SEC) device driver
  *
- * Copyright (c) 2008 Freescale Semiconductor, Inc.
+ * Copyright (c) 2008-2009 Freescale Semiconductor, Inc.
  *
  * Scatterlist Crypto API glue code copied from files with the following:
  * Copyright (c) 2006-2007 Herbert Xu <herbert@gondor.apana.org.au>
@@ -37,6 +37,8 @@
 #include <linux/io.h>
 #include <linux/spinlock.h>
 #include <linux/rtnetlink.h>
+#include <linux/dmaengine.h>
+#include <linux/raid/xor.h>
 
 #include <crypto/algapi.h>
 #include <crypto/aes.h>
@@ -140,6 +142,9 @@ struct talitos_private {
 
 	/* hwrng device */
 	struct hwrng rng;
+
+	/* XOR Device */
+	struct dma_device dma_dev_common;
 };
 
 /* .features flag */
@@ -685,6 +690,401 @@ static void talitos_unregister_rng(struct device *dev)
 }
 
 /*
+ * async_tx interface for XOR-capable SECs
+ *
+ * Dipen Dudhat <Dipen.Dudhat@freescale.com>
+ * Maneesh Gupta <Maneesh.Gupta@freescale.com>
+ * Vishnu Suresh <Vishnu@freescale.com>
+ */
+
+/**
+ * talitos_xor_chan - context management for the async_tx channel
+ * @completed_cookie: the last completed cookie
+ * @desc_lock: lock for tx queue
+ * @total_desc: number of descriptors allocated
+ * @submit_q: queue of submitted descriptors
+ * @pending_q: queue of pending descriptors
+ * @in_progress_q: queue of descriptors in progress
+ * @free_desc: queue of unused descriptors
+ * @dev: talitos device implementing this channel
+ * @common: the corresponding xor channel in async_tx
+ */
+struct talitos_xor_chan {
+	dma_cookie_t completed_cookie;
+	spinlock_t desc_lock;
+	unsigned int total_desc;
+	struct list_head submit_q;
+	struct list_head pending_q;
+	struct list_head in_progress_q;
+	struct list_head free_desc;
+	struct device *dev;
+	struct dma_chan common;
+};
+
+/**
+ * talitos_xor_desc - software xor descriptor
+ * @async_tx: the referring async_tx descriptor
+ * @node:
+ * @hwdesc: h/w descriptor
+ */
+struct talitos_xor_desc {
+	struct dma_async_tx_descriptor async_tx;
+	struct list_head tx_list;
+	struct list_head node;
+	struct talitos_desc hwdesc;
+};
+
+static void talitos_release_xor(struct device *dev, struct talitos_desc *hwdesc,
+				void *context, int error);
+
+static enum dma_status talitos_is_tx_complete(struct dma_chan *chan,
+					      dma_cookie_t cookie,
+					      dma_cookie_t *done,
+					      dma_cookie_t *used)
+{
+	struct talitos_xor_chan *xor_chan;
+	dma_cookie_t last_used;
+	dma_cookie_t last_complete;
+
+	xor_chan = container_of(chan, struct talitos_xor_chan, common);
+
+	last_used = chan->cookie;
+	last_complete = xor_chan->completed_cookie;
+
+	if (done)
+		*done = last_complete;
+
+	if (used)
+		*used = last_used;
+
+	return dma_async_is_complete(cookie, last_complete, last_used);
+}
+
+static void talitos_process_pending(struct talitos_xor_chan *xor_chan)
+{
+	struct talitos_xor_desc *desc, *_desc;
+	unsigned long flags;
+	int status;
+
+	spin_lock_irqsave(&xor_chan->desc_lock, flags);
+
+	list_for_each_entry_safe(desc, _desc, &xor_chan->pending_q, node) {
+		status = talitos_submit(xor_chan->dev, &desc->hwdesc,
+					talitos_release_xor, desc);
+		if (status != -EINPROGRESS)
+			break;
+
+		list_del(&desc->node);
+		list_add_tail(&desc->node, &xor_chan->in_progress_q);
+	}
+
+	spin_unlock_irqrestore(&xor_chan->desc_lock, flags);
+}
+
+static void talitos_release_xor(struct device *dev, struct talitos_desc *hwdesc,
+				void *context, int error)
+{
+	struct talitos_xor_desc *desc = context;
+	struct talitos_xor_chan *xor_chan;
+	dma_async_tx_callback callback;
+	void *callback_param;
+
+	if (unlikely(error)) {
+		dev_err(dev, "xor operation: talitos error %d\n", error);
+		BUG();
+	}
+
+	xor_chan = container_of(desc->async_tx.chan, struct talitos_xor_chan,
+				common);
+	spin_lock_bh(&xor_chan->desc_lock);
+	if (xor_chan->completed_cookie < desc->async_tx.cookie)
+		xor_chan->completed_cookie = desc->async_tx.cookie;
+
+	callback = desc->async_tx.callback;
+	callback_param = desc->async_tx.callback_param;
+
+	if (callback) {
+		spin_unlock_bh(&xor_chan->desc_lock);
+		callback(callback_param);
+		spin_lock_bh(&xor_chan->desc_lock);
+	}
+
+	list_del(&desc->node);
+	list_add_tail(&desc->node, &xor_chan->free_desc);
+	spin_unlock_bh(&xor_chan->desc_lock);
+	if (!list_empty(&xor_chan->pending_q))
+		talitos_process_pending(xor_chan);
+}
+
+/**
+ * talitos_issue_pending - move the descriptors in submit
+ * queue to pending queue and submit them for processing
+ * @chan: DMA channel
+ */
+static void talitos_issue_pending(struct dma_chan *chan)
+{
+	struct talitos_xor_chan *xor_chan;
+
+	xor_chan = container_of(chan, struct talitos_xor_chan, common);
+	spin_lock_bh(&xor_chan->desc_lock);
+	list_splice_tail_init(&xor_chan->submit_q,
+				 &xor_chan->pending_q);
+	spin_unlock_bh(&xor_chan->desc_lock);
+	talitos_process_pending(xor_chan);
+}
+
+static dma_cookie_t talitos_async_tx_submit(struct dma_async_tx_descriptor *tx)
+{
+	struct talitos_xor_desc *desc;
+	struct talitos_xor_chan *xor_chan;
+	dma_cookie_t cookie;
+
+	desc = container_of(tx, struct talitos_xor_desc, async_tx);
+	xor_chan = container_of(tx->chan, struct talitos_xor_chan, common);
+
+	spin_lock_bh(&xor_chan->desc_lock);
+
+	cookie = xor_chan->common.cookie + 1;
+	if (cookie < 0)
+		cookie = 1;
+
+	desc->async_tx.cookie = cookie;
+	xor_chan->common.cookie = desc->async_tx.cookie;
+
+	list_splice_tail_init(&desc->tx_list,
+				 &xor_chan->submit_q);
+
+	spin_unlock_bh(&xor_chan->desc_lock);
+
+	return cookie;
+}
+
+static struct talitos_xor_desc *talitos_xor_alloc_descriptor(
+				struct talitos_xor_chan *xor_chan, gfp_t flags)
+{
+	struct talitos_xor_desc *desc;
+
+	desc = kmalloc(sizeof(*desc), flags);
+	if (desc) {
+		xor_chan->total_desc++;
+		desc->async_tx.tx_submit = talitos_async_tx_submit;
+	}
+
+	return desc;
+}
+
+static void talitos_free_chan_resources(struct dma_chan *chan)
+{
+	struct talitos_xor_chan *xor_chan;
+	struct talitos_xor_desc *desc, *_desc;
+
+	xor_chan = container_of(chan, struct talitos_xor_chan, common);
+
+	spin_lock_bh(&xor_chan->desc_lock);
+
+	list_for_each_entry_safe(desc, _desc, &xor_chan->submit_q, node) {
+		list_del(&desc->node);
+		xor_chan->total_desc--;
+		kfree(desc);
+	}
+	list_for_each_entry_safe(desc, _desc, &xor_chan->pending_q, node) {
+		list_del(&desc->node);
+		xor_chan->total_desc--;
+		kfree(desc);
+	}
+	list_for_each_entry_safe(desc, _desc, &xor_chan->in_progress_q, node) {
+		list_del(&desc->node);
+		xor_chan->total_desc--;
+		kfree(desc);
+	}
+	list_for_each_entry_safe(desc, _desc, &xor_chan->free_desc, node) {
+		list_del(&desc->node);
+		xor_chan->total_desc--;
+		kfree(desc);
+	}
+	BUG_ON(unlikely(xor_chan->total_desc));	/* Some descriptor not freed? */
+
+	spin_unlock_bh(&xor_chan->desc_lock);
+}
+
+static int talitos_alloc_chan_resources(struct dma_chan *chan)
+{
+	struct talitos_xor_chan *xor_chan;
+	struct talitos_xor_desc *desc;
+	LIST_HEAD(tmp_list);
+	int i;
+
+	xor_chan = container_of(chan, struct talitos_xor_chan, common);
+
+	if (!list_empty(&xor_chan->free_desc))
+		return xor_chan->total_desc;
+
+	/* 256 initial descriptors */
+	for (i = 0; i < 256; i++) {
+		desc = talitos_xor_alloc_descriptor(xor_chan, GFP_KERNEL);
+		if (!desc) {
+			dev_err(xor_chan->common.device->dev,
+				"Only %d initial descriptors\n", i);
+			break;
+		}
+		list_add_tail(&desc->node, &tmp_list);
+	}
+
+	if (!i)
+		return -ENOMEM;
+
+	/* At least one desc is allocated */
+	spin_lock_bh(&xor_chan->desc_lock);
+	list_splice_init(&tmp_list, &xor_chan->free_desc);
+	spin_unlock_bh(&xor_chan->desc_lock);
+
+	return xor_chan->total_desc;
+}
+
+static struct dma_async_tx_descriptor * talitos_prep_dma_xor(
+			struct dma_chan *chan, dma_addr_t dest, dma_addr_t *src,
+			unsigned int src_cnt, size_t len, unsigned long flags)
+{
+	struct talitos_xor_chan *xor_chan;
+	struct talitos_xor_desc *new;
+	struct talitos_desc *desc;
+	int i, j;
+
+	BUG_ON(unlikely(len > TALITOS_MAX_DATA_LEN));
+
+	xor_chan = container_of(chan, struct talitos_xor_chan, common);
+
+	spin_lock_bh(&xor_chan->desc_lock);
+	if (!list_empty(&xor_chan->free_desc)) {
+		new = container_of(xor_chan->free_desc.next,
+				   struct talitos_xor_desc, node);
+		list_del(&new->node);
+	} else {
+		new = talitos_xor_alloc_descriptor(xor_chan, GFP_KERNEL);
+	}
+	spin_unlock_bh(&xor_chan->desc_lock);
+
+	if (!new) {
+		dev_err(xor_chan->common.device->dev,
+			"No free memory for XOR DMA descriptor\n");
+		return NULL;
+	}
+	dma_async_tx_descriptor_init(&new->async_tx, &xor_chan->common);
+
+	INIT_LIST_HEAD(&new->node);
+	INIT_LIST_HEAD(&new->tx_list);
+
+	desc = &new->hwdesc;
+	/* Set destination: Last pointer pair */
+	to_talitos_ptr(&desc->ptr[6], dest);
+	desc->ptr[6].len = cpu_to_be16(len);
+	desc->ptr[6].j_extent = 0;
+
+	/* Set Sources: End loading from second-last pointer pair */
+	for (i = 5, j = 0; (j < src_cnt) && (i > 0); i--, j++) {
+		to_talitos_ptr(&desc->ptr[i], src[j]);
+		desc->ptr[i].len = cpu_to_be16(len);
+		desc->ptr[i].j_extent = 0;
+	}
+
+	/*
+	 * documentation states first 0 ptr/len combo marks end of sources
+	 * yet device produces scatter boundary error unless all subsequent
+	 * sources are zeroed out
+	 */
+	for (; i >= 0; i--) {
+		to_talitos_ptr(&desc->ptr[i], 0);
+		desc->ptr[i].len = 0;
+		desc->ptr[i].j_extent = 0;
+	}
+
+	desc->hdr = DESC_HDR_SEL0_AESU | DESC_HDR_MODE0_AESU_XOR
+		    | DESC_HDR_TYPE_RAID_XOR;
+
+	new->async_tx.parent = NULL;
+	new->async_tx.next = NULL;
+	new->async_tx.cookie = 0;
+	async_tx_ack(&new->async_tx);
+
+	list_add_tail(&new->node, &new->tx_list);
+
+	new->async_tx.flags = flags;
+	new->async_tx.cookie = -EBUSY;
+
+	return &new->async_tx;
+}
+
+static void talitos_unregister_async_xor(struct device *dev)
+{
+	struct talitos_private *priv = dev_get_drvdata(dev);
+	struct talitos_xor_chan *xor_chan;
+	struct dma_chan *chan;
+
+	if (priv->dma_dev_common.chancnt)
+		dma_async_device_unregister(&priv->dma_dev_common);
+
+	list_for_each_entry(chan, &priv->dma_dev_common.channels, device_node) {
+		xor_chan = container_of(chan, struct talitos_xor_chan, common);
+		list_del(&chan->device_node);
+		priv->dma_dev_common.chancnt--;
+		kfree(xor_chan);
+	}
+}
+
+/**
+ * talitos_register_dma_async - Initialize the Freescale XOR ADMA device
+ * It is registered as a DMA device with the capability to perform
+ * XOR operation with the Async_tx layer.
+ * The various queues and channel resources are also allocated.
+ */
+static int talitos_register_async_tx(struct device *dev, int max_xor_srcs)
+{
+	struct talitos_private *priv = dev_get_drvdata(dev);
+	struct dma_device *dma_dev = &priv->dma_dev_common;
+	struct talitos_xor_chan *xor_chan;
+	int err;
+
+	xor_chan = kzalloc(sizeof(struct talitos_xor_chan), GFP_KERNEL);
+	if (!xor_chan) {
+		dev_err(dev, "unable to allocate xor channel\n");
+		return -ENOMEM;
+	}
+
+	dma_dev->dev = dev;
+	dma_dev->device_alloc_chan_resources = talitos_alloc_chan_resources;
+	dma_dev->device_free_chan_resources = talitos_free_chan_resources;
+	dma_dev->device_prep_dma_xor = talitos_prep_dma_xor;
+	dma_dev->max_xor = max_xor_srcs;
+	dma_dev->device_is_tx_complete = talitos_is_tx_complete;
+	dma_dev->device_issue_pending = talitos_issue_pending;
+	INIT_LIST_HEAD(&dma_dev->channels);
+	dma_cap_set(DMA_XOR, dma_dev->cap_mask);
+
+	xor_chan->dev = dev;
+	xor_chan->common.device = dma_dev;
+	xor_chan->total_desc = 0;
+	INIT_LIST_HEAD(&xor_chan->submit_q);
+	INIT_LIST_HEAD(&xor_chan->pending_q);
+	INIT_LIST_HEAD(&xor_chan->in_progress_q);
+	INIT_LIST_HEAD(&xor_chan->free_desc);
+	spin_lock_init(&xor_chan->desc_lock);
+
+	list_add_tail(&xor_chan->common.device_node, &dma_dev->channels);
+	dma_dev->chancnt++;
+
+	err = dma_async_device_register(dma_dev);
+	if (err) {
+		dev_err(dev, "Unable to register XOR with Async_tx\n");
+		goto err_out;
+	}
+
+	return err;
+
+err_out:
+	talitos_unregister_async_xor(dev);
+	return err;
+}
+/*
  * crypto alg
  */
 #define TALITOS_CRA_PRIORITY		3000
@@ -1768,6 +2168,8 @@ static int talitos_remove(struct of_device *ofdev)
 	tasklet_kill(&priv->done_task);
 
 	iounmap(priv->reg);
+	if (priv->dma_dev_common.chancnt)
+		talitos_unregister_async_xor(dev);
 
 	dev_set_drvdata(dev, NULL);
 
@@ -1926,6 +2328,25 @@ static int talitos_probe(struct of_device *ofdev,
 			dev_info(dev, "hwrng\n");
 	}
 
+	/*
+	 * register with async_tx xor, if capable
+	 * SEC 2.x support up to 3 RAID sources,
+	 * SEC 3.x support up to 6
+	 */
+	if (hw_supports(dev, DESC_HDR_SEL0_AESU | DESC_HDR_TYPE_RAID_XOR)) {
+		int max_xor_srcs = 3;
+		if (of_device_is_compatible(np, "fsl,sec3.0"))
+			max_xor_srcs = 6;
+
+		err = talitos_register_async_tx(dev, max_xor_srcs);
+		if (err) {
+			dev_err(dev, "failed to register async_tx xor: %d\n",
+				err);
+			goto err_out;
+		}
+		dev_info(dev, "max_xor_srcs %d\n", max_xor_srcs);
+	}
+
 	/* register crypto algorithms the device supports */
 	for (i = 0; i < ARRAY_SIZE(driver_algs); i++) {
 		if (hw_supports(dev, driver_algs[i].desc_hdr_template)) {
diff --git a/drivers/crypto/talitos.h b/drivers/crypto/talitos.h
index ff5a145..b6197bc 100644
--- a/drivers/crypto/talitos.h
+++ b/drivers/crypto/talitos.h
@@ -155,6 +155,7 @@
 /* primary execution unit mode (MODE0) and derivatives */
 #define	DESC_HDR_MODE0_ENCRYPT		cpu_to_be32(0x00100000)
 #define	DESC_HDR_MODE0_AESU_CBC		cpu_to_be32(0x00200000)
+#define	DESC_HDR_MODE0_AESU_XOR		cpu_to_be32(0x0c600000)
 #define	DESC_HDR_MODE0_DEU_CBC		cpu_to_be32(0x00400000)
 #define	DESC_HDR_MODE0_DEU_3DES		cpu_to_be32(0x00200000)
 #define	DESC_HDR_MODE0_MDEU_INIT	cpu_to_be32(0x01000000)
@@ -202,6 +203,7 @@
 #define DESC_HDR_TYPE_IPSEC_ESP			cpu_to_be32(1 << 3)
 #define DESC_HDR_TYPE_COMMON_NONSNOOP_NO_AFEU	cpu_to_be32(2 << 3)
 #define DESC_HDR_TYPE_HMAC_SNOOP_NO_AFEU	cpu_to_be32(4 << 3)
+#define DESC_HDR_TYPE_RAID_XOR			cpu_to_be32(21 << 3)
 
 /* link table extent field bits */
 #define DESC_PTR_LNKTBL_JUMP			0x80
-- 
1.6.4.2

^ permalink raw reply related

* Re: linux-next: tree build failure
From: Jan Beulich @ 2009-10-15  7:27 UTC (permalink / raw)
  To: Hollis Blanchard
  Cc: sfr, Rusty Russell, linux-kernel, kvm-ppc, linux-next, akpm,
	linuxppc-dev
In-Reply-To: <1255561026.29192.0.camel@slab.beaverton.ibm.com>

>>> Hollis Blanchard <hollisb@us.ibm.com> 15.10.09 00:57 >>>
>On Fri, 2009-10-09 at 12:14 -0700, Hollis Blanchard wrote:
>> Rusty's version of BUILD_BUG_ON() does indeed fix the build break, and
>> also exposes the bug in kvmppc_account_exit_stat(). So to recap:
>>=20
>> original: built but didn't work
>> Jan's: doesn't build
>> Rusty's: builds and works
>>=20
>> Where do you want to go from here?
>
>Jan, what are your thoughts? Your BUILD_BUG_ON patch has broken the
>build, and we still need to fix it.

My perspective is that it just uncovered already existing brokenness. And
honestly, I won't be able to get to look into this within the next days. =
(And
btw., when I run into issues with other people's code changes, quite
frequently I'm told to propose a patch, so I'm also having some
philosophical problem understanding why I can't simply expect the same
when people run into issues with changes I made, especially in cases like
this where it wasn't me introducing the broken code.) So, if this can wait
for a couple of days, I can try to find time to look into this. Otherwise, =
I'd
rely on someone running into the actual issue to implement a solution.

Jan

^ permalink raw reply

* [patch 3/7] powerpc: Use unlocked ioctl in nvram_64
From: Thomas Gleixner @ 2009-10-15  8:42 UTC (permalink / raw)
  To: LKML
  Cc: Arnd Bergmann, Frederic Weisbecker, linuxppc-dev, Ingo Molnar,
	ALan Cox
In-Reply-To: <20091015083906.716130653@linutronix.de>

The ioctl is only used for powermac systems and reads a partition
number from an array which is initialized at boot time way before the
nvram code is initialized. So it's safe to switch to unlocked_ioctl.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
---
 arch/powerpc/kernel/nvram_64.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Index: linux-2.6-tip/arch/powerpc/kernel/nvram_64.c
===================================================================
--- linux-2.6-tip.orig/arch/powerpc/kernel/nvram_64.c
+++ linux-2.6-tip/arch/powerpc/kernel/nvram_64.c
@@ -139,8 +139,8 @@ out:
 
 }
 
-static int dev_nvram_ioctl(struct inode *inode, struct file *file,
-	unsigned int cmd, unsigned long arg)
+static long dev_nvram_ioctl(struct file *file, unsigned int cmd,
+			    unsigned long arg)
 {
 	switch(cmd) {
 #ifdef CONFIG_PPC_PMAC
@@ -169,11 +169,11 @@ static int dev_nvram_ioctl(struct inode 
 }
 
 const struct file_operations nvram_fops = {
-	.owner =	THIS_MODULE,
-	.llseek =	dev_nvram_llseek,
-	.read =		dev_nvram_read,
-	.write =	dev_nvram_write,
-	.ioctl =	dev_nvram_ioctl,
+	.owner		= THIS_MODULE,
+	.llseek		= dev_nvram_llseek,
+	.read		= dev_nvram_read,
+	.write		= dev_nvram_write,
+	.unlocked_ioctl	= dev_nvram_ioctl,
 };
 
 static struct miscdevice nvram_dev = {

^ permalink raw reply

* [patch 0/3] powerpc: Fixes for nvram_64
From: Thomas Gleixner @ 2009-10-15  8:54 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev

Ben,

while looking at the ioctl in nvram_64.c I noticed a couple of issues
which are addressed by the folling patch series.

Thanks,

	tglx

^ permalink raw reply

* [patch 1/3] powerpc: Remove unused code
From: Thomas Gleixner @ 2009-10-15  8:54 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <20091015085253.295880694@linutronix.de>

nvram_find_partition() has no user. The call site was removed in the
arch/powerpc move, but the function stayed. Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
---
 arch/powerpc/include/asm/nvram.h |    1 -
 arch/powerpc/kernel/nvram_64.c   |   25 -------------------------
 2 files changed, 26 deletions(-)

Index: linux-2.6-tip/arch/powerpc/include/asm/nvram.h
===================================================================
--- linux-2.6-tip.orig/arch/powerpc/include/asm/nvram.h
+++ linux-2.6-tip/arch/powerpc/include/asm/nvram.h
@@ -73,7 +73,6 @@ extern int nvram_write_error_log(char * 
 extern int nvram_read_error_log(char * buff, int length,
 					 unsigned int * err_type, unsigned int *err_seq);
 extern int nvram_clear_error_log(void);
-extern struct nvram_partition *nvram_find_partition(int sig, const char *name);
 
 extern int pSeries_nvram_init(void);
 
Index: linux-2.6-tip/arch/powerpc/kernel/nvram_64.c
===================================================================
--- linux-2.6-tip.orig/arch/powerpc/kernel/nvram_64.c
+++ linux-2.6-tip/arch/powerpc/kernel/nvram_64.c
@@ -228,31 +228,6 @@ static unsigned char nvram_checksum(stru
 	return c_sum;
 }
 
-
-/*
- * Find an nvram partition, sig can be 0 for any
- * partition or name can be NULL for any name, else
- * tries to match both
- */
-struct nvram_partition *nvram_find_partition(int sig, const char *name)
-{
-	struct nvram_partition * part;
-	struct list_head * p;
-
-	list_for_each(p, &nvram_part->partition) {
-		part = list_entry(p, struct nvram_partition, partition);
-
-		if (sig && part->header.signature != sig)
-			continue;
-		if (name && 0 != strncmp(name, part->header.name, 12))
-			continue;
-		return part; 
-	}
-	return NULL;
-}
-EXPORT_SYMBOL(nvram_find_partition);
-
-
 static int nvram_remove_os_partition(void)
 {
 	struct list_head *i;

^ permalink raw reply

* [patch 2/3] powerpc: Check nvram_error_log_index in nvram_clear_error_log()
From: Thomas Gleixner @ 2009-10-15  8:54 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <20091015085253.295880694@linutronix.de>

nvram_clear_error_log() calls ppc_md.nvram_write() even when
nvram_error_log_index is -1 (invalid). The nvram_write() function does
not check for a negative offset. 

Check nvram_error_log_index as the other nvram log functions do.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org

---
 arch/powerpc/kernel/nvram_64.c |    3 +++
 1 file changed, 3 insertions(+)

Index: linux-2.6-tip/arch/powerpc/kernel/nvram_64.c
===================================================================
--- linux-2.6-tip.orig/arch/powerpc/kernel/nvram_64.c
+++ linux-2.6-tip/arch/powerpc/kernel/nvram_64.c
@@ -681,6 +681,9 @@ int nvram_clear_error_log(void)
 	int clear_word = ERR_FLAG_ALREADY_LOGGED;
 	int rc;
 
+	if (nvram_error_log_index == -1)
+		return -1;
+
 	tmp_index = nvram_error_log_index;
 	
 	rc = ppc_md.nvram_write((char *)&clear_word, sizeof(int), &tmp_index);

^ permalink raw reply

* [patch 3/3] powerpc: Mark init code __init in nvram_64
From: Thomas Gleixner @ 2009-10-15  8:54 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <20091015085253.295880694@linutronix.de>

Mark all functions which are only called from nvram_init() __init.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
---
 arch/powerpc/kernel/nvram_64.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Index: linux-2.6-tip/arch/powerpc/kernel/nvram_64.c
===================================================================
--- linux-2.6-tip.orig/arch/powerpc/kernel/nvram_64.c
+++ linux-2.6-tip/arch/powerpc/kernel/nvram_64.c
@@ -184,7 +184,7 @@ static struct miscdevice nvram_dev = {
 
 
 #ifdef DEBUG_NVRAM
-static void nvram_print_partitions(char * label)
+static void __init nvram_print_partitions(char * label)
 {
 	struct list_head * p;
 	struct nvram_partition * tmp_part;
@@ -202,7 +202,7 @@ static void nvram_print_partitions(char 
 #endif
 
 
-static int nvram_write_header(struct nvram_partition * part)
+static int __init nvram_write_header(struct nvram_partition * part)
 {
 	loff_t tmp_index;
 	int rc;
@@ -214,7 +214,7 @@ static int nvram_write_header(struct nvr
 }
 
 
-static unsigned char nvram_checksum(struct nvram_header *p)
+static unsigned char __init nvram_checksum(struct nvram_header *p)
 {
 	unsigned int c_sum, c_sum2;
 	unsigned short *sp = (unsigned short *)p->name; /* assume 6 shorts */
@@ -228,7 +228,7 @@ static unsigned char nvram_checksum(stru
 	return c_sum;
 }
 
-static int nvram_remove_os_partition(void)
+static int __init nvram_remove_os_partition(void)
 {
 	struct list_head *i;
 	struct list_head *j;
@@ -294,7 +294,7 @@ static int nvram_remove_os_partition(voi
  * Will create a partition starting at the first free
  * space found if space has enough room.
  */
-static int nvram_create_os_partition(void)
+static int __init nvram_create_os_partition(void)
 {
 	struct nvram_partition *part;
 	struct nvram_partition *new_part;
@@ -397,7 +397,7 @@ static int nvram_create_os_partition(voi
  * 5.) If the max chunk cannot be allocated then try finding a chunk
  * that will satisfy the minum needed (NVRAM_MIN_REQ).
  */
-static int nvram_setup_partition(void)
+static int __init nvram_setup_partition(void)
 {
 	struct list_head * p;
 	struct nvram_partition * part;
@@ -455,7 +455,7 @@ static int nvram_setup_partition(void)
 }
 
 
-static int nvram_scan_partitions(void)
+static int __init nvram_scan_partitions(void)
 {
 	loff_t cur_index = 0;
 	struct nvram_header phead;

^ permalink raw reply

* [PATCH 1/8] 8xx: invalidate non present TLBs
From: Joakim Tjernlund @ 2009-10-15  9:04 UTC (permalink / raw)
  To: Scott Wood, Rex Feany, Benjamin Herrenschmidt,
	linuxppc-dev@ozlabs.org
In-Reply-To: <1255597466-30976-1-git-send-email-Joakim.Tjernlund@transmode.se>

8xx sometimes need to load a invalid/non-present TLBs in
it DTLB asm handler.
These must be invalidated separaly as linux mm don't.
---
 arch/powerpc/mm/fault.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 7699394..071e0ca 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -39,7 +39,7 @@
 #include <asm/uaccess.h>
 #include <asm/tlbflush.h>
 #include <asm/siginfo.h>
-
+#include <mm/mmu_decl.h>
 
 #ifdef CONFIG_KPROBES
 static inline int notify_page_fault(struct pt_regs *regs)
@@ -243,6 +243,12 @@ good_area:
 		goto bad_area;
 #endif /* CONFIG_6xx */
 #if defined(CONFIG_8xx)
+	/* 8xx sometimes need to load a invalid/non-present TLBs.
+	 * These must be invalidated separately as linux mm don't.
+	 */
+	if (error_code & 0x40000000) /* no translation? */
+		_tlbil_va(address, 0, 0, 0);
+
         /* The MPC8xx seems to always set 0x80000000, which is
          * "undefined".  Of those that can be set, this is the only
          * one which seems bad.
-- 
1.6.4.4

^ permalink raw reply related

* [PATCH 0/8]  Fix 8xx MMU/TLB
From: Joakim Tjernlund @ 2009-10-15  9:04 UTC (permalink / raw)
  To: Scott Wood, Rex Feany, Benjamin Herrenschmidt,
	linuxppc-dev@ozlabs.org

Now updated with Scott's remarks.
There is still(probably) a trivial conflict in pte-8xx.h

Joakim Tjernlund (8):
  8xx: invalidate non present TLBs
  8xx: Update TLB asm so it behaves as linux mm expects.
  8xx: Tag DAR with 0x00f0 to catch buggy instructions.
  8xx: Fixup DAR from buggy dcbX instructions.
  8xx: Add missing Guarded setting in DTLB Error.
  8xx: Restore _PAGE_WRITETHRU
  8xx: start using dcbX instructions in various copy routines
  8xx: Remove DIRTY pte handling in DTLB Error.

 arch/powerpc/include/asm/pte-8xx.h |   14 +-
 arch/powerpc/kernel/head_8xx.S     |  341 +++++++++++++++++++++++------------
 arch/powerpc/kernel/misc_32.S      |   18 --
 arch/powerpc/lib/copy_32.S         |   24 ---
 arch/powerpc/mm/fault.c            |    8 +-
 5 files changed, 238 insertions(+), 167 deletions(-)

^ permalink raw reply

* [PATCH 4/8] 8xx: Fixup DAR from buggy dcbX instructions.
From: Joakim Tjernlund @ 2009-10-15  9:04 UTC (permalink / raw)
  To: Scott Wood, Rex Feany, Benjamin Herrenschmidt,
	linuxppc-dev@ozlabs.org
In-Reply-To: <1255597466-30976-4-git-send-email-Joakim.Tjernlund@transmode.se>

This is an assembler version to fixup DAR not being set
by dcbX, icbi instructions. There are two versions, one
uses selfmodifing code, the other uses a
jump table but is much bigger(default).
---
 arch/powerpc/kernel/head_8xx.S |  180 +++++++++++++++++++++++++++++++++++++++-
 1 files changed, 176 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index bca22fa..320f333 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -494,11 +494,16 @@ DataTLBError:
 
 	mfspr	r10, SPRN_DAR
 	cmpwi	cr0, r10, 0x00f0
-	beq-	2f	/* must be a buggy dcbX, icbi insn. */
-
+	beq-	FixupDAR	/* must be a buggy dcbX, icbi insn. */
+DARFixed:/* Return from dcbx instruction bug workaround, r10 holds value of DAR */
 	mfspr	r11, SPRN_DSISR
-	andis.	r11, r11, 0x4800	/* !translation or protection */
-	bne	2f	/* branch if either is set */
+	/* As the DAR fixup may clear store we may have all 3 states zero.
+	 * Make sure only 0x0200(store) falls down into DIRTY handling
+	 */
+	andis.	r11, r11, 0x4a00	/* !translation, protection or store */
+	srwi	r11, r11, 16
+	cmpwi	cr0, r11, 0x0200	/* just store ? */
+	bne	2f
 	/* Only Change bit left now, do it here as it is faster
 	 * than trapping to the C fault handler.
 	*/
@@ -604,6 +609,173 @@ DataTLBError:
 
 	. = 0x2000
 
+/* This is the procedure to calculate the data EA for buggy dcbx,dcbi instructions
+ * by decoding the registers used by the dcbx instruction and adding them.
+ * DAR is set to the calculated address and r10 also holds the EA on exit.
+ */
+#define NO_SELF_MODIFYING_CODE /* define if you don't want to use self modifying code */
+	nop	/* A few nops to make the modified_instr: space below cache line aligned */
+	nop
+139:	/* fetch instruction from userspace memory */
+	DO_8xx_CPU6(0x3780, r3)
+	mtspr	SPRN_MD_EPN, r10
+	mfspr	r11, SPRN_M_TWB	/* Get level 1 table entry address */
+	lwz	r11, 0(r11)	/* Get the level 1 entry */
+	tophys  (r11, r11)
+	DO_8xx_CPU6(0x3b80, r3)
+	mtspr	SPRN_MD_TWC, r11	/* Load pte table base address */
+	mfspr	r11, SPRN_MD_TWC	/* ....and get the pte address */
+	lwz	r11, 0(r11)	/* Get the pte */
+	/* concat physical page address(r11) and page offset(r10) */
+	rlwimi	r11, r10, 0, 20, 31
+	b	140f
+FixupDAR:	/* Entry point for dcbx workaround. */
+	/* fetch instruction from memory. */
+	mfspr	r10, SPRN_SRR0
+	andis.	r11, r10, 0x8000
+	tophys  (r11, r10)
+	beq-	139b		/* Branch if user space address */
+140:	lwz	r11,0(r11)
+/* Check if it really is a dcbx instruction. */
+/* dcbt and dcbtst does not generate DTLB Misses/Errors,
+ * no need to include them here */
+	srwi	r10, r11, 26	/* check if major OP code is 31 */
+	cmpwi	cr0, r10, 31
+	bne-	141f
+	rlwinm	r10, r11, 0, 21, 30
+	cmpwi	cr0, r10, 2028	/* Is dcbz? */
+	beq+	142f
+	cmpwi	cr0, r10, 940	/* Is dcbi? */
+	beq+	142f
+	cmpwi	cr0, r10, 108	/* Is dcbst? */
+	beq+	144f		/* Fix up store bit! */
+	cmpwi	cr0, r10, 172	/* Is dcbf? */
+	beq+	142f
+	cmpwi	cr0, r10, 1964	/* Is icbi? */
+	beq+	142f
+141:	mfspr	r10, SPRN_DAR	/* r10 must hold DAR at exit */
+	b	DARfix		/* Nope, go back to normal TLB processing */
+
+144:	mfspr	r10, SPRN_DSISR
+	rlwinm	r10, r10,0,7,5	/* Clear store bit for buggy dcbst insn */
+	mtspr	SPRN_DSISR, r10
+142:	/* continue, it was a dcbx, dcbi instruction. */
+#ifdef CONFIG_8xx_CPU6
+	lwz	r3, 8(r0)	/* restore r3 from memory */
+#endif
+#ifndef NO_SELF_MODIFYING_CODE
+	andis.	r10,r11,0x1f	/* test if reg RA is r0 */
+	li	r10,modified_instr@l
+	dcbtst	r0,r10		/* touch for store */
+	rlwinm	r11,r11,0,0,20	/* Zero lower 10 bits */
+	oris	r11,r11,640	/* Transform instr. to a "add r10,RA,RB" */
+	ori	r11,r11,532
+	stw	r11,0(r10)	/* store add/and instruction */
+	dcbf	0,r10		/* flush new instr. to memory. */
+	icbi	0,r10		/* invalidate instr. cache line */
+	lwz	r11, 4(r0)	/* restore r11 from memory */
+	mfspr	r10, SPRN_M_TW	/* restore r10 from M_TW */
+	isync			/* Wait until new instr is loaded from memory */
+modified_instr:
+	.space	4		/* this is where the add/and instr. is stored */
+	bne+	143f
+	subf	r10,r0,r10	/* r10=r10-r0, only if reg RA is r0 */
+143:	mtdar	r10		/* store faulting EA in DAR */
+	b	DARFixed	/* Go back to normal TLB handling */
+#else
+	mfctr	r10
+	mtdar	r10			/* save ctr reg in DAR */
+	rlwinm	r10, r11, 24, 24, 28	/* offset into jump table for reg RB */
+	addi	r10, r10, 150f@l	/* add start of table */
+	mtctr	r10			/* load ctr with jump address */
+	xor	r10, r10, r10		/* sum starts at zero */
+	bctr				/* jump into table */
+150:
+	add	r10, r10, r0
+	b	151f
+	add	r10, r10, r1
+	b	151f
+	add	r10, r10, r2
+	b	151f
+	add	r10, r10, r3
+	b	151f
+	add	r10, r10, r4
+	b	151f
+	add	r10, r10, r5
+	b	151f
+	add	r10, r10, r6
+	b	151f
+	add	r10, r10, r7
+	b	151f
+	add	r10, r10, r8
+	b	151f
+	add	r10, r10, r9
+	b	151f
+	add	r10, r10, r10
+	b	151f
+	add	r10, r10, r11
+	b	151f
+	add	r10, r10, r12
+	b	151f
+	add	r10, r10, r13
+	b	151f
+	add	r10, r10, r14
+	b	151f
+	add	r10, r10, r15
+	b	151f
+	add	r10, r10, r16
+	b	151f
+	add	r10, r10, r17
+	b	151f
+	add	r10, r10, r18
+	b	151f
+	add	r10, r10, r19
+	b	151f
+	mtctr	r11	/* r10 needs special handling */
+	b	154f
+	mtctr	r11	/* r11 needs special handling */
+	b	153f
+	add	r10, r10, r22
+	b	151f
+	add	r10, r10, r23
+	b	151f
+	add	r10, r10, r24
+	b	151f
+	add	r10, r10, r25
+	b	151f
+	add	r10, r10, r25
+	b	151f
+	add	r10, r10, r27
+	b	151f
+	add	r10, r10, r28
+	b	151f
+	add	r10, r10, r29
+	b	151f
+	add	r10, r10, r30
+	b	151f
+	add	r10, r10, r31
+151:
+	rlwinm. r11,r11,19,24,28	/* offset into jump table for reg RA */
+	beq	152f			/* if reg RA is zero, don't add it */ 
+	addi	r11, r11, 150b@l	/* add start of table */
+	mtctr	r11			/* load ctr with jump address */
+	rlwinm	r11,r11,0,16,10		/* make sure we don't execute this more than once */
+	bctr				/* jump into table */
+152:
+	mfdar	r11
+	mtctr	r11			/* restore ctr reg from DAR */
+	mtdar	r10			/* save fault EA to DAR */
+	b	DARFixed		/* Go back to normal TLB handling */
+
+	/* special handling for r10,r11 since these are modified already */
+153:	lwz	r11, 4(r0)	/* load r11 from memory */
+	b	155f
+154:	mfspr	r11, SPRN_M_TW	/* load r10 from M_TW */
+155:	add	r10, r10, r11	/* add it */
+	mfctr	r11		/* restore r11 */
+	b	151b
+#endif
+
 	.globl	giveup_fpu
 giveup_fpu:
 	blr
-- 
1.6.4.4

^ permalink raw reply related

* [PATCH 3/8] 8xx: Tag DAR with 0x00f0 to catch buggy instructions.
From: Joakim Tjernlund @ 2009-10-15  9:04 UTC (permalink / raw)
  To: Scott Wood, Rex Feany, Benjamin Herrenschmidt,
	linuxppc-dev@ozlabs.org
In-Reply-To: <1255597466-30976-3-git-send-email-Joakim.Tjernlund@transmode.se>

dcbz, dcbf, dcbi, dcbst and icbi do not set DAR when they
cause a DTLB Error. Dectect this by tagging DAR with 0x00f0
at every exception exit that modifies DAR.
Test for DAR=0x00f0 in DataTLBError and bail
to handle_page_fault().
---
 arch/powerpc/kernel/head_8xx.S |   15 ++++++++++++++-
 1 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 2011230..bca22fa 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -206,6 +206,8 @@ MachineCheck:
 	EXCEPTION_PROLOG
 	mfspr r4,SPRN_DAR
 	stw r4,_DAR(r11)
+	li r5,0x00f0
+	mtspr SPRN_DAR,r5	/* Tag DAR, to be used in DTLB Error */
 	mfspr r5,SPRN_DSISR
 	stw r5,_DSISR(r11)
 	addi r3,r1,STACK_FRAME_OVERHEAD
@@ -222,6 +224,8 @@ DataAccess:
 	stw	r10,_DSISR(r11)
 	mr	r5,r10
 	mfspr	r4,SPRN_DAR
+	li	r10,0x00f0
+	mtspr	SPRN_DAR,r10	/* Tag DAR, to be used in DTLB Error */
 	EXC_XFER_EE_LITE(0x300, handle_page_fault)
 
 /* Instruction access exception.
@@ -244,6 +248,8 @@ Alignment:
 	EXCEPTION_PROLOG
 	mfspr	r4,SPRN_DAR
 	stw	r4,_DAR(r11)
+	li	r5,0x00f0
+	mtspr	SPRN_DAR,r5	/* Tag DAR, to be used in DTLB Error */
 	mfspr	r5,SPRN_DSISR
 	stw	r5,_DSISR(r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
@@ -445,6 +451,7 @@ DataStoreTLBMiss:
 	 * of the MMU.
 	 */
 2:	li	r11, 0x00f0
+	mtspr	SPRN_DAR,r11	/* Tag DAR */
 	rlwimi	r10, r11, 0, 24, 28	/* Set 24-27, clear 28 */
 	DO_8xx_CPU6(0x3d80, r3)
 	mtspr	SPRN_MD_RPN, r10	/* Update TLB entry */
@@ -485,6 +492,10 @@ DataTLBError:
 	stw	r10, 0(r0)
 	stw	r11, 4(r0)
 
+	mfspr	r10, SPRN_DAR
+	cmpwi	cr0, r10, 0x00f0
+	beq-	2f	/* must be a buggy dcbX, icbi insn. */
+
 	mfspr	r11, SPRN_DSISR
 	andis.	r11, r11, 0x4800	/* !translation or protection */
 	bne	2f	/* branch if either is set */
@@ -508,7 +519,8 @@ DataTLBError:
 	 * are initialized in mapin_ram().  This will avoid the problem,
 	 * assuming we only use the dcbi instruction on kernel addresses.
 	 */
-	mfspr	r10, SPRN_DAR
+
+	/* DAR is in r10 already */
 	rlwinm	r11, r10, 0, 0, 19
 	ori	r11, r11, MD_EVALID
 	mfspr	r10, SPRN_M_CASID
@@ -550,6 +562,7 @@ DataTLBError:
 	 * of the MMU.
 	 */
 	li	r11, 0x00f0
+	mtspr	SPRN_DAR,r11	/* Tag DAR */
 	rlwimi	r10, r11, 0, 24, 28	/* Set 24-27, clear 28 */
 	DO_8xx_CPU6(0x3d80, r3)
 	mtspr	SPRN_MD_RPN, r10	/* Update TLB entry */
-- 
1.6.4.4

^ permalink raw reply related

* [PATCH 2/8] 8xx: Update TLB asm so it behaves as linux mm expects.
From: Joakim Tjernlund @ 2009-10-15  9:04 UTC (permalink / raw)
  To: Scott Wood, Rex Feany, Benjamin Herrenschmidt,
	linuxppc-dev@ozlabs.org
In-Reply-To: <1255597466-30976-2-git-send-email-Joakim.Tjernlund@transmode.se>

Update the TLB asm to make proper use of _PAGE_DIRY and _PAGE_ACCESSED.
Get rid of _PAGE_HWWRITE too.
Pros:
 - I/D TLB Miss never needs to write to the linux pte.
 - _PAGE_ACCESSED is only set on TLB Error fixing accounting
 - _PAGE_DIRTY is mapped to 0x100, the changed bit, and is set directly
    when a page has been made dirty.
 - Proper RO/RW mapping of user space.
 - Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?)
 - kernel RO/user NA support.
Cons:
 - A few more instructions in the TLB Miss routines.
---
 arch/powerpc/include/asm/pte-8xx.h |   13 ++---
 arch/powerpc/kernel/head_8xx.S     |   99 ++++++++++++++++++-----------------
 2 files changed, 57 insertions(+), 55 deletions(-)

diff --git a/arch/powerpc/include/asm/pte-8xx.h b/arch/powerpc/include/asm/pte-8xx.h
index 8c6e312..f23cd15 100644
--- a/arch/powerpc/include/asm/pte-8xx.h
+++ b/arch/powerpc/include/asm/pte-8xx.h
@@ -32,22 +32,21 @@
 #define _PAGE_FILE	0x0002	/* when !present: nonlinear file mapping */
 #define _PAGE_NO_CACHE	0x0002	/* I: cache inhibit */
 #define _PAGE_SHARED	0x0004	/* No ASID (context) compare */
+#define _PAGE_DIRTY	0x0100	/* C: page changed */
 
-/* These five software bits must be masked out when the entry is loaded
- * into the TLB.
+/* These 3 software bits must be masked out when the entry is loaded
+ * into the TLB, 2 SW bits left.
  */
 #define _PAGE_EXEC	0x0008	/* software: i-cache coherency required */
 #define _PAGE_GUARDED	0x0010	/* software: guarded access */
-#define _PAGE_DIRTY	0x0020	/* software: page changed */
-#define _PAGE_RW	0x0040	/* software: user write access allowed */
-#define _PAGE_ACCESSED	0x0080	/* software: page referenced */
+#define _PAGE_ACCESSED	0x0020	/* software: page referenced */
 
 /* Setting any bits in the nibble with the follow two controls will
  * require a TLB exception handler change.  It is assumed unused bits
  * are always zero.
  */
-#define _PAGE_HWWRITE	0x0100	/* h/w write enable: never set in Linux PTE */
-#define _PAGE_USER	0x0800	/* One of the PP bits, the other is USER&~RW */
+#define _PAGE_RW	0x0400	/* lsb PP bits, inverted in HW */
+#define _PAGE_USER	0x0800	/* msb PP bits */
 
 #define _PMD_PRESENT	0x0001
 #define _PMD_BAD	0x0ff0
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 52ff8c5..2011230 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -333,26 +333,20 @@ InstructionTLBMiss:
 	mfspr	r11, SPRN_MD_TWC	/* ....and get the pte address */
 	lwz	r10, 0(r11)	/* Get the pte */
 
-#ifdef CONFIG_SWAP
-	/* do not set the _PAGE_ACCESSED bit of a non-present page */
-	andi.	r11, r10, _PAGE_PRESENT
-	beq	4f
-	ori	r10, r10, _PAGE_ACCESSED
-	mfspr	r11, SPRN_MD_TWC	/* get the pte address again */
-	stw	r10, 0(r11)
-4:
-#else
-	ori	r10, r10, _PAGE_ACCESSED
-	stw	r10, 0(r11)
-#endif
+	andi.	r11, r10, _PAGE_ACCESSED | _PAGE_PRESENT
+	cmpwi	cr0, r11, _PAGE_ACCESSED | _PAGE_PRESENT
+	bne-	cr0, 2f
+
+	/* Clear PP lsb, 0x400 */
+	rlwinm 	r10, r10, 0, 22, 20
 
 	/* The Linux PTE won't go exactly into the MMU TLB.
-	 * Software indicator bits 21, 22 and 28 must be clear.
+	 * Software indicator bits 22 and 28 must be clear.
 	 * Software indicator bits 24, 25, 26, and 27 must be
 	 * set.  All other Linux PTE bits control the behavior
 	 * of the MMU.
 	 */
-2:	li	r11, 0x00f0
+	li	r11, 0x00f0
 	rlwimi	r10, r11, 0, 24, 28	/* Set 24-27, clear 28 */
 	DO_8xx_CPU6(0x2d80, r3)
 	mtspr	SPRN_MI_RPN, r10	/* Update TLB entry */
@@ -365,6 +359,22 @@ InstructionTLBMiss:
 	lwz	r3, 8(r0)
 #endif
 	rfi
+2:
+	mfspr	r11, SPRN_SRR1
+	/* clear all error bits as TLB Miss
+	 * sets a few unconditionally
+	*/
+	rlwinm	r11, r11, 0, 0xffff
+	mtspr	SPRN_SRR1, r11
+
+	mfspr	r10, SPRN_M_TW	/* Restore registers */
+	lwz	r11, 0(r0)
+	mtcr	r11
+	lwz	r11, 4(r0)
+#ifdef CONFIG_8xx_CPU6
+	lwz	r3, 8(r0)
+#endif
+	b	InstructionAccess
 
 	. = 0x1200
 DataStoreTLBMiss:
@@ -409,21 +419,27 @@ DataStoreTLBMiss:
 	DO_8xx_CPU6(0x3b80, r3)
 	mtspr	SPRN_MD_TWC, r11
 
-#ifdef CONFIG_SWAP
-	/* do not set the _PAGE_ACCESSED bit of a non-present page */
-	andi.	r11, r10, _PAGE_PRESENT
-	beq	4f
-	ori	r10, r10, _PAGE_ACCESSED
-4:
-	/* and update pte in table */
-#else
-	ori	r10, r10, _PAGE_ACCESSED
-#endif
-	mfspr	r11, SPRN_MD_TWC	/* get the pte address again */
-	stw	r10, 0(r11)
+	/* Both _PAGE_ACCESSED and _PAGE_PRESENT has to be set.
+	 * We also need to know if the insn is a load/store, so:
+	 * Clear _PAGE_PRESENT and load that which will
+	 * trap into DTLB Error with store bit set accordinly.
+	 */
+	/* PRESENT=0x1, ACCESSED=0x20
+	 * r11 = ((r10 & PRESENT) & ((r10 & ACCESSED) >> 5));
+	 * r10 = (r10 & ~PRESENT) | r11;
+	 */
+	rlwinm	r11, r10, 32-5, 31, 31
+	and	r11, r11, r10
+	rlwimi	r10, r11, 0, 31, 31
+
+	/* Honour kernel RO, User NA */
+	andi.	r11, r10, _PAGE_USER | _PAGE_RW
+	bne-	cr0, 5f
+	ori	r10,r10, 0x200 /* Extended encoding, bit 22 */
+5:	xori	r10, r10, _PAGE_RW  /* invert RW bit */
 
 	/* The Linux PTE won't go exactly into the MMU TLB.
-	 * Software indicator bits 21, 22 and 28 must be clear.
+	 * Software indicator bits 22 and 28 must be clear.
 	 * Software indicator bits 24, 25, 26, and 27 must be
 	 * set.  All other Linux PTE bits control the behavior
 	 * of the MMU.
@@ -469,11 +485,12 @@ DataTLBError:
 	stw	r10, 0(r0)
 	stw	r11, 4(r0)
 
-	/* First, make sure this was a store operation.
+	mfspr	r11, SPRN_DSISR
+	andis.	r11, r11, 0x4800	/* !translation or protection */
+	bne	2f	/* branch if either is set */
+	/* Only Change bit left now, do it here as it is faster
+	 * than trapping to the C fault handler.
 	*/
-	mfspr	r10, SPRN_DSISR
-	andis.	r11, r10, 0x0200	/* If set, indicates store op */
-	beq	2f
 
 	/* The EA of a data TLB miss is automatically stored in the MD_EPN
 	 * register.  The EA of a data TLB error is automatically stored in
@@ -522,26 +539,12 @@ DataTLBError:
 	mfspr	r11, SPRN_MD_TWC		/* ....and get the pte address */
 	lwz	r10, 0(r11)		/* Get the pte */
 
-	andi.	r11, r10, _PAGE_RW	/* Is it writeable? */
-	beq	2f			/* Bail out if not */
-
-	/* Update 'changed', among others.
-	*/
-#ifdef CONFIG_SWAP
-	ori	r10, r10, _PAGE_DIRTY|_PAGE_HWWRITE
-	/* do not set the _PAGE_ACCESSED bit of a non-present page */
-	andi.	r11, r10, _PAGE_PRESENT
-	beq	4f
-	ori	r10, r10, _PAGE_ACCESSED
-4:
-#else
-	ori	r10, r10, _PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_HWWRITE
-#endif
-	mfspr	r11, SPRN_MD_TWC		/* Get pte address again */
+	ori	r10, r10, _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_HWWRITE
 	stw	r10, 0(r11)		/* and update pte in table */
+	xori	r10, r10, _PAGE_RW	/* RW bit is inverted */
 
 	/* The Linux PTE won't go exactly into the MMU TLB.
-	 * Software indicator bits 21, 22 and 28 must be clear.
+	 * Software indicator bits 22 and 28 must be clear.
 	 * Software indicator bits 24, 25, 26, and 27 must be
 	 * set.  All other Linux PTE bits control the behavior
 	 * of the MMU.
-- 
1.6.4.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox