LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc/pci: Don't keep ISA memory hole resources in the tree
From: Benjamin Herrenschmidt @ 2008-07-31  5:24 UTC (permalink / raw)
  To: linuxppc-dev

When we have an ISA memory hole (ie, a PCI window that allows to
generate PCI memory cycles at low PCI address) mixes with other
resources using a different CPU <=> PCI mapping, we must not keep
the ISA hole in the bridge resource list.

If we do, things might start trying to allocate device resources
in there and will get the PCI addresses wrong.

This patch fixes it, which fixes various cases of PCMCIA breakage
on PowerBooks using the MPC106 "grackle" bridge that supports
ISA holes.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

 arch/powerpc/kernel/pci-common.c |   17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

--- linux-work.orig/arch/powerpc/kernel/pci-common.c	2008-07-31 14:45:20.000000000 +1000
+++ linux-work/arch/powerpc/kernel/pci-common.c	2008-07-31 14:57:31.000000000 +1000
@@ -650,11 +650,18 @@ void __devinit pci_process_bridge_OF_ran
 		}
 	}
 
-	/* Out of paranoia, let's put the ISA hole last if any */
-	if (isa_hole >= 0 && memno > 0 && isa_hole != (memno-1)) {
-		struct resource tmp = hose->mem_resources[isa_hole];
-		hose->mem_resources[isa_hole] = hose->mem_resources[memno-1];
-		hose->mem_resources[memno-1] = tmp;
+	/* If there's an ISA hole and the pci_mem_offset is -not- matching
+	 * the ISA hole offset, then we need to remove the ISA hole from
+	 * the resource list for that brige
+	 */
+	if (isa_hole >= 0 && hose->pci_mem_offset != isa_mb) {
+		unsigned int next = isa_hole + 1;
+		printk(KERN_INFO " Removing ISA hole at 0x%016llx\n", isa_mb);
+		if (next < memno)
+			memmove(&hose->mem_resources[isa_hole],
+				&hose->mem_resources[next],
+				sizeof(struct resource) * (memno - next));
+		hose->mem_resources[--memno].flags = 0;
 	}
 }
 

^ permalink raw reply

* Re: [PATCH 4/8] Silence warnings in arch/powerpc/platforms/52xx/mpc52xx_pci.c
From: Grant Likely @ 2008-07-31  5:27 UTC (permalink / raw)
  To: Jon Smirl; +Cc: linuxppc-dev, Paul Mackerras
In-Reply-To: <9e4733910807302108q689c9c82yb010c75cba770218@mail.gmail.com>

On Wed, Jul 30, 2008 at 10:08 PM, Jon Smirl <jonsmirl@gmail.com> wrote:
> There are some warnings in mpc5200 spi that I haven't looked at....
>
> drivers/spi/mpc52xx_psc_spi.c: In function 'mpc52xx_psc_spi_activate_cs':
> drivers/spi/mpc52xx_psc_spi.c:111: warning: passing argument 1 of
> 'in_be16' from incompatible pointer type
> drivers/spi/mpc52xx_psc_spi.c:117: warning: passing argument 1 of
> 'out_be16' from incompatible pointer type
> drivers/spi/mpc52xx_psc_spi.c: In function 'mpc52xx_psc_spi_port_config':
> drivers/spi/mpc52xx_psc_spi.c:350: warning: passing argument 1 of
> 'out_be16' from incompatible pointer type

I've got a patch for these already in my tree.

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.

^ permalink raw reply

* Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks
From: Nick Piggin @ 2008-07-31  6:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Mel Gorman, libhugetlbfs-devel, linux-kernel,
	linuxppc-dev, Eric Munson
In-Reply-To: <20080730103407.b110afc2.akpm@linux-foundation.org>

On Thursday 31 July 2008 03:34, Andrew Morton wrote:
> On Wed, 30 Jul 2008 18:23:18 +0100 Mel Gorman <mel@csn.ul.ie> wrote:
> > On (30/07/08 01:43), Andrew Morton didst pronounce:
> > > On Mon, 28 Jul 2008 12:17:10 -0700 Eric Munson <ebmunson@us.ibm.com> 
wrote:
> > > > Certain workloads benefit if their data or text segments are backed
> > > > by huge pages.
> > >
> > > oh.  As this is a performance patch, it would be much better if its
> > > description contained some performance measurement results!  Please.
> >
> > I ran these patches through STREAM (http://www.cs.virginia.edu/stream/).
> > STREAM itself was patched to allocate data from the stack instead of
> > statically for the test. They completed without any problem on x86,
> > x86_64 and PPC64 and each test showed a performance gain from using
> > hugepages.  I can post the raw figures but they are not currently in an
> > eye-friendly format. Here are some plots of the data though;
> >
> > x86:
> > http://www.csn.ul.ie/~mel/postings/stack-backing-20080730/x86-stream-stac
> >k.ps x86_64:
> > http://www.csn.ul.ie/~mel/postings/stack-backing-20080730/x86_64-stream-s
> >tack.ps ppc64-small:
> > http://www.csn.ul.ie/~mel/postings/stack-backing-20080730/ppc64-small-str
> >eam-stack.ps ppc64-large:
> > http://www.csn.ul.ie/~mel/postings/stack-backing-20080730/ppc64-large-str
> >eam-stack.ps
> >
> > The test was to run STREAM with different array sizes (plotted on X-axis)
> > and measure the average throughput (y-axis). In each case, backing the
> > stack with large pages with a performance gain.
>
> So about a 10% speedup on x86 for most STREAM configurations.  Handy -
> that's somewhat larger than most hugepage-conversions, iirc.

Although it might be a bit unusual to have codes doing huge streaming
memory operations on stack memory...

We can see why IBM is so keen on their hugepages though :)


> Do we expect that this change will be replicated in other
> memory-intensive apps?  (I do).

Such as what? It would be nice to see some numbers with some HPC or java
or DBMS workload using this. Not that I dispute it will help some cases,
but 10% (or 20% for ppc) I guess is getting toward the best case, short
of a specifically written TLB thrasher.

^ permalink raw reply

* Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks
From: Andrew Morton @ 2008-07-31  6:14 UTC (permalink / raw)
  To: Nick Piggin
  Cc: linux-mm, Mel Gorman, libhugetlbfs-devel, linux-kernel,
	linuxppc-dev, Eric Munson
In-Reply-To: <200807311604.14349.nickpiggin@yahoo.com.au>

On Thu, 31 Jul 2008 16:04:14 +1000 Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> > Do we expect that this change will be replicated in other
> > memory-intensive apps?  (I do).
> 
> Such as what? It would be nice to see some numbers with some HPC or java
> or DBMS workload using this. Not that I dispute it will help some cases,
> but 10% (or 20% for ppc) I guess is getting toward the best case, short
> of a specifically written TLB thrasher.

I didn't realise the STREAM is using vast amounts of automatic memory. 
I'd assumed that it was using sane amounts of stack, but the stack TLB
slots were getting zapped by all the heap-memory activity.  Oh well.

I guess that effect is still there, but smaller.

I agree that few real-world apps are likely to see gains of this
order.  More benchmarks, please :)

^ permalink raw reply

* Re: [Bugme-new] [Bug 11185] New: Device/host RESET in SCSI
From: Andrew Morton @ 2008-07-31  6:24 UTC (permalink / raw)
  To: cijoml; +Cc: linuxppc-dev, linux-scsi, bugme-daemon
In-Reply-To: <bug-11185-10286@http.bugzilla.kernel.org/>


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Wed, 30 Jul 2008 23:18:04 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11185
> 
>            Summary: Device/host RESET in SCSI
>            Product: Platform Specific/Hardware
>            Version: 2.5
>      KernelVersion: 2.6.26
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: blocking
>           Priority: P1
>          Component: PPC-64
>         AssignedTo: anton@samba.org
>         ReportedBy: cijoml@volny.cz
> 
> 
> Latest working kernel version: 2.6.25, 2.6.18??? both tested are Debian
> distribution kernels
> Earliest failing kernel version: unknown
> Distribution: Debian stable
> Hardware Environment: IBM H70, PPC64 kernel
> Software Environment: Debian stable, 2.6.26 self compiled

Why do you describe this regression as a powerpc problem rather than a
scsi one?

(It could be either or both, I'm just wondering...)

> Problem Description:
> 
> [    3.881326] sym53c8xx 0000:00:0c.0: enabling device (0140 -> 0143)
> [    3.959117] sym0: <875> rev 0x4 at pci 0000:00:0c.0 irq 17
> [    4.029503] sym0: No NVRAM, ID 7, Fast-20, SE, parity checking
> [    4.108967] sym0: SCSI BUS has been reset.
> [    4.160753] scsi0 : sym-2.2.3
> [    4.200066] sym53c8xx 0000:00:11.0: enabling device (0140 -> 0143)
> [    4.278375] sym1: <895> rev 0x1 at pci 0000:00:11.0 irq 19
> [    4.349340] sym1: No NVRAM, ID 7, Fast-40, SE, parity checking
> [    4.429359] sym1: SCSI BUS has been reset.
> [    4.481660] scsi1 : sym-2.2.3
> [    4.521351] sym53c8xx 0001:40:0c.0: enabling device (0140 -> 0143)
> [    4.600250] sym2: <875> rev 0x3 at pci 0001:40:0c.0 irq 29
> [    4.756252] sym2: No NVRAM, ID 7, Fast-20, SE, parity checking
> [    4.836739] sym2: SCSI BUS has been reset.
> [    4.889450] scsi2 : sym-2.2.3
> [    4.929845] st: Version 20080224, fixed bufsize 32768, s/g segs 256
> [    5.008868] Driver 'st' needs updating - please use bus_type methods
> [    5.089184] Driver 'sd' needs updating - please use bus_type methods
> [    5.169000] Driver 'sr' needs updating - please use bus_type methods
> [    5.248969] SCSI Media Changer driver v0.25
> [    5.303686] Driver 'ch' needs updating - please use bus_type methods
> [    5.385159] mice: PS/2 mouse device common for all mice
> [    5.454519] TCP cubic registered
> [    5.496517] NET: Registered protocol family 17
> [    5.553945] registered taskstats version 1
> [    5.606960] scsi: waiting for bus probes to complete ...
> [   12.689883] scsi 0:0:0:0: ABORT operation started
> [   13.057829] scsi 1:0:0:0: ABORT operation started
> [   13.401828] scsi 2:0:0:0: ABORT operation started
> [   17.745837] scsi 0:0:0:0: ABORT operation timed-out.
> [   17.808370] scsi 0:0:0:0: DEVICE RESET operation started
> [   18.113823] scsi 1:0:0:0: ABORT operation timed-out.
> [   18.176365] scsi 1:0:0:0: DEVICE RESET operation started
> [   18.457824] scsi 2:0:0:0: ABORT operation timed-out.
> [   18.520280] scsi 2:0:0:0: DEVICE RESET operation started
> [   22.873822] scsi 0:0:0:0: DEVICE RESET operation timed-out.
> [   22.943527] scsi 0:0:0:0: BUS RESET operation started
> [   23.241823] scsi 1:0:0:0: DEVICE RESET operation timed-out.
> [   23.311387] scsi 1:0:0:0: BUS RESET operation started
> [   23.585824] scsi 2:0:0:0: DEVICE RESET operation timed-out.
> [   23.655371] scsi 2:0:0:0: BUS RESET operation started
> [   28.005857] scsi 0:0:0:0: BUS RESET operation timed-out.
> [   28.072268] scsi 0:0:0:0: HOST RESET operation started
> [   28.143373] sym0: SCSI BUS has been reset.
> [   28.373822] scsi 1:0:0:0: BUS RESET operation timed-out.
> [   28.440183] scsi 1:0:0:0: HOST RESET operation started
> [   28.511103] sym1: SCSI BUS has been reset.
> [   28.717826] scsi 2:0:0:0: BUS RESET operation timed-out.
> [   28.784138] scsi 2:0:0:0: HOST RESET operation started
> [   28.854961] sym2: SCSI BUS has been reset.
> [   33.193826] scsi 0:0:0:0: HOST RESET operation timed-out.
> [   33.261164] scsi 0:0:0:0: Device offlined - not ready after error recovery
> [   33.561823] scsi 1:0:0:0: HOST RESET operation timed-out.
> [   33.629409] scsi 1:0:0:0: Device offlined - not ready after error recovery
> [   33.905841] scsi 2:0:0:0: HOST RESET operation timed-out.
> [   33.973557] scsi 2:0:0:0: Device offlined - not ready after error recovery
> [   38.845823] scsi 0:0:1:0: ABORT operation started
> [   39.213828] scsi 1:0:1:0: ABORT operation started
> [   39.469825] scsi 2:0:1:0: ABORT operation started
> [   43.901857] scsi 0:0:1:0: ABORT operation timed-out.
> [   43.964323] scsi 0:0:1:0: DEVICE RESET operation started
> [   44.269822] scsi 1:0:1:0: ABORT operation timed-out.
> [   44.332252] scsi 1:0:1:0: DEVICE RESET operation started
> [   44.525821] scsi 2:0:1:0: ABORT operation timed-out.
> [   44.588275] scsi 2:0:1:0: DEVICE RESET operation started
> [   49.029823] scsi 0:0:1:0: DEVICE RESET operation timed-out.
> [   49.099525] scsi 0:0:1:0: BUS RESET operation started
> [   49.397822] scsi 1:0:1:0: DEVICE RESET operation timed-out.
> [   49.467597] scsi 1:0:1:0: BUS RESET operation started
> [   49.653820] scsi 2:0:1:0: DEVICE RESET operation timed-out.
> [   49.723526] scsi 2:0:1:0: BUS RESET operation started
> [   54.161858] scsi 0:0:1:0: BUS RESET operation timed-out.
> [   54.228409] scsi 0:0:1:0: HOST RESET operation started
> [   54.299580] sym0: SCSI BUS has been reset.
> [   54.529821] scsi 1:0:1:0: BUS RESET operation timed-out.
> [   54.596347] scsi 1:0:1:0: HOST RESET operation started
> [   54.667436] sym1: SCSI BUS has been reset.
> [   54.785819] scsi 2:0:1:0: BUS RESET operation timed-out.
> [   54.852267] scsi 2:0:1:0: HOST RESET operation started
> [   54.922982] sym2: SCSI BUS has been reset.
> [   59.349828] scsi 0:0:1:0: HOST RESET operation timed-out.
> [   59.417183] scsi 0:0:1:0: Device offlined - not ready after error recovery
> [   59.717822] scsi 1:0:1:0: HOST RESET operation timed-out.
> [   59.785439] scsi 1:0:1:0: Device offlined - not ready after error recovery
> [   59.973820] scsi 2:0:1:0: HOST RESET operation timed-out.
> [   60.041448] scsi 2:0:1:0: Device offlined - not ready after error recovery
> [   65.001825] scsi 0:0:2:0: ABORT operation started
> [   65.369821] scsi 1:0:2:0: ABORT operation started
> [   65.625824] scsi 2:0:2:0: ABORT operation started
> [   70.057856] scsi 0:0:2:0: ABORT operation timed-out.
> [   70.120341] scsi 0:0:2:0: DEVICE RESET operation started
> [   70.425820] scsi 1:0:2:0: ABORT operation timed-out.
> [   70.488251] scsi 1:0:2:0: DEVICE RESET operation started
> [   70.681820] scsi 2:0:2:0: ABORT operation timed-out.
> [   70.744266] scsi 2:0:2:0: DEVICE RESET operation started
> [   75.185827] scsi 0:0:2:0: DEVICE RESET operation timed-out.
> [   75.255546] scsi 0:0:2:0: BUS RESET operation started
> [   75.553822] scsi 1:0:2:0: DEVICE RESET operation timed-out.
> [   75.623581] scsi 1:0:2:0: BUS RESET operation started
> [   75.809837] scsi 2:0:2:0: DEVICE RESET operation timed-out.
> [   75.879524] scsi 2:0:2:0: BUS RESET operation started
> [   80.317876] scsi 0:0:2:0: BUS RESET operation timed-out.
> [   80.384393] scsi 0:0:2:0: HOST RESET operation started
> [   80.455478] sym0: SCSI BUS has been reset.
> [   80.685820] scsi 1:0:2:0: BUS RESET operation timed-out.
> [   80.752306] scsi 1:0:2:0: HOST RESET operation started
> [   80.823332] sym1: SCSI BUS has been reset.
> [   80.941820] scsi 2:0:2:0: BUS RESET operation timed-out.
> [   81.008217] scsi 2:0:2:0: HOST RESET operation started
> [   81.079035] sym2: SCSI BUS has been reset.
> [   85.505820] scsi 0:0:2:0: HOST RESET operation timed-out.
> [   85.573175] scsi 0:0:2:0: Device offlined - not ready after error recovery
> [   85.873839] scsi 1:0:2:0: HOST RESET operation timed-out.
> [   85.941331] scsi 1:0:2:0: Device offlined - not ready after error recovery
> [   86.129819] scsi 2:0:2:0: HOST RESET operation timed-out.
> [   86.197497] scsi 2:0:2:0: Device offlined - not ready after error recovery
> [   91.157827] scsi 0:0:3:0: ABORT operation started
> [   91.525844] scsi 1:0:3:0: ABORT operation started
> [   91.781824] scsi 2:0:3:0: ABORT operation started
> [   96.213848] scsi 0:0:3:0: ABORT operation timed-out.
> [   96.276335] scsi 0:0:3:0: DEVICE RESET operation started
> [   96.581820] scsi 1:0:3:0: ABORT operation timed-out.
> [   96.644261] scsi 1:0:3:0: DEVICE RESET operation started
> [   96.837819] scsi 2:0:3:0: ABORT operation timed-out.
> [   96.900213] scsi 2:0:3:0: DEVICE RESET operation started
> [  101.341843] scsi 0:0:3:0: DEVICE RESET operation timed-out.
> [  101.411555] scsi 0:0:3:0: BUS RESET operation started
> [  101.709820] scsi 1:0:3:0: DEVICE RESET operation timed-out.
> [  101.779494] scsi 1:0:3:0: BUS RESET operation started
> [  101.965819] scsi 2:0:3:0: DEVICE RESET operation timed-out.
> [  102.035530] scsi 2:0:3:0: BUS RESET operation started
> [  106.473854] scsi 0:0:3:0: BUS RESET operation timed-out.
> [  106.540347] scsi 0:0:3:0: HOST RESET operation started
> [  106.611496] sym0: SCSI BUS has been reset.
> [  106.841818] scsi 1:0:3:0: BUS RESET operation timed-out.
> [  106.908266] scsi 1:0:3:0: HOST RESET operation started
> [  106.979253] sym1: SCSI BUS has been reset.
> [  107.097818] scsi 2:0:3:0: BUS RESET operation timed-out.
> [  107.164208] scsi 2:0:3:0: HOST RESET operation started
> [  107.234919] sym2: SCSI BUS has been reset.
> [  111.661848] scsi 0:0:3:0: HOST RESET operation timed-out.
> [  111.729202] scsi 0:0:3:0: Device offlined - not ready after error recovery
> [  112.029820] scsi 1:0:3:0: HOST RESET operation timed-out.
> [  112.097359] scsi 1:0:3:0: Device offlined - not ready after error recovery
> [  112.285818] scsi 2:0:3:0: HOST RESET operation timed-out.
> [  112.353503] scsi 2:0:3:0: Device offlined - not ready after error recovery
> [  117.313828] scsi 0:0:4:0: ABORT operation started
> [  117.681823] scsi 1:0:4:0: ABORT operation started
> [  117.937839] scsi 2:0:4:0: ABORT operation started
> [  122.369861] scsi 0:0:4:0: ABORT operation timed-out.
> [  122.432287] scsi 0:0:4:0: DEVICE RESET operation started
> [  122.737819] scsi 1:0:4:0: ABORT operation timed-out.
> [  122.800214] scsi 1:0:4:0: DEVICE RESET operation started
> [  122.993820] scsi 2:0:4:0: ABORT operation timed-out.
> [  123.056273] scsi 2:0:4:0: DEVICE RESET operation started
> [  127.497830] scsi 0:0:4:0: DEVICE RESET operation timed-out.
> [  127.567586] scsi 0:0:4:0: BUS RESET operation started
> [  127.865836] scsi 1:0:4:0: DEVICE RESET operation timed-out.
> [  127.935537] scsi 1:0:4:0: BUS RESET operation started
> [  128.121818] scsi 2:0:4:0: DEVICE RESET operation timed-out.
> [  128.191627] scsi 2:0:4:0: BUS RESET operation started
> [  132.629865] scsi 0:0:4:0: BUS RESET operation timed-out.
> [  132.696399] scsi 0:0:4:0: HOST RESET operation started
> [  132.767554] sym0: SCSI BUS has been reset.
> [  132.997819] scsi 1:0:4:0: BUS RESET operation timed-out.
> [  133.064328] scsi 1:0:4:0: HOST RESET operation started
> [  133.135414] sym1: SCSI BUS has been reset.
> [  133.253817] scsi 2:0:4:0: BUS RESET operation timed-out.
> [  133.320242] scsi 2:0:4:0: HOST RESET operation started
> [  133.390987] sym2: SCSI BUS has been reset.
> [  137.817850] scsi 0:0:4:0: HOST RESET operation timed-out.
> [  137.885271] scsi 0:0:4:0: Device offlined - not ready after error recovery
> [  138.185819] scsi 1:0:4:0: HOST RESET operation timed-out.
> [  138.253407] scsi 1:0:4:0: Device offlined - not ready after error recovery
> [  138.441818] scsi 2:0:4:0: HOST RESET operation timed-out.
> [  138.509454] scsi 2:0:4:0: Device offlined - not ready after error recovery
> [  143.469830] scsi 0:0:5:0: ABORT operation started
> [  143.837839] scsi 1:0:5:0: ABORT operation started
> [  144.093822] scsi 2:0:5:0: ABORT operation started
> [  148.525863] scsi 0:0:5:0: ABORT operation timed-out.
> [  148.588334] scsi 0:0:5:0: DEVICE RESET operation started
> [  148.893821] scsi 1:0:5:0: ABORT operation timed-out.
> [  148.956234] scsi 1:0:5:0: DEVICE RESET operation started
> [  149.149817] scsi 2:0:5:0: ABORT operation timed-out.
> [  149.212295] scsi 2:0:5:0: DEVICE RESET operation started
> [  153.653831] scsi 0:0:5:0: DEVICE RESET operation timed-out.
> [  153.723629] scsi 0:0:5:0: BUS RESET operation started
> [  154.021836] scsi 1:0:5:0: DEVICE RESET operation timed-out.
> [  154.091593] scsi 1:0:5:0: BUS RESET operation started
> [  154.277817] scsi 2:0:5:0: DEVICE RESET operation timed-out.
> [  154.347515] scsi 2:0:5:0: BUS RESET operation started
> [  158.785866] scsi 0:0:5:0: BUS RESET operation timed-out.
> [  158.852420] scsi 0:0:5:0: HOST RESET operation started
> [  158.923602] sym0: SCSI BUS has been reset.
> [  159.153819] scsi 1:0:5:0: BUS RESET operation timed-out.
> [  159.220336] scsi 1:0:5:0: HOST RESET operation started
> [  159.291467] sym1: SCSI BUS has been reset.
> [  159.409816] scsi 2:0:5:0: BUS RESET operation timed-out.
> [  159.476196] scsi 2:0:5:0: HOST RESET operation started
> [  159.546998] sym2: SCSI BUS has been reset.
> [  163.973852] scsi 0:0:5:0: HOST RESET operation timed-out.
> [  164.041152] scsi 0:0:5:0: Device offlined - not ready after error recovery
> [  164.341818] scsi 1:0:5:0: HOST RESET operation timed-out.
> [  164.409398] scsi 1:0:5:0: Device offlined - not ready after error recovery
> [  164.597817] scsi 2:0:5:0: HOST RESET operation timed-out.
> [  164.665478] scsi 2:0:5:0: Device offlined - not ready after error recovery
> [  169.625832] scsi 0:0:6:0: ABORT operation started
> [  169.993842] scsi 1:0:6:0: ABORT operation started
> [  170.249820] scsi 2:0:6:0: ABORT operation started
> [  174.681864] scsi 0:0:6:0: ABORT operation timed-out.
> [  174.744345] scsi 0:0:6:0: DEVICE RESET operation started
> [  175.049819] scsi 1:0:6:0: ABORT operation timed-out.
> [  175.112276] scsi 1:0:6:0: DEVICE RESET operation started
> [  175.305816] scsi 2:0:6:0: ABORT operation timed-out.
> [  175.368234] scsi 2:0:6:0: DEVICE RESET operation started
> [  179.809848] scsi 0:0:6:0: DEVICE RESET operation timed-out.
> [  179.879566] scsi 0:0:6:0: BUS RESET operation started
> [  180.177820] scsi 1:0:6:0: DEVICE RESET operation timed-out.
> [  180.247619] scsi 1:0:6:0: BUS RESET operation started
> [  180.433833] scsi 2:0:6:0: DEVICE RESET operation timed-out.
> [  180.503503] scsi 2:0:6:0: BUS RESET operation started
> [  184.941817] scsi 0:0:6:0: BUS RESET operation timed-out.
> [  185.008334] scsi 0:0:6:0: HOST RESET operation started
> [  185.079490] sym0: SCSI BUS has been reset.
> [  185.309817] scsi 1:0:6:0: BUS RESET operation timed-out.
> [  185.376387] scsi 1:0:6:0: HOST RESET operation started
> [  185.447443] sym1: SCSI BUS has been reset.
> [  185.565816] scsi 2:0:6:0: BUS RESET operation timed-out.
> [  185.632200] scsi 2:0:6:0: HOST RESET operation started
> [  185.703010] sym2: SCSI BUS has been reset.
> [  190.129853] scsi 0:0:6:0: HOST RESET operation timed-out.
> [  190.197192] scsi 0:0:6:0: Device offlined - not ready after error recovery
> [  190.497836] scsi 1:0:6:0: HOST RESET operation timed-out.
> [  190.565325] scsi 1:0:6:0: Device offlined - not ready after error recovery
> [  190.753820] scsi 2:0:6:0: HOST RESET operation timed-out.
> [  190.753842] scsi 2:0:6:0: Device offlined - not ready after error recovery
> [  195.781822] scsi 0:0:8:0: ABORT operation started
> [  196.149821] scsi 1:0:8:0: ABORT operation started
> [  196.253813] scsi 2:0:8:0: ABORT operation started
> [  200.837815] scsi 0:0:8:0: ABORT operation timed-out.
> [  200.900204] scsi 0:0:8:0: DEVICE RESET operation started
> [  201.205818] scsi 1:0:8:0: ABORT operation timed-out.
> [  201.268267] scsi 1:0:8:0: DEVICE RESET operation started
> [  201.334835] scsi 2:0:8:0: ABORT operation timed-out.
> [  201.397227] scsi 2:0:8:0: DEVICE RESET operation started
> and so on in neverending loop...
> 
> Steps to reproduce:
> 
> Boot with 2.6.26
> 

^ permalink raw reply

* Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks
From: Nick Piggin @ 2008-07-31  6:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Mel Gorman, libhugetlbfs-devel, linux-kernel,
	linuxppc-dev, Eric Munson
In-Reply-To: <20080730231428.a7bdcfa7.akpm@linux-foundation.org>

On Thursday 31 July 2008 16:14, Andrew Morton wrote:
> On Thu, 31 Jul 2008 16:04:14 +1000 Nick Piggin <nickpiggin@yahoo.com.au> 
wrote:
> > > Do we expect that this change will be replicated in other
> > > memory-intensive apps?  (I do).
> >
> > Such as what? It would be nice to see some numbers with some HPC or java
> > or DBMS workload using this. Not that I dispute it will help some cases,
> > but 10% (or 20% for ppc) I guess is getting toward the best case, short
> > of a specifically written TLB thrasher.
>
> I didn't realise the STREAM is using vast amounts of automatic memory.
> I'd assumed that it was using sane amounts of stack, but the stack TLB
> slots were getting zapped by all the heap-memory activity.  Oh well.

An easy mistake to make because that's probabably how STREAM would normally
work. I think what Mel had done is to modify the stream kernel so as to
have it operate on arrays of stack memory.


> I guess that effect is still there, but smaller.

I imagine it should be, unless you're using a CPU with seperate TLBs for
small and huge pages, and your large data set is mapped with huge pages,
in which case you might now introduce *new* TLB contention between the
stack and the dataset :)

Also, interestingly I have actually seen some CPUs whos memory operations
get significantly slower when operating on large pages than small (in the
case when there is full TLB coverage for both sizes). This would make
sense if the CPU only implements a fast L1 TLB for small pages.

So for the vast majority of workloads, where stacks are relatively small
(or slowly changing), and relatively hot, I suspect this could easily have
no benefit at best and slowdowns at worst.

But I'm not saying that as a reason not to merge it -- this is no
different from any other hugepage allocations and as usual they have to be
used selectively where they help.... I just wonder exactly where huge
stacks will help.


> I agree that few real-world apps are likely to see gains of this
> order.  More benchmarks, please :)

Would be nice, if just out of morbid curiosity :)

^ permalink raw reply

* Re: [Bugme-new] [Bug 11185] New: Device/host RESET in SCSI
From: Michael Ellerman @ 2008-07-31  6:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: cijoml, linuxppc-dev, bugme-daemon, linux-scsi
In-Reply-To: <20080730232409.81965504.akpm@linux-foundation.org>

[-- Attachment #1: Type: text/plain, Size: 5737 bytes --]

On Wed, 2008-07-30 at 23:24 -0700, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Wed, 30 Jul 2008 23:18:04 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=11185
> > 
> >            Summary: Device/host RESET in SCSI
> >            Product: Platform Specific/Hardware
> >            Version: 2.5
> >      KernelVersion: 2.6.26
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: blocking
> >           Priority: P1
> >          Component: PPC-64
> >         AssignedTo: anton@samba.org
> >         ReportedBy: cijoml@volny.cz
> > 
> > 
> > Latest working kernel version: 2.6.25, 2.6.18??? both tested are Debian
> > distribution kernels
> > Earliest failing kernel version: unknown
> > Distribution: Debian stable
> > Hardware Environment: IBM H70, PPC64 kernel
> > Software Environment: Debian stable, 2.6.26 self compiled
> 
> Why do you describe this regression as a powerpc problem rather than a
> scsi one?
> 
> (It could be either or both, I'm just wondering...)
> 
> > Problem Description:
> > 
> > [    3.881326] sym53c8xx 0000:00:0c.0: enabling device (0140 -> 0143)
> > [    3.959117] sym0: <875> rev 0x4 at pci 0000:00:0c.0 irq 17
> > [    4.029503] sym0: No NVRAM, ID 7, Fast-20, SE, parity checking
> > [    4.108967] sym0: SCSI BUS has been reset.
> > [    4.160753] scsi0 : sym-2.2.3
> > [    4.200066] sym53c8xx 0000:00:11.0: enabling device (0140 -> 0143)
> > [    4.278375] sym1: <895> rev 0x1 at pci 0000:00:11.0 irq 19
> > [    4.349340] sym1: No NVRAM, ID 7, Fast-40, SE, parity checking
> > [    4.429359] sym1: SCSI BUS has been reset.
> > [    4.481660] scsi1 : sym-2.2.3
> > [    4.521351] sym53c8xx 0001:40:0c.0: enabling device (0140 -> 0143)
> > [    4.600250] sym2: <875> rev 0x3 at pci 0001:40:0c.0 irq 29
> > [    4.756252] sym2: No NVRAM, ID 7, Fast-20, SE, parity checking
> > [    4.836739] sym2: SCSI BUS has been reset.
> > [    4.889450] scsi2 : sym-2.2.3


I don't know much about scsi, but I have a 44P (POWER3) which boots fine:

Linux version 2.6.27-rc1 (benh@grosgo) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu8
...
sym53c8xx 0000:00:0c.0: enabling device (0140 -> 0143)
sym0: <896> rev 0x7 at pci 0000:00:0c.0 irq 17
sym0: No NVRAM, ID 7, Fast-40, SE, parity checking
sym0: SCSI BUS has been reset.
scsi0 : sym-2.2.3
scsi 0:0:1:0: CD-ROM            IBM      CDRM00203        1_05 PQ: 0 ANSI: 2
 target0:0:1: Beginning Domain Validation
 target0:0:1: asynchronous
 target0:0:1: wide asynchronous
 target0:0:1: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 15)
 target0:0:1: Domain Validation skipping write tests
 target0:0:1: Ending Domain Validation
 target0:0:4: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 31)
scsi 0:0:4:0: Direct-Access     IBM      DDYS-T09170N     S96F PQ: 0 ANSI: 3
 target0:0:4: tagged command queuing enabled, command queue depth 16.
 target0:0:4: Beginning Domain Validation
 target0:0:4: asynchronous
 target0:0:4: wide asynchronous
 target0:0:4: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 31)
 target0:0:4: Domain Validation skipping write tests
 target0:0:4: Ending Domain Validation
 target0:0:5: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 31)
scsi 0:0:5:0: Direct-Access     IBM      DDYS-T09170N     S96F PQ: 0 ANSI: 3
 target0:0:5: tagged command queuing enabled, command queue depth 16.
 target0:0:5: Beginning Domain Validation
 target0:0:5: asynchronous
 target0:0:5: wide asynchronous
 target0:0:5: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 31)
 target0:0:5: Domain Validation skipping write tests
 target0:0:5: Ending Domain Validation
sym53c8xx 0000:00:0c.1: enabling device (0140 -> 0143)
sym1: <896> rev 0x7 at pci 0000:00:0c.1 irq 18
sym1: No NVRAM, ID 7, Fast-40, LVD, parity checking
sym1: SCSI BUS has been reset.
scsi1 : sym-2.2.3
ipr: IBM Power RAID SCSI Device Driver version: 2.4.1 (April 24, 2007)
st: Version 20080504, fixed bufsize 32768, s/g segs 256
Driver 'st' needs updating - please use bus_type methods
Driver 'sd' needs updating - please use bus_type methods
sd 0:0:4:0: [sda] 17774160 512-byte hardware sectors (9100 MB)
sd 0:0:4:0: [sda] Write Protect is off
sd 0:0:4:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DA
sd 0:0:4:0: [sda] 17774160 512-byte hardware sectors (9100 MB)
sd 0:0:4:0: [sda] Write Protect is off
sd 0:0:4:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DA
 sda: sda1
sd 0:0:4:0: [sda] Attached SCSI disk
sd 0:0:5:0: [sdb] Spinning up disk..............ready
sd 0:0:5:0: [sdb] 17774160 512-byte hardware sectors (9100 MB)
sd 0:0:5:0: [sdb] Write Protect is off
sd 0:0:5:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DA
sd 0:0:5:0: [sdb] 17774160 512-byte hardware sectors (9100 MB)
sd 0:0:5:0: [sdb] Write Protect is off
sd 0:0:5:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DA
 sdb: sdb1 sdb2 sdb3 < sdb5 sdb6 >
sd 0:0:5:0: [sdb] Attached SCSI disk
Driver 'sr' needs updating - please use bus_type methods
sr0: scsi-1 drive
Uniform CD-ROM driver Revision: 3.20
sr 0:0:1:0: Attached scsi generic sg0 type 5
sd 0:0:4:0: Attached scsi generic sg1 type 0
sd 0:0:5:0: Attached scsi generic sg2 type 0


cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* [PATCH] powerpc/kdump: Fix /dev/oldmem interface
From: Sachin P. Sant @ 2008-07-31  6:54 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev, Vivek Goyal

[-- Attachment #1: Type: text/plain, Size: 432 bytes --]

This patch fixes the /dev/oldmem interface for kdump on ppc64.
The patch originally came from Michael Ellerman hence have
retained the signed-off line. I just rediffed/tested against
latest git. Michael has ack'ed this patch.

Ben this is not a must for 2.6.27 but would be good if it's 
included.

Thanks
-Sachin

Signed-off-by : Michael Ellerman <michael@ellerman.id.au>
Acked-by : Michael Ellerman <michael@ellerman.id.au>

---


[-- Attachment #2: fix-oldmem-interface-of-kdump.patch --]
[-- Type: text/x-patch, Size: 1839 bytes --]

Fix /dev/oldmem for kdump

A change to __ioremap() broke reading /dev/oldmem because we're no
longer able to ioremap pfn 0 (d177c207ba16b1db31283e2d1fee7ad4a863584b).

We actually don't need to ioremap for anything that's part of the linear
mapping, so just read it directly.

Also make sure we're only reading one page or less at a time.

Signed-off-by : Michael Ellerman <michael@ellerman.id.au>
---

diff -Naurp 1/arch/powerpc/kernel/crash_dump.c 2/arch/powerpc/kernel/crash_dump.c
--- 1/arch/powerpc/kernel/crash_dump.c	2008-07-31 11:55:39.000000000 +0530
+++ 2/arch/powerpc/kernel/crash_dump.c	2008-07-31 12:09:49.000000000 +0530
@@ -86,6 +86,19 @@ static int __init parse_savemaxmem(char 
 }
 __setup("savemaxmem=", parse_savemaxmem);
 
+
+static size_t copy_oldmem_vaddr(void *vaddr, char *buf, size_t csize,
+                               unsigned long offset, int userbuf)
+{
+	if (userbuf) {
+		if (copy_to_user((char __user *)buf, (vaddr + offset), csize))
+			return -EFAULT;
+	} else
+		memcpy(buf, (vaddr + offset), csize);
+
+	return csize;
+}
+
 /**
  * copy_oldmem_page - copy one page from "oldmem"
  * @pfn: page frame number to be copied
@@ -107,16 +120,16 @@ ssize_t copy_oldmem_page(unsigned long p
 	if (!csize)
 		return 0;
 
-	vaddr = __ioremap(pfn << PAGE_SHIFT, PAGE_SIZE, 0);
+	csize = min(csize, PAGE_SIZE);
 
-	if (userbuf) {
-		if (copy_to_user((char __user *)buf, (vaddr + offset), csize)) {
-			iounmap(vaddr);
-			return -EFAULT;
-		}
-	} else
-		memcpy(buf, (vaddr + offset), csize);
+	if (pfn < max_pfn) {
+		vaddr = __va(pfn << PAGE_SHIFT);
+		csize = copy_oldmem_vaddr(vaddr, buf, csize, offset, userbuf);
+	} else {
+		vaddr = __ioremap(pfn << PAGE_SHIFT, PAGE_SIZE, 0);
+		csize = copy_oldmem_vaddr(vaddr, buf, csize, offset, userbuf);
+		iounmap(vaddr);
+	}
 
-	iounmap(vaddr);
 	return csize;
 }

^ permalink raw reply

* Re: [PATCH 5/8] Silence warning in arch/powerpc/mm/ppc_mmu_32.c
From: Milton Miller @ 2008-07-31  6:54 UTC (permalink / raw)
  To: Tony Breeds; +Cc: ppcdev, Stephen Rothwell
In-Reply-To: <20080731152125.2bfec5b7.sfr@canb.auug.org.au>

On Thu Jul 31 at 15:21:25 EST in 2008, Stephen Rothwell wrote:
> On Thu, 31 Jul 2008 13:51:43 +1000 (EST) Tony Breeds <tony at 
> bakeyournoodle.com> wrote:
>>
>> total_memory is a 'phys_addr_t', cast to unsigned long to silence
>> warning.
>>
>> diff --git a/arch/powerpc/mm/ppc_mmu_32.c 
>> b/arch/powerpc/mm/ppc_mmu_32.c
>> index c53145f..9c19655 100644
>> --- a/arch/powerpc/mm/ppc_mmu_32.c
>> +++ b/arch/powerpc/mm/ppc_mmu_32.c
>> @@ -237,7 +237,7 @@ void __init MMU_init_hw(void)
>>       Hash_end = (struct hash_pte *) ((unsigned long)Hash + 
>> Hash_size);
>>
>>       printk("Total memory = %ldMB; using %ldkB for hash table (at 
>> %p)\n",
>> -            total_memory >> 20, Hash_size >> 10, Hash);
>> +            (unsigned long)total_memory >> 20, Hash_size >> 10, 
>> Hash);
>
> Will this ever be built with CONFIG_PHYS_64BIT?

I think that is how warning originates.

But please, cast the result of the shift.  Otherwise it will print 0MB 
instead of 4096MB.

The patches for 4G ram are in on one platform and in progress on a 
second.

milton

^ permalink raw reply

* [PATCH v2] Guard print_device_node_tree() if #if 0.
From: Tony Breeds @ 2008-07-31  6:54 UTC (permalink / raw)
  To: Paul Mackerras, Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <c5fdfe03bcac5d6964b6712a2a15ab0edc12ce83.1217476198.git.tony@bakeyournoodle.com>

Currently print_device_node_tree() isn't called but it can be usful for
debuging.  Leave the function there but hide it behind '#if 0' to save
it being rewritten.  If you want to call it you're already editing this
file anyway ;P

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
---
Changes since v1:
 - No longer breaks CONFIG_PPC_PSERIES_DEBUG.

 arch/powerpc/platforms/pseries/eeh_driver.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/eeh_driver.c b/arch/powerpc/platforms/pseries/eeh_driver.c
index 8c1ca47..0ad56ff 100644
--- a/arch/powerpc/platforms/pseries/eeh_driver.c
+++ b/arch/powerpc/platforms/pseries/eeh_driver.c
@@ -41,7 +41,7 @@ static inline const char * pcid_name (struct pci_dev *pdev)
 	return "";
 }
 
-#ifdef DEBUG
+#if 0
 static void print_device_node_tree(struct pci_dn *pdn, int dent)
 {
 	int i;
-- 
1.5.6.3

^ permalink raw reply related

* [PATCH v2] Force printing of 'total_memory' to unsigned long long in ppc_mmu_32.c
From: Tony Breeds @ 2008-07-31  6:57 UTC (permalink / raw)
  To: Paul Mackerras, Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <6827a84e83495f49c991a33b2cef07614273fa0a.1217476198.git.tony@bakeyournoodle.com>

total_memory is a 'phys_addr_t', Which can be either 64 or 32 bits.
Force printing as unsigned long long to silence the warning.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
---
Changes since v1:
 - correctly use 64bit type as phys_addr_t wont always be 32bits.  Thanks to sfr for showing me the error of my ways ;P

 arch/powerpc/mm/ppc_mmu_32.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/ppc_mmu_32.c b/arch/powerpc/mm/ppc_mmu_32.c
index c53145f..07473e0 100644
--- a/arch/powerpc/mm/ppc_mmu_32.c
+++ b/arch/powerpc/mm/ppc_mmu_32.c
@@ -236,8 +236,8 @@ void __init MMU_init_hw(void)
 
 	Hash_end = (struct hash_pte *) ((unsigned long)Hash + Hash_size);
 
-	printk("Total memory = %ldMB; using %ldkB for hash table (at %p)\n",
-	       total_memory >> 20, Hash_size >> 10, Hash);
+	printk("Total memory = %lldMB; using %ldkB for hash table (at %p)\n",
+	       (unsigned long long)total_memory >> 20, Hash_size >> 10, Hash);
 
 
 	/*
-- 
1.5.6.3

^ permalink raw reply related

* Re: [PATCH 5/8] Silence warning in arch/powerpc/mm/ppc_mmu_32.c
From: Tony Breeds @ 2008-07-31  6:58 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: Paul Mackerras, linuxppc-dev
In-Reply-To: <20080731152125.2bfec5b7.sfr@canb.auug.org.au>

On Thu, Jul 31, 2008 at 03:21:25PM +1000, Stephen Rothwell wrote:
> Hi Tony,

Rusty has forever ruined 'Hi $name' ;P

<snip>
 
> Will this ever be built with CONFIG_PHYS_64BIT?

Updated patch follows.

Yours Tony

  linux.conf.au    http://www.marchsouth.org/
  Jan 19 - 24 2009 The Australian Linux Technical Conference!

^ permalink raw reply

* Re: [PATCH] powerpc/kdump: Fix /dev/oldmem interface
From: Michael Ellerman @ 2008-07-31  7:12 UTC (permalink / raw)
  To: Sachin P. Sant; +Cc: Vivek Goyal, linuxppc-dev
In-Reply-To: <489161A4.6060801@in.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 1017 bytes --]

On Thu, 2008-07-31 at 12:24 +0530, Sachin P. Sant wrote:
> This patch fixes the /dev/oldmem interface for kdump on ppc64.
> The patch originally came from Michael Ellerman hence have
> retained the signed-off line. I just rediffed/tested against
> latest git. Michael has ack'ed this patch.
> 
> Ben this is not a must for 2.6.27 but would be good if it's 
> included.

It's been broken long enough, I think it's a must for 27.


> Signed-off-by : Michael Ellerman <michael@ellerman.id.au>
> Acked-by : Michael Ellerman <michael@ellerman.id.au>

Heh, do I get a prize for being neurotic? ;)  .. Or just stupid, ahem.

I think you should also sign off on it Sachin, see clause (c) of the
developers cert (http://lwn.net/Articles/139918/)

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: [Bugme-new] [Bug 11185] New: Device/host RESET in SCSI
From: Matthew Wilcox @ 2008-07-31  7:21 UTC (permalink / raw)
  To: Andrew Morton; +Cc: cijoml, linuxppc-dev, linux-scsi, bugme-daemon
In-Reply-To: <20080730232409.81965504.akpm@linux-foundation.org>

On Wed, Jul 30, 2008 at 11:24:09PM -0700, Andrew Morton wrote:
> Why do you describe this regression as a powerpc problem rather than a
> scsi one?
> 
> (It could be either or both, I'm just wondering...)

This seems quite astute of the reporter.  The error messages from sym2
are consistent with an interrupt routing problem.  I have an idea for
reporting this more effectively (because this comes up every 3-6 months
or so) but testing that patch will have to wait until I'm back home.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply

* Re: [Bugme-new] [Bug 11185] New: Device/host RESET in SCSI
From: Michael Ellerman @ 2008-07-31  7:32 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: cijoml, linuxppc-dev, Andrew Morton, bugme-daemon, linux-scsi
In-Reply-To: <20080731072159.GB30534@parisc-linux.org>

[-- Attachment #1: Type: text/plain, Size: 869 bytes --]

On Thu, 2008-07-31 at 01:21 -0600, Matthew Wilcox wrote:
> On Wed, Jul 30, 2008 at 11:24:09PM -0700, Andrew Morton wrote:
> > Why do you describe this regression as a powerpc problem rather than a
> > scsi one?
> > 
> > (It could be either or both, I'm just wondering...)
> 
> This seems quite astute of the reporter.  The error messages from sym2
> are consistent with an interrupt routing problem. 

Hmm I suppose.

In that case can we see the full dmesg and a tarball
of /proc/device-tree from a working kernel, Cijoml?

Which begs the question what was the latest working kernel version?

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: ide pmac breakage
From: Alan Cox @ 2008-07-31  8:49 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: FUJITA Tomonori, petkovbb, linuxppc-dev, linux-ide
In-Reply-To: <200807310248.29214.bzolnier@gmail.com>

> There seems to be some confusion between warm-plugging of IDE devices
> and hot-plugging of IDE devices.
> 
> > not a single piece of HW to exercise those code path ? I don't ask you
> > to get a powermac with a media-bay, but ide_cs seems to be a pretty
> > important one that's part of what the ide maintainer should take care
> > of... And I suspect it's going to exercise the same code path as
> > mediabay.

That confuses people sometimes - ide_cs is a controller hotplug not a
device hotplug ...

^ permalink raw reply

* Re: ide pmac breakage
From: Benjamin Herrenschmidt @ 2008-07-31  9:11 UTC (permalink / raw)
  To: Alan Cox
  Cc: FUJITA Tomonori, linux-ide, petkovbb, Bartlomiej Zolnierkiewicz,
	linuxppc-dev
In-Reply-To: <20080731094901.062a2602@lxorguk.ukuu.org.uk>

On Thu, 2008-07-31 at 09:49 +0100, Alan Cox wrote:
> > There seems to be some confusion between warm-plugging of IDE devices
> > and hot-plugging of IDE devices.
> > 
> > > not a single piece of HW to exercise those code path ? I don't ask you
> > > to get a powermac with a media-bay, but ide_cs seems to be a pretty
> > > important one that's part of what the ide maintainer should take care
> > > of... And I suspect it's going to exercise the same code path as
> > > mediabay.
> 
> That confuses people sometimes - ide_cs is a controller hotplug not a
> device hotplug ...

I could make the media-bay look like a controller hotplug if it was
going to make things easier...

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH] powerpc/kdump: Fix /dev/oldmem interface
From: Sachin P. Sant @ 2008-07-31  9:20 UTC (permalink / raw)
  To: michael; +Cc: Vivek Goyal, linuxppc-dev
In-Reply-To: <1217488357.1487.51.camel@localhost>

Michael Ellerman wrote:
> Heh, do I get a prize for being neurotic? ;)  .. Or just stupid, ahem.
>
> I think you should also sign off on it Sachin, see clause (c) of the
> developers cert (http://lwn.net/Articles/139918/)
>
>   
Here is a signed-off from me. Let me know if i need to resubmit
the patch with all signed-off-by lines.

Signed-off-by : Sachin Sant <sachinp@in.ibm.com>


-- 
Thanks
-Sachin

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------

^ permalink raw reply

* Re: ide pmac breakage
From: Alan Cox @ 2008-07-31  9:13 UTC (permalink / raw)
  To: benh
  Cc: FUJITA Tomonori, linux-ide, petkovbb, Bartlomiej Zolnierkiewicz,
	linuxppc-dev
In-Reply-To: <1217495493.11188.441.camel@pasglop>

> I could make the media-bay look like a controller hotplug if it was
> going to make things easier...

I'm not sure it will. It may do nowdays, but the older IDE code
historically was fairly broken for both cases except in 2.4. Also faking
it as controller hotplug is the wrong path for libata which does real
drive hot plug.


Alan

^ permalink raw reply

* [PATCH] powerpc - Initialize the irq radix tree earlier
From: Sebastien Dugue @ 2008-07-31  9:40 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: tinytim, linux-rt-users, linux-kernel, rostedt, jean-pierre.dion,
	Sebastien Dugue, paulus, gilles.carry, tglx
In-Reply-To: <1217497241-10685-1-git-send-email-sebastien.dugue@bull.net>

  The radix tree used for fast irq reverse mapping by the XICS is initialized
late in the boot process, after the first interrupt (IPI) gets registered
and after the first IPI is received.

  This patch moves the initialization of the XICS radix tree earlier into
the boot process in smp_xics_probe() (the mm is already up but no interrupts
have been registered at that point) to avoid having to insert a mapping into
the tree in interrupt context. This will help in simplifying the locking
constraints and move to a lockless radix tree in subsequent patches.

  As a nice side effect, there is no need any longer to check for
(host->revmap_data.tree.gfp_mask != 0) to know if the tree have been
initialized.


Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/kernel/irq.c            |   44 +++++++++------------------------
 arch/powerpc/platforms/pseries/smp.c |    1 +
 include/asm-powerpc/irq.h            |    5 ++++
 3 files changed, 18 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 6ac8612..0a1445c 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -840,9 +840,6 @@ void irq_dispose_mapping(unsigned int virq)
 			host->revmap_data.linear.revmap[hwirq] = NO_IRQ;
 		break;
 	case IRQ_HOST_MAP_TREE:
-		/* Check if radix tree allocated yet */
-		if (host->revmap_data.tree.gfp_mask == 0)
-			break;
 		irq_radix_wrlock(&flags);
 		radix_tree_delete(&host->revmap_data.tree, hwirq);
 		irq_radix_wrunlock(flags);
@@ -893,28 +890,28 @@ unsigned int irq_find_mapping(struct irq_host *host,
 }
 EXPORT_SYMBOL_GPL(irq_find_mapping);
 
+void __init irq_radix_revmap_init(void)
+{
+ 	struct irq_host *h;
+
+	list_for_each_entry(h, &irq_hosts, link) {
+		if (h->revmap_type == IRQ_HOST_MAP_TREE)
+			INIT_RADIX_TREE(&h->revmap_data.tree, GFP_ATOMIC);
+	}
+}
 
 unsigned int irq_radix_revmap(struct irq_host *host,
 			      irq_hw_number_t hwirq)
 {
-	struct radix_tree_root *tree;
 	struct irq_map_entry *ptr;
 	unsigned int virq;
 	unsigned long flags;
 
 	WARN_ON(host->revmap_type != IRQ_HOST_MAP_TREE);
 
-	/* Check if the radix tree exist yet. We test the value of
-	 * the gfp_mask for that. Sneaky but saves another int in the
-	 * structure. If not, we fallback to slow mode
-	 */
-	tree = &host->revmap_data.tree;
-	if (tree->gfp_mask == 0)
-		return irq_find_mapping(host, hwirq);
-
-	/* Now try to resolve */
+	/* Try to resolve */
 	irq_radix_rdlock(&flags);
-	ptr = radix_tree_lookup(tree, hwirq);
+	ptr = radix_tree_lookup(&host->revmap_data.tree, hwirq);
 	irq_radix_rdunlock(flags);
 
 	/* Found it, return */
@@ -927,7 +924,7 @@ unsigned int irq_radix_revmap(struct irq_host *host,
 	virq = irq_find_mapping(host, hwirq);
 	if (virq != NO_IRQ) {
 		irq_radix_wrlock(&flags);
-		radix_tree_insert(tree, hwirq, &irq_map[virq]);
+		radix_tree_insert(&host->revmap_data.tree, hwirq, &irq_map[virq]);
 		irq_radix_wrunlock(flags);
 	}
 	return virq;
@@ -1035,23 +1032,6 @@ void irq_early_init(void)
 		get_irq_desc(i)->status |= IRQ_NOREQUEST;
 }
 
-/* We need to create the radix trees late */
-static int irq_late_init(void)
-{
-	struct irq_host *h;
-	unsigned long flags;
-
-	irq_radix_wrlock(&flags);
-	list_for_each_entry(h, &irq_hosts, link) {
-		if (h->revmap_type == IRQ_HOST_MAP_TREE)
-			INIT_RADIX_TREE(&h->revmap_data.tree, GFP_ATOMIC);
-	}
-	irq_radix_wrunlock(flags);
-
-	return 0;
-}
-arch_initcall(irq_late_init);
-
 #ifdef CONFIG_VIRQ_DEBUG
 static int virq_debug_show(struct seq_file *m, void *private)
 {
diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
index 9d8f8c8..b143fe7 100644
--- a/arch/powerpc/platforms/pseries/smp.c
+++ b/arch/powerpc/platforms/pseries/smp.c
@@ -130,6 +130,7 @@ static void smp_xics_message_pass(int target, int msg)
 
 static int __init smp_xics_probe(void)
 {
+	irq_radix_revmap_init();
 	xics_request_IPIs();
 
 	return cpus_weight(cpu_possible_map);
diff --git a/include/asm-powerpc/irq.h b/include/asm-powerpc/irq.h
index 1ef8e30..47b8119 100644
--- a/include/asm-powerpc/irq.h
+++ b/include/asm-powerpc/irq.h
@@ -237,6 +237,11 @@ extern unsigned int irq_find_mapping(struct irq_host *host,
  */
 extern unsigned int irq_create_direct_mapping(struct irq_host *host);
 
+/*
+ * Initialize the radix tree used by some irq controllers
+ */
+extern void __init irq_radix_revmap_init(void);
+
 /**
  * irq_radix_revmap - Find a linux virq from a hw irq number.
  * @host: host owning this hardware interrupt
-- 
1.5.5.1

^ permalink raw reply related

* [PATCH] powerpc - Separate the irq radix tree insertion and lookup
From: Sebastien Dugue @ 2008-07-31  9:40 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: tinytim, linux-rt-users, linux-kernel, rostedt, jean-pierre.dion,
	Sebastien Dugue, paulus, gilles.carry, tglx
In-Reply-To: <1217497241-10685-1-git-send-email-sebastien.dugue@bull.net>

  irq_radix_revmap() currently serves 2 purposes, irq mapping lookup
and insertion which happen in interrupt and process context respectively.

  Separate the function into its 2 components, one for lookup only and one
for insertion only.


Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/kernel/irq.c             |   25 ++++++++++++++-----------
 arch/powerpc/platforms/pseries/xics.c |   11 ++++-------
 include/asm-powerpc/irq.h             |   18 +++++++++++++++---
 3 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 0a1445c..083b181 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -900,34 +900,37 @@ void __init irq_radix_revmap_init(void)
 	}
 }
 
-unsigned int irq_radix_revmap(struct irq_host *host,
-			      irq_hw_number_t hwirq)
+unsigned int irq_radix_revmap_lookup(struct irq_host *host,
+				     irq_hw_number_t hwirq)
 {
 	struct irq_map_entry *ptr;
-	unsigned int virq;
+	unsigned int virq = NO_IRQ;
 	unsigned long flags;
 
 	WARN_ON(host->revmap_type != IRQ_HOST_MAP_TREE);
 
-	/* Try to resolve */
 	irq_radix_rdlock(&flags);
 	ptr = radix_tree_lookup(&host->revmap_data.tree, hwirq);
 	irq_radix_rdunlock(flags);
 
-	/* Found it, return */
-	if (ptr) {
+	if (ptr)
 		virq = ptr - irq_map;
-		return virq;
-	}
 
-	/* If not there, try to insert it */
-	virq = irq_find_mapping(host, hwirq);
+	return virq;
+}
+
+void irq_radix_revmap_insert(struct irq_host *host, unsigned int virq,
+			     irq_hw_number_t hwirq)
+{
+	unsigned long flags;
+
+	WARN_ON(host->revmap_type != IRQ_HOST_MAP_TREE);
+
 	if (virq != NO_IRQ) {
 		irq_radix_wrlock(&flags);
 		radix_tree_insert(&host->revmap_data.tree, hwirq, &irq_map[virq]);
 		irq_radix_wrunlock(flags);
 	}
-	return virq;
 }
 
 unsigned int irq_linear_revmap(struct irq_host *host,
diff --git a/arch/powerpc/platforms/pseries/xics.c b/arch/powerpc/platforms/pseries/xics.c
index 0fc830f..6b1a005 100644
--- a/arch/powerpc/platforms/pseries/xics.c
+++ b/arch/powerpc/platforms/pseries/xics.c
@@ -310,12 +310,6 @@ static void xics_mask_irq(unsigned int virq)
 
 static unsigned int xics_startup(unsigned int virq)
 {
-	unsigned int irq;
-
-	/* force a reverse mapping of the interrupt so it gets in the cache */
-	irq = (unsigned int)irq_map[virq].hwirq;
-	irq_radix_revmap(xics_host, irq);
-
 	/* unmask it */
 	xics_unmask_irq(virq);
 	return 0;
@@ -346,7 +340,7 @@ static inline unsigned int xics_remap_irq(unsigned int vec)
 
 	if (vec == XICS_IRQ_SPURIOUS)
 		return NO_IRQ;
-	irq = irq_radix_revmap(xics_host, vec);
+	irq = irq_radix_revmap_lookup(xics_host, vec);
 	if (likely(irq != NO_IRQ))
 		return irq;
 
@@ -530,6 +524,9 @@ static int xics_host_map(struct irq_host *h, unsigned int virq,
 {
 	pr_debug("xics: map virq %d, hwirq 0x%lx\n", virq, hw);
 
+	/* Insert the interrupt mapping into the radix tree for fast lookup */
+	irq_radix_revmap_insert(xics_host, virq, hw);
+
 	get_irq_desc(virq)->status |= IRQ_LEVEL;
 	set_irq_chip_and_handler(virq, xics_irq_chip, handle_fasteoi_irq);
 	return 0;
diff --git a/include/asm-powerpc/irq.h b/include/asm-powerpc/irq.h
index 47b8119..5c88acf 100644
--- a/include/asm-powerpc/irq.h
+++ b/include/asm-powerpc/irq.h
@@ -243,15 +243,27 @@ extern unsigned int irq_create_direct_mapping(struct irq_host *host);
 extern void __init irq_radix_revmap_init(void);
 
 /**
- * irq_radix_revmap - Find a linux virq from a hw irq number.
+ * irq_radix_revmap_insert - Insert a hw irq to linux virq number mapping.
+ * @host: host owning this hardware interrupt
+ * @virq: linux irq number
+ * @hwirq: hardware irq number in that host space
+ *
+ * This is for use by irq controllers that use a radix tree reverse
+ * mapping for fast lookup.
+ */
+extern void irq_radix_revmap_insert(struct irq_host *host, unsigned int virq,
+				    irq_hw_number_t hwirq);
+
+/**
+ * irq_radix_revmap_lookup - Find a linux virq from a hw irq number.
  * @host: host owning this hardware interrupt
  * @hwirq: hardware irq number in that host space
  *
  * This is a fast path, for use by irq controller code that uses radix tree
  * revmaps
  */
-extern unsigned int irq_radix_revmap(struct irq_host *host,
-				     irq_hw_number_t hwirq);
+extern unsigned int irq_radix_revmap_lookup(struct irq_host *host,
+					    irq_hw_number_t hwirq);
 
 /**
  * irq_linear_revmap - Find a linux virq from a hw irq number.
-- 
1.5.5.1

^ permalink raw reply related

* [PATCH 0/3] powerpc - Make the irq reverse mapping tree lockless
From: Sebastien Dugue @ 2008-07-31  9:40 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: tinytim, linux-rt-users, linux-kernel, rostedt, jean-pierre.dion,
	paulus, gilles.carry, tglx

  Hi ,

  here is a respin of the patches I posted last week for the RT kernel now targeted
for mainline (http://lkml.org/lkml/2008/7/24/98). Thomas, steven, a note for you
at the end.

  The goal of this patchset is to simplify the locking constraints on the radix
tree used for IRQ reverse mapping on the pSeries machines and provide lockless
access to this tree.

  This also solves the following BUG under preempt-rt:

BUG: sleeping function called from invalid context swapper(1) at kernel/rtmutex.c:739
in_atomic():1 [00000002], irqs_disabled():1
Call Trace:
[c0000001e20f3340] [c000000000010370] .show_stack+0x70/0x1bc (unreliable)
[c0000001e20f33f0] [c000000000049380] .__might_sleep+0x11c/0x138
[c0000001e20f3470] [c0000000002a2f64] .__rt_spin_lock+0x3c/0x98
[c0000001e20f34f0] [c0000000000c3f20] .kmem_cache_alloc+0x68/0x184
[c0000001e20f3590] [c000000000193f3c] .radix_tree_node_alloc+0xf0/0x144
[c0000001e20f3630] [c000000000195190] .radix_tree_insert+0x18c/0x2fc
[c0000001e20f36f0] [c00000000000c710] .irq_radix_revmap+0x1a4/0x1e4
[c0000001e20f37b0] [c00000000003b3f0] .xics_startup+0x30/0x54
[c0000001e20f3840] [c00000000008b864] .setup_irq+0x26c/0x370
[c0000001e20f38f0] [c00000000008ba68] .request_irq+0x100/0x158
[c0000001e20f39a0] [c0000000001ee9c0] .hvc_open+0xb4/0x148
[c0000001e20f3a40] [c0000000001d72ec] .tty_open+0x200/0x368
[c0000001e20f3af0] [c0000000000ce928] .chrdev_open+0x1f4/0x25c
[c0000001e20f3ba0] [c0000000000c8bf0] .__dentry_open+0x188/0x2c8
[c0000001e20f3c50] [c0000000000c8dec] .do_filp_open+0x50/0x70
[c0000001e20f3d70] [c0000000000c8e8c] .do_sys_open+0x80/0x148
[c0000001e20f3e20] [c00000000000928c] .init_post+0x4c/0x100
[c0000001e20f3ea0] [c0000000003c0e0c] .kernel_init+0x428/0x478
[c0000001e20f3f90] [c000000000027448] .kernel_thread+0x4c/0x68

  The root cause of this bug lies in the fact that the XICS interrupt controller
uses a radix tree for its reverse irq mapping and that we cannot allocate the tree
nodes (even GFP_ATOMIC) with preemption disabled.

  In fact, we have 2 nested preemption disabling when we want to allocate
a new node:

  - setup_irq() does a spin_lock_irqsave() before calling xics_startup() which
    then calls irq_radix_revmap() to insert a new node in the tree

  - irq_radix_revmap() also does a spin_lock_irqsave() (in irq_radix_wrlock())
    before the radix_tree_insert()

  Also, if an IRQ gets registered before the tree is initialized (namely the
IPI), it will be inserted into the tree in interrupt context once the tree
have been initialized, hence the need for a spin_lock_irqsave() in the insertion
path.

  This serie is split into 3 patches:

  - The first patch moves the initialization of the radix tree earlier in the
    boot process before any IRQ gets registered, but after the mm is up.

  - The second patch splits irq_radix_revmap() into its 2 components: one
    for lookup and one for insertion into the radix tree.

  - And finally, the third patch makes the radix tree fully lockless on the 
    lookup side.


  Here is the diffstat for the whole patchset:

 arch/powerpc/kernel/irq.c             |  134 ++++++++-------------------------
 arch/powerpc/platforms/pseries/smp.c  |    1 +
 arch/powerpc/platforms/pseries/xics.c |   11 +--
 include/asm-powerpc/irq.h             |   24 +++++-
 4 files changed, 58 insertions(+), 112 deletions(-)


  Thomas, Steven, the first 2 patches can be applied seamlessly to 2.6.26-rt1
with offsets, the third patch has a trivial to fix reject in
arch/powerpc/kernel/irq.c because the irq_big_lock is changed to a raw spinlock
in preempt-rt. If you want those patches for RT, just flag me, I have those
sitting on my test box.



  Thanks,

  Sebastien.

^ permalink raw reply

* [PATCH] powerpc - Make the irq reverse mapping radix tree lockless
From: Sebastien Dugue @ 2008-07-31  9:40 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: tinytim, linux-rt-users, linux-kernel, rostedt, jean-pierre.dion,
	Sebastien Dugue, paulus, gilles.carry, tglx
In-Reply-To: <1217497241-10685-1-git-send-email-sebastien.dugue@bull.net>

  The radix trees used by interrupt controllers for their irq reverse mapping
(currently only the XICS found on pSeries) have a complex locking scheme
dating back to before the advent of the lockless radix tree.

  Take advantage of this and of the fact that the items of the tree are
pointers to a static array (irq_map) elements which can never go under us
to simplify the locking.

  Concurrency between readers and writers is handled by the intrinsic
properties of the lockless radix tree. Concurrency between writers is handled
with a spinlock added to the irq_host structure.


Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/kernel/irq.c |   75 ++++++--------------------------------------
 include/asm-powerpc/irq.h |    1 +
 2 files changed, 12 insertions(+), 64 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 083b181..3aa683b 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -458,8 +458,6 @@ void do_softirq(void)
 
 static LIST_HEAD(irq_hosts);
 static DEFINE_SPINLOCK(irq_big_lock);
-static DEFINE_PER_CPU(unsigned int, irq_radix_reader);
-static unsigned int irq_radix_writer;
 struct irq_map_entry irq_map[NR_IRQS];
 static unsigned int irq_virq_count = NR_IRQS;
 static struct irq_host *irq_default_host;
@@ -602,57 +600,6 @@ void irq_set_virq_count(unsigned int count)
 		irq_virq_count = count;
 }
 
-/* radix tree not lockless safe ! we use a brlock-type mecanism
- * for now, until we can use a lockless radix tree
- */
-static void irq_radix_wrlock(unsigned long *flags)
-{
-	unsigned int cpu, ok;
-
-	spin_lock_irqsave(&irq_big_lock, *flags);
-	irq_radix_writer = 1;
-	smp_mb();
-	do {
-		barrier();
-		ok = 1;
-		for_each_possible_cpu(cpu) {
-			if (per_cpu(irq_radix_reader, cpu)) {
-				ok = 0;
-				break;
-			}
-		}
-		if (!ok)
-			cpu_relax();
-	} while(!ok);
-}
-
-static void irq_radix_wrunlock(unsigned long flags)
-{
-	smp_wmb();
-	irq_radix_writer = 0;
-	spin_unlock_irqrestore(&irq_big_lock, flags);
-}
-
-static void irq_radix_rdlock(unsigned long *flags)
-{
-	local_irq_save(*flags);
-	__get_cpu_var(irq_radix_reader) = 1;
-	smp_mb();
-	if (likely(irq_radix_writer == 0))
-		return;
-	__get_cpu_var(irq_radix_reader) = 0;
-	smp_wmb();
-	spin_lock(&irq_big_lock);
-	__get_cpu_var(irq_radix_reader) = 1;
-	spin_unlock(&irq_big_lock);
-}
-
-static void irq_radix_rdunlock(unsigned long flags)
-{
-	__get_cpu_var(irq_radix_reader) = 0;
-	local_irq_restore(flags);
-}
-
 static int irq_setup_virq(struct irq_host *host, unsigned int virq,
 			    irq_hw_number_t hwirq)
 {
@@ -807,7 +754,6 @@ void irq_dispose_mapping(unsigned int virq)
 {
 	struct irq_host *host;
 	irq_hw_number_t hwirq;
-	unsigned long flags;
 
 	if (virq == NO_IRQ)
 		return;
@@ -840,9 +786,9 @@ void irq_dispose_mapping(unsigned int virq)
 			host->revmap_data.linear.revmap[hwirq] = NO_IRQ;
 		break;
 	case IRQ_HOST_MAP_TREE:
-		irq_radix_wrlock(&flags);
+		spin_lock(&host->tree_lock);
 		radix_tree_delete(&host->revmap_data.tree, hwirq);
-		irq_radix_wrunlock(flags);
+		spin_unlock(&host->tree_lock);
 		break;
 	}
 
@@ -895,8 +841,10 @@ void __init irq_radix_revmap_init(void)
  	struct irq_host *h;
 
 	list_for_each_entry(h, &irq_hosts, link) {
-		if (h->revmap_type == IRQ_HOST_MAP_TREE)
+		if (h->revmap_type == IRQ_HOST_MAP_TREE) {
 			INIT_RADIX_TREE(&h->revmap_data.tree, GFP_ATOMIC);
+			spin_lock_init(&h->tree_lock);
+		}
 	}
 }
 
@@ -905,13 +853,14 @@ unsigned int irq_radix_revmap_lookup(struct irq_host *host,
 {
 	struct irq_map_entry *ptr;
 	unsigned int virq = NO_IRQ;
-	unsigned long flags;
 
 	WARN_ON(host->revmap_type != IRQ_HOST_MAP_TREE);
 
-	irq_radix_rdlock(&flags);
+	/*
+	 * No rcu_read_lock(ing) needed, the ptr returned can't go under us
+	 * as it's referencing an entry in the static irq_map table.
+	 */
 	ptr = radix_tree_lookup(&host->revmap_data.tree, hwirq);
-	irq_radix_rdunlock(flags);
 
 	if (ptr)
 		virq = ptr - irq_map;
@@ -922,14 +871,12 @@ unsigned int irq_radix_revmap_lookup(struct irq_host *host,
 void irq_radix_revmap_insert(struct irq_host *host, unsigned int virq,
 			     irq_hw_number_t hwirq)
 {
-	unsigned long flags;
-
 	WARN_ON(host->revmap_type != IRQ_HOST_MAP_TREE);
 
 	if (virq != NO_IRQ) {
-		irq_radix_wrlock(&flags);
+		spin_lock(&host->tree_lock);
 		radix_tree_insert(&host->revmap_data.tree, hwirq, &irq_map[virq]);
-		irq_radix_wrunlock(flags);
+		spin_unlock(&host->tree_lock);
 	}
 }
 
diff --git a/include/asm-powerpc/irq.h b/include/asm-powerpc/irq.h
index 5c88acf..2ae395f 100644
--- a/include/asm-powerpc/irq.h
+++ b/include/asm-powerpc/irq.h
@@ -121,6 +121,7 @@ struct irq_host {
 		} linear;
 		struct radix_tree_root tree;
 	} revmap_data;
+	spinlock_t	       tree_lock;
 	struct irq_host_ops	*ops;
 	void			*host_data;
 	irq_hw_number_t		inval_irq;
-- 
1.5.5.1

^ permalink raw reply related

* Re: ide pmac breakage
From: Benjamin Herrenschmidt @ 2008-07-31  9:48 UTC (permalink / raw)
  To: Alan Cox
  Cc: FUJITA Tomonori, linux-ide, petkovbb, Bartlomiej Zolnierkiewicz,
	linuxppc-dev
In-Reply-To: <20080731101322.18735e64@lxorguk.ukuu.org.uk>

On Thu, 2008-07-31 at 10:13 +0100, Alan Cox wrote:
> > I could make the media-bay look like a controller hotplug if it was
> > going to make things easier...
> 
> I'm not sure it will. It may do nowdays, but the older IDE code
> historically was fairly broken for both cases except in 2.4. Also faking
> it as controller hotplug is the wrong path for libata which does real
> drive hot plug.

Yeah, that was my line of thinking initially, also the fact that it has
the nice side effect of keeping the minor number stable.

Ben.

^ permalink raw reply

* Re: [PATCH 0/3] powerpc - Make the irq reverse mapping tree lockless
From: Sebastien Dugue @ 2008-07-31 10:12 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: tinytim, linux-rt-users, jean-pierre.dion, rostedt, linux-kernel,
	paulus, gilles.carry, tglx
In-Reply-To: <1217497241-10685-1-git-send-email-sebastien.dugue@bull.net>


  OK, I goofed up with git-format-patch, forgot the --numbered option.

  The patches subjects should read:

	[PATCH 1/3] powerpc - Initialize the irq radix tree earlier
	[PATCH 2/3] powerpc - Separate the irq radix tree insertion and lookup
	[PATCH 3/3] powerpc - Make the irq reverse mapping radix tree lockless

  Sorry for that.

  Sebastien.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox