LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* I2C node in device tree breaks old-style drivers
From: Timur Tabi @ 2008-07-29 19:43 UTC (permalink / raw)
  To: linuxppc-dev, Grant Likely

I'm trying to debug an I2C problem I've found in my old-style driver:
sound/soc/codecs/cs4270.c.  My I2C probe function is working, but the I2C
subsystem cannot find my device.  I know it's there, because U-Boot can probe it
just fine.

At first, I thought my problem was this:

static struct i2c_driver cs4270_i2c_driver = {
	.driver = {
		.name = "CS4270 I2C",
		.owner = THIS_MODULE,
	},
	.id =             I2C_DRIVERID_CS4270,
	.attach_adapter = cs4270_i2c_attach,
	.detach_client =  cs4270_i2c_detach,
};

In a slightly older kernel (still 2.6.27), I had to change "CS4270 I2C" to
"cs4270" to get it to work.  However, that change no longer makes a difference.

I turned on debugging and this is what I see:

i2c-core: driver [CS4270 I2C] registered
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x48
i2c-adapter i2c-0: master_xfer[0] W, addr=0x48, len=0
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x49
i2c-adapter i2c-0: master_xfer[0] W, addr=0x49, len=0
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x4a
i2c-adapter i2c-0: master_xfer[0] W, addr=0x4a, len=0
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x4b
i2c-adapter i2c-0: master_xfer[0] W, addr=0x4b, len=0
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x4c
i2c-adapter i2c-0: master_xfer[0] W, addr=0x4c, len=0
i2c-adapter i2c-0: master_xfer[0] W, addr=0x4c, len=1
i2c-adapter i2c-0: master_xfer[1] R, addr=0x4c, len=1
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x4d
i2c-adapter i2c-0: master_xfer[0] W, addr=0x4d, len=0
i2c-adapter i2c-0: master_xfer[0] W, addr=0x4d, len=1
i2c-adapter i2c-0: master_xfer[1] R, addr=0x4d, len=1
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x4e
i2c-adapter i2c-0: master_xfer[0] W, addr=0x4e, len=0
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x4f

My device is at address 4F.  The device tree defines a node for this device.
You can see it in arch/powerpc/boot/dts/mpc8610_hpcd.dts.

When I change the device tree so that it lists the device at an address other
than 4F, (e.g. "reg = <0x48>"), then my driver works.

So my conclusion is that specifying an I2C node in the device tree *requires*
that the driver be new-style.  Is there any way we can fix this?  I'm not going
to have time to update the CS4270 driver to a new-style interface before the
2.6.27 window closes.

-- 
Timur Tabi
Linux kernel developer at Freescale

^ permalink raw reply

* Re: mpc8349mITX developement repository
From: Scott Wood @ 2008-07-29 19:57 UTC (permalink / raw)
  To: Sparks, Sam; +Cc: linuxppc-dev
In-Reply-To: <6011A7C9CD0EE74792C3ED2A83556086342170@mail.twacs.com>

On Tue, Jul 29, 2008 at 02:26:12PM -0500, Sparks, Sam wrote:
> Which repository on kernel.org should I use to pick up the
> latest-greatest software for the mpc8349mITX?
>  
> I've tried linux/kernel/git/vitb/linux-2.6-8xx.git, but the dts doesn't
> contain compact flash nodes.
> I've tried linux/kernel/git/paulus/powerpc.git, but the kernel hangs
> while trying to uncompress.

linux/kernel/git/benh/powerpc.git is the latest.  It boots fine for me.

-Scott

^ permalink raw reply

* [PATCH] powerpc/fsl: proliferate simple-bus compatibility to soc nodes
From: Kim Phillips @ 2008-07-29 20:29 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mr.scada

add simple-bus compatible property to soc nodes for 83xx/85xx platforms
that were missing them.  Add same to platform probe code.

This fixes SoC device drivers (such as talitos) to succeed in matching
devices present in the soc node.

also update mpc836x_rdk dts to new SEC bindings (overlooked in commit
3fd4473: powerpc/fsl: update crypto node definition and device tree
instances).

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
---
 arch/powerpc/boot/dts/mpc832x_mds.dts     |    1 +
 arch/powerpc/boot/dts/mpc832x_rdb.dts     |    1 +
 arch/powerpc/boot/dts/mpc8349emitx.dts    |    1 +
 arch/powerpc/boot/dts/mpc8349emitxgp.dts  |    1 +
 arch/powerpc/boot/dts/mpc834x_mds.dts     |    1 +
 arch/powerpc/boot/dts/mpc836x_mds.dts     |    1 +
 arch/powerpc/boot/dts/mpc836x_rdk.dts     |   16 ++++++----------
 arch/powerpc/boot/dts/mpc8377_mds.dts     |    1 +
 arch/powerpc/boot/dts/mpc8378_mds.dts     |    1 +
 arch/powerpc/boot/dts/mpc8379_mds.dts     |    1 +
 arch/powerpc/boot/dts/mpc8536ds.dts       |    1 +
 arch/powerpc/boot/dts/mpc8540ads.dts      |    1 +
 arch/powerpc/boot/dts/mpc8541cds.dts      |    1 +
 arch/powerpc/boot/dts/mpc8544ds.dts       |    1 +
 arch/powerpc/boot/dts/mpc8548cds.dts      |    1 +
 arch/powerpc/boot/dts/mpc8555cds.dts      |    1 +
 arch/powerpc/boot/dts/mpc8560ads.dts      |    1 +
 arch/powerpc/boot/dts/mpc8568mds.dts      |    1 +
 arch/powerpc/boot/dts/mpc8572ds.dts       |    1 +
 arch/powerpc/platforms/83xx/mpc832x_mds.c |    1 +
 arch/powerpc/platforms/83xx/mpc832x_rdb.c |    1 +
 arch/powerpc/platforms/83xx/mpc834x_itx.c |    1 +
 arch/powerpc/platforms/83xx/mpc834x_mds.c |    1 +
 arch/powerpc/platforms/83xx/mpc836x_mds.c |    1 +
 arch/powerpc/platforms/83xx/sbc834x.c     |    1 +
 arch/powerpc/platforms/85xx/ksi8560.c     |    1 +
 arch/powerpc/platforms/85xx/mpc8536_ds.c  |    1 +
 arch/powerpc/platforms/85xx/mpc85xx_ads.c |    1 +
 arch/powerpc/platforms/85xx/mpc85xx_ds.c  |    1 +
 arch/powerpc/platforms/85xx/mpc85xx_mds.c |    1 +
 arch/powerpc/platforms/85xx/sbc8560.c     |    1 +
 31 files changed, 36 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/boot/dts/mpc832x_mds.dts b/arch/powerpc/boot/dts/mpc832x_mds.dts
index 7345743..fbc9304 100644
--- a/arch/powerpc/boot/dts/mpc832x_mds.dts
+++ b/arch/powerpc/boot/dts/mpc832x_mds.dts
@@ -68,6 +68,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x00100000>;
 		reg = <0xe0000000 0x00000200>;
 		bus-frequency = <132000000>;
diff --git a/arch/powerpc/boot/dts/mpc832x_rdb.dts b/arch/powerpc/boot/dts/mpc832x_rdb.dts
index e74c045..b157d18 100644
--- a/arch/powerpc/boot/dts/mpc832x_rdb.dts
+++ b/arch/powerpc/boot/dts/mpc832x_rdb.dts
@@ -51,6 +51,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x00100000>;
 		reg = <0xe0000000 0x00000200>;
 		bus-frequency = <0>;
diff --git a/arch/powerpc/boot/dts/mpc8349emitx.dts b/arch/powerpc/boot/dts/mpc8349emitx.dts
index 8dfab56..700e076 100644
--- a/arch/powerpc/boot/dts/mpc8349emitx.dts
+++ b/arch/powerpc/boot/dts/mpc8349emitx.dts
@@ -52,6 +52,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x00100000>;
 		reg = <0xe0000000 0x00000200>;
 		bus-frequency = <0>;                    // from bootloader
diff --git a/arch/powerpc/boot/dts/mpc8349emitxgp.dts b/arch/powerpc/boot/dts/mpc8349emitxgp.dts
index 49ca349..cdd3063 100644
--- a/arch/powerpc/boot/dts/mpc8349emitxgp.dts
+++ b/arch/powerpc/boot/dts/mpc8349emitxgp.dts
@@ -50,6 +50,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x00100000>;
 		reg = <0xe0000000 0x00000200>;
 		bus-frequency = <0>;                    // from bootloader
diff --git a/arch/powerpc/boot/dts/mpc834x_mds.dts b/arch/powerpc/boot/dts/mpc834x_mds.dts
index ba586cb..783241c 100644
--- a/arch/powerpc/boot/dts/mpc834x_mds.dts
+++ b/arch/powerpc/boot/dts/mpc834x_mds.dts
@@ -57,6 +57,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x00100000>;
 		reg = <0xe0000000 0x00000200>;
 		bus-frequency = <0>;
diff --git a/arch/powerpc/boot/dts/mpc836x_mds.dts b/arch/powerpc/boot/dts/mpc836x_mds.dts
index 3701dae..a3b76a7 100644
--- a/arch/powerpc/boot/dts/mpc836x_mds.dts
+++ b/arch/powerpc/boot/dts/mpc836x_mds.dts
@@ -61,6 +61,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x00100000>;
 		reg = <0xe0000000 0x00000200>;
 		bus-frequency = <264000000>;
diff --git a/arch/powerpc/boot/dts/mpc836x_rdk.dts b/arch/powerpc/boot/dts/mpc836x_rdk.dts
index 8acd1d6..89c9202 100644
--- a/arch/powerpc/boot/dts/mpc836x_rdk.dts
+++ b/arch/powerpc/boot/dts/mpc836x_rdk.dts
@@ -149,18 +149,14 @@
 		};
 
 		crypto@30000 {
-			compatible = "fsl,sec2-crypto";
+			compatible = "fsl,sec2.0";
 			reg = <0x30000 0x10000>;
-			interrupts = <11 8>;
+			interrupts = <11 0x8>;
 			interrupt-parent = <&ipic>;
-			num-channels = <4>;
-			channel-fifo-len = <24>;
-			exec-units-mask = <0x7e>;
-			/*
-			 * desc mask is for rev1.x, we need runtime fixup
-			 * for >=2.x
-			 */
-			descriptor-types-mask = <0x1010ebf>;
+			fsl,num-channels = <4>;
+			fsl,channel-fifo-len = <24>;
+			fsl,exec-units-mask = <0x7e>;
+			fsl,descriptor-types-mask = <0x01010ebf>;
 		};
 
 		ipic: interrupt-controller@700 {
diff --git a/arch/powerpc/boot/dts/mpc8377_mds.dts b/arch/powerpc/boot/dts/mpc8377_mds.dts
index 0a700cb..432782b 100644
--- a/arch/powerpc/boot/dts/mpc8377_mds.dts
+++ b/arch/powerpc/boot/dts/mpc8377_mds.dts
@@ -117,6 +117,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x00100000>;
 		reg = <0xe0000000 0x00000200>;
 		bus-frequency = <0>;
diff --git a/arch/powerpc/boot/dts/mpc8378_mds.dts b/arch/powerpc/boot/dts/mpc8378_mds.dts
index 29c8c76..ed32c8d 100644
--- a/arch/powerpc/boot/dts/mpc8378_mds.dts
+++ b/arch/powerpc/boot/dts/mpc8378_mds.dts
@@ -117,6 +117,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x00100000>;
 		reg = <0xe0000000 0x00000200>;
 		bus-frequency = <0>;
diff --git a/arch/powerpc/boot/dts/mpc8379_mds.dts b/arch/powerpc/boot/dts/mpc8379_mds.dts
index d641a89..f4db9ed 100644
--- a/arch/powerpc/boot/dts/mpc8379_mds.dts
+++ b/arch/powerpc/boot/dts/mpc8379_mds.dts
@@ -117,6 +117,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x00100000>;
 		reg = <0xe0000000 0x00000200>;
 		bus-frequency = <0>;
diff --git a/arch/powerpc/boot/dts/mpc8536ds.dts b/arch/powerpc/boot/dts/mpc8536ds.dts
index 02cfa24..1505d68 100644
--- a/arch/powerpc/boot/dts/mpc8536ds.dts
+++ b/arch/powerpc/boot/dts/mpc8536ds.dts
@@ -49,6 +49,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xffe00000 0x100000>;
 		reg = <0xffe00000 0x1000>;
 		bus-frequency = <0>;		// Filled out by uboot.
diff --git a/arch/powerpc/boot/dts/mpc8540ads.dts b/arch/powerpc/boot/dts/mpc8540ads.dts
index f2273a8..9568bfa 100644
--- a/arch/powerpc/boot/dts/mpc8540ads.dts
+++ b/arch/powerpc/boot/dts/mpc8540ads.dts
@@ -53,6 +53,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x100000>;
 		reg = <0xe0000000 0x100000>;	// CCSRBAR 1M
 		bus-frequency = <0>;
diff --git a/arch/powerpc/boot/dts/mpc8541cds.dts b/arch/powerpc/boot/dts/mpc8541cds.dts
index c4469f1..6480f4f 100644
--- a/arch/powerpc/boot/dts/mpc8541cds.dts
+++ b/arch/powerpc/boot/dts/mpc8541cds.dts
@@ -53,6 +53,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x100000>;
 		reg = <0xe0000000 0x1000>;	// CCSRBAR 1M
 		bus-frequency = <0>;
diff --git a/arch/powerpc/boot/dts/mpc8544ds.dts b/arch/powerpc/boot/dts/mpc8544ds.dts
index 7d3829d..f1fb207 100644
--- a/arch/powerpc/boot/dts/mpc8544ds.dts
+++ b/arch/powerpc/boot/dts/mpc8544ds.dts
@@ -54,6 +54,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 
 		ranges = <0x0 0xe0000000 0x100000>;
 		reg = <0xe0000000 0x1000>;	// CCSRBAR 1M
diff --git a/arch/powerpc/boot/dts/mpc8548cds.dts b/arch/powerpc/boot/dts/mpc8548cds.dts
index d84466b..431b496 100644
--- a/arch/powerpc/boot/dts/mpc8548cds.dts
+++ b/arch/powerpc/boot/dts/mpc8548cds.dts
@@ -58,6 +58,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x100000>;
 		reg = <0xe0000000 0x1000>;	// CCSRBAR
 		bus-frequency = <0>;
diff --git a/arch/powerpc/boot/dts/mpc8555cds.dts b/arch/powerpc/boot/dts/mpc8555cds.dts
index e03a780..d833a5c 100644
--- a/arch/powerpc/boot/dts/mpc8555cds.dts
+++ b/arch/powerpc/boot/dts/mpc8555cds.dts
@@ -53,6 +53,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x100000>;
 		reg = <0xe0000000 0x1000>;	// CCSRBAR 1M
 		bus-frequency = <0>;
diff --git a/arch/powerpc/boot/dts/mpc8560ads.dts b/arch/powerpc/boot/dts/mpc8560ads.dts
index ba8159d..4d1f2f2 100644
--- a/arch/powerpc/boot/dts/mpc8560ads.dts
+++ b/arch/powerpc/boot/dts/mpc8560ads.dts
@@ -53,6 +53,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x100000>;
 		reg = <0xe0000000 0x200>;
 		bus-frequency = <330000000>;
diff --git a/arch/powerpc/boot/dts/mpc8568mds.dts b/arch/powerpc/boot/dts/mpc8568mds.dts
index 9c30a34..a15f103 100644
--- a/arch/powerpc/boot/dts/mpc8568mds.dts
+++ b/arch/powerpc/boot/dts/mpc8568mds.dts
@@ -60,6 +60,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xe0000000 0x100000>;
 		reg = <0xe0000000 0x1000>;
 		bus-frequency = <0>;
diff --git a/arch/powerpc/boot/dts/mpc8572ds.dts b/arch/powerpc/boot/dts/mpc8572ds.dts
index 08c61e3..e124dd1 100644
--- a/arch/powerpc/boot/dts/mpc8572ds.dts
+++ b/arch/powerpc/boot/dts/mpc8572ds.dts
@@ -68,6 +68,7 @@
 		#address-cells = <1>;
 		#size-cells = <1>;
 		device_type = "soc";
+		compatible = "simple-bus";
 		ranges = <0x0 0xffe00000 0x100000>;
 		reg = <0xffe00000 0x1000>;	// CCSRBAR & soc regs, remove once parse code for immrbase fixed
 		bus-frequency = <0>;		// Filled out by uboot.
diff --git a/arch/powerpc/platforms/83xx/mpc832x_mds.c b/arch/powerpc/platforms/83xx/mpc832x_mds.c
index dd4be4a..ec43477 100644
--- a/arch/powerpc/platforms/83xx/mpc832x_mds.c
+++ b/arch/powerpc/platforms/83xx/mpc832x_mds.c
@@ -105,6 +105,7 @@ static void __init mpc832x_sys_setup_arch(void)
 static struct of_device_id mpc832x_ids[] = {
 	{ .type = "soc", },
 	{ .compatible = "soc", },
+	{ .compatible = "simple-bus", },
 	{ .type = "qe", },
 	{ .compatible = "fsl,qe", },
 	{},
diff --git a/arch/powerpc/platforms/83xx/mpc832x_rdb.c b/arch/powerpc/platforms/83xx/mpc832x_rdb.c
index f049d69..0300268 100644
--- a/arch/powerpc/platforms/83xx/mpc832x_rdb.c
+++ b/arch/powerpc/platforms/83xx/mpc832x_rdb.c
@@ -115,6 +115,7 @@ static void __init mpc832x_rdb_setup_arch(void)
 static struct of_device_id mpc832x_ids[] = {
 	{ .type = "soc", },
 	{ .compatible = "soc", },
+	{ .compatible = "simple-bus", },
 	{ .type = "qe", },
 	{ .compatible = "fsl,qe", },
 	{},
diff --git a/arch/powerpc/platforms/83xx/mpc834x_itx.c b/arch/powerpc/platforms/83xx/mpc834x_itx.c
index 7301d77..76092d3 100644
--- a/arch/powerpc/platforms/83xx/mpc834x_itx.c
+++ b/arch/powerpc/platforms/83xx/mpc834x_itx.c
@@ -41,6 +41,7 @@
 
 static struct of_device_id __initdata mpc834x_itx_ids[] = {
 	{ .compatible = "fsl,pq2pro-localbus", },
+	{ .compatible = "simple-bus", },
 	{},
 };
 
diff --git a/arch/powerpc/platforms/83xx/mpc834x_mds.c b/arch/powerpc/platforms/83xx/mpc834x_mds.c
index 30d509a..fc3f2ed 100644
--- a/arch/powerpc/platforms/83xx/mpc834x_mds.c
+++ b/arch/powerpc/platforms/83xx/mpc834x_mds.c
@@ -111,6 +111,7 @@ static void __init mpc834x_mds_init_IRQ(void)
 static struct of_device_id mpc834x_ids[] = {
 	{ .type = "soc", },
 	{ .compatible = "soc", },
+	{ .compatible = "simple-bus", },
 	{},
 };
 
diff --git a/arch/powerpc/platforms/83xx/mpc836x_mds.c b/arch/powerpc/platforms/83xx/mpc836x_mds.c
index 75b80e8..9d46e5b 100644
--- a/arch/powerpc/platforms/83xx/mpc836x_mds.c
+++ b/arch/powerpc/platforms/83xx/mpc836x_mds.c
@@ -136,6 +136,7 @@ static void __init mpc836x_mds_setup_arch(void)
 static struct of_device_id mpc836x_ids[] = {
 	{ .type = "soc", },
 	{ .compatible = "soc", },
+	{ .compatible = "simple-bus", },
 	{ .type = "qe", },
 	{ .compatible = "fsl,qe", },
 	{},
diff --git a/arch/powerpc/platforms/83xx/sbc834x.c b/arch/powerpc/platforms/83xx/sbc834x.c
index fc21f5c..156c4e2 100644
--- a/arch/powerpc/platforms/83xx/sbc834x.c
+++ b/arch/powerpc/platforms/83xx/sbc834x.c
@@ -83,6 +83,7 @@ static void __init sbc834x_init_IRQ(void)
 static struct __initdata of_device_id sbc834x_ids[] = {
 	{ .type = "soc", },
 	{ .compatible = "soc", },
+	{ .compatible = "simple-bus", },
 	{},
 };
 
diff --git a/arch/powerpc/platforms/85xx/ksi8560.c b/arch/powerpc/platforms/85xx/ksi8560.c
index 2145ade..8a3b117 100644
--- a/arch/powerpc/platforms/85xx/ksi8560.c
+++ b/arch/powerpc/platforms/85xx/ksi8560.c
@@ -222,6 +222,7 @@ static void ksi8560_show_cpuinfo(struct seq_file *m)
 
 static struct of_device_id __initdata of_bus_ids[] = {
 	{ .type = "soc", },
+	{ .type = "simple-bus", },
 	{ .name = "cpm", },
 	{ .name = "localbus", },
 	{},
diff --git a/arch/powerpc/platforms/85xx/mpc8536_ds.c b/arch/powerpc/platforms/85xx/mpc8536_ds.c
index 6b846aa..1bf5aef 100644
--- a/arch/powerpc/platforms/85xx/mpc8536_ds.c
+++ b/arch/powerpc/platforms/85xx/mpc8536_ds.c
@@ -91,6 +91,7 @@ static void __init mpc8536_ds_setup_arch(void)
 static struct of_device_id __initdata mpc8536_ds_ids[] = {
 	{ .type = "soc", },
 	{ .compatible = "soc", },
+	{ .compatible = "simple-bus", },
 	{},
 };
 
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_ads.c b/arch/powerpc/platforms/85xx/mpc85xx_ads.c
index ba498d6..d17807a 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_ads.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_ads.c
@@ -230,6 +230,7 @@ static struct of_device_id __initdata of_bus_ids[] = {
 	{ .type = "soc", },
 	{ .name = "cpm", },
 	{ .name = "localbus", },
+	{ .compatible = "simple-bus", },
 	{},
 };
 
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_ds.c b/arch/powerpc/platforms/85xx/mpc85xx_ds.c
index 00c5358..483b65c 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_ds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_ds.c
@@ -186,6 +186,7 @@ static int __init mpc8544_ds_probe(void)
 static struct of_device_id __initdata mpc85xxds_ids[] = {
 	{ .type = "soc", },
 	{ .compatible = "soc", },
+	{ .compatible = "simple-bus", },
 	{},
 };
 
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
index 43a459f..2494c51 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
@@ -260,6 +260,7 @@ machine_arch_initcall(mpc85xx_mds, board_fixups);
 static struct of_device_id mpc85xx_ids[] = {
 	{ .type = "soc", },
 	{ .compatible = "soc", },
+	{ .compatible = "simple-bus", },
 	{ .type = "qe", },
 	{ .compatible = "fsl,qe", },
 	{},
diff --git a/arch/powerpc/platforms/85xx/sbc8560.c b/arch/powerpc/platforms/85xx/sbc8560.c
index 2c580cd..6509ade 100644
--- a/arch/powerpc/platforms/85xx/sbc8560.c
+++ b/arch/powerpc/platforms/85xx/sbc8560.c
@@ -217,6 +217,7 @@ static struct of_device_id __initdata of_bus_ids[] = {
 	{ .type = "soc", },
 	{ .name = "cpm", },
 	{ .name = "localbus", },
+	{ .compatible = "simple-bus", },
 	{},
 };
 
-- 
1.5.6

^ permalink raw reply related

* Re: ide pmac breakage
From: Benjamin Herrenschmidt @ 2008-07-29 21:30 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
In-Reply-To: <200807292126.12238.bzolnier@gmail.com>


> I WON!!!

Heh, great :-)

I'll give you patch a try, thanks !

Cheers,
Ben.

> From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
> Subject: [PATCH] ide: fix regression caused by ide_device_{get,put}() addition
> 
> On Monday 28 July 2008, Benjamin Herrenschmidt wrote:
> 
> [...]
> 
> > Vector: 300 (Data Access) at [c58b7b80]
> >     pc: c014f264: elv_may_queue+0x10/0x44
> >     lr: c0152750: get_request+0x2c/0x2c0
> >     sp: c58b7c30
> >    msr: 1032
> >    dar: c
> >  dsisr: 40000000
> >   current = 0xc58aaae0
> >     pid   = 854, comm = media-bay
> > enter ? for help
> > mon> t
> > [c58b7c40] c0152750 get_request+0x2c/0x2c0
> > [c58b7c70] c0152a08 get_request_wait+0x24/0xec
> > [c58b7cc0] c0225674 ide_cd_queue_pc+0x58/0x1a0
> > [c58b7d40] c022672c ide_cdrom_packet+0x9c/0xdc
> > [c58b7d70] c0261810 cdrom_get_disc_info+0x60/0xd0
> > [c58b7dc0] c026208c cdrom_mrw_exit+0x1c/0x11c
> > [c58b7e30] c0260f7c unregister_cdrom+0x84/0xe8
> > [c58b7e50] c022395c ide_cd_release+0x80/0x84
> > [c58b7e70] c0163650 kref_put+0x54/0x6c
> > [c58b7e80] c0223884 ide_cd_put+0x40/0x5c
> > [c58b7ea0] c0211100 generic_ide_remove+0x28/0x3c
> > [c58b7eb0] c01e9d34 __device_release_driver+0x78/0xb4
> > [c58b7ec0] c01e9e44 device_release_driver+0x28/0x44
> > [c58b7ee0] c01e8f7c bus_remove_device+0xac/0xd8
> > [c58b7f00] c01e7424 device_del+0x104/0x198
> > [c58b7f20] c01e74d0 device_unregister+0x18/0x30
> > [c58b7f40] c02121c4 __ide_port_unregister_devices+0x6c/0x88
> > [c58b7f60] c0212398 ide_port_unregister_devices+0x38/0x80
> > [c58b7f80] c0208ca4 media_bay_step+0x1cc/0x5c0
> > [c58b7fb0] c0209124 media_bay_task+0x8c/0xcc
> > [c58b7fd0] c00485c0 kthread+0x48/0x84
> > [c58b7ff0] c0011b20 kernel_thread+0x44/0x60
> 
> The guilty commit turned out to be 08da591e14cf87247ec09b17c350235157a92fc3
> ("ide: add ide_device_{get,put}() helpers").  ide_device_put() is called
> before kref_put() in ide_cd_put() so IDE device is already gone by the time
> ide_cd_release() is reached.
> 
> Fix it by calling ide_device_get() before kref_get() and ide_device_put()
> after kref_put() in all affected device drivers.
> 
> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Cc: Borislav Petkov <petkovbb@gmail.com>
> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
> ---
>  drivers/ide/ide-cd.c     |   10 +++++-----
>  drivers/ide/ide-disk.c   |    9 ++++-----
>  drivers/ide/ide-floppy.c |    9 ++++-----
>  drivers/ide/ide-tape.c   |    9 ++++-----
>  drivers/scsi/ide-scsi.c  |    9 ++++-----
>  5 files changed, 21 insertions(+), 25 deletions(-)
> 
> Index: b/drivers/ide/ide-cd.c
> ===================================================================
> --- a/drivers/ide/ide-cd.c
> +++ b/drivers/ide/ide-cd.c
> @@ -66,11 +66,11 @@ static struct cdrom_info *ide_cd_get(str
>  	mutex_lock(&idecd_ref_mutex);
>  	cd = ide_cd_g(disk);
>  	if (cd) {
> -		kref_get(&cd->kref);
> -		if (ide_device_get(cd->drive)) {
> -			kref_put(&cd->kref, ide_cd_release);
> +		if (ide_device_get(cd->drive))
>  			cd = NULL;
> -		}
> +		else
> +			kref_get(&cd->kref);
> +
>  	}
>  	mutex_unlock(&idecd_ref_mutex);
>  	return cd;
> @@ -79,8 +79,8 @@ static struct cdrom_info *ide_cd_get(str
>  static void ide_cd_put(struct cdrom_info *cd)
>  {
>  	mutex_lock(&idecd_ref_mutex);
> -	ide_device_put(cd->drive);
>  	kref_put(&cd->kref, ide_cd_release);
> +	ide_device_put(cd->drive);
>  	mutex_unlock(&idecd_ref_mutex);
>  }
>  
> Index: b/drivers/ide/ide-disk.c
> ===================================================================
> --- a/drivers/ide/ide-disk.c
> +++ b/drivers/ide/ide-disk.c
> @@ -65,11 +65,10 @@ static struct ide_disk_obj *ide_disk_get
>  	mutex_lock(&idedisk_ref_mutex);
>  	idkp = ide_disk_g(disk);
>  	if (idkp) {
> -		kref_get(&idkp->kref);
> -		if (ide_device_get(idkp->drive)) {
> -			kref_put(&idkp->kref, ide_disk_release);
> +		if (ide_device_get(idkp->drive))
>  			idkp = NULL;
> -		}
> +		else
> +			kref_get(&idkp->kref);
>  	}
>  	mutex_unlock(&idedisk_ref_mutex);
>  	return idkp;
> @@ -78,8 +77,8 @@ static struct ide_disk_obj *ide_disk_get
>  static void ide_disk_put(struct ide_disk_obj *idkp)
>  {
>  	mutex_lock(&idedisk_ref_mutex);
> -	ide_device_put(idkp->drive);
>  	kref_put(&idkp->kref, ide_disk_release);
> +	ide_device_put(idkp->drive);
>  	mutex_unlock(&idedisk_ref_mutex);
>  }
>  
> Index: b/drivers/ide/ide-floppy.c
> ===================================================================
> --- a/drivers/ide/ide-floppy.c
> +++ b/drivers/ide/ide-floppy.c
> @@ -167,11 +167,10 @@ static struct ide_floppy_obj *ide_floppy
>  	mutex_lock(&idefloppy_ref_mutex);
>  	floppy = ide_floppy_g(disk);
>  	if (floppy) {
> -		kref_get(&floppy->kref);
> -		if (ide_device_get(floppy->drive)) {
> -			kref_put(&floppy->kref, idefloppy_cleanup_obj);
> +		if (ide_device_get(floppy->drive))
>  			floppy = NULL;
> -		}
> +		else
> +			kref_get(&floppy->kref);
>  	}
>  	mutex_unlock(&idefloppy_ref_mutex);
>  	return floppy;
> @@ -180,8 +179,8 @@ static struct ide_floppy_obj *ide_floppy
>  static void ide_floppy_put(struct ide_floppy_obj *floppy)
>  {
>  	mutex_lock(&idefloppy_ref_mutex);
> -	ide_device_put(floppy->drive);
>  	kref_put(&floppy->kref, idefloppy_cleanup_obj);
> +	ide_device_put(floppy->drive);
>  	mutex_unlock(&idefloppy_ref_mutex);
>  }
>  
> Index: b/drivers/ide/ide-tape.c
> ===================================================================
> --- a/drivers/ide/ide-tape.c
> +++ b/drivers/ide/ide-tape.c
> @@ -331,11 +331,10 @@ static struct ide_tape_obj *ide_tape_get
>  	mutex_lock(&idetape_ref_mutex);
>  	tape = ide_tape_g(disk);
>  	if (tape) {
> -		kref_get(&tape->kref);
> -		if (ide_device_get(tape->drive)) {
> -			kref_put(&tape->kref, ide_tape_release);
> +		if (ide_device_get(tape->drive))
>  			tape = NULL;
> -		}
> +		else
> +			kref_get(&tape->kref);
>  	}
>  	mutex_unlock(&idetape_ref_mutex);
>  	return tape;
> @@ -344,8 +343,8 @@ static struct ide_tape_obj *ide_tape_get
>  static void ide_tape_put(struct ide_tape_obj *tape)
>  {
>  	mutex_lock(&idetape_ref_mutex);
> -	ide_device_put(tape->drive);
>  	kref_put(&tape->kref, ide_tape_release);
> +	ide_device_put(tape->drive);
>  	mutex_unlock(&idetape_ref_mutex);
>  }
>  
> Index: b/drivers/scsi/ide-scsi.c
> ===================================================================
> --- a/drivers/scsi/ide-scsi.c
> +++ b/drivers/scsi/ide-scsi.c
> @@ -102,11 +102,10 @@ static struct ide_scsi_obj *ide_scsi_get
>  	mutex_lock(&idescsi_ref_mutex);
>  	scsi = ide_scsi_g(disk);
>  	if (scsi) {
> -		scsi_host_get(scsi->host);
> -		if (ide_device_get(scsi->drive)) {
> -			scsi_host_put(scsi->host);
> +		if (ide_device_get(scsi->drive))
>  			scsi = NULL;
> -		}
> +		else
> +			scsi_host_get(scsi->host);
>  	}
>  	mutex_unlock(&idescsi_ref_mutex);
>  	return scsi;
> @@ -115,8 +114,8 @@ static struct ide_scsi_obj *ide_scsi_get
>  static void ide_scsi_put(struct ide_scsi_obj *scsi)
>  {
>  	mutex_lock(&idescsi_ref_mutex);
> -	ide_device_put(scsi->drive);
>  	scsi_host_put(scsi->host);
> +	ide_device_put(scsi->drive);
>  	mutex_unlock(&idescsi_ref_mutex);
>  }
>  

^ permalink raw reply

* Re: [bugme-daemon@bugzilla.kernel.org: [Bug 7306] Yenta-socket causes oops on insertion of any PCMCIA card]
From: Benjamin Herrenschmidt @ 2008-07-29 21:31 UTC (permalink / raw)
  To: Dominik Brodowski; +Cc: linuxppc-dev, paulus
In-Reply-To: <20080729182819.GA30961@isilmar.linta.de>

On Tue, 2008-07-29 at 20:28 +0200, Dominik Brodowski wrote:
> Ben, Paul,
> 
> any ideas?

Strange. Paul has a lombard, so if he can bring it to the office, I'll
have a look. Paul, bring some legacy PCMCIA cards too if you have any,
I'm not sure I do (though I think we have one or two somewhere in the
lab).

Cheers,
Ben.

> Best,
> 	Dominik
> 
> On Thu, Jul 17, 2008 at 11:14:44AM +0200, Dominik Brodowski wrote:
> > Hi,
> > 
> > on an Apple Powerbook G3 (Lombard) with a PPC 740 running at 333 MHz, the
> > PCI host bridge is condigured to allow "downstream" devices to use iomem
> > 
> > 0xfd000000 - 0xfdffffff
> > 
> > However, when using it for PCMCIA purposes, there's a machine check. Any
> > ideas on why this PCI host bridge is mis-configured, and how to resolve this
> > issue (besides adding reserved=0xfd000000,0xffffff as kernel boot option)?
> > 
> > Best,
> > 	Dominik
> > 
> > 
> > ----- Forwarded message from bugme-daemon@bugzilla.kernel.org -----
> > 
> > Subject: [Bug 7306] Yenta-socket causes oops on insertion of any PCMCIA card
> > To: linux-pcmcia@lists.infradead.org
> > From: bugme-daemon@bugzilla.kernel.org
> > Date: Thu, 17 Jul 2008 01:45:44 -0700 (PDT)
> > 
> > http://bugzilla.kernel.org/show_bug.cgi?id=7306
> > 
> > 
> > 
> > 
> > 
> > ------- Comment #17 from linux@brodo.de  2008-07-17 01:45 -------
> > Now this contains interesting information:
> > 
> > pcmcia: parent PCI bridge Memory window: 
> > 
> > means the PCI host bridge is configured to allow "downstream" devices to use
> > this memory area. However, when the PCMCIA socket tries to do so, you get the
> > machine check. So my question would be to the powerpc folks: why is the PCI
> > host bridge configured this way, even if this memory area is not usable?
> > 
> > 
> > -- 
> > Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> > ------- You are receiving this mail because: -------
> > You are on the CC list for the bug, or are watching someone who is.
> > 
> > _______________________________________________
> > Linux PCMCIA reimplementation list
> > http://lists.infradead.org/mailman/listinfo/linux-pcmcia
> > 
> > ----- End forwarded message -----

^ permalink raw reply

* Re: 2.6.27-rc1 problems
From: Benjamin Herrenschmidt @ 2008-07-29 21:33 UTC (permalink / raw)
  To: Sean MacLennan; +Cc: linuxppc-dev
In-Reply-To: <20080729140617.7bf16aeb@lappy.seanm.ca>

On Tue, 2008-07-29 at 14:06 -0400, Sean MacLennan wrote:
> I am seeing a lot of problems with 2.6.27-rc1.
> 
> Merge conflicts that make no sense.

Ugh ? Have you done a clean pull ?

> Files missing (tracehook.h).

Looks like your repo isn't sane

> Lots of compile errors.

There's one known due to a PCI patch that went in and broke ppc64 but
that's it I think. I'll a new batch of fixes today.

> Anybody else seeing problems, or is it just me? I am starting
> to wonder if my git repository has gotten corrupted.

Yeah, sounds like it.

Ben.

^ permalink raw reply

* Re: Level IRQ handling on Xilinx INTC with ARCH=powerpc
From: Benjamin Herrenschmidt @ 2008-07-29 21:35 UTC (permalink / raw)
  To: David Howells; +Cc: Ingo Molnar, Thomas Gleixner, linuxppc-dev
In-Reply-To: <16359.1217340857@redhat.com>

On Tue, 2008-07-29 at 15:14 +0100, David Howells wrote:
> Sergey Temerkhanov <temerkhanov@yandex.ru> wrote:
> 
> > And handle_level_irq() which is currently used as high-level IRQ handler for
> > Xilinx INTC only tries to acknowledge IRQ before ISR call. So that the IRQ
> > remains asserted in INTC and after the call to desc->chip->unmask() causes
> > spurious attempt to process the same IRQ again. However, call to
> > desc->chip->ack() this time finishes the required procedure of IRQ
> > acknowledge.
> 
> I think I'm seeing the same on the MN10300 arch with its builtin PIC.  My
> soultion was to make unmask() also clear the IRQ latch in the PIC for that
> channel.  We perhaps want an unmask_ack() op.

I've heard about similar issues on other setups... I dislike having a
separate op though, not sure what's the best approach. Another one is to
write a different level handler for such PICs, though that somewhat
sucks too. CC'ing Ingo and Thomas who may have a better idea.

Ben.

^ permalink raw reply

* Re: 2.6.27-rc1 problems
From: Sean MacLennan @ 2008-07-29 21:50 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev
In-Reply-To: <1217367185.11188.260.camel@pasglop>

On Wed, 30 Jul 2008 07:33:05 +1000
"Benjamin Herrenschmidt" <benh@kernel.crashing.org> wrote:

> Yeah, sounds like it.

Thanks for the confirmation. Yeah, I think the git repository is
corrupt. I did a git pull on a recently branched copy of this git and
it was clean.

Cheers,
   Sean

^ permalink raw reply

* Re: 2.6.27-rc1 problems
From: Benjamin Herrenschmidt @ 2008-07-29 22:01 UTC (permalink / raw)
  To: Sean MacLennan; +Cc: linuxppc-dev
In-Reply-To: <20080729175021.14c458ed@lappy.seanm.ca>

On Tue, 2008-07-29 at 17:50 -0400, Sean MacLennan wrote:
> On Wed, 30 Jul 2008 07:33:05 +1000
> "Benjamin Herrenschmidt" <benh@kernel.crashing.org> wrote:
> 
> > Yeah, sounds like it.
> 
> Thanks for the confirmation. Yeah, I think the git repository is
> corrupt. I did a git pull on a recently branched copy of this git and
> it was clean.

that's what git reset --hard is for... unless your objects are corrupt
too.

Ben.

^ permalink raw reply

* verbose kernel debug
From: Jon Smirl @ 2008-07-29 22:47 UTC (permalink / raw)
  To: ppc-dev

I'm getting a "Badness at c01cc228 [verbose debug info unavailable]"

How do I turn on verbose debug support? Or is it helpful? I see the
option for x86 but I don't see how to do it for PowerPC.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: [PATCH] powerpc/fsl: proliferate simple-bus compatibility to soc nodes
From: Kumar Gala @ 2008-07-29 22:50 UTC (permalink / raw)
  To: Kim Phillips; +Cc: linuxppc-dev, mr.scada
In-Reply-To: <20080729152924.65a02311.kim.phillips@freescale.com>


On Jul 29, 2008, at 3:29 PM, Kim Phillips wrote:

> add simple-bus compatible property to soc nodes for 83xx/85xx  
> platforms
> that were missing them.  Add same to platform probe code.
>
> This fixes SoC device drivers (such as talitos) to succeed in matching
> devices present in the soc node.
>
> also update mpc836x_rdk dts to new SEC bindings (overlooked in commit
> 3fd4473: powerpc/fsl: update crypto node definition and device tree
> instances).
>
> Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
> ---
> arch/powerpc/boot/dts/mpc832x_mds.dts     |    1 +
> arch/powerpc/boot/dts/mpc832x_rdb.dts     |    1 +
> arch/powerpc/boot/dts/mpc8349emitx.dts    |    1 +
> arch/powerpc/boot/dts/mpc8349emitxgp.dts  |    1 +
> arch/powerpc/boot/dts/mpc834x_mds.dts     |    1 +
> arch/powerpc/boot/dts/mpc836x_mds.dts     |    1 +
> arch/powerpc/boot/dts/mpc836x_rdk.dts     |   16 ++++++----------
> arch/powerpc/boot/dts/mpc8377_mds.dts     |    1 +
> arch/powerpc/boot/dts/mpc8378_mds.dts     |    1 +
> arch/powerpc/boot/dts/mpc8379_mds.dts     |    1 +
> arch/powerpc/boot/dts/mpc8536ds.dts       |    1 +
> arch/powerpc/boot/dts/mpc8540ads.dts      |    1 +
> arch/powerpc/boot/dts/mpc8541cds.dts      |    1 +
> arch/powerpc/boot/dts/mpc8544ds.dts       |    1 +
> arch/powerpc/boot/dts/mpc8548cds.dts      |    1 +
> arch/powerpc/boot/dts/mpc8555cds.dts      |    1 +
> arch/powerpc/boot/dts/mpc8560ads.dts      |    1 +
> arch/powerpc/boot/dts/mpc8568mds.dts      |    1 +
> arch/powerpc/boot/dts/mpc8572ds.dts       |    1 +
> arch/powerpc/platforms/83xx/mpc832x_mds.c |    1 +
> arch/powerpc/platforms/83xx/mpc832x_rdb.c |    1 +
> arch/powerpc/platforms/83xx/mpc834x_itx.c |    1 +
> arch/powerpc/platforms/83xx/mpc834x_mds.c |    1 +
> arch/powerpc/platforms/83xx/mpc836x_mds.c |    1 +
> arch/powerpc/platforms/83xx/sbc834x.c     |    1 +
> arch/powerpc/platforms/85xx/ksi8560.c     |    1 +
> arch/powerpc/platforms/85xx/mpc8536_ds.c  |    1 +
> arch/powerpc/platforms/85xx/mpc85xx_ads.c |    1 +
> arch/powerpc/platforms/85xx/mpc85xx_ds.c  |    1 +
> arch/powerpc/platforms/85xx/mpc85xx_mds.c |    1 +
> arch/powerpc/platforms/85xx/sbc8560.c     |    1 +
> 31 files changed, 36 insertions(+), 10 deletions(-)

applied.

- k

^ permalink raw reply

* Re: verbose kernel debug
From: Scott Wood @ 2008-07-29 23:05 UTC (permalink / raw)
  To: Jon Smirl; +Cc: ppc-dev
In-Reply-To: <9e4733910807291547p3ddd6b15nbe4516657e2e014f@mail.gmail.com>

Jon Smirl wrote:
> I'm getting a "Badness at c01cc228 [verbose debug info unavailable]"
> 
> How do I turn on verbose debug support? Or is it helpful? I see the
> option for x86 but I don't see how to do it for PowerPC.

Under "Kernel Hacking", enable "Kernel debugging".  This will expose a 
"Verbose BUG() reporting" option.

IMHO, this option shouldn't be buried in this manner; it's not just for 
hacking kernels, but also for submitting decent bug reports.

-Scott

^ permalink raw reply

* Re: verbose kernel debug
From: Jon Smirl @ 2008-07-29 23:12 UTC (permalink / raw)
  To: Scott Wood; +Cc: ppc-dev
In-Reply-To: <488FA22C.5060100@freescale.com>

On 7/29/08, Scott Wood <scottwood@freescale.com> wrote:
> Jon Smirl wrote:
>
> > I'm getting a "Badness at c01cc228 [verbose debug info unavailable]"
> >
> > How do I turn on verbose debug support? Or is it helpful? I see the
> > option for x86 but I don't see how to do it for PowerPC.
> >
>
>  Under "Kernel Hacking", enable "Kernel debugging".  This will expose a
> "Verbose BUG() reporting" option.
>
>  IMHO, this option shouldn't be buried in this manner; it's not just for
> hacking kernels, but also for submitting decent bug reports.

I looked in there. The option is not appearing for my MPC5200.

>
>  -Scott
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: verbose kernel debug
From: Grant Likely @ 2008-07-29 23:40 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Scott Wood, ppc-dev
In-Reply-To: <9e4733910807291612l5e0567d6t376c2f96c5169333@mail.gmail.com>

On Tue, Jul 29, 2008 at 07:12:58PM -0400, Jon Smirl wrote:
> On 7/29/08, Scott Wood <scottwood@freescale.com> wrote:
> > Jon Smirl wrote:
> >
> > > I'm getting a "Badness at c01cc228 [verbose debug info unavailable]"
> > >
> > > How do I turn on verbose debug support? Or is it helpful? I see the
> > > option for x86 but I don't see how to do it for PowerPC.
> > >
> >
> >  Under "Kernel Hacking", enable "Kernel debugging".  This will expose a
> > "Verbose BUG() reporting" option.
> >
> >  IMHO, this option shouldn't be buried in this manner; it's not just for
> > hacking kernels, but also for submitting decent bug reports.
> 
> I looked in there. The option is not appearing for my MPC5200.

Hmmm, I can see it right now on a 5200 configured kernel.  It's right
before "Compile the kernel with debug info" option.

It also helps to make sure CONFIG_EMBEDDED is disabled, or if it is
enabled that CONFIG_KALLSYMS is enabled.

g.

^ permalink raw reply

* Re: Writing to CPLD mapped to EBC Port of AMCC440EP
From: Josh Boyer @ 2008-07-30  0:02 UTC (permalink / raw)
  To: Henry Bausley; +Cc: linuxppc-embedded
In-Reply-To: <23b401c8f10b$54ecf9d0$0109220a@deltatau.local>

On Mon, 28 Jul 2008 16:40:35 -0700
"Henry Bausley" <hbausley@deltatau.com> wrote:

> 
> I am attempting to write to a CPLD mapped to the EBC port of a AMCC 440EP.  When I attempt to write using an unsigned variable
> ie. unsigned *pbase = (unsigned char *)ioremap64(0x8F000000,0x1000000);
> I get a kernel access of bad area, sig: 11 fault.  However, if I change to an unsigned char ie. unsigned char *pbase = (unsigned char *)ioremap64(0x8F000000,
> 0x1000000); The system doesn't crash.  I need to write using an unsigned.  Does any one have any ideas what I am doing wrong?

The documentation I have for the Bamboo board says the EPLD is at
address 0x80002000 and is only 8 bytes in size.  Similarly, the
Yosemite board CPLD is at 0x80002000 and is only 16 bytes in size.  Why
you are ioremapping 16MiB at 0x8F000000 I have no idea.

Also, the individual registers of the EPLD/CPLD on both boards are only
8 bits, so an unsigned char seems appropriate.  If you have a custom
board that does something totally different from how the eval boards
are set up, then I'm not sure many people will be able to help you
without documentation for that board.

josh

^ permalink raw reply

* Re: verbose kernel debug
From: Jon Smirl @ 2008-07-30  0:09 UTC (permalink / raw)
  To: Grant Likely; +Cc: Scott Wood, ppc-dev
In-Reply-To: <20080729234012.GA13708@secretlab.ca>

On 7/29/08, Grant Likely <grant.likely@secretlab.ca> wrote:
> On Tue, Jul 29, 2008 at 07:12:58PM -0400, Jon Smirl wrote:
>  > On 7/29/08, Scott Wood <scottwood@freescale.com> wrote:
>  > > Jon Smirl wrote:
>  > >
>  > > > I'm getting a "Badness at c01cc228 [verbose debug info unavailable]"
>  > > >
>  > > > How do I turn on verbose debug support? Or is it helpful? I see the
>  > > > option for x86 but I don't see how to do it for PowerPC.
>  > > >
>  > >
>  > >  Under "Kernel Hacking", enable "Kernel debugging".  This will expose a
>  > > "Verbose BUG() reporting" option.
>  > >
>  > >  IMHO, this option shouldn't be buried in this manner; it's not just for
>  > > hacking kernels, but also for submitting decent bug reports.
>  >
>  > I looked in there. The option is not appearing for my MPC5200.
>
>
> Hmmm, I can see it right now on a 5200 configured kernel.  It's right
>  before "Compile the kernel with debug info" option.
>
>  It also helps to make sure CONFIG_EMBEDDED is disabled, or if it is
>  enabled that CONFIG_KALLSYMS is enabled.

More needs to be selected, I still can't get it to appear.

I'm digging through Kconfig....


>
>
>  g.
>
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: verbose kernel debug
From: Jon Smirl @ 2008-07-30  0:17 UTC (permalink / raw)
  To: Grant Likely; +Cc: Scott Wood, ppc-dev
In-Reply-To: <9e4733910807291709j49cacaa7gcc05f075f0691682@mail.gmail.com>

Why isn't PowerPC in the depends on?

config DEBUG_BUGVERBOSE
	bool "Verbose BUG() reporting (adds 70K)" if DEBUG_KERNEL && EMBEDDED
	depends on BUG
	depends on ARM || AVR32 || M32R || M68K || SPARC32 || SPARC64 || \
		   FRV || SUPERH || GENERIC_BUG || BLACKFIN || MN10300
	default !EMBEDDED
	help
	  Say Y here to make BUG() panics output the file name and line number
	  of the BUG call as well as the EIP and oops trace.  This aids
	  debugging but costs about 70-100K of memory.


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: verbose kernel debug
From: Jon Smirl @ 2008-07-30  0:28 UTC (permalink / raw)
  To: Grant Likely; +Cc: Scott Wood, ppc-dev
In-Reply-To: <9e4733910807291717g7096b8eas455964e0bbb637b9@mail.gmail.com>

I finally figured out the right combo to turn it on. I was expecting
it to appear in the "Kernel Debugging" section but it was appearing
down lower.

Seems like this option should be defaulted on no matter what, and then
turn it off to save memory. It was the interaction with
CONFIG_EMBEDDED that confused me.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: ide pmac breakage
From: FUJITA Tomonori @ 2008-07-30  1:23 UTC (permalink / raw)
  To: bzolnier; +Cc: fujita.tomonori, petkovbb, linuxppc-dev, linux-ide
In-Reply-To: <200807292126.12238.bzolnier@gmail.com>

On Tue, 29 Jul 2008 21:26:11 +0200
Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> wrote:

> On Tuesday 29 July 2008, Bartlomiej Zolnierkiewicz wrote:
> > On Tuesday 29 July 2008, Bartlomiej Zolnierkiewicz wrote:
> > > On Tuesday 29 July 2008, Benjamin Herrenschmidt wrote:
> > > > On Tue, 2008-07-29 at 13:41 +0200, Bartlomiej Zolnierkiewicz wrote:
> > > > > > Well, all I do is call into Bart's new helpers to scan for or
> > > > > unregister
> > > > > > devices ...
> > > > > 
> > > > > The switch to these helpers happened _before_ 2.6.26 and it shouldn't
> > > > > bring
> > > > > such behavior change (ditto for new IDE host addition/removal
> > > > > helpers)...
> > > > > 
> > > > > Please try to git-bisect it when you have some time.
> > > > 
> > > > Ok, I will. I worked fine when I last tried your patches. I'll see if I
> > > > can track it down too. Been a bit too busy lately as you can imagine.
> > > > 
> > > > Do you have something that exercise the same code path you can use ?
> > > 
> > > I'll see if I can reproduce it with IDE warm-plug support later...
> > 
> > OK, I reproduced it here with IDE warm-plug support
> > (echo -n "1" > /sys/class/ide_port/ide*/delete_devices)
> > for devices driven by ide-cd.
> > 
> > It is also reproducible under qemu so I'm scripting it
> > into git-bisect run now...
> 
> I WON!!!

Great, seems that I don't need to learn how ide does ref counting. :)

Thanks a lot!

^ permalink raw reply

* Re: [PATCH] powerpc/lpar - defer prefered console setup
From: Michael Ellerman @ 2008-07-30  2:34 UTC (permalink / raw)
  To: Bastian Blank; +Cc: linuxppc-dev, akpm, linux-kernel
In-Reply-To: <20080728185651.GA29530@wavehammer.waldi.eu.org>

[-- Attachment #1: Type: text/plain, Size: 990 bytes --]

On Mon, 2008-07-28 at 20:56 +0200, Bastian Blank wrote:
> Hi

Hi Bastian,

> The powerpc lpar code adds a prefered console at a very early state,
> during arch_setup. This runs even before the console setup from the
> command line and takes preference.

It runs before the command line parsing, and so /does not/ take
preference. I thought.

/**
 * add_preferred_console - add a device to the list of preferred consoles.
 ...
 * The last preferred console added will be used for kernel messages
 * and stdin/out/err for init. 

The last console will be added by the console= parsing, and so that will
be used. The console we add in the pseries setup is only used if nothing
is specified on the command line.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* [PATCH] powerpc/mm: Lockless get_user_pages_fast()
From: Benjamin Herrenschmidt @ 2008-07-30  3:37 UTC (permalink / raw)
  To: linuxppc-dev list; +Cc: Nick Piggin

From: Nick Piggin <npiggin@suse.de>

Implement lockless get_user_pages_fast for powerpc.  Page table existence
is guaranteed with RCU, and speculative page references are used to take a
reference to the pages without having a prior existence guarantee on them.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

I'm going to merge this, sending it to the list for reference, it was
in -mm for some time , minus some changes/fixes I did to solve conflicts
with the new multiple huge page sizes.

Index: linux-work/arch/powerpc/Kconfig
===================================================================
--- linux-work.orig/arch/powerpc/Kconfig	2008-07-30 13:17:06.000000000 +1000
+++ linux-work/arch/powerpc/Kconfig	2008-07-30 13:27:40.000000000 +1000
@@ -42,6 +42,9 @@ config GENERIC_HARDIRQS
 	bool
 	default y
 
+config HAVE_GET_USER_PAGES_FAST
+	def_bool PPC64
+
 config HAVE_SETUP_PER_CPU_AREA
 	def_bool PPC64
 
Index: linux-work/arch/powerpc/mm/Makefile
===================================================================
--- linux-work.orig/arch/powerpc/mm/Makefile	2008-07-30 13:17:06.000000000 +1000
+++ linux-work/arch/powerpc/mm/Makefile	2008-07-30 13:27:40.000000000 +1000
@@ -6,7 +6,7 @@ ifeq ($(CONFIG_PPC64),y)
 EXTRA_CFLAGS	+= -mno-minimal-toc
 endif
 
-obj-y				:= fault.o mem.o \
+obj-y				:= fault.o mem.o gup.o \
 				   init_$(CONFIG_WORD_SIZE).o \
 				   pgtable_$(CONFIG_WORD_SIZE).o \
 				   mmu_context_$(CONFIG_WORD_SIZE).o
Index: linux-work/arch/powerpc/mm/gup.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-work/arch/powerpc/mm/gup.c	2008-07-30 13:28:03.000000000 +1000
@@ -0,0 +1,262 @@
+/*
+ * Lockless get_user_pages_fast for powerpc
+ *
+ * Copyright (C) 2008 Nick Piggin
+ * Copyright (C) 2008 Novell Inc.
+ */
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <linux/hugetlb.h>
+#include <linux/vmstat.h>
+#include <linux/pagemap.h>
+#include <linux/rwsem.h>
+#include <asm/pgtable.h>
+
+/*
+ * The performance critical leaf functions are made noinline otherwise gcc
+ * inlines everything into a single function which results in too much
+ * register pressure.
+ */
+static noinline int gup_pte_range(pmd_t pmd, unsigned long addr,
+		unsigned long end, int write, struct page **pages, int *nr)
+{
+	unsigned long mask, result;
+	pte_t *ptep;
+
+	result = _PAGE_PRESENT|_PAGE_USER;
+	if (write)
+		result |= _PAGE_RW;
+	mask = result | _PAGE_SPECIAL;
+
+	ptep = pte_offset_kernel(&pmd, addr);
+	do {
+		pte_t pte = *ptep;
+		struct page *page;
+
+		if ((pte_val(pte) & mask) != result)
+			return 0;
+		VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
+		page = pte_page(pte);
+		if (!page_cache_get_speculative(page))
+			return 0;
+		if (unlikely(pte != *ptep)) {
+			put_page(page);
+			return 0;
+		}
+		pages[*nr] = page;
+		(*nr)++;
+
+	} while (ptep++, addr += PAGE_SIZE, addr != end);
+
+	return 1;
+}
+
+static noinline int gup_huge_pte(pte_t *ptep, struct hstate *hstate,
+				 unsigned long *addr, unsigned long end,
+				 int write, struct page **pages, int *nr)
+{
+	unsigned long mask;
+	unsigned long pte_end;
+	struct page *head, *page;
+	pte_t pte;
+	int refs;
+
+	pte_end = (*addr + huge_page_size(hstate)) & huge_page_mask(hstate);
+	if (pte_end < end)
+		end = pte_end;
+
+	pte = *ptep;
+	mask = _PAGE_PRESENT|_PAGE_USER;
+	if (write)
+		mask |= _PAGE_RW;
+	if ((pte_val(pte) & mask) != mask)
+		return 0;
+	/* hugepages are never "special" */
+	VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
+
+	refs = 0;
+	head = pte_page(pte);
+	page = head + ((*addr & ~huge_page_mask(hstate)) >> PAGE_SHIFT);
+	do {
+		VM_BUG_ON(compound_head(page) != head);
+		pages[*nr] = page;
+		(*nr)++;
+		page++;
+		refs++;
+	} while (*addr += PAGE_SIZE, *addr != end);
+
+	if (!page_cache_add_speculative(head, refs)) {
+		*nr -= refs;
+		return 0;
+	}
+	if (unlikely(pte != *ptep)) {
+		/* Could be optimized better */
+		while (*nr) {
+			put_page(page);
+			(*nr)--;
+		}
+	}
+
+	return 1;
+}
+
+static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
+		int write, struct page **pages, int *nr)
+{
+	unsigned long next;
+	pmd_t *pmdp;
+
+	pmdp = pmd_offset(&pud, addr);
+	do {
+		pmd_t pmd = *pmdp;
+
+		next = pmd_addr_end(addr, end);
+		if (pmd_none(pmd))
+			return 0;
+		if (!gup_pte_range(pmd, addr, next, write, pages, nr))
+			return 0;
+	} while (pmdp++, addr = next, addr != end);
+
+	return 1;
+}
+
+static int gup_pud_range(pgd_t pgd, unsigned long addr, unsigned long end,
+		int write, struct page **pages, int *nr)
+{
+	unsigned long next;
+	pud_t *pudp;
+
+	pudp = pud_offset(&pgd, addr);
+	do {
+		pud_t pud = *pudp;
+
+		next = pud_addr_end(addr, end);
+		if (pud_none(pud))
+			return 0;
+		if (!gup_pmd_range(pud, addr, next, write, pages, nr))
+			return 0;
+	} while (pudp++, addr = next, addr != end);
+
+	return 1;
+}
+
+int get_user_pages_fast(unsigned long start, int nr_pages, int write,
+			struct page **pages)
+{
+	struct mm_struct *mm = current->mm;
+	unsigned long addr, len, end;
+	unsigned long next;
+	pgd_t *pgdp;
+	int psize, nr = 0;
+	unsigned int shift;
+
+	start &= PAGE_MASK;
+	addr = start;
+	len = (unsigned long) nr_pages << PAGE_SHIFT;
+	end = start + len;
+
+	if (unlikely(!access_ok(write ? VERIFY_WRITE : VERIFY_READ,
+					start, len)))
+		goto slow_irqon;
+
+	/* Cross a slice boundary? */
+	/* XXX could be improved by iterating slices instead */
+	if (addr < SLICE_LOW_TOP) {
+		if (end > SLICE_LOW_TOP)
+			goto slow_irqon;
+
+		if (unlikely(GET_LOW_SLICE_INDEX(addr) !=
+			     GET_LOW_SLICE_INDEX(end - 1)))
+			goto slow_irqon;
+	} else {
+		if (unlikely(GET_HIGH_SLICE_INDEX(addr) !=
+			     GET_HIGH_SLICE_INDEX(end - 1)))
+			goto slow_irqon;
+	}
+
+	/*
+	 * XXX: batch / limit 'nr', to avoid large irq off latency
+	 * needs some instrumenting to determine the common sizes used by
+	 * important workloads (eg. DB2), and whether limiting the batch size
+	 * will decrease performance.
+	 *
+	 * It seems like we're in the clear for the moment. Direct-IO is
+	 * the main guy that batches up lots of get_user_pages, and even
+	 * they are limited to 64-at-a-time which is not so many.
+	 */
+	/*
+	 * This doesn't prevent pagetable teardown, but does prevent
+	 * the pagetables from being freed on powerpc.
+	 *
+	 * So long as we atomically load page table pointers versus teardown,
+	 * we can follow the address down to the the page and take a ref on it.
+	 */
+	local_irq_disable();
+
+	psize = get_slice_psize(mm, addr);
+	shift = mmu_psize_defs[psize].shift;
+
+	if (unlikely(mmu_huge_psizes[psize])) {
+		pte_t *ptep;
+		unsigned long a = addr;
+		unsigned long sz = ((1UL) << shift);
+		struct hstate *hstate = size_to_hstate(sz);
+
+		BUG_ON(!hstate);
+		/*
+		 * XXX: could be optimized to avoid hstate
+		 * lookup entirely (just use shift)
+		 */
+
+		do {
+			VM_BUG_ON(shift != mmu_psize_defs[get_slice_psize(mm, a)].shift);
+			ptep = huge_pte_offset(mm, a);
+			if (!gup_huge_pte(ptep, hstate, &a, end, write, pages,
+					  &nr))
+				goto slow;
+		} while (a != end);
+	} else {
+		pgdp = pgd_offset(mm, addr);
+		do {
+			pgd_t pgd = *pgdp;
+
+			VM_BUG_ON(shift != mmu_psize_defs[get_slice_psize(mm, addr)].shift);
+
+			next = pgd_addr_end(addr, end);
+			if (pgd_none(pgd))
+				goto slow;
+			if (!gup_pud_range(pgd, addr, next, write, pages, &nr))
+				goto slow;
+		} while (pgdp++, addr = next, addr != end);
+	}
+	local_irq_enable();
+
+	VM_BUG_ON(nr != (end - start) >> PAGE_SHIFT);
+	return nr;
+
+	{
+		int ret;
+
+slow:
+		local_irq_enable();
+slow_irqon:
+		/* Try to get the remaining pages with get_user_pages */
+		start += nr << PAGE_SHIFT;
+		pages += nr;
+
+		down_read(&mm->mmap_sem);
+		ret = get_user_pages(current, mm, start,
+			(end - start) >> PAGE_SHIFT, write, 0, pages, NULL);
+		up_read(&mm->mmap_sem);
+
+		/* Have to be a bit careful with return values */
+		if (nr > 0) {
+			if (ret < 0)
+				ret = nr;
+			else
+				ret += nr;
+		}
+
+		return ret;
+	}
+}
Index: linux-work/include/asm-powerpc/pgtable-ppc64.h
===================================================================
--- linux-work.orig/include/asm-powerpc/pgtable-ppc64.h	2008-07-30 13:17:06.000000000 +1000
+++ linux-work/include/asm-powerpc/pgtable-ppc64.h	2008-07-30 13:27:40.000000000 +1000
@@ -461,6 +461,8 @@ void pgtable_cache_init(void);
 	return pt;
 }
 
+pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long address);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
Index: linux-work/include/linux/pagemap.h
===================================================================
--- linux-work.orig/include/linux/pagemap.h	2008-07-30 13:17:06.000000000 +1000
+++ linux-work/include/linux/pagemap.h	2008-07-30 13:27:40.000000000 +1000
@@ -142,6 +142,29 @@ static inline int page_cache_get_specula
 	return 1;
 }
 
+/*
+ * Same as above, but add instead of inc (could just be merged)
+ */
+static inline int page_cache_add_speculative(struct page *page, int count)
+{
+	VM_BUG_ON(in_interrupt());
+
+#if !defined(CONFIG_SMP) && defined(CONFIG_CLASSIC_RCU)
+# ifdef CONFIG_PREEMPT
+	VM_BUG_ON(!in_atomic());
+# endif
+	VM_BUG_ON(page_count(page) == 0);
+	atomic_add(count, &page->_count);
+
+#else
+	if (unlikely(!atomic_add_unless(&page->_count, count, 0)))
+		return 0;
+#endif
+	VM_BUG_ON(PageCompound(page) && page != compound_head(page));
+
+	return 1;
+}
+
 static inline int page_freeze_refs(struct page *page, int count)
 {
 	return likely(atomic_cmpxchg(&page->_count, count, 0) == count);

^ permalink raw reply

* Re: [PATCH] powerpc/mm: Lockless get_user_pages_fast()
From: Benjamin Herrenschmidt @ 2008-07-30  4:20 UTC (permalink / raw)
  To: linuxppc-dev list; +Cc: Nick Piggin
In-Reply-To: <1217389038.11188.285.camel@pasglop>

From: Nick Piggin <npiggin@suse.de>

Implement lockless get_user_pages_fast for powerpc.  Page table existence
is guaranteed with RCU, and speculative page references are used to take a
reference to the pages without having a prior existence guarantee on them.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

v2.
Fix makefile to only build gup.o on 64 bits and fix a bug with
huge pages where we would oops (null dereference) if  
huge_pte_offset() returns NULL (ie, not populated yet).

v1.
I'm going to merge this, sending it to the list for reference, it was
in -mm , minus some changes/fixes I did to solve conflicts with the
new multiple huge page sizes.

Index: linux-work/arch/powerpc/Kconfig
===================================================================
--- linux-work.orig/arch/powerpc/Kconfig	2008-07-30 13:17:06.000000000 +1000
+++ linux-work/arch/powerpc/Kconfig	2008-07-30 13:27:40.000000000 +1000
@@ -42,6 +42,9 @@ config GENERIC_HARDIRQS
 	bool
 	default y
 
+config HAVE_GET_USER_PAGES_FAST
+	def_bool PPC64
+
 config HAVE_SETUP_PER_CPU_AREA
 	def_bool PPC64
 
Index: linux-work/arch/powerpc/mm/Makefile
===================================================================
--- linux-work.orig/arch/powerpc/mm/Makefile	2008-07-30 13:17:06.000000000 +1000
+++ linux-work/arch/powerpc/mm/Makefile	2008-07-30 13:42:42.000000000 +1000
@@ -12,7 +12,8 @@ obj-y				:= fault.o mem.o \
 				   mmu_context_$(CONFIG_WORD_SIZE).o
 hash-$(CONFIG_PPC_NATIVE)	:= hash_native_64.o
 obj-$(CONFIG_PPC64)		+= hash_utils_64.o \
-				   slb_low.o slb.o stab.o mmap.o $(hash-y)
+				   slb_low.o slb.o stab.o \
+				   gup.o mmap.o $(hash-y)
 obj-$(CONFIG_PPC_STD_MMU_32)	+= ppc_mmu_32.o
 obj-$(CONFIG_PPC_STD_MMU)	+= hash_low_$(CONFIG_WORD_SIZE).o \
 				   tlb_$(CONFIG_WORD_SIZE).o
Index: linux-work/arch/powerpc/mm/gup.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-work/arch/powerpc/mm/gup.c	2008-07-30 14:20:00.000000000 +1000
@@ -0,0 +1,271 @@
+/*
+ * Lockless get_user_pages_fast for powerpc
+ *
+ * Copyright (C) 2008 Nick Piggin
+ * Copyright (C) 2008 Novell Inc.
+ */
+#undef DEBUG
+
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <linux/hugetlb.h>
+#include <linux/vmstat.h>
+#include <linux/pagemap.h>
+#include <linux/rwsem.h>
+#include <asm/pgtable.h>
+
+/*
+ * The performance critical leaf functions are made noinline otherwise gcc
+ * inlines everything into a single function which results in too much
+ * register pressure.
+ */
+static noinline int gup_pte_range(pmd_t pmd, unsigned long addr,
+		unsigned long end, int write, struct page **pages, int *nr)
+{
+	unsigned long mask, result;
+	pte_t *ptep;
+
+	result = _PAGE_PRESENT|_PAGE_USER;
+	if (write)
+		result |= _PAGE_RW;
+	mask = result | _PAGE_SPECIAL;
+
+	ptep = pte_offset_kernel(&pmd, addr);
+	do {
+		pte_t pte = *ptep;
+		struct page *page;
+
+		if ((pte_val(pte) & mask) != result)
+			return 0;
+		VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
+		page = pte_page(pte);
+		if (!page_cache_get_speculative(page))
+			return 0;
+		if (unlikely(pte != *ptep)) {
+			put_page(page);
+			return 0;
+		}
+		pages[*nr] = page;
+		(*nr)++;
+
+	} while (ptep++, addr += PAGE_SIZE, addr != end);
+
+	return 1;
+}
+
+static noinline int gup_huge_pte(pte_t *ptep, struct hstate *hstate,
+				 unsigned long *addr, unsigned long end,
+				 int write, struct page **pages, int *nr)
+{
+	unsigned long mask;
+	unsigned long pte_end;
+	struct page *head, *page;
+	pte_t pte;
+	int refs;
+
+	pte_end = (*addr + huge_page_size(hstate)) & huge_page_mask(hstate);
+	if (pte_end < end)
+		end = pte_end;
+
+	pte = *ptep;
+	mask = _PAGE_PRESENT|_PAGE_USER;
+	if (write)
+		mask |= _PAGE_RW;
+	if ((pte_val(pte) & mask) != mask)
+		return 0;
+	/* hugepages are never "special" */
+	VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
+
+	refs = 0;
+	head = pte_page(pte);
+	page = head + ((*addr & ~huge_page_mask(hstate)) >> PAGE_SHIFT);
+	do {
+		VM_BUG_ON(compound_head(page) != head);
+		pages[*nr] = page;
+		(*nr)++;
+		page++;
+		refs++;
+	} while (*addr += PAGE_SIZE, *addr != end);
+
+	if (!page_cache_add_speculative(head, refs)) {
+		*nr -= refs;
+		return 0;
+	}
+	if (unlikely(pte != *ptep)) {
+		/* Could be optimized better */
+		while (*nr) {
+			put_page(page);
+			(*nr)--;
+		}
+	}
+
+	return 1;
+}
+
+static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
+		int write, struct page **pages, int *nr)
+{
+	unsigned long next;
+	pmd_t *pmdp;
+
+	pmdp = pmd_offset(&pud, addr);
+	do {
+		pmd_t pmd = *pmdp;
+
+		next = pmd_addr_end(addr, end);
+		if (pmd_none(pmd))
+			return 0;
+		if (!gup_pte_range(pmd, addr, next, write, pages, nr))
+			return 0;
+	} while (pmdp++, addr = next, addr != end);
+
+	return 1;
+}
+
+static int gup_pud_range(pgd_t pgd, unsigned long addr, unsigned long end,
+		int write, struct page **pages, int *nr)
+{
+	unsigned long next;
+	pud_t *pudp;
+
+	pudp = pud_offset(&pgd, addr);
+	do {
+		pud_t pud = *pudp;
+
+		next = pud_addr_end(addr, end);
+		if (pud_none(pud))
+			return 0;
+		if (!gup_pmd_range(pud, addr, next, write, pages, nr))
+			return 0;
+	} while (pudp++, addr = next, addr != end);
+
+	return 1;
+}
+
+int get_user_pages_fast(unsigned long start, int nr_pages, int write,
+			struct page **pages)
+{
+	struct mm_struct *mm = current->mm;
+	unsigned long addr, len, end;
+	unsigned long next;
+	pgd_t *pgdp;
+	int psize, nr = 0;
+	unsigned int shift;
+
+	pr_debug("%s(%lx,%x,%s)\n", __func__, start, nr_pages, write ? "write" : "read");
+
+	start &= PAGE_MASK;
+	addr = start;
+	len = (unsigned long) nr_pages << PAGE_SHIFT;
+	end = start + len;
+
+	if (unlikely(!access_ok(write ? VERIFY_WRITE : VERIFY_READ,
+					start, len)))
+		goto slow_irqon;
+
+	pr_debug("  aligned: %lx .. %lx\n", start, end);
+
+	/* Cross a slice boundary? */
+	/* XXX could be improved by iterating slices instead */
+	if (addr < SLICE_LOW_TOP) {
+		if (end > SLICE_LOW_TOP)
+			goto slow_irqon;
+
+		if (unlikely(GET_LOW_SLICE_INDEX(addr) !=
+			     GET_LOW_SLICE_INDEX(end - 1)))
+			goto slow_irqon;
+	} else {
+		if (unlikely(GET_HIGH_SLICE_INDEX(addr) !=
+			     GET_HIGH_SLICE_INDEX(end - 1)))
+			goto slow_irqon;
+	}
+
+	/*
+	 * XXX: batch / limit 'nr', to avoid large irq off latency
+	 * needs some instrumenting to determine the common sizes used by
+	 * important workloads (eg. DB2), and whether limiting the batch size
+	 * will decrease performance.
+	 *
+	 * It seems like we're in the clear for the moment. Direct-IO is
+	 * the main guy that batches up lots of get_user_pages, and even
+	 * they are limited to 64-at-a-time which is not so many.
+	 */
+	/*
+	 * This doesn't prevent pagetable teardown, but does prevent
+	 * the pagetables from being freed on powerpc.
+	 *
+	 * So long as we atomically load page table pointers versus teardown,
+	 * we can follow the address down to the the page and take a ref on it.
+	 */
+	local_irq_disable();
+
+	psize = get_slice_psize(mm, addr);
+	shift = mmu_psize_defs[psize].shift;
+
+	if (unlikely(mmu_huge_psizes[psize])) {
+		pte_t *ptep;
+		unsigned long a = addr;
+		unsigned long sz = ((1UL) << shift);
+		struct hstate *hstate = size_to_hstate(sz);
+
+		BUG_ON(!hstate);
+		/*
+		 * XXX: could be optimized to avoid hstate
+		 * lookup entirely (just use shift)
+		 */
+
+		do {
+			VM_BUG_ON(shift != mmu_psize_defs[get_slice_psize(mm, a)].shift);
+			ptep = huge_pte_offset(mm, a);
+			pr_debug(" %016lx: huge ptep %p\n", a, ptep);
+			if (!ptep || !gup_huge_pte(ptep, hstate, &a, end, write, pages,
+						   &nr))
+				goto slow;
+		} while (a != end);
+	} else {
+		pgdp = pgd_offset(mm, addr);
+		do {
+			pgd_t pgd = *pgdp;
+
+			VM_BUG_ON(shift != mmu_psize_defs[get_slice_psize(mm, addr)].shift);
+			pr_debug("  %016lx: normal pgd %p\n", addr, (void *)pgd);
+			next = pgd_addr_end(addr, end);
+			if (pgd_none(pgd))
+				goto slow;
+			if (!gup_pud_range(pgd, addr, next, write, pages, &nr))
+				goto slow;
+		} while (pgdp++, addr = next, addr != end);
+	}
+	local_irq_enable();
+
+	VM_BUG_ON(nr != (end - start) >> PAGE_SHIFT);
+	return nr;
+
+	{
+		int ret;
+
+slow:
+		local_irq_enable();
+slow_irqon:
+		pr_debug("  slow path ! nr = %d\n", nr);
+
+		/* Try to get the remaining pages with get_user_pages */
+		start += nr << PAGE_SHIFT;
+		pages += nr;
+
+		down_read(&mm->mmap_sem);
+		ret = get_user_pages(current, mm, start,
+			(end - start) >> PAGE_SHIFT, write, 0, pages, NULL);
+		up_read(&mm->mmap_sem);
+
+		/* Have to be a bit careful with return values */
+		if (nr > 0) {
+			if (ret < 0)
+				ret = nr;
+			else
+				ret += nr;
+		}
+
+		return ret;
+	}
+}
Index: linux-work/include/asm-powerpc/pgtable-ppc64.h
===================================================================
--- linux-work.orig/include/asm-powerpc/pgtable-ppc64.h	2008-07-30 13:17:06.000000000 +1000
+++ linux-work/include/asm-powerpc/pgtable-ppc64.h	2008-07-30 13:27:40.000000000 +1000
@@ -461,6 +461,8 @@ void pgtable_cache_init(void);
 	return pt;
 }
 
+pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long address);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
Index: linux-work/include/linux/pagemap.h
===================================================================
--- linux-work.orig/include/linux/pagemap.h	2008-07-30 13:17:06.000000000 +1000
+++ linux-work/include/linux/pagemap.h	2008-07-30 13:27:40.000000000 +1000
@@ -142,6 +142,29 @@ static inline int page_cache_get_specula
 	return 1;
 }
 
+/*
+ * Same as above, but add instead of inc (could just be merged)
+ */
+static inline int page_cache_add_speculative(struct page *page, int count)
+{
+	VM_BUG_ON(in_interrupt());
+
+#if !defined(CONFIG_SMP) && defined(CONFIG_CLASSIC_RCU)
+# ifdef CONFIG_PREEMPT
+	VM_BUG_ON(!in_atomic());
+# endif
+	VM_BUG_ON(page_count(page) == 0);
+	atomic_add(count, &page->_count);
+
+#else
+	if (unlikely(!atomic_add_unless(&page->_count, count, 0)))
+		return 0;
+#endif
+	VM_BUG_ON(PageCompound(page) && page != compound_head(page));
+
+	return 1;
+}
+
 static inline int page_freeze_refs(struct page *page, int count)
 {
 	return likely(atomic_cmpxchg(&page->_count, count, 0) == count);

^ permalink raw reply

* Re: [PATCH] powerpc/mm: Lockless get_user_pages_fast()
From: Michael Ellerman @ 2008-07-30  5:06 UTC (permalink / raw)
  To: benh; +Cc: Nick Piggin, linuxppc-dev list
In-Reply-To: <1217391656.11188.292.camel@pasglop>

[-- Attachment #1: Type: text/plain, Size: 1618 bytes --]

On Wed, 2008-07-30 at 14:20 +1000, Benjamin Herrenschmidt wrote:
> From: Nick Piggin <npiggin@suse.de>
> 
> Implement lockless get_user_pages_fast for powerpc.  Page table existence
> is guaranteed with RCU, and speculative page references are used to take a
> reference to the pages without having a prior existence guarantee on them.

> Index: linux-work/arch/powerpc/mm/gup.c
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-work/arch/powerpc/mm/gup.c	2008-07-30 14:20:00.000000000 +1000
> @@ -0,0 +1,271 @@
> +/*
> + * Lockless get_user_pages_fast for powerpc
> + *
> + * Copyright (C) 2008 Nick Piggin
> + * Copyright (C) 2008 Novell Inc.
> + */
> +#undef DEBUG
> +
> +#include <linux/sched.h>
> +#include <linux/mm.h>
> +#include <linux/hugetlb.h>
> +#include <linux/vmstat.h>
> +#include <linux/pagemap.h>
> +#include <linux/rwsem.h>
> +#include <asm/pgtable.h>
> +
> +/*
> + * The performance critical leaf functions are made noinline otherwise gcc
> + * inlines everything into a single function which results in too much
> + * register pressure.
> + */

This strikes me as something that is liable to change for compiler
version n+1, or n with -fsomething - and might leave us shooting
ourselves in the foot, just a thought.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: [PATCH] powerpc/mm: Lockless get_user_pages_fast()
From: Benjamin Herrenschmidt @ 2008-07-30  5:08 UTC (permalink / raw)
  To: michael; +Cc: Nick Piggin, linuxppc-dev list
In-Reply-To: <1217394382.10646.13.camel@localhost>

On Wed, 2008-07-30 at 15:06 +1000, Michael Ellerman wrote:
> > +
> > +/*
> > + * The performance critical leaf functions are made noinline otherwise gcc
> > + * inlines everything into a single function which results in too much
> > + * register pressure.
> > + */
> 
> This strikes me as something that is liable to change for compiler
> version n+1, or n with -fsomething - and might leave us shooting
> ourselves in the foot, just a thought.
> 

Not that much I'd say... In fact, I wouldn't be too worried on powerpc,
I wonder if that comment is stale from the x86 variant :-) Nick ?

Cheers,
Ben.

^ permalink raw reply

* [PATCH] powerpc/mm: Lockless get_user_pages_fast() for 64-bit (v3)
From: Benjamin Herrenschmidt @ 2008-07-30  5:23 UTC (permalink / raw)
  To: linuxppc-dev list; +Cc: Nick Piggin

From: Nick Piggin <npiggin@suse.de>

Implement lockless get_user_pages_fast for 64-bit powerpc.

Page table existence is guaranteed with RCU, and speculative page references
are used to take a reference to the pages without having a prior existence
guarantee on them.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

v3.
Fix compile without hugetlbfs

v2.
Fix makefile to only build gup.o on 64 bits and fix a bug with
huge pages where we would oops (null dereference) if  
huge_pte_offset() returns NULL (ie, not populated yet).

v1.
I'm going to merge this, sending it to the list for reference, it was
in -mm , minus some changes/fixes I did to solve conflicts with the
new multiple huge page sizes.

Index: linux-work/arch/powerpc/Kconfig
===================================================================
--- linux-work.orig/arch/powerpc/Kconfig	2008-07-30 13:17:06.000000000 +1000
+++ linux-work/arch/powerpc/Kconfig	2008-07-30 13:27:40.000000000 +1000
@@ -42,6 +42,9 @@ config GENERIC_HARDIRQS
 	bool
 	default y
 
+config HAVE_GET_USER_PAGES_FAST
+	def_bool PPC64
+
 config HAVE_SETUP_PER_CPU_AREA
 	def_bool PPC64
 
Index: linux-work/arch/powerpc/mm/Makefile
===================================================================
--- linux-work.orig/arch/powerpc/mm/Makefile	2008-07-30 13:17:06.000000000 +1000
+++ linux-work/arch/powerpc/mm/Makefile	2008-07-30 13:42:42.000000000 +1000
@@ -12,7 +12,8 @@ obj-y				:= fault.o mem.o \
 				   mmu_context_$(CONFIG_WORD_SIZE).o
 hash-$(CONFIG_PPC_NATIVE)	:= hash_native_64.o
 obj-$(CONFIG_PPC64)		+= hash_utils_64.o \
-				   slb_low.o slb.o stab.o mmap.o $(hash-y)
+				   slb_low.o slb.o stab.o \
+				   gup.o mmap.o $(hash-y)
 obj-$(CONFIG_PPC_STD_MMU_32)	+= ppc_mmu_32.o
 obj-$(CONFIG_PPC_STD_MMU)	+= hash_low_$(CONFIG_WORD_SIZE).o \
 				   tlb_$(CONFIG_WORD_SIZE).o
Index: linux-work/arch/powerpc/mm/gup.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-work/arch/powerpc/mm/gup.c	2008-07-30 15:17:03.000000000 +1000
@@ -0,0 +1,280 @@
+/*
+ * Lockless get_user_pages_fast for powerpc
+ *
+ * Copyright (C) 2008 Nick Piggin
+ * Copyright (C) 2008 Novell Inc.
+ */
+#undef DEBUG
+
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <linux/hugetlb.h>
+#include <linux/vmstat.h>
+#include <linux/pagemap.h>
+#include <linux/rwsem.h>
+#include <asm/pgtable.h>
+
+/*
+ * The performance critical leaf functions are made noinline otherwise gcc
+ * inlines everything into a single function which results in too much
+ * register pressure.
+ */
+static noinline int gup_pte_range(pmd_t pmd, unsigned long addr,
+		unsigned long end, int write, struct page **pages, int *nr)
+{
+	unsigned long mask, result;
+	pte_t *ptep;
+
+	result = _PAGE_PRESENT|_PAGE_USER;
+	if (write)
+		result |= _PAGE_RW;
+	mask = result | _PAGE_SPECIAL;
+
+	ptep = pte_offset_kernel(&pmd, addr);
+	do {
+		pte_t pte = *ptep;
+		struct page *page;
+
+		if ((pte_val(pte) & mask) != result)
+			return 0;
+		VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
+		page = pte_page(pte);
+		if (!page_cache_get_speculative(page))
+			return 0;
+		if (unlikely(pte != *ptep)) {
+			put_page(page);
+			return 0;
+		}
+		pages[*nr] = page;
+		(*nr)++;
+
+	} while (ptep++, addr += PAGE_SIZE, addr != end);
+
+	return 1;
+}
+
+#ifdef CONFIG_HUGETLB_PAGE
+static noinline int gup_huge_pte(pte_t *ptep, struct hstate *hstate,
+				 unsigned long *addr, unsigned long end,
+				 int write, struct page **pages, int *nr)
+{
+	unsigned long mask;
+	unsigned long pte_end;
+	struct page *head, *page;
+	pte_t pte;
+	int refs;
+
+	pte_end = (*addr + huge_page_size(hstate)) & huge_page_mask(hstate);
+	if (pte_end < end)
+		end = pte_end;
+
+	pte = *ptep;
+	mask = _PAGE_PRESENT|_PAGE_USER;
+	if (write)
+		mask |= _PAGE_RW;
+	if ((pte_val(pte) & mask) != mask)
+		return 0;
+	/* hugepages are never "special" */
+	VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
+
+	refs = 0;
+	head = pte_page(pte);
+	page = head + ((*addr & ~huge_page_mask(hstate)) >> PAGE_SHIFT);
+	do {
+		VM_BUG_ON(compound_head(page) != head);
+		pages[*nr] = page;
+		(*nr)++;
+		page++;
+		refs++;
+	} while (*addr += PAGE_SIZE, *addr != end);
+
+	if (!page_cache_add_speculative(head, refs)) {
+		*nr -= refs;
+		return 0;
+	}
+	if (unlikely(pte != *ptep)) {
+		/* Could be optimized better */
+		while (*nr) {
+			put_page(page);
+			(*nr)--;
+		}
+	}
+
+	return 1;
+}
+#endif /* CONFIG_HUGETLB_PAGE */
+
+static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
+		int write, struct page **pages, int *nr)
+{
+	unsigned long next;
+	pmd_t *pmdp;
+
+	pmdp = pmd_offset(&pud, addr);
+	do {
+		pmd_t pmd = *pmdp;
+
+		next = pmd_addr_end(addr, end);
+		if (pmd_none(pmd))
+			return 0;
+		if (!gup_pte_range(pmd, addr, next, write, pages, nr))
+			return 0;
+	} while (pmdp++, addr = next, addr != end);
+
+	return 1;
+}
+
+static int gup_pud_range(pgd_t pgd, unsigned long addr, unsigned long end,
+		int write, struct page **pages, int *nr)
+{
+	unsigned long next;
+	pud_t *pudp;
+
+	pudp = pud_offset(&pgd, addr);
+	do {
+		pud_t pud = *pudp;
+
+		next = pud_addr_end(addr, end);
+		if (pud_none(pud))
+			return 0;
+		if (!gup_pmd_range(pud, addr, next, write, pages, nr))
+			return 0;
+	} while (pudp++, addr = next, addr != end);
+
+	return 1;
+}
+
+int get_user_pages_fast(unsigned long start, int nr_pages, int write,
+			struct page **pages)
+{
+	struct mm_struct *mm = current->mm;
+	unsigned long addr, len, end;
+	unsigned long next;
+	pgd_t *pgdp;
+	int psize, nr = 0;
+	unsigned int shift;
+
+	pr_debug("%s(%lx,%x,%s)\n", __func__, start, nr_pages, write ? "write" : "read");
+
+	start &= PAGE_MASK;
+	addr = start;
+	len = (unsigned long) nr_pages << PAGE_SHIFT;
+	end = start + len;
+
+	if (unlikely(!access_ok(write ? VERIFY_WRITE : VERIFY_READ,
+					start, len)))
+		goto slow_irqon;
+
+	pr_debug("  aligned: %lx .. %lx\n", start, end);
+
+#ifdef CONFIG_HUGETLB_PAGE
+	/* We bail out on slice boundary crossing when hugetlb is
+	 * enabled in order to not have to deal with two different
+	 * page table formats
+	 */
+	if (addr < SLICE_LOW_TOP) {
+		if (end > SLICE_LOW_TOP)
+			goto slow_irqon;
+
+		if (unlikely(GET_LOW_SLICE_INDEX(addr) !=
+			     GET_LOW_SLICE_INDEX(end - 1)))
+			goto slow_irqon;
+	} else {
+		if (unlikely(GET_HIGH_SLICE_INDEX(addr) !=
+			     GET_HIGH_SLICE_INDEX(end - 1)))
+			goto slow_irqon;
+	}
+#endif /* CONFIG_HUGETLB_PAGE */
+
+	/*
+	 * XXX: batch / limit 'nr', to avoid large irq off latency
+	 * needs some instrumenting to determine the common sizes used by
+	 * important workloads (eg. DB2), and whether limiting the batch size
+	 * will decrease performance.
+	 *
+	 * It seems like we're in the clear for the moment. Direct-IO is
+	 * the main guy that batches up lots of get_user_pages, and even
+	 * they are limited to 64-at-a-time which is not so many.
+	 */
+	/*
+	 * This doesn't prevent pagetable teardown, but does prevent
+	 * the pagetables from being freed on powerpc.
+	 *
+	 * So long as we atomically load page table pointers versus teardown,
+	 * we can follow the address down to the the page and take a ref on it.
+	 */
+	local_irq_disable();
+
+	psize = get_slice_psize(mm, addr);
+	shift = mmu_psize_defs[psize].shift;
+
+#ifdef CONFIG_HUGETLB_PAGE
+	if (unlikely(mmu_huge_psizes[psize])) {
+		pte_t *ptep;
+		unsigned long a = addr;
+		unsigned long sz = ((1UL) << shift);
+		struct hstate *hstate = size_to_hstate(sz);
+
+		BUG_ON(!hstate);
+		/*
+		 * XXX: could be optimized to avoid hstate
+		 * lookup entirely (just use shift)
+		 */
+
+		do {
+			VM_BUG_ON(shift != mmu_psize_defs[get_slice_psize(mm, a)].shift);
+			ptep = huge_pte_offset(mm, a);
+			pr_debug(" %016lx: huge ptep %p\n", a, ptep);
+			if (!ptep || !gup_huge_pte(ptep, hstate, &a, end, write, pages,
+						   &nr))
+				goto slow;
+		} while (a != end);
+	} else
+#endif /* CONFIG_HUGETLB_PAGE */
+	{
+		pgdp = pgd_offset(mm, addr);
+		do {
+			pgd_t pgd = *pgdp;
+
+			VM_BUG_ON(shift != mmu_psize_defs[get_slice_psize(mm, addr)].shift);
+			pr_debug("  %016lx: normal pgd %p\n", addr, (void *)pgd);
+			next = pgd_addr_end(addr, end);
+			if (pgd_none(pgd))
+				goto slow;
+			if (!gup_pud_range(pgd, addr, next, write, pages, &nr))
+				goto slow;
+		} while (pgdp++, addr = next, addr != end);
+	}
+	local_irq_enable();
+
+	VM_BUG_ON(nr != (end - start) >> PAGE_SHIFT);
+	return nr;
+
+	{
+		int ret;
+
+slow:
+		local_irq_enable();
+slow_irqon:
+		pr_debug("  slow path ! nr = %d\n", nr);
+
+		/* Try to get the remaining pages with get_user_pages */
+		start += nr << PAGE_SHIFT;
+		pages += nr;
+
+		down_read(&mm->mmap_sem);
+		ret = get_user_pages(current, mm, start,
+			(end - start) >> PAGE_SHIFT, write, 0, pages, NULL);
+		up_read(&mm->mmap_sem);
+
+		/* Have to be a bit careful with return values */
+		if (nr > 0) {
+			if (ret < 0)
+				ret = nr;
+			else
+				ret += nr;
+		}
+
+		return ret;
+	}
+}
Index: linux-work/include/asm-powerpc/pgtable-ppc64.h
===================================================================
--- linux-work.orig/include/asm-powerpc/pgtable-ppc64.h	2008-07-30 13:17:06.000000000 +1000
+++ linux-work/include/asm-powerpc/pgtable-ppc64.h	2008-07-30 13:27:40.000000000 +1000
@@ -461,6 +461,8 @@ void pgtable_cache_init(void);
 	return pt;
 }
 
+pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long address);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
Index: linux-work/include/linux/pagemap.h
===================================================================
--- linux-work.orig/include/linux/pagemap.h	2008-07-30 13:17:06.000000000 +1000
+++ linux-work/include/linux/pagemap.h	2008-07-30 13:27:40.000000000 +1000
@@ -142,6 +142,29 @@ static inline int page_cache_get_specula
 	return 1;
 }
 
+/*
+ * Same as above, but add instead of inc (could just be merged)
+ */
+static inline int page_cache_add_speculative(struct page *page, int count)
+{
+	VM_BUG_ON(in_interrupt());
+
+#if !defined(CONFIG_SMP) && defined(CONFIG_CLASSIC_RCU)
+# ifdef CONFIG_PREEMPT
+	VM_BUG_ON(!in_atomic());
+# endif
+	VM_BUG_ON(page_count(page) == 0);
+	atomic_add(count, &page->_count);
+
+#else
+	if (unlikely(!atomic_add_unless(&page->_count, count, 0)))
+		return 0;
+#endif
+	VM_BUG_ON(PageCompound(page) && page != compound_head(page));
+
+	return 1;
+}
+
 static inline int page_freeze_refs(struct page *page, int count)
 {
 	return likely(atomic_cmpxchg(&page->_count, count, 0) == count);

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox