LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: How to use mpc8xxx_gpio.c device driver
From: Ira W. Snyder @ 2010-08-11 16:25 UTC (permalink / raw)
  To: Ravi Gupta; +Cc: linuxppc-dev, linuxppc-dev
In-Reply-To: <AANLkTimJ1+d7U2SN_SSHepo6u01RJqUEXUDQXUgBfCWK@mail.gmail.com>

On Wed, Aug 11, 2010 at 07:49:40PM +0530, Ravi Gupta wrote:
> Also, when I try to export a gpio in sysfs
> 
> echo 9 > /sys/class/gpio/export
> 
> It gives me an error in dmesg
> gpio_request: gpio-9 (sysfs) status -22
> export_store: status -22
> 
> Here is a look of sysfs on my machine
> 
> # ls /sys/class/gpio/ -la
> drwxr-xr-x    4 root     root            0 Jan  1 00:00 .
> drwxr-xr-x   24 root     root            0 Jan  1 00:00 ..
> --w-------    1 root     root         4096 Jan  1 00:10 export
> drwxr-xr-x    3 root     root            0 Jan  1 00:00 gpiochip192
> drwxr-xr-x    3 root     root            0 Jan  1 00:00 gpiochip224
> --w-------    1 root     root         4096 Jan  1 00:00 unexport


Your GPIO pins are numbered from 192-223 on one GPIO chip, and 224-255
on the next GPIO chip. You should be exporting GPIO pin 200 or 201
(192+8 or 192+9), depending on whether your pins are numbered from zero
or one.

"status -22" is -EINVAL: Invalid Argument. You're doing something which
is invalid, so this makes sense. There is no "pin 9".

Ira

^ permalink raw reply

* [PATCH] booting-without-of: Remove nonexistent chapters from TOC, fix numbering
From: Anton Vorontsov @ 2010-08-11 16:56 UTC (permalink / raw)
  To: Grant Likely; +Cc: linuxppc-dev

Marvell and GPIO bindings live in their own files, so the TOC should not
mention them.

Also fix chapters numbering.

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
---
 Documentation/powerpc/booting-without-of.txt |   31 +------------------------
 1 files changed, 2 insertions(+), 29 deletions(-)

diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt
index 46d2210..3f454b7 100644
--- a/Documentation/powerpc/booting-without-of.txt
+++ b/Documentation/powerpc/booting-without-of.txt
@@ -49,40 +49,13 @@ Table of Contents
       f) MDIO on GPIOs
       g) SPI busses
 
-  VII - Marvell Discovery mv64[345]6x System Controller chips
-    1) The /system-controller node
-    2) Child nodes of /system-controller
-      a) Marvell Discovery MDIO bus
-      b) Marvell Discovery ethernet controller
-      c) Marvell Discovery PHY nodes
-      d) Marvell Discovery SDMA nodes
-      e) Marvell Discovery BRG nodes
-      f) Marvell Discovery CUNIT nodes
-      g) Marvell Discovery MPSCROUTING nodes
-      h) Marvell Discovery MPSCINTR nodes
-      i) Marvell Discovery MPSC nodes
-      j) Marvell Discovery Watch Dog Timer nodes
-      k) Marvell Discovery I2C nodes
-      l) Marvell Discovery PIC (Programmable Interrupt Controller) nodes
-      m) Marvell Discovery MPP (Multipurpose Pins) multiplexing nodes
-      n) Marvell Discovery GPP (General Purpose Pins) nodes
-      o) Marvell Discovery PCI host bridge node
-      p) Marvell Discovery CPU Error nodes
-      q) Marvell Discovery SRAM Controller nodes
-      r) Marvell Discovery PCI Error Handler nodes
-      s) Marvell Discovery Memory Controller nodes
-
-  VIII - Specifying interrupt information for devices
+  VII - Specifying interrupt information for devices
     1) interrupts property
     2) interrupt-parent property
     3) OpenPIC Interrupt Controllers
     4) ISA Interrupt Controllers
 
-  IX - Specifying GPIO information for devices
-    1) gpios property
-    2) gpio-controller nodes
-
-  X - Specifying device power management information (sleep property)
+  VIII - Specifying device power management information (sleep property)
 
   Appendix A - Sample SOC node for MPC8540
 
-- 
1.7.0.5

^ permalink raw reply related

* [PATCH 1/1] powerpc: Clear cpu_sibling_map in cpu_die
From: Brian King @ 2010-08-11 20:34 UTC (permalink / raw)
  To: benh; +Cc: brking, linuxppc-dev


While testing CPU DLPAR, the following problem was discovered.
We were DLPAR removing the first CPU, which in this case was
logical CPUs 0-3. CPUs 0-2 were already marked offline and
we were in the process of offlining CPU 3. After marking
the CPU inactive and offline in cpu_disable, but before the
cpu was completely idle (cpu_die), we ended up in __make_request
on CPU 3. There we looked at the topology map to see which CPU
to complete the I/O on and found no CPUs in the cpu_sibling_map.
This resulted in the block layer setting the completion cpu
to be NR_CPUS, which then caused an oops when we tried to
complete the I/O.

Fix this by delaying clearing the sibling map of the cpu we
are offlining for the cpu we are offlining until cpu_die.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
---

 arch/powerpc/kernel/smp.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff -puN arch/powerpc/kernel/smp.c~powerpc_sibling_map_offline arch/powerpc/kernel/smp.c
--- linux-2.6/arch/powerpc/kernel/smp.c~powerpc_sibling_map_offline	2010-08-09 16:49:47.000000000 -0500
+++ linux-2.6-bjking1/arch/powerpc/kernel/smp.c	2010-08-09 16:49:47.000000000 -0500
@@ -598,8 +598,11 @@ int __cpu_disable(void)
 	/* Update sibling maps */
 	base = cpu_first_thread_in_core(cpu);
 	for (i = 0; i < threads_per_core; i++) {
-		cpumask_clear_cpu(cpu, cpu_sibling_mask(base + i));
-		cpumask_clear_cpu(base + i, cpu_sibling_mask(cpu));
+		if ((base + i) != cpu) {
+			cpumask_clear_cpu(cpu, cpu_sibling_mask(base + i));
+			cpumask_clear_cpu(base + i, cpu_sibling_mask(cpu));
+		}
+
 		cpumask_clear_cpu(cpu, cpu_core_mask(base + i));
 		cpumask_clear_cpu(base + i, cpu_core_mask(cpu));
 	}
@@ -641,6 +644,8 @@ void cpu_hotplug_driver_unlock()
 
 void cpu_die(void)
 {
+	cpumask_clear_cpu(smp_processor_id(), cpu_sibling_mask(smp_processor_id()));
+
 	if (ppc_md.cpu_die)
 		ppc_md.cpu_die();
 }
_

^ permalink raw reply

* Query regarding 2.6.335 RT[Ingo's] and Non-RT performance
From: Manikandan Ramachandran @ 2010-08-11 22:18 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3340 bytes --]

Hello All,

    I created a very simple program which has higher priority than normal
tasks and runs a tight loop. Under same test environment I ran this
program on both non-rt and rt 2.6.33.5 kernel.  To my suprise I see that
performance of non-RT kernel is better than RT. non-RT kernel took 3 sec and
366156 usec while RT kernel took about 3 sec and 418011 usec.Can someone
please explain why the performance of non-rt kernel is better than rt
kernel? From the face of the test result, I feel RT has more overhead,Is
there any configuration that I could do to bring down the overhead?

Processor:
----------------
processor       : 0
cpu             : 7448
clock           : 996.000000MHz
revision        : 2.2 (pvr 8004 0202)
bogomips        : 83.10
processor       : 1
cpu             : 7448
clock           : 996.000000MHz
revision        : 2.2 (pvr 8004 0202)
bogomips        : 83.10

CFS optimization:
--------------------------
# cat /proc/sys/kernel/sched_rt_runtime_us
1000000
# cat /proc/sys/kernel/sched_rt_period_us
1000000
# cat /proc/sys/kernel/sched_compat_yield
1

Test Program:
---------------------

main()
{

    int sched_rr_min,sched_rr_max;
    struct sched_param scheduling_parameters;
    struct timeval tv,late_tv;
    suseconds_t usec_diff,avg_usec = 0;
    time_t sec_diff, avg_sec = 0;
    int i;
    long count = 1;

    sched_rr_min = sched_get_priority_min(SCHED_RR);
    sched_rr_max = sched_get_priority_max(SCHED_RR);
    scheduling_parameters.sched_priority = sched_rr_min+4;
    sched_setscheduler(0, SCHED_RR, &scheduling_parameters);// Run the
process with the given priority


    for(i = 0 ; i < 150 ; i++) {
       gettimeofday(&tv, NULL);
       while(count > 0){
        //printf(".");
        count++;
       }
       gettimeofday(&late_tv, NULL);
       count = 1;
       sec_diff = (late_tv.tv_sec - tv.tv_sec);
       avg_sec += sec_diff;
       usec_diff = ( (late_tv.tv_usec > tv.tv_usec) ? (late_tv.tv_usec -
tv.tv_usec) : ( tv.tv_usec - late_tv.tv_usec));
       avg_usec += usec_diff;
       printf("Iteration #%d sec %x usec %x\n",i,(sec_diff),(usec_diff));
    }
       printf("Average of #%d sec %x usec %x\n",i,(avg_sec/i),(avg_usec)/i);
}

Partial Result of non-rt kernel:
-------------------------------------------

Iteration #140 sec 3 usec 3aef8
Iteration #141 sec 3 usec 3aefe
Iteration #142 sec 3 usec 3aee4
*Iteration #143 sec 4 usec b935b  [Why there is this periodic bump ??]
[Scheduler at work??]*
Iteration #144 sec 3 usec 3aef2
Iteration #145 sec 3 usec 3aef0
Iteration #146 sec 3 usec 3aef4
*Iteration #147 sec 4 usec b934b*
Iteration #148 sec 3 usec 3aeed
Iteration #149 sec 3 usec 3aef9

 Partial Result of rt kernel:
-------------------------------------------
Iteration #135 sec 3 usec 47328
*Iteration #136 sec 4 usec ac4fd
*Iteration #137 sec 3 usec 48b0b
Iteration #138 sec 3 usec 4738c
Iteration #139 sec 4 usec ac4d5
Iteration #140 sec 3 usec 483cb
Iteration #141 sec 3 usec 48500
*Iteration #142 sec 4 usec acc49
*Iteration #143 sec 3 usec 47c1f
Iteration #144 sec 3 usec 478c2
Iteration #145 sec 3 usec 47e48
Iteration #146 sec 4 usec ac9b5
Iteration #147 sec 3 usec 48de4
Iteration #148 sec 3 usec 46fbe
Iteration #149 sec 4 usec ac52e
Average of #150 sec 3 usec 660db

Thanks,
Mani


-- 
Thanks,
Manik

Think twice about a tree before you take a printout

[-- Attachment #2: Type: text/html, Size: 6395 bytes --]

^ permalink raw reply

* [039/111] irq: Add new IRQ flag IRQF_NO_SUSPEND
From: Greg KH @ 2010-08-11 23:54 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jeremy Fitzhardinge, xen-devel, Thomas Gleixner, Ian Campbell,
	devicetree-discuss, Dmitry Torokhov, linuxppc-dev, Paul Mackerras,
	linux-input, akpm, torvalds, stable-review, alan
In-Reply-To: <20100811235623.GA24440@kroah.com>

2.6.32-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Ian Campbell <ian.campbell@citrix.com>

commit 685fd0b4ea3f0f1d5385610b0d5b57775a8d5842 upstream.

A small number of users of IRQF_TIMER are using it for the implied no
suspend behaviour on interrupts which are not timer interrupts.

Therefore add a new IRQF_NO_SUSPEND flag, rename IRQF_TIMER to
__IRQF_TIMER and redefine IRQF_TIMER in terms of these new flags.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: xen-devel@lists.xensource.com
Cc: linux-input@vger.kernel.org
Cc: linuxppc-dev@ozlabs.org
Cc: devicetree-discuss@lists.ozlabs.org
LKML-Reference: <1280398595-29708-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 include/linux/interrupt.h |    7 ++++++-
 kernel/irq/manage.c       |    2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -52,16 +52,21 @@
  * IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
  *                Used by threaded interrupts which need to keep the
  *                irq line disabled until the threaded handler has been run.
+ * IRQF_NO_SUSPEND - Do not disable this IRQ during suspend
+ *
  */
 #define IRQF_DISABLED		0x00000020
 #define IRQF_SAMPLE_RANDOM	0x00000040
 #define IRQF_SHARED		0x00000080
 #define IRQF_PROBE_SHARED	0x00000100
-#define IRQF_TIMER		0x00000200
+#define __IRQF_TIMER		0x00000200
 #define IRQF_PERCPU		0x00000400
 #define IRQF_NOBALANCING	0x00000800
 #define IRQF_IRQPOLL		0x00001000
 #define IRQF_ONESHOT		0x00002000
+#define IRQF_NO_SUSPEND		0x00004000
+
+#define IRQF_TIMER		(__IRQF_TIMER | IRQF_NO_SUSPEND)
 
 /*
  * Bits used by threaded handlers:
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -200,7 +200,7 @@ static inline int setup_affinity(unsigne
 void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend)
 {
 	if (suspend) {
-		if (!desc->action || (desc->action->flags & IRQF_TIMER))
+		if (!desc->action || (desc->action->flags & IRQF_NO_SUSPEND))
 			return;
 		desc->status |= IRQ_SUSPENDED;
 	}

^ permalink raw reply

* [48/54] irq: Add new IRQ flag IRQF_NO_SUSPEND
From: Greg KH @ 2010-08-12  0:01 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jeremy Fitzhardinge, xen-devel, Thomas Gleixner, Ian Campbell,
	devicetree-discuss, Dmitry Torokhov, linuxppc-dev, Paul Mackerras,
	linux-input, akpm, torvalds, stable-review, alan
In-Reply-To: <20100812000249.GA30948@kroah.com>

2.6.34-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Ian Campbell <ian.campbell@citrix.com>

commit 685fd0b4ea3f0f1d5385610b0d5b57775a8d5842 upstream.

A small number of users of IRQF_TIMER are using it for the implied no
suspend behaviour on interrupts which are not timer interrupts.

Therefore add a new IRQF_NO_SUSPEND flag, rename IRQF_TIMER to
__IRQF_TIMER and redefine IRQF_TIMER in terms of these new flags.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: xen-devel@lists.xensource.com
Cc: linux-input@vger.kernel.org
Cc: linuxppc-dev@ozlabs.org
Cc: devicetree-discuss@lists.ozlabs.org
LKML-Reference: <1280398595-29708-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 include/linux/interrupt.h |    7 ++++++-
 kernel/irq/manage.c       |    2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -52,16 +52,21 @@
  * IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
  *                Used by threaded interrupts which need to keep the
  *                irq line disabled until the threaded handler has been run.
+ * IRQF_NO_SUSPEND - Do not disable this IRQ during suspend
+ *
  */
 #define IRQF_DISABLED		0x00000020
 #define IRQF_SAMPLE_RANDOM	0x00000040
 #define IRQF_SHARED		0x00000080
 #define IRQF_PROBE_SHARED	0x00000100
-#define IRQF_TIMER		0x00000200
+#define __IRQF_TIMER		0x00000200
 #define IRQF_PERCPU		0x00000400
 #define IRQF_NOBALANCING	0x00000800
 #define IRQF_IRQPOLL		0x00001000
 #define IRQF_ONESHOT		0x00002000
+#define IRQF_NO_SUSPEND		0x00004000
+
+#define IRQF_TIMER		(__IRQF_TIMER | IRQF_NO_SUSPEND)
 
 /*
  * Bits used by threaded handlers:
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -200,7 +200,7 @@ static inline int setup_affinity(unsigne
 void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend)
 {
 	if (suspend) {
-		if (!desc->action || (desc->action->flags & IRQF_TIMER))
+		if (!desc->action || (desc->action->flags & IRQF_NO_SUSPEND))
 			return;
 		desc->status |= IRQ_SUSPENDED;
 	}

^ permalink raw reply

* [64/67] irq: Add new IRQ flag IRQF_NO_SUSPEND
From: Greg KH @ 2010-08-12  0:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jeremy Fitzhardinge, xen-devel, Thomas Gleixner, Ian Campbell,
	devicetree-discuss, Dmitry Torokhov, linuxppc-dev, Paul Mackerras,
	linux-input, akpm, torvalds, stable-review, alan
In-Reply-To: <20100812000641.GA6348@kroah.com>

2.6.35-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Ian Campbell <ian.campbell@citrix.com>

commit 685fd0b4ea3f0f1d5385610b0d5b57775a8d5842 upstream.

A small number of users of IRQF_TIMER are using it for the implied no
suspend behaviour on interrupts which are not timer interrupts.

Therefore add a new IRQF_NO_SUSPEND flag, rename IRQF_TIMER to
__IRQF_TIMER and redefine IRQF_TIMER in terms of these new flags.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: xen-devel@lists.xensource.com
Cc: linux-input@vger.kernel.org
Cc: linuxppc-dev@ozlabs.org
Cc: devicetree-discuss@lists.ozlabs.org
LKML-Reference: <1280398595-29708-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 include/linux/interrupt.h |    7 ++++++-
 kernel/irq/manage.c       |    2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -53,16 +53,21 @@
  * IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
  *                Used by threaded interrupts which need to keep the
  *                irq line disabled until the threaded handler has been run.
+ * IRQF_NO_SUSPEND - Do not disable this IRQ during suspend
+ *
  */
 #define IRQF_DISABLED		0x00000020
 #define IRQF_SAMPLE_RANDOM	0x00000040
 #define IRQF_SHARED		0x00000080
 #define IRQF_PROBE_SHARED	0x00000100
-#define IRQF_TIMER		0x00000200
+#define __IRQF_TIMER		0x00000200
 #define IRQF_PERCPU		0x00000400
 #define IRQF_NOBALANCING	0x00000800
 #define IRQF_IRQPOLL		0x00001000
 #define IRQF_ONESHOT		0x00002000
+#define IRQF_NO_SUSPEND		0x00004000
+
+#define IRQF_TIMER		(__IRQF_TIMER | IRQF_NO_SUSPEND)
 
 /*
  * Bits used by threaded handlers:
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -216,7 +216,7 @@ static inline int setup_affinity(unsigne
 void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend)
 {
 	if (suspend) {
-		if (!desc->action || (desc->action->flags & IRQF_TIMER))
+		if (!desc->action || (desc->action->flags & IRQF_NO_SUSPEND))
 			return;
 		desc->status |= IRQ_SUSPENDED;
 	}

^ permalink raw reply

* [PATCH] powerpc: Fix bogus it_blocksize in VIO iommu code
From: Anton Blanchard @ 2010-08-12  2:42 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev


When looking at some issues with the virtual ethernet driver I noticed
that TCE allocation was following a very strange pattern:

address 00e9000 length 2048
address 0409000 length 2048 <-----
address 0429000 length 2048
address 0449000 length 2048
address 0469000 length 2048
address 0489000 length 2048
address 04a9000 length 2048
address 04c9000 length 2048
address 04e9000 length 2048
address 4009000 length 2048 <-----
address 4029000 length 2048

Huge unexplained gaps in what should be an empty TCE table. It turns out
it_blocksize, the amount we want to align the next allocation to, was
c0000000fe903b20. Completely bogus.

Initialise it to something reasonable in the VIO IOMMU code, and use kzalloc
everywhere to protect against this when we next add a non compulsary
field to iommu code and forget to initialise it.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: powerpc.git/arch/powerpc/kernel/vio.c
===================================================================
--- powerpc.git.orig/arch/powerpc/kernel/vio.c	2010-08-12 12:27:58.674490962 +1000
+++ powerpc.git/arch/powerpc/kernel/vio.c	2010-08-12 12:28:18.660741428 +1000
@@ -1059,7 +1059,7 @@ static struct iommu_table *vio_build_iom
 	if (!dma_window)
 		return NULL;
 
-	tbl = kmalloc(sizeof(*tbl), GFP_KERNEL);
+	tbl = kzalloc(sizeof(*tbl), GFP_KERNEL);
 	if (tbl == NULL)
 		return NULL;
 
@@ -1072,6 +1072,7 @@ static struct iommu_table *vio_build_iom
 	tbl->it_offset = offset >> IOMMU_PAGE_SHIFT;
 	tbl->it_busno = 0;
 	tbl->it_type = TCE_VB;
+	tbl->it_blocksize = 16;
 
 	return iommu_init_table(tbl, -1);
 }
Index: powerpc.git/arch/powerpc/platforms/iseries/iommu.c
===================================================================
--- powerpc.git.orig/arch/powerpc/platforms/iseries/iommu.c	2010-08-12 12:29:35.473241172 +1000
+++ powerpc.git/arch/powerpc/platforms/iseries/iommu.c	2010-08-12 12:29:50.190890563 +1000
@@ -184,7 +184,7 @@ static void pci_dma_dev_setup_iseries(st
 
 	BUG_ON(lsn == NULL);
 
-	tbl = kmalloc(sizeof(struct iommu_table), GFP_KERNEL);
+	tbl = kzalloc(sizeof(struct iommu_table), GFP_KERNEL);
 
 	iommu_table_getparms_iSeries(pdn->busno, *lsn, 0, tbl);
 
Index: powerpc.git/arch/powerpc/platforms/pseries/iommu.c
===================================================================
--- powerpc.git.orig/arch/powerpc/platforms/pseries/iommu.c	2010-08-12 12:28:45.340756738 +1000
+++ powerpc.git/arch/powerpc/platforms/pseries/iommu.c	2010-08-12 12:29:15.401118951 +1000
@@ -403,7 +403,7 @@ static void pci_dma_bus_setup_pSeries(st
 	pci->phb->dma_window_size = 0x8000000ul;
 	pci->phb->dma_window_base_cur = 0x8000000ul;
 
-	tbl = kmalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
+	tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
 			   pci->phb->node);
 
 	iommu_table_setparms(pci->phb, dn, tbl);
@@ -448,7 +448,7 @@ static void pci_dma_bus_setup_pSeriesLP(
 		 pdn->full_name, ppci->iommu_table);
 
 	if (!ppci->iommu_table) {
-		tbl = kmalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
+		tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
 				   ppci->phb->node);
 		iommu_table_setparms_lpar(ppci->phb, pdn, tbl, dma_window,
 			bus->number);
@@ -478,7 +478,7 @@ static void pci_dma_dev_setup_pSeries(st
 		struct pci_controller *phb = PCI_DN(dn)->phb;
 
 		pr_debug(" --> first child, no bridge. Allocating iommu table.\n");
-		tbl = kmalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
+		tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
 				   phb->node);
 		iommu_table_setparms(phb, dn, tbl);
 		PCI_DN(dn)->iommu_table = iommu_init_table(tbl, phb->node);
@@ -544,7 +544,7 @@ static void pci_dma_dev_setup_pSeriesLP(
 
 	pci = PCI_DN(pdn);
 	if (!pci->iommu_table) {
-		tbl = kmalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
+		tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
 				   pci->phb->node);
 		iommu_table_setparms_lpar(pci->phb, pdn, tbl, dma_window,
 			pci->phb->bus->number);
Index: powerpc.git/arch/powerpc/platforms/cell/iommu.c
===================================================================
--- powerpc.git.orig/arch/powerpc/platforms/cell/iommu.c	2010-08-12 12:31:27.040741891 +1000
+++ powerpc.git/arch/powerpc/platforms/cell/iommu.c	2010-08-12 12:31:34.641324320 +1000
@@ -477,7 +477,7 @@ cell_iommu_setup_window(struct cbe_iommu
 
 	ioid = cell_iommu_get_ioid(np);
 
-	window = kmalloc_node(sizeof(*window), GFP_KERNEL, iommu->nid);
+	window = kzalloc_node(sizeof(*window), GFP_KERNEL, iommu->nid);
 	BUG_ON(window == NULL);
 
 	window->offset = offset;

^ permalink raw reply

* RE: [PATCH 1/2] mmc: change ACMD12 to AUTO_CMD12 for more clear
From: Zang Roy-R61911 @ 2010-08-12  4:00 UTC (permalink / raw)
  To: Zang Roy-R61911, akpm, linux-mmc; +Cc: linuxppc-dev, mirqus

=20

> -----Original Message-----
> From: Zang Roy-R61911=20
> Sent: Wednesday, August 11, 2010 12:47 PM
> To: Zang Roy-R61911; akpm@linux-foundation.org;=20
> linux-mmc@vger.kernel.org
> Cc: linuxppc-dev@ozlabs.org; mirqus@gmail.com;=20
> cbouatmailru@gmail.com; grant.likely@secretlab.ca
> Subject: RE: [PATCH 1/2] mmc: change ACMD12 to AUTO_CMD12 for=20
> more clear
>=20
> =20
>=20
> > -----Original Message-----
> > From: Zang Roy-R61911=20
> > Sent: Tuesday, August 10, 2010 17:47 PM
> > To: akpm@linux-foundation.org; linux-mmc@vger.kernel.org
> > Cc: linuxppc-dev@ozlabs.org; mirqus@gmail.com;=20
> > cbouatmailru@gmail.com; grant.likely@secretlab.ca
> > Subject: [PATCH 1/2] mmc: change ACMD12 to AUTO_CMD12 for more clear
> >=20
> > Change ACMD12 to AUTO_CMD12 to reduce the confusion.
> > ACMD12 might be confused with MMC/SD App CMD 12 (CMD55+CMD12 combo).
> >=20
> > Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
> > ---
> >  drivers/mmc/host/sdhci-of-core.c |    2 +-
> >  drivers/mmc/host/sdhci.c         |    8 ++++----
> >  drivers/mmc/host/sdhci.h         |   10 +++++-----
> >  3 files changed, 10 insertions(+), 10 deletions(-)
> Andrew,=20
> Could you help to pick up this minor fix?
> Thanks.
> Roy
Any update?
Thanks.
Roy

^ permalink raw reply

* Re: [PATCH 1/2] mmc: change ACMD12 to AUTO_CMD12 for more clear
From: Grant Likely @ 2010-08-12  4:21 UTC (permalink / raw)
  To: Zang Roy-R61911; +Cc: linux-mmc, linuxppc-dev, mirqus, akpm
In-Reply-To: <3850A844E6A3854C827AC5C0BEC7B60A0565B8@zch01exm23.fsl.freescale.net>

On Wed, Aug 11, 2010 at 10:00 PM, Zang Roy-R61911 <r61911@freescale.com> wr=
ote:
>
>
>> -----Original Message-----
>> From: Zang Roy-R61911
>> Sent: Wednesday, August 11, 2010 12:47 PM
>> To: Zang Roy-R61911; akpm@linux-foundation.org;
>> linux-mmc@vger.kernel.org
>> Cc: linuxppc-dev@ozlabs.org; mirqus@gmail.com;
>> cbouatmailru@gmail.com; grant.likely@secretlab.ca
>> Subject: RE: [PATCH 1/2] mmc: change ACMD12 to AUTO_CMD12 for
>> more clear
>>
>>
>>
>> > -----Original Message-----
>> > From: Zang Roy-R61911
>> > Sent: Tuesday, August 10, 2010 17:47 PM
>> > To: akpm@linux-foundation.org; linux-mmc@vger.kernel.org
>> > Cc: linuxppc-dev@ozlabs.org; mirqus@gmail.com;
>> > cbouatmailru@gmail.com; grant.likely@secretlab.ca
>> > Subject: [PATCH 1/2] mmc: change ACMD12 to AUTO_CMD12 for more clear
>> >
>> > Change ACMD12 to AUTO_CMD12 to reduce the confusion.
>> > ACMD12 might be confused with MMC/SD App CMD 12 (CMD55+CMD12 combo).
>> >
>> > Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
>> > ---
>> > =A0drivers/mmc/host/sdhci-of-core.c | =A0 =A02 +-
>> > =A0drivers/mmc/host/sdhci.c =A0 =A0 =A0 =A0 | =A0 =A08 ++++----
>> > =A0drivers/mmc/host/sdhci.h =A0 =A0 =A0 =A0 | =A0 10 +++++-----
>> > =A03 files changed, 10 insertions(+), 10 deletions(-)
>> Andrew,
>> Could you help to pick up this minor fix?
>> Thanks.
>> Roy
> Any update?
> Thanks.
> Roy

Patience Roy.  You only sent the patch 1 day ago.

g.

^ permalink raw reply

* Looking for a tutorial on the use of the new of_??? init functions
From: LEROY Christophe @ 2010-08-12  8:13 UTC (permalink / raw)
  To: LinuxPPC-dev

Hello,

Is there a tutorial or an HOWTO out somewhere explaining the use of 
those new of_platform_xxx() and other of_xxx() functions in the init of 
drivers ? It looks like a very nice way to write drivers in Linux 2-6 
but a little help would be welcomed.

Regards
Christophe

^ permalink raw reply

* [PATCH] fs-enet/mac-fec: Restore multicast and promiscous settings during restart
From: Wolfgang Ocker @ 2010-08-12  8:26 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Wolfgang Ocker, Vitaly Bordug

Signed-off-by: Wolfgang Ocker <weo@reccoware.de>
---
 drivers/net/fs_enet/mac-fec.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/fs_enet/mac-fec.c b/drivers/net/fs_enet/mac-fec.c
index 7ca1642..05f4bb1 100644
--- a/drivers/net/fs_enet/mac-fec.c
+++ b/drivers/net/fs_enet/mac-fec.c
@@ -344,6 +344,9 @@ static void restart(struct net_device *dev)
 	FW(fecp, imask, FEC_ENET_TXF | FEC_ENET_TXB |
 	   FEC_ENET_RXF | FEC_ENET_RXB);
 
+	/* Restore multicast and promiscuous settings */
+	set_multicast_list(dev);
+
 	/*
 	 * And last, enable the transmit and receive processing.
 	 */
-- 
1.7.2.1

^ permalink raw reply related

* Re: How to use mpc8xxx_gpio.c device driver
From: Ravi Gupta @ 2010-08-12 10:25 UTC (permalink / raw)
  To: MJ embd; +Cc: linuxppc-dev, linuxppc-dev
In-Reply-To: <AANLkTikNt7hrxA+QR-omUiiLKVBnjqCw+HTDQh_5B5Ff@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3086 bytes --]

On Wed, Aug 11, 2010 at 9:45 PM, MJ embd <mj.embd@gmail.com> wrote:

> u can directly access GPIO registers in kernel, by ioremap of GPIO
> memory mapped registers.
> you might need to check
> - muxing of gpio
>
> -mj
>

Hi MJ,

Thanks for the reply.
I tried memory mapping but it fails, here is my code :

#include <linux/module.h>
#include <linux/errno.h>    /* error codes */
#include <linux/mm.h>

void __iomem *ioaddr = NULL;

static __init int sample_module_init(void)
{
    ioaddr = ioremap(0xFF400C00, 0x24);
    if(ioaddr == NULL) {
        printk(KERN_WARNING "ioremap failed\n");
    }
    printk(KERN_WARNING "ioremap successed\n");
    printk(KERN_WARNING "GP1DIR = %u\n", ioread32(ioaddr));
    return 0;
}

static __exit void sample_module_exit(void)
{
    iounmap(ioaddr);
}

MODULE_LICENSE("GPL");
module_init(sample_module_init);
module_exit(sample_module_exit);

As per the MPC8377ERDB data sheet default IMMRBAR address is 0xFF40_0000 and
offset of GPIO1 is 0C00 and each GPIO has programmable registers that occupy
24 bytes of memory-mapped space, so I mapped from 24bytes (0x18) starting
from 0xFF40_0C00 address. But when I tried to read the values from the
mapped memory I get the following errors. Is there something I am missing.
Any help with reference to MPC8377ERDB board will be highly appreciable.

# tftp -l ~/immrbar.ko -r immrbar.ko -g 10.20.50.70
# insmod ./immrbar.ko
[  717.825241] ioremap successed
[  717.849215] Machine check in kernel mode.
[  717.853220] Caused by (from SRR1=41000): Transfer error ack signal
[  717.859405] Oops: Machine check, sig: 7 [#1]
[  717.863668] MPC837x RDB
[  717.866106] Modules linked in: immrbar(+)
[  717.870119] NIP: 00000900 LR: d1034054 CTR: c0014d50
[  717.875079] REGS: cf895d00 TRAP: 0200   Not tainted  (2.6.28.9)
[  717.880992] MSR: 00041000 <ME>  CR: 24000082  XER: 20000000
[  717.886578] TASK = cf8e8640[647] 'insmod' THREAD: cf894000
[  717.891882] GPR00: d103404c cf895db0 cf8e8640 00000000 000023d5 ffffffff
c01e
04f4 00020000
[  717.900265] GPR08: 00000001 c0383f3c 000023d5 c0014d50 4c72ff56 10019100
1007
77e0 1007ea98
[  717.908650] GPR16: 10077834 100a0000 100a0000 100a0000 bfaf4828 00000000
1009
f23c 10000cfc
[  717.917034] GPR24: 10000d00 10000d24 10012008 c03650e8 00000000 d1034000
1001
2018 d1030000
[  717.925598] NIP [00000900] 0x900
[  717.928828] LR [d1034054] sample_module_init+0x54/0xc0 [immrbar]
[  717.934828] Call Trace:
[  717.937273] [cf895db0] [d103404c] sample_module_init+0x4c/0xc0 [immrbar]
(unr
eliable)
[  717.945115] [cf895dc0] [c00038a0] do_one_initcall+0x64/0x18c
[  717.950780] [cf895f20] [c004d7b8] sys_init_module+0xac/0x19c
[  717.956441] [cf895f40] [c00122f0] ret_from_syscall+0x0/0x38
[  717.962013] --- Exception: c01 at 0x48043f6c
[  717.962017]     LR = 0x100009cc
[  717.969407] Instruction dump:
[  717.972370] 00000000 XXXXXXXX XXXXXXXX XXXXXXXX 00000000 XXXXXXXX
XXXXXXXX XX
XXXXXX
[  717.980140] 00000000 XXXXXXXX XXXXXXXX XXXXXXXX 7d5043a6 XXXXXXXX
XXXXXXXX XX
XXXXXX
[  717.987919] ---[ end trace a47be794e2873cef ]---

Thanks in advance
Ravi Gupta

[-- Attachment #2: Type: text/html, Size: 3680 bytes --]

^ permalink raw reply

* Running out of SDHCI quirk space (Re: [PATCH 1/3 v2] sdhci: Add auto CMD12 support for eSDHC driver)
From: Matt Fleming @ 2010-08-12 11:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ben Dooks, linux-mmc, linuxppc-dev

On Tue, Aug 03, 2010 at 04:43:46PM -0700, Andrew Morton wrote:
> On Tue, 3 Aug 2010 11:11:10 +0800
> Roy Zang <tie-fei.zang@freescale.com> wrote:
> 
> > --- a/drivers/mmc/host/sdhci.h
> > +++ b/drivers/mmc/host/sdhci.h
> > @@ -240,6 +240,8 @@ struct sdhci_host {
> >  #define SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN		(1<<25)
> >  /* Controller cannot support End Attribute in NOP ADMA descriptor */
> >  #define SDHCI_QUIRK_NO_ENDATTR_IN_NOPDESC		(1<<26)
> > +/* Controller uses Auto CMD12 command to stop the transfer */
> > +#define SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12		(1<<27)
> 
> This becomes 1<<29 in my tree.
> 
> We're about to run out.  What happens then?

I've been wondering for a while now if many of the quirks should be
hidden behind function pointers. While we could of course extend the
quirk space, I think that's kinda missing the point that quirks are
being used too liberally. Take SDHCI_QUIRK_SINGLE_POWER_WRITE in
drivers/mmc/host/sdhci.c:sdhci_set_power(). Really, that quirk should
probably be hidden inside a set_power() function in the sdhci_ops
structure.

I'm gonna have a go at trying to remove some of the quirks that don't
make sense being quirks. I'll post the series when I'm done.

Does anyone think that this approach is crazy?

^ permalink raw reply

* Flash Programmer Problem in Code Warrior
From: Naresh Reddy Sankapelly @ 2010-08-12 13:40 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 377 bytes --]

Hi,
I am trying to program NOR flash (M29DW323DT) on MPC8321 board. I have
imported the details of the flash into FPDeviceConfig.xml. When I try to run
Program/verify flash, it is taking large amount of time(in hours). I could
not figure out the reason for that. Kindly let me know the troubleshooting
method for this.

-- 
Thanks and Regards
Naresh Reddy S.
Noida, 9873240342

[-- Attachment #2: Type: text/html, Size: 408 bytes --]

^ permalink raw reply

* Re: How to use mpc8xxx_gpio.c device driver
From: Ira W. Snyder @ 2010-08-12 15:36 UTC (permalink / raw)
  To: Ravi Gupta; +Cc: linuxppc-dev, MJ embd, linuxppc-dev
In-Reply-To: <AANLkTi=-YiWtBqCGSt-h8dd3WwTMVJcDitvROCDBEMdw@mail.gmail.com>

On Thu, Aug 12, 2010 at 03:55:49PM +0530, Ravi Gupta wrote:
> On Wed, Aug 11, 2010 at 9:45 PM, MJ embd <mj.embd@gmail.com> wrote:
> 
> > u can directly access GPIO registers in kernel, by ioremap of GPIO
> > memory mapped registers.
> > you might need to check
> > - muxing of gpio
> >
> > -mj
> >
> 
> Hi MJ,
> 
> Thanks for the reply.
> I tried memory mapping but it fails, here is my code :
> 
> #include <linux/module.h>
> #include <linux/errno.h>    /* error codes */
> #include <linux/mm.h>
> 
> void __iomem *ioaddr = NULL;
> 
> static __init int sample_module_init(void)
> {
>     ioaddr = ioremap(0xFF400C00, 0x24);
>     if(ioaddr == NULL) {
>         printk(KERN_WARNING "ioremap failed\n");
>     }
>     printk(KERN_WARNING "ioremap successed\n");
>     printk(KERN_WARNING "GP1DIR = %u\n", ioread32(ioaddr));
>     return 0;
> }
> 
> static __exit void sample_module_exit(void)
> {
>     iounmap(ioaddr);
> }
> 
> MODULE_LICENSE("GPL");
> module_init(sample_module_init);
> module_exit(sample_module_exit);
> 
> As per the MPC8377ERDB data sheet default IMMRBAR address is 0xFF40_0000 and
> offset of GPIO1 is 0C00 and each GPIO has programmable registers that occupy
> 24 bytes of memory-mapped space, so I mapped from 24bytes (0x18) starting
> from 0xFF40_0C00 address. But when I tried to read the values from the
> mapped memory I get the following errors. Is there something I am missing.
> Any help with reference to MPC8377ERDB board will be highly appreciable.
> 
> # tftp -l ~/immrbar.ko -r immrbar.ko -g 10.20.50.70
> # insmod ./immrbar.ko
> [  717.825241] ioremap successed
> [  717.849215] Machine check in kernel mode.
> [  717.853220] Caused by (from SRR1=41000): Transfer error ack signal
> [  717.859405] Oops: Machine check, sig: 7 [#1]
> [  717.863668] MPC837x RDB
> [  717.866106] Modules linked in: immrbar(+)
> [  717.870119] NIP: 00000900 LR: d1034054 CTR: c0014d50
> [  717.875079] REGS: cf895d00 TRAP: 0200   Not tainted  (2.6.28.9)
> [  717.880992] MSR: 00041000 <ME>  CR: 24000082  XER: 20000000
> [  717.886578] TASK = cf8e8640[647] 'insmod' THREAD: cf894000
> [  717.891882] GPR00: d103404c cf895db0 cf8e8640 00000000 000023d5 ffffffff
> c01e
> 04f4 00020000
> [  717.900265] GPR08: 00000001 c0383f3c 000023d5 c0014d50 4c72ff56 10019100
> 1007
> 77e0 1007ea98
> [  717.908650] GPR16: 10077834 100a0000 100a0000 100a0000 bfaf4828 00000000
> 1009
> f23c 10000cfc
> [  717.917034] GPR24: 10000d00 10000d24 10012008 c03650e8 00000000 d1034000
> 1001
> 2018 d1030000
> [  717.925598] NIP [00000900] 0x900
> [  717.928828] LR [d1034054] sample_module_init+0x54/0xc0 [immrbar]
> [  717.934828] Call Trace:
> [  717.937273] [cf895db0] [d103404c] sample_module_init+0x4c/0xc0 [immrbar]
> (unr
> eliable)
> [  717.945115] [cf895dc0] [c00038a0] do_one_initcall+0x64/0x18c
> [  717.950780] [cf895f20] [c004d7b8] sys_init_module+0xac/0x19c
> [  717.956441] [cf895f40] [c00122f0] ret_from_syscall+0x0/0x38
> [  717.962013] --- Exception: c01 at 0x48043f6c
> [  717.962017]     LR = 0x100009cc
> [  717.969407] Instruction dump:
> [  717.972370] 00000000 XXXXXXXX XXXXXXXX XXXXXXXX 00000000 XXXXXXXX
> XXXXXXXX XX
> XXXXXX
> [  717.980140] 00000000 XXXXXXXX XXXXXXXX XXXXXXXX 7d5043a6 XXXXXXXX
> XXXXXXXX XX
> XXXXXX
> [  717.987919] ---[ end trace a47be794e2873cef ]---
> 

Looking at the device tree for this board, it appears U-Boot remaps the
IMMR registers to 0xe0000000. They are no longer accessible at
0xff400000.

I would recommend studying arch/powerpc/boot/dts/mpc8377_rdb.dts in the
Linux source code. That describes the device layout on your board after
U-Boot has run.

A wonderful tool for testing devices from userspace is "busybox devmem".
It allows you to poke any physical address with any value. The output of
"busybox devmem --help" should get you started. As a quick example,
"busybox devmem 0xe0000c00 w 0x1" will write the 32-bit value 0x1 to
address 0xe0000c00.

I would also recommend using the built-in Linux GPIO API. It works, you
just need to figure out how to use it. It will be much easier to get
your code upstream if you use the provided APIs.

The Documentation/gpio.txt file should help you in understanding the
in-kernel Linux GPIO API. I'm afraid I don't have much experience other
than accessing it via sysfs from userspace.

Ira

^ permalink raw reply

* Re: Query regarding 2.6.335 RT[Ingo's] and Non-RT performance
From: Jeff Angielski @ 2010-08-12 17:53 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <AANLkTikrpqFQsK=YLkHeWc1ZC=_Gz2rWStJrbQ8O-SrZ@mail.gmail.com>

On 08/11/2010 06:18 PM, Manikandan Ramachandran wrote:
> Hello All,
>      I created a very simple program which has higher priority than
> normal tasks and runs a tight loop. Under same test environment I ran
> this program on both non-rt and rt 2.6.33.5 kernel.  To my suprise I see
> that performance of non-RT kernel is better than RT. non-RT kernel took
> 3 sec and 366156 usec while RT kernel took about 3 sec and 418011
> usec.Can someone please explain why the performance of non-rt kernel is
> better than rt kernel? From the face of the test result, I feel RT has
> more overhead,Is there any configuration that I could do to bring down
> the overhead?

Your "surprise" is due to your definition of "performance".

The purpose of the -rt kernels is to reduce the kernel latency.  This is 
important for servicing hardware.  Normal users find the -rt useful for 
audio/video applications.  Engineering and scientific users find the -rt 
beneficially for servicing hardware like sensors or control systems.

If you are just trying to run calculations as fast as you can in user 
space, you'd be better off using the non-rt variants.


-- 
Jeff Angielski
The PTR Group
www.theptrgroup.com

^ permalink raw reply

* Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections
From: Andrew Morton @ 2010-08-12 19:08 UTC (permalink / raw)
  To: Nathan Fontenot
  Cc: linuxppc-dev, Greg KH, linux-kernel, Dave Hansen, linux-mm,
	KAMEZAWA Hiroyuki
In-Reply-To: <4C60407C.2080608@austin.ibm.com>

On Mon, 09 Aug 2010 12:53:00 -0500
Nathan Fontenot <nfont@austin.ibm.com> wrote:

> This set of patches de-couples the idea that there is a single
> directory in sysfs for each memory section.  The intent of the
> patches is to reduce the number of sysfs directories created to
> resolve a boot-time performance issue.  On very large systems
> boot time are getting very long (as seen on powerpc hardware)
> due to the enormous number of sysfs directories being created.
> On a system with 1 TB of memory we create ~63,000 directories.
> For even larger systems boot times are being measured in hours.

And those "hours" are mainly due to this problem, I assume.

> This set of patches allows for each directory created in sysfs
> to cover more than one memory section.  The default behavior for
> sysfs directory creation is the same, in that each directory
> represents a single memory section.  A new file 'end_phys_index'
> in each directory contains the physical_id of the last memory
> section covered by the directory so that users can easily
> determine the memory section range of a directory.

What you're proposing appears to be a non-back-compatible
userspace-visible change.  This is a big issue!

It's not an unresolvable issue, as this is a must-fix problem.  But you
should tell us what your proposal is to prevent breakage of existing
installations.  A Kconfig option would be good, but a boot-time kernel
command line option which selects the new format would be much better.

However you didn't mention this issue at all, and it's the most
important one.


> Updates for version 5 of the patchset include the following:
> 
> Patch 4/8 Add mutex for add/remove of memory blocks
> - Define the mutex using DEFINE_MUTEX macro.
> 
> Patch 8/8 Update memory-hotplug documentation
> - Add information concerning memory holes in phys_index..end_phys_index.

And you forgot to tell us how long those machines boot with the
patchset applied, which is the entire point of the patchset!

^ permalink raw reply

* Question about dma_direct_ops in PowerPC.
From: Fushen Chen @ 2010-08-12 19:28 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 2011 bytes --]

We have a board with PCI device driver that calls for
pci_dma_sync_single_for_device.
This driver used to work for Linux kernel 2.6.25.

We ported to the driver to Linux kernel 2.6.32. The PCI device driver
doesn't work anymore.
The following call trace shows why the PCI driver won't work in kernel
2.6.32.
1. In pci_include/asm-generic/pci-dma-compat.h
    pci_dma_sync_single_for_device calls for dma_sync_single_for_cpu
2. In include/asm-generic/dma-mapping-common.h
    dma_sync_single_for_cpu calls for ops->sync_single_for_cpu
3. In arch/powerpc/kernel/dma.c
struct dma_map_ops dma_direct_ops = {
        .alloc_coherent = dma_direct_alloc_coherent,
        .free_coherent  = dma_direct_free_coherent,
        .map_sg         = dma_direct_map_sg,
        .unmap_sg       = dma_direct_unmap_sg,
        .dma_supported  = dma_direct_dma_supported,
        .map_page       = dma_direct_map_page,
        .unmap_page     = dma_direct_unmap_page,
#ifdef CONFIG_NOT_COHERENT_CACHE
        .sync_single_range_for_cpu      = dma_direct_sync_single_range,
        .sync_single_range_for_device   = dma_direct_sync_single_range,
        .sync_sg_for_cpu                = dma_direct_sync_sg,
        .sync_sg_for_device             = dma_direct_sync_sg,
#endif
};
There is no ops defined for sync_single_for_cpu.
The pci_dma_sync_single_for_device is a no-op.

However Linux kernel 2.6.35.1 from kernel.org has the  .sync_single_for_cpu
for dma_direct_ops.
in arch/powerpc/kernel/dma.c
#ifdef CONFIG_NOT_COHERENT_CACHE
        .sync_single_for_cpu            = dma_direct_sync_single,
        .sync_single_for_device         = dma_direct_sync_single,
        .sync_sg_for_cpu                = dma_direct_sync_sg,
        .sync_sg_for_device             = dma_direct_sync_sg,
#endif


We won't move to Linux kernel 2.6.35 anytime soon.
My questions:
1. Is there any side effect for adding .sync_single_for_cpu to
dma_direct_ops in 2.6.32?
2. What will be the future development here?


Best regards & Thanks,
Fushen

[-- Attachment #2: Type: text/html, Size: 2235 bytes --]

^ permalink raw reply

* Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections
From: Dave Hansen @ 2010-08-12 20:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linuxppc-dev, Greg KH, linux-kernel, linux-mm, KAMEZAWA Hiroyuki
In-Reply-To: <20100812120816.e97d8b9e.akpm@linux-foundation.org>

On Thu, 2010-08-12 at 12:08 -0700, Andrew Morton wrote:
> > This set of patches allows for each directory created in sysfs
> > to cover more than one memory section.  The default behavior for
> > sysfs directory creation is the same, in that each directory
> > represents a single memory section.  A new file 'end_phys_index'
> > in each directory contains the physical_id of the last memory
> > section covered by the directory so that users can easily
> > determine the memory section range of a directory.
> 
> What you're proposing appears to be a non-back-compatible
> userspace-visible change.  This is a big issue! 

Nathan, one thought to get around this at the moment would be to bump up
the size that we export in /sys/devices/system/memory/block_size_bytes.
I think you have already done most of the hard work to accomplish
this.  

You can still add the end_phys_index stuff.  But, for now, it would
always be equal to start_phys_index.

-- Dave

^ permalink raw reply

* Re: Query regarding 2.6.335 RT[Ingo's] and Non-RT performance
From: Xianghua Xiao @ 2010-08-13  2:18 UTC (permalink / raw)
  To: Jeff Angielski; +Cc: linuxppc-dev
In-Reply-To: <4C64352F.4090005@theptrgroup.com>

On Thu, Aug 12, 2010 at 12:53 PM, Jeff Angielski <jeff@theptrgroup.com> wro=
te:
> On 08/11/2010 06:18 PM, Manikandan Ramachandran wrote:
>>
>> Hello All,
>> =C2=A0 =C2=A0 I created a very simple program which has higher priority =
than
>> normal tasks and runs a tight loop. Under same test environment I ran
>> this program on both non-rt and rt 2.6.33.5 kernel. =C2=A0To my suprise =
I see
>> that performance of non-RT kernel is better than RT. non-RT kernel took
>> 3 sec and 366156 usec while RT kernel took about 3 sec and 418011
>> usec.Can someone please explain why the performance of non-rt kernel is
>> better than rt kernel? From the face of the test result, I feel RT has
>> more overhead,Is there any configuration that I could do to bring down
>> the overhead?
>
> Your "surprise" is due to your definition of "performance".
>
> The purpose of the -rt kernels is to reduce the kernel latency. =C2=A0Thi=
s is
> important for servicing hardware. =C2=A0Normal users find the -rt useful =
for
> audio/video applications. =C2=A0Engineering and scientific users find the=
 -rt
> beneficially for servicing hardware like sensors or control systems.
>
> If you are just trying to run calculations as fast as you can in user spa=
ce,
> you'd be better off using the non-rt variants.
>
>
> --
> Jeff Angielski
> The PTR Group
> www.theptrgroup.com
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>

true, in most cases non-rt will have better performance/throughput,
while rt's major goal is to have better latency for high priority
tasks. also true is that, rt kernel will have more overhead.

xianghua

^ permalink raw reply

* [PATCH] powerpc: Add support for popcnt instructions
From: Anton Blanchard @ 2010-08-13  2:28 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev


POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
this patch does all the work out of line, but it would be nice to implement
them as inlines with an out of line fallback.

The performance issue with hweight was noticed when disabling SMT on a large
(192 thread) POWER7 box. The patch improves that testcase by about 8%.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: powerpc.git/arch/powerpc/include/asm/cputable.h
===================================================================
--- powerpc.git.orig/arch/powerpc/include/asm/cputable.h	2010-08-13 11:19:42.691991439 +1000
+++ powerpc.git/arch/powerpc/include/asm/cputable.h	2010-08-13 11:24:55.510741618 +1000
@@ -199,6 +199,8 @@ extern const char *powerpc_base_platform
 #define CPU_FTR_UNALIGNED_LD_STD	LONG_ASM_CONST(0x0080000000000000)
 #define CPU_FTR_ASYM_SMT		LONG_ASM_CONST(0x0100000000000000)
 #define CPU_FTR_STCX_CHECKS_ADDRESS	LONG_ASM_CONST(0x0200000000000000)
+#define CPU_FTR_POPCNTB			LONG_ASM_CONST(0x0400000000000000)
+#define CPU_FTR_POPCNTD			LONG_ASM_CONST(0x0800000000000000)
 
 #ifndef __ASSEMBLY__
 
@@ -403,21 +405,22 @@ extern const char *powerpc_base_platform
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
-	    CPU_FTR_PURR | CPU_FTR_STCX_CHECKS_ADDRESS)
+	    CPU_FTR_PURR | CPU_FTR_STCX_CHECKS_ADDRESS | \
+	    CPU_FTR_POPCNTB)
 #define CPU_FTRS_POWER6 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
 	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
 	    CPU_FTR_DSCR | CPU_FTR_UNALIGNED_LD_STD | \
-	    CPU_FTR_STCX_CHECKS_ADDRESS)
+	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB)
 #define CPU_FTRS_POWER7 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
 	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
 	    CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
-	    CPU_FTR_STCX_CHECKS_ADDRESS)
+	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD)
 #define CPU_FTRS_CELL	(CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
Index: powerpc.git/arch/powerpc/lib/Makefile
===================================================================
--- powerpc.git.orig/arch/powerpc/lib/Makefile	2010-08-13 11:19:43.653241065 +1000
+++ powerpc.git/arch/powerpc/lib/Makefile	2010-08-13 11:19:45.930743841 +1000
@@ -18,7 +18,7 @@ obj-$(CONFIG_HAS_IOMEM)	+= devres.o
 
 obj-$(CONFIG_PPC64)	+= copypage_64.o copyuser_64.o \
 			   memcpy_64.o usercopy_64.o mem_64.o string.o \
-			   checksum_wrappers_64.o
+			   checksum_wrappers_64.o hweight_64.o
 obj-$(CONFIG_XMON)	+= sstep.o ldstfp.o
 obj-$(CONFIG_KPROBES)	+= sstep.o ldstfp.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= sstep.o ldstfp.o
Index: powerpc.git/arch/powerpc/lib/hweight_64.S
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ powerpc.git/arch/powerpc/lib/hweight_64.S	2010-08-13 11:19:45.940741462 +1000
@@ -0,0 +1,110 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2010
+ *
+ * Author: Anton Blanchard <anton@au.ibm.com>
+ */
+#include <asm/processor.h>
+#include <asm/ppc_asm.h>
+
+/* Note: This code relies on -mminimal-toc */
+
+_GLOBAL(__arch_hweight8)
+BEGIN_FTR_SECTION
+	b .__sw_hweight8
+	nop
+	nop
+FTR_SECTION_ELSE
+	popcntb	r3,r3
+	clrldi	r3,r3,64-8
+	blr
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
+
+_GLOBAL(__arch_hweight16)
+BEGIN_FTR_SECTION
+	b .__sw_hweight16
+	nop
+	nop
+	nop
+	nop
+FTR_SECTION_ELSE
+  BEGIN_FTR_SECTION_NESTED(50)
+	popcntb r3,r3
+	srdi	r4,r3,8
+	add	r3,r4,r3
+	clrldi	r3,r3,64-8
+	blr
+  FTR_SECTION_ELSE_NESTED(50)
+	clrlwi  r3,r3,16
+	popcntw	r3,r3
+	clrldi	r3,r3,64-8
+	blr
+  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 50)
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
+
+_GLOBAL(__arch_hweight32)
+BEGIN_FTR_SECTION
+	b .__sw_hweight32
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+FTR_SECTION_ELSE
+  BEGIN_FTR_SECTION_NESTED(51)
+	popcntb r3,r3
+	srdi	r4,r3,16
+	add	r3,r4,r3
+	srdi	r4,r3,8
+	add	r3,r4,r3
+	clrldi	r3,r3,64-8
+	blr
+  FTR_SECTION_ELSE_NESTED(51)
+	popcntw	r3,r3
+	clrldi	r3,r3,64-8
+	blr
+  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 51)
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
+
+_GLOBAL(__arch_hweight64)
+BEGIN_FTR_SECTION
+	b .__sw_hweight64
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+FTR_SECTION_ELSE
+  BEGIN_FTR_SECTION_NESTED(52)
+	popcntb r3,r3
+	srdi	r4,r3,32
+	add	r3,r4,r3
+	srdi	r4,r3,16
+	add	r3,r4,r3
+	srdi	r4,r3,8
+	add	r3,r4,r3
+	clrldi	r3,r3,64-8
+	blr
+  FTR_SECTION_ELSE_NESTED(52)
+	popcntd	r3,r3
+	clrldi	r3,r3,64-8
+	blr
+  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 52)
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
Index: powerpc.git/arch/powerpc/include/asm/bitops.h
===================================================================
--- powerpc.git.orig/arch/powerpc/include/asm/bitops.h	2010-08-13 11:06:20.991992998 +1000
+++ powerpc.git/arch/powerpc/include/asm/bitops.h	2010-08-13 11:19:45.940741462 +1000
@@ -267,7 +267,16 @@ static __inline__ int fls64(__u64 x)
 #include <asm-generic/bitops/fls64.h>
 #endif /* __powerpc64__ */
 
+#ifdef CONFIG_PPC64
+unsigned int __arch_hweight8(unsigned int w);
+unsigned int __arch_hweight16(unsigned int w);
+unsigned int __arch_hweight32(unsigned int w);
+unsigned long __arch_hweight64(__u64 w);
+#include <asm-generic/bitops/const_hweight.h>
+#else
 #include <asm-generic/bitops/hweight.h>
+#endif
+
 #include <asm-generic/bitops/find.h>
 
 /* Little-endian versions */
Index: powerpc.git/arch/powerpc/kernel/ppc_ksyms.c
===================================================================
--- powerpc.git.orig/arch/powerpc/kernel/ppc_ksyms.c	2010-08-13 11:06:21.011991745 +1000
+++ powerpc.git/arch/powerpc/kernel/ppc_ksyms.c	2010-08-13 11:19:45.940741462 +1000
@@ -186,3 +186,10 @@ EXPORT_SYMBOL(__mtdcr);
 EXPORT_SYMBOL(__mfdcr);
 #endif
 EXPORT_SYMBOL(empty_zero_page);
+
+#ifdef CONFIG_PPC64
+EXPORT_SYMBOL(__arch_hweight8);
+EXPORT_SYMBOL(__arch_hweight16);
+EXPORT_SYMBOL(__arch_hweight32);
+EXPORT_SYMBOL(__arch_hweight64);
+#endif

^ permalink raw reply

* Re: [PATCH] powerpc: Add support for popcnt instructions
From: Benjamin Herrenschmidt @ 2010-08-13  4:29 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: linuxppc-dev
In-Reply-To: <20100813022809.GY29316@kryten>

On Fri, 2010-08-13 at 12:28 +1000, Anton Blanchard wrote:
> POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
> this patch does all the work out of line, but it would be nice to implement
> them as inlines with an out of line fallback.
> 
> The performance issue with hweight was noticed when disabling SMT on a large
> (192 thread) POWER7 box. The patch improves that testcase by about 8%.

Especially from modules it will suck big time. If kept out of line they
should probably be linked-in with each module, but I'd rather have them
inlined.

Cheers,
Ben.

> Signed-off-by: Anton Blanchard <anton@samba.org>
> ---
> 
> Index: powerpc.git/arch/powerpc/include/asm/cputable.h
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/include/asm/cputable.h	2010-08-13 11:19:42.691991439 +1000
> +++ powerpc.git/arch/powerpc/include/asm/cputable.h	2010-08-13 11:24:55.510741618 +1000
> @@ -199,6 +199,8 @@ extern const char *powerpc_base_platform
>  #define CPU_FTR_UNALIGNED_LD_STD	LONG_ASM_CONST(0x0080000000000000)
>  #define CPU_FTR_ASYM_SMT		LONG_ASM_CONST(0x0100000000000000)
>  #define CPU_FTR_STCX_CHECKS_ADDRESS	LONG_ASM_CONST(0x0200000000000000)
> +#define CPU_FTR_POPCNTB			LONG_ASM_CONST(0x0400000000000000)
> +#define CPU_FTR_POPCNTD			LONG_ASM_CONST(0x0800000000000000)
>  
>  #ifndef __ASSEMBLY__
>  
> @@ -403,21 +405,22 @@ extern const char *powerpc_base_platform
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
>  	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
> -	    CPU_FTR_PURR | CPU_FTR_STCX_CHECKS_ADDRESS)
> +	    CPU_FTR_PURR | CPU_FTR_STCX_CHECKS_ADDRESS | \
> +	    CPU_FTR_POPCNTB)
>  #define CPU_FTRS_POWER6 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
>  	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
>  	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
>  	    CPU_FTR_DSCR | CPU_FTR_UNALIGNED_LD_STD | \
> -	    CPU_FTR_STCX_CHECKS_ADDRESS)
> +	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB)
>  #define CPU_FTRS_POWER7 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
>  	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
>  	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
>  	    CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
> -	    CPU_FTR_STCX_CHECKS_ADDRESS)
> +	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD)
>  #define CPU_FTRS_CELL	(CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
> Index: powerpc.git/arch/powerpc/lib/Makefile
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/lib/Makefile	2010-08-13 11:19:43.653241065 +1000
> +++ powerpc.git/arch/powerpc/lib/Makefile	2010-08-13 11:19:45.930743841 +1000
> @@ -18,7 +18,7 @@ obj-$(CONFIG_HAS_IOMEM)	+= devres.o
>  
>  obj-$(CONFIG_PPC64)	+= copypage_64.o copyuser_64.o \
>  			   memcpy_64.o usercopy_64.o mem_64.o string.o \
> -			   checksum_wrappers_64.o
> +			   checksum_wrappers_64.o hweight_64.o
>  obj-$(CONFIG_XMON)	+= sstep.o ldstfp.o
>  obj-$(CONFIG_KPROBES)	+= sstep.o ldstfp.o
>  obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= sstep.o ldstfp.o
> Index: powerpc.git/arch/powerpc/lib/hweight_64.S
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ powerpc.git/arch/powerpc/lib/hweight_64.S	2010-08-13 11:19:45.940741462 +1000
> @@ -0,0 +1,110 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
> + *
> + * Copyright (C) IBM Corporation, 2010
> + *
> + * Author: Anton Blanchard <anton@au.ibm.com>
> + */
> +#include <asm/processor.h>
> +#include <asm/ppc_asm.h>
> +
> +/* Note: This code relies on -mminimal-toc */
> +
> +_GLOBAL(__arch_hweight8)
> +BEGIN_FTR_SECTION
> +	b .__sw_hweight8
> +	nop
> +	nop
> +FTR_SECTION_ELSE
> +	popcntb	r3,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
> +
> +_GLOBAL(__arch_hweight16)
> +BEGIN_FTR_SECTION
> +	b .__sw_hweight16
> +	nop
> +	nop
> +	nop
> +	nop
> +FTR_SECTION_ELSE
> +  BEGIN_FTR_SECTION_NESTED(50)
> +	popcntb r3,r3
> +	srdi	r4,r3,8
> +	add	r3,r4,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  FTR_SECTION_ELSE_NESTED(50)
> +	clrlwi  r3,r3,16
> +	popcntw	r3,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 50)
> +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
> +
> +_GLOBAL(__arch_hweight32)
> +BEGIN_FTR_SECTION
> +	b .__sw_hweight32
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +FTR_SECTION_ELSE
> +  BEGIN_FTR_SECTION_NESTED(51)
> +	popcntb r3,r3
> +	srdi	r4,r3,16
> +	add	r3,r4,r3
> +	srdi	r4,r3,8
> +	add	r3,r4,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  FTR_SECTION_ELSE_NESTED(51)
> +	popcntw	r3,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 51)
> +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
> +
> +_GLOBAL(__arch_hweight64)
> +BEGIN_FTR_SECTION
> +	b .__sw_hweight64
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +FTR_SECTION_ELSE
> +  BEGIN_FTR_SECTION_NESTED(52)
> +	popcntb r3,r3
> +	srdi	r4,r3,32
> +	add	r3,r4,r3
> +	srdi	r4,r3,16
> +	add	r3,r4,r3
> +	srdi	r4,r3,8
> +	add	r3,r4,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  FTR_SECTION_ELSE_NESTED(52)
> +	popcntd	r3,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 52)
> +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
> Index: powerpc.git/arch/powerpc/include/asm/bitops.h
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/include/asm/bitops.h	2010-08-13 11:06:20.991992998 +1000
> +++ powerpc.git/arch/powerpc/include/asm/bitops.h	2010-08-13 11:19:45.940741462 +1000
> @@ -267,7 +267,16 @@ static __inline__ int fls64(__u64 x)
>  #include <asm-generic/bitops/fls64.h>
>  #endif /* __powerpc64__ */
>  
> +#ifdef CONFIG_PPC64
> +unsigned int __arch_hweight8(unsigned int w);
> +unsigned int __arch_hweight16(unsigned int w);
> +unsigned int __arch_hweight32(unsigned int w);
> +unsigned long __arch_hweight64(__u64 w);
> +#include <asm-generic/bitops/const_hweight.h>
> +#else
>  #include <asm-generic/bitops/hweight.h>
> +#endif
> +
>  #include <asm-generic/bitops/find.h>
>  
>  /* Little-endian versions */
> Index: powerpc.git/arch/powerpc/kernel/ppc_ksyms.c
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/kernel/ppc_ksyms.c	2010-08-13 11:06:21.011991745 +1000
> +++ powerpc.git/arch/powerpc/kernel/ppc_ksyms.c	2010-08-13 11:19:45.940741462 +1000
> @@ -186,3 +186,10 @@ EXPORT_SYMBOL(__mtdcr);
>  EXPORT_SYMBOL(__mfdcr);
>  #endif
>  EXPORT_SYMBOL(empty_zero_page);
> +
> +#ifdef CONFIG_PPC64
> +EXPORT_SYMBOL(__arch_hweight8);
> +EXPORT_SYMBOL(__arch_hweight16);
> +EXPORT_SYMBOL(__arch_hweight32);
> +EXPORT_SYMBOL(__arch_hweight64);
> +#endif

^ permalink raw reply

* Re: [PATCH] powerpc: Add support for popcnt instructions
From: Anton Blanchard @ 2010-08-13  5:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1281673744.2987.362.camel@pasglop>

 
Hi,

> Especially from modules it will suck big time. If kept out of line they
> should probably be linked-in with each module, but I'd rather have them
> inlined.

Inlining would be good, but this is as far as I can take this for now.
If someone else is interested go for it :)

Anton

^ permalink raw reply

* [PATCH 1/2] powerpc: Abstract indexing of lppaca structs
From: Paul Mackerras @ 2010-08-13  6:18 UTC (permalink / raw)
  To: linuxppc-dev

Currently we have the lppaca structs as a simple array of NR_CPUS
entries, taking up space in the data section of the kernel image.
In future we would like to allocate them dynamically, so this
abstracts out the accesses to the array, making it easier to
change how we locate the lppaca for a given cpu in future.
Specifically, lppaca[cpu] changes to lppaca_of(cpu).

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/lppaca.h     |    2 ++
 arch/powerpc/kernel/lparcfg.c         |   14 +++++++-------
 arch/powerpc/lib/locks.c              |    4 ++--
 arch/powerpc/platforms/iseries/dt.c   |    4 ++--
 arch/powerpc/platforms/iseries/smp.c  |    2 +-
 arch/powerpc/platforms/pseries/dtl.c  |    8 ++++----
 arch/powerpc/platforms/pseries/lpar.c |    4 ++--
 7 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/lppaca.h b/arch/powerpc/include/asm/lppaca.h
index 14b592d..6b73554 100644
--- a/arch/powerpc/include/asm/lppaca.h
+++ b/arch/powerpc/include/asm/lppaca.h
@@ -153,6 +153,8 @@ struct lppaca {
 
 extern struct lppaca lppaca[];
 
+#define lppaca_of(cpu)	(lppaca[cpu])
+
 /*
  * SLB shadow buffer structure as defined in the PAPR.  The save_area
  * contains adjacent ESID and VSID pairs for each shadowed SLB.  The
diff --git a/arch/powerpc/kernel/lparcfg.c b/arch/powerpc/kernel/lparcfg.c
index 50362b6..8d9e3b9 100644
--- a/arch/powerpc/kernel/lparcfg.c
+++ b/arch/powerpc/kernel/lparcfg.c
@@ -56,7 +56,7 @@ static unsigned long get_purr(void)
 
 	for_each_possible_cpu(cpu) {
 		if (firmware_has_feature(FW_FEATURE_ISERIES))
-			sum_purr += lppaca[cpu].emulated_time_base;
+			sum_purr += lppaca_of(cpu).emulated_time_base;
 		else {
 			struct cpu_usage *cu;
 
@@ -263,7 +263,7 @@ static void parse_ppp_data(struct seq_file *m)
 	           ppp_data.active_system_procs);
 
 	/* pool related entries are apropriate for shared configs */
-	if (lppaca[0].shared_proc) {
+	if (lppaca_of(0).shared_proc) {
 		unsigned long pool_idle_time, pool_procs;
 
 		seq_printf(m, "pool=%d\n", ppp_data.pool_num);
@@ -460,8 +460,8 @@ static void pseries_cmo_data(struct seq_file *m)
 		return;
 
 	for_each_possible_cpu(cpu) {
-		cmo_faults += lppaca[cpu].cmo_faults;
-		cmo_fault_time += lppaca[cpu].cmo_fault_time;
+		cmo_faults += lppaca_of(cpu).cmo_faults;
+		cmo_fault_time += lppaca_of(cpu).cmo_fault_time;
 	}
 
 	seq_printf(m, "cmo_faults=%lu\n", cmo_faults);
@@ -479,8 +479,8 @@ static void splpar_dispatch_data(struct seq_file *m)
 	unsigned long dispatch_dispersions = 0;
 
 	for_each_possible_cpu(cpu) {
-		dispatches += lppaca[cpu].yield_count;
-		dispatch_dispersions += lppaca[cpu].dispersion_count;
+		dispatches += lppaca_of(cpu).yield_count;
+		dispatch_dispersions += lppaca_of(cpu).dispersion_count;
 	}
 
 	seq_printf(m, "dispatches=%lu\n", dispatches);
@@ -545,7 +545,7 @@ static int pseries_lparcfg_data(struct seq_file *m, void *v)
 	seq_printf(m, "partition_potential_processors=%d\n",
 		   partition_potential_processors);
 
-	seq_printf(m, "shared_processor_mode=%d\n", lppaca[0].shared_proc);
+	seq_printf(m, "shared_processor_mode=%d\n", lppaca_of(0).shared_proc);
 
 	seq_printf(m, "slb_size=%d\n", mmu_slb_size);
 
diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c
index 58e14fb..9b8182e 100644
--- a/arch/powerpc/lib/locks.c
+++ b/arch/powerpc/lib/locks.c
@@ -34,7 +34,7 @@ void __spin_yield(arch_spinlock_t *lock)
 		return;
 	holder_cpu = lock_value & 0xffff;
 	BUG_ON(holder_cpu >= NR_CPUS);
-	yield_count = lppaca[holder_cpu].yield_count;
+	yield_count = lppaca_of(holder_cpu).yield_count;
 	if ((yield_count & 1) == 0)
 		return;		/* virtual cpu is currently running */
 	rmb();
@@ -65,7 +65,7 @@ void __rw_yield(arch_rwlock_t *rw)
 		return;		/* no write lock at present */
 	holder_cpu = lock_value & 0xffff;
 	BUG_ON(holder_cpu >= NR_CPUS);
-	yield_count = lppaca[holder_cpu].yield_count;
+	yield_count = lppaca_of(holder_cpu).yield_count;
 	if ((yield_count & 1) == 0)
 		return;		/* virtual cpu is currently running */
 	rmb();
diff --git a/arch/powerpc/platforms/iseries/dt.c b/arch/powerpc/platforms/iseries/dt.c
index 7f45a51..fdb7384 100644
--- a/arch/powerpc/platforms/iseries/dt.c
+++ b/arch/powerpc/platforms/iseries/dt.c
@@ -243,7 +243,7 @@ static void __init dt_cpus(struct iseries_flat_dt *dt)
 	pft_size[1] = __ilog2(HvCallHpt_getHptPages() * HW_PAGE_SIZE);
 
 	for (i = 0; i < NR_CPUS; i++) {
-		if (lppaca[i].dyn_proc_status >= 2)
+		if (lppaca_of(i).dyn_proc_status >= 2)
 			continue;
 
 		snprintf(p, 32 - (p - buf), "@%d", i);
@@ -251,7 +251,7 @@ static void __init dt_cpus(struct iseries_flat_dt *dt)
 
 		dt_prop_str(dt, "device_type", device_type_cpu);
 
-		index = lppaca[i].dyn_hv_phys_proc_index;
+		index = lppaca_of(i).dyn_hv_phys_proc_index;
 		d = &xIoHriProcessorVpd[index];
 
 		dt_prop_u32(dt, "i-cache-size", d->xInstCacheSize * 1024);
diff --git a/arch/powerpc/platforms/iseries/smp.c b/arch/powerpc/platforms/iseries/smp.c
index 6590850..6c60299 100644
--- a/arch/powerpc/platforms/iseries/smp.c
+++ b/arch/powerpc/platforms/iseries/smp.c
@@ -91,7 +91,7 @@ static void smp_iSeries_kick_cpu(int nr)
 	BUG_ON((nr < 0) || (nr >= NR_CPUS));
 
 	/* Verify that our partition has a processor nr */
-	if (lppaca[nr].dyn_proc_status >= 2)
+	if (lppaca_of(nr).dyn_proc_status >= 2)
 		return;
 
 	/* The processor is currently spinning, waiting
diff --git a/arch/powerpc/platforms/pseries/dtl.c b/arch/powerpc/platforms/pseries/dtl.c
index a00addb..adfd544 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -107,14 +107,14 @@ static int dtl_enable(struct dtl *dtl)
 	}
 
 	/* set our initial buffer indices */
-	dtl->last_idx = lppaca[dtl->cpu].dtl_idx = 0;
+	dtl->last_idx = lppaca_of(dtl->cpu).dtl_idx = 0;
 
 	/* ensure that our updates to the lppaca fields have occurred before
 	 * we actually enable the logging */
 	smp_wmb();
 
 	/* enable event logging */
-	lppaca[dtl->cpu].dtl_enable_mask = dtl_event_mask;
+	lppaca_of(dtl->cpu).dtl_enable_mask = dtl_event_mask;
 
 	return 0;
 }
@@ -123,7 +123,7 @@ static void dtl_disable(struct dtl *dtl)
 {
 	int hwcpu = get_hard_smp_processor_id(dtl->cpu);
 
-	lppaca[dtl->cpu].dtl_enable_mask = 0x0;
+	lppaca_of(dtl->cpu).dtl_enable_mask = 0x0;
 
 	unregister_dtl(hwcpu, __pa(dtl->buf));
 
@@ -171,7 +171,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
 	/* actual number of entries read */
 	n_read = 0;
 
-	cur_idx = lppaca[dtl->cpu].dtl_idx;
+	cur_idx = lppaca_of(dtl->cpu).dtl_idx;
 	last_idx = dtl->last_idx;
 
 	if (cur_idx - last_idx > dtl->buf_entries) {
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index cf79b46..a17fe4a 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -250,9 +250,9 @@ void vpa_init(int cpu)
 	long ret;
 
 	if (cpu_has_feature(CPU_FTR_ALTIVEC))
-		lppaca[cpu].vmxregs_in_use = 1;
+		lppaca_of(cpu).vmxregs_in_use = 1;
 
-	addr = __pa(&lppaca[cpu]);
+	addr = __pa(&lppaca_of(cpu));
 	ret = register_vpa(hwcpu, addr);
 
 	if (ret) {
-- 
1.7.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox