LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v5 04/45] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
From: Srivatsa S. Bhat @ 2013-01-23 19:33 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-doc, peterz, fweisbec, linux-kernel, mingo, linux-arch,
	linux, xiaoguangrong, wangyun, paulmck, nikunj, linux-pm, rusty,
	rostedt, rjw, namhyung, tglx, linux-arm-kernel, netdev, oleg, sbw,
	akpm, linuxppc-dev
In-Reply-To: <20130123185522.GG2373@mtj.dyndns.org>

On 01/24/2013 12:25 AM, Tejun Heo wrote:
> Hello, Srivatsa.
> 
> First of all, I'm not sure whether we need to be this step-by-step
> when introducing something new.  It's not like we're transforming an
> existing implementation and it doesn't seem to help understanding the
> series that much either.
> 

Hmm.. I split it up into steps to help explain the reasoning behind
the code sufficiently, rather than spring all of the intricacies at
one go (which would make it very hard to write the changelog/comments
also). The split made it easier for me to document it well in the
changelog, because I could deal with reasonable chunks of code/complexity
at a time. IMHO that helps people reading it for the first time to
understand the logic easily.

> On Tue, Jan 22, 2013 at 01:03:53PM +0530, Srivatsa S. Bhat wrote:
>> Using global rwlocks as the backend for per-CPU rwlocks helps us avoid many
>> lock-ordering related problems (unlike per-cpu locks). However, global
> 
> So, unfortunately, this already seems broken, right?  The problem here
> seems to be that previously, say, read_lock() implied
> preempt_disable() but as this series aims to move away from it, it
> introduces the problem of locking order between such locks and the new
> contruct.
>

Not sure I got your point correctly. Are you referring to Steve's comment
that rwlocks are probably fair now (and hence not really safe when used
like this)? If yes, I haven't actually verified that yet, but yes, that
will make this hard to use, since we need to take care of locking rules.

But suppose rwlocks are unfair (as I had assumed them to be), then we
have absolutely no problems and no lock-ordering to worry about.
 
> The only two options are either punishing writers or identifying and
> updating all such possible deadlocks.  percpu_rwsem does the former,
> right?  I don't know how feasible the latter would be.

I don't think we can avoid looking into all the possible deadlocks,
as long as we use rwlocks inside get/put_online_cpus_atomic() (assuming
rwlocks are fair). Even with Oleg's idea of using synchronize_sched()
at the writer, we still need to take care of locking rules, because the
synchronize_sched() only helps avoid the memory barriers at the reader,
and doesn't help get rid of the rwlocks themselves.
So in short, I don't see how we can punish the writers and thereby somehow
avoid looking into possible deadlocks (if rwlocks are fair).

>  Srivatsa,
> you've been looking at all the places which would require conversion,
> how difficult would doing the latter be?
> 

The problem is that some APIs like smp_call_function() will need to use
get/put_online_cpus_atomic(). That is when the locking becomes tricky
in the subsystem which invokes these APIs with other (subsystem-specific,
internal) locks held. So we could potentially use a convention such as
"Make get/put_online_cpus_atomic() your outer-most calls, within which
you nest the other locks" to rule out all ABBA deadlock possibilities...
But we might still hit some hard-to-convert places.. 

BTW, Steve, fair rwlocks doesn't mean the following scenario will result
in a deadlock right?

CPU 0                          CPU 1

read_lock(&rwlock)

                              write_lock(&rwlock) //spins, because CPU 0
                              //has acquired the lock for read

read_lock(&rwlock)
   ^^^^^
What happens here? Does CPU 0 start spinning (and hence deadlock) or will
it continue realizing that it already holds the rwlock for read?

If the above ends in a deadlock, then its next to impossible to convert
all the places safely (because the above mentioned convention will simply
fall apart).

>> +#define reader_uses_percpu_refcnt(pcpu_rwlock, cpu)			\
>> +		(ACCESS_ONCE(per_cpu(*((pcpu_rwlock)->reader_refcnt), cpu)))
>> +
>> +#define reader_nested_percpu(pcpu_rwlock)				\
>> +			(__this_cpu_read(*((pcpu_rwlock)->reader_refcnt)) > 1)
>> +
>> +#define writer_active(pcpu_rwlock)					\
>> +			(__this_cpu_read(*((pcpu_rwlock)->writer_signal)))
> 
> Why are these in the public header file?  Are they gonna be used to
> inline something?
>

No, I can put it in the .c file itself. Will do.
 
>> +static inline void raise_writer_signal(struct percpu_rwlock *pcpu_rwlock,
>> +				       unsigned int cpu)
>> +{
>> +	per_cpu(*pcpu_rwlock->writer_signal, cpu) = true;
>> +}
>> +
>> +static inline void drop_writer_signal(struct percpu_rwlock *pcpu_rwlock,
>> +				      unsigned int cpu)
>> +{
>> +	per_cpu(*pcpu_rwlock->writer_signal, cpu) = false;
>> +}
>> +
>> +static void announce_writer_active(struct percpu_rwlock *pcpu_rwlock)
>> +{
>> +	unsigned int cpu;
>> +
>> +	for_each_online_cpu(cpu)
>> +		raise_writer_signal(pcpu_rwlock, cpu);
>> +
>> +	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
>> +}
>> +
>> +static void announce_writer_inactive(struct percpu_rwlock *pcpu_rwlock)
>> +{
>> +	unsigned int cpu;
>> +
>> +	drop_writer_signal(pcpu_rwlock, smp_processor_id());
>> +
>> +	for_each_online_cpu(cpu)
>> +		drop_writer_signal(pcpu_rwlock, cpu);
>> +
>> +	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
>> +}
> 
> It could be just personal preference but I find the above one line
> wrappers more obfuscating than anything else.  What's the point of
> wrapping writer_signal = true/false into a separate function?  These
> simple wrappers just add layers that people have to dig through to
> figure out what's going on without adding anything of value.  I'd much
> prefer collapsing these into the percpu_write_[un]lock().
>

Sure, I see your point. I'll change that.

Thanks a lot for your feedback Tejun!

Regards,
Srivatsa S. Bhat

^ permalink raw reply

* Re: [PATCH] perf: Fix compile warnings in tests/attr.c
From: Sukadev Bhattiprolu @ 2013-01-23 18:57 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Anton Blanchard, linux-kernel, linuxppc-dev, paulus, acme, mingo,
	Jiri Olsa
In-Reply-To: <1358898333.29791.14.camel@concordia>

Michael Ellerman [michael@ellerman.id.au] wrote:
| > | make: *** [tests/attr.o] Error 1
| > | 
| > | i386 compiles fine
| > 
| > __u64 is 'unsigned long long' on x86 and PRIu64 is 'llu' which is fine.
| > 
| > __u64 is 'unsigned long' on Power and PRIu64 is 'lu' which is again fine.
| > 
| > But __u64 is 'unsigned long long' on x86_64, but PRIu64 is '%lu' bc __WORDSIZE
| > is 64.
| 
| 
| This is a bit of a mess, but let me see if I can help explain it.

Yes it is :-) thanks for explaining it.

| 
| The root of the problem is that you're mixing up the kernel type __u64,
| with the userspace format specifier PRIu64.

struct perf_event_attr is shared with user space and is using __u64. Should
it use uint64_t instead ?

| 
| PRIu64 is the format specifier for printing a uint64_t, it _may_ also be
| the right specifier for a __u64, but there's no guarantee of that - as
| you have discovered.
| 
| Inside the kernel both x86 and powerpc use unsigned long long always, in
| 32-bit and 64-bit code. That means in the kernel we can always use %llu.
| 
| On x86 that definition is also exported to userspace, so on x86 __u64 is
| always unsigned long long. As you noticed this potentially differs from
| uint64_t, which can be confusing. However it means in x86 userspace code
| you can always print a __u64 with %llu.
| 
| On powerpc we default to using definitions that match userspace, so
| __u64 changes depending on your wordsize, and so you must use PRIu64
| etc. to print them.

Well, using __u64 and PRIu64 seems breaks x86-64...

| 
| There is however support in recent powerpc kernels to switch to using
| unsigned long long even on 64-bit. See commit 2c9c6ce.
| 
| You need to define __SANE_USERSPACE_TYPES__ before including types.h.
| Then you can always use %llu to print __u64.

but __SANE_USERSPACE_TYPES__ with __u64 and %llu seems to work on x86,
x86-64, powerpc.

Will modify my patch to add __SANE_USERSPACE_TYPES__ but leave the %llu
as is.

Sukadev

^ permalink raw reply

* Re: [PATCH v5 04/45] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
From: Tejun Heo @ 2013-01-23 18:55 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: linux-doc, peterz, fweisbec, linux-kernel, mingo, linux-arch,
	linux, xiaoguangrong, wangyun, paulmck, nikunj, linux-pm, rusty,
	rostedt, rjw, namhyung, tglx, linux-arm-kernel, netdev, oleg, sbw,
	akpm, linuxppc-dev
In-Reply-To: <20130122073347.13822.85876.stgit@srivatsabhat.in.ibm.com>

Hello, Srivatsa.

First of all, I'm not sure whether we need to be this step-by-step
when introducing something new.  It's not like we're transforming an
existing implementation and it doesn't seem to help understanding the
series that much either.

On Tue, Jan 22, 2013 at 01:03:53PM +0530, Srivatsa S. Bhat wrote:
> Using global rwlocks as the backend for per-CPU rwlocks helps us avoid many
> lock-ordering related problems (unlike per-cpu locks). However, global

So, unfortunately, this already seems broken, right?  The problem here
seems to be that previously, say, read_lock() implied
preempt_disable() but as this series aims to move away from it, it
introduces the problem of locking order between such locks and the new
contruct.

The only two options are either punishing writers or identifying and
updating all such possible deadlocks.  percpu_rwsem does the former,
right?  I don't know how feasible the latter would be.  Srivatsa,
you've been looking at all the places which would require conversion,
how difficult would doing the latter be?

> +#define reader_uses_percpu_refcnt(pcpu_rwlock, cpu)			\
> +		(ACCESS_ONCE(per_cpu(*((pcpu_rwlock)->reader_refcnt), cpu)))
> +
> +#define reader_nested_percpu(pcpu_rwlock)				\
> +			(__this_cpu_read(*((pcpu_rwlock)->reader_refcnt)) > 1)
> +
> +#define writer_active(pcpu_rwlock)					\
> +			(__this_cpu_read(*((pcpu_rwlock)->writer_signal)))

Why are these in the public header file?  Are they gonna be used to
inline something?

> +static inline void raise_writer_signal(struct percpu_rwlock *pcpu_rwlock,
> +				       unsigned int cpu)
> +{
> +	per_cpu(*pcpu_rwlock->writer_signal, cpu) = true;
> +}
> +
> +static inline void drop_writer_signal(struct percpu_rwlock *pcpu_rwlock,
> +				      unsigned int cpu)
> +{
> +	per_cpu(*pcpu_rwlock->writer_signal, cpu) = false;
> +}
> +
> +static void announce_writer_active(struct percpu_rwlock *pcpu_rwlock)
> +{
> +	unsigned int cpu;
> +
> +	for_each_online_cpu(cpu)
> +		raise_writer_signal(pcpu_rwlock, cpu);
> +
> +	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
> +}
> +
> +static void announce_writer_inactive(struct percpu_rwlock *pcpu_rwlock)
> +{
> +	unsigned int cpu;
> +
> +	drop_writer_signal(pcpu_rwlock, smp_processor_id());
> +
> +	for_each_online_cpu(cpu)
> +		drop_writer_signal(pcpu_rwlock, cpu);
> +
> +	smp_mb(); /* Paired with smp_rmb() in percpu_read_[un]lock() */
> +}

It could be just personal preference but I find the above one line
wrappers more obfuscating than anything else.  What's the point of
wrapping writer_signal = true/false into a separate function?  These
simple wrappers just add layers that people have to dig through to
figure out what's going on without adding anything of value.  I'd much
prefer collapsing these into the percpu_write_[un]lock().

Thanks.

-- 
tejun

^ permalink raw reply

* Re: Freescale P2020 CPU Freeze over PCIe abort signal
From: siva kumar @ 2013-01-23 17:41 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 162 bytes --]

Hi ,

Is there any update on this , am getting in to the same state .
https://lists.ozlabs.org/pipermail/linuxppc-dev/2010-October/086680.html

Thanks,
Sivakumar

[-- Attachment #2: Type: text/html, Size: 321 bytes --]

^ permalink raw reply

* Re: [PATCH 1/2] powerpc/mpic: allow coreint to be determined by MPIC version
From: Scott Wood @ 2013-01-23 17:28 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev
In-Reply-To: <AAD002DB-C259-4054-A126-950B69BF1B5D@kernel.crashing.org>

On 01/23/2013 11:23:51 AM, Kumar Gala wrote:
>=20
> On Jan 21, 2013, at 7:56 PM, Scott Wood wrote:
>=20
> > This will be used by the qemu-e500 platform, as the MPIC version =20
> (and
> > thus whether we have coreint) depends on how QEMU is configured.
> >
> > Signed-off-by: Scott Wood <scottwood@freescale.com>
> > ---
> > arch/powerpc/sysdev/mpic.c |   26 +++++++++++++++++++++++---
> > 1 file changed, 23 insertions(+), 3 deletions(-)
>=20
> Is the idea that we'd set mpic->flags such that MPIC_ENABLE_COREINT =20
> was set, but based on the controller version we'd ignore the flag?

Yes, at least for platforms like qemu-e500 where both are posssible =20
(see patch 2/2).

-Scott=

^ permalink raw reply

* Re: [PATCH 2/2] powerpc/85xx: describe the PAMU topology in the device tree
From: Gala Kumar-B11780 @ 2013-01-23 17:27 UTC (permalink / raw)
  To: Wood Scott-B07421, Yoder Stuart-B08248
  Cc: linuxppc-dev@ozlabs.org list, Timur Tabi
In-Reply-To: <1358462073-2558-2-git-send-email-timur@tabi.org>


On Jan 17, 2013, at 4:34 PM, Timur Tabi wrote:

> From: Timur Tabi <timur@freescale.com>
>=20
> The PAMU caches use the LIODNs to determine which cache lines hold the
> entries for the corresponding LIODs.  The LIODNs must therefore be
> carefully assigned to avoid cache thrashing -- two active LIODs with
> LIODNs that put them in the same cache line.
>=20
> Currently, LIODNs are statically assigned by U-Boot, but this has
> limitations.  LIODNs are assigned even for devices that may be disabled
> or unused by the kernel.  Static assignments also do not allow for device
> drivers which may know which LIODs can be used simultaneously.  In
> other words, we really should assign LIODNs dynamically in Linux.
>=20
> To do that, we need to describe the PAMU device and cache topologies in
> the device trees.
>=20
> Signed-off-by: Timur Tabi <timur@freescale.com>
> ---
> .../devicetree/bindings/powerpc/fsl/guts.txt       |   14 ++-
> .../devicetree/bindings/powerpc/fsl/pamu.txt       |  142 +++++++++++++++=
+++++
> arch/powerpc/boot/dts/fsl/p2041si-post.dtsi        |   87 +++++++++++--
> arch/powerpc/boot/dts/fsl/p3041si-post.dtsi        |   87 +++++++++++--
> arch/powerpc/boot/dts/fsl/p4080si-post.dtsi        |   68 +++++++++-
> arch/powerpc/boot/dts/fsl/p5020si-post.dtsi        |   92 +++++++++++--
> arch/powerpc/boot/dts/fsl/p5040si-post.dtsi        |   92 +++++++++++--
> 7 files changed, 533 insertions(+), 49 deletions(-)
> create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/pamu.txt

Scott, Stuart, does this have your guys Ack?

- k=

^ permalink raw reply

* Re: [PATCH 1/2] powerpc/mpic: allow coreint to be determined by MPIC version
From: Kumar Gala @ 2013-01-23 17:23 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev
In-Reply-To: <1358819804-28665-1-git-send-email-scottwood@freescale.com>


On Jan 21, 2013, at 7:56 PM, Scott Wood wrote:

> This will be used by the qemu-e500 platform, as the MPIC version (and
> thus whether we have coreint) depends on how QEMU is configured.
>=20
> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
> arch/powerpc/sysdev/mpic.c |   26 +++++++++++++++++++++++---
> 1 file changed, 23 insertions(+), 3 deletions(-)

Is the idea that we'd set mpic->flags such that MPIC_ENABLE_COREINT was =
set, but based on the controller version we'd ignore the flag?

- k=

^ permalink raw reply

* Re: [PATCH Bug fix 0/5] Bug fix for physical memory hot-remove.
From: Tang Chen @ 2013-01-23 13:17 UTC (permalink / raw)
  To: Simon Jeons
  Cc: linux-mm, paulus, hpa, cl, sfr, x86, linux-acpi, isimatu.yasuaki,
	linfeng, mgorman, kosaki.motohiro, rientjes, len.brown, jiang.liu,
	wency, julian.calaby, glommer, wujianguo, yinghai, laijs,
	linux-kernel, minchan.kim, akpm, linuxppc-dev
In-Reply-To: <1358944171.3351.1.camel@kernel>

On 01/23/2013 08:29 PM, Simon Jeons wrote:
> Hi Tang,
>
> I remember your big physical memory hot-remove patchset has already
> merged by Andrew, but where I can find it? Could you give me git tree
> address?

Hi Simon,

You can find all the physical memory hot-remove patches and related bugfix
patches from the following url:

git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm


Thanks. :)

^ permalink raw reply

* Re: [PATCH Bug fix 0/5] Bug fix for physical memory hot-remove.
From: Simon Jeons @ 2013-01-23 12:29 UTC (permalink / raw)
  To: Tang Chen
  Cc: linux-mm, paulus, hpa, cl, sfr, x86, linux-acpi, isimatu.yasuaki,
	linfeng, mgorman, kosaki.motohiro, rientjes, len.brown, jiang.liu,
	wency, julian.calaby, glommer, wujianguo, yinghai, laijs,
	linux-kernel, minchan.kim, akpm, linuxppc-dev
In-Reply-To: <1358854984-6073-1-git-send-email-tangchen@cn.fujitsu.com>

On Tue, 2013-01-22 at 19:42 +0800, Tang Chen wrote:
> Here are some bug fix patches for physical memory hot-remove. All these
> patches are based on the latest -mm tree.
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm
> 
> And patch1 and patch3 are very important.
> patch1: free compound pages when freeing memmap, otherwise the kernel
>         will panic the next time memory is hot-added.
> patch3: the old way of freeing pagetable pages was wrong. We should never
>         split larger pages into small ones.
> 
> 

Hi Tang,

I remember your big physical memory hot-remove patchset has already
merged by Andrew, but where I can find it? Could you give me git tree 
address?

> Lai Jiangshan (1):
>   Bug-fix: mempolicy: fix is_valid_nodemask()
> 
> Tang Chen (3):
>   Bug fix: Do not split pages when freeing pagetable pages.
>   Bug fix: Fix section mismatch problem of
>     release_firmware_map_entry().
>   Bug fix: Fix the doc format in drivers/firmware/memmap.c
> 
> Wen Congyang (1):
>   Bug fix: consider compound pages when free memmap
> 
>  arch/x86/mm/init_64.c     |  148 ++++++++++++++-------------------------------
>  drivers/firmware/memmap.c |   16 +++---
>  mm/mempolicy.c            |   36 +++++++----
>  mm/sparse.c               |    2 +-
>  4 files changed, 77 insertions(+), 125 deletions(-)
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* [PATCH] powerpc: make irq_stat.timers_irqs counting more specific
From: Fan Du @ 2013-01-23  8:06 UTC (permalink / raw)
  To: benh, paulus; +Cc: linuxppc-dev, fan.du

Current irq_stat.timers_irqs counting doesn't discriminate timer event handler
and other timer interrupt(like arch_irq_work_raise). Sometimes we need to know
exactly how much interrupts timer event handler fired, so let's be more specific
on this.

Signed-off-by: Fan Du <fan.du@windriver.com>
---
 arch/powerpc/include/asm/hardirq.h |    3 ++-
 arch/powerpc/kernel/irq.c          |   12 +++++++++---
 arch/powerpc/kernel/time.c         |    3 ++-
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/hardirq.h b/arch/powerpc/include/asm/hardirq.h
index 3147a29..ff8bfdc 100644
--- a/arch/powerpc/include/asm/hardirq.h
+++ b/arch/powerpc/include/asm/hardirq.h
@@ -6,7 +6,8 @@
 
 typedef struct {
 	unsigned int __softirq_pending;
-	unsigned int timer_irqs;
+	unsigned int timer_irqs_event;
+	unsigned int timer_irqs_others;
 	unsigned int pmu_irqs;
 	unsigned int mce_exceptions;
 	unsigned int spurious_irqs;
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 71413f4..101127d 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -349,8 +349,13 @@ int arch_show_interrupts(struct seq_file *p, int prec)
 
 	seq_printf(p, "%*s: ", prec, "LOC");
 	for_each_online_cpu(j)
-		seq_printf(p, "%10u ", per_cpu(irq_stat, j).timer_irqs);
-        seq_printf(p, "  Local timer interrupts\n");
+		seq_printf(p, "%10u ", per_cpu(irq_stat, j).timer_irqs_event);
+        seq_printf(p, "  Local timer interrupts for timer event device\n");
+
+	seq_printf(p, "%*s: ", prec, "LOC");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10u ", per_cpu(irq_stat, j).timer_irqs_others);
+        seq_printf(p, "  Local timer interrupts for others\n");
 
 	seq_printf(p, "%*s: ", prec, "SPU");
 	for_each_online_cpu(j)
@@ -375,11 +380,12 @@ int arch_show_interrupts(struct seq_file *p, int prec)
  */
 u64 arch_irq_stat_cpu(unsigned int cpu)
 {
-	u64 sum = per_cpu(irq_stat, cpu).timer_irqs;
+	u64 sum = per_cpu(irq_stat, cpu).timer_irqs_event;
 
 	sum += per_cpu(irq_stat, cpu).pmu_irqs;
 	sum += per_cpu(irq_stat, cpu).mce_exceptions;
 	sum += per_cpu(irq_stat, cpu).spurious_irqs;
+	sum += per_cpu(irq_stat, cpu).timer_irqs_others;
 
 	return sum;
 }
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index b3b1435..7e7c553 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -504,7 +504,6 @@ void timer_interrupt(struct pt_regs * regs)
 	 */
 	may_hard_irq_enable();
 
-	__get_cpu_var(irq_stat).timer_irqs++;
 
 #if defined(CONFIG_PPC32) && defined(CONFIG_PMAC)
 	if (atomic_read(&ppc_n_lost_interrupts) != 0)
@@ -526,10 +525,12 @@ void timer_interrupt(struct pt_regs * regs)
 		*next_tb = ~(u64)0;
 		if (evt->event_handler)
 			evt->event_handler(evt);
+		__get_cpu_var(irq_stat).timer_irqs_event++;
 	} else {
 		now = *next_tb - now;
 		if (now <= DECREMENTER_MAX)
 			set_dec((int)now);
+		__get_cpu_var(irq_stat).timer_irqs_others++;
 	}
 
 #ifdef CONFIG_PPC64
-- 
1.7.1

^ permalink raw reply related

* [PATCHv1] crypto: caam - Added property fsl, sec-era in SEC4.0 device tree binding.
From: Vakul Garg @ 2013-01-23  7:21 UTC (permalink / raw)
  To: linuxppc-dev, devicetree-discuss, linux-crypto

This new property defines the era of the particular SEC version.
The compatible property in device tree "crypto" node has been updated
not to contain SEC era numbers.

Signed-off-by: Vakul Garg <vakul@freescale.com>
---
Changelog:
	1. Marked fsl,sec-era as 'optional'.

 .../devicetree/bindings/crypto/fsl-sec4.txt        |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/crypto/fsl-sec4.txt b/Documentation/devicetree/bindings/crypto/fsl-sec4.txt
index fc9ce6f..dc40055 100644
--- a/Documentation/devicetree/bindings/crypto/fsl-sec4.txt
+++ b/Documentation/devicetree/bindings/crypto/fsl-sec4.txt
@@ -54,8 +54,13 @@ PROPERTIES
    - compatible
       Usage: required
       Value type: <string>
-      Definition: Must include "fsl,sec-v4.0". Also includes SEC
-           ERA versions (optional) with which the device is compatible.
+      Definition: Must include "fsl,sec-v4.0"
+
+   - fsl,sec-era
+      Usage: optional
+      Value type: <u32>
+      Definition: A standard property. Define the 'ERA' of the SEC
+          device.
 
    - #address-cells
        Usage: required
@@ -107,7 +112,8 @@ PROPERTIES
 
 EXAMPLE
 	crypto@300000 {
-		compatible = "fsl,sec-v4.0", "fsl,sec-era-v2.0";
+		compatible = "fsl,sec-v4.0";
+		fsl,sec-era = <0x2>;
 		#address-cells = <1>;
 		#size-cells = <1>;
 		reg = <0x300000 0x10000>;
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH] crypto: caam - Added property fsl, sec-era in SEC4.0 device tree binding.
From: Vakul Garg @ 2013-01-23  6:47 UTC (permalink / raw)
  To: linuxppc-dev

This new property defines the era of the particular SEC version.
The compatible property in device tree "crypto" node has been updated
not to contain SEC era numbers.

Signed-off-by: Vakul Garg <vakul@freescale.com>
---
 .../devicetree/bindings/crypto/fsl-sec4.txt        |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/crypto/fsl-sec4.txt b/Documentation/devicetree/bindings/crypto/fsl-sec4.txt
index fc9ce6f..dc40055 100644
--- a/Documentation/devicetree/bindings/crypto/fsl-sec4.txt
+++ b/Documentation/devicetree/bindings/crypto/fsl-sec4.txt
@@ -54,8 +54,13 @@ PROPERTIES
    - compatible
       Usage: required
       Value type: <string>
-      Definition: Must include "fsl,sec-v4.0". Also includes SEC
-           ERA versions (optional) with which the device is compatible.
+      Definition: Must include "fsl,sec-v4.0"
+
+   - fsl,sec-era
+      Usage: required
+      Value type: <u32>
+      Definition: A standard property. Define the 'ERA' of the SEC
+          device.
 
    - #address-cells
        Usage: required
@@ -107,7 +112,8 @@ PROPERTIES
 
 EXAMPLE
 	crypto@300000 {
-		compatible = "fsl,sec-v4.0", "fsl,sec-era-v2.0";
+		compatible = "fsl,sec-v4.0";
+		fsl,sec-era = <0x2>;
 		#address-cells = <1>;
 		#size-cells = <1>;
 		reg = <0x300000 0x10000>;
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH 6/6][v4]: perf: Document the ABI of perf sysfs entries
From: Sukadev Bhattiprolu @ 2013-01-23  6:26 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar
  Cc: Andi Kleen, robert.richter, Anton Blanchard, linux-kernel,
	Stephane Eranian, linuxppc-dev, Arnaldo Carvalho de Melo,
	Jiri Olsa
In-Reply-To: <20130123062201.GA13720@us.ibm.com>


[PATCH 6/6][v4]: perf: Document the ABI of perf sysfs entries

This patchset addes two new sets of files to sysfs for POWER architecture.

	- perf event config format in /sys/devices/cpu/format/event
	- generic and POWER-specific perf events in /sys/devices/cpu/events/

The format of the first file is already documented in:

	sysfs-bus-event_source-devices-format

Document the format of the second set of files '/sys/devices/cpu/events/*'
which would also become part of the ABI.

Changelog[v4]:
	[Jiri Olsa]: Mention that multiple event= like terms can be specified
	in the 'events' file.
	[Jiri Olsa]: Remove the documentation for the 'config format' file
	as it is already documented in 'Documentation/ABI/testing/'.
	[Jiri Olsa]: Move ABI documentation from 'stable/' to 'testing/'

Changelog[v3]:
	[Greg KH] Include ABI documentation.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
---
 .../testing/sysfs-bus-event_source-devices-events  |   62 ++++++++++++++++++++
 1 files changed, 62 insertions(+), 0 deletions(-)
 delete mode 100644 Documentation/ABI/stable/sysfs-devices-cpu-events
 create mode 100644 Documentation/ABI/testing/sysfs-bus-event_source-devices-events

diff --git a/Documentation/ABI/stable/sysfs-devices-cpu-events b/Documentation/ABI/stable/sysfs-devices-cpu-events
deleted file mode 100644
index e69de29..0000000
diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
new file mode 100644
index 0000000..0adeb52
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
@@ -0,0 +1,62 @@
+What:		/sys/devices/cpu/events/
+		/sys/devices/cpu/events/branch-misses
+		/sys/devices/cpu/events/cache-references
+		/sys/devices/cpu/events/cache-misses
+		/sys/devices/cpu/events/stalled-cycles-frontend
+		/sys/devices/cpu/events/branch-instructions
+		/sys/devices/cpu/events/stalled-cycles-backend
+		/sys/devices/cpu/events/instructions
+		/sys/devices/cpu/events/cpu-cycles
+
+Date:		2013/01/08
+
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+
+Description:	Generic performance monitoring events
+
+		A collection of performance monitoring events that may be
+		supported by many/most CPUs. These events can be monitored
+		using the 'perf(1)' tool.
+
+		The contents of each file would look like:
+
+			event=0xNNNN
+
+		where 'N' is a hex digit and the number '0xNNNN' shows the
+		"raw code" for the perf event identified by the file's
+		"basename".
+
+
+What: 		/sys/devices/cpu/events/PM_LD_MISS_L1
+		/sys/devices/cpu/events/PM_LD_REF_L1
+		/sys/devices/cpu/events/PM_CYC
+		/sys/devices/cpu/events/PM_BRU_FIN
+		/sys/devices/cpu/events/PM_GCT_NOSLOT_CYC
+		/sys/devices/cpu/events/PM_BRU_MPRED
+		/sys/devices/cpu/events/PM_INST_CMPL
+		/sys/devices/cpu/events/PM_CMPLU_STALL
+
+Date:		2013/01/08
+
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+		Linux Powerpc mailing list <linuxppc-dev@ozlabs.org>
+
+Description:	POWER-systems specific performance monitoring events
+
+		A collection of performance monitoring events that may be
+		supported by the POWER CPU. These events can be monitored
+		using the 'perf(1)' tool.
+
+		These events may not be supported by other CPUs.
+
+		The contents of each file would look like:
+
+			event=0xNNNN
+
+		where 'N' is a hex digit and the number '0xNNNN' shows the
+		"raw code" for the perf event identified by the file's
+		"basename".
+
+		Further, multiple terms like 'event=0xNNNN' can be specified
+		and separated with comma. All available terms are defined in
+		the /sys/bus/event_source/devices/<dev>/format file.
-- 
1.7.1

^ permalink raw reply related

* [PATCH 5/6][v4]: perf: Create a sysfs entry for Power event format
From: Sukadev Bhattiprolu @ 2013-01-23  6:26 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar
  Cc: Andi Kleen, robert.richter, Anton Blanchard, linux-kernel,
	Stephane Eranian, linuxppc-dev, Arnaldo Carvalho de Melo,
	Jiri Olsa
In-Reply-To: <20130123062201.GA13720@us.ibm.com>


[PATCH 5/6][v4]: perf: Create a sysfs entry for Power event format

Create a sysfs entry, '/sys/bus/event_source/devices/cpu/format/event'
which describes the format of a POWER cpu.

The format of the event is the same for all POWER cpus at least in
(Power6, Power7), so bulk of this change is common in the code common
to POWER cpus.

This code is based on corresponding code in x86.

Changelog[v2]: [Jiri Osla] Use PMU_FORMAT_ATTR() rather than duplicating it.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/perf_event_server.h |    6 ++++++
 arch/powerpc/perf/core-book3s.c              |   12 ++++++++++++
 arch/powerpc/perf/power7-pmu.c               |    1 +
 3 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index b29fcc6..ee63205 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -135,3 +135,9 @@ extern ssize_t power_events_sysfs_show(struct device *dev,
 
 #define	POWER_EVENT_ATTR(_name, _id)	EVENT_ATTR(PM_##_name, _id, _p)
 #define	POWER_EVENT_PTR(_id)		EVENT_PTR(_id, _p)
+
+/*
+ * Format of a perf event is the same on all POWER cpus. Declare a
+ * common sysfs attribute group that individual POWER cpus can share.
+ */
+extern struct attribute_group power_pmu_format_group;
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index fa476d5..4ae044b 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1315,6 +1315,18 @@ ssize_t power_events_sysfs_show(struct device *dev,
 	return sprintf(page, "event=0x%02llx\n", pmu_attr->id);
 }
 
+PMU_FORMAT_ATTR(event, "config:0-20");
+
+static struct attribute *power_pmu_format_attr[] = {
+	&format_attr_event.attr,
+	NULL,
+};
+
+struct attribute_group power_pmu_format_group = {
+	.name = "format",
+	.attrs = power_pmu_format_attr,
+};
+
 struct pmu power_pmu = {
 	.pmu_enable	= power_pmu_enable,
 	.pmu_disable	= power_pmu_disable,
diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
index 5627940..5fb3c9b 100644
--- a/arch/powerpc/perf/power7-pmu.c
+++ b/arch/powerpc/perf/power7-pmu.c
@@ -410,6 +410,7 @@ static struct attribute_group power7_pmu_events_group = {
 };
 
 static const struct attribute_group *power7_pmu_attr_groups[] = {
+	&power_pmu_format_group,
 	&power7_pmu_events_group,
 	NULL,
 };
-- 
1.7.1

^ permalink raw reply related

* [PATCH 4/6][v4]: perf/POWER7: Make some POWER7 events available in sysfs
From: Sukadev Bhattiprolu @ 2013-01-23  6:25 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar
  Cc: Andi Kleen, robert.richter, Anton Blanchard, linux-kernel,
	Stephane Eranian, linuxppc-dev, Arnaldo Carvalho de Melo,
	Jiri Olsa
In-Reply-To: <20130123062201.GA13720@us.ibm.com>


[PATCH 4/6][v4]: perf/POWER7: Make some POWER7 events available in sysfs

Make some POWER7-specific perf events available in sysfs.

	$ /bin/ls -1 /sys/bus/event_source/devices/cpu/events/
	branch-instructions
	branch-misses
	cache-misses
	cache-references
	cpu-cycles
	instructions
	PM_BRU_FIN
	PM_BRU_MPRED
	PM_CMPLU_STALL
	PM_CYC
	PM_GCT_NOSLOT_CYC
	PM_INST_CMPL
	PM_LD_MISS_L1
	PM_LD_REF_L1
	stalled-cycles-backend
	stalled-cycles-frontend

where the 'PM_*' events are POWER specific and the others are the
generic events.

This will enable users to specify these events with their symbolic
names rather than with their raw code.

	perf stat -e 'cpu/PM_CYC' ...

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/perf_event_server.h |    2 ++
 arch/powerpc/perf/power7-pmu.c               |   18 ++++++++++++++++++
 2 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 3f21d89..b29fcc6 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -133,3 +133,5 @@ extern ssize_t power_events_sysfs_show(struct device *dev,
 #define	GENERIC_EVENT_ATTR(_name, _id)	EVENT_ATTR(_name, _id, _g)
 #define	GENERIC_EVENT_PTR(_id)		EVENT_PTR(_id, _g)
 
+#define	POWER_EVENT_ATTR(_name, _id)	EVENT_ATTR(PM_##_name, _id, _p)
+#define	POWER_EVENT_PTR(_id)		EVENT_PTR(_id, _p)
diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
index ae5d757..5627940 100644
--- a/arch/powerpc/perf/power7-pmu.c
+++ b/arch/powerpc/perf/power7-pmu.c
@@ -373,6 +373,15 @@ GENERIC_EVENT_ATTR(cache-misses,		LD_MISS_L1);
 GENERIC_EVENT_ATTR(branch-instructions,		BRU_FIN);
 GENERIC_EVENT_ATTR(branch-misses,		BRU_MPRED);
 
+POWER_EVENT_ATTR(CYC,				CYC);
+POWER_EVENT_ATTR(GCT_NOSLOT_CYC,		GCT_NOSLOT_CYC);
+POWER_EVENT_ATTR(CMPLU_STALL,			CMPLU_STALL);
+POWER_EVENT_ATTR(INST_CMPL,			INST_CMPL);
+POWER_EVENT_ATTR(LD_REF_L1,			LD_REF_L1);
+POWER_EVENT_ATTR(LD_MISS_L1,			LD_MISS_L1);
+POWER_EVENT_ATTR(BRU_FIN,			BRU_FIN)
+POWER_EVENT_ATTR(BRU_MPRED,			BRU_MPRED);
+
 static struct attribute *power7_events_attr[] = {
 	GENERIC_EVENT_PTR(CYC),
 	GENERIC_EVENT_PTR(GCT_NOSLOT_CYC),
@@ -382,6 +391,15 @@ static struct attribute *power7_events_attr[] = {
 	GENERIC_EVENT_PTR(LD_MISS_L1),
 	GENERIC_EVENT_PTR(BRU_FIN),
 	GENERIC_EVENT_PTR(BRU_MPRED),
+
+	POWER_EVENT_PTR(CYC),
+	POWER_EVENT_PTR(GCT_NOSLOT_CYC),
+	POWER_EVENT_PTR(CMPLU_STALL),
+	POWER_EVENT_PTR(INST_CMPL),
+	POWER_EVENT_PTR(LD_REF_L1),
+	POWER_EVENT_PTR(LD_MISS_L1),
+	POWER_EVENT_PTR(BRU_FIN),
+	POWER_EVENT_PTR(BRU_MPRED),
 	NULL
 };
 
-- 
1.7.1

^ permalink raw reply related

* [PATCH 3/6][v4]: perf/POWER7: Make generic event translations available in sysfs
From: Sukadev Bhattiprolu @ 2013-01-23  6:24 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar
  Cc: Andi Kleen, robert.richter, Anton Blanchard, linux-kernel,
	Stephane Eranian, linuxppc-dev, Arnaldo Carvalho de Melo,
	Jiri Olsa
In-Reply-To: <20130123062201.GA13720@us.ibm.com>


[PATCH 3/6][v4]: perf/POWER7: Make generic event translations available in sysfs

Make the generic perf events in POWER7 available via sysfs.

	$ ls /sys/bus/event_source/devices/cpu/events
	branch-instructions
	branch-misses
	cache-misses
	cache-references
	cpu-cycles
	instructions
	stalled-cycles-backend
	stalled-cycles-frontend

	$ cat /sys/bus/event_source/devices/cpu/events/cache-misses
	event=0x400f0

This patch is based on commits that implement this functionality on x86.
Eg:
	commit a47473939db20e3961b200eb00acf5fcf084d755
	Author: Jiri Olsa <jolsa@redhat.com>
	Date:   Wed Oct 10 14:53:11 2012 +0200

	    perf/x86: Make hardware event translations available in sysfs

Changelog:[v2]
	[Jiri Osla] Drop EVENT_ID() macro since it is only used once.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/perf_event_server.h      |   24 ++++++++++++++
 arch/powerpc/perf/core-book3s.c                   |   12 +++++++
 arch/powerpc/perf/power7-pmu.c                    |   34 +++++++++++++++++++++
 3 files changed, 70 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/stable/sysfs-devices-cpu-events

diff --git a/Documentation/ABI/stable/sysfs-devices-cpu-events b/Documentation/ABI/stable/sysfs-devices-cpu-events
new file mode 100644
index 0000000..e69de29
diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 9710be3..3f21d89 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -11,6 +11,7 @@
 
 #include <linux/types.h>
 #include <asm/hw_irq.h>
+#include <linux/device.h>
 
 #define MAX_HWEVENTS		8
 #define MAX_EVENT_ALTERNATIVES	8
@@ -35,6 +36,7 @@ struct power_pmu {
 	void		(*disable_pmc)(unsigned int pmc, unsigned long mmcr[]);
 	int		(*limited_pmc_event)(u64 event_id);
 	u32		flags;
+	const struct attribute_group	**attr_groups;
 	int		n_generic;
 	int		*generic_events;
 	int		(*cache_events)[PERF_COUNT_HW_CACHE_MAX]
@@ -109,3 +111,25 @@ extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
  * If an event_id is not subject to the constraint expressed by a particular
  * field, then it will have 0 in both the mask and value for that field.
  */
+
+extern ssize_t power_events_sysfs_show(struct device *dev,
+				struct device_attribute *attr, char *page);
+
+/*
+ * EVENT_VAR() is same as PMU_EVENT_VAR with a suffix.
+ *
+ * Having a suffix allows us to have aliases in sysfs - eg: the generic
+ * event 'cpu-cycles' can have two entries in sysfs: 'cpu-cycles' and
+ * 'PM_CYC' where the latter is the name by which the event is known in
+ * POWER CPU specification.
+ */
+#define	EVENT_VAR(_id, _suffix)		event_attr_##_id##_suffix
+#define	EVENT_PTR(_id, _suffix)		&EVENT_VAR(_id, _suffix)
+
+#define	EVENT_ATTR(_name, _id, _suffix)					\
+	PMU_EVENT_ATTR(_name, EVENT_VAR(_id, _suffix), PME_PM_##_id,	\
+			power_events_sysfs_show)
+
+#define	GENERIC_EVENT_ATTR(_name, _id)	EVENT_ATTR(_name, _id, _g)
+#define	GENERIC_EVENT_PTR(_id)		EVENT_PTR(_id, _g)
+
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index aa2465e..fa476d5 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1305,6 +1305,16 @@ static int power_pmu_event_idx(struct perf_event *event)
 	return event->hw.idx;
 }
 
+ssize_t power_events_sysfs_show(struct device *dev,
+				struct device_attribute *attr, char *page)
+{
+	struct perf_pmu_events_attr *pmu_attr;
+
+	pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
+
+	return sprintf(page, "event=0x%02llx\n", pmu_attr->id);
+}
+
 struct pmu power_pmu = {
 	.pmu_enable	= power_pmu_enable,
 	.pmu_disable	= power_pmu_disable,
@@ -1537,6 +1547,8 @@ int __cpuinit register_power_pmu(struct power_pmu *pmu)
 	pr_info("%s performance monitor hardware support registered\n",
 		pmu->name);
 
+	power_pmu.attr_groups = ppmu->attr_groups;
+
 #ifdef MSR_HV
 	/*
 	 * Use FCHV to ignore kernel events if MSR.HV is set.
diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
index 44e70d2..ae5d757 100644
--- a/arch/powerpc/perf/power7-pmu.c
+++ b/arch/powerpc/perf/power7-pmu.c
@@ -363,6 +363,39 @@ static int power7_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	},
 };
 
+
+GENERIC_EVENT_ATTR(cpu-cycles,			CYC);
+GENERIC_EVENT_ATTR(stalled-cycles-frontend,	GCT_NOSLOT_CYC);
+GENERIC_EVENT_ATTR(stalled-cycles-backend,	CMPLU_STALL);
+GENERIC_EVENT_ATTR(instructions,		INST_CMPL);
+GENERIC_EVENT_ATTR(cache-references,		LD_REF_L1);
+GENERIC_EVENT_ATTR(cache-misses,		LD_MISS_L1);
+GENERIC_EVENT_ATTR(branch-instructions,		BRU_FIN);
+GENERIC_EVENT_ATTR(branch-misses,		BRU_MPRED);
+
+static struct attribute *power7_events_attr[] = {
+	GENERIC_EVENT_PTR(CYC),
+	GENERIC_EVENT_PTR(GCT_NOSLOT_CYC),
+	GENERIC_EVENT_PTR(CMPLU_STALL),
+	GENERIC_EVENT_PTR(INST_CMPL),
+	GENERIC_EVENT_PTR(LD_REF_L1),
+	GENERIC_EVENT_PTR(LD_MISS_L1),
+	GENERIC_EVENT_PTR(BRU_FIN),
+	GENERIC_EVENT_PTR(BRU_MPRED),
+	NULL
+};
+
+
+static struct attribute_group power7_pmu_events_group = {
+	.name = "events",
+	.attrs = power7_events_attr,
+};
+
+static const struct attribute_group *power7_pmu_attr_groups[] = {
+	&power7_pmu_events_group,
+	NULL,
+};
+
 static struct power_pmu power7_pmu = {
 	.name			= "POWER7",
 	.n_counter		= 6,
@@ -374,6 +407,7 @@ static struct power_pmu power7_pmu = {
 	.get_alternatives	= power7_get_alternatives,
 	.disable_pmc		= power7_disable_pmc,
 	.flags			= PPMU_ALT_SIPR,
+	.attr_groups		= power7_pmu_attr_groups,
 	.n_generic		= ARRAY_SIZE(power7_generic_events),
 	.generic_events		= power7_generic_events,
 	.cache_events		= &power7_cache_events,
-- 
1.7.1

^ permalink raw reply related

* [PATCH 2/6][v4]: perf: Make EVENT_ATTR global
From: Sukadev Bhattiprolu @ 2013-01-23  6:24 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar
  Cc: Andi Kleen, robert.richter, Anton Blanchard, linux-kernel,
	Stephane Eranian, linuxppc-dev, Arnaldo Carvalho de Melo,
	Jiri Olsa
In-Reply-To: <20130123062201.GA13720@us.ibm.com>


[PATCH 2/6][v4]: perf: Make EVENT_ATTR global

Rename EVENT_ATTR() to PMU_EVENT_ATTR() and make it global so it is
available to all architectures.

Further to allow architectures flexibility, have PMU_EVENT_ATTR() pass
in the variable name as a parameter.

Changelog[v2]
	- [Jiri Osla] No need to define PMU_EVENT_PTR()

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
---
 arch/x86/kernel/cpu/perf_event.c |   13 +++----------
 include/linux/perf_event.h       |   11 +++++++++++
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 4428fd1..59a1238 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1316,11 +1316,6 @@ static struct attribute_group x86_pmu_format_group = {
 	.attrs = NULL,
 };
 
-struct perf_pmu_events_attr {
-	struct device_attribute attr;
-	u64 id;
-};
-
 /*
  * Remove all undefined events (x86_pmu.event_map(id) == 0)
  * out of events_attr attributes.
@@ -1354,11 +1349,9 @@ static ssize_t events_sysfs_show(struct device *dev, struct device_attribute *at
 #define EVENT_VAR(_id)  event_attr_##_id
 #define EVENT_PTR(_id) &event_attr_##_id.attr.attr
 
-#define EVENT_ATTR(_name, _id)					\
-static struct perf_pmu_events_attr EVENT_VAR(_id) = {		\
-	.attr = __ATTR(_name, 0444, events_sysfs_show, NULL),	\
-	.id   =  PERF_COUNT_HW_##_id,				\
-};
+#define EVENT_ATTR(_name, _id)						\
+	PMU_EVENT_ATTR(_name, EVENT_VAR(_id), PERF_COUNT_HW_##_id,	\
+			events_sysfs_show)
 
 EVENT_ATTR(cpu-cycles,			CPU_CYCLES		);
 EVENT_ATTR(instructions,		INSTRUCTIONS		);
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 6bfb2fa..42adf01 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -817,6 +817,17 @@ do {									\
 } while (0)
 
 
+struct perf_pmu_events_attr {
+	struct device_attribute attr;
+	u64 id;
+};
+
+#define PMU_EVENT_ATTR(_name, _var, _id, _show)				\
+static struct perf_pmu_events_attr _var = {				\
+	.attr = __ATTR(_name, 0444, _show, NULL),			\
+	.id   =  _id,							\
+};
+
 #define PMU_FORMAT_ATTR(_name, _format)					\
 static ssize_t								\
 _name##_show(struct device *dev,					\
-- 
1.7.1

^ permalink raw reply related

* [PATCH 1/6][v4]: perf/Power7: Use macros to identify perf events
From: Sukadev Bhattiprolu @ 2013-01-23  6:23 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar
  Cc: Andi Kleen, robert.richter, Anton Blanchard, linux-kernel,
	Stephane Eranian, linuxppc-dev, Arnaldo Carvalho de Melo,
	Jiri Olsa
In-Reply-To: <20130123062201.GA13720@us.ibm.com>


[PATCH 1/6][v4]: perf/Power7: Use macros to identify perf events

Define and use macros to identify perf events codes This would make it
easier and more readable when these event codes need to be used in more
than one place.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
---
 arch/powerpc/perf/power7-pmu.c |   28 ++++++++++++++++++++--------
 1 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
index 441af08..44e70d2 100644
--- a/arch/powerpc/perf/power7-pmu.c
+++ b/arch/powerpc/perf/power7-pmu.c
@@ -51,6 +51,18 @@
 #define MMCR1_PMCSEL_MSK	0xff
 
 /*
+ * Power7 event codes.
+ */
+#define	PME_PM_CYC			0x1e
+#define	PME_PM_GCT_NOSLOT_CYC		0x100f8
+#define	PME_PM_CMPLU_STALL		0x4000a
+#define	PME_PM_INST_CMPL		0x2
+#define	PME_PM_LD_REF_L1		0xc880
+#define	PME_PM_LD_MISS_L1		0x400f0
+#define	PME_PM_BRU_FIN			0x10068
+#define	PME_PM_BRU_MPRED		0x400f6
+
+/*
  * Layout of constraint bits:
  * 6666555555555544444444443333333333222222222211111111110000000000
  * 3210987654321098765432109876543210987654321098765432109876543210
@@ -296,14 +308,14 @@ static void power7_disable_pmc(unsigned int pmc, unsigned long mmcr[])
 }
 
 static int power7_generic_events[] = {
-	[PERF_COUNT_HW_CPU_CYCLES] = 0x1e,
-	[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x100f8, /* GCT_NOSLOT_CYC */
-	[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x4000a,  /* CMPLU_STALL */
-	[PERF_COUNT_HW_INSTRUCTIONS] = 2,
-	[PERF_COUNT_HW_CACHE_REFERENCES] = 0xc880,	/* LD_REF_L1_LSU*/
-	[PERF_COUNT_HW_CACHE_MISSES] = 0x400f0,		/* LD_MISS_L1	*/
-	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x10068,	/* BRU_FIN	*/
-	[PERF_COUNT_HW_BRANCH_MISSES] = 0x400f6,	/* BR_MPRED	*/
+	[PERF_COUNT_HW_CPU_CYCLES] =			PME_PM_CYC,
+	[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =	PME_PM_GCT_NOSLOT_CYC,
+	[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =	PME_PM_CMPLU_STALL,
+	[PERF_COUNT_HW_INSTRUCTIONS] =			PME_PM_INST_CMPL,
+	[PERF_COUNT_HW_CACHE_REFERENCES] =		PME_PM_LD_REF_L1,
+	[PERF_COUNT_HW_CACHE_MISSES] =			PME_PM_LD_MISS_L1,
+	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] =		PME_PM_BRU_FIN,
+	[PERF_COUNT_HW_BRANCH_MISSES] =			PME_PM_BRU_MPRED,
 };
 
 #define C(x)	PERF_COUNT_HW_CACHE_##x
-- 
1.7.1

^ permalink raw reply related

* [PATCH 0/6][v4]: perf: Make POWER7 events available in sysfs
From: Sukadev Bhattiprolu @ 2013-01-23  6:22 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar
  Cc: Andi Kleen, robert.richter, Anton Blanchard, linux-kernel,
	Stephane Eranian, linuxppc-dev, Arnaldo Carvalho de Melo,
	Jiri Olsa

[PATCH 0/6][v4]: perf: Make POWER7 events available in sysfs

Make the generic and some POWER7-specific perf events available in sysfs.
These changes mainly extend similar functionality implemented in x86 to
work on POWER architecture as well.

Thanks to input from Stephane Eranian, Robert Richter, Peter Ziljstra
and Jiri Olsa.

Changelog[v4]:
	[Jiri Olsa]: Document that multiple event= like terms can be specified
	in the 'events' file.
	[Jiri Olsa]: Remove the documentation for the 'config format' file
	as it is already documented in 'Documentation/ABI/testing/'
	[Jiri Olsa]: Move the ABI documentaion from 'stable/' to 'testing/'.

Changelog[v3]:
	[Jiri Olsa]: No need to define EVENT_ID, PMU_EVENT_PTR() if used only
	once
	[Greg KH]: Document the new sysfs interfaces in Documenation/ABI

Changelog[v2]:
	[Jiri Olsa] Use PMU_FORMAT_ATTR() rather than duplicating code.

Sukadev Bhattiprolu (6):
  perf/Power7: Use macros to identify perf events
  perf: Make EVENT_ATTR global
  perf/POWER7: Make generic event translations available in sysfs
  perf/POWER7: Make some POWER7 events available in sysfs
  perf: Create a sysfs entry for Power event format
  perf: Document the ABI of perf sysfs entries

 .../testing/sysfs-bus-event_source-devices-events  |   62 +++++++++++++++
 arch/powerpc/include/asm/perf_event_server.h       |   32 ++++++++
 arch/powerpc/perf/core-book3s.c                    |   24 ++++++
 arch/powerpc/perf/power7-pmu.c                     |   81 ++++++++++++++++++--
 arch/x86/kernel/cpu/perf_event.c                   |   13 +---
 include/linux/perf_event.h                         |   11 +++
 6 files changed, 205 insertions(+), 18 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-event_source-devices-events

^ permalink raw reply

* Re: [PATCH 1/6][v3] perf/Power7: Use macros to identify perf events
From: Michael Ellerman @ 2013-01-23  3:50 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Andi Kleen, Peter Zijlstra, robert.richter, Anton Blanchard,
	linux-kernel, Stephane Eranian, linuxppc-dev, Ingo Molnar,
	Paul Mackerras, Arnaldo Carvalho de Melo, Jiri Olsa
In-Reply-To: <20130110010347.GA32590@us.ibm.com>

On Wed, 2013-01-09 at 17:03 -0800, Sukadev Bhattiprolu wrote:
> [PATCH 1/6][v3] perf/Power7: Use macros to identify perf events
> 
> Define and use macros to identify perf events codes. This would make it
> easier and more readable when these event codes need to be used in more
> than one place.
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> ---
>  arch/powerpc/perf/power7-pmu.c |   28 ++++++++++++++++++++--------
>  1 files changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
> index 441af08..44e70d2 100644
> --- a/arch/powerpc/perf/power7-pmu.c
> +++ b/arch/powerpc/perf/power7-pmu.c
> @@ -51,6 +51,18 @@
>  #define MMCR1_PMCSEL_MSK	0xff
>  
>  /*
> + * Power7 event codes.
> + */
> +#define	PME_PM_CYC			0x1e
> +#define	PME_PM_GCT_NOSLOT_CYC		0x100f8
> +#define	PME_PM_CMPLU_STALL		0x4000a
> +#define	PME_PM_INST_CMPL		0x2
> +#define	PME_PM_LD_REF_L1		0xc880
> +#define	PME_PM_LD_MISS_L1		0x400f0
> +#define	PME_PM_BRU_FIN			0x10068
> +#define	PME_PM_BRU_MPRED		0x400f6
> +
> +/*
>   * Layout of constraint bits:
>   * 6666555555555544444444443333333333222222222211111111110000000000
>   * 3210987654321098765432109876543210987654321098765432109876543210
> @@ -296,14 +308,14 @@ static void power7_disable_pmc(unsigned int pmc, unsigned long mmcr[])
>  }
>  
>  static int power7_generic_events[] = {
> -	[PERF_COUNT_HW_CPU_CYCLES] = 0x1e,
> -	[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x100f8, /* GCT_NOSLOT_CYC */
> -	[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x4000a,  /* CMPLU_STALL */
> -	[PERF_COUNT_HW_INSTRUCTIONS] = 2,
> -	[PERF_COUNT_HW_CACHE_REFERENCES] = 0xc880,	/* LD_REF_L1_LSU*/
> -	[PERF_COUNT_HW_CACHE_MISSES] = 0x400f0,		/* LD_MISS_L1	*/
> -	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x10068,	/* BRU_FIN	*/
> -	[PERF_COUNT_HW_BRANCH_MISSES] = 0x400f6,	/* BR_MPRED	*/
> +	[PERF_COUNT_HW_CPU_CYCLES] =			PME_PM_CYC,
> +	[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =	PME_PM_GCT_NOSLOT_CYC,
> +	[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =	PME_PM_CMPLU_STALL,
> +	[PERF_COUNT_HW_INSTRUCTIONS] =			PME_PM_INST_CMPL,
> +	[PERF_COUNT_HW_CACHE_REFERENCES] =		PME_PM_LD_REF_L1,
> +	[PERF_COUNT_HW_CACHE_MISSES] =			PME_PM_LD_MISS_L1,


Your patch is good, but raises the question why we're using L1 events
for HW_CACHE.

AFAICS on Intel they use 0x42fe/0x412e, which are last-level-cache (LLC)
events.
        
        PMU name : ix86arch (Intel X86 architectural PMU)
        Name     : LLC_REFERENCES
        Desc     : count each request originating from the core to
        reference a cache line in the last level cache. The count may
        include speculation, but excludes cache line fills due to
        hardware prefetch
        Code     : 0x4f2e

        PMU name : ix86arch (Intel X86 architectural PMU)
        Name     : LLC_MISSES
        Desc     : count each cache miss condition for references to the
        last level cache. The event count may include speculation, but
        excludes cache line fills due to hardware prefetch
        Code     : 0x412e


That would seem to more closely match our PM_L3_LD_HIT/MISS?

cheers

^ permalink raw reply

* [PATCH][v2] KVM: PPC: add paravirt idle loop for 64-bit book E
From: Stuart Yoder @ 2013-01-22 23:54 UTC (permalink / raw)
  To: agraf, benh; +Cc: linuxppc-dev, kvm-ppc, kvm, Stuart Yoder

From: Stuart Yoder <stuart.yoder@freescale.com>

Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
---

-v2
   -macro'ized loop in idle_book3e.S to avoid code 
    duplication, paravirt loop is now in idle_book3e.S

 arch/powerpc/kernel/epapr_hcalls.S |    2 ++
 arch/powerpc/kernel/idle_book3e.S  |   30 ++++++++++++++++++++++++++++--
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/epapr_hcalls.S b/arch/powerpc/kernel/epapr_hcalls.S
index 62c0dc2..9f1ebf7 100644
--- a/arch/powerpc/kernel/epapr_hcalls.S
+++ b/arch/powerpc/kernel/epapr_hcalls.S
@@ -17,6 +17,7 @@
 #include <asm/asm-compat.h>
 #include <asm/asm-offsets.h>
 
+#ifndef CONFIG_PPC64
 /* epapr_ev_idle() was derived from e500_idle() */
 _GLOBAL(epapr_ev_idle)
 	CURRENT_THREAD_INFO(r3, r1)
@@ -42,6 +43,7 @@ epapr_ev_idle_start:
 	 * _TLF_NAPPING.
 	 */
 	b	idle_loop
+#endif
 
 /* Hypercall entry point. Will be patched with device tree instructions. */
 .global epapr_hypercall_start
diff --git a/arch/powerpc/kernel/idle_book3e.S b/arch/powerpc/kernel/idle_book3e.S
index 4c7cb400..e1c9acd 100644
--- a/arch/powerpc/kernel/idle_book3e.S
+++ b/arch/powerpc/kernel/idle_book3e.S
@@ -16,11 +16,13 @@
 #include <asm/ppc-opcode.h>
 #include <asm/processor.h>
 #include <asm/thread_info.h>
+#include <asm/epapr_hcalls.h>
 
 /* 64-bit version only for now */
 #ifdef CONFIG_PPC64
 
-_GLOBAL(book3e_idle)
+.macro BOOK3E_IDLE name loop
+_GLOBAL(\name)
 	/* Save LR for later */
 	mflr	r0
 	std	r0,16(r1)
@@ -67,7 +69,31 @@ _GLOBAL(book3e_idle)
 
 	/* We can now re-enable hard interrupts and go to sleep */
 	wrteei	1
-1:	PPC_WAIT(0)
+	\loop
+
+.endm
+
+.macro BOOK3E_IDLE_LOOP
+1:
+	PPC_WAIT(0)
 	b	1b
+.endm
+
+.macro EPAPR_EV_IDLE_LOOP
+idle_loop:
+       LOAD_REG_IMMEDIATE(r11, EV_HCALL_TOKEN(EV_IDLE))
+
+.global epapr_ev_idle_start
+epapr_ev_idle_start:
+       li      r3, -1
+       nop
+       nop
+       nop
+       b       idle_loop
+.endm
+
+BOOK3E_IDLE epapr_ev_idle, EPAPR_EV_IDLE_LOOP
+
+BOOK3E_IDLE book3e_idle BOOK3E_IDLE_LOOP
 
 #endif /* CONFIG_PPC64 */
-- 
1.7.9.7

^ permalink raw reply related

* Re: [PATCH] perf: Fix compile warnings in tests/attr.c
From: Michael Ellerman @ 2013-01-22 23:45 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Anton Blanchard, linux-kernel, linuxppc-dev, paulus, acme, mingo,
	Jiri Olsa
In-Reply-To: <20130121213823.GA4774@us.ibm.com>

On Mon, 2013-01-21 at 13:38 -0800, Sukadev Bhattiprolu wrote:
> Jiri Olsa [jolsa@redhat.com] wrote:
> | On Fri, Jan 18, 2013 at 05:30:52PM -0800, Sukadev Bhattiprolu wrote:
> | > From 4d266e5040c33103f5d226b0d16b89f8ef79e3ad Mon Sep 17 00:00:00 2001
> | > From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> | > Date: Fri, 18 Jan 2013 11:14:28 -0800
> | > Subject: [PATCH] perf: Fix compile warnings in tests/attr.c
> | > 
> | > Replace '%llu' in printf()s with 'PRIu64' in 'tools/perf/tests/attr.c'
> | > to fix compile warnings (which become errors due to -Werror).
> | 
> | i386 and x86_64 compiles fine for me with gcc versions 4.6.3-2 and 4.7.2-2
> 
> But is broken on Power for 64bit :-( I am trying to fix that and thought
> that use of format specifiers like 'PRIu64' was the way to go.
> 
> | 
> | with your patch for x86_64 I'm getting following warnings/errors:
> 
> | 
> |     CC tests/attr.o
> | tests/attr.c: In function ‘store_event’:
> | tests/attr.c:69:4: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 6 has type ‘__u64’ [-Werror=format]
> 
> Here is what I see on an x86_64 box, RHEL6.2 box:
> 
> 	$ rpm -qf /usr/include/linux/types.h
> 	kernel-headers-2.6.32-220.4.2.el6.x86_64
> 
> 	$ cat foo.c
> 	#include <linux/types.h>
> 
> 	$ cc -Werror -Wall foo.c
> 	In file included from /usr/include/asm-generic/types.h:7,
> 			 from /usr/include/asm/types.h:6,
> 			 from /usr/include/linux/types.h:4,
> 			 from foo5.c:1:
> 	/usr/include/asm-generic/int-ll64.h:31:2: error: #error __u64 defined as unsigned long long
> 
> where the #error is my debug message.
> 
> <snip>
> 
> | make: *** [tests/attr.o] Error 1
> | 
> | i386 compiles fine
> 
> __u64 is 'unsigned long long' on x86 and PRIu64 is 'llu' which is fine.
> 
> __u64 is 'unsigned long' on Power and PRIu64 is 'lu' which is again fine.
> 
> But __u64 is 'unsigned long long' on x86_64, but PRIu64 is '%lu' bc __WORDSIZE
> is 64.


This is a bit of a mess, but let me see if I can help explain it.

The root of the problem is that you're mixing up the kernel type __u64,
with the userspace format specifier PRIu64.

PRIu64 is the format specifier for printing a uint64_t, it _may_ also be
the right specifier for a __u64, but there's no guarantee of that - as
you have discovered.

Inside the kernel both x86 and powerpc use unsigned long long always, in
32-bit and 64-bit code. That means in the kernel we can always use %llu.

On x86 that definition is also exported to userspace, so on x86 __u64 is
always unsigned long long. As you noticed this potentially differs from
uint64_t, which can be confusing. However it means in x86 userspace code
you can always print a __u64 with %llu.

On powerpc we default to using definitions that match userspace, so
__u64 changes depending on your wordsize, and so you must use PRIu64
etc. to print them.

There is however support in recent powerpc kernels to switch to using
unsigned long long even on 64-bit. See commit 2c9c6ce.

You need to define __SANE_USERSPACE_TYPES__ before including types.h.
Then you can always use %llu to print __u64.

cheers

^ permalink raw reply

* Re: [PATCH 1/2] powerpc/5200: Fix size to request_mem_region() call
From: Anatolij Gustschin @ 2013-01-22 21:10 UTC (permalink / raw)
  To: Grant Likely; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1358473200-17886-1-git-send-email-grant.likely@secretlab.ca>

Hi Grant,

On Fri, 18 Jan 2013 01:39:59 +0000
Grant Likely <grant.likely@secretlab.ca> wrote:

> The Bestcomm driver requests a memory region larger than the one
> described in the device tree. This is due to an extra undocumented field
> in the bestcomm register structure. This hasn't been a problem up to
> now, but there is a patch pending to make the DT platform_bus support
> code use platform_device_add() which tightens the rules and provides
> extra checks for drivers to stay within the specified register regions.
> 
> Alternately, I could have removed the extra field from the structure,
> but I'm not sure if it is still needed for resume to work. Better be
> safe and leave it in.
> 
> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Anatolij Gustschin <agust@denx.de>
> ---
>  arch/powerpc/sysdev/bestcomm/bestcomm.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

There is a patch moving this driver to drivers/dma,

http://patchwork.ozlabs.org/patch/191153/

I've applied it to my 5xxx next branch.

> diff --git a/arch/powerpc/sysdev/bestcomm/bestcomm.c b/arch/powerpc/sysdev/bestcomm/bestcomm.c
> index d913063..81c3314 100644
> --- a/arch/powerpc/sysdev/bestcomm/bestcomm.c
> +++ b/arch/powerpc/sysdev/bestcomm/bestcomm.c
> @@ -414,7 +414,7 @@ static int mpc52xx_bcom_probe(struct platform_device *op)
>  		goto error_sramclean;
>  	}
>  
> -	if (!request_mem_region(res_bcom.start, sizeof(struct mpc52xx_sdma),
> +	if (!request_mem_region(res_bcom.start, resource_size(&res_bcom),
>  				DRIVER_NAME)) {
>  		printk(KERN_ERR DRIVER_NAME ": "
>  			"Can't request registers region\n");

similar change is needed for release_mem_region() in error path
and in driver's remove() function.

Thanks,

Anatolij

^ permalink raw reply

* Re: [PATCH v5 01/45] percpu_rwlock: Introduce the global reader-writer lock backend
From: Steven Rostedt @ 2013-01-22 20:54 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: linux-doc, peterz, fweisbec, linux-kernel, mingo, linux-arch,
	linux, xiaoguangrong, wangyun, paulmck, nikunj, linux-pm, rusty,
	rjw, namhyung, tglx, linux-arm-kernel, netdev, oleg, sbw, tj,
	akpm, linuxppc-dev
In-Reply-To: <50FEEF5D.6080302@linux.vnet.ibm.com>

On Wed, 2013-01-23 at 01:28 +0530, Srivatsa S. Bhat wrote:

> > I thought global locks are now fair. That is, a reader will block if a
> > writer is waiting. Hence, the above should deadlock on the current
> > rwlock_t types.
> > 
> 
> Oh is it? Last I checked, lockdep didn't complain about this ABBA scenario!

It doesn't and Peter Zijlstra said we need to fix that ;-)  It only
recently became an issue with the new "fair" locking of rwlocks.

-- Steve

^ permalink raw reply

* Re: [PATCH v5 01/45] percpu_rwlock: Introduce the global reader-writer lock backend
From: Srivatsa S. Bhat @ 2013-01-22 19:58 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-doc, peterz, fweisbec, linux-kernel, mingo, linux-arch,
	linux, xiaoguangrong, wangyun, paulmck, nikunj, linux-pm, rusty,
	rjw, namhyung, tglx, linux-arm-kernel, netdev, oleg, sbw, tj,
	akpm, linuxppc-dev
In-Reply-To: <1358883152.21576.55.camel@gandalf.local.home>

On 01/23/2013 01:02 AM, Steven Rostedt wrote:
> On Tue, 2013-01-22 at 13:03 +0530, Srivatsa S. Bhat wrote:
>> A straight-forward (and obvious) algorithm to implement Per-CPU Reader-Writer
>> locks can also lead to too many deadlock possibilities which can make it very
>> hard/impossible to use. This is explained in the example below, which helps
>> justify the need for a different algorithm to implement flexible Per-CPU
>> Reader-Writer locks.
>>
>> We can use global rwlocks as shown below safely, without fear of deadlocks:
>>
>> Readers:
>>
>>          CPU 0                                CPU 1
>>          ------                               ------
>>
>> 1.    spin_lock(&random_lock);             read_lock(&my_rwlock);
>>
>>
>> 2.    read_lock(&my_rwlock);               spin_lock(&random_lock);
>>
>>
>> Writer:
>>
>>          CPU 2:
>>          ------
>>
>>        write_lock(&my_rwlock);
>>
> 
> I thought global locks are now fair. That is, a reader will block if a
> writer is waiting. Hence, the above should deadlock on the current
> rwlock_t types.
> 

Oh is it? Last I checked, lockdep didn't complain about this ABBA scenario!

> We need to fix those locations (or better yet, remove all rwlocks ;-)
> 

:-)

The challenge with stop_machine() removal is that the replacement on the
reader side must have the (locking) flexibility comparable to preempt_disable().
Otherwise, that solution most likely won't be viable because we'll hit way
too many locking problems and go crazy by the time we convert them over..(if
we can, that is!)

Regards,
Srivatsa S. Bhat

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox