LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH 2/2] powerpc - Make the irq reverse mapping radix tree lockless
From: Benjamin Herrenschmidt @ 2008-08-20  5:23 UTC (permalink / raw)
  To: Sebastien Dugue
  Cc: dwalker, tinytim, linux-rt-users, linux-kernel, rostedt,
	jean-pierre.dion, linuxppc-dev, paulus, gilles.carry, tglx
In-Reply-To: <1218029429-21114-3-git-send-email-sebastien.dugue@bull.net>

BTW. It would be good to try to turn the GFP_ATOMIC into GFP_KERNEL,
maybe using a semaphore instead of a lock to protect insertion vs.
initialisation. The old scheme was fine because if the atomic allocation
failed, it could fallback to the linear search and try again on the next
interrupt. Not anymore.

Ben.

^ permalink raw reply

* Re: [PATCH v2] powerpc: Improve message for vio bus entitlement panic
From: Paul Mackerras @ 2008-08-20  6:00 UTC (permalink / raw)
  To: Robert Jennings; +Cc: linuxppc-dev list
In-Reply-To: <20080818153446.GB27342@austin.ibm.com>

Robert Jennings writes:

> Add information regarding the available and required entitlement amounts
> to the message displayed for the panic when insufficient entitlement is
> provided at boot.

I'll queue this up for 2.6.28, unless you tell me it's needed for
2.6.27.

Paul.

^ permalink raw reply

* Re: [PATCH 4/5] powerpc: Make the 64-bit kernel as a position-independent executable
From: Paul Mackerras @ 2008-08-20  6:34 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: linuxppc-dev
In-Reply-To: <Pine.LNX.4.64.0808191816450.25348@vixen.sonytel.be>

Geert Uytterhoeven writes:

> This part broke ppc32:
> 
> | arch/powerpc/kernel/prom.c: In function 'early_init_devtree':
> | arch/powerpc/kernel/prom.c:1166: error: '__end_interrupts' undeclared (first use in this function)
> | arch/powerpc/kernel/prom.c:1166: error: (Each undeclared identifier is reported only once
> | arch/powerpc/kernel/prom.c:1166: error: for each function it appears in.)

I'm not totally surprised I broke ppc32 somewhere along the line
there. :)  I think we will end up with having to have a #ifdef
CONFIG_RELOCATABLE in there or maybe a test on PHYSICAL_START.  It
will depend a bit on whether we want to make relocatable 32-bit
kernels as PIEs as well.

Paul.

^ permalink raw reply

* Re: [PATCH] powerpc: fix memory leaks in QE library
From: Tony Breeds @ 2008-08-20  6:38 UTC (permalink / raw)
  To: Timur Tabi; +Cc: linuxppc-dev
In-Reply-To: <1219093928-5349-1-git-send-email-timur@freescale.com>

On Mon, Aug 18, 2008 at 04:12:08PM -0500, Timur Tabi wrote:
> Fix two memory leaks in the Freescale QE library: add a missing kfree() in
> ucc_fast_init() if the ioremap() fails, and update ucc_fast_free() to call
> iounmap() on uf_regs.

It's been pointed out in
http://bugzilla.kernel.org/show_bug.cgi?id=11371
that ucc_slow suffers from the same (welll clsoe enough) 2 problems.
Care to fix them aswell?

Yours Tony

  linux.conf.au    http://www.marchsouth.org/
  Jan 19 - 24 2009 The Australian Linux Technical Conference!

^ permalink raw reply

* libfdt: Add function to explicitly expand aliases
From: David Gibson @ 2008-08-20  6:55 UTC (permalink / raw)
  To: Jon Loeliger; +Cc: linuxppc-dev, devicetree-discuss

Kumar has already added alias expansion to fdt_path_offset().
However, in some circumstances it may be convenient for the user of
libfdt to explicitly get the string expansion of an alias.  This patch
adds a function to do this, fdt_get_alias(), and uses it to implement
fdt_path_offset().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Index: dtc/libfdt/fdt_ro.c
===================================================================
--- dtc.orig/libfdt/fdt_ro.c	2008-08-20 15:59:56.000000000 +1000
+++ dtc/libfdt/fdt_ro.c	2008-08-20 15:59:56.000000000 +1000
@@ -141,17 +141,12 @@ int fdt_path_offset(const void *fdt, con
 
 	/* see if we have an alias */
 	if (*path != '/') {
-		const char *q;
-		int aliasoffset = fdt_path_offset(fdt, "/aliases");
-
-		if (aliasoffset < 0)
-			return -FDT_ERR_BADPATH;
+		const char *q = strchr(path, '/');
 
-		q = strchr(path, '/');
 		if (!q)
 			q = end;
 
-		p = fdt_getprop_namelen(fdt, aliasoffset, path, q - p, NULL);
+		p = fdt_get_alias_namelen(fdt, p, q - p);
 		if (!p)
 			return -FDT_ERR_BADPATH;
 		offset = fdt_path_offset(fdt, p);
@@ -302,6 +297,23 @@ uint32_t fdt_get_phandle(const void *fdt
 	return fdt32_to_cpu(*php);
 }
 
+const char *fdt_get_alias_namelen(const void *fdt,
+				  const char *name, int namelen)
+{
+	int aliasoffset;
+
+	aliasoffset = fdt_path_offset(fdt, "/aliases");
+	if (aliasoffset < 0)
+		return NULL;
+
+	return fdt_getprop_namelen(fdt, aliasoffset, name, namelen, NULL);
+}
+
+const char *fdt_get_alias(const void *fdt, const char *name)
+{
+	return fdt_get_alias_namelen(fdt, name, strlen(name));
+}
+
 int fdt_get_path(const void *fdt, int nodeoffset, char *buf, int buflen)
 {
 	int pdepth = 0, p = 0;
Index: dtc/libfdt/libfdt.h
===================================================================
--- dtc.orig/libfdt/libfdt.h	2008-08-20 15:59:56.000000000 +1000
+++ dtc/libfdt/libfdt.h	2008-08-20 15:59:56.000000000 +1000
@@ -459,6 +459,32 @@ static inline void *fdt_getprop_w(void *
 uint32_t fdt_get_phandle(const void *fdt, int nodeoffset);
 
 /**
+ * fdt_get_namelen - get alias based on substring
+ * @fdt: pointer to the device tree blob
+ * @name: name of the alias th look up
+ * @namelen: number of characters of name to consider
+ *
+ * Identical to fdt_get_alias(), but only examine the first namelen
+ * characters of name for matching the alias name.
+ */
+const char *fdt_get_alias_namelen(const void *fdt,
+				  const char *name, int namelen);
+
+/**
+ * fdt_get_alias - retreive the path referenced by a given alias
+ * @fdt: pointer to the device tree blob
+ * @name: name of the alias th look up
+ *
+ * fdt_get_alias() retrieves the value of a given alias.  That is, the
+ * value of the property named 'name' in the node /aliases.
+ *
+ * returns:
+ *	a pointer to the expansion of the alias named 'name', of it exists
+ *	NULL, if the given alias or the /aliases node does not exist
+ */
+const char *fdt_get_alias(const void *fdt, const char *name);
+
+/**
  * fdt_get_path - determine the full path of a node
  * @fdt: pointer to the device tree blob
  * @nodeoffset: offset of the node whose path to find
Index: dtc/tests/get_alias.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ dtc/tests/get_alias.c	2008-08-20 15:59:56.000000000 +1000
@@ -0,0 +1,58 @@
+/*
+ * libfdt - Flat Device Tree manipulation
+ *	Testcase for fdt_get_alias()
+ * Copyright (C) 2006 David Gibson, IBM Corporation.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+
+#include <fdt.h>
+#include <libfdt.h>
+
+#include "tests.h"
+#include "testdata.h"
+
+void check_alias(void *fdt, const char *path, const char *alias)
+{
+	const char *aliaspath;
+
+	aliaspath = fdt_get_alias(fdt, alias);
+
+	if (path && !aliaspath)
+		FAIL("fdt_get_alias(%s) failed\n", alias);
+
+	if (strcmp(aliaspath, path) != 0)
+		FAIL("fdt_get_alias(%s) returned %s instead of %s\n",
+		     alias, aliaspath, path);
+}
+
+int main(int argc, char *argv[])
+{
+	void *fdt;
+
+	test_init(argc, argv);
+	fdt = load_blob_arg(argc, argv);
+
+	check_alias(fdt, "/subnode@1", "s1");
+	check_alias(fdt, "/subnode@1/subsubnode", "ss1");
+	check_alias(fdt, "/subnode@1/subsubnode/subsubsubnode", "sss1");
+
+	PASS();
+}
Index: dtc/tests/Makefile.tests
===================================================================
--- dtc.orig/tests/Makefile.tests	2008-08-20 15:59:56.000000000 +1000
+++ dtc/tests/Makefile.tests	2008-08-20 15:59:56.000000000 +1000
@@ -4,6 +4,7 @@ LIB_TESTS_L = get_mem_rsv \
 	get_path supernode_atdepth_offset parent_offset \
 	node_offset_by_prop_value node_offset_by_phandle \
 	node_check_compatible node_offset_by_compatible \
+	get_alias \
 	notfound \
 	setprop_inplace nop_property nop_node \
 	sw_tree1 \
Index: dtc/tests/run_tests.sh
===================================================================
--- dtc.orig/tests/run_tests.sh	2008-08-20 15:59:56.000000000 +1000
+++ dtc/tests/run_tests.sh	2008-08-20 15:59:56.000000000 +1000
@@ -216,6 +216,7 @@ dtc_tests () {
 
     # Check aliases support in fdt_path_offset
     run_dtc_test -I dts -O dtb -o aliases.dtb aliases.dts
+    run_test get_alias aliases.dtb
     run_test path_offset_aliases aliases.dtb
 
     # Check /include/ directive

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply

* Re: [gmail] DMA Cache problem on PPC8248.
From: Marc Leeman @ 2008-08-20  6:55 UTC (permalink / raw)
  To: Jayasri Sangu; +Cc: 'linuxppc-dev@ozlabs.org'
In-Reply-To: <2DE723F44CAFD24A9B6BF606813F010C0945BE21@aae-exch>

[-- Attachment #1: Type: text/plain, Size: 790 bytes --]

>    We are using PPC8248 provided by freescale and the kernel  2.6.10. I believe we have cache problem.
>   We are allocating DMA buffers in our driver. If we increase the DMA buffer size the board hangs/halts(when try to load more applications).
>   If we reduce the DMA buffer size, then we can run all our applications.
>    Is there any way I can get around this problem.

Have a look at the u-boot code for the 8245; I once submitted a patch
for such a problem on an XPC8245 (experimental MPC8245). It's still in
the code; but disabled right now.

It might be the same silicon bug.

-- 
  greetz, marc
Yeah, yeah, yeah nothing like a bomb to sober me up, I'm fine.
	Crichton - Suns and Lovers
crichton 2.6.26 #1 PREEMPT Tue Jul 29 21:17:59 CDT 2008 armv5tel GNU/Linux

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
From: Benjamin Herrenschmidt @ 2008-08-20  7:18 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Eran Liberty, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Alan Modra, Scott Wood, Paul E. McKenney
In-Reply-To: <1219119431.8062.35.camel@pasglop>

Found the problem (or at least -a- problem), it's a gcc bug.

Well, first I must say the code generated by -pg is just plain
horrible :-)

Appart from that, look at the exit of, for example, __d_lookup, as
generated by gcc when ftrace is enabled:

c00c0498:       38 60 00 00     li      r3,0
c00c049c:       81 61 00 00     lwz     r11,0(r1)
c00c04a0:       80 0b 00 04     lwz     r0,4(r11)
c00c04a4:       7d 61 5b 78     mr      r1,r11
c00c04a8:       bb 0b ff e0     lmw     r24,-32(r11)
c00c04ac:       7c 08 03 a6     mtlr    r0
c00c04b0:       4e 80 00 20     blr

As you can see, it restores r1 -before- it pops r24..r31 off
the stack ! I let you imagine what happens if an interrupt happens
just in between those two instructions (mr and lmw). We don't do
redzones on our ABI, so basically, the registers end up corrupted
by the interrupt.

Cheers,
Ben.

^ permalink raw reply

* Re: CONFIG_BOOTX_TEXT breaks PowerBook 3400 with BootX
From: Finn Thain @ 2008-08-20  8:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1219205249.21386.7.camel@pasglop>



On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:

> On Wed, 2008-08-20 at 03:23 +1000, Finn Thain wrote:
> >...
> > 
> > The Debian Sarge 2.6.8-4 kernel does not seem to have this problem 
> > though kernel.org 2.6.16 and later versions do.
> > 
> > Any ideas?
> 
> How "late" have you tried ? I remember fixing something at one point...

I've tried 2.6.26.2 (I think I also tried 2.6.22 and 2.6.25 a while ago).

I did some more tests today. I found that 2.6.15 (ARCH=ppc) is fine, but 
2.6.16 (ARCH=powerpc) has the bug.

Finn

> 
> Ben.
> 
> 
> 

^ permalink raw reply

* Re: [alsa-devel] [PATCH v2] duplicate SNDRV_PCM_FMTBIT_S{16,24}_BE
From: Takashi Iwai @ 2008-08-20  8:42 UTC (permalink / raw)
  To: roel kluin; +Cc: linuxppc-dev, Johannes Berg, alsa-devel, linux-kernel
In-Reply-To: <48AAEC92.8050108@gmail.com>

At Tue, 19 Aug 2008 11:53:54 -0400,
roel kluin wrote:
> 
> Takashi Iwai wrote:
> > At Tue, 19 Aug 2008 08:15:05 +0200 (CEST),
> > Johannes Berg wrote:
> >> roel kluin wrote:
> >>> untested, is it correct?
> >> not a clue, do you know how long ago that was? :)
> >> does the driver check endianness anywhere?
> > 
> > AFAIK snd-aoa supports only bit-endian formats (at least in
> > sound/aoa/soundbus/i2sbus-pcm.c), so this addition makes little
> > sense.
> > 
> > Better to drop the duplicated words there.
> 
> Thanks Johannes and Takashi,
> 
> FWIW this removes the duplicates.
> ---
> Remove duplicate assignment of SNDRV_PCM_FMTBIT_S{16,24}_BE bits
> 
> Signed-off-by: Roel Kluin <roel.kluin@gmail.com>

Thanks, applied this one now.


Takashi

> ---
> diff --git a/sound/aoa/codecs/snd-aoa-codec-tas.c b/sound/aoa/codecs/snd-aoa-codec-tas.c
> index 7a16a33..6c515b2 100644
> --- a/sound/aoa/codecs/snd-aoa-codec-tas.c
> +++ b/sound/aoa/codecs/snd-aoa-codec-tas.c
> @@ -654,15 +654,13 @@ static struct snd_kcontrol_new bass_control = {
>  static struct transfer_info tas_transfers[] = {
>  	{
>  		/* input */
> -		.formats = SNDRV_PCM_FMTBIT_S16_BE | SNDRV_PCM_FMTBIT_S16_BE |
> -			   SNDRV_PCM_FMTBIT_S24_BE | SNDRV_PCM_FMTBIT_S24_BE,
> +		.formats = SNDRV_PCM_FMTBIT_S16_BE | SNDRV_PCM_FMTBIT_S24_BE,
>  		.rates = SNDRV_PCM_RATE_32000 | SNDRV_PCM_RATE_44100 | SNDRV_PCM_RATE_48000,
>  		.transfer_in = 1,
>  	},
>  	{
>  		/* output */
> -		.formats = SNDRV_PCM_FMTBIT_S16_BE | SNDRV_PCM_FMTBIT_S16_BE |
> -			   SNDRV_PCM_FMTBIT_S24_BE | SNDRV_PCM_FMTBIT_S24_BE,
> +		.formats = SNDRV_PCM_FMTBIT_S16_BE | SNDRV_PCM_FMTBIT_S24_BE,
>  		.rates = SNDRV_PCM_RATE_32000 | SNDRV_PCM_RATE_44100 | SNDRV_PCM_RATE_48000,
>  		.transfer_in = 0,
>  	},
> 

^ permalink raw reply

* Re: CONFIG_BOOTX_TEXT breaks PowerBook 3400 with BootX
From: Benjamin Herrenschmidt @ 2008-08-20  9:04 UTC (permalink / raw)
  To: Finn Thain; +Cc: linuxppc-dev
In-Reply-To: <Pine.LNX.4.64.0808201819500.22930@loopy.telegraphics.com.au>

On Wed, 2008-08-20 at 18:25 +1000, Finn Thain wrote:
> 
> On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:
> 
> > On Wed, 2008-08-20 at 03:23 +1000, Finn Thain wrote:
> > >...
> > > 
> > > The Debian Sarge 2.6.8-4 kernel does not seem to have this problem 
> > > though kernel.org 2.6.16 and later versions do.
> > > 
> > > Any ideas?
> > 
> > How "late" have you tried ? I remember fixing something at one point...
> 
> I've tried 2.6.26.2 (I think I also tried 2.6.22 and 2.6.25 a while ago).
> 
> I did some more tests today. I found that 2.6.15 (ARCH=ppc) is fine, but 
> 2.6.16 (ARCH=powerpc) has the bug.

Thanks. I can get access to a 3400, I'll give it a try.

Cheers,
Ben.

^ permalink raw reply

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
From: Nick Piggin @ 2008-08-20  9:40 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Scott Wood, Eran Liberty
In-Reply-To: <alpine.DEB.1.10.0808191705590.4957@gandalf.stny.rr.com>

On Wednesday 20 August 2008 07:08, Steven Rostedt wrote:
> On Tue, 19 Aug 2008, Mathieu Desnoyers wrote:
> > Ok, there are two cases where it's ok :
> >
> > 1 - in stop_machine, considering we are not touching code executed in
> > NMI handlers.
> > 2 - when using my replace_instruction_safe() which uses a temporary
> > breakpoint when doing the instruction replacement.
> >
> > In those cases you could use text_poke_early().
> >
> > See
> > http://git.kernel.org/?p=linux/kernel/git/compudj/linux-2.6-lttng.git;a=b
> >lob;f=arch/x86/kernel/immediate.c;h=7789e2c75bf03e645f15759d5dff0c1698493f
> >92;hb=HEAD
> >
> > For a use example. Basically it looks like :
> >
> >
> > 360                 pages[0] = virt_to_page((void *)bypass_eip);
> > 361                 vaddr = vmap(pages, 1, VM_MAP, PAGE_KERNEL);
> > 362                 BUG_ON(!vaddr);
> > 363                 text_poke_early(&vaddr[bypass_eip & ~PAGE_MASK],
> > 364                         (void *)addr, size);
> > 365                 /*
> > 366                  * Fill the rest with nops.
> > 367                  */
> > 368                 len = NR_NOPS - size;
> > 369                 add_nops((void *)
> > 370                         &vaddr[(bypass_eip & ~PAGE_MASK) + size],
> > 371                         len);
> > 372                 print_dbg_bytes("inserted nops",
> > 373                         &vaddr[(bypass_eip & ~PAGE_MASK) + size],
> > len); 374                 vunmap(vaddr);
>
> vunmap can not be called with interrupts disabled, and this is exactly
> what my code does.

It could be after the vmap rewrite (with a few other small tweaks).
But a) it would be less robust when called from interrupt context
and this code looks broken as it is WRT error handling; and b) it
still costs several thousand cycles to vmap+touch+vunmap...

^ permalink raw reply

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
From: Eran Liberty @ 2008-08-20 11:18 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linuxppc-dev, Steven Rostedt, Paul E. McKenney, Mathieu Desnoyers,
	linux-kernel
In-Reply-To: <alpine.DEB.1.10.0808191614380.4957@gandalf.stny.rr.com>

Steven Rostedt wrote:
> On Tue, 19 Aug 2008, Eran Liberty wrote:
>
>   
>> Steven Rostedt wrote:
>>     
>>>   
>>>       
>>>> Testing tracer sched_switch: PASSED
>>>> Testing tracer ftrace: PASSED
>>>> Testing dynamic ftrace: PASSED
>>>>         
>
> Do you have PREEMPT_TRACER enabled, or any other tracer for that matter?
>
> -- Steve
>   
I can see stack trace & context trace, but they are derived from other 
choices (I can not un select them)

cat .config | grep TRACE
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
# CONFIG_BLK_DEV_IO_TRACE is not set
CONFIG_STACKTRACE=y
# CONFIG_BACKTRACE_SELF_TEST is not set
CONFIG_HAVE_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_FTRACE=y
# CONFIG_SCHED_TRACER is not set
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_FTRACE_SELFTEST=y
CONFIG_FTRACE_STARTUP_TEST=y
>   
>>>> Oops: Exception in kernel mode, sig: 11 [#1]
>>>> Exsw1600
>>>> Modules linked in:
>>>> NIP: c00bbb20 LR: c00bbb20 CTR: 00000000
>>>>     
>>>>         
>
>
>   

^ permalink raw reply

* Re: [PATCH] powerpc: fix memory leaks in QE library
From: Timur Tabi @ 2008-08-20 11:28 UTC (permalink / raw)
  To: Tony Breeds; +Cc: linuxppc-dev
In-Reply-To: <20080820063850.GK934@bakeyournoodle.com>

On Wed, Aug 20, 2008 at 1:38 AM, Tony Breeds <tony@bakeyournoodle.com> wrote:

> It's been pointed out in
> http://bugzilla.kernel.org/show_bug.cgi?id=11371
> that ucc_slow suffers from the same (welll clsoe enough) 2 problems.
> Care to fix them aswell?

Sure.  I'll do a sweep o the QE code just to make sure there aren't even more.

Of course, it would have been nice if someone had looked up the
maintainer of the QE library in the MAINTAINERS file and notified me
about these bugs.

-- 
Timur Tabi
Linux kernel developer at Freescale

^ permalink raw reply

* Re: powerpc/cell/oprofile: fix mutex locking for spu-oprofile
From: Robert Richter @ 2008-08-20 11:57 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linuxppc-dev, Paul Mackerras, oprofile-list, cel,
	cbe-oss-dev
In-Reply-To: <200808110925.08485.arnd@arndb.de>

I am fine with the changes with the exception of removing
add_event_entry() from include/linux/oprofile.h. Though there is no
usage of the function also in other architectures anymore, this change
in the API should be discussed on the oprofile mailing list. Please
separate the change in a different patch and submit it to the mailing
list. If there are no objections then, this change can go upstream as
well.

-Robert

On 11.08.08 09:25:07, Arnd Bergmann wrote:
> From: Carl Love <cel@us.ibm.com>
> 
> The issue is the SPU code is not holding the kernel mutex lock while
> adding samples to the kernel buffer.
> 
> This patch creates per SPU buffers to hold the data.  Data
> is added to the buffers from in interrupt context.  The data
> is periodically pushed to the kernel buffer via a new Oprofile
> function oprofile_put_buff(). The oprofile_put_buff() function
> is called via a work queue enabling the funtion to acquire the
> mutex lock.
> 
> The existing user controls for adjusting the per CPU buffer
> size is used to control the size of the per SPU buffers.
> Similarly, overflows of the SPU buffers are reported by
> incrementing the per CPU buffer stats.  This eliminates the
> need to have architecture specific controls for the per SPU
> buffers which is not acceptable to the OProfile user tool
> maintainer.
> 
> The export of the oprofile add_event_entry() is removed as it
> is no longer needed given this patch.
> 
> Note, this patch has not addressed the issue of indexing arrays
> by the spu number.  This still needs to be fixed as the spu
> numbering is not guarenteed to be 0 to max_num_spus-1.
> 
> Signed-off-by: Carl Love <carll@us.ibm.com>
> Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  arch/powerpc/oprofile/cell/pr_util.h       |   13 ++
>  arch/powerpc/oprofile/cell/spu_profiler.c  |    4 +-
>  arch/powerpc/oprofile/cell/spu_task_sync.c |  236 +++++++++++++++++++++++++---
>  drivers/oprofile/buffer_sync.c             |   24 +++
>  drivers/oprofile/cpu_buffer.c              |   15 ++-
>  drivers/oprofile/event_buffer.h            |    7 +
>  include/linux/oprofile.h                   |   16 +-
>  7 files changed, 279 insertions(+), 36 deletions(-)
> 
> diff --git a/arch/powerpc/oprofile/cell/pr_util.h b/arch/powerpc/oprofile/cell/pr_util.h
> index 22e4e8d..628009c 100644
> --- a/arch/powerpc/oprofile/cell/pr_util.h
> +++ b/arch/powerpc/oprofile/cell/pr_util.h
> @@ -24,6 +24,11 @@
>  #define SKIP_GENERIC_SYNC 0
>  #define SYNC_START_ERROR -1
>  #define DO_GENERIC_SYNC 1
> +#define SPUS_PER_NODE   8
> +#define DEFAULT_TIMER_EXPIRE  (HZ / 10)
> +
> +extern struct delayed_work spu_work;
> +extern int spu_prof_running;
>  
>  struct spu_overlay_info {	/* map of sections within an SPU overlay */
>  	unsigned int vma;	/* SPU virtual memory address from elf */
> @@ -62,6 +67,14 @@ struct vma_to_fileoffset_map {	/* map of sections within an SPU program */
>  
>  };
>  
> +struct spu_buffer {
> +	int last_guard_val;
> +	int ctx_sw_seen;
> +	unsigned long *buff;
> +	unsigned int head, tail;
> +};
> +
> +
>  /* The three functions below are for maintaining and accessing
>   * the vma-to-fileoffset map.
>   */
> diff --git a/arch/powerpc/oprofile/cell/spu_profiler.c b/arch/powerpc/oprofile/cell/spu_profiler.c
> index 380d7e2..6edaebd 100644
> --- a/arch/powerpc/oprofile/cell/spu_profiler.c
> +++ b/arch/powerpc/oprofile/cell/spu_profiler.c
> @@ -23,12 +23,11 @@
>  
>  static u32 *samples;
>  
> -static int spu_prof_running;
> +int spu_prof_running;
>  static unsigned int profiling_interval;
>  
>  #define NUM_SPU_BITS_TRBUF 16
>  #define SPUS_PER_TB_ENTRY   4
> -#define SPUS_PER_NODE	     8
>  
>  #define SPU_PC_MASK	     0xFFFF
>  
> @@ -208,6 +207,7 @@ int start_spu_profiling(unsigned int cycles_reset)
>  
>  	spu_prof_running = 1;
>  	hrtimer_start(&timer, kt, HRTIMER_MODE_REL);
> +	schedule_delayed_work(&spu_work, DEFAULT_TIMER_EXPIRE);
>  
>  	return 0;
>  }
> diff --git a/arch/powerpc/oprofile/cell/spu_task_sync.c b/arch/powerpc/oprofile/cell/spu_task_sync.c
> index 2a9b4a0..2949126 100644
> --- a/arch/powerpc/oprofile/cell/spu_task_sync.c
> +++ b/arch/powerpc/oprofile/cell/spu_task_sync.c
> @@ -35,7 +35,102 @@ static DEFINE_SPINLOCK(buffer_lock);
>  static DEFINE_SPINLOCK(cache_lock);
>  static int num_spu_nodes;
>  int spu_prof_num_nodes;
> -int last_guard_val[MAX_NUMNODES * 8];
> +
> +struct spu_buffer spu_buff[MAX_NUMNODES * SPUS_PER_NODE];
> +struct delayed_work spu_work;
> +static unsigned max_spu_buff;
> +
> +static void spu_buff_add(unsigned long int value, int spu)
> +{
> +	/* spu buff is a circular buffer.  Add entries to the
> +	 * head.  Head is the index to store the next value.
> +	 * The buffer is full when there is one available entry
> +	 * in the queue, i.e. head and tail can't be equal.
> +	 * That way we can tell the difference between the
> +	 * buffer being full versus empty.
> +	 *
> +	 *  ASSUPTION: the buffer_lock is held when this function
> +	 *             is called to lock the buffer, head and tail.
> +	 */
> +	int full = 1;
> +
> +	if (spu_buff[spu].head >= spu_buff[spu].tail) {
> +		if ((spu_buff[spu].head - spu_buff[spu].tail)
> +		    <  (max_spu_buff - 1))
> +			full = 0;
> +
> +	} else if (spu_buff[spu].tail > spu_buff[spu].head) {
> +		if ((spu_buff[spu].tail - spu_buff[spu].head)
> +		    > 1)
> +			full = 0;
> +	}
> +
> +	if (!full) {
> +		spu_buff[spu].buff[spu_buff[spu].head] = value;
> +		spu_buff[spu].head++;
> +
> +		if (spu_buff[spu].head >= max_spu_buff)
> +			spu_buff[spu].head = 0;
> +	} else {
> +		/* From the user's perspective make the SPU buffer
> +		 * size management/overflow look like we are using
> +		 * per cpu buffers.  The user uses the same
> +		 * per cpu parameter to adjust the SPU buffer size.
> +		 * Increment the sample_lost_overflow to inform
> +		 * the user the buffer size needs to be increased.
> +		 */
> +		oprofile_cpu_buffer_inc_smpl_lost();
> +	}
> +}
> +
> +/* This function copies the per SPU buffers to the
> + * OProfile kernel buffer.
> + */
> +void sync_spu_buff(void)
> +{
> +	int spu;
> +	unsigned long flags;
> +	int curr_head;
> +
> +	for (spu = 0; spu < num_spu_nodes; spu++) {
> +		/* In case there was an issue and the buffer didn't
> +		 * get created skip it.
> +		 */
> +		if (spu_buff[spu].buff == NULL)
> +			continue;
> +
> +		/* Hold the lock to make sure the head/tail
> +		 * doesn't change while spu_buff_add() is
> +		 * deciding if the buffer is full or not.
> +		 * Being a little paranoid.
> +		 */
> +		spin_lock_irqsave(&buffer_lock, flags);
> +		curr_head = spu_buff[spu].head;
> +		spin_unlock_irqrestore(&buffer_lock, flags);
> +
> +		/* Transfer the current contents to the kernel buffer.
> +		 * data can still be added to the head of the buffer.
> +		 */
> +		oprofile_put_buff(spu_buff[spu].buff,
> +				  spu_buff[spu].tail,
> +				  curr_head, max_spu_buff);
> +
> +		spin_lock_irqsave(&buffer_lock, flags);
> +		spu_buff[spu].tail = curr_head;
> +		spin_unlock_irqrestore(&buffer_lock, flags);
> +	}
> +
> +}
> +
> +static void wq_sync_spu_buff(struct work_struct *work)
> +{
> +	/* move data from spu buffers to kernel buffer */
> +	sync_spu_buff();
> +
> +	/* only reschedule if profiling is not done */
> +	if (spu_prof_running)
> +		schedule_delayed_work(&spu_work, DEFAULT_TIMER_EXPIRE);
> +}
>  
>  /* Container for caching information about an active SPU task. */
>  struct cached_info {
> @@ -305,14 +400,21 @@ static int process_context_switch(struct spu *spu, unsigned long objectId)
>  
>  	/* Record context info in event buffer */
>  	spin_lock_irqsave(&buffer_lock, flags);
> -	add_event_entry(ESCAPE_CODE);
> -	add_event_entry(SPU_CTX_SWITCH_CODE);
> -	add_event_entry(spu->number);
> -	add_event_entry(spu->pid);
> -	add_event_entry(spu->tgid);
> -	add_event_entry(app_dcookie);
> -	add_event_entry(spu_cookie);
> -	add_event_entry(offset);
> +	spu_buff_add(ESCAPE_CODE, spu->number);
> +	spu_buff_add(SPU_CTX_SWITCH_CODE, spu->number);
> +	spu_buff_add(spu->number, spu->number);
> +	spu_buff_add(spu->pid, spu->number);
> +	spu_buff_add(spu->tgid, spu->number);
> +	spu_buff_add(app_dcookie, spu->number);
> +	spu_buff_add(spu_cookie, spu->number);
> +	spu_buff_add(offset, spu->number);
> +
> +	/* Set flag to indicate SPU PC data can now be written out.  If
> +	 * the SPU program counter data is seen before an SPU context
> +	 * record is seen, the postprocessing will fail.
> +	 */
> +	spu_buff[spu->number].ctx_sw_seen = 1;
> +
>  	spin_unlock_irqrestore(&buffer_lock, flags);
>  	smp_wmb();	/* insure spu event buffer updates are written */
>  			/* don't want entries intermingled... */
> @@ -360,6 +462,47 @@ static int number_of_online_nodes(void)
>          return nodes;
>  }
>  
> +static int oprofile_spu_buff_create(void)
> +{
> +	int spu;
> +
> +	max_spu_buff = oprofile_get_cpu_buffer_size();
> +
> +	for (spu = 0; spu < num_spu_nodes; spu++) {
> +		/* create circular buffers to store the data in.
> +		 * use locks to manage accessing the buffers
> +		 */
> +		spu_buff[spu].head = 0;
> +		spu_buff[spu].tail = 0;
> +
> +		/*
> +		 * Create a buffer for each SPU.  Can't reliably
> +		 * create a single buffer for all spus due to not
> +		 * enough contiguous kernel memory.
> +		 */
> +
> +		spu_buff[spu].buff = kzalloc((max_spu_buff
> +					      * sizeof(unsigned long)),
> +					     GFP_KERNEL);
> +
> +		if (!spu_buff[spu].buff) {
> +			printk(KERN_ERR "SPU_PROF: "
> +			       "%s, line %d:  oprofile_spu_buff_create "
> +		       "failed to allocate spu buffer %d.\n",
> +			       __func__, __LINE__, spu);
> +
> +			/* release the spu buffers that have been allocated */
> +			while (spu >= 0) {
> +				kfree(spu_buff[spu].buff);
> +				spu_buff[spu].buff = 0;
> +				spu--;
> +			}
> +			return -ENOMEM;
> +		}
> +	}
> +	return 0;
> +}
> +
>  /* The main purpose of this function is to synchronize
>   * OProfile with SPUFS by registering to be notified of
>   * SPU task switches.
> @@ -372,20 +515,35 @@ static int number_of_online_nodes(void)
>   */
>  int spu_sync_start(void)
>  {
> -	int k;
> +	int spu;
>  	int ret = SKIP_GENERIC_SYNC;
>  	int register_ret;
>  	unsigned long flags = 0;
>  
>  	spu_prof_num_nodes = number_of_online_nodes();
>  	num_spu_nodes = spu_prof_num_nodes * 8;
> +	INIT_DELAYED_WORK(&spu_work, wq_sync_spu_buff);
> +
> +	/* create buffer for storing the SPU data to put in
> +	 * the kernel buffer.
> +	 */
> +	ret = oprofile_spu_buff_create();
> +	if (ret)
> +		goto out;
>  
>  	spin_lock_irqsave(&buffer_lock, flags);
> -	add_event_entry(ESCAPE_CODE);
> -	add_event_entry(SPU_PROFILING_CODE);
> -	add_event_entry(num_spu_nodes);
> +	for (spu = 0; spu < num_spu_nodes; spu++) {
> +		spu_buff_add(ESCAPE_CODE, spu);
> +		spu_buff_add(SPU_PROFILING_CODE, spu);
> +		spu_buff_add(num_spu_nodes, spu);
> +	}
>  	spin_unlock_irqrestore(&buffer_lock, flags);
>  
> +	for (spu = 0; spu < num_spu_nodes; spu++) {
> +		spu_buff[spu].ctx_sw_seen = 0;
> +		spu_buff[spu].last_guard_val = 0;
> +	}
> +
>  	/* Register for SPU events  */
>  	register_ret = spu_switch_event_register(&spu_active);
>  	if (register_ret) {
> @@ -393,8 +551,6 @@ int spu_sync_start(void)
>  		goto out;
>  	}
>  
> -	for (k = 0; k < (MAX_NUMNODES * 8); k++)
> -		last_guard_val[k] = 0;
>  	pr_debug("spu_sync_start -- running.\n");
>  out:
>  	return ret;
> @@ -446,13 +602,20 @@ void spu_sync_buffer(int spu_num, unsigned int *samples,
>  		 * use.	 We need to discard samples taken during the time
>  		 * period which an overlay occurs (i.e., guard value changes).
>  		 */
> -		if (grd_val && grd_val != last_guard_val[spu_num]) {
> -			last_guard_val[spu_num] = grd_val;
> +		if (grd_val && grd_val != spu_buff[spu_num].last_guard_val) {
> +			spu_buff[spu_num].last_guard_val = grd_val;
>  			/* Drop the rest of the samples. */
>  			break;
>  		}
>  
> -		add_event_entry(file_offset | spu_num_shifted);
> +		/* We must ensure that the SPU context switch has been written
> +		 * out before samples for the SPU.  Otherwise, the SPU context
> +		 * information is not available and the postprocessing of the
> +		 * SPU PC will fail with no available anonymous map information.
> +		 */
> +		if (spu_buff[spu_num].ctx_sw_seen)
> +			spu_buff_add((file_offset | spu_num_shifted),
> +					 spu_num);
>  	}
>  	spin_unlock(&buffer_lock);
>  out:
> @@ -463,20 +626,41 @@ out:
>  int spu_sync_stop(void)
>  {
>  	unsigned long flags = 0;
> -	int ret = spu_switch_event_unregister(&spu_active);
> -	if (ret) {
> +	int ret;
> +	int k;
> +
> +	ret = spu_switch_event_unregister(&spu_active);
> +
> +	if (ret)
>  		printk(KERN_ERR "SPU_PROF: "
> -			"%s, line %d: spu_switch_event_unregister returned %d\n",
> -			__func__, __LINE__, ret);
> -		goto out;
> -	}
> +		       "%s, line %d: spu_switch_event_unregister "	\
> +		       "returned %d\n",
> +		       __func__, __LINE__, ret);
> +
> +	/* flush any remaining data in the per SPU buffers */
> +	sync_spu_buff();
>  
>  	spin_lock_irqsave(&cache_lock, flags);
>  	ret = release_cached_info(RELEASE_ALL);
>  	spin_unlock_irqrestore(&cache_lock, flags);
> -out:
> +
> +	/* remove scheduled work queue item rather then waiting
> +	 * for every queued entry to execute.  Then flush pending
> +	 * system wide buffer to event buffer.
> +	 */
> +	cancel_delayed_work(&spu_work);
> +
> +	for (k = 0; k < num_spu_nodes; k++) {
> +		spu_buff[k].ctx_sw_seen = 0;
> +
> +		/*
> +		 * spu_sys_buff will be null if there was a problem
> +		 * allocating the buffer.  Only delete if it exists.
> +		 */
> +		kfree(spu_buff[k].buff);
> +		spu_buff[k].buff = 0;
> +	}
>  	pr_debug("spu_sync_stop -- done.\n");
>  	return ret;
>  }
>  
> -
> diff --git a/drivers/oprofile/buffer_sync.c b/drivers/oprofile/buffer_sync.c
> index 9304c45..4a70180 100644
> --- a/drivers/oprofile/buffer_sync.c
> +++ b/drivers/oprofile/buffer_sync.c
> @@ -551,3 +551,27 @@ void sync_buffer(int cpu)
>  
>  	mutex_unlock(&buffer_mutex);
>  }
> +
> +/* The function can be used to add a buffer worth of data directly to
> + * the kernel buffer. The buffer is assumed to be a circular buffer.
> + * Take the entries from index start and end at index end, wrapping
> + * at max_entries.
> + */
> +void oprofile_put_buff(unsigned long *buf, unsigned int start,
> +		       unsigned int stop, unsigned int max)
> +{
> +	int i;
> +
> +	i = start;
> +
> +	mutex_lock(&buffer_mutex);
> +	while (i != stop) {
> +		add_event_entry(buf[i++]);
> +
> +		if (i >= max)
> +			i = 0;
> +	}
> +
> +	mutex_unlock(&buffer_mutex);
> +}
> +
> diff --git a/drivers/oprofile/cpu_buffer.c b/drivers/oprofile/cpu_buffer.c
> index 2450b3a..b8601dc 100644
> --- a/drivers/oprofile/cpu_buffer.c
> +++ b/drivers/oprofile/cpu_buffer.c
> @@ -37,11 +37,24 @@ static int work_enabled;
>  void free_cpu_buffers(void)
>  {
>  	int i;
> - 
> +
>  	for_each_online_cpu(i)
>  		vfree(per_cpu(cpu_buffer, i).buffer);
>  }
>  
> +unsigned long oprofile_get_cpu_buffer_size(void)
> +{
> +	return fs_cpu_buffer_size;
> +}
> +
> +void oprofile_cpu_buffer_inc_smpl_lost(void)
> +{
> +	struct oprofile_cpu_buffer *cpu_buf
> +		= &__get_cpu_var(cpu_buffer);
> +
> +	cpu_buf->sample_lost_overflow++;
> +}
> +
>  int alloc_cpu_buffers(void)
>  {
>  	int i;
> diff --git a/drivers/oprofile/event_buffer.h b/drivers/oprofile/event_buffer.h
> index 5076ed1..84bf324 100644
> --- a/drivers/oprofile/event_buffer.h
> +++ b/drivers/oprofile/event_buffer.h
> @@ -17,6 +17,13 @@ int alloc_event_buffer(void);
>  
>  void free_event_buffer(void);
>   
> +/**
> + * Add data to the event buffer.
> + * The data passed is free-form, but typically consists of
> + * file offsets, dcookies, context information, and ESCAPE codes.
> + */
> +void add_event_entry(unsigned long data);
> +
>  /* wake up the process sleeping on the event file */
>  void wake_up_buffer_waiter(void);
>  
> diff --git a/include/linux/oprofile.h b/include/linux/oprofile.h
> index 041bb31..1ef7fce 100644
> --- a/include/linux/oprofile.h
> +++ b/include/linux/oprofile.h
> @@ -84,13 +84,6 @@ int oprofile_arch_init(struct oprofile_operations * ops);
>  void oprofile_arch_exit(void);
>  
>  /**
> - * Add data to the event buffer.
> - * The data passed is free-form, but typically consists of
> - * file offsets, dcookies, context information, and ESCAPE codes.
> - */
> -void add_event_entry(unsigned long data);
> -
> -/**
>   * Add a sample. This may be called from any context. Pass
>   * smp_processor_id() as cpu.
>   */
> @@ -160,5 +153,14 @@ int oprofilefs_ulong_from_user(unsigned long * val, char const __user * buf, siz
>  
>  /** lock for read/write safety */
>  extern spinlock_t oprofilefs_lock;
> +
> +/**
> + * Add the contents of a circular buffer to the event buffer.
> + */
> +void oprofile_put_buff(unsigned long *buf, unsigned int start,
> +			unsigned int stop, unsigned int max);
> +
> +unsigned long oprofile_get_cpu_buffer_size(void);
> +void oprofile_cpu_buffer_inc_smpl_lost(void);
>   
>  #endif /* OPROFILE_H */
> -- 
> 1.5.4.3
> 
> 

-- 
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert.richter@amd.com

^ permalink raw reply

* Re: powerpc/cell/oprofile: fix mutex locking for spu-oprofile
From: Arnd Bergmann @ 2008-08-20 12:05 UTC (permalink / raw)
  To: Robert Richter
  Cc: linux-kernel, linuxppc-dev, Paul Mackerras, oprofile-list, cel,
	cbe-oss-dev
In-Reply-To: <20080820115750.GO13011@erda.amd.com>

On Wednesday 20 August 2008, Robert Richter wrote:
> I am fine with the changes with the exception of removing
> add_event_entry() from include/linux/oprofile.h. Though there is no
> usage of the function also in other architectures anymore, this change
> in the API should be discussed on the oprofile mailing list. Please
> separate the change in a different patch and submit it to the mailing
> list. If there are no objections then, this change can go upstream as
> well.

As an explanation, the removal of add_event_entry is the whole point
of this patch. add_event_entry must only be called with buffer_mutex
held, but buffer_mutex itself is not exported.
I'm pretty sure that no other user of add_event_entry exists, as it
was exported specifically for the SPU support and that never worked.
Any other (theoretical) code using it would be broken in the same way
and need a corresponding fix.

We can easily leave the declaration in place, but I'd recommend removing
it eventually. If you prefer to keep it, how about marking it as
__deprecated?

	Arnd <><

^ permalink raw reply

* Re: U-boot and Xilinx emaclite, TX_PING_PONG -> stops working
From: Michal Simek @ 2008-08-20 12:13 UTC (permalink / raw)
  To: Philipp Hachtmann; +Cc: linuxppc-embedded
In-Reply-To: <48A7302F.8030707@hachti.de>

Hi Philipp,

I did U-BOOT driver for emaclite. You can set both ping pong mode - all
combination are supported. I tested this driver with Microblaze and works well.
I use this driver almost every day and I have no problem with it. If you have
problem with this driver please send your email to u-boot mailing list.

Regards,
Michal


> Hi all,
> 
> I use a Xilinx Virtex-4FX (with powerpc) Board (Ml403) with U-boot and
> Linux.
> 
> I am currently wondering if someone has experience with the emaclite
> driver. When I unset the "ping pong" (double buffering) parameters of
> the core, I can use it unter Linux and U-boot. But the performance is
> near to non-functional (wget tells me about 10kByte/sec).
> Then I switched to double buffering. Now it runs very well - but only
> under Linux.
> 
> I found a driver (xilinx_emaclite.c) in the U-boot git. That completely
> refuses to work correctly (of course with parameters corrected). The
> other driver I once found "somewhere" (in the Petalinux distribution).
> It at least worked without the double buffering. Both drivers talk about
> the ping pong options in their source, so I think they *should* know
> about it.
> 
> 
> Does anyone know that problem? Is there an improved driver around?
> 
> Best wishes,
> 
> Philipp
> 
> 
> 
> 
> 

^ permalink raw reply

* Re: powerpc/cell/oprofile: fix mutex locking for spu-oprofile
From: Robert Richter @ 2008-08-20 12:39 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linuxppc-dev, Paul Mackerras, oprofile-list, cel,
	cbe-oss-dev
In-Reply-To: <200808201405.32101.arnd@arndb.de>

On 20.08.08 14:05:31, Arnd Bergmann wrote:
> On Wednesday 20 August 2008, Robert Richter wrote:
> > I am fine with the changes with the exception of removing
> > add_event_entry() from include/linux/oprofile.h. Though there is no
> > usage of the function also in other architectures anymore, this change
> > in the API should be discussed on the oprofile mailing list. Please
> > separate the change in a different patch and submit it to the mailing
> > list. If there are no objections then, this change can go upstream as
> > well.
> 
> As an explanation, the removal of add_event_entry is the whole point
> of this patch. add_event_entry must only be called with buffer_mutex
> held, but buffer_mutex itself is not exported.

Thanks for pointing this out.

> I'm pretty sure that no other user of add_event_entry exists, as it
> was exported specifically for the SPU support and that never worked.
> Any other (theoretical) code using it would be broken in the same way
> and need a corresponding fix.
> 
> We can easily leave the declaration in place, but I'd recommend removing
> it eventually. If you prefer to keep it, how about marking it as
> __deprecated?

No, since this is broken by design we remove it. The patch can go
upstream as it is.

Thanks,

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert.richter@amd.com

^ permalink raw reply

* Re: powerpc/cell/oprofile: fix mutex locking for spu-oprofile
From: Robert Richter @ 2008-08-20 12:39 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linuxppc-dev, Paul Mackerras, oprofile-list, cel,
	cbe-oss-dev
In-Reply-To: <200808110925.08485.arnd@arndb.de>

On 11.08.08 09:25:07, Arnd Bergmann wrote:
> From: Carl Love <cel@us.ibm.com>
> 
> The issue is the SPU code is not holding the kernel mutex lock while
> adding samples to the kernel buffer.
> 
> This patch creates per SPU buffers to hold the data.  Data
> is added to the buffers from in interrupt context.  The data
> is periodically pushed to the kernel buffer via a new Oprofile
> function oprofile_put_buff(). The oprofile_put_buff() function
> is called via a work queue enabling the funtion to acquire the
> mutex lock.
> 
> The existing user controls for adjusting the per CPU buffer
> size is used to control the size of the per SPU buffers.
> Similarly, overflows of the SPU buffers are reported by
> incrementing the per CPU buffer stats.  This eliminates the
> need to have architecture specific controls for the per SPU
> buffers which is not acceptable to the OProfile user tool
> maintainer.
> 
> The export of the oprofile add_event_entry() is removed as it
> is no longer needed given this patch.
> 
> Note, this patch has not addressed the issue of indexing arrays
> by the spu number.  This still needs to be fixed as the spu
> numbering is not guarenteed to be 0 to max_num_spus-1.
> 
> Signed-off-by: Carl Love <carll@us.ibm.com>
> Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Acked-by: Robert Richter <robert.richter@amd.com>

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert.richter@amd.com

^ permalink raw reply

* Re: [PATCH 2/4] kvmppc: add hypercall infrastructure - guest part
From: Christian Ehrhardt @ 2008-08-20 12:41 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: jimix, linuxppc-dev, hollisb, kvm-ppc
In-Reply-To: <200808191328.16934.arnd@arndb.de>

Arnd Bergmann wrote:
> On Tuesday 19 August 2008, ehrhardt@linux.vnet.ibm.com wrote:
>   
>> +static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
>> +{
>> +       register unsigned long hcall asm ("r0") = nr;
>> +       register unsigned long arg1 asm ("r3") = p1;
>> +       register long ret asm ("r11");
>> +
>> +       asm volatile(".long %1"
>> +                       : "=r"(ret)
>> +                       : "i"(KVM_HYPERCALL_BIN), "r"(hcall), "r"(arg1)
>> +                       : "r4", "r5", "r6", "r7", "r8",
>> +                         "r9", "r10", "r12", "cc");
>> +       return ret;
>> +}
>>     
>
> What is the reasoning for making the calling convention different from
> all the existing hcall interfaces here?
>
> pseries uses r3 for the hcall number, lv1 and beat use r11, so using
> r0 just for the sake of being different seems counterintuitive.
>
> 	Arnd <><
>   
Some documentation is here 
http://kvm.qumranet.com/kvmwiki/PowerPC_Hypercall_ABI
As far as I remember it was oriented on system calls, from my point we 
can still change it atm.
When we discussed about that I was too new to the power architecture to 
really get all the details, but I assume Hollis and Jimi can answer you 
that.


-- 

Grüsse / regards, 
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization

^ permalink raw reply

* Re: [PATCH 4/4] kvmppc: convert wrteei to wrtee as kvm guest optimization
From: Christian Ehrhardt @ 2008-08-20 12:53 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linuxppc-dev, hollisb, kvm-ppc
In-Reply-To: <200808191342.29918.arnd@arndb.de>

Arnd Bergmann wrote:
> On Tuesday 19 August 2008, ehrhardt@linux.vnet.ibm.com wrote:
>   
>> Dependent on the already existing CONFIG_KVM_GUEST config option this patch
>> changes wrteei to wrtee allowing the hypervisor to rewrite those to nontrapping
>> instructions. Maybe we should split the kvm guest otpimizations in two parts
>> one for the overhead free optimizations and on for the rest that might add
>> some complexity for non virtualized execution (like this one).
>>
>> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
>>     
>
> How significant is the performance impact of this change for non-virtualized
> systems? If it's very low, maybe you should not bother with the #ifdef, and
> if it's noticable, you might be better off using dynamic patching for this.
>
> 	Arnd <><
>   
To be honest I unfortunately don't know how big the impact for 
non-virtualized systems is. I would like to test it, but without 
hardware performance counters on the core I have I'm not sure (yet) how 
to measure that in a good way - any suggestion welcome.
I'm really sure that any jumping around style dynamic patching in the 
guest like function pointers etc will be slower than just let the load 
be there. Unfortunately I can not rewrite it from the hypervisor because 
for "wrteei" I would need a "stwi" to rewrite it in one instruction.
The patch as it is today let you choose between 10% benefit for 
virtualized guest and an unkown but surely very small overhead on native 
hardware.

-- 

Grüsse / regards, 
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization

^ permalink raw reply

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
From: Steven Rostedt @ 2008-08-20 13:12 UTC (permalink / raw)
  To: Eran Liberty
  Cc: linuxppc-dev, Steven Rostedt, Paul E. McKenney, Mathieu Desnoyers,
	linux-kernel
In-Reply-To: <48ABFD77.6030701@extricom.com>


On Wed, 20 Aug 2008, Eran Liberty wrote:

> Steven Rostedt wrote:
> > On Tue, 19 Aug 2008, Eran Liberty wrote:
> > 
> >   
> > > Steven Rostedt wrote:
> > >     
> > > >         
> > > > > Testing tracer sched_switch: PASSED
> > > > > Testing tracer ftrace: PASSED
> > > > > Testing dynamic ftrace: PASSED
> > > > >         
> > 
> > Do you have PREEMPT_TRACER enabled, or any other tracer for that matter?
> > 
> > -- Steve
> >   
> I can see stack trace & context trace, but they are derived from other choices
> (I can not un select them)

Yeah, those are not bad.

> 
> cat .config | grep TRACE
> CONFIG_STACKTRACE_SUPPORT=y
> CONFIG_HAVE_ARCH_TRACEHOOK=y
> # CONFIG_BLK_DEV_IO_TRACE is not set
> CONFIG_STACKTRACE=y
> # CONFIG_BACKTRACE_SELF_TEST is not set
> CONFIG_HAVE_FTRACE=y
> CONFIG_HAVE_DYNAMIC_FTRACE=y
> CONFIG_FTRACE=y
> # CONFIG_SCHED_TRACER is not set
> CONFIG_CONTEXT_SWITCH_TRACER=y
> CONFIG_DYNAMIC_FTRACE=y
> CONFIG_FTRACE_SELFTEST=y
> CONFIG_FTRACE_STARTUP_TEST=y

You must not have PREEMPT on, so the PREEMPT_TRACER will not show up.

The reason I asked, is that my PowerBook runs fine without PREEMPT_TRACER 
but is very unstable when I have PREEMPT_TRACER enabled.

-- Steve


> >   
> > > > > Oops: Exception in kernel mode, sig: 11 [#1]
> > > > > Exsw1600
> > > > > Modules linked in:
> > > > > NIP: c00bbb20 LR: c00bbb20 CTR: 00000000
> > > > >             
> > 
> > 
> >   
> 
> 

^ permalink raw reply

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
From: Steven Rostedt @ 2008-08-20 13:14 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Eran Liberty, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Alan Modra, Scott Wood, Paul E. McKenney
In-Reply-To: <1219216705.21386.46.camel@pasglop>


On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:

> Found the problem (or at least -a- problem), it's a gcc bug.
> 
> Well, first I must say the code generated by -pg is just plain
> horrible :-)
> 
> Appart from that, look at the exit of, for example, __d_lookup, as
> generated by gcc when ftrace is enabled:
> 
> c00c0498:       38 60 00 00     li      r3,0
> c00c049c:       81 61 00 00     lwz     r11,0(r1)
> c00c04a0:       80 0b 00 04     lwz     r0,4(r11)
> c00c04a4:       7d 61 5b 78     mr      r1,r11
> c00c04a8:       bb 0b ff e0     lmw     r24,-32(r11)
> c00c04ac:       7c 08 03 a6     mtlr    r0
> c00c04b0:       4e 80 00 20     blr
> 
> As you can see, it restores r1 -before- it pops r24..r31 off
> the stack ! I let you imagine what happens if an interrupt happens
> just in between those two instructions (mr and lmw). We don't do
> redzones on our ABI, so basically, the registers end up corrupted
> by the interrupt.

Ouch!  You've disassembled this without -pg too, and it does not have this 
bug? What version of gcc do you have?

-- Steve

^ permalink raw reply

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
From: Steven Rostedt @ 2008-08-20 13:19 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Eran Liberty, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Alan Modra, Scott Wood, Paul E. McKenney
In-Reply-To: <alpine.DEB.1.10.0808200913070.13132@gandalf.stny.rr.com>


On Wed, 20 Aug 2008, Steven Rostedt wrote:

> 
> On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:
> 
> > Found the problem (or at least -a- problem), it's a gcc bug.
> > 
> > Well, first I must say the code generated by -pg is just plain
> > horrible :-)
> > 
> > Appart from that, look at the exit of, for example, __d_lookup, as
> > generated by gcc when ftrace is enabled:
> > 
> > c00c0498:       38 60 00 00     li      r3,0
> > c00c049c:       81 61 00 00     lwz     r11,0(r1)
> > c00c04a0:       80 0b 00 04     lwz     r0,4(r11)
> > c00c04a4:       7d 61 5b 78     mr      r1,r11
> > c00c04a8:       bb 0b ff e0     lmw     r24,-32(r11)
> > c00c04ac:       7c 08 03 a6     mtlr    r0
> > c00c04b0:       4e 80 00 20     blr
> > 
> > As you can see, it restores r1 -before- it pops r24..r31 off
> > the stack ! I let you imagine what happens if an interrupt happens
> > just in between those two instructions (mr and lmw). We don't do
> > redzones on our ABI, so basically, the registers end up corrupted
> > by the interrupt.
> 
> Ouch!  You've disassembled this without -pg too, and it does not have this 
> bug? What version of gcc do you have?
> 

I have:
 gcc (Debian 4.3.1-2) 4.3.1

c00c64c8:       81 61 00 00     lwz     r11,0(r1)
c00c64cc:       7f 83 e3 78     mr      r3,r28
c00c64d0:       80 0b 00 04     lwz     r0,4(r11)
c00c64d4:       ba eb ff dc     lmw     r23,-36(r11)
c00c64d8:       7d 61 5b 78     mr      r1,r11
c00c64dc:       7c 08 03 a6     mtlr    r0
c00c64e0:       4e 80 00 20     blr


My version looks fine.  I'm thinking that this is a separate issue than 
what Eran is seeing.

Eran, can you do an "objdump -dr vmlinux" and search for __d_lookup, and 
print out the end of the function dump.

Thanks,

-- Steve

^ permalink raw reply

* Re: powerpc/cell/oprofile: fix mutex locking for spu-oprofile
From: Arnd Bergmann @ 2008-08-20 13:19 UTC (permalink / raw)
  To: Robert Richter
  Cc: linux-kernel, linuxppc-dev, Paul Mackerras, oprofile-list, cel,
	cbe-oss-dev
In-Reply-To: <20080820123944.GQ13011@erda.amd.com>

On Wednesday 20 August 2008, Robert Richter wrote:

> > Signed-off-by: Carl Love <carll@us.ibm.com>
> > Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
> > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> 
> Acked-by: Robert Richter <robert.richter@amd.com>
> 

Thanks Robert.

Paul, any chance we can still get this into 2.6.27?

I've added the Ack and uploaded it again for you to
pull from

 master.kernel.org:/pub/scm/linux/kernel/git/arnd/cell-2.6.git merge

	Arnd <><

^ permalink raw reply

* Re: [PATCH 1/9] powerpc/44x: Add PowerPC 44x simple platform support
From: Arnd Bergmann @ 2008-08-20 13:33 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <496103659f7b122a8301703b055ef4c6bd3092af.1219160188.git.jwboyer@linux.vnet.ibm.com>

On Tuesday 19 August 2008, Josh Boyer wrote:
> This adds a common board file for almost all of the "simple" PowerPC 44x
> boards that exist today.  This is intended to be a single place to add
> support for boards that do not differ in platform support from most of the
> evaluation boards that are used as reference platforms.  Boards that have
> specific requirements or custom hardware setup should still have their own
> board.c file.

The code looks correct, but since this is going to be example code
that may get copied into other platforms, I would take extra care
for coding style:

> +#include <linux/init.h>
> +#include <linux/of_platform.h>
> +
> +#include <asm/machdep.h>
> +#include <asm/prom.h>
> +#include <asm/udbg.h>
> +#include <asm/time.h>
> +#include <asm/uic.h>
> +#include <asm/pci-bridge.h>
> +#include <asm/ppc4xx.h>

#include lines should be ordered alphabetically in an ideal world.

> +static char *board[] __initdata = {
> +	"amcc,bamboo",
> +	"amcc,cayonlands",
> +	"ibm,ebony",
> +	"amcc,katmai",
> +	"amcc,rainier",
> +	"amcc,sequoia",
> +	"amcc,taishan",
> +	NULL
> +};

You don't need the NULL termination here, since the array is only
used statically and you can use ARRAY_SIZE().

> +static int __init ppc44x_probe(void)
> +{
> +	unsigned long root = of_get_flat_dt_root();
> +	int i = 0;
> +
> +	while (board[i]) {
> +		if (of_flat_dt_is_compatible(root, board[i]))
> +			break;
> +		i++;
> +	}

This looks like a for() loop in disguise, so you can better write
it as

	int i;
	for (i = 0; i < ARRAY_SIZE(board); i++) {
		if (of_flat_dt_is_compatible(root, board[i])) {
			ppc_pci_flags = PPC_PCI_REASSIGN_ALL_RSRC;
			return 1;
		}
	}
	return 0;


	Arnd <><

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox