LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc: Initialise paca->kstack before early_setup_secondary
From: Matt Evans @ 2010-08-13  6:58 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mjwolf

As early setup calls down to slb_initialize(), we must have kstack
initialised before checking "should we add a bolted SLB entry for our kstack?"

Failing to do so means stack access requires an SLB miss exception to refill
an entry dynamically, if the stack isn't accessible via SLB(0) (kernel text
& static data).  It's not always allowable to take such a miss, and
intermittent crashes will result.

Primary CPUs don't have this issue; an SLB entry is not bolted for their
stack anyway (as that lives within SLB(0)).  This patch therefore only
affects the init of secondaries.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Cc: stable <stable@kernel.org>
---
 arch/powerpc/kernel/head_64.S |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 844a44b..4d6681d 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -572,9 +572,6 @@ __secondary_start:
 	/* Set thread priority to MEDIUM */
 	HMT_MEDIUM
 
-	/* Do early setup for that CPU (stab, slb, hash table pointer) */
-	bl	.early_setup_secondary
-
 	/* Initialize the kernel stack.  Just a repeat for iSeries.	 */
 	LOAD_REG_ADDR(r3, current_set)
 	sldi	r28,r24,3		/* get current_set[cpu#]	 */
@@ -582,6 +579,9 @@ __secondary_start:
 	addi	r1,r1,THREAD_SIZE-STACK_FRAME_OVERHEAD
 	std	r1,PACAKSAVE(r13)
 
+	/* Do early setup for that CPU (stab, slb, hash table pointer) */
+	bl	.early_setup_secondary
+
 	/* Clear backchain so we get nice backtraces */
 	li	r7,0
 	mtlr	r7
-- 
1.7.0.4

^ permalink raw reply related

* Re: 2.6.35-stable/ppc64/p7: suspicious rcu_dereference_check() usage detected during 2.6.35-stable boot
From: Subrata Modak @ 2010-08-13  6:55 UTC (permalink / raw)
  To: paulmck, Ingo Molnar, balbir
  Cc: sachinp, Peter Zijlstra, Li Zefan, linux-kernel, Linuxppc-dev,
	Ingo Molnar, DIVYA PRAKASH
In-Reply-To: <20100809161200.GC3026@linux.vnet.ibm.com>

Hi Paul,

Is there any specific person(s) whom we whom we should direct this mail
to ? We have not received any response from CGROUP developers on this.
Kindly let me know whom to contact for this. I am adding few more people
i know :-)

Regards--
Subrata

On Mon, 2010-08-09 at 09:12 -0700, Paul E. McKenney wrote:
> On Mon, Aug 02, 2010 at 02:22:12PM +0530, Subrata Modak wrote:
> > Hi,
> > 
> > The following suspicious rcu_dereference_check() usage is detected
> > during 2.6.35-stable boot on my ppc64/p7 machine:
> > 
> > ==================================================
> > [ INFO: suspicious rcu_dereference_check() usage. ]
> > ---------------------------------------------------
> > kernel/sched.c:616 invoked rcu_dereference_check() without protection!
> > other info that might help us debug this:
> > 
> > rcu_scheduler_active = 1, debug_locks = 0
> > 1 lock held by swapper/1:
> >  #0:  (&rq->lock){-.....}, at: [<c0000000007ca2f8>] .init_idle+0x78/0x4a8
> > stack backtrace:
> > Call Trace:
> > [c000000f392bf990] [c000000000014f04] .show_stack+0xb0/0x1a0 (unreliable)
> > [c000000f392bfa50] [c0000000007c87b4] .dump_stack+0x28/0x3c
> > [c000000f392bfad0] [c000000000103e1c] .lockdep_rcu_dereference+0xbc/0xe4
> > [c000000f392bfb70] [c0000000007ca434] .init_idle+0x1b4/0x4a8
> > [c000000f392bfc30] [c0000000007cad04] .fork_idle+0xa4/0xd0
> > [c000000f392bfe30] [c000000000aefaac] .smp_prepare_cpus+0x23c/0x2f4
> > [c000000f392bfed0] [c000000000ae1424] .kernel_init+0xec/0x32c
> > [c000000f392bff90] [c000000000033f40] .kernel_thread+0x54/0x70
> > ==================================================
> > 
> > Please note that this was reported earlier on 2.6.34-rc6:
> > http://marc.info/?l=linux-kernel&m=127313031922395&w=2,
> > The issue was fixed with:
> > 	commit 1ce7e4ff24fe338438bc7837e02780f202bf202b
> > 	Author: Li Zefan <lizf@cn.fujitsu.com>
> > 	Date:   Fri Apr 23 10:35:52 2010 +0800
> > 	cgroup: Check task_lock in task_subsys_state()
> > 
> > According to:
> > 	http://lkml.org/lkml/2010/7/1/883,
> > 	commit dc61b1d65e353d638b2445f71fb8e5b5630f2415
> > 	Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > 	Date:   Tue Jun 8 11:40:42 2010 +0200
> > 	sched: Fix PROVE_RCU vs cpu_cgroup
> > should have fixed this. But this is reproducible on 2.6.35-stable.
> > 
> > Please also see the config file attached.
> 
> Hello, Subrata,
> 
> Thank you for locating this one!  This looks like the same issue that
> Ilia Mirkin located.  Please see below for my analysis -- no fix yet,
> as I need confirmation from cgroups experts.  I can easily create a
> patch that suppresses the warning, but I don't yet know whether this is
> the right thing to do.
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> On Thu, Aug 05, 2010 at 01:31:10PM -0400, Ilia Mirkin wrote:
> > On Thu, Jul 1, 2010 at 6:18 PM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> > > On Thu, Jul 01, 2010 at 08:21:43AM -0400, Miles Lane wrote:
> > >> [ INFO: suspicious rcu_dereference_check() usage. ]
> > >> ---------------------------------------------------
> > >> kernel/sched.c:616 invoked rcu_dereference_check() without protection!
> > >>
> > >> other info that might help us debug this:
> > >>
> > >> rcu_scheduler_active = 1, debug_locks = 1
> > >> 3 locks held by swapper/1:
> > >>   #0:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81042914>]
> > >> cpu_maps_update_begin+0x12/0x14
> > >>   #1:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8104294f>]
> > >> cpu_hotplug_begin+0x27/0x4e
> > >>   #2:  (&rq->lock){-.-...}, at: [<ffffffff812f8502>] init_idle+0x2b/0x114
> > >
> > > Hello, Miles!
> > >
> > > I believe that this one is fixed by commit dc61b1d6 in -tip.
> > 
> > Hi Paul,
> > 
> > Looks like that commit made it into 2.6.35:
> > 
> > git tag -l --contains dc61b1d65e353d638b2445f71fb8e5b5630f2415 v2.6.35*
> > v2.6.35
> > v2.6.35-rc4
> > v2.6.35-rc5
> > v2.6.35-rc6
> > 
> > However I still get:
> > 
> > [    0.051203] CPU0: AMD QEMU Virtual CPU version 0.12.4 stepping 03
> > [    0.052999] lockdep: fixing up alternatives.
> > [    0.054105]
> > [    0.054106] ===================================================
> > [    0.054999] [ INFO: suspicious rcu_dereference_check() usage. ]
> > [    0.054999] ---------------------------------------------------
> > [    0.054999] kernel/sched.c:616 invoked rcu_dereference_check()
> > without protection
> > !
> > [    0.054999]
> > [    0.054999] other info that might help us debug this:
> > [    0.054999]
> > [    0.054999]
> > [    0.054999] rcu_scheduler_active = 1, debug_locks = 1
> > [    0.054999] 3 locks held by swapper/1:
> > [    0.054999]  #0:  (cpu_add_remove_lock){+.+.+.}, at:
> > [<ffffffff814be933>] cpu_up+
> > 0x42/0x6a
> > [    0.054999]  #1:  (cpu_hotplug.lock){+.+.+.}, at:
> > [<ffffffff810400d8>] cpu_hotplu
> > g_begin+0x2a/0x51
> > [    0.054999]  #2:  (&rq->lock){-.-...}, at: [<ffffffff814be2f7>]
> > init_idle+0x2f/0x
> > 113
> > [    0.054999]
> > [    0.054999] stack backtrace:
> > [    0.054999] Pid: 1, comm: swapper Not tainted 2.6.35 #1
> > [    0.054999] Call Trace:
> > [    0.054999]  [<ffffffff81068054>] lockdep_rcu_dereference+0x9b/0xa3
> > [    0.054999]  [<ffffffff810325c3>] task_group+0x7b/0x8a
> > [    0.054999]  [<ffffffff810325e5>] set_task_rq+0x13/0x40
> > [    0.054999]  [<ffffffff814be39a>] init_idle+0xd2/0x113
> > [    0.054999]  [<ffffffff814be78a>] fork_idle+0xb8/0xc7
> > [    0.054999]  [<ffffffff81068717>] ? mark_held_locks+0x4d/0x6b
> > [    0.054999]  [<ffffffff814bcebd>] do_fork_idle+0x17/0x2b
> > [    0.054999]  [<ffffffff814bc89b>] native_cpu_up+0x1c1/0x724
> > [    0.054999]  [<ffffffff814bcea6>] ? do_fork_idle+0x0/0x2b
> > [    0.054999]  [<ffffffff814be876>] _cpu_up+0xac/0x127
> > [    0.054999]  [<ffffffff814be946>] cpu_up+0x55/0x6a
> > [    0.054999]  [<ffffffff81ab562a>] kernel_init+0xe1/0x1ff
> > [    0.054999]  [<ffffffff81003854>] kernel_thread_helper+0x4/0x10
> > [    0.054999]  [<ffffffff814c353c>] ? restore_args+0x0/0x30
> > [    0.054999]  [<ffffffff81ab5549>] ? kernel_init+0x0/0x1ff
> > [    0.054999]  [<ffffffff81003850>] ? kernel_thread_helper+0x0/0x10
> > [    0.056074] Booting Node   0, Processors  #1lockdep: fixing up alternatives.
> > [    0.130045]  #2lockdep: fixing up alternatives.
> > [    0.203089]  #3 Ok.
> > [    0.275286] Brought up 4 CPUs
> > [    0.276005] Total of 4 processors activated (16017.17 BogoMIPS).
> 
> This does look like a new one, thank you for reporting it!
> 
> Here is my analysis, which should at least provide some humor value to
> those who understand the code better than I do.  ;-)
> 
> So the corresponding rcu_dereference_check() is in
> task_subsys_state_check(), and is fetching the cpu_cgroup_subsys_id
> element of the newly created task's task->cgroups->subsys[] array.
> The "git grep" command finds only three uses of cpu_cgroup_subsys_id,
> but no definition.
> 
> Now, fork_idle() invokes copy_process(), which invokes cgroup_fork(),
> which sets the child process's ->cgroups pointer to that of the parent,
> also invoking get_css_set(), which increments the corresponding reference
> count, doing both operations under task_lock() protection (->alloc_lock).
> Because fork_idle() does not specify any of CLONE_NEWNS, CLONE_NEWUTS,
> CLONE_NEWIPC, CLONE_NEWPID, or CLONE_NEWNET, copy_namespaces() should
> not create a new namespace, and so there should be no ns_cgroup_clone().
> We should thus retain the parent's ->cgroups pointer.  And copy_process()
> installs the new task in the various lists, so that the task is externally
> accessible upon return.
> 
> After a non-error return from copy_process(), fork_init() invokes
> init_idle_pid(), which does not appear to affect the task's cgroup
> state.  Next fork_init() invokes init_idle(), which in turn invokes
> __set_task_cpu(), which invokes set_task_rq(), which calls task_group()
> several times, which calls task_subsys_state_check(), which calls the
> rcu_dereference_check() that complained above.
> 
> However, the result returns by rcu_dereference_check() is stored into
> the task structure:
> 
> 	p->se.cfs_rq = task_group(p)->cfs_rq[cpu];
> 	p->se.parent = task_group(p)->se[cpu];
> 
> This means that the corresponding structure must have been tied down with
> a reference count or some such.  If such a reference has been taken, then
> this complaint is a false positive, and could be suppressed by putting
> rcu_read_lock() and rcu_read_unlock() around the call to init_idle()
> from fork_idle().  However, although, reference to the enclosing ->cgroups
> struct css_set is held, it is not clear to me that this reference applies
> to the structures pointed to by the ->subsys[] array, especially given
> that the cgroup_subsys_state structures referenced by this array have
> their own reference count, which does not appear to me to be acquired
> by this code path.
> 
> Or are the cgroup_subsys_state structures referenced by idle tasks
> never freed or some such?
> 
> 							Thanx, Paul

^ permalink raw reply

* [PATCH 2/2] powerpc: Dynamically allocate most lppaca structs
From: Paul Mackerras @ 2010-08-13  6:18 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <20100813061815.GA30234@drongo>

This arranges for the lppaca structs for most cpus to be dynamically
allocated in the same manner as the paca structs.  If we don't include
support for legacy iSeries, only the first lppaca is statically
allocated; the rest are dynamically allocated.  If we include legacy
iSeries support, then we statically allocate the first 64 lppaca
structs, since the iSeries hypervisor requires that the lppaca
structs be present in the data section of the kernel image, but
legacy iSeries supports at most 64 cpus.

With CONFIG_NR_CPUS, the kernel image size for a typical pSeries config
went from:

   text    data     bss     dec     hex filename
9524478 4734564 8469944 22728986        15ad11a ../test-1024/vmlinux

to:

   text    data     bss     dec     hex filename
9524482 3751508 8469944 21745934        14bd10e ../test-1024/vmlinux

a reduction of 983052 bytes overall.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/lppaca.h |    2 +-
 arch/powerpc/kernel/paca.c        |   70 +++++++++++++++++++++++++++++++++++-
 2 files changed, 69 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/lppaca.h b/arch/powerpc/include/asm/lppaca.h
index 6b73554..6d02624 100644
--- a/arch/powerpc/include/asm/lppaca.h
+++ b/arch/powerpc/include/asm/lppaca.h
@@ -153,7 +153,7 @@ struct lppaca {
 
 extern struct lppaca lppaca[];
 
-#define lppaca_of(cpu)	(lppaca[cpu])
+#define lppaca_of(cpu)	(*paca[cpu].lppaca_ptr)
 
 /*
  * SLB shadow buffer structure as defined in the PAPR.  The save_area
diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index d0a26f1..1e068a4 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -27,6 +27,20 @@ extern unsigned long __toc_start;
 #ifdef CONFIG_PPC_BOOK3S
 
 /*
+ * We only have to have statically allocated lppaca structs on
+ * legacy iSeries, which supports at most 64 cpus.
+ */
+#ifdef CONFIG_PPC_ISERIES
+#if NR_CPUS < 64
+#define NR_LPPACAS	NR_CPUS
+#else
+#define NR_LPPACAS	64
+#endif
+#else /* not iSeries */
+#define NR_LPPACAS	1
+#endif
+
+/*
  * The structure which the hypervisor knows about - this structure
  * should not cross a page boundary.  The vpa_init/register_vpa call
  * is now known to fail if the lppaca structure crosses a page
@@ -36,7 +50,7 @@ extern unsigned long __toc_start;
  * will suffice to ensure that it doesn't cross a page boundary.
  */
 struct lppaca lppaca[] = {
-	[0 ... (NR_CPUS-1)] = {
+	[0 ... (NR_LPPACAS-1)] = {
 		.desc = 0xd397d781,	/* "LpPa" */
 		.size = sizeof(struct lppaca),
 		.dyn_proc_status = 2,
@@ -49,6 +63,54 @@ struct lppaca lppaca[] = {
 	},
 };
 
+static struct lppaca *extra_lppacas;
+static long __initdata lppaca_size;
+
+static void allocate_lppacas(int nr_cpus, unsigned long limit)
+{
+	if (nr_cpus <= NR_LPPACAS)
+		return;
+
+	lppaca_size = PAGE_ALIGN(sizeof(struct lppaca) *
+				 (nr_cpus - NR_LPPACAS));
+	extra_lppacas = __va(memblock_alloc_base(lppaca_size,
+						 PAGE_SIZE, limit));
+}
+
+static struct lppaca *new_lppaca(int cpu)
+{
+	struct lppaca *lp;
+
+	if (cpu < NR_LPPACAS)
+		return &lppaca[cpu];
+
+	lp = extra_lppacas + (cpu - NR_LPPACAS);
+	*lp = lppaca[0];
+
+	return lp;
+}
+
+static void free_lppacas(void)
+{
+	long new_size = 0, nr;
+
+	if (!lppaca_size)
+		return;
+	nr = num_possible_cpus() - NR_LPPACAS;
+	if (nr > 0)
+		new_size = PAGE_ALIGN(nr * sizeof(struct lppaca));
+	if (new_size >= lppaca_size)
+		return;
+
+	memblock_free(__pa(extra_lppacas) + new_size, lppaca_size - new_size);
+	lppaca_size = new_size;
+}
+
+#else
+
+static inline void allocate_lppacas(int, unsigned long) { }
+static inline void free_lppacas(void) { }
+
 #endif /* CONFIG_PPC_BOOK3S */
 
 #ifdef CONFIG_PPC_STD_MMU_64
@@ -88,7 +150,7 @@ void __init initialise_paca(struct paca_struct *new_paca, int cpu)
 	unsigned long kernel_toc = (unsigned long)(&__toc_start) + 0x8000UL;
 
 #ifdef CONFIG_PPC_BOOK3S
-	new_paca->lppaca_ptr = &lppaca[cpu];
+	new_paca->lppaca_ptr = new_lppaca(cpu);
 #else
 	new_paca->kernel_pgd = swapper_pg_dir;
 #endif
@@ -144,6 +206,8 @@ void __init allocate_pacas(void)
 	printk(KERN_DEBUG "Allocated %u bytes for %d pacas at %p\n",
 		paca_size, nr_cpus, paca);
 
+	allocate_lppacas(nr_cpus, limit);
+
 	/* Can't use for_each_*_cpu, as they aren't functional yet */
 	for (cpu = 0; cpu < nr_cpus; cpu++)
 		initialise_paca(&paca[cpu], cpu);
@@ -164,4 +228,6 @@ void __init free_unused_pacas(void)
 		paca_size - new_size);
 
 	paca_size = new_size;
+
+	free_lppacas();
 }
-- 
1.7.1

^ permalink raw reply related

* [PATCH 1/2] powerpc: Abstract indexing of lppaca structs
From: Paul Mackerras @ 2010-08-13  6:18 UTC (permalink / raw)
  To: linuxppc-dev

Currently we have the lppaca structs as a simple array of NR_CPUS
entries, taking up space in the data section of the kernel image.
In future we would like to allocate them dynamically, so this
abstracts out the accesses to the array, making it easier to
change how we locate the lppaca for a given cpu in future.
Specifically, lppaca[cpu] changes to lppaca_of(cpu).

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/lppaca.h     |    2 ++
 arch/powerpc/kernel/lparcfg.c         |   14 +++++++-------
 arch/powerpc/lib/locks.c              |    4 ++--
 arch/powerpc/platforms/iseries/dt.c   |    4 ++--
 arch/powerpc/platforms/iseries/smp.c  |    2 +-
 arch/powerpc/platforms/pseries/dtl.c  |    8 ++++----
 arch/powerpc/platforms/pseries/lpar.c |    4 ++--
 7 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/lppaca.h b/arch/powerpc/include/asm/lppaca.h
index 14b592d..6b73554 100644
--- a/arch/powerpc/include/asm/lppaca.h
+++ b/arch/powerpc/include/asm/lppaca.h
@@ -153,6 +153,8 @@ struct lppaca {
 
 extern struct lppaca lppaca[];
 
+#define lppaca_of(cpu)	(lppaca[cpu])
+
 /*
  * SLB shadow buffer structure as defined in the PAPR.  The save_area
  * contains adjacent ESID and VSID pairs for each shadowed SLB.  The
diff --git a/arch/powerpc/kernel/lparcfg.c b/arch/powerpc/kernel/lparcfg.c
index 50362b6..8d9e3b9 100644
--- a/arch/powerpc/kernel/lparcfg.c
+++ b/arch/powerpc/kernel/lparcfg.c
@@ -56,7 +56,7 @@ static unsigned long get_purr(void)
 
 	for_each_possible_cpu(cpu) {
 		if (firmware_has_feature(FW_FEATURE_ISERIES))
-			sum_purr += lppaca[cpu].emulated_time_base;
+			sum_purr += lppaca_of(cpu).emulated_time_base;
 		else {
 			struct cpu_usage *cu;
 
@@ -263,7 +263,7 @@ static void parse_ppp_data(struct seq_file *m)
 	           ppp_data.active_system_procs);
 
 	/* pool related entries are apropriate for shared configs */
-	if (lppaca[0].shared_proc) {
+	if (lppaca_of(0).shared_proc) {
 		unsigned long pool_idle_time, pool_procs;
 
 		seq_printf(m, "pool=%d\n", ppp_data.pool_num);
@@ -460,8 +460,8 @@ static void pseries_cmo_data(struct seq_file *m)
 		return;
 
 	for_each_possible_cpu(cpu) {
-		cmo_faults += lppaca[cpu].cmo_faults;
-		cmo_fault_time += lppaca[cpu].cmo_fault_time;
+		cmo_faults += lppaca_of(cpu).cmo_faults;
+		cmo_fault_time += lppaca_of(cpu).cmo_fault_time;
 	}
 
 	seq_printf(m, "cmo_faults=%lu\n", cmo_faults);
@@ -479,8 +479,8 @@ static void splpar_dispatch_data(struct seq_file *m)
 	unsigned long dispatch_dispersions = 0;
 
 	for_each_possible_cpu(cpu) {
-		dispatches += lppaca[cpu].yield_count;
-		dispatch_dispersions += lppaca[cpu].dispersion_count;
+		dispatches += lppaca_of(cpu).yield_count;
+		dispatch_dispersions += lppaca_of(cpu).dispersion_count;
 	}
 
 	seq_printf(m, "dispatches=%lu\n", dispatches);
@@ -545,7 +545,7 @@ static int pseries_lparcfg_data(struct seq_file *m, void *v)
 	seq_printf(m, "partition_potential_processors=%d\n",
 		   partition_potential_processors);
 
-	seq_printf(m, "shared_processor_mode=%d\n", lppaca[0].shared_proc);
+	seq_printf(m, "shared_processor_mode=%d\n", lppaca_of(0).shared_proc);
 
 	seq_printf(m, "slb_size=%d\n", mmu_slb_size);
 
diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c
index 58e14fb..9b8182e 100644
--- a/arch/powerpc/lib/locks.c
+++ b/arch/powerpc/lib/locks.c
@@ -34,7 +34,7 @@ void __spin_yield(arch_spinlock_t *lock)
 		return;
 	holder_cpu = lock_value & 0xffff;
 	BUG_ON(holder_cpu >= NR_CPUS);
-	yield_count = lppaca[holder_cpu].yield_count;
+	yield_count = lppaca_of(holder_cpu).yield_count;
 	if ((yield_count & 1) == 0)
 		return;		/* virtual cpu is currently running */
 	rmb();
@@ -65,7 +65,7 @@ void __rw_yield(arch_rwlock_t *rw)
 		return;		/* no write lock at present */
 	holder_cpu = lock_value & 0xffff;
 	BUG_ON(holder_cpu >= NR_CPUS);
-	yield_count = lppaca[holder_cpu].yield_count;
+	yield_count = lppaca_of(holder_cpu).yield_count;
 	if ((yield_count & 1) == 0)
 		return;		/* virtual cpu is currently running */
 	rmb();
diff --git a/arch/powerpc/platforms/iseries/dt.c b/arch/powerpc/platforms/iseries/dt.c
index 7f45a51..fdb7384 100644
--- a/arch/powerpc/platforms/iseries/dt.c
+++ b/arch/powerpc/platforms/iseries/dt.c
@@ -243,7 +243,7 @@ static void __init dt_cpus(struct iseries_flat_dt *dt)
 	pft_size[1] = __ilog2(HvCallHpt_getHptPages() * HW_PAGE_SIZE);
 
 	for (i = 0; i < NR_CPUS; i++) {
-		if (lppaca[i].dyn_proc_status >= 2)
+		if (lppaca_of(i).dyn_proc_status >= 2)
 			continue;
 
 		snprintf(p, 32 - (p - buf), "@%d", i);
@@ -251,7 +251,7 @@ static void __init dt_cpus(struct iseries_flat_dt *dt)
 
 		dt_prop_str(dt, "device_type", device_type_cpu);
 
-		index = lppaca[i].dyn_hv_phys_proc_index;
+		index = lppaca_of(i).dyn_hv_phys_proc_index;
 		d = &xIoHriProcessorVpd[index];
 
 		dt_prop_u32(dt, "i-cache-size", d->xInstCacheSize * 1024);
diff --git a/arch/powerpc/platforms/iseries/smp.c b/arch/powerpc/platforms/iseries/smp.c
index 6590850..6c60299 100644
--- a/arch/powerpc/platforms/iseries/smp.c
+++ b/arch/powerpc/platforms/iseries/smp.c
@@ -91,7 +91,7 @@ static void smp_iSeries_kick_cpu(int nr)
 	BUG_ON((nr < 0) || (nr >= NR_CPUS));
 
 	/* Verify that our partition has a processor nr */
-	if (lppaca[nr].dyn_proc_status >= 2)
+	if (lppaca_of(nr).dyn_proc_status >= 2)
 		return;
 
 	/* The processor is currently spinning, waiting
diff --git a/arch/powerpc/platforms/pseries/dtl.c b/arch/powerpc/platforms/pseries/dtl.c
index a00addb..adfd544 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -107,14 +107,14 @@ static int dtl_enable(struct dtl *dtl)
 	}
 
 	/* set our initial buffer indices */
-	dtl->last_idx = lppaca[dtl->cpu].dtl_idx = 0;
+	dtl->last_idx = lppaca_of(dtl->cpu).dtl_idx = 0;
 
 	/* ensure that our updates to the lppaca fields have occurred before
 	 * we actually enable the logging */
 	smp_wmb();
 
 	/* enable event logging */
-	lppaca[dtl->cpu].dtl_enable_mask = dtl_event_mask;
+	lppaca_of(dtl->cpu).dtl_enable_mask = dtl_event_mask;
 
 	return 0;
 }
@@ -123,7 +123,7 @@ static void dtl_disable(struct dtl *dtl)
 {
 	int hwcpu = get_hard_smp_processor_id(dtl->cpu);
 
-	lppaca[dtl->cpu].dtl_enable_mask = 0x0;
+	lppaca_of(dtl->cpu).dtl_enable_mask = 0x0;
 
 	unregister_dtl(hwcpu, __pa(dtl->buf));
 
@@ -171,7 +171,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
 	/* actual number of entries read */
 	n_read = 0;
 
-	cur_idx = lppaca[dtl->cpu].dtl_idx;
+	cur_idx = lppaca_of(dtl->cpu).dtl_idx;
 	last_idx = dtl->last_idx;
 
 	if (cur_idx - last_idx > dtl->buf_entries) {
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index cf79b46..a17fe4a 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -250,9 +250,9 @@ void vpa_init(int cpu)
 	long ret;
 
 	if (cpu_has_feature(CPU_FTR_ALTIVEC))
-		lppaca[cpu].vmxregs_in_use = 1;
+		lppaca_of(cpu).vmxregs_in_use = 1;
 
-	addr = __pa(&lppaca[cpu]);
+	addr = __pa(&lppaca_of(cpu));
 	ret = register_vpa(hwcpu, addr);
 
 	if (ret) {
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH] powerpc: Add support for popcnt instructions
From: Anton Blanchard @ 2010-08-13  5:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1281673744.2987.362.camel@pasglop>

 
Hi,

> Especially from modules it will suck big time. If kept out of line they
> should probably be linked-in with each module, but I'd rather have them
> inlined.

Inlining would be good, but this is as far as I can take this for now.
If someone else is interested go for it :)

Anton

^ permalink raw reply

* Re: [PATCH] powerpc: Add support for popcnt instructions
From: Benjamin Herrenschmidt @ 2010-08-13  4:29 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: linuxppc-dev
In-Reply-To: <20100813022809.GY29316@kryten>

On Fri, 2010-08-13 at 12:28 +1000, Anton Blanchard wrote:
> POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
> this patch does all the work out of line, but it would be nice to implement
> them as inlines with an out of line fallback.
> 
> The performance issue with hweight was noticed when disabling SMT on a large
> (192 thread) POWER7 box. The patch improves that testcase by about 8%.

Especially from modules it will suck big time. If kept out of line they
should probably be linked-in with each module, but I'd rather have them
inlined.

Cheers,
Ben.

> Signed-off-by: Anton Blanchard <anton@samba.org>
> ---
> 
> Index: powerpc.git/arch/powerpc/include/asm/cputable.h
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/include/asm/cputable.h	2010-08-13 11:19:42.691991439 +1000
> +++ powerpc.git/arch/powerpc/include/asm/cputable.h	2010-08-13 11:24:55.510741618 +1000
> @@ -199,6 +199,8 @@ extern const char *powerpc_base_platform
>  #define CPU_FTR_UNALIGNED_LD_STD	LONG_ASM_CONST(0x0080000000000000)
>  #define CPU_FTR_ASYM_SMT		LONG_ASM_CONST(0x0100000000000000)
>  #define CPU_FTR_STCX_CHECKS_ADDRESS	LONG_ASM_CONST(0x0200000000000000)
> +#define CPU_FTR_POPCNTB			LONG_ASM_CONST(0x0400000000000000)
> +#define CPU_FTR_POPCNTD			LONG_ASM_CONST(0x0800000000000000)
>  
>  #ifndef __ASSEMBLY__
>  
> @@ -403,21 +405,22 @@ extern const char *powerpc_base_platform
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
>  	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
> -	    CPU_FTR_PURR | CPU_FTR_STCX_CHECKS_ADDRESS)
> +	    CPU_FTR_PURR | CPU_FTR_STCX_CHECKS_ADDRESS | \
> +	    CPU_FTR_POPCNTB)
>  #define CPU_FTRS_POWER6 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
>  	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
>  	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
>  	    CPU_FTR_DSCR | CPU_FTR_UNALIGNED_LD_STD | \
> -	    CPU_FTR_STCX_CHECKS_ADDRESS)
> +	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB)
>  #define CPU_FTRS_POWER7 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
>  	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
>  	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
>  	    CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
> -	    CPU_FTR_STCX_CHECKS_ADDRESS)
> +	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD)
>  #define CPU_FTRS_CELL	(CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
> Index: powerpc.git/arch/powerpc/lib/Makefile
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/lib/Makefile	2010-08-13 11:19:43.653241065 +1000
> +++ powerpc.git/arch/powerpc/lib/Makefile	2010-08-13 11:19:45.930743841 +1000
> @@ -18,7 +18,7 @@ obj-$(CONFIG_HAS_IOMEM)	+= devres.o
>  
>  obj-$(CONFIG_PPC64)	+= copypage_64.o copyuser_64.o \
>  			   memcpy_64.o usercopy_64.o mem_64.o string.o \
> -			   checksum_wrappers_64.o
> +			   checksum_wrappers_64.o hweight_64.o
>  obj-$(CONFIG_XMON)	+= sstep.o ldstfp.o
>  obj-$(CONFIG_KPROBES)	+= sstep.o ldstfp.o
>  obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= sstep.o ldstfp.o
> Index: powerpc.git/arch/powerpc/lib/hweight_64.S
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ powerpc.git/arch/powerpc/lib/hweight_64.S	2010-08-13 11:19:45.940741462 +1000
> @@ -0,0 +1,110 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
> + *
> + * Copyright (C) IBM Corporation, 2010
> + *
> + * Author: Anton Blanchard <anton@au.ibm.com>
> + */
> +#include <asm/processor.h>
> +#include <asm/ppc_asm.h>
> +
> +/* Note: This code relies on -mminimal-toc */
> +
> +_GLOBAL(__arch_hweight8)
> +BEGIN_FTR_SECTION
> +	b .__sw_hweight8
> +	nop
> +	nop
> +FTR_SECTION_ELSE
> +	popcntb	r3,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
> +
> +_GLOBAL(__arch_hweight16)
> +BEGIN_FTR_SECTION
> +	b .__sw_hweight16
> +	nop
> +	nop
> +	nop
> +	nop
> +FTR_SECTION_ELSE
> +  BEGIN_FTR_SECTION_NESTED(50)
> +	popcntb r3,r3
> +	srdi	r4,r3,8
> +	add	r3,r4,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  FTR_SECTION_ELSE_NESTED(50)
> +	clrlwi  r3,r3,16
> +	popcntw	r3,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 50)
> +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
> +
> +_GLOBAL(__arch_hweight32)
> +BEGIN_FTR_SECTION
> +	b .__sw_hweight32
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +FTR_SECTION_ELSE
> +  BEGIN_FTR_SECTION_NESTED(51)
> +	popcntb r3,r3
> +	srdi	r4,r3,16
> +	add	r3,r4,r3
> +	srdi	r4,r3,8
> +	add	r3,r4,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  FTR_SECTION_ELSE_NESTED(51)
> +	popcntw	r3,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 51)
> +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
> +
> +_GLOBAL(__arch_hweight64)
> +BEGIN_FTR_SECTION
> +	b .__sw_hweight64
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +FTR_SECTION_ELSE
> +  BEGIN_FTR_SECTION_NESTED(52)
> +	popcntb r3,r3
> +	srdi	r4,r3,32
> +	add	r3,r4,r3
> +	srdi	r4,r3,16
> +	add	r3,r4,r3
> +	srdi	r4,r3,8
> +	add	r3,r4,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  FTR_SECTION_ELSE_NESTED(52)
> +	popcntd	r3,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 52)
> +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
> Index: powerpc.git/arch/powerpc/include/asm/bitops.h
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/include/asm/bitops.h	2010-08-13 11:06:20.991992998 +1000
> +++ powerpc.git/arch/powerpc/include/asm/bitops.h	2010-08-13 11:19:45.940741462 +1000
> @@ -267,7 +267,16 @@ static __inline__ int fls64(__u64 x)
>  #include <asm-generic/bitops/fls64.h>
>  #endif /* __powerpc64__ */
>  
> +#ifdef CONFIG_PPC64
> +unsigned int __arch_hweight8(unsigned int w);
> +unsigned int __arch_hweight16(unsigned int w);
> +unsigned int __arch_hweight32(unsigned int w);
> +unsigned long __arch_hweight64(__u64 w);
> +#include <asm-generic/bitops/const_hweight.h>
> +#else
>  #include <asm-generic/bitops/hweight.h>
> +#endif
> +
>  #include <asm-generic/bitops/find.h>
>  
>  /* Little-endian versions */
> Index: powerpc.git/arch/powerpc/kernel/ppc_ksyms.c
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/kernel/ppc_ksyms.c	2010-08-13 11:06:21.011991745 +1000
> +++ powerpc.git/arch/powerpc/kernel/ppc_ksyms.c	2010-08-13 11:19:45.940741462 +1000
> @@ -186,3 +186,10 @@ EXPORT_SYMBOL(__mtdcr);
>  EXPORT_SYMBOL(__mfdcr);
>  #endif
>  EXPORT_SYMBOL(empty_zero_page);
> +
> +#ifdef CONFIG_PPC64
> +EXPORT_SYMBOL(__arch_hweight8);
> +EXPORT_SYMBOL(__arch_hweight16);
> +EXPORT_SYMBOL(__arch_hweight32);
> +EXPORT_SYMBOL(__arch_hweight64);
> +#endif

^ permalink raw reply

* [PATCH] powerpc: Add support for popcnt instructions
From: Anton Blanchard @ 2010-08-13  2:28 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev


POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
this patch does all the work out of line, but it would be nice to implement
them as inlines with an out of line fallback.

The performance issue with hweight was noticed when disabling SMT on a large
(192 thread) POWER7 box. The patch improves that testcase by about 8%.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: powerpc.git/arch/powerpc/include/asm/cputable.h
===================================================================
--- powerpc.git.orig/arch/powerpc/include/asm/cputable.h	2010-08-13 11:19:42.691991439 +1000
+++ powerpc.git/arch/powerpc/include/asm/cputable.h	2010-08-13 11:24:55.510741618 +1000
@@ -199,6 +199,8 @@ extern const char *powerpc_base_platform
 #define CPU_FTR_UNALIGNED_LD_STD	LONG_ASM_CONST(0x0080000000000000)
 #define CPU_FTR_ASYM_SMT		LONG_ASM_CONST(0x0100000000000000)
 #define CPU_FTR_STCX_CHECKS_ADDRESS	LONG_ASM_CONST(0x0200000000000000)
+#define CPU_FTR_POPCNTB			LONG_ASM_CONST(0x0400000000000000)
+#define CPU_FTR_POPCNTD			LONG_ASM_CONST(0x0800000000000000)
 
 #ifndef __ASSEMBLY__
 
@@ -403,21 +405,22 @@ extern const char *powerpc_base_platform
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
-	    CPU_FTR_PURR | CPU_FTR_STCX_CHECKS_ADDRESS)
+	    CPU_FTR_PURR | CPU_FTR_STCX_CHECKS_ADDRESS | \
+	    CPU_FTR_POPCNTB)
 #define CPU_FTRS_POWER6 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
 	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
 	    CPU_FTR_DSCR | CPU_FTR_UNALIGNED_LD_STD | \
-	    CPU_FTR_STCX_CHECKS_ADDRESS)
+	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB)
 #define CPU_FTRS_POWER7 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
 	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
 	    CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
-	    CPU_FTR_STCX_CHECKS_ADDRESS)
+	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD)
 #define CPU_FTRS_CELL	(CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
Index: powerpc.git/arch/powerpc/lib/Makefile
===================================================================
--- powerpc.git.orig/arch/powerpc/lib/Makefile	2010-08-13 11:19:43.653241065 +1000
+++ powerpc.git/arch/powerpc/lib/Makefile	2010-08-13 11:19:45.930743841 +1000
@@ -18,7 +18,7 @@ obj-$(CONFIG_HAS_IOMEM)	+= devres.o
 
 obj-$(CONFIG_PPC64)	+= copypage_64.o copyuser_64.o \
 			   memcpy_64.o usercopy_64.o mem_64.o string.o \
-			   checksum_wrappers_64.o
+			   checksum_wrappers_64.o hweight_64.o
 obj-$(CONFIG_XMON)	+= sstep.o ldstfp.o
 obj-$(CONFIG_KPROBES)	+= sstep.o ldstfp.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= sstep.o ldstfp.o
Index: powerpc.git/arch/powerpc/lib/hweight_64.S
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ powerpc.git/arch/powerpc/lib/hweight_64.S	2010-08-13 11:19:45.940741462 +1000
@@ -0,0 +1,110 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2010
+ *
+ * Author: Anton Blanchard <anton@au.ibm.com>
+ */
+#include <asm/processor.h>
+#include <asm/ppc_asm.h>
+
+/* Note: This code relies on -mminimal-toc */
+
+_GLOBAL(__arch_hweight8)
+BEGIN_FTR_SECTION
+	b .__sw_hweight8
+	nop
+	nop
+FTR_SECTION_ELSE
+	popcntb	r3,r3
+	clrldi	r3,r3,64-8
+	blr
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
+
+_GLOBAL(__arch_hweight16)
+BEGIN_FTR_SECTION
+	b .__sw_hweight16
+	nop
+	nop
+	nop
+	nop
+FTR_SECTION_ELSE
+  BEGIN_FTR_SECTION_NESTED(50)
+	popcntb r3,r3
+	srdi	r4,r3,8
+	add	r3,r4,r3
+	clrldi	r3,r3,64-8
+	blr
+  FTR_SECTION_ELSE_NESTED(50)
+	clrlwi  r3,r3,16
+	popcntw	r3,r3
+	clrldi	r3,r3,64-8
+	blr
+  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 50)
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
+
+_GLOBAL(__arch_hweight32)
+BEGIN_FTR_SECTION
+	b .__sw_hweight32
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+FTR_SECTION_ELSE
+  BEGIN_FTR_SECTION_NESTED(51)
+	popcntb r3,r3
+	srdi	r4,r3,16
+	add	r3,r4,r3
+	srdi	r4,r3,8
+	add	r3,r4,r3
+	clrldi	r3,r3,64-8
+	blr
+  FTR_SECTION_ELSE_NESTED(51)
+	popcntw	r3,r3
+	clrldi	r3,r3,64-8
+	blr
+  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 51)
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
+
+_GLOBAL(__arch_hweight64)
+BEGIN_FTR_SECTION
+	b .__sw_hweight64
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+FTR_SECTION_ELSE
+  BEGIN_FTR_SECTION_NESTED(52)
+	popcntb r3,r3
+	srdi	r4,r3,32
+	add	r3,r4,r3
+	srdi	r4,r3,16
+	add	r3,r4,r3
+	srdi	r4,r3,8
+	add	r3,r4,r3
+	clrldi	r3,r3,64-8
+	blr
+  FTR_SECTION_ELSE_NESTED(52)
+	popcntd	r3,r3
+	clrldi	r3,r3,64-8
+	blr
+  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 52)
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
Index: powerpc.git/arch/powerpc/include/asm/bitops.h
===================================================================
--- powerpc.git.orig/arch/powerpc/include/asm/bitops.h	2010-08-13 11:06:20.991992998 +1000
+++ powerpc.git/arch/powerpc/include/asm/bitops.h	2010-08-13 11:19:45.940741462 +1000
@@ -267,7 +267,16 @@ static __inline__ int fls64(__u64 x)
 #include <asm-generic/bitops/fls64.h>
 #endif /* __powerpc64__ */
 
+#ifdef CONFIG_PPC64
+unsigned int __arch_hweight8(unsigned int w);
+unsigned int __arch_hweight16(unsigned int w);
+unsigned int __arch_hweight32(unsigned int w);
+unsigned long __arch_hweight64(__u64 w);
+#include <asm-generic/bitops/const_hweight.h>
+#else
 #include <asm-generic/bitops/hweight.h>
+#endif
+
 #include <asm-generic/bitops/find.h>
 
 /* Little-endian versions */
Index: powerpc.git/arch/powerpc/kernel/ppc_ksyms.c
===================================================================
--- powerpc.git.orig/arch/powerpc/kernel/ppc_ksyms.c	2010-08-13 11:06:21.011991745 +1000
+++ powerpc.git/arch/powerpc/kernel/ppc_ksyms.c	2010-08-13 11:19:45.940741462 +1000
@@ -186,3 +186,10 @@ EXPORT_SYMBOL(__mtdcr);
 EXPORT_SYMBOL(__mfdcr);
 #endif
 EXPORT_SYMBOL(empty_zero_page);
+
+#ifdef CONFIG_PPC64
+EXPORT_SYMBOL(__arch_hweight8);
+EXPORT_SYMBOL(__arch_hweight16);
+EXPORT_SYMBOL(__arch_hweight32);
+EXPORT_SYMBOL(__arch_hweight64);
+#endif

^ permalink raw reply

* Re: Query regarding 2.6.335 RT[Ingo's] and Non-RT performance
From: Xianghua Xiao @ 2010-08-13  2:18 UTC (permalink / raw)
  To: Jeff Angielski; +Cc: linuxppc-dev
In-Reply-To: <4C64352F.4090005@theptrgroup.com>

On Thu, Aug 12, 2010 at 12:53 PM, Jeff Angielski <jeff@theptrgroup.com> wro=
te:
> On 08/11/2010 06:18 PM, Manikandan Ramachandran wrote:
>>
>> Hello All,
>> =C2=A0 =C2=A0 I created a very simple program which has higher priority =
than
>> normal tasks and runs a tight loop. Under same test environment I ran
>> this program on both non-rt and rt 2.6.33.5 kernel. =C2=A0To my suprise =
I see
>> that performance of non-RT kernel is better than RT. non-RT kernel took
>> 3 sec and 366156 usec while RT kernel took about 3 sec and 418011
>> usec.Can someone please explain why the performance of non-rt kernel is
>> better than rt kernel? From the face of the test result, I feel RT has
>> more overhead,Is there any configuration that I could do to bring down
>> the overhead?
>
> Your "surprise" is due to your definition of "performance".
>
> The purpose of the -rt kernels is to reduce the kernel latency. =C2=A0Thi=
s is
> important for servicing hardware. =C2=A0Normal users find the -rt useful =
for
> audio/video applications. =C2=A0Engineering and scientific users find the=
 -rt
> beneficially for servicing hardware like sensors or control systems.
>
> If you are just trying to run calculations as fast as you can in user spa=
ce,
> you'd be better off using the non-rt variants.
>
>
> --
> Jeff Angielski
> The PTR Group
> www.theptrgroup.com
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>

true, in most cases non-rt will have better performance/throughput,
while rt's major goal is to have better latency for high priority
tasks. also true is that, rt kernel will have more overhead.

xianghua

^ permalink raw reply

* Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections
From: Dave Hansen @ 2010-08-12 20:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linuxppc-dev, Greg KH, linux-kernel, linux-mm, KAMEZAWA Hiroyuki
In-Reply-To: <20100812120816.e97d8b9e.akpm@linux-foundation.org>

On Thu, 2010-08-12 at 12:08 -0700, Andrew Morton wrote:
> > This set of patches allows for each directory created in sysfs
> > to cover more than one memory section.  The default behavior for
> > sysfs directory creation is the same, in that each directory
> > represents a single memory section.  A new file 'end_phys_index'
> > in each directory contains the physical_id of the last memory
> > section covered by the directory so that users can easily
> > determine the memory section range of a directory.
> 
> What you're proposing appears to be a non-back-compatible
> userspace-visible change.  This is a big issue! 

Nathan, one thought to get around this at the moment would be to bump up
the size that we export in /sys/devices/system/memory/block_size_bytes.
I think you have already done most of the hard work to accomplish
this.  

You can still add the end_phys_index stuff.  But, for now, it would
always be equal to start_phys_index.

-- Dave

^ permalink raw reply

* Question about dma_direct_ops in PowerPC.
From: Fushen Chen @ 2010-08-12 19:28 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 2011 bytes --]

We have a board with PCI device driver that calls for
pci_dma_sync_single_for_device.
This driver used to work for Linux kernel 2.6.25.

We ported to the driver to Linux kernel 2.6.32. The PCI device driver
doesn't work anymore.
The following call trace shows why the PCI driver won't work in kernel
2.6.32.
1. In pci_include/asm-generic/pci-dma-compat.h
    pci_dma_sync_single_for_device calls for dma_sync_single_for_cpu
2. In include/asm-generic/dma-mapping-common.h
    dma_sync_single_for_cpu calls for ops->sync_single_for_cpu
3. In arch/powerpc/kernel/dma.c
struct dma_map_ops dma_direct_ops = {
        .alloc_coherent = dma_direct_alloc_coherent,
        .free_coherent  = dma_direct_free_coherent,
        .map_sg         = dma_direct_map_sg,
        .unmap_sg       = dma_direct_unmap_sg,
        .dma_supported  = dma_direct_dma_supported,
        .map_page       = dma_direct_map_page,
        .unmap_page     = dma_direct_unmap_page,
#ifdef CONFIG_NOT_COHERENT_CACHE
        .sync_single_range_for_cpu      = dma_direct_sync_single_range,
        .sync_single_range_for_device   = dma_direct_sync_single_range,
        .sync_sg_for_cpu                = dma_direct_sync_sg,
        .sync_sg_for_device             = dma_direct_sync_sg,
#endif
};
There is no ops defined for sync_single_for_cpu.
The pci_dma_sync_single_for_device is a no-op.

However Linux kernel 2.6.35.1 from kernel.org has the  .sync_single_for_cpu
for dma_direct_ops.
in arch/powerpc/kernel/dma.c
#ifdef CONFIG_NOT_COHERENT_CACHE
        .sync_single_for_cpu            = dma_direct_sync_single,
        .sync_single_for_device         = dma_direct_sync_single,
        .sync_sg_for_cpu                = dma_direct_sync_sg,
        .sync_sg_for_device             = dma_direct_sync_sg,
#endif


We won't move to Linux kernel 2.6.35 anytime soon.
My questions:
1. Is there any side effect for adding .sync_single_for_cpu to
dma_direct_ops in 2.6.32?
2. What will be the future development here?


Best regards & Thanks,
Fushen

[-- Attachment #2: Type: text/html, Size: 2235 bytes --]

^ permalink raw reply

* Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections
From: Andrew Morton @ 2010-08-12 19:08 UTC (permalink / raw)
  To: Nathan Fontenot
  Cc: linuxppc-dev, Greg KH, linux-kernel, Dave Hansen, linux-mm,
	KAMEZAWA Hiroyuki
In-Reply-To: <4C60407C.2080608@austin.ibm.com>

On Mon, 09 Aug 2010 12:53:00 -0500
Nathan Fontenot <nfont@austin.ibm.com> wrote:

> This set of patches de-couples the idea that there is a single
> directory in sysfs for each memory section.  The intent of the
> patches is to reduce the number of sysfs directories created to
> resolve a boot-time performance issue.  On very large systems
> boot time are getting very long (as seen on powerpc hardware)
> due to the enormous number of sysfs directories being created.
> On a system with 1 TB of memory we create ~63,000 directories.
> For even larger systems boot times are being measured in hours.

And those "hours" are mainly due to this problem, I assume.

> This set of patches allows for each directory created in sysfs
> to cover more than one memory section.  The default behavior for
> sysfs directory creation is the same, in that each directory
> represents a single memory section.  A new file 'end_phys_index'
> in each directory contains the physical_id of the last memory
> section covered by the directory so that users can easily
> determine the memory section range of a directory.

What you're proposing appears to be a non-back-compatible
userspace-visible change.  This is a big issue!

It's not an unresolvable issue, as this is a must-fix problem.  But you
should tell us what your proposal is to prevent breakage of existing
installations.  A Kconfig option would be good, but a boot-time kernel
command line option which selects the new format would be much better.

However you didn't mention this issue at all, and it's the most
important one.


> Updates for version 5 of the patchset include the following:
> 
> Patch 4/8 Add mutex for add/remove of memory blocks
> - Define the mutex using DEFINE_MUTEX macro.
> 
> Patch 8/8 Update memory-hotplug documentation
> - Add information concerning memory holes in phys_index..end_phys_index.

And you forgot to tell us how long those machines boot with the
patchset applied, which is the entire point of the patchset!

^ permalink raw reply

* Re: Query regarding 2.6.335 RT[Ingo's] and Non-RT performance
From: Jeff Angielski @ 2010-08-12 17:53 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <AANLkTikrpqFQsK=YLkHeWc1ZC=_Gz2rWStJrbQ8O-SrZ@mail.gmail.com>

On 08/11/2010 06:18 PM, Manikandan Ramachandran wrote:
> Hello All,
>      I created a very simple program which has higher priority than
> normal tasks and runs a tight loop. Under same test environment I ran
> this program on both non-rt and rt 2.6.33.5 kernel.  To my suprise I see
> that performance of non-RT kernel is better than RT. non-RT kernel took
> 3 sec and 366156 usec while RT kernel took about 3 sec and 418011
> usec.Can someone please explain why the performance of non-rt kernel is
> better than rt kernel? From the face of the test result, I feel RT has
> more overhead,Is there any configuration that I could do to bring down
> the overhead?

Your "surprise" is due to your definition of "performance".

The purpose of the -rt kernels is to reduce the kernel latency.  This is 
important for servicing hardware.  Normal users find the -rt useful for 
audio/video applications.  Engineering and scientific users find the -rt 
beneficially for servicing hardware like sensors or control systems.

If you are just trying to run calculations as fast as you can in user 
space, you'd be better off using the non-rt variants.


-- 
Jeff Angielski
The PTR Group
www.theptrgroup.com

^ permalink raw reply

* Re: How to use mpc8xxx_gpio.c device driver
From: Ira W. Snyder @ 2010-08-12 15:36 UTC (permalink / raw)
  To: Ravi Gupta; +Cc: linuxppc-dev, MJ embd, linuxppc-dev
In-Reply-To: <AANLkTi=-YiWtBqCGSt-h8dd3WwTMVJcDitvROCDBEMdw@mail.gmail.com>

On Thu, Aug 12, 2010 at 03:55:49PM +0530, Ravi Gupta wrote:
> On Wed, Aug 11, 2010 at 9:45 PM, MJ embd <mj.embd@gmail.com> wrote:
> 
> > u can directly access GPIO registers in kernel, by ioremap of GPIO
> > memory mapped registers.
> > you might need to check
> > - muxing of gpio
> >
> > -mj
> >
> 
> Hi MJ,
> 
> Thanks for the reply.
> I tried memory mapping but it fails, here is my code :
> 
> #include <linux/module.h>
> #include <linux/errno.h>    /* error codes */
> #include <linux/mm.h>
> 
> void __iomem *ioaddr = NULL;
> 
> static __init int sample_module_init(void)
> {
>     ioaddr = ioremap(0xFF400C00, 0x24);
>     if(ioaddr == NULL) {
>         printk(KERN_WARNING "ioremap failed\n");
>     }
>     printk(KERN_WARNING "ioremap successed\n");
>     printk(KERN_WARNING "GP1DIR = %u\n", ioread32(ioaddr));
>     return 0;
> }
> 
> static __exit void sample_module_exit(void)
> {
>     iounmap(ioaddr);
> }
> 
> MODULE_LICENSE("GPL");
> module_init(sample_module_init);
> module_exit(sample_module_exit);
> 
> As per the MPC8377ERDB data sheet default IMMRBAR address is 0xFF40_0000 and
> offset of GPIO1 is 0C00 and each GPIO has programmable registers that occupy
> 24 bytes of memory-mapped space, so I mapped from 24bytes (0x18) starting
> from 0xFF40_0C00 address. But when I tried to read the values from the
> mapped memory I get the following errors. Is there something I am missing.
> Any help with reference to MPC8377ERDB board will be highly appreciable.
> 
> # tftp -l ~/immrbar.ko -r immrbar.ko -g 10.20.50.70
> # insmod ./immrbar.ko
> [  717.825241] ioremap successed
> [  717.849215] Machine check in kernel mode.
> [  717.853220] Caused by (from SRR1=41000): Transfer error ack signal
> [  717.859405] Oops: Machine check, sig: 7 [#1]
> [  717.863668] MPC837x RDB
> [  717.866106] Modules linked in: immrbar(+)
> [  717.870119] NIP: 00000900 LR: d1034054 CTR: c0014d50
> [  717.875079] REGS: cf895d00 TRAP: 0200   Not tainted  (2.6.28.9)
> [  717.880992] MSR: 00041000 <ME>  CR: 24000082  XER: 20000000
> [  717.886578] TASK = cf8e8640[647] 'insmod' THREAD: cf894000
> [  717.891882] GPR00: d103404c cf895db0 cf8e8640 00000000 000023d5 ffffffff
> c01e
> 04f4 00020000
> [  717.900265] GPR08: 00000001 c0383f3c 000023d5 c0014d50 4c72ff56 10019100
> 1007
> 77e0 1007ea98
> [  717.908650] GPR16: 10077834 100a0000 100a0000 100a0000 bfaf4828 00000000
> 1009
> f23c 10000cfc
> [  717.917034] GPR24: 10000d00 10000d24 10012008 c03650e8 00000000 d1034000
> 1001
> 2018 d1030000
> [  717.925598] NIP [00000900] 0x900
> [  717.928828] LR [d1034054] sample_module_init+0x54/0xc0 [immrbar]
> [  717.934828] Call Trace:
> [  717.937273] [cf895db0] [d103404c] sample_module_init+0x4c/0xc0 [immrbar]
> (unr
> eliable)
> [  717.945115] [cf895dc0] [c00038a0] do_one_initcall+0x64/0x18c
> [  717.950780] [cf895f20] [c004d7b8] sys_init_module+0xac/0x19c
> [  717.956441] [cf895f40] [c00122f0] ret_from_syscall+0x0/0x38
> [  717.962013] --- Exception: c01 at 0x48043f6c
> [  717.962017]     LR = 0x100009cc
> [  717.969407] Instruction dump:
> [  717.972370] 00000000 XXXXXXXX XXXXXXXX XXXXXXXX 00000000 XXXXXXXX
> XXXXXXXX XX
> XXXXXX
> [  717.980140] 00000000 XXXXXXXX XXXXXXXX XXXXXXXX 7d5043a6 XXXXXXXX
> XXXXXXXX XX
> XXXXXX
> [  717.987919] ---[ end trace a47be794e2873cef ]---
> 

Looking at the device tree for this board, it appears U-Boot remaps the
IMMR registers to 0xe0000000. They are no longer accessible at
0xff400000.

I would recommend studying arch/powerpc/boot/dts/mpc8377_rdb.dts in the
Linux source code. That describes the device layout on your board after
U-Boot has run.

A wonderful tool for testing devices from userspace is "busybox devmem".
It allows you to poke any physical address with any value. The output of
"busybox devmem --help" should get you started. As a quick example,
"busybox devmem 0xe0000c00 w 0x1" will write the 32-bit value 0x1 to
address 0xe0000c00.

I would also recommend using the built-in Linux GPIO API. It works, you
just need to figure out how to use it. It will be much easier to get
your code upstream if you use the provided APIs.

The Documentation/gpio.txt file should help you in understanding the
in-kernel Linux GPIO API. I'm afraid I don't have much experience other
than accessing it via sysfs from userspace.

Ira

^ permalink raw reply

* Flash Programmer Problem in Code Warrior
From: Naresh Reddy Sankapelly @ 2010-08-12 13:40 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 377 bytes --]

Hi,
I am trying to program NOR flash (M29DW323DT) on MPC8321 board. I have
imported the details of the flash into FPDeviceConfig.xml. When I try to run
Program/verify flash, it is taking large amount of time(in hours). I could
not figure out the reason for that. Kindly let me know the troubleshooting
method for this.

-- 
Thanks and Regards
Naresh Reddy S.
Noida, 9873240342

[-- Attachment #2: Type: text/html, Size: 408 bytes --]

^ permalink raw reply

* Running out of SDHCI quirk space (Re: [PATCH 1/3 v2] sdhci: Add auto CMD12 support for eSDHC driver)
From: Matt Fleming @ 2010-08-12 11:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ben Dooks, linux-mmc, linuxppc-dev

On Tue, Aug 03, 2010 at 04:43:46PM -0700, Andrew Morton wrote:
> On Tue, 3 Aug 2010 11:11:10 +0800
> Roy Zang <tie-fei.zang@freescale.com> wrote:
> 
> > --- a/drivers/mmc/host/sdhci.h
> > +++ b/drivers/mmc/host/sdhci.h
> > @@ -240,6 +240,8 @@ struct sdhci_host {
> >  #define SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN		(1<<25)
> >  /* Controller cannot support End Attribute in NOP ADMA descriptor */
> >  #define SDHCI_QUIRK_NO_ENDATTR_IN_NOPDESC		(1<<26)
> > +/* Controller uses Auto CMD12 command to stop the transfer */
> > +#define SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12		(1<<27)
> 
> This becomes 1<<29 in my tree.
> 
> We're about to run out.  What happens then?

I've been wondering for a while now if many of the quirks should be
hidden behind function pointers. While we could of course extend the
quirk space, I think that's kinda missing the point that quirks are
being used too liberally. Take SDHCI_QUIRK_SINGLE_POWER_WRITE in
drivers/mmc/host/sdhci.c:sdhci_set_power(). Really, that quirk should
probably be hidden inside a set_power() function in the sdhci_ops
structure.

I'm gonna have a go at trying to remove some of the quirks that don't
make sense being quirks. I'll post the series when I'm done.

Does anyone think that this approach is crazy?

^ permalink raw reply

* Re: How to use mpc8xxx_gpio.c device driver
From: Ravi Gupta @ 2010-08-12 10:25 UTC (permalink / raw)
  To: MJ embd; +Cc: linuxppc-dev, linuxppc-dev
In-Reply-To: <AANLkTikNt7hrxA+QR-omUiiLKVBnjqCw+HTDQh_5B5Ff@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3086 bytes --]

On Wed, Aug 11, 2010 at 9:45 PM, MJ embd <mj.embd@gmail.com> wrote:

> u can directly access GPIO registers in kernel, by ioremap of GPIO
> memory mapped registers.
> you might need to check
> - muxing of gpio
>
> -mj
>

Hi MJ,

Thanks for the reply.
I tried memory mapping but it fails, here is my code :

#include <linux/module.h>
#include <linux/errno.h>    /* error codes */
#include <linux/mm.h>

void __iomem *ioaddr = NULL;

static __init int sample_module_init(void)
{
    ioaddr = ioremap(0xFF400C00, 0x24);
    if(ioaddr == NULL) {
        printk(KERN_WARNING "ioremap failed\n");
    }
    printk(KERN_WARNING "ioremap successed\n");
    printk(KERN_WARNING "GP1DIR = %u\n", ioread32(ioaddr));
    return 0;
}

static __exit void sample_module_exit(void)
{
    iounmap(ioaddr);
}

MODULE_LICENSE("GPL");
module_init(sample_module_init);
module_exit(sample_module_exit);

As per the MPC8377ERDB data sheet default IMMRBAR address is 0xFF40_0000 and
offset of GPIO1 is 0C00 and each GPIO has programmable registers that occupy
24 bytes of memory-mapped space, so I mapped from 24bytes (0x18) starting
from 0xFF40_0C00 address. But when I tried to read the values from the
mapped memory I get the following errors. Is there something I am missing.
Any help with reference to MPC8377ERDB board will be highly appreciable.

# tftp -l ~/immrbar.ko -r immrbar.ko -g 10.20.50.70
# insmod ./immrbar.ko
[  717.825241] ioremap successed
[  717.849215] Machine check in kernel mode.
[  717.853220] Caused by (from SRR1=41000): Transfer error ack signal
[  717.859405] Oops: Machine check, sig: 7 [#1]
[  717.863668] MPC837x RDB
[  717.866106] Modules linked in: immrbar(+)
[  717.870119] NIP: 00000900 LR: d1034054 CTR: c0014d50
[  717.875079] REGS: cf895d00 TRAP: 0200   Not tainted  (2.6.28.9)
[  717.880992] MSR: 00041000 <ME>  CR: 24000082  XER: 20000000
[  717.886578] TASK = cf8e8640[647] 'insmod' THREAD: cf894000
[  717.891882] GPR00: d103404c cf895db0 cf8e8640 00000000 000023d5 ffffffff
c01e
04f4 00020000
[  717.900265] GPR08: 00000001 c0383f3c 000023d5 c0014d50 4c72ff56 10019100
1007
77e0 1007ea98
[  717.908650] GPR16: 10077834 100a0000 100a0000 100a0000 bfaf4828 00000000
1009
f23c 10000cfc
[  717.917034] GPR24: 10000d00 10000d24 10012008 c03650e8 00000000 d1034000
1001
2018 d1030000
[  717.925598] NIP [00000900] 0x900
[  717.928828] LR [d1034054] sample_module_init+0x54/0xc0 [immrbar]
[  717.934828] Call Trace:
[  717.937273] [cf895db0] [d103404c] sample_module_init+0x4c/0xc0 [immrbar]
(unr
eliable)
[  717.945115] [cf895dc0] [c00038a0] do_one_initcall+0x64/0x18c
[  717.950780] [cf895f20] [c004d7b8] sys_init_module+0xac/0x19c
[  717.956441] [cf895f40] [c00122f0] ret_from_syscall+0x0/0x38
[  717.962013] --- Exception: c01 at 0x48043f6c
[  717.962017]     LR = 0x100009cc
[  717.969407] Instruction dump:
[  717.972370] 00000000 XXXXXXXX XXXXXXXX XXXXXXXX 00000000 XXXXXXXX
XXXXXXXX XX
XXXXXX
[  717.980140] 00000000 XXXXXXXX XXXXXXXX XXXXXXXX 7d5043a6 XXXXXXXX
XXXXXXXX XX
XXXXXX
[  717.987919] ---[ end trace a47be794e2873cef ]---

Thanks in advance
Ravi Gupta

[-- Attachment #2: Type: text/html, Size: 3680 bytes --]

^ permalink raw reply

* [PATCH] fs-enet/mac-fec: Restore multicast and promiscous settings during restart
From: Wolfgang Ocker @ 2010-08-12  8:26 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Wolfgang Ocker, Vitaly Bordug

Signed-off-by: Wolfgang Ocker <weo@reccoware.de>
---
 drivers/net/fs_enet/mac-fec.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/fs_enet/mac-fec.c b/drivers/net/fs_enet/mac-fec.c
index 7ca1642..05f4bb1 100644
--- a/drivers/net/fs_enet/mac-fec.c
+++ b/drivers/net/fs_enet/mac-fec.c
@@ -344,6 +344,9 @@ static void restart(struct net_device *dev)
 	FW(fecp, imask, FEC_ENET_TXF | FEC_ENET_TXB |
 	   FEC_ENET_RXF | FEC_ENET_RXB);
 
+	/* Restore multicast and promiscuous settings */
+	set_multicast_list(dev);
+
 	/*
 	 * And last, enable the transmit and receive processing.
 	 */
-- 
1.7.2.1

^ permalink raw reply related

* Looking for a tutorial on the use of the new of_??? init functions
From: LEROY Christophe @ 2010-08-12  8:13 UTC (permalink / raw)
  To: LinuxPPC-dev

Hello,

Is there a tutorial or an HOWTO out somewhere explaining the use of 
those new of_platform_xxx() and other of_xxx() functions in the init of 
drivers ? It looks like a very nice way to write drivers in Linux 2-6 
but a little help would be welcomed.

Regards
Christophe

^ permalink raw reply

* Re: [PATCH 1/2] mmc: change ACMD12 to AUTO_CMD12 for more clear
From: Grant Likely @ 2010-08-12  4:21 UTC (permalink / raw)
  To: Zang Roy-R61911; +Cc: linux-mmc, linuxppc-dev, mirqus, akpm
In-Reply-To: <3850A844E6A3854C827AC5C0BEC7B60A0565B8@zch01exm23.fsl.freescale.net>

On Wed, Aug 11, 2010 at 10:00 PM, Zang Roy-R61911 <r61911@freescale.com> wr=
ote:
>
>
>> -----Original Message-----
>> From: Zang Roy-R61911
>> Sent: Wednesday, August 11, 2010 12:47 PM
>> To: Zang Roy-R61911; akpm@linux-foundation.org;
>> linux-mmc@vger.kernel.org
>> Cc: linuxppc-dev@ozlabs.org; mirqus@gmail.com;
>> cbouatmailru@gmail.com; grant.likely@secretlab.ca
>> Subject: RE: [PATCH 1/2] mmc: change ACMD12 to AUTO_CMD12 for
>> more clear
>>
>>
>>
>> > -----Original Message-----
>> > From: Zang Roy-R61911
>> > Sent: Tuesday, August 10, 2010 17:47 PM
>> > To: akpm@linux-foundation.org; linux-mmc@vger.kernel.org
>> > Cc: linuxppc-dev@ozlabs.org; mirqus@gmail.com;
>> > cbouatmailru@gmail.com; grant.likely@secretlab.ca
>> > Subject: [PATCH 1/2] mmc: change ACMD12 to AUTO_CMD12 for more clear
>> >
>> > Change ACMD12 to AUTO_CMD12 to reduce the confusion.
>> > ACMD12 might be confused with MMC/SD App CMD 12 (CMD55+CMD12 combo).
>> >
>> > Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
>> > ---
>> > =A0drivers/mmc/host/sdhci-of-core.c | =A0 =A02 +-
>> > =A0drivers/mmc/host/sdhci.c =A0 =A0 =A0 =A0 | =A0 =A08 ++++----
>> > =A0drivers/mmc/host/sdhci.h =A0 =A0 =A0 =A0 | =A0 10 +++++-----
>> > =A03 files changed, 10 insertions(+), 10 deletions(-)
>> Andrew,
>> Could you help to pick up this minor fix?
>> Thanks.
>> Roy
> Any update?
> Thanks.
> Roy

Patience Roy.  You only sent the patch 1 day ago.

g.

^ permalink raw reply

* RE: [PATCH 1/2] mmc: change ACMD12 to AUTO_CMD12 for more clear
From: Zang Roy-R61911 @ 2010-08-12  4:00 UTC (permalink / raw)
  To: Zang Roy-R61911, akpm, linux-mmc; +Cc: linuxppc-dev, mirqus

=20

> -----Original Message-----
> From: Zang Roy-R61911=20
> Sent: Wednesday, August 11, 2010 12:47 PM
> To: Zang Roy-R61911; akpm@linux-foundation.org;=20
> linux-mmc@vger.kernel.org
> Cc: linuxppc-dev@ozlabs.org; mirqus@gmail.com;=20
> cbouatmailru@gmail.com; grant.likely@secretlab.ca
> Subject: RE: [PATCH 1/2] mmc: change ACMD12 to AUTO_CMD12 for=20
> more clear
>=20
> =20
>=20
> > -----Original Message-----
> > From: Zang Roy-R61911=20
> > Sent: Tuesday, August 10, 2010 17:47 PM
> > To: akpm@linux-foundation.org; linux-mmc@vger.kernel.org
> > Cc: linuxppc-dev@ozlabs.org; mirqus@gmail.com;=20
> > cbouatmailru@gmail.com; grant.likely@secretlab.ca
> > Subject: [PATCH 1/2] mmc: change ACMD12 to AUTO_CMD12 for more clear
> >=20
> > Change ACMD12 to AUTO_CMD12 to reduce the confusion.
> > ACMD12 might be confused with MMC/SD App CMD 12 (CMD55+CMD12 combo).
> >=20
> > Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
> > ---
> >  drivers/mmc/host/sdhci-of-core.c |    2 +-
> >  drivers/mmc/host/sdhci.c         |    8 ++++----
> >  drivers/mmc/host/sdhci.h         |   10 +++++-----
> >  3 files changed, 10 insertions(+), 10 deletions(-)
> Andrew,=20
> Could you help to pick up this minor fix?
> Thanks.
> Roy
Any update?
Thanks.
Roy

^ permalink raw reply

* [PATCH] powerpc: Fix bogus it_blocksize in VIO iommu code
From: Anton Blanchard @ 2010-08-12  2:42 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev


When looking at some issues with the virtual ethernet driver I noticed
that TCE allocation was following a very strange pattern:

address 00e9000 length 2048
address 0409000 length 2048 <-----
address 0429000 length 2048
address 0449000 length 2048
address 0469000 length 2048
address 0489000 length 2048
address 04a9000 length 2048
address 04c9000 length 2048
address 04e9000 length 2048
address 4009000 length 2048 <-----
address 4029000 length 2048

Huge unexplained gaps in what should be an empty TCE table. It turns out
it_blocksize, the amount we want to align the next allocation to, was
c0000000fe903b20. Completely bogus.

Initialise it to something reasonable in the VIO IOMMU code, and use kzalloc
everywhere to protect against this when we next add a non compulsary
field to iommu code and forget to initialise it.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: powerpc.git/arch/powerpc/kernel/vio.c
===================================================================
--- powerpc.git.orig/arch/powerpc/kernel/vio.c	2010-08-12 12:27:58.674490962 +1000
+++ powerpc.git/arch/powerpc/kernel/vio.c	2010-08-12 12:28:18.660741428 +1000
@@ -1059,7 +1059,7 @@ static struct iommu_table *vio_build_iom
 	if (!dma_window)
 		return NULL;
 
-	tbl = kmalloc(sizeof(*tbl), GFP_KERNEL);
+	tbl = kzalloc(sizeof(*tbl), GFP_KERNEL);
 	if (tbl == NULL)
 		return NULL;
 
@@ -1072,6 +1072,7 @@ static struct iommu_table *vio_build_iom
 	tbl->it_offset = offset >> IOMMU_PAGE_SHIFT;
 	tbl->it_busno = 0;
 	tbl->it_type = TCE_VB;
+	tbl->it_blocksize = 16;
 
 	return iommu_init_table(tbl, -1);
 }
Index: powerpc.git/arch/powerpc/platforms/iseries/iommu.c
===================================================================
--- powerpc.git.orig/arch/powerpc/platforms/iseries/iommu.c	2010-08-12 12:29:35.473241172 +1000
+++ powerpc.git/arch/powerpc/platforms/iseries/iommu.c	2010-08-12 12:29:50.190890563 +1000
@@ -184,7 +184,7 @@ static void pci_dma_dev_setup_iseries(st
 
 	BUG_ON(lsn == NULL);
 
-	tbl = kmalloc(sizeof(struct iommu_table), GFP_KERNEL);
+	tbl = kzalloc(sizeof(struct iommu_table), GFP_KERNEL);
 
 	iommu_table_getparms_iSeries(pdn->busno, *lsn, 0, tbl);
 
Index: powerpc.git/arch/powerpc/platforms/pseries/iommu.c
===================================================================
--- powerpc.git.orig/arch/powerpc/platforms/pseries/iommu.c	2010-08-12 12:28:45.340756738 +1000
+++ powerpc.git/arch/powerpc/platforms/pseries/iommu.c	2010-08-12 12:29:15.401118951 +1000
@@ -403,7 +403,7 @@ static void pci_dma_bus_setup_pSeries(st
 	pci->phb->dma_window_size = 0x8000000ul;
 	pci->phb->dma_window_base_cur = 0x8000000ul;
 
-	tbl = kmalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
+	tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
 			   pci->phb->node);
 
 	iommu_table_setparms(pci->phb, dn, tbl);
@@ -448,7 +448,7 @@ static void pci_dma_bus_setup_pSeriesLP(
 		 pdn->full_name, ppci->iommu_table);
 
 	if (!ppci->iommu_table) {
-		tbl = kmalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
+		tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
 				   ppci->phb->node);
 		iommu_table_setparms_lpar(ppci->phb, pdn, tbl, dma_window,
 			bus->number);
@@ -478,7 +478,7 @@ static void pci_dma_dev_setup_pSeries(st
 		struct pci_controller *phb = PCI_DN(dn)->phb;
 
 		pr_debug(" --> first child, no bridge. Allocating iommu table.\n");
-		tbl = kmalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
+		tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
 				   phb->node);
 		iommu_table_setparms(phb, dn, tbl);
 		PCI_DN(dn)->iommu_table = iommu_init_table(tbl, phb->node);
@@ -544,7 +544,7 @@ static void pci_dma_dev_setup_pSeriesLP(
 
 	pci = PCI_DN(pdn);
 	if (!pci->iommu_table) {
-		tbl = kmalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
+		tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
 				   pci->phb->node);
 		iommu_table_setparms_lpar(pci->phb, pdn, tbl, dma_window,
 			pci->phb->bus->number);
Index: powerpc.git/arch/powerpc/platforms/cell/iommu.c
===================================================================
--- powerpc.git.orig/arch/powerpc/platforms/cell/iommu.c	2010-08-12 12:31:27.040741891 +1000
+++ powerpc.git/arch/powerpc/platforms/cell/iommu.c	2010-08-12 12:31:34.641324320 +1000
@@ -477,7 +477,7 @@ cell_iommu_setup_window(struct cbe_iommu
 
 	ioid = cell_iommu_get_ioid(np);
 
-	window = kmalloc_node(sizeof(*window), GFP_KERNEL, iommu->nid);
+	window = kzalloc_node(sizeof(*window), GFP_KERNEL, iommu->nid);
 	BUG_ON(window == NULL);
 
 	window->offset = offset;

^ permalink raw reply

* [64/67] irq: Add new IRQ flag IRQF_NO_SUSPEND
From: Greg KH @ 2010-08-12  0:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jeremy Fitzhardinge, xen-devel, Thomas Gleixner, Ian Campbell,
	devicetree-discuss, Dmitry Torokhov, linuxppc-dev, Paul Mackerras,
	linux-input, akpm, torvalds, stable-review, alan
In-Reply-To: <20100812000641.GA6348@kroah.com>

2.6.35-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Ian Campbell <ian.campbell@citrix.com>

commit 685fd0b4ea3f0f1d5385610b0d5b57775a8d5842 upstream.

A small number of users of IRQF_TIMER are using it for the implied no
suspend behaviour on interrupts which are not timer interrupts.

Therefore add a new IRQF_NO_SUSPEND flag, rename IRQF_TIMER to
__IRQF_TIMER and redefine IRQF_TIMER in terms of these new flags.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: xen-devel@lists.xensource.com
Cc: linux-input@vger.kernel.org
Cc: linuxppc-dev@ozlabs.org
Cc: devicetree-discuss@lists.ozlabs.org
LKML-Reference: <1280398595-29708-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 include/linux/interrupt.h |    7 ++++++-
 kernel/irq/manage.c       |    2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -53,16 +53,21 @@
  * IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
  *                Used by threaded interrupts which need to keep the
  *                irq line disabled until the threaded handler has been run.
+ * IRQF_NO_SUSPEND - Do not disable this IRQ during suspend
+ *
  */
 #define IRQF_DISABLED		0x00000020
 #define IRQF_SAMPLE_RANDOM	0x00000040
 #define IRQF_SHARED		0x00000080
 #define IRQF_PROBE_SHARED	0x00000100
-#define IRQF_TIMER		0x00000200
+#define __IRQF_TIMER		0x00000200
 #define IRQF_PERCPU		0x00000400
 #define IRQF_NOBALANCING	0x00000800
 #define IRQF_IRQPOLL		0x00001000
 #define IRQF_ONESHOT		0x00002000
+#define IRQF_NO_SUSPEND		0x00004000
+
+#define IRQF_TIMER		(__IRQF_TIMER | IRQF_NO_SUSPEND)
 
 /*
  * Bits used by threaded handlers:
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -216,7 +216,7 @@ static inline int setup_affinity(unsigne
 void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend)
 {
 	if (suspend) {
-		if (!desc->action || (desc->action->flags & IRQF_TIMER))
+		if (!desc->action || (desc->action->flags & IRQF_NO_SUSPEND))
 			return;
 		desc->status |= IRQ_SUSPENDED;
 	}

^ permalink raw reply

* [48/54] irq: Add new IRQ flag IRQF_NO_SUSPEND
From: Greg KH @ 2010-08-12  0:01 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jeremy Fitzhardinge, xen-devel, Thomas Gleixner, Ian Campbell,
	devicetree-discuss, Dmitry Torokhov, linuxppc-dev, Paul Mackerras,
	linux-input, akpm, torvalds, stable-review, alan
In-Reply-To: <20100812000249.GA30948@kroah.com>

2.6.34-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Ian Campbell <ian.campbell@citrix.com>

commit 685fd0b4ea3f0f1d5385610b0d5b57775a8d5842 upstream.

A small number of users of IRQF_TIMER are using it for the implied no
suspend behaviour on interrupts which are not timer interrupts.

Therefore add a new IRQF_NO_SUSPEND flag, rename IRQF_TIMER to
__IRQF_TIMER and redefine IRQF_TIMER in terms of these new flags.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: xen-devel@lists.xensource.com
Cc: linux-input@vger.kernel.org
Cc: linuxppc-dev@ozlabs.org
Cc: devicetree-discuss@lists.ozlabs.org
LKML-Reference: <1280398595-29708-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 include/linux/interrupt.h |    7 ++++++-
 kernel/irq/manage.c       |    2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -52,16 +52,21 @@
  * IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
  *                Used by threaded interrupts which need to keep the
  *                irq line disabled until the threaded handler has been run.
+ * IRQF_NO_SUSPEND - Do not disable this IRQ during suspend
+ *
  */
 #define IRQF_DISABLED		0x00000020
 #define IRQF_SAMPLE_RANDOM	0x00000040
 #define IRQF_SHARED		0x00000080
 #define IRQF_PROBE_SHARED	0x00000100
-#define IRQF_TIMER		0x00000200
+#define __IRQF_TIMER		0x00000200
 #define IRQF_PERCPU		0x00000400
 #define IRQF_NOBALANCING	0x00000800
 #define IRQF_IRQPOLL		0x00001000
 #define IRQF_ONESHOT		0x00002000
+#define IRQF_NO_SUSPEND		0x00004000
+
+#define IRQF_TIMER		(__IRQF_TIMER | IRQF_NO_SUSPEND)
 
 /*
  * Bits used by threaded handlers:
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -200,7 +200,7 @@ static inline int setup_affinity(unsigne
 void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend)
 {
 	if (suspend) {
-		if (!desc->action || (desc->action->flags & IRQF_TIMER))
+		if (!desc->action || (desc->action->flags & IRQF_NO_SUSPEND))
 			return;
 		desc->status |= IRQ_SUSPENDED;
 	}

^ permalink raw reply

* [039/111] irq: Add new IRQ flag IRQF_NO_SUSPEND
From: Greg KH @ 2010-08-11 23:54 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jeremy Fitzhardinge, xen-devel, Thomas Gleixner, Ian Campbell,
	devicetree-discuss, Dmitry Torokhov, linuxppc-dev, Paul Mackerras,
	linux-input, akpm, torvalds, stable-review, alan
In-Reply-To: <20100811235623.GA24440@kroah.com>

2.6.32-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Ian Campbell <ian.campbell@citrix.com>

commit 685fd0b4ea3f0f1d5385610b0d5b57775a8d5842 upstream.

A small number of users of IRQF_TIMER are using it for the implied no
suspend behaviour on interrupts which are not timer interrupts.

Therefore add a new IRQF_NO_SUSPEND flag, rename IRQF_TIMER to
__IRQF_TIMER and redefine IRQF_TIMER in terms of these new flags.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: xen-devel@lists.xensource.com
Cc: linux-input@vger.kernel.org
Cc: linuxppc-dev@ozlabs.org
Cc: devicetree-discuss@lists.ozlabs.org
LKML-Reference: <1280398595-29708-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 include/linux/interrupt.h |    7 ++++++-
 kernel/irq/manage.c       |    2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -52,16 +52,21 @@
  * IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
  *                Used by threaded interrupts which need to keep the
  *                irq line disabled until the threaded handler has been run.
+ * IRQF_NO_SUSPEND - Do not disable this IRQ during suspend
+ *
  */
 #define IRQF_DISABLED		0x00000020
 #define IRQF_SAMPLE_RANDOM	0x00000040
 #define IRQF_SHARED		0x00000080
 #define IRQF_PROBE_SHARED	0x00000100
-#define IRQF_TIMER		0x00000200
+#define __IRQF_TIMER		0x00000200
 #define IRQF_PERCPU		0x00000400
 #define IRQF_NOBALANCING	0x00000800
 #define IRQF_IRQPOLL		0x00001000
 #define IRQF_ONESHOT		0x00002000
+#define IRQF_NO_SUSPEND		0x00004000
+
+#define IRQF_TIMER		(__IRQF_TIMER | IRQF_NO_SUSPEND)
 
 /*
  * Bits used by threaded handlers:
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -200,7 +200,7 @@ static inline int setup_affinity(unsigne
 void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend)
 {
 	if (suspend) {
-		if (!desc->action || (desc->action->flags & IRQF_TIMER))
+		if (!desc->action || (desc->action->flags & IRQF_NO_SUSPEND))
 			return;
 		desc->status |= IRQ_SUSPENDED;
 	}

^ permalink raw reply

* Query regarding 2.6.335 RT[Ingo's] and Non-RT performance
From: Manikandan Ramachandran @ 2010-08-11 22:18 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3340 bytes --]

Hello All,

    I created a very simple program which has higher priority than normal
tasks and runs a tight loop. Under same test environment I ran this
program on both non-rt and rt 2.6.33.5 kernel.  To my suprise I see that
performance of non-RT kernel is better than RT. non-RT kernel took 3 sec and
366156 usec while RT kernel took about 3 sec and 418011 usec.Can someone
please explain why the performance of non-rt kernel is better than rt
kernel? From the face of the test result, I feel RT has more overhead,Is
there any configuration that I could do to bring down the overhead?

Processor:
----------------
processor       : 0
cpu             : 7448
clock           : 996.000000MHz
revision        : 2.2 (pvr 8004 0202)
bogomips        : 83.10
processor       : 1
cpu             : 7448
clock           : 996.000000MHz
revision        : 2.2 (pvr 8004 0202)
bogomips        : 83.10

CFS optimization:
--------------------------
# cat /proc/sys/kernel/sched_rt_runtime_us
1000000
# cat /proc/sys/kernel/sched_rt_period_us
1000000
# cat /proc/sys/kernel/sched_compat_yield
1

Test Program:
---------------------

main()
{

    int sched_rr_min,sched_rr_max;
    struct sched_param scheduling_parameters;
    struct timeval tv,late_tv;
    suseconds_t usec_diff,avg_usec = 0;
    time_t sec_diff, avg_sec = 0;
    int i;
    long count = 1;

    sched_rr_min = sched_get_priority_min(SCHED_RR);
    sched_rr_max = sched_get_priority_max(SCHED_RR);
    scheduling_parameters.sched_priority = sched_rr_min+4;
    sched_setscheduler(0, SCHED_RR, &scheduling_parameters);// Run the
process with the given priority


    for(i = 0 ; i < 150 ; i++) {
       gettimeofday(&tv, NULL);
       while(count > 0){
        //printf(".");
        count++;
       }
       gettimeofday(&late_tv, NULL);
       count = 1;
       sec_diff = (late_tv.tv_sec - tv.tv_sec);
       avg_sec += sec_diff;
       usec_diff = ( (late_tv.tv_usec > tv.tv_usec) ? (late_tv.tv_usec -
tv.tv_usec) : ( tv.tv_usec - late_tv.tv_usec));
       avg_usec += usec_diff;
       printf("Iteration #%d sec %x usec %x\n",i,(sec_diff),(usec_diff));
    }
       printf("Average of #%d sec %x usec %x\n",i,(avg_sec/i),(avg_usec)/i);
}

Partial Result of non-rt kernel:
-------------------------------------------

Iteration #140 sec 3 usec 3aef8
Iteration #141 sec 3 usec 3aefe
Iteration #142 sec 3 usec 3aee4
*Iteration #143 sec 4 usec b935b  [Why there is this periodic bump ??]
[Scheduler at work??]*
Iteration #144 sec 3 usec 3aef2
Iteration #145 sec 3 usec 3aef0
Iteration #146 sec 3 usec 3aef4
*Iteration #147 sec 4 usec b934b*
Iteration #148 sec 3 usec 3aeed
Iteration #149 sec 3 usec 3aef9

 Partial Result of rt kernel:
-------------------------------------------
Iteration #135 sec 3 usec 47328
*Iteration #136 sec 4 usec ac4fd
*Iteration #137 sec 3 usec 48b0b
Iteration #138 sec 3 usec 4738c
Iteration #139 sec 4 usec ac4d5
Iteration #140 sec 3 usec 483cb
Iteration #141 sec 3 usec 48500
*Iteration #142 sec 4 usec acc49
*Iteration #143 sec 3 usec 47c1f
Iteration #144 sec 3 usec 478c2
Iteration #145 sec 3 usec 47e48
Iteration #146 sec 4 usec ac9b5
Iteration #147 sec 3 usec 48de4
Iteration #148 sec 3 usec 46fbe
Iteration #149 sec 4 usec ac52e
Average of #150 sec 3 usec 660db

Thanks,
Mani


-- 
Thanks,
Manik

Think twice about a tree before you take a printout

[-- Attachment #2: Type: text/html, Size: 6395 bytes --]

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox