* LEON SMP
@ 2010-10-26 17:50 Daniel Hellstrom
2010-10-26 17:54 ` David Miller
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Daniel Hellstrom @ 2010-10-26 17:50 UTC (permalink / raw)
To: sparclinux
Thanks for applying the patch.
I don't know if I have introduced myself before.. I'm working at
Aeroflex Gaisler as a software engineer on the LEON{2,3,4} architectures.
I saw the discussion about LEON previously on the list, and I feel that
I should say something short about LEON and SMP. We are of course very
thankfull for being in the Linux kernel tree.
There are multiple multi-core designs of LEON in FPGAs and ASICs. For
example the dual-core LEON4 200MHz eAsic that Sam mentioned, and a new
chip GR712 is being tested at the fab as we speak. More chips are in the
design phase. The LEON4 architecture was released earlier this year, it
is aiming more on SMP than the LEON3, even though the LEON3 can do SMP
as well. The LEON4 has for example L2 cache and wider buses. Of course
all LEONs are SPARCv8 compatible.
I have been working with a new quad-core LEON4 design since
mid-september, we have made some changes to the LEON port of Linux port
which we would like to submit. Some of them are not in a clean state
yet, so there are some work still before I will try to post them here.
Now to the question...
The LEON do not have internal timers as some CPUs does, it has
one/multiple General Purpose TIMERs on the Processor Local Bus. On
single-CPU/SMP systems the first Timer is used for System Clock, and on
SMP systems timer two is also used to generate a simultaneous IRQ on all
CPUs for profiling etc. (leon_percpu_timer_interrupt()). On the
quad-core SMP system I discovered that since the per-cpu timer is
generated at the same frequency (and almost simultaneously) as the
System Clock Timer. I have made a patch that uses only one Timer for SMP
systems, the Timer generates a per-cpu tick as before, however on CPU0
the handler_irq() is also called after profiling has been done, this is
to handle the System Clock Tick. I seems to work successfully, and it
saves me HZ interrupts per second and a Timer instance. What is you
opinion about that? Is it possible to use the same timer for System
Clock and for per-cpu profiling etc.?
Best Regards,
Daniel Hellstrom
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LEON SMP
2010-10-26 17:50 LEON SMP Daniel Hellstrom
@ 2010-10-26 17:54 ` David Miller
2010-10-26 18:11 ` Daniel Hellstrom
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2010-10-26 17:54 UTC (permalink / raw)
To: sparclinux
From: Daniel Hellstrom <daniel@gaisler.com>
Date: Tue, 26 Oct 2010 19:50:36 +0200
> The LEON do not have internal timers as some CPUs does, it has
> one/multiple General Purpose TIMERs on the Processor Local Bus. On
> single-CPU/SMP systems the first Timer is used for System Clock, and
> on SMP systems timer two is also used to generate a simultaneous IRQ
> on all CPUs for profiling etc. (leon_percpu_timer_interrupt()). On the
> quad-core SMP system I discovered that since the per-cpu timer is
> generated at the same frequency (and almost simultaneously) as the
> System Clock Timer. I have made a patch that uses only one Timer for
> SMP systems, the Timer generates a per-cpu tick as before, however on
> CPU0 the handler_irq() is also called after profiling has been done,
> this is to handle the System Clock Tick. I seems to work successfully,
> and it saves me HZ interrupts per second and a Timer instance. What is
> you opinion about that? Is it possible to use the same timer for
> System Clock and for per-cpu profiling etc.?
You only need to generate one timer interrupt per-cpu, and the kernel
generically decides to run the global timer actions (jiffies update,
etc.) on a choosen cpu, transparently, in the per-cpu periodic timer
code.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LEON SMP
2010-10-26 17:50 LEON SMP Daniel Hellstrom
2010-10-26 17:54 ` David Miller
@ 2010-10-26 18:11 ` Daniel Hellstrom
2011-01-05 10:20 ` Daniel Hellstrom
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Daniel Hellstrom @ 2010-10-26 18:11 UTC (permalink / raw)
To: sparclinux
David Miller wrote:
>From: Daniel Hellstrom <daniel@gaisler.com>
>Date: Tue, 26 Oct 2010 19:50:36 +0200
>
>
>
>>The LEON do not have internal timers as some CPUs does, it has
>>one/multiple General Purpose TIMERs on the Processor Local Bus. On
>>single-CPU/SMP systems the first Timer is used for System Clock, and
>>on SMP systems timer two is also used to generate a simultaneous IRQ
>>on all CPUs for profiling etc. (leon_percpu_timer_interrupt()). On the
>>quad-core SMP system I discovered that since the per-cpu timer is
>>generated at the same frequency (and almost simultaneously) as the
>>System Clock Timer. I have made a patch that uses only one Timer for
>>SMP systems, the Timer generates a per-cpu tick as before, however on
>>CPU0 the handler_irq() is also called after profiling has been done,
>>this is to handle the System Clock Tick. I seems to work successfully,
>>and it saves me HZ interrupts per second and a Timer instance. What is
>>you opinion about that? Is it possible to use the same timer for
>>System Clock and for per-cpu profiling etc.?
>>
>>
>
>You only need to generate one timer interrupt per-cpu, and the kernel
>generically decides to run the global timer actions (jiffies update,
>etc.) on a choosen cpu, transparently, in the per-cpu periodic timer
>code.
>
>
>
That is interesting, I didn't even think about that. So then I can even
remove the extra call to handler_irq() from within the per-cpu timer IRQ
handler, and I will probably have to fix some code in the System Clock
Timer setup as well. I will have to investigate this further then.
So actually it is bad to make 5 timer IRQs per tick on a quad core
system, and one of them is calling handler_irq. But I can see that the
system clock is progressing in the correct pace, unless my watch is bad
:) I will get back to this issue later on.
Thank you,
Daniel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LEON SMP
2010-10-26 17:50 LEON SMP Daniel Hellstrom
2010-10-26 17:54 ` David Miller
2010-10-26 18:11 ` Daniel Hellstrom
@ 2011-01-05 10:20 ` Daniel Hellstrom
2011-01-05 10:48 ` Sam Ravnborg
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Daniel Hellstrom @ 2011-01-05 10:20 UTC (permalink / raw)
To: sparclinux
Daniel Hellstrom wrote:
> David Miller wrote:
>
>> From: Daniel Hellstrom <daniel@gaisler.com>
>> Date: Tue, 26 Oct 2010 19:50:36 +0200
>>
>>
>>
>>> The LEON do not have internal timers as some CPUs does, it has
>>> one/multiple General Purpose TIMERs on the Processor Local Bus. On
>>> single-CPU/SMP systems the first Timer is used for System Clock, and
>>> on SMP systems timer two is also used to generate a simultaneous IRQ
>>> on all CPUs for profiling etc. (leon_percpu_timer_interrupt()). On the
>>> quad-core SMP system I discovered that since the per-cpu timer is
>>> generated at the same frequency (and almost simultaneously) as the
>>> System Clock Timer. I have made a patch that uses only one Timer for
>>> SMP systems, the Timer generates a per-cpu tick as before, however on
>>> CPU0 the handler_irq() is also called after profiling has been done,
>>> this is to handle the System Clock Tick. I seems to work successfully,
>>> and it saves me HZ interrupts per second and a Timer instance. What is
>>> you opinion about that? Is it possible to use the same timer for
>>> System Clock and for per-cpu profiling etc.?
>>>
>>
>>
>> You only need to generate one timer interrupt per-cpu, and the kernel
>> generically decides to run the global timer actions (jiffies update,
>> etc.) on a choosen cpu, transparently, in the per-cpu periodic timer
>> code.
>>
>>
>>
> That is interesting, I didn't even think about that. So then I can
> even remove the extra call to handler_irq() from within the per-cpu
> timer IRQ handler, and I will probably have to fix some code in the
> System Clock Timer setup as well. I will have to investigate this
> further then.
>
> So actually it is bad to make 5 timer IRQs per tick on a quad core
> system, and one of them is calling handler_irq. But I can see that the
> system clock is progressing in the correct pace, unless my watch is
> bad :) I will get back to this issue later on.
I have started working on this timer patch again...
I tried looking a sun4d and sun4m to get an example of how to implement
this in a better way, however they seem to implement the per-cpu ticker
using hardcoded IRQ number 14 and a custom trap handler for the per-cpu
timer ticker (see bottom of kernel/sun4m_irq.c: sun4m_init_timers()).
The system clock is implemented using the time-tick at IRQ10. I'm not
sure why the time-tick timer is used at all, unless the hardware
requires it (the per-cpu timer perhaps can not get the time or limits HZ).
The LEON port mimics this behaviour, however the LEON CPU does not have
a "per-cpu" timer. The new patch uses only one timer to generate one IRQ
on each CPU simultaneously, and only CPU0 will call handler_irq() to
handle the standard system clock tick interrupt handler. Please see the
patch I will submit to the list soon as reference.
This patch is bad if one would want the system clock tick to run at a
different rate than the profiling per-cpu ticks, or if the system clock
tick is turned off/on as it will affect the profiling, however I have
not seen code indicating such a behaviour.
Thanks for applying the other patches,
Daniel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LEON SMP
2010-10-26 17:50 LEON SMP Daniel Hellstrom
` (2 preceding siblings ...)
2011-01-05 10:20 ` Daniel Hellstrom
@ 2011-01-05 10:48 ` Sam Ravnborg
2011-01-05 11:19 ` Daniel Hellstrom
2011-01-05 21:27 ` David Miller
5 siblings, 0 replies; 7+ messages in thread
From: Sam Ravnborg @ 2011-01-05 10:48 UTC (permalink / raw)
To: sparclinux
>
> I have started working on this timer patch again...
>
> I tried looking a sun4d and sun4m to get an example of how to implement
> this in a better way, however they seem to implement the per-cpu ticker
> using hardcoded IRQ number 14 and a custom trap handler for the per-cpu
> timer ticker (see bottom of kernel/sun4m_irq.c: sun4m_init_timers()).
I am slowly looking into introducing generic IRQ support for SPARC.
If I succeed then we will shift to a more dynamic numbering
of interrupts - like sparc64 does.
Right now I am in a situation where I try to analyse SPARC, existing
codebase and genirq in the kernel. So it will take
a while before I get anywhere with this.
Sam
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LEON SMP
2010-10-26 17:50 LEON SMP Daniel Hellstrom
` (3 preceding siblings ...)
2011-01-05 10:48 ` Sam Ravnborg
@ 2011-01-05 11:19 ` Daniel Hellstrom
2011-01-05 21:27 ` David Miller
5 siblings, 0 replies; 7+ messages in thread
From: Daniel Hellstrom @ 2011-01-05 11:19 UTC (permalink / raw)
To: sparclinux
Sam Ravnborg wrote:
>>I have started working on this timer patch again...
>>
>>I tried looking a sun4d and sun4m to get an example of how to implement
>>this in a better way, however they seem to implement the per-cpu ticker
>>using hardcoded IRQ number 14 and a custom trap handler for the per-cpu
>>timer ticker (see bottom of kernel/sun4m_irq.c: sun4m_init_timers()).
>>
>>
>
>I am slowly looking into introducing generic IRQ support for SPARC.
>If I succeed then we will shift to a more dynamic numbering
>of interrupts - like sparc64 does.
>
>
That would be great. I have not looked so much into the other SPARC32
ports or the SPARC64, however the LEON port handles IRQ always on the
CPU calling request_irq(), since CPU0 initializes everything during
startup CPU0 will end up doing a lot if IRQ work. I wish there where a
way of implementing IRQ routing to different CPUs. In best case during
runtime, however a static configuration is good enough.
>Right now I am in a situation where I try to analyse SPARC, existing
>codebase and genirq in the kernel. So it will take
>a while before I get anywhere with this.
>
>
I understand, I'm appreciating your efforts.
Daniel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LEON SMP
2010-10-26 17:50 LEON SMP Daniel Hellstrom
` (4 preceding siblings ...)
2011-01-05 11:19 ` Daniel Hellstrom
@ 2011-01-05 21:27 ` David Miller
5 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2011-01-05 21:27 UTC (permalink / raw)
To: sparclinux
From: Daniel Hellstrom <daniel@gaisler.com>
Date: Wed, 05 Jan 2011 11:20:19 +0100
> I tried looking a sun4d and sun4m to get an example of how to
> implement this in a better way, however they seem to implement the
> per-cpu ticker using hardcoded IRQ number 14 and a custom trap handler
> for the per-cpu timer ticker (see bottom of kernel/sun4m_irq.c:
> sun4m_init_timers()). The system clock is implemented using the
> time-tick at IRQ10. I'm not sure why the time-tick timer is used at
> all, unless the hardware requires it (the per-cpu timer perhaps can
> not get the time or limits HZ).
sun4c, sun4d, and sun4m have two timer sources that deliver interrupts
on level 10 and level 14 and these mappings are not changable.
The level 10 one is a global interrupt which can target one cpu,
whereas the level 14 one can be used in a per-cpu manner and is
usually referred to as the profiling timer.
Therefore there is one count/limit register pair for the level 10
timer, and NR_CPUS pairs of count/limit registers for level 14.
The best thing to do is the use only the level 14 timer, and register
is as a clockevent with the generic code.
Then you won't have to deal with any details like which cpu to invoke
the scheduler tick on, etc. It'll all be taken care of for you.
Something like:
static struct clock_event_device sparc32_clockevent = {
.features = CLOCK_EVT_FEAT_ONESHOT,
.set_mode = sparc32_timer_setup,
.set_next_event = sparc32_next_event,
.rating = 100,
.shift = 30,
.irq = -1,
};
static DEFINE_PER_CPU(struct clock_event_device, sparc32_events);
You implement sparc32_timer_setup, which will look something like:
static void sparc32_timer_setup(enum clock_event_mode mode,
struct clock_event_device *evt)
{
switch (mode) {
case CLOCK_EVT_MODE_ONESHOT:
case CLOCK_EVT_MODE_RESUME:
break;
case CLOCK_EVT_MODE_SHUTDOWN:
disable_level14_timer();
break;
case CLOCK_EVT_MODE_PERIODIC:
case CLOCK_EVT_MODE_UNUSED:
WARN_ON(1);
break;
};
}
and sparc32_next_event, which advances the limit register to the
next interrupt count.
static int sparc64_next_event(unsigned long delta,
struct clock_event_device *evt)
{
unsigned long orig_count, new_count, new_limit;
orig_count = sparc32_read_level14_count();
new_limit = orig_count + delta;
sparc32_write_level14_limit(new_limit);
new_count = sparc32_read_level14_count();
/* If the new limit has been passed already, let the caller
* know.
*/
if (((long)(new_count - (orig_count + delta))) > 0L)
return -ETIME;
return 0;
}
The level 14 timer interrupt should go:
int cpu = smp_processor_id();
struct clock_event_device *evt = &per_cpu(sparc32_events, cpu);
...
if (unlikely(!evt->event_handler)) {
printk(KERN_WARNING
"Spurious SPARC32 timer interrupt on cpu %d\n", cpu);
} else
evt->event_handler(evt);
...
Finally, on bootup and on each cpu, initialize (but do not start) the
level 14 timer and then go:
struct clock_event_device *sevt;
sevt = &__get_cpu_var(sparc32_events);
memcpy(sevt, &sparc32_clockevent, sizeof(*sevt));
sevt->cpumask = cpumask_of(smp_processor_id());
clockevents_register_device(sevt);
The clockevents system will make ->next_event() calls on this
clockevents device to start the timers firing.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-01-05 21:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-26 17:50 LEON SMP Daniel Hellstrom
2010-10-26 17:54 ` David Miller
2010-10-26 18:11 ` Daniel Hellstrom
2011-01-05 10:20 ` Daniel Hellstrom
2011-01-05 10:48 ` Sam Ravnborg
2011-01-05 11:19 ` Daniel Hellstrom
2011-01-05 21:27 ` David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.