* [RFC] Energy/power monitoring within the kernel
From: Guenter Roeck @ 2012-10-23 22:02 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1351013449.9070.5.camel@hornet>
On Tue, Oct 23, 2012 at 06:30:49PM +0100, Pawel Moll wrote:
> Greetings All,
>
> More and more of people are getting interested in the subject of power
> (energy) consumption monitoring. We have some external tools like
> "battery simulators", energy probes etc., but some targets can measure
> their power usage on their own.
>
> Traditionally such data should be exposed to the user via hwmon sysfs
> interface, and that's exactly what I did for "my" platform - I have
> a /sys/class/hwmon/hwmon*/device/energy*_input and this was good
> enough to draw pretty graphs in userspace. Everyone was happy...
>
Only driver supporting "energy" output so far is ibmaem, and the reported energy
is supposed to be cumulative, as in energy = power * time. Do you mean power,
possibly ?
> Now I am getting new requests to do more with this data. In particular
> I'm asked how to add such information to ftrace/perf output. The second
> most frequent request is about providing it to a "energy aware"
> cpufreq governor.
>
Anything energy related would have to be along the line of "do something after a
certain amount of work has been performed", which at least at the surface does
not make much sense to me, unless you mean something along the line of a
process scheduler which schedules a process not based on time slices but based
on energy consumed, ie if you want to define a time slice not in milli-seconds
but in Joule.
If so, I would argue that a similar behavior could be achieved by varying the
duration of time slices with the current CPU speed, or simply by using cycle
count instead of time as time slice parameter. Not that I am sure if such an
approach would really be of interest for anyone.
Or do you really mean power, not energy, such as in "reduce CPU speed if its
power consumption is above X Watt" ?
> I've came up with three (non-mutually exclusive) options. I will
> appreciate any other ideas and comments (including "it makes not sense
> whatsoever" ones, with justification). Of course I am more than willing
> to spend time on prototyping anything that seems reasonable and propose
> patches.
>
>
>
> === Option 1: Trace event ===
>
> This seems to be the "cheapest" option. Simply defining a trace event
> that can be generated by a hwmon (or any other) driver makes the
> interesting data immediately available to any ftrace/perf user. Of
> course it doesn't really help with the cpufreq case, but seems to be
> a good place to start with.
>
> The question is how to define it... I've came up with two prototypes:
>
> = Generic hwmon trace event =
>
> This one allows any driver to generate a trace event whenever any
> "hwmon attribute" (measured value) gets updated. The rate at which the
> updates happen can be controlled by already existing "update_interval"
> attribute.
>
> 8<-------------------------------------------
> TRACE_EVENT(hwmon_attr_update,
> TP_PROTO(struct device *dev, struct attribute *attr, long long input),
> TP_ARGS(dev, attr, input),
>
> TP_STRUCT__entry(
> __string( dev, dev_name(dev))
> __string( attr, attr->name)
> __field( long long, input)
> ),
>
> TP_fast_assign(
> __assign_str(dev, dev_name(dev));
> __assign_str(attr, attr->name);
> __entry->input = input;
> ),
>
> TP_printk("%s %s %lld", __get_str(dev), __get_str(attr), __entry->input)
> );
> 8<-------------------------------------------
>
> It generates such ftrace message:
>
> <...>212.673126: hwmon_attr_update: hwmon4 temp1_input 34361
>
> One issue with this is that some external knowledge is required to
> relate a number to a processor core. Or maybe it's not an issue at all
> because it should be left for the user(space)?
>
> = CPU power/energy/temperature trace event =
>
> This one is designed to emphasize the relation between the measured
> value (whether it is energy, temperature or any other physical
> phenomena, really) and CPUs, so it is quite specific (too specific?)
>
> 8<-------------------------------------------
> TRACE_EVENT(cpus_environment,
> TP_PROTO(const struct cpumask *cpus, long long value, char unit),
> TP_ARGS(cpus, value, unit),
>
> TP_STRUCT__entry(
> __array( unsigned char, cpus, sizeof(struct cpumask))
> __field( long long, value)
> __field( char, unit)
> ),
>
> TP_fast_assign(
> memcpy(__entry->cpus, cpus, sizeof(struct cpumask));
> __entry->value = value;
> __entry->unit = unit;
> ),
>
> TP_printk("cpus %s %lld[%c]",
> __print_cpumask((struct cpumask *)__entry->cpus),
> __entry->value, __entry->unit)
> );
> 8<-------------------------------------------
>
> And the equivalent ftrace message is:
>
> <...>127.063107: cpus_environment: cpus 0,1,2,3 34361[C]
>
> It's a cpumask, not just single cpu id, because the sensor may measure
> the value per set of CPUs, eg. a temperature of the whole silicon die
> (so all the cores) or an energy consumed by a subset of cores (this
> is my particular use case - two meters monitor a cluster of two
> processors and a cluster of three processors, all working as a SMP
> system).
>
> Of course the cpus __array could be actually a special __cpumask field
> type (I've just hacked the __print_cpumask so far). And I've just
> realised that the unit field should actually be a string to allow unit
> prefixes to be specified (the above should obviously be "34361[mC]"
> not "[C]"). Also - excuse the "cpus_environment" name - this was the
> best I was able to come up with at the time and I'm eager to accept
> any alternative suggestions :-)
>
I am not sure how this would be expected to work. hwmon is, by its very nature,
a passive subsystem: It doesn't do anything unless data is explicitly requested
from it. It does not update an attribute unless that attribute is read.
That does not seem to fit well with the idea of tracing - which assumes
that some activity is happening, ultimately, all by itself, presumably
periodically. The idea to have a user space application read hwmon data only
for it to trigger trace events does not seem to be very compelling to me.
An exception is if a monitoring device suppports interrupts, and if its driver
actually implements those interrupts. This is, however, not the case for most of
the current drivers (if any), mostly because interrupt support for hardware
monitoring devices is very platform dependent and thus difficult to implement.
>
> === Option 2: hwmon perf PMU ===
>
> Although the trace event makes it possible to obtain interesting
> information using perf, the user wouldn't be able to treat the
> energy meter as a normal data source. In particular there would
> be no way of creating a group of events consisting eg. of a
> "normal" leader (eg. cache miss event) triggering energy meter
> read. The only way to get this done is to implement a perf PMU
> backend providing "environmental data" to the user.
>
> = High-level hwmon API and PMU =
>
> Current hwmon subsystem does not provide any abstraction for the
> measured values and requires particular drivers to create specified
> sysfs attributes than used by userspace libsensors. This makes
> the framework ultimately flexible and ultimately hard to access
> from within the kernel...
>
> What could be done here is some (simple) API to register the
> measured values with the hwmon core which would result in creating
> equivalent sysfs attributes automagically, but also allow a
> in-kernel API for values enumeration and access. That way the core
> could also register a "hwmon PMU" with the perf framework providing
> data from all "compliant" drivers.
>
> = A driver-specific PMU =
>
> Of course a particular driver could register its own perf PMU on its
> own. It's certainly an option, just very suboptimal in my opinion.
> Or maybe not? Maybe the task is so specialized that it makes sense?
>
We had a couple of attempts to provide an in-kernel API. Unfortunately,
the result was, at least so far, more complexity on the driver side.
So the difficulty is really to define an API which is really simple, and does
not just complicate driver development for a (presumably) rare use case.
Guenter
>
>
> === Option 3: CPU power(energy) monitoring framework ===
>
> And last but not least, maybe the problem deserves some dedicated
> API? Something that would take providers and feed their data into
> interested parties, in particular a perf PMU implementation and
> cpufreq governors?
>
> Maybe it could be an extension to the thermal framework? It already
> gives some meaning to a physical phenomena. Adding other, related ones
> like energy, and relating it to cpu cores could make some sense.
>
>
>
> I've tried to gather all potentially interested audience in the To:
> list, but if I missed anyone - please, do let them (and/or me) know.
>
> Best regards and thanks for participation in the discussion!
>
> Pawel
>
>
>
>
^ permalink raw reply
* [GIT PULL] ARM: OMAP: PM fixes for v3.7-rc3
From: Kevin Hilman @ 2012-10-23 22:14 UTC (permalink / raw)
To: linux-arm-kernel
Tony,
Here are a few more fixes PM-related fixes for v3.7-rc
Kevin
The following changes since commit 6f0c0580b70c89094b3422ba81118c7b959c7556:
Linux 3.7-rc2 (2012-10-20 12:11:32 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap-pm.git tags/for_3.7-rc3-fixes-pm
for you to fetch changes up to 65bf7ca0005d7d827596d5df28583c83c9158da6:
ARM: OMAP3: Beagle: fix OPP customization and initcall ordering (2012-10-22 16:01:42 -0700)
----------------------------------------------------------------
Misc. OMAP PM-related fixes for v3.7-rc
----------------------------------------------------------------
Kevin Hilman (2):
ARM: OMAP2: UART: fix console UART mismatched runtime PM status
ARM: OMAP3: Beagle: fix OPP customization and initcall ordering
Paul Walmsley (1):
ARM: OMAP3: PM: apply part of the erratum i582 workaround
arch/arm/mach-omap2/board-omap3beagle.c | 22 +++++++++++++---------
arch/arm/mach-omap2/pm.h | 1 +
arch/arm/mach-omap2/pm34xx.c | 30 ++++++++++++++++++++++++++++--
arch/arm/mach-omap2/serial.c | 5 +++++
4 files changed, 47 insertions(+), 11 deletions(-)
^ permalink raw reply
* [PATCH v3] pwm: vt8500: Update vt8500 PWM driver support
From: Thierry Reding @ 2012-10-23 22:14 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1350929425-17516-1-git-send-email-linux@prisktech.co.nz>
On Tue, Oct 23, 2012 at 07:10:24AM +1300, Tony Prisk wrote:
[...]
> @@ -87,6 +98,11 @@ static int vt8500_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm)
> {
> struct vt8500_chip *vt8500 = to_vt8500_chip(chip);
>
> + if (!clk_enable(vt8500->clk)) {
> + dev_err(chip->dev, "failed to enable clock\n");
> + return -EBUSY;
> + };
> +
I don't think that works. The clock API returns 0 on success and a
negative error code on failure. So this should rather be something like:
err = clk_enable(vt8500->clk);
if (err < 0) {
dev_err(chip->dev, "failed to enable clock: %d\n", err);
return err;
}
> @@ -123,6 +153,12 @@ static int __devinit pwm_probe(struct platform_device *pdev)
> chip->chip.ops = &vt8500_pwm_ops;
> chip->chip.base = -1;
> chip->chip.npwm = VT8500_NR_PWMS;
> + chip->clk = devm_clk_get(&pdev->dev, NULL);
> +
The blank line should go above the call to devm_clk_get().
> + if (IS_ERR_OR_NULL(chip->clk)) {
> + dev_err(&pdev->dev, "clock source not specified\n");
> + return PTR_ERR(chip->clk);
> + }
[...]
> + if (!clk_prepare(chip->clk)) {
> + dev_err(&pdev->dev, "failed to prepare clock\n");
> + return -EBUSY;
> + }
> +
Same comment here. I wonder how this code can work, since if the clock
is properly prepared, then it will return 0, and the above will return
-EBUSY.
> ret = pwmchip_add(&chip->chip);
> - if (ret < 0)
> + if (ret < 0) {
> + dev_err(&pdev->dev, "failed to add pwmchip\n");
Error messages can be considered prose, so this should be: "failed to
add PWM chip".
Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20121024/d37c6d31/attachment-0001.sig>
^ permalink raw reply
* [PATCH] genirq: provide means to retrigger parent
From: Kevin Hilman @ 2012-10-23 22:23 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20121016221502.GY28061@n2100.arm.linux.org.uk>
Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> On Tue, Oct 16, 2012 at 03:07:49PM -0700, Kevin Hilman wrote:
>> From: Thomas Gleixner <tglx@linutronix.de>
>>
>> Attempts to retrigger nested threaded IRQs currently fail because they
>> have no primary handler. In order to support retrigger of nested
>> IRQs, the parent IRQ needs to be retriggered.
>>
>> To fix, when an IRQ needs to be resent, if the interrupt has a parent
>> IRQ and runs in the context of the parent IRQ, then resend the parent.
>>
>> Also, handle_nested_irq() needs to clear the replay flag like the
>> other handlers, otherwise check_irq_resend() will set it and it will
>> never be cleared. Without clearing, it results in the first resend
>> working fine, but check_irq_resend() returning early on subsequent
>> resends because the replay flag is still set.
>>
>> Problem discovered on ARM/OMAP platforms where a nested IRQ that's
>> also a wakeup IRQ happens late in suspend and needed to be retriggered
>> during the resume process.
>>
>> Reported-by: Kevin Hilman <khilman@ti.com>
>> Tested-by: Kevin Hilman <khilman@ti.com>
>> [khilman at ti.com: changelog edits, clear IRQS_REPLAY in handle_nested_irq()]
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>
> Umm, we also have the converse situation. We have platforms where the
> resend has to be done from the child IRQ, and the parent must not be
> touched. I hope that doesn't break those.
I'm assuming the child IRQs you're concerned with are not threaded,
right? This patch only addresses nested, threaded IRQs, and these don't
have a primary handler to run at all, so cannot do any triggering.
Kevin
^ permalink raw reply
* [PATCH v4 2/5] zynq: use pl310 device tree bindings
From: Josh Cartwright @ 2012-10-23 22:34 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <cover.1351466765.git.josh.cartwright@ni.com>
The Zynq has a PL310 L2 cache controller. Convert in-tree uses to using
the device tree.
Signed-off-by: Josh Cartwright <josh.cartwright@ni.com>
Cc: John Linn <john.linn@xilinx.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michal Simek <michal.simek@xilinx.com>
---
arch/arm/boot/dts/zynq-ep107.dts | 9 +++++++++
arch/arm/mach-zynq/common.c | 9 +--------
arch/arm/mach-zynq/include/mach/zynq_soc.h | 4 ----
3 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/arch/arm/boot/dts/zynq-ep107.dts b/arch/arm/boot/dts/zynq-ep107.dts
index f914090..574bc04 100644
--- a/arch/arm/boot/dts/zynq-ep107.dts
+++ b/arch/arm/boot/dts/zynq-ep107.dts
@@ -44,6 +44,15 @@
<0xF8F00100 0x100>;
};
+ L2: cache-controller {
+ compatible = "arm,pl310-cache";
+ reg = <0xF8F02000 0x1000>;
+ arm,data-latency = <2 3 2>;
+ arm,tag-latency = <2 3 2>;
+ cache-unified;
+ cache-level = <2>;
+ };
+
uart0: uart at e0000000 {
compatible = "xlnx,xuartps";
reg = <0xE0000000 0x1000>;
diff --git a/arch/arm/mach-zynq/common.c b/arch/arm/mach-zynq/common.c
index d73963b..056091a 100644
--- a/arch/arm/mach-zynq/common.c
+++ b/arch/arm/mach-zynq/common.c
@@ -45,12 +45,10 @@ static struct of_device_id zynq_of_bus_ids[] __initdata = {
*/
static void __init xilinx_init_machine(void)
{
-#ifdef CONFIG_CACHE_L2X0
/*
* 64KB way size, 8-way associativity, parity disabled
*/
- l2x0_init(PL310_L2CC_BASE, 0x02060000, 0xF0F0FFFF);
-#endif
+ l2x0_of_init(0x02060000, 0xF0F0FFFF);
of_platform_bus_probe(NULL, zynq_of_bus_ids, NULL);
}
@@ -83,11 +81,6 @@ static struct map_desc io_desc[] __initdata = {
.pfn = __phys_to_pfn(SCU_PERIPH_PHYS),
.length = SZ_8K,
.type = MT_DEVICE,
- }, {
- .virtual = PL310_L2CC_VIRT,
- .pfn = __phys_to_pfn(PL310_L2CC_PHYS),
- .length = SZ_4K,
- .type = MT_DEVICE,
},
#ifdef CONFIG_DEBUG_LL
diff --git a/arch/arm/mach-zynq/include/mach/zynq_soc.h b/arch/arm/mach-zynq/include/mach/zynq_soc.h
index 3d1c6a6..218283a 100644
--- a/arch/arm/mach-zynq/include/mach/zynq_soc.h
+++ b/arch/arm/mach-zynq/include/mach/zynq_soc.h
@@ -25,9 +25,6 @@
#define TTC0_PHYS 0xF8001000
#define TTC0_VIRT TTC0_PHYS
-#define PL310_L2CC_PHYS 0xF8F02000
-#define PL310_L2CC_VIRT PL310_L2CC_PHYS
-
#define SCU_PERIPH_PHYS 0xF8F00000
#define SCU_PERIPH_VIRT SCU_PERIPH_PHYS
@@ -35,7 +32,6 @@
#define TTC0_BASE IOMEM(TTC0_VIRT)
#define SCU_PERIPH_BASE IOMEM(SCU_PERIPH_VIRT)
-#define PL310_L2CC_BASE IOMEM(PL310_L2CC_VIRT)
/*
* Mandatory for CONFIG_LL_DEBUG, UART is mapped virtual = physical
--
1.8.0
^ permalink raw reply related
* [PATCH] genirq: provide means to retrigger parent
From: Thomas Gleixner @ 2012-10-23 22:36 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <871ugo7rqv.fsf@deeprootsystems.com>
On Tue, 23 Oct 2012, Kevin Hilman wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
>
> > On Tue, Oct 16, 2012 at 03:07:49PM -0700, Kevin Hilman wrote:
> >> From: Thomas Gleixner <tglx@linutronix.de>
> >>
> >> Attempts to retrigger nested threaded IRQs currently fail because they
> >> have no primary handler. In order to support retrigger of nested
> >> IRQs, the parent IRQ needs to be retriggered.
> >>
> >> To fix, when an IRQ needs to be resent, if the interrupt has a parent
> >> IRQ and runs in the context of the parent IRQ, then resend the parent.
> >>
> >> Also, handle_nested_irq() needs to clear the replay flag like the
> >> other handlers, otherwise check_irq_resend() will set it and it will
> >> never be cleared. Without clearing, it results in the first resend
> >> working fine, but check_irq_resend() returning early on subsequent
> >> resends because the replay flag is still set.
> >>
> >> Problem discovered on ARM/OMAP platforms where a nested IRQ that's
> >> also a wakeup IRQ happens late in suspend and needed to be retriggered
> >> during the resume process.
> >>
> >> Reported-by: Kevin Hilman <khilman@ti.com>
> >> Tested-by: Kevin Hilman <khilman@ti.com>
> >> [khilman at ti.com: changelog edits, clear IRQS_REPLAY in handle_nested_irq()]
> >> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> >
> > Umm, we also have the converse situation. We have platforms where the
> > resend has to be done from the child IRQ, and the parent must not be
> > touched. I hope that doesn't break those.
>
> I'm assuming the child IRQs you're concerned with are not threaded,
> right? This patch only addresses nested, threaded IRQs, and these don't
> have a primary handler to run at all, so cannot do any triggering.
And it involves that you activly set the parent irq via the new
interface: irq_set_parent()
You don't have that yet or you don't use that in your future changes,
then you're good. :)
Thanks,
tglx
^ permalink raw reply
* [PATCH v2 1/4] ARM: dts: omap5: Update GPIO with address space and interrupts
From: Jon Hunter @ 2012-10-23 23:15 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <5086CC02.6070801@firmworks.com>
Hi Mitch,
On 10/23/2012 11:55 AM, Mitch Bradley wrote:
> On 10/23/2012 4:49 AM, Jon Hunter wrote:
>
>> Therefore, I believe it will improve search time and hence, boot time if
>> we have interrupt-parent defined in each node.
>
> I strongly suspect (based on many years of performance tuning, with
> special focus on boot time) that the time difference will be completely
> insignificant. The total extra time for walking up the interrupt tree
> for every interrupt in a large system is comparable to the time it takes
> to send a few characters out a UART. So you can get more improvement
> from eliminating a single printk() than from globally adding per-node
> interrupt-parent.
>
> Furthermore, the cost of processing all of the interrupt-parent
> properties is probably similar to the cost of the avoided tree walks.
>
> CPU cycles are very fast compared to I/O register accesses, say a factor
> of 100. Now consider that many modern devices contain embedded
> microcontrollers (SD cards, network interface modules, USB hubs and
> devices, ...), and those devices usually require various delays measured
> in milliseconds, to ensure that the microcontroller is ready for the
> next initialization step. Those delays are extremely long compared to
> CPU cycles. Obviously, some of that can be overlapped by careful
> multithreading, but that isn't free either.
>
> The bottom line is that I'm pretty sure that adding per-node
> interrupt-parent would not be worthwhile from the standpoint of speeding
> up boot time.
Absolutely, I don't expect this to miraculously improve the boot time or
suggest that this is a major contributor to boot time, but what is the
best approach in general in terms of efficiency (memory and time). In
other words, is there a best practice? And from your feedback, I
understand that adding a global interrupt-parent is a good practice.
For a bit of fun, I took an omap4430 board and benchmarked the time
taken by the of_irq_find_parent() when interrupt-parent was defined for
each node using interrupts and without.
There were a total of 47 device nodes using interrupts. Adding the
interrupt-parent to all 47 nodes increased the dtb from 13211 bytes to
13963 bytes.
On boot-up I saw 117 calls to of_irq_find_parent() for this platform
(there appears to be multiple calls for a given device). Without
interrupt-parent defined for each node total time spent in
of_irq_find_parent() was 1.028 ms where as with interrupt-parent defined
for each node the total time was 0.4032 ms. This was done using a
38.4MHz timer and the overhead of reading the timer 117 times was about
36 us.
I understand that this does not provide the full picture, but I wanted
to get a better handle on the times here. So yes the overall overhead
here is not significant for us to worry about.
Cheers
Jon
^ permalink raw reply
* [PATCH v2 2/4] zynq: move static peripheral mappings
From: Josh Cartwright @ 2012-10-23 23:42 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20121023202703.GA16912@elliptictech.com>
On Tue, Oct 23, 2012 at 04:27:03PM -0400, Nick Bowler wrote:
>
> Just FYI, I sent a patch to fix the same bug a while back
>
> https://patchwork.kernel.org/patch/1156361/
>
> together with other patches to fix early printk on the ZC702 serial
> console. Admittedly, I dropped the ball on these as other issues
> came up so I was away from the Zynq for a while.
>
> However, I'm now getting back on the Zynq and have a bunch of patches to
> make it all work on the ZC702 board. I've respun the ZC702 early boot
> fixes against newer git but they're obviously going to conflict with
> this series. Should I resend them anyway?
If you have other fixes for the zc702, that'd be great. Most of my
testing has been in a qemu model; I haven't had a chance to try getting
the zc702 booting yet.
The first stumbling block is that it looks like the secondary uart is
the primary uart on the zc702.
> I also have a DT binding for the TTC driver, I can send that.
That'd be great!
Thanks,
Josh
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20121023/7be3833f/attachment.sig>
^ permalink raw reply
* [PATCH v2 1/4] ARM: dts: omap5: Update GPIO with address space and interrupts
From: Mitch Bradley @ 2012-10-24 0:18 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <5087252A.50203@ti.com>
On 10/23/2012 1:15 PM, Jon Hunter wrote:
> Hi Mitch,
>
> On 10/23/2012 11:55 AM, Mitch Bradley wrote:
>> On 10/23/2012 4:49 AM, Jon Hunter wrote:
>>
>>> Therefore, I believe it will improve search time and hence, boot time if
>>> we have interrupt-parent defined in each node.
>>
>> I strongly suspect (based on many years of performance tuning, with
>> special focus on boot time) that the time difference will be completely
>> insignificant. The total extra time for walking up the interrupt tree
>> for every interrupt in a large system is comparable to the time it takes
>> to send a few characters out a UART. So you can get more improvement
>> from eliminating a single printk() than from globally adding per-node
>> interrupt-parent.
>>
>> Furthermore, the cost of processing all of the interrupt-parent
>> properties is probably similar to the cost of the avoided tree walks.
>>
>> CPU cycles are very fast compared to I/O register accesses, say a factor
>> of 100. Now consider that many modern devices contain embedded
>> microcontrollers (SD cards, network interface modules, USB hubs and
>> devices, ...), and those devices usually require various delays measured
>> in milliseconds, to ensure that the microcontroller is ready for the
>> next initialization step. Those delays are extremely long compared to
>> CPU cycles. Obviously, some of that can be overlapped by careful
>> multithreading, but that isn't free either.
>>
>> The bottom line is that I'm pretty sure that adding per-node
>> interrupt-parent would not be worthwhile from the standpoint of speeding
>> up boot time.
>
> Absolutely, I don't expect this to miraculously improve the boot time or
> suggest that this is a major contributor to boot time, but what is the
> best approach in general in terms of efficiency (memory and time). In
> other words, is there a best practice? And from your feedback, I
> understand that adding a global interrupt-parent is a good practice.
>From a maintenance standpoint, "saying it once" is best practice. Time
that you don't spend doing unnecessary maintenance can be spent looking
for other, higher value, improvements. And when you do need to optimize
something, it's much easier if the function is centralized.
Pushing the interrupt parent up the tree to the appropriate point can
make the next platform easier, opening the possibility of changing just
one thing instead of several dozen.
There have been several cases when I have violated good factoring in
order to save a little time, only to have to undo it later when the next
system was enough different that the de-factored version didn't work.
So, while there are certainly cases where you are forced to do
otherwise, I generally like the "don't repeat yourself" mantra.
>
> For a bit of fun, I took an omap4430 board and benchmarked the time
> taken by the of_irq_find_parent() when interrupt-parent was defined for
> each node using interrupts and without.
>
> There were a total of 47 device nodes using interrupts. Adding the
> interrupt-parent to all 47 nodes increased the dtb from 13211 bytes to
> 13963 bytes.
>
> On boot-up I saw 117 calls to of_irq_find_parent() for this platform
> (there appears to be multiple calls for a given device). Without
> interrupt-parent defined for each node total time spent in
> of_irq_find_parent() was 1.028 ms where as with interrupt-parent defined
> for each node the total time was 0.4032 ms. This was done using a
> 38.4MHz timer and the overhead of reading the timer 117 times was about
> 36 us.
That sounds about right. The savings of 600 us is 6 characters at
115200 baud.
>
> I understand that this does not provide the full picture, but I wanted
> to get a better handle on the times here. So yes the overall overhead
> here is not significant for us to worry about.
Big ticket items for boot time improvement are time spent waiting for
peripheral devices to become ready and time spent spewing diagnostic
messages. But in the final analysis, you just have to measure what is
happening and see what you can do to improve it. In my experience, CPU
cycles are rarely problematic, unless they are artificially slowed down
due to caches being off or due to direct execution from slow memory like
ROMs.
I once shaved an hour off the startup time for a PowerPC system by
moving some critical code into cache. This was on a prototype "chip"
that was being emulated by arrays of FPGAs.
On the first generation OLPC XO-1 machine we were really interested in
super-fast wakeup from suspend. I tuned that firmware code path to the
nth degree, finally getting stuck at 2 ms because you had to wait that
long before accessing the PCI bus interface, otherwise the SD controller
chip would lock up. Then I transferred control to the kernel, which had
to wait something like 40 ms (two display frame times) to re-sync the
video subsystem, then it had to re-enable the USB subsystem, which ended
up taking a good fraction of a second.
Things haven't gotten much better (in fact they are probably worse),
because, even the the CPUs have gotten faster, there are more
peripherals with hard-to-avoid delays. So, in the end, a few
sub-millisecond delays just don't matter.
>
> Cheers
> Jon
>
>
^ permalink raw reply
* [PATCH v3 0/5] zynq subarch cleanups
From: Josh Cartwright @ 2012-10-24 0:32 UTC (permalink / raw)
To: linux-arm-kernel
Hey all-
Things have been relatively quiet on the Zynq front lately. This patchset does
a bit of cleanup of the Zynq subarchitecture. It was the necessary set of
things I had to do to get a zynq target booting with the upstream qemu model.
Patches 1 and 2 move zynq to use the GIC and pl310 L2 cache controller device
tree mappings respectively.
Patch 3 removes unused clock infrastructure. the plan is to rework the
out-of-tree Xilinx generic clk support into something suitable for merging.
What's in tree now just isn't used at all, and can be removed.
Patch 4 and 5 move around the static peripheral mappings into the vmalloc area.
Arnd-
I intentionally did not Cc stable on patch 5, even though you had
suggested otherwise. I do not think it will apply cleanly to the stable
trees independent of the other patches. Additionally, with the current
state of zynq upstream, I'm not convinced there would be enough users to
make it worth the effort.
Additionally, I've left the SCU static mapping around, even though its
currently unused. We'll eventually need it around (maybe in a different form)
when SMP support is added.
---
Changes since v2:
- Reordered patchset to prevent remapping peripherals that were subsequently
removed from the static map
- Use DT bindings for the L2 cache controller
Changes since v1:
- Make sure arm at kernel.org was included
- Rebased on arm-soc/for-next
- Added a cover letter
- Elaborated a bit on why I removed CLKDEV_LOOKUP
---
Josh Cartwright (5):
zynq: use GIC device tree bindings
zynq: use pl310 device tree bindings
zynq: remove use of CLKDEV_LOOKUP
ARM: annotate VMALLOC_END definition with _AC
zynq: move static peripheral mappings
arch/arm/Kconfig | 1 -
arch/arm/boot/dts/zynq-ep107.dts | 17 +++++++++++++---
arch/arm/include/asm/pgtable.h | 2 +-
arch/arm/mach-zynq/common.c | 23 ++++++++++-----------
arch/arm/mach-zynq/include/mach/clkdev.h | 32 ------------------------------
arch/arm/mach-zynq/include/mach/zynq_soc.h | 29 ++++++++++++---------------
6 files changed, 38 insertions(+), 66 deletions(-)
delete mode 100644 arch/arm/mach-zynq/include/mach/clkdev.h
--
1.8.0
^ permalink raw reply
* [PATCH v3 1/5] zynq: use GIC device tree bindings
From: Josh Cartwright @ 2012-10-24 0:33 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20121024003218.GA31625@beefymiracle.amer.corp.natinst.com>
The Zynq uses the cortex-a9-gic. This eliminates the need to hardcode
register addresses.
Signed-off-by: Josh Cartwright <josh.cartwright@ni.com>
Cc: John Linn <john.linn@xilinx.com>
---
arch/arm/boot/dts/zynq-ep107.dts | 8 +++++---
arch/arm/mach-zynq/common.c | 7 ++++++-
arch/arm/mach-zynq/include/mach/zynq_soc.h | 2 --
3 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/arch/arm/boot/dts/zynq-ep107.dts b/arch/arm/boot/dts/zynq-ep107.dts
index 37ca192..7bfff4a 100644
--- a/arch/arm/boot/dts/zynq-ep107.dts
+++ b/arch/arm/boot/dts/zynq-ep107.dts
@@ -36,10 +36,12 @@
ranges;
intc: interrupt-controller at f8f01000 {
+ compatible = "arm,cortex-a9-gic";
+ #interrupt-cells = <3>;
+ #address-cells = <1>;
interrupt-controller;
- compatible = "arm,gic";
- reg = <0xF8F01000 0x1000>;
- #interrupt-cells = <2>;
+ reg = <0xF8F01000 0x1000>,
+ <0xF8F00100 0x100>;
};
uart0: uart at e0000000 {
diff --git a/arch/arm/mach-zynq/common.c b/arch/arm/mach-zynq/common.c
index ab5cfdd..d73963b 100644
--- a/arch/arm/mach-zynq/common.c
+++ b/arch/arm/mach-zynq/common.c
@@ -55,12 +55,17 @@ static void __init xilinx_init_machine(void)
of_platform_bus_probe(NULL, zynq_of_bus_ids, NULL);
}
+static struct of_device_id irq_match[] __initdata = {
+ { .compatible = "arm,cortex-a9-gic", .data = gic_of_init, },
+ { }
+};
+
/**
* xilinx_irq_init() - Interrupt controller initialization for the GIC.
*/
static void __init xilinx_irq_init(void)
{
- gic_init(0, 29, SCU_GIC_DIST_BASE, SCU_GIC_CPU_BASE);
+ of_irq_init(irq_match);
}
/* The minimum devices needed to be mapped before the VM system is up and
diff --git a/arch/arm/mach-zynq/include/mach/zynq_soc.h b/arch/arm/mach-zynq/include/mach/zynq_soc.h
index d0d3f8f..3d1c6a6 100644
--- a/arch/arm/mach-zynq/include/mach/zynq_soc.h
+++ b/arch/arm/mach-zynq/include/mach/zynq_soc.h
@@ -35,8 +35,6 @@
#define TTC0_BASE IOMEM(TTC0_VIRT)
#define SCU_PERIPH_BASE IOMEM(SCU_PERIPH_VIRT)
-#define SCU_GIC_CPU_BASE (SCU_PERIPH_BASE + 0x100)
-#define SCU_GIC_DIST_BASE (SCU_PERIPH_BASE + 0x1000)
#define PL310_L2CC_BASE IOMEM(PL310_L2CC_VIRT)
/*
--
1.8.0
^ permalink raw reply related
* [PATCH v3 2/5] zynq: use pl310 device tree bindings
From: Josh Cartwright @ 2012-10-24 0:34 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20121024003218.GA31625@beefymiracle.amer.corp.natinst.com>
The Zynq has a PL310 L2 cache controller. Convert in-tree uses to using
the device tree.
Signed-off-by: Josh Cartwright <josh.cartwright@ni.com>
Cc: John Linn <john.linn@xilinx.com>
---
arch/arm/boot/dts/zynq-ep107.dts | 9 +++++++++
arch/arm/mach-zynq/common.c | 9 +--------
arch/arm/mach-zynq/include/mach/zynq_soc.h | 4 ----
3 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/arch/arm/boot/dts/zynq-ep107.dts b/arch/arm/boot/dts/zynq-ep107.dts
index 7bfff4a..87204d7 100644
--- a/arch/arm/boot/dts/zynq-ep107.dts
+++ b/arch/arm/boot/dts/zynq-ep107.dts
@@ -44,6 +44,15 @@
<0xF8F00100 0x100>;
};
+ L2: cache-controller {
+ compatible = "arm,pl310-cache";
+ reg = <0xF8F02000 0x1000>;
+ arm,data-latency = <2 3 2>;
+ arm,tag-latency = <2 3 2>;
+ cache-unified;
+ cache-level = <2>;
+ };
+
uart0: uart at e0000000 {
compatible = "xlnx,xuartps";
reg = <0xE0000000 0x1000>;
diff --git a/arch/arm/mach-zynq/common.c b/arch/arm/mach-zynq/common.c
index d73963b..056091a 100644
--- a/arch/arm/mach-zynq/common.c
+++ b/arch/arm/mach-zynq/common.c
@@ -45,12 +45,10 @@ static struct of_device_id zynq_of_bus_ids[] __initdata = {
*/
static void __init xilinx_init_machine(void)
{
-#ifdef CONFIG_CACHE_L2X0
/*
* 64KB way size, 8-way associativity, parity disabled
*/
- l2x0_init(PL310_L2CC_BASE, 0x02060000, 0xF0F0FFFF);
-#endif
+ l2x0_of_init(0x02060000, 0xF0F0FFFF);
of_platform_bus_probe(NULL, zynq_of_bus_ids, NULL);
}
@@ -83,11 +81,6 @@ static struct map_desc io_desc[] __initdata = {
.pfn = __phys_to_pfn(SCU_PERIPH_PHYS),
.length = SZ_8K,
.type = MT_DEVICE,
- }, {
- .virtual = PL310_L2CC_VIRT,
- .pfn = __phys_to_pfn(PL310_L2CC_PHYS),
- .length = SZ_4K,
- .type = MT_DEVICE,
},
#ifdef CONFIG_DEBUG_LL
diff --git a/arch/arm/mach-zynq/include/mach/zynq_soc.h b/arch/arm/mach-zynq/include/mach/zynq_soc.h
index 3d1c6a6..218283a 100644
--- a/arch/arm/mach-zynq/include/mach/zynq_soc.h
+++ b/arch/arm/mach-zynq/include/mach/zynq_soc.h
@@ -25,9 +25,6 @@
#define TTC0_PHYS 0xF8001000
#define TTC0_VIRT TTC0_PHYS
-#define PL310_L2CC_PHYS 0xF8F02000
-#define PL310_L2CC_VIRT PL310_L2CC_PHYS
-
#define SCU_PERIPH_PHYS 0xF8F00000
#define SCU_PERIPH_VIRT SCU_PERIPH_PHYS
@@ -35,7 +32,6 @@
#define TTC0_BASE IOMEM(TTC0_VIRT)
#define SCU_PERIPH_BASE IOMEM(SCU_PERIPH_VIRT)
-#define PL310_L2CC_BASE IOMEM(PL310_L2CC_VIRT)
/*
* Mandatory for CONFIG_LL_DEBUG, UART is mapped virtual = physical
--
1.8.0
^ permalink raw reply related
* [PATCH v3 3/5] zynq: remove use of CLKDEV_LOOKUP
From: Josh Cartwright @ 2012-10-24 0:34 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20121024003218.GA31625@beefymiracle.amer.corp.natinst.com>
The Zynq support in mainline does not (yet) make use of any of the
generic clk or clk lookup functionality. Remove what is upstream for
now, until the out-of-tree implementation is in suitable form for
merging.
An important side effect of this patch is that it allows the building of
a Zynq kernel without running into unresolved symbol problems:
drivers/built-in.o: In function `amba_get_enable_pclk':
clkdev.c:(.text+0x444): undefined reference to `clk_enable'
drivers/built-in.o: In function `amba_remove':
clkdev.c:(.text+0x488): undefined reference to `clk_disable'
drivers/built-in.o: In function `amba_probe':
clkdev.c:(.text+0x540): undefined reference to `clk_disable'
drivers/built-in.o: In function `amba_device_add':
clkdev.c:(.text+0x77c): undefined reference to `clk_disable'
drivers/built-in.o: In function `enable_clock':
clkdev.c:(.text+0x29738): undefined reference to `clk_enable'
drivers/built-in.o: In function `disable_clock':
clkdev.c:(.text+0x29778): undefined reference to `clk_disable'
drivers/built-in.o: In function `__pm_clk_remove':
clkdev.c:(.text+0x297f8): undefined reference to `clk_disable'
drivers/built-in.o: In function `pm_clk_suspend':
clkdev.c:(.text+0x29bc8): undefined reference to `clk_disable'
drivers/built-in.o: In function `pm_clk_resume':
clkdev.c:(.text+0x29c28): undefined reference to `clk_enable'
make[2]: *** [vmlinux] Error 1
make[1]: *** [sub-make] Error 2
make: *** [all] Error 2
Signed-off-by: Josh Cartwright <josh.cartwright@ni.com>
Cc: John Linn <john.linn@xilinx.com>
---
arch/arm/Kconfig | 1 -
arch/arm/mach-zynq/common.c | 1 -
arch/arm/mach-zynq/include/mach/clkdev.h | 32 --------------------------------
3 files changed, 34 deletions(-)
delete mode 100644 arch/arm/mach-zynq/include/mach/clkdev.h
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index cce4f8d..de70d99 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -946,7 +946,6 @@ config ARCH_ZYNQ
bool "Xilinx Zynq ARM Cortex A9 Platform"
select ARM_AMBA
select ARM_GIC
- select CLKDEV_LOOKUP
select CPU_V7
select GENERIC_CLOCKEVENTS
select ICST
diff --git a/arch/arm/mach-zynq/common.c b/arch/arm/mach-zynq/common.c
index 056091a..ba48f06 100644
--- a/arch/arm/mach-zynq/common.c
+++ b/arch/arm/mach-zynq/common.c
@@ -31,7 +31,6 @@
#include <asm/hardware/cache-l2x0.h>
#include <mach/zynq_soc.h>
-#include <mach/clkdev.h>
#include "common.h"
static struct of_device_id zynq_of_bus_ids[] __initdata = {
diff --git a/arch/arm/mach-zynq/include/mach/clkdev.h b/arch/arm/mach-zynq/include/mach/clkdev.h
deleted file mode 100644
index c6e73d8..0000000
--- a/arch/arm/mach-zynq/include/mach/clkdev.h
+++ /dev/null
@@ -1,32 +0,0 @@
-/*
- * arch/arm/mach-zynq/include/mach/clkdev.h
- *
- * Copyright (C) 2011 Xilinx, Inc.
- *
- * This software is licensed under the terms of the GNU General Public
- * License version 2, as published by the Free Software Foundation, and
- * may be copied, distributed, and modified under those terms.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- */
-
-#ifndef __MACH_CLKDEV_H__
-#define __MACH_CLKDEV_H__
-
-#include <plat/clock.h>
-
-struct clk {
- unsigned long rate;
- const struct clk_ops *ops;
- const struct icst_params *params;
- void __iomem *vcoreg;
-};
-
-#define __clk_get(clk) ({ 1; })
-#define __clk_put(clk) do { } while (0)
-
-#endif
--
1.8.0
^ permalink raw reply related
* [PATCH v3 4/5] ARM: annotate VMALLOC_END definition with _AC
From: Josh Cartwright @ 2012-10-24 0:35 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20121024003218.GA31625@beefymiracle.amer.corp.natinst.com>
This makes the definition of VMALLOC_END suitable for use within
assembly code. This is necessary to allow the use of VMALLOC_END in
defining where the early uart is mapped for use with DEBUG_LL.
Signed-off-by: Josh Cartwright <josh.cartwright@ni.com>
---
arch/arm/include/asm/pgtable.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 08c1231..72904a2 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -40,7 +40,7 @@
*/
#define VMALLOC_OFFSET (8*1024*1024)
#define VMALLOC_START (((unsigned long)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1))
-#define VMALLOC_END 0xff000000UL
+#define VMALLOC_END _AC(0xff000000,UL)
#define LIBRARY_TEXT_START 0x0c000000
--
1.8.0
^ permalink raw reply related
* [PATCH v3 5/5] zynq: move static peripheral mappings
From: Josh Cartwright @ 2012-10-24 0:35 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20121024003218.GA31625@beefymiracle.amer.corp.natinst.com>
Shifting them up into the vmalloc region prevents the following warning,
when booting a zynq qemu target with more than 512mb of RAM:
BUG: mapping for 0xe0000000 at 0xe0000000 out of vmalloc space
In addition, it allows for reuse of these mappings when the proper
drivers issue requests via ioremap().
Signed-off-by: Josh Cartwright <josh.cartwright@ni.com>
Cc: John Linn <john.linn@xilinx.com>
---
arch/arm/mach-zynq/common.c | 6 +++---
arch/arm/mach-zynq/include/mach/zynq_soc.h | 23 +++++++++++++----------
2 files changed, 16 insertions(+), 13 deletions(-)
diff --git a/arch/arm/mach-zynq/common.c b/arch/arm/mach-zynq/common.c
index ba48f06..ba8d14f 100644
--- a/arch/arm/mach-zynq/common.c
+++ b/arch/arm/mach-zynq/common.c
@@ -73,12 +73,12 @@ static struct map_desc io_desc[] __initdata = {
{
.virtual = TTC0_VIRT,
.pfn = __phys_to_pfn(TTC0_PHYS),
- .length = SZ_4K,
+ .length = TTC0_SIZE,
.type = MT_DEVICE,
}, {
.virtual = SCU_PERIPH_VIRT,
.pfn = __phys_to_pfn(SCU_PERIPH_PHYS),
- .length = SZ_8K,
+ .length = SCU_PERIPH_SIZE,
.type = MT_DEVICE,
},
@@ -86,7 +86,7 @@ static struct map_desc io_desc[] __initdata = {
{
.virtual = UART0_VIRT,
.pfn = __phys_to_pfn(UART0_PHYS),
- .length = SZ_4K,
+ .length = UART0_SIZE,
.type = MT_DEVICE,
},
#endif
diff --git a/arch/arm/mach-zynq/include/mach/zynq_soc.h b/arch/arm/mach-zynq/include/mach/zynq_soc.h
index 218283a..c6b9b67 100644
--- a/arch/arm/mach-zynq/include/mach/zynq_soc.h
+++ b/arch/arm/mach-zynq/include/mach/zynq_soc.h
@@ -15,27 +15,30 @@
#ifndef __MACH_XILINX_SOC_H__
#define __MACH_XILINX_SOC_H__
+#include <asm/pgtable.h>
+
#define PERIPHERAL_CLOCK_RATE 2500000
-/* For now, all mappings are flat (physical = virtual)
+/* Static peripheral mappings are mapped at the top of the
+ * vmalloc region
*/
-#define UART0_PHYS 0xE0000000
-#define UART0_VIRT UART0_PHYS
+#define UART0_PHYS 0xE0000000
+#define UART0_SIZE SZ_4K
+#define UART0_VIRT (VMALLOC_END - UART0_SIZE)
-#define TTC0_PHYS 0xF8001000
-#define TTC0_VIRT TTC0_PHYS
+#define TTC0_PHYS 0xF8001000
+#define TTC0_SIZE SZ_4K
+#define TTC0_VIRT (UART0_VIRT - TTC0_SIZE)
-#define SCU_PERIPH_PHYS 0xF8F00000
-#define SCU_PERIPH_VIRT SCU_PERIPH_PHYS
+#define SCU_PERIPH_PHYS 0xF8F00000
+#define SCU_PERIPH_SIZE SZ_8K
+#define SCU_PERIPH_VIRT (TTC0_VIRT - SCU_PERIPH_SIZE)
/* The following are intended for the devices that are mapped early */
#define TTC0_BASE IOMEM(TTC0_VIRT)
#define SCU_PERIPH_BASE IOMEM(SCU_PERIPH_VIRT)
-/*
- * Mandatory for CONFIG_LL_DEBUG, UART is mapped virtual = physical
- */
#define LL_UART_PADDR UART0_PHYS
#define LL_UART_VADDR UART0_VIRT
--
1.8.0
^ permalink raw reply related
* [RFC] Energy/power monitoring within the kernel
From: Thomas Renninger @ 2012-10-24 0:40 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1351013449.9070.5.camel@hornet>
Hi,
On Tuesday, October 23, 2012 06:30:49 PM Pawel Moll wrote:
> Greetings All,
>
> More and more of people are getting interested in the subject of power
> (energy) consumption monitoring. We have some external tools like
> "battery simulators", energy probes etc., but some targets can measure
> their power usage on their own.
>
> Traditionally such data should be exposed to the user via hwmon sysfs
> interface, and that's exactly what I did for "my" platform - I have
> a /sys/class/hwmon/hwmon*/device/energy*_input and this was good
> enough to draw pretty graphs in userspace. Everyone was happy...
>
> Now I am getting new requests to do more with this data. In particular
> I'm asked how to add such information to ftrace/perf output.
Why? What is the gain?
Perf events can be triggered at any point in the kernel.
A cpufreq event is triggered when the frequency gets changed.
CPU idle events are triggered when the kernel requests to enter an idle state
or exits one.
When would you trigger a thermal or a power event?
There is the possibility of (critical) thermal limits.
But if I understand this correctly you want this for debugging and
I guess you have everything interesting one can do with temperature
values:
- read the temperature
- draw some nice graphs from the results
Hm, I guess I know what you want to do:
In your temperature/energy graph, you want to have some dots
when relevant HW states (frequency, sleep states, DDR power,...)
changed. Then you are able to see the effects over a timeline.
So you have to bring the existing frequency/idle perf events together
with temperature readings
Cleanest solution could be to enhance the exisiting userspace apps
(pytimechart/perf timechart) and let them add another line
(temperature/energy), but the data would not come from perf, but
from sysfs/hwmon.
Not sure whether this works out with the timechart tools.
Anyway, this sounds like a userspace only problem.
Thomas
^ permalink raw reply
* [RFC] Energy/power monitoring within the kernel
From: Thomas Renninger @ 2012-10-24 0:41 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1351013449.9070.5.camel@hornet>
Hi,
On Tuesday, October 23, 2012 06:30:49 PM Pawel Moll wrote:
> Greetings All,
>
> More and more of people are getting interested in the subject of power
> (energy) consumption monitoring. We have some external tools like
> "battery simulators", energy probes etc., but some targets can measure
> their power usage on their own.
>
> Traditionally such data should be exposed to the user via hwmon sysfs
> interface, and that's exactly what I did for "my" platform - I have
> a /sys/class/hwmon/hwmon*/device/energy*_input and this was good
> enough to draw pretty graphs in userspace. Everyone was happy...
>
> Now I am getting new requests to do more with this data. In particular
> I'm asked how to add such information to ftrace/perf output.
Why? What is the gain?
Perf events can be triggered at any point in the kernel.
A cpufreq event is triggered when the frequency gets changed.
CPU idle events are triggered when the kernel requests to enter an idle state
or exits one.
When would you trigger a thermal or a power event?
There is the possibility of (critical) thermal limits.
But if I understand this correctly you want this for debugging and
I guess you have everything interesting one can do with temperature
values:
- read the temperature
- draw some nice graphs from the results
Hm, I guess I know what you want to do:
In your temperature/energy graph, you want to have some dots
when relevant HW states (frequency, sleep states, DDR power,...)
changed. Then you are able to see the effects over a timeline.
So you have to bring the existing frequency/idle perf events together
with temperature readings
Cleanest solution could be to enhance the exisiting userspace apps
(pytimechart/perf timechart) and let them add another line
(temperature/energy), but the data would not come from perf, but
from sysfs/hwmon.
Not sure whether this works out with the timechart tools.
Anyway, this sounds like a userspace only problem.
Thomas
^ permalink raw reply
* [GIT PULL] ARM: OMAP: PM fixes for v3.7-rc3
From: Tony Lindgren @ 2012-10-24 1:37 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <87d3087s5k.fsf@deeprootsystems.com>
* Kevin Hilman <khilman@deeprootsystems.com> [121023 15:15]:
> Tony,
>
> Here are a few more fixes PM-related fixes for v3.7-rc
>
> Kevin
>
>
> The following changes since commit 6f0c0580b70c89094b3422ba81118c7b959c7556:
>
> Linux 3.7-rc2 (2012-10-20 12:11:32 -0700)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap-pm.git tags/for_3.7-rc3-fixes-pm
>
> for you to fetch changes up to 65bf7ca0005d7d827596d5df28583c83c9158da6:
>
> ARM: OMAP3: Beagle: fix OPP customization and initcall ordering (2012-10-22 16:01:42 -0700)
>
> ----------------------------------------------------------------
> Misc. OMAP PM-related fixes for v3.7-rc
Thanks pulling into omap-for-v3.7-rc1/fixes.
Regards,
Tony
> ----------------------------------------------------------------
> Kevin Hilman (2):
> ARM: OMAP2: UART: fix console UART mismatched runtime PM status
> ARM: OMAP3: Beagle: fix OPP customization and initcall ordering
>
> Paul Walmsley (1):
> ARM: OMAP3: PM: apply part of the erratum i582 workaround
>
> arch/arm/mach-omap2/board-omap3beagle.c | 22 +++++++++++++---------
> arch/arm/mach-omap2/pm.h | 1 +
> arch/arm/mach-omap2/pm34xx.c | 30 ++++++++++++++++++++++++++++--
> arch/arm/mach-omap2/serial.c | 5 +++++
> 4 files changed, 47 insertions(+), 11 deletions(-)
^ permalink raw reply
* [PATCH] ARM: AM33XX: Fix configuration of dmtimer parent clock by dmtimer driver
From: Tony Lindgren @ 2012-10-24 2:01 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1350500155-22525-1-git-send-email-jon-hunter@ti.com>
* Jon Hunter <jon-hunter@ti.com> [121017 11:57]:
> From: Vaibhav Hiremath <hvaibhav@ti.com>
>
> Add dmtimer clock aliases for AM33XX devices so that the parent clock for
> the dmtimer can be set correctly by the dmtimer driver. Without these clock
> aliases the dmtimer driver will fail to find the parent clocks for the dmtimer.
>
> Verified that DMTIMERs can be successfully requested on AM335x beagle bone.
>
> Original patch was provided by Vaibhav Hiremath [1]. Changelog and
> additional verification performed by Jon Hunter.
>
> [1] http://marc.info/?l=linux-omap&m=134693631608018&w=2
>
> Signed-off-by: Vaibhav Hiremath <hvaibhav@ti.com>
> Signed-off-by: Jon Hunter <jon-hunter@ti.com>
> Tested-by: Jon Hunter <jon-hunter@ti.com>
Thanks applying into omap-for-v3.7-rc2/fixes.
Regards,
Tony
> ---
> arch/arm/mach-omap2/clock33xx_data.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/arm/mach-omap2/clock33xx_data.c b/arch/arm/mach-omap2/clock33xx_data.c
> index 114ab4b..1a45d6b 100644
> --- a/arch/arm/mach-omap2/clock33xx_data.c
> +++ b/arch/arm/mach-omap2/clock33xx_data.c
> @@ -1073,6 +1073,8 @@ static struct omap_clk am33xx_clks[] = {
> CLK(NULL, "gfx_fck_div_ck", &gfx_fck_div_ck, CK_AM33XX),
> CLK(NULL, "sysclkout_pre_ck", &sysclkout_pre_ck, CK_AM33XX),
> CLK(NULL, "clkout2_ck", &clkout2_ck, CK_AM33XX),
> + CLK(NULL, "timer_32k_ck", &clkdiv32k_ick, CK_AM33XX),
> + CLK(NULL, "timer_sys_ck", &sys_clkin_ck, CK_AM33XX),
> };
>
> int __init am33xx_clk_init(void)
> --
> 1.7.9.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH v4 2/5] ARM: EXYNOS: Correct combined IRQs for exynos4
From: Chanho Park @ 2012-10-24 2:20 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <020c01cdb126$85c9c940$915d5bc0$%kim@samsung.com>
> -----Original Message-----
> From: linux-arm-kernel-bounces at lists.infradead.org [mailto:linux-arm-
> kernel-bounces at lists.infradead.org] On Behalf Of Kukjin Kim
> Sent: Tuesday, October 23, 2012 10:59 PM
> To: 'Chanho Park'; ben-linux at fluff.org; linux-arm-kernel at lists.infradead.org;
> linux-samsung-soc at vger.kernel.org
> Cc: sachin.kamat at linaro.org; will.deacon at arm.com;
> kyungmin.park at samsung.com; linux at arm.linux.org.uk;
> thomas.abraham at linaro.org
> Subject: RE: [PATCH v4 2/5] ARM: EXYNOS: Correct combined IRQs for
> exynos4
>
> Chanho Park wrote:
> >
> > This patch corrects combined IRQs for exynos4 series platform. The
> > exynos4412
> > has four extra combined irq group and the exynos4212 has two more
> > combined irqs than exynos4210. Each irq is mapped to IRQ_SPI(xx).
> > Unfortunately, extra 4 combined IRQs isn't sequential. So, we need to
> > map the irqs manually.
> >
> > Signed-off-by: Chanho Park <chanho61.park@samsung.com>
> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> > ---
> > arch/arm/mach-exynos/common.c | 42
> +++++++++++++++++++++++++----
> > -
> > arch/arm/mach-exynos/include/mach/irqs.h | 4 ++-
> > 2 files changed, 39 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/arm/mach-exynos/common.c
> > b/arch/arm/mach-exynos/common.c index 709245e..fdd582a 100644
> > --- a/arch/arm/mach-exynos/common.c
> > +++ b/arch/arm/mach-exynos/common.c
> > @@ -560,23 +560,50 @@ static struct irq_domain_ops
> > combiner_irq_domain_ops = {
> > .map = combiner_irq_domain_map,
> > };
> >
> > +static unsigned int combiner_extra_irq(int group)
>
> This is only for exynos4212 and exynos4412 so how about to use
> exynos4x12_combiner_extra_irq()?
I agree with you. I'll change it in next patchset.
>
> > +{
> > + switch (group) {
> > + case 16:
> > + return IRQ_SPI(107);
> > + case 17:
> > + return IRQ_SPI(108);
> > + case 18:
> > + return IRQ_SPI(48);
> > + case 19:
> > + return IRQ_SPI(42);
> > + default:
> > + return 0;
> > + }
> > +}
> > +
> > +static unsigned int max_combiner_nr(void) {
> > + if (soc_is_exynos5250())
> > + return EXYNOS5_MAX_COMBINER_NR;
> > + else if (soc_is_exynos4412())
> > + return EXYNOS4_MAX_COMBINER_NR;
>
> EXYNOS4412_MAX_COMBINER_NR is more clear?
EXYNOS4_MAX_COMBINER_NR is defined for MAX_COMBINER_NR which determines maximum combined irq number.
In this situation, EXYNOS4_MAX_COMBINER_NR is more clear than EXYNOS4412_xx.
How about this? I think it's more clearer in all cases.
-#define EXYNOS4_MAX_COMBINER_NR 16
+#define EXYNOS4210_MAX_COMBINER_NR 16
+#define EXYNOS4212_MAX_COMBINER_NR 18
+#define EXYNOS4412_MAX_COMBINER_NR 20
+#define EXYNOS4_MAX_COMBINER_NR EXYNOS4412_MAX_COMBINER_NR
>
> > + else if (soc_is_exynos4212())
> > + return EXYNOS4212_MAX_COMBINER_NR;
> > + else
> > + return EXYNOS4210_MAX_COMBINER_NR;
> > +}
> > +
> > static void __init combiner_init(void __iomem *combiner_base,
> > struct device_node *np)
> > {
> > int i, irq, irq_base;
> > unsigned int max_nr, nr_irq;
> >
> > + max_nr = max_combiner_nr();
> > +
> > if (np) {
> > if (of_property_read_u32(np, "samsung,combiner-nr",
> &max_nr))
> > {
> > pr_warning("%s: number of combiners not specified,
> "
>
> Hmm...the message should be changed, because it is just defined by
> checking SoC with this changes not property of device tree...So how about
> just using
> pr_info() with proper message?
I agree with you. I'll fix it.
>
> > "setting default as %d.\n",
> > - __func__, EXYNOS4_MAX_COMBINER_NR);
> > - max_nr = EXYNOS4_MAX_COMBINER_NR;
> > + __func__, max_nr);
> > }
> > - } else {
> > - max_nr = soc_is_exynos5250() ?
> EXYNOS5_MAX_COMBINER_NR :
> > -
> EXYNOS4_MAX_COMBINER_NR;
> > }
> > +
> > nr_irq = max_nr * MAX_IRQ_IN_COMBINER;
> >
> > irq_base = irq_alloc_descs(COMBINER_IRQ(0, 0), 1, nr_irq, 0); @@
> > -593,7 +620,10 @@ static void __init combiner_init(void __iomem
> > *combiner_base,
> > }
> >
> > for (i = 0; i < max_nr; i++) {
> > - irq = IRQ_SPI(i);
> > + if (i < EXYNOS4210_MAX_COMBINER_NR ||
> soc_is_exynos5250())
> > + irq = IRQ_SPI(i);
> > + else
> > + irq = combiner_extra_irq(i);
> > #ifdef CONFIG_OF
> > if (np)
> > irq = irq_of_parse_and_map(np, i); diff --git
> > a/arch/arm/mach-exynos/include/mach/irqs.h b/arch/arm/mach-
> > exynos/include/mach/irqs.h index 35bced6..3a83546 100644
> > --- a/arch/arm/mach-exynos/include/mach/irqs.h
> > +++ b/arch/arm/mach-exynos/include/mach/irqs.h
> > @@ -165,7 +165,9 @@
> > #define EXYNOS4_IRQ_FIMD0_VSYNC COMBINER_IRQ(11, 1)
> > #define EXYNOS4_IRQ_FIMD0_SYSTEM COMBINER_IRQ(11, 2)
> >
> > -#define EXYNOS4_MAX_COMBINER_NR 16
> > +#define EXYNOS4210_MAX_COMBINER_NR 16
> > +#define EXYNOS4212_MAX_COMBINER_NR 18
> > +#define EXYNOS4_MAX_COMBINER_NR 20
>
> EXYNOS4412_MAX_COMBINER_NR ?
>
> >
> > #define EXYNOS4_IRQ_GPIO1_NR_GROUPS 16
> > #define EXYNOS4_IRQ_GPIO2_NR_GROUPS 9
> > --
> > 1.7.9.5
>
>
>
> Thanks.
>
> Best regards,
> Kgene.
> --
> Kukjin Kim <kgene.kim@samsung.com>, Senior Engineer, SW Solution
> Development Team, Samsung Electronics Co., Ltd.
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH v2 0/7] crypto: omap-sham updates
From: Mark A. Greer @ 2012-10-24 2:36 UTC (permalink / raw)
To: linux-arm-kernel
From: "Mark A. Greer" <mgreer@animalcreek.com>
Changes since v1:
- Removed the check of CM_IDLEST to see if the module exists
and instead add the hwmod data for all omap2's and omap3 GP's.
- Placed new sha_ick clk entries after the 'omap-sham' entry
in the clockxxx_data.c files
- Removed cpu_is_xxx() checks in
arch/arm/mach-omap2/devices.c:omap_init_sham()
- Rebased on the latest k.o. kernel
This series updates the crypto omap-sham driver and supporting
infrastructure.
Notes:
a) Based on current k.o. 2d1f4c8 (Merge branch 'drm-fixes' of
git://people.freedesktop.org/~airlied/linux)
b) Since these patches will likely go though the OMAP tree (and not
through the crypto tree), it would be nice if the crypto guy(s)
would ACK or NACK patches 5-7 which modify the
drivers/crypto/omap-sham.c driver.
c) These have only been tested on an omap2420 h4 and an am37x evm. If you
have different hardware available and a few minutes, please test them.
A quick and easy test is to enable tcrypt as a module
(CONFIG_CRYPTO_TEST=m), boot, then run 'modprobe tcrypt sec=2 mode=403'.
'CONFIG_CRYPTO_SHA1' and 'CONFIG_CRYPTO_DEV_OMAP_SHAM' also have to be
enabled. A quick 'grep omap-sham /proc/interrupts' will tell you if
the omap-sham driver was really used.
d) To test these patches, you will likely need...
i) The patch included here:
http://marc.info/?l=kernel-janitors&m=134910841909057&w=2
ii) This patch from linux-omap/master:
27615a9 (ARM: OMAP: Trivial driver changes to remove include
plat/cpu.h)
iii) This patch from Paul Walmsley:
http://www.spinics.net/lists/linux-omap/msg79436.html
e) If you prefer, a version you can test is available at
git at github.com:mgreeraz/linux-mag.git mag/wip/crypto/test
f) There is a reduction in DMA performance after switching to dmaengine
(see http://www.spinics.net/lists/linux-omap/msg79855.html)
g) Many thanks to Jon Hunter for testing on his omap2420 h4.
Mark A. Greer (7):
ARM: OMAP2xxx: hwmod: Convert SHAM crypto device data to hwmod
ARM: OMAP2xxx: hwmod: Add DMA information for SHAM module
ARM: OMAP3xxx: hwmod: Convert SHAM crypto device data to hwmod
ARM: OMAP2+: Remove unnecessary message when no SHA IP is present
crypto: omap-sham: Convert to use pm_runtime API
crypto: omap-sham: Add code to use dmaengine API
crypto: omap_sham: Remove usage of private DMA API
arch/arm/mach-omap2/clock2420_data.c | 1 +
arch/arm/mach-omap2/clock2430_data.c | 1 +
arch/arm/mach-omap2/clock3xxx_data.c | 1 +
arch/arm/mach-omap2/devices.c | 81 +++------
arch/arm/mach-omap2/omap_hwmod_2420_data.c | 1 +
arch/arm/mach-omap2/omap_hwmod_2430_data.c | 1 +
.../mach-omap2/omap_hwmod_2xxx_interconnect_data.c | 18 ++
arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c | 43 +++++
arch/arm/mach-omap2/omap_hwmod_3xxx_data.c | 60 +++++++
arch/arm/mach-omap2/omap_hwmod_common_data.h | 2 +
drivers/crypto/omap-sham.c | 192 +++++++++++----------
11 files changed, 250 insertions(+), 151 deletions(-)
--
1.7.12
^ permalink raw reply
* [PATCH v2 1/7] ARM: OMAP2xxx: hwmod: Convert SHAM crypto device data to hwmod
From: Mark A. Greer @ 2012-10-24 2:36 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1351046167-4882-1-git-send-email-mgreer@animalcreek.com>
From: "Mark A. Greer" <mgreer@animalcreek.com>
Convert the device data for the OMAP2 SHAM crypto IP from
explicit platform_data to hwmod.
CC: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
---
arch/arm/mach-omap2/clock2430_data.c | 1 +
arch/arm/mach-omap2/devices.c | 34 ++++++++------------
arch/arm/mach-omap2/omap_hwmod_2420_data.c | 1 +
arch/arm/mach-omap2/omap_hwmod_2430_data.c | 1 +
.../mach-omap2/omap_hwmod_2xxx_interconnect_data.c | 18 +++++++++++
arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c | 37 ++++++++++++++++++++++
arch/arm/mach-omap2/omap_hwmod_common_data.h | 2 ++
7 files changed, 73 insertions(+), 21 deletions(-)
diff --git a/arch/arm/mach-omap2/clock2430_data.c b/arch/arm/mach-omap2/clock2430_data.c
index 22404fe..4d52ec6 100644
--- a/arch/arm/mach-omap2/clock2430_data.c
+++ b/arch/arm/mach-omap2/clock2430_data.c
@@ -1993,6 +1993,7 @@ static struct omap_clk omap2430_clks[] = {
CLK(NULL, "sdrc_ick", &sdrc_ick, CK_243X),
CLK(NULL, "des_ick", &des_ick, CK_243X),
CLK("omap-sham", "ick", &sha_ick, CK_243X),
+ CLK(NULL, "sha_ick", &sha_ick, CK_242X),
CLK("omap_rng", "ick", &rng_ick, CK_243X),
CLK(NULL, "rng_ick", &rng_ick, CK_243X),
CLK("omap-aes", "ick", &aes_ick, CK_243X),
diff --git a/arch/arm/mach-omap2/devices.c b/arch/arm/mach-omap2/devices.c
index cba60e0..f18fa50 100644
--- a/arch/arm/mach-omap2/devices.c
+++ b/arch/arm/mach-omap2/devices.c
@@ -34,6 +34,8 @@
#include "mux.h"
#include "control.h"
#include "devices.h"
+#include "cm2xxx_3xxx.h"
+#include "cm-regbits-24xx.h"
#define L3_MODULES_MAX_LEN 12
#define L3_MODULES 3
@@ -453,24 +455,6 @@ static void omap_init_rng(void)
#if defined(CONFIG_CRYPTO_DEV_OMAP_SHAM) || defined(CONFIG_CRYPTO_DEV_OMAP_SHAM_MODULE)
-#ifdef CONFIG_ARCH_OMAP2
-static struct resource omap2_sham_resources[] = {
- {
- .start = OMAP24XX_SEC_SHA1MD5_BASE,
- .end = OMAP24XX_SEC_SHA1MD5_BASE + 0x64,
- .flags = IORESOURCE_MEM,
- },
- {
- .start = 51 + OMAP_INTC_START,
- .flags = IORESOURCE_IRQ,
- }
-};
-static int omap2_sham_resources_sz = ARRAY_SIZE(omap2_sham_resources);
-#else
-#define omap2_sham_resources NULL
-#define omap2_sham_resources_sz 0
-#endif
-
#ifdef CONFIG_ARCH_OMAP3
static struct resource omap3_sham_resources[] = {
{
@@ -501,16 +485,24 @@ static struct platform_device sham_device = {
static void omap_init_sham(void)
{
if (cpu_is_omap24xx()) {
- sham_device.resource = omap2_sham_resources;
- sham_device.num_resources = omap2_sham_resources_sz;
+ struct omap_hwmod *oh;
+ struct platform_device *pdev;
+
+ oh = omap_hwmod_lookup("sham");
+ if (!oh)
+ return;
+
+ pdev = omap_device_build("omap-sham", -1, oh, NULL, 0, NULL,
+ 0, 0);
+ WARN(IS_ERR(pdev), "Can't build omap_device for omap-sham\n");
} else if (cpu_is_omap34xx()) {
sham_device.resource = omap3_sham_resources;
sham_device.num_resources = omap3_sham_resources_sz;
+ platform_device_register(&sham_device);
} else {
pr_err("%s: platform not supported\n", __func__);
return;
}
- platform_device_register(&sham_device);
}
#else
static inline void omap_init_sham(void) { }
diff --git a/arch/arm/mach-omap2/omap_hwmod_2420_data.c b/arch/arm/mach-omap2/omap_hwmod_2420_data.c
index b5db600..b102a53 100644
--- a/arch/arm/mach-omap2/omap_hwmod_2420_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_2420_data.c
@@ -603,6 +603,7 @@ static struct omap_hwmod_ocp_if *omap2420_hwmod_ocp_ifs[] __initdata = {
&omap2420_l4_core__mcbsp2,
&omap2420_l4_core__msdi1,
&omap2xxx_l4_core__rng,
+ &omap2xxx_l4_core__sham,
&omap2420_l4_core__hdq1w,
&omap2420_l4_wkup__counter_32k,
&omap2420_l3__gpmc,
diff --git a/arch/arm/mach-omap2/omap_hwmod_2430_data.c b/arch/arm/mach-omap2/omap_hwmod_2430_data.c
index c455e41..b1ce7b0 100644
--- a/arch/arm/mach-omap2/omap_hwmod_2430_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_2430_data.c
@@ -963,6 +963,7 @@ static struct omap_hwmod_ocp_if *omap2430_hwmod_ocp_ifs[] __initdata = {
&omap2430_l4_core__mcbsp5,
&omap2430_l4_core__hdq1w,
&omap2xxx_l4_core__rng,
+ &omap2xxx_l4_core__sham,
&omap2430_l4_wkup__counter_32k,
&omap2430_l3__gpmc,
NULL,
diff --git a/arch/arm/mach-omap2/omap_hwmod_2xxx_interconnect_data.c b/arch/arm/mach-omap2/omap_hwmod_2xxx_interconnect_data.c
index 1a1287d..bb314c5 100644
--- a/arch/arm/mach-omap2/omap_hwmod_2xxx_interconnect_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_2xxx_interconnect_data.c
@@ -138,6 +138,15 @@ static struct omap_hwmod_addr_space omap2_rng_addr_space[] = {
{ }
};
+struct omap_hwmod_addr_space omap2xxx_sham_addrs[] = {
+ {
+ .pa_start = 0x480a4000,
+ .pa_end = 0x480a4000 + 0x64 - 1,
+ .flags = ADDR_TYPE_RT
+ },
+ { }
+};
+
/*
* Common interconnect data
*/
@@ -389,3 +398,12 @@ struct omap_hwmod_ocp_if omap2xxx_l4_core__rng = {
.addr = omap2_rng_addr_space,
.user = OCP_USER_MPU | OCP_USER_SDMA,
};
+
+/* l4 core -> sham interface */
+struct omap_hwmod_ocp_if omap2xxx_l4_core__sham = {
+ .master = &omap2xxx_l4_core_hwmod,
+ .slave = &omap2xxx_sham_hwmod,
+ .clk = "sha_ick",
+ .addr = omap2xxx_sham_addrs,
+ .user = OCP_USER_MPU,
+};
diff --git a/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c b/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
index bd9220e..a041670 100644
--- a/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
@@ -851,3 +851,40 @@ struct omap_hwmod omap2xxx_rng_hwmod = {
.flags = HWMOD_INIT_NO_RESET,
.class = &omap2_rng_hwmod_class,
};
+
+/* SHAM */
+
+static struct omap_hwmod_class_sysconfig omap2_sham_sysc = {
+ .rev_offs = 0x5c,
+ .sysc_offs = 0x60,
+ .syss_offs = 0x64,
+ .sysc_flags = (SYSC_HAS_SOFTRESET | SYSC_HAS_AUTOIDLE |
+ SYSS_HAS_RESET_STATUS),
+ .sysc_fields = &omap_hwmod_sysc_type1,
+};
+
+static struct omap_hwmod_class omap2xxx_sham_class = {
+ .name = "sham",
+ .sysc = &omap2_sham_sysc,
+};
+
+struct omap_hwmod_irq_info omap2_sham_mpu_irqs[] = {
+ { .irq = 51 + OMAP_INTC_START, },
+ { .irq = -1 }
+};
+
+struct omap_hwmod omap2xxx_sham_hwmod = {
+ .name = "sham",
+ .mpu_irqs = omap2_sham_mpu_irqs,
+ .main_clk = "l4_ck",
+ .prcm = {
+ .omap2 = {
+ .module_offs = CORE_MOD,
+ .prcm_reg_id = 4,
+ .module_bit = OMAP24XX_EN_SHA_SHIFT,
+ .idlest_reg_id = 4,
+ .idlest_idle_bit = OMAP24XX_ST_SHA_SHIFT,
+ },
+ },
+ .class = &omap2xxx_sham_class,
+};
diff --git a/arch/arm/mach-omap2/omap_hwmod_common_data.h b/arch/arm/mach-omap2/omap_hwmod_common_data.h
index 2bc8f17..74a7b7a 100644
--- a/arch/arm/mach-omap2/omap_hwmod_common_data.h
+++ b/arch/arm/mach-omap2/omap_hwmod_common_data.h
@@ -78,6 +78,7 @@ extern struct omap_hwmod omap2xxx_mcspi2_hwmod;
extern struct omap_hwmod omap2xxx_counter_32k_hwmod;
extern struct omap_hwmod omap2xxx_gpmc_hwmod;
extern struct omap_hwmod omap2xxx_rng_hwmod;
+extern struct omap_hwmod omap2xxx_sham_hwmod;
/* Common interface data across OMAP2xxx */
extern struct omap_hwmod_ocp_if omap2xxx_l3_main__l4_core;
@@ -105,6 +106,7 @@ extern struct omap_hwmod_ocp_if omap2xxx_l4_core__dss_dispc;
extern struct omap_hwmod_ocp_if omap2xxx_l4_core__dss_rfbi;
extern struct omap_hwmod_ocp_if omap2xxx_l4_core__dss_venc;
extern struct omap_hwmod_ocp_if omap2xxx_l4_core__rng;
+extern struct omap_hwmod_ocp_if omap2xxx_l4_core__sham;
/* Common IP block data */
extern struct omap_hwmod_dma_info omap2_uart1_sdma_reqs[];
--
1.7.12
^ permalink raw reply related
* [PATCH v2 2/7] ARM: OMAP2xxx: hwmod: Add DMA support for SHAM module
From: Mark A. Greer @ 2012-10-24 2:36 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1351046167-4882-1-git-send-email-mgreer@animalcreek.com>
From: "Mark A. Greer" <mgreer@animalcreek.com>
The current OMAP2 SHAM support doesn't enable DMA
so add that support so it can use DMA just like OMAP3.
CC: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
---
arch/arm/mach-omap2/omap_hwmod_2xxx_interconnect_data.c | 2 +-
arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c | 6 ++++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mach-omap2/omap_hwmod_2xxx_interconnect_data.c b/arch/arm/mach-omap2/omap_hwmod_2xxx_interconnect_data.c
index bb314c5..4b4fd5f 100644
--- a/arch/arm/mach-omap2/omap_hwmod_2xxx_interconnect_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_2xxx_interconnect_data.c
@@ -405,5 +405,5 @@ struct omap_hwmod_ocp_if omap2xxx_l4_core__sham = {
.slave = &omap2xxx_sham_hwmod,
.clk = "sha_ick",
.addr = omap2xxx_sham_addrs,
- .user = OCP_USER_MPU,
+ .user = OCP_USER_MPU | OCP_USER_SDMA,
};
diff --git a/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c b/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
index a041670..703b269 100644
--- a/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
@@ -873,9 +873,15 @@ struct omap_hwmod_irq_info omap2_sham_mpu_irqs[] = {
{ .irq = -1 }
};
+struct omap_hwmod_dma_info omap2_sham_sdma_chs[] = {
+ { .name = "rx", .dma_req = OMAP24XX_DMA_SHA1MD5_RX },
+ { .dma_req = -1 }
+};
+
struct omap_hwmod omap2xxx_sham_hwmod = {
.name = "sham",
.mpu_irqs = omap2_sham_mpu_irqs,
+ .sdma_reqs = omap2_sham_sdma_chs,
.main_clk = "l4_ck",
.prcm = {
.omap2 = {
--
1.7.12
^ permalink raw reply related
* [PATCH v2 3/7] ARM: OMAP3xxx: hwmod: Convert SHAM crypto device data to hwmod
From: Mark A. Greer @ 2012-10-24 2:36 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1351046167-4882-1-git-send-email-mgreer@animalcreek.com>
From: "Mark A. Greer" <mgreer@animalcreek.com>
Convert the device data for the OMAP3 SHAM2 (SHA1/MD5) crypto IP
from explicit platform_data to hwmod.
CC: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
---
arch/arm/mach-omap2/clock3xxx_data.c | 1 +
arch/arm/mach-omap2/devices.c | 42 ++-------------------
arch/arm/mach-omap2/omap_hwmod_3xxx_data.c | 60 ++++++++++++++++++++++++++++++
3 files changed, 64 insertions(+), 39 deletions(-)
diff --git a/arch/arm/mach-omap2/clock3xxx_data.c b/arch/arm/mach-omap2/clock3xxx_data.c
index 1f42c9d..6f14d9b 100644
--- a/arch/arm/mach-omap2/clock3xxx_data.c
+++ b/arch/arm/mach-omap2/clock3xxx_data.c
@@ -3342,6 +3342,7 @@ static struct omap_clk omap3xxx_clks[] = {
CLK(NULL, "icr_ick", &icr_ick, CK_34XX | CK_36XX),
CLK("omap-aes", "ick", &aes2_ick, CK_34XX | CK_36XX),
CLK("omap-sham", "ick", &sha12_ick, CK_34XX | CK_36XX),
+ CLK(NULL, "sha12_ick", &sha12_ick, CK_34XX | CK_36XX),
CLK(NULL, "des2_ick", &des2_ick, CK_34XX | CK_36XX),
CLK("omap_hsmmc.1", "ick", &mmchs2_ick, CK_3XXX),
CLK("omap_hsmmc.0", "ick", &mmchs1_ick, CK_3XXX),
diff --git a/arch/arm/mach-omap2/devices.c b/arch/arm/mach-omap2/devices.c
index f18fa50..f38ac9d 100644
--- a/arch/arm/mach-omap2/devices.c
+++ b/arch/arm/mach-omap2/devices.c
@@ -36,6 +36,7 @@
#include "devices.h"
#include "cm2xxx_3xxx.h"
#include "cm-regbits-24xx.h"
+#include "cm-regbits-34xx.h"
#define L3_MODULES_MAX_LEN 12
#define L3_MODULES 3
@@ -453,38 +454,9 @@ static void omap_init_rng(void)
WARN(IS_ERR(pdev), "Can't build omap_device for omap_rng\n");
}
-#if defined(CONFIG_CRYPTO_DEV_OMAP_SHAM) || defined(CONFIG_CRYPTO_DEV_OMAP_SHAM_MODULE)
-
-#ifdef CONFIG_ARCH_OMAP3
-static struct resource omap3_sham_resources[] = {
- {
- .start = OMAP34XX_SEC_SHA1MD5_BASE,
- .end = OMAP34XX_SEC_SHA1MD5_BASE + 0x64,
- .flags = IORESOURCE_MEM,
- },
- {
- .start = 49 + OMAP_INTC_START,
- .flags = IORESOURCE_IRQ,
- },
- {
- .start = OMAP34XX_DMA_SHA1MD5_RX,
- .flags = IORESOURCE_DMA,
- }
-};
-static int omap3_sham_resources_sz = ARRAY_SIZE(omap3_sham_resources);
-#else
-#define omap3_sham_resources NULL
-#define omap3_sham_resources_sz 0
-#endif
-
-static struct platform_device sham_device = {
- .name = "omap-sham",
- .id = -1,
-};
-
-static void omap_init_sham(void)
+static void __init omap_init_sham(void)
{
- if (cpu_is_omap24xx()) {
+ if (cpu_is_omap24xx() || cpu_is_omap34xx()) {
struct omap_hwmod *oh;
struct platform_device *pdev;
@@ -495,18 +467,10 @@ static void omap_init_sham(void)
pdev = omap_device_build("omap-sham", -1, oh, NULL, 0, NULL,
0, 0);
WARN(IS_ERR(pdev), "Can't build omap_device for omap-sham\n");
- } else if (cpu_is_omap34xx()) {
- sham_device.resource = omap3_sham_resources;
- sham_device.num_resources = omap3_sham_resources_sz;
- platform_device_register(&sham_device);
} else {
pr_err("%s: platform not supported\n", __func__);
- return;
}
}
-#else
-static inline void omap_init_sham(void) { }
-#endif
#if defined(CONFIG_CRYPTO_DEV_OMAP_AES) || defined(CONFIG_CRYPTO_DEV_OMAP_AES_MODULE)
diff --git a/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c b/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c
index f67b7ee..785a0c5 100644
--- a/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c
@@ -3543,6 +3543,65 @@ static struct omap_hwmod_ocp_if omap3xxx_l3_main__gpmc = {
.user = OCP_USER_MPU | OCP_USER_SDMA,
};
+/* l4_core -> SHAM2 (SHA1/MD5) (similar to omap24xx) */
+static struct omap_hwmod_class_sysconfig omap3_sham_sysc = {
+ .rev_offs = 0x5c,
+ .sysc_offs = 0x60,
+ .syss_offs = 0x64,
+ .sysc_flags = (SYSC_HAS_SOFTRESET | SYSC_HAS_AUTOIDLE |
+ SYSS_HAS_RESET_STATUS),
+ .sysc_fields = &omap_hwmod_sysc_type1,
+};
+
+static struct omap_hwmod_class omap3xxx_sham_class = {
+ .name = "sham",
+ .sysc = &omap3_sham_sysc,
+};
+
+struct omap_hwmod_irq_info omap3_sham_mpu_irqs[] = {
+ { .irq = 49 + OMAP_INTC_START, },
+ { .irq = -1 }
+};
+
+struct omap_hwmod_dma_info omap3_sham_sdma_reqs[] = {
+ { .name = "rx", .dma_req = OMAP34XX_DMA_SHA1MD5_RX, },
+ { .dma_req = -1 }
+};
+
+struct omap_hwmod omap3xxx_sham_hwmod = {
+ .name = "sham",
+ .mpu_irqs = omap3_sham_mpu_irqs,
+ .sdma_reqs = omap3_sham_sdma_reqs,
+ .main_clk = "sha12_ick",
+ .prcm = {
+ .omap2 = {
+ .module_offs = CORE_MOD,
+ .prcm_reg_id = 1,
+ .module_bit = OMAP3430_EN_SHA12_SHIFT,
+ .idlest_reg_id = 1,
+ .idlest_idle_bit = OMAP3430_ST_SHA12_SHIFT,
+ },
+ },
+ .class = &omap3xxx_sham_class,
+};
+
+static struct omap_hwmod_addr_space omap3xxx_sham_addrs[] = {
+ {
+ .pa_start = 0x480c3000,
+ .pa_end = 0x480c3000 + 0x64 - 1,
+ .flags = ADDR_TYPE_RT
+ },
+ { }
+};
+
+static struct omap_hwmod_ocp_if omap3xxx_l4_core__sham = {
+ .master = &omap3xxx_l4_core_hwmod,
+ .slave = &omap3xxx_sham_hwmod,
+ .clk = "sha12_ick",
+ .addr = omap3xxx_sham_addrs,
+ .user = OCP_USER_MPU | OCP_USER_SDMA,
+};
+
static struct omap_hwmod_ocp_if *omap3xxx_hwmod_ocp_ifs[] __initdata = {
&omap3xxx_l3_main__l4_core,
&omap3xxx_l3_main__l4_per,
@@ -3596,6 +3655,7 @@ static struct omap_hwmod_ocp_if *omap3xxx_hwmod_ocp_ifs[] __initdata = {
/* GP-only hwmod links */
static struct omap_hwmod_ocp_if *omap3xxx_gp_hwmod_ocp_ifs[] __initdata = {
&omap3xxx_l4_sec__timer12,
+ &omap3xxx_l4_core__sham,
NULL
};
--
1.7.12
^ permalink raw reply related
* [PATCH v2 4/7] ARM: OMAP2+: Remove unnecessary message when no SHA IP is present
From: Mark A. Greer @ 2012-10-24 2:36 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1351046167-4882-1-git-send-email-mgreer@animalcreek.com>
From: "Mark A. Greer" <mgreer@animalcreek.com>
Remove the error message that prints when there is no SHA IP
present to make it consistent with all the other IPs.
CC: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
---
arch/arm/mach-omap2/devices.c | 19 +++++++------------
1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/arch/arm/mach-omap2/devices.c b/arch/arm/mach-omap2/devices.c
index f38ac9d..f41c793 100644
--- a/arch/arm/mach-omap2/devices.c
+++ b/arch/arm/mach-omap2/devices.c
@@ -456,20 +456,15 @@ static void omap_init_rng(void)
static void __init omap_init_sham(void)
{
- if (cpu_is_omap24xx() || cpu_is_omap34xx()) {
- struct omap_hwmod *oh;
- struct platform_device *pdev;
+ struct omap_hwmod *oh;
+ struct platform_device *pdev;
- oh = omap_hwmod_lookup("sham");
- if (!oh)
- return;
+ oh = omap_hwmod_lookup("sham");
+ if (!oh)
+ return;
- pdev = omap_device_build("omap-sham", -1, oh, NULL, 0, NULL,
- 0, 0);
- WARN(IS_ERR(pdev), "Can't build omap_device for omap-sham\n");
- } else {
- pr_err("%s: platform not supported\n", __func__);
- }
+ pdev = omap_device_build("omap-sham", -1, oh, NULL, 0, NULL, 0, 0);
+ WARN(IS_ERR(pdev), "Can't build omap_device for omap-sham\n");
}
#if defined(CONFIG_CRYPTO_DEV_OMAP_AES) || defined(CONFIG_CRYPTO_DEV_OMAP_AES_MODULE)
--
1.7.12
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox