* Re: [PATCH v4 0/3] DEVFREQ, DVFS framework for non-CPU devices
From: Turquette, Mike @ 2011-08-01 22:01 UTC (permalink / raw)
To: myungjoo.ham
Cc: Len Brown, Greg Kroah-Hartman, Kyungmin Park, Thomas Gleixner,
linux-pm
In-Reply-To: <CAJ0PZbT0+ovjPAnVKaaJNVChG+kU3cgjC9sEU4ppnMGqGL+UoA@mail.gmail.com>
On Sun, Jul 31, 2011 at 11:22 PM, MyungJoo Ham <myungjoo.ham@samsung.com> wrote:
> Hello.
>
> On Sat, Jul 30, 2011 at 10:02 AM, Turquette, Mike <mturquette@ti.com> wrote:
>> On Fri, Jul 29, 2011 at 2:10 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>> On Friday, July 29, 2011, Turquette, Mike wrote:
>>>> On Thu, Jul 28, 2011 at 3:10 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>> > On Friday, July 15, 2011, MyungJoo Ham wrote:
>>>> >> For a usage example, please look at
>>>> >> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
>>>> >>
>>>> >> In the above git tree, DVFS (dynamic voltage and frequency scaling) mechanism
>>>> >> is applied to the memory bus of Exynos4210 for Exynos4210-NURI boards.
>>>> >> In the example, the LPDDR2 DRAM frequency changes between 133, 266, and 400MHz
>>>> >> and other related clocks simply follow the determined DDR RAM clock.
>>>> >>
>>>> >> The DEVFREQ driver for Exynos4210 memory bus is at
>>>> >> /arch/arm/mach-exynos4/devfreq_bus.c in the git tree.
>>>> >>
>>>> >> MyungJoo Ham (3):
>>>> >> PM: Introduce DEVFREQ: generic DVFS framework with device-specific
>>>> >> OPPs
>>>> >> PM / DEVFREQ: add example governors
>>>> >> PM / DEVFREQ: add sysfs interface (including user tickling)
>>>> >
>>>> > OK, I'm going to take the patches for 3.2.
>>>>
>>>> Have any other platforms signed up to use this mechanism to manage
>>>> their peripheral DVFS?
>>>
>>> Not that I know of, but one initial user is sufficient for me.
>>> So if you have anything _against_ the patches, please speak up.
>>
>> I do have some concerns. Let me start by saying that I'm defining a
>> "governor" as some active piece of executing code, probably a looping
>> workqueue that inspects activity/idleness of a device and then makes a
>> determination regarding clock frequency.
>>
>> devfreq seems to be good framework for creating DVFS governors.
>> However I think that most scalable devices on an SoC do *not* need a
>> governor, and many scalable devices won't have performance counters or
>> any other way to implement such introspection.
>
> Yes, governors except for some static or userspace-driven ones (such
> as "performance", "powersave", and "userspace" although "userspace" is
> not implemented for devfreq yet), they loop workqueue that inspects
> activity/idleness of a device and determines frequency. However, the
> inspection is done with a callback provided by each device, not done
> directly by the devfreq itself. Therefore, if there is any way to
> measure the activities (not just performance counters, number of
> requests/function calls should be fine for may cases), normal
> governors like "simple-ondemand" will work.
Maybe I'm not understanding how the devfreq requests would be made
from drivers. Can you explain an example where a single target device
named X has constraints placed on it's clock rate from two different
drivers Y & Z? Imagine in this case that there are no performance
counters or any way in hardware to monitor device saturation.
>> Some examples include a MMC controller, which might change its clock
>> rate depending on the class of card that the user has inserted. Or
>> even a "smartish" device like a GPU lacking performance counters; it's
>> driver will ramp up frequency when there is work to be done and kick
>> off a timeout. If no new work comes in before the timeout then the
>> driver will drop the frequency.
>
> In the "simple MMC controller w/o performance counter" case, there are
> following ways to use devfreq even if using the number of requests or
> functions calls is not possible.
>
> Method 1) use "userspace" governor and let user process choose
> frequency based on the class
I'm less interested in userspace control of MMC controller operating
frequency and much more interested in how devfreq might arbitrate QoS
requests from multiple "client" devices.
> Method 2) use any "reasonable" governor and let the device driver set
> only "valid" frequencies enabled.
Can you elaborate on this? I'm not sure I understand how this will
look in driver code. Maybe the example I requested above will shed
some light.
> For a rough example, we may do if class < 6, disable freq > 40MHz,
> class < 10, disable freq > 80MHz, and so on. If we do not have
> performance counters or any other mechanisms to monitor the
> activities, "performance" governor along with clock-gated MMC driver
> will save enough power.
>
> For GPUs without anything to monitor the activities, we may do the
> same as the MMC case.
>
> However, with the H/W I've got now, (Exynos4210), we have performance
> counters (PPMU) for many blocks: 3D(MALI GPU), ACP, CAMIF, CPU, DMC0,
> DMC1 (memory controllers), FSYS, IMAGE, LCD0, LCD1, MFC_L, MFC_R, TV,
> LEFT_BUS, and RIGHT_BUS. I don't think Exynos4 is an exceptionally
> fancy SoC (already millions are sold for phones) and other mobile SoCs
> (at least for flagship models) will have them very soon (or already
> have them). Along with this patch, in the example with git branch
> link, we control DMC0/DMC1 blocks. And,
I agree devfreq is well-suited for such hardware.
>> A governor is not required in these cases (as they are event driven)
>> and devfreq is quite heavyweight for such applications. What is
>> needed is a QoS-style software layer that allows throughput requests
>> to be made from an initiator device towards a target device. This
>> layer should aggregate requests since many initator devices may make
>> requests to the same target device. This layer I'm describing, which
>> does not exist today, should be where the actual DVFS transition takes
>> place. That could take the form of a clk_set_rate call in the clock
>> framework (as described by Colin in V1 of this series), or some other
>> not-yet-realized dvfs_set_opp ,or something like Jean Pihet's
>> per-device PM QoS patches or whatever. For the purposes of this email
>> I don't really care which framework implements the QoS request
>> aggregation.
>
> Such aggregation could be also done with governors. If the
> governor-device pair does not want to poll devfreq wouldn't loop
> unless there is any governor-device pair that wants to do so. If it is
> event-driven, users may just "allow/disallow" frequencies with OPP
> framework and devfreq will choose proper frequency with the given
> governor for the device. If every device uses "static" or
> "event-driven" governors such as powersave/performance/userspace,
> there will be no polling/looping.
So drivers must disable OPPs, and then the non-polling devfreq
governor will have to be notified by the OPP code and then run it's
->target code again? This sounds backwards to me.
devfreq seems like an ideal bit of code to understand the constraints
needed by a device (via the workqueue/monitor loop) and then request
those needs via the proper API. It seems entirely wrong to me to have
other device drivers send their QoS needs to devfreq.
I'm starting to sound like a broken record though, and I've rescinded
my NAK in my reply to Rafael. If you could explain how multiple
drivers can request their performance needs to a devfreq governor
(same question I asked above) then that would be really helpful.
Thanks,
Mike
> When it is going to be directly controlled by userspace, we'll need a
> "userspace" governor (same with userspace governor of cpufreq).
>
> If there is a QoS request for a devfreq-ed device, the request could
> be done with OPP's frequency enable/disable. If a device is to be
> executed at 400MHz or faster, all frequencies under 400MHz could be
> simply disabled w/ OPP. Devfreq governors cannot override such
> frequency enable/disable configurations.
>
> However, if such QoS requests need delays (timers) like tickle, a
> generalized tickle supplied with frequency or percent of max-frequency
> might work. (i.e., tickle(dev, freuqency, duration); ) Then, this
> generalized tickle will hold at the request frequency or higher by
> disabling lower frequencies temporarily.
>
>>
>> The point of describing this non-existant API is that devfreq should
>> really be just another input into it. A governor that can measure bus
>> saturation is really cool, but it may not yield optimal results
>> compared to several drivers which make QoS-style requests and insure
>> that performance is guaranteed for their particular needs during their
>> transactions. The good news is that we don't have to choose between
>> performance counter introspection and software QoS requests: both the
>> driver requests and the governor should all feed as inputs into the
>> QoS-style DVFS mechanism.
>>
>> Taking that logic to its inevitable conclusion, tickle doesn't belong
>> inside the governor at all. If some device X wants to ramp up the
>> frequency of device Y, it should just make a QoS-style throughput
>> request towards device Y, possibly with a timeout (keeping the
>> original idea of tickle intact). This is entirely a separate idea
>> from a governor's introspective workqueue loop.
>
> Although tickle is sharing the same loop with governors, tickle does
> not belong inside governors. Tickle overrides the decisions of
> governors; governor's decision function is not called if the device is
> being tickled. However, generalizing the tickle function so that it
> may take "at least at xx % of max frequency" or "operate at least xx
> khz" as an option seems reasonable for QoS requests. And such options
> might be implemented for next version of devfreq later. This requires
> modification in tickle function interface or adding another interface
> for tickle function. However, if such QoS requests do not need
> duration set, we can just go with OPP's frequency enable/disable and
> disable lower-than-QoS-requirement frequencies.
>
> Thus, I guess this QoS issue is somewhat not very significant for
> devfreq. And it can be easily mitigated by adding another interface or
> modifying the interface of tickle function.
>
>>
>> For userspace, a sysfs entry for tickle would also not feed into the
>> governor, but some dummy struct device *user would probably be the
>> initiator device and it would simply call the QoS-style throughput
>> API.
>>
>> In summary my objections to this series are:
>> 1) devfreq should not be the *final* software layer to invoke a DVFS
>> transition as it has not taken all constraints into account.
>> 2) a devfreq governor represents just one constraint out of many to be
>> considered for any given scalable device.
>
> If the concern is about the QoS requests, I guess generalizing tickle
> would be sufficient as above. For devices without performance counters
> and any other mechanisms to infer the usage statistics, "performance"
> governor with event-driven OPP freq-enable/disable should be fine.
>
>>
>> My objection to these patches getting merged is that I think they are
>> a bit ahead of their time. We need to know what the real DVFS API
>> looks like underneath devfreq first, since devfreq should really be
>> built on top of it.
>>
>> Regards,
>> Mike
>>
>>> Thanks,
>>> Rafael
>>>
>> _______________________________________________
>> linux-pm mailing list
>> linux-pm@lists.linux-foundation.org
>> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
>>
>
> Cheers!
> MyungJoo.
>
>
> --
> MyungJoo Ham (함명주), Ph.D.
> Mobile Software Platform Lab,
> Digital Media and Communications (DMC) Business
> Samsung Electronics
> cell: 82-10-6714-2858
>
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm
^ permalink raw reply
* Re: [PATCH v4 0/3] DEVFREQ, DVFS framework for non-CPU devices
From: Turquette, Mike @ 2011-08-01 21:47 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Len Brown, Greg Kroah-Hartman, Kyungmin Park, MyungJoo Ham,
Thomas Gleixner, linux-pm
In-Reply-To: <201107302323.06066.rjw@sisk.pl>
On Sat, Jul 30, 2011 at 2:23 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Saturday, July 30, 2011, Turquette, Mike wrote:
>> On Fri, Jul 29, 2011 at 2:10 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> > On Friday, July 29, 2011, Turquette, Mike wrote:
>> >> On Thu, Jul 28, 2011 at 3:10 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> >> > On Friday, July 15, 2011, MyungJoo Ham wrote:
>> >> >> For a usage example, please look at
>> >> >> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
>> >> >>
>> >> >> In the above git tree, DVFS (dynamic voltage and frequency scaling) mechanism
>> >> >> is applied to the memory bus of Exynos4210 for Exynos4210-NURI boards.
>> >> >> In the example, the LPDDR2 DRAM frequency changes between 133, 266, and 400MHz
>> >> >> and other related clocks simply follow the determined DDR RAM clock.
>> >> >>
>> >> >> The DEVFREQ driver for Exynos4210 memory bus is at
>> >> >> /arch/arm/mach-exynos4/devfreq_bus.c in the git tree.
>> >> >>
>> >> >> MyungJoo Ham (3):
>> >> >> PM: Introduce DEVFREQ: generic DVFS framework with device-specific
>> >> >> OPPs
>> >> >> PM / DEVFREQ: add example governors
>> >> >> PM / DEVFREQ: add sysfs interface (including user tickling)
>> >> >
>> >> > OK, I'm going to take the patches for 3.2.
>> >>
>> >> Have any other platforms signed up to use this mechanism to manage
>> >> their peripheral DVFS?
>> >
>> > Not that I know of, but one initial user is sufficient for me.
>> > So if you have anything _against_ the patches, please speak up.
>>
>> I do have some concerns. Let me start by saying that I'm defining a
>> "governor" as some active piece of executing code, probably a looping
>> workqueue that inspects activity/idleness of a device and then makes a
>> determination regarding clock frequency.
>>
>> devfreq seems to be good framework for creating DVFS governors.
>> However I think that most scalable devices on an SoC do *not* need a
>> governor, and many scalable devices won't have performance counters or
>> any other way to implement such introspection.
>
> OK, so I'd like to see what the author of the patch series has to say
> in the face of your comments below.
>
>> Some examples include a MMC controller, which might change its clock
>> rate depending on the class of card that the user has inserted. Or
>> even a "smartish" device like a GPU lacking performance counters; it's
>> driver will ramp up frequency when there is work to be done and kick
>> off a timeout. If no new work comes in before the timeout then the
>> driver will drop the frequency.
>>
>> A governor is not required in these cases (as they are event driven)
>> and devfreq is quite heavyweight for such applications. What is
>> needed is a QoS-style software layer that allows throughput requests
>> to be made from an initiator device towards a target device. This
>> layer should aggregate requests since many initator devices may make
>> requests to the same target device. This layer I'm describing, which
>> does not exist today, should be where the actual DVFS transition takes
>> place. That could take the form of a clk_set_rate call in the clock
>> framework (as described by Colin in V1 of this series), or some other
>> not-yet-realized dvfs_set_opp ,or something like Jean Pihet's
>> per-device PM QoS patches or whatever. For the purposes of this email
>> I don't really care which framework implements the QoS request
>> aggregation.
>>
>> The point of describing this non-existant API is that devfreq should
>> really be just another input into it. A governor that can measure bus
>> saturation is really cool, but it may not yield optimal results
>> compared to several drivers which make QoS-style requests and insure
>> that performance is guaranteed for their particular needs during their
>> transactions. The good news is that we don't have to choose between
>> performance counter introspection and software QoS requests: both the
>> driver requests and the governor should all feed as inputs into the
>> QoS-style DVFS mechanism.
>>
>> Taking that logic to its inevitable conclusion, tickle doesn't belong
>> inside the governor at all. If some device X wants to ramp up the
>> frequency of device Y, it should just make a QoS-style throughput
>> request towards device Y, possibly with a timeout (keeping the
>> original idea of tickle intact). This is entirely a separate idea
>> from a governor's introspective workqueue loop.
>>
>> For userspace, a sysfs entry for tickle would also not feed into the
>> governor, but some dummy struct device *user would probably be the
>> initiator device and it would simply call the QoS-style throughput
>> API.
>>
>> In summary my objections to this series are:
>> 1) devfreq should not be the *final* software layer to invoke a DVFS
>> transition as it has not taken all constraints into account.
>> 2) a devfreq governor represents just one constraint out of many to be
>> considered for any given scalable device.
>>
>> My objection to these patches getting merged is that I think they are
>> a bit ahead of their time.
>
> Still, we're merging quite a number of patches being ahead of their time.
> The resulting code is then modified as we learn what's wrong with it or
> how to improve it.
>
> Why exactly do you think this approach will not work in this particular case?
I shouldn't have objected to the patches being merged, because I think
they are OK for a subset of the bigger DVFS problem. The "governor"
method seems fine for devices with performance counters and whose
drivers do not express performance constraints.
>> We need to know what the real DVFS API looks like underneath devfreq first,
>> since devfreq should really be built on top of it.
>
> Well, quite frankly, if we generally adopted that point of view, much of the
> useful functionality we have in the kernel right now wouldn't be merged at all.
Yes yes, I got ahead of myself; my real goal is to restart the
discussion that popped up in the V1 patchset regarding some API for
handling clock/voltage scaling for DVFS transitions. However it
doesn't mean that these patches need to be blocked.
I do have an overall concern about the approach mentioned by MyungJoo
where drivers start enabling/disabling OPPs and then the governors
compensate by raising frequency/voltage the next time their workqueues
loop around. That seems entirely backwards for devices that express
performance needs on an event-driven basis, so my concerns are more
for how these patches *might* be used in the future, and less about
how they look right now.
We still don't have a good way to manage DVFS transitions and
aggregate QoS requests for an SoC. My hope is when that problem gets
solved devfreq will use those new APIs in its ->target function. I
also hope that too many drivers won't start using "tickle" as a
substitute for a real DVFS API.
Regards,
Mike
> Think of the USB subsystem, for one example, that has been rewritten from
> scratch no fewer that three times.
>
> The code appears to be reasonably isolated and simple enough to merge.
> There is a subsystem wanting to use it and I don't see anyone forcing
> anybody else to adopt it. If it's not suitable to you, you won't be using it,
> plain and simple. And if you come up with some better code to replace it,
> I won't have any problems with taking that too.
>
> Thanks,
> Rafael
>
^ permalink raw reply
* RFC: Implementing power management in Linux-based consumer products
From: spdev31 @ 2011-08-01 20:23 UTC (permalink / raw)
To: linux-pm
[-- Attachment #1.1: Type: text/plain, Size: 1007 bytes --]
Hello all,
I am searching for information about implementing power management in
Linux-based consumer products. I've found the Documentation/power folder
and lesswatts.org, and want to ask what additional resources are available
to understand the issues facing a new team beginning PM on Linux. What
forums are best for discussing w/developers involved with Linux PM
integration for products? Thanks.
I'd like to understand what consumer products are known to use Linux PM and
what certifications have been granted to products using Linux PM (e.g.
EnergyStar, BAM, ErP). Also, it is not clear what level of support can be
expected from SoC vendors. Is it common for SoC vendors to provide code
which integrates SoC PM features with the Linux PM, or does the bulk of this
work fall on the product team using raw SoC documentation? Any tips to
minimize our effort in this area would be greatly appreciated.
I hope this is an appropriate forum for these questions. Thanks for your
help.
Best,
Shaun
[-- Attachment #1.2: Type: text/html, Size: 1088 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: [PATCH] mrst_pmu: driver for Intel Moorestown Power Management Unit
From: Alan Cox @ 2011-08-01 15:39 UTC (permalink / raw)
To: Len Brown; +Cc: linux-pm, linux-kernel
In-Reply-To: <alpine.LFD.2.02.1108011116160.23581@x980>
On Mon, 01 Aug 2011 11:24:45 -0400 (EDT)
Len Brown <lenb@kernel.org> wrote:
>
> > > + BUG_ON(pci_dev_num != mrst_devs[index].pci_dev_num);
> >
> > That strikes me as needlessly unfriendly, you could warn/return NULL and
> > propogate a WARN_ONCE back to the user.
>
> If a fit of friendlyness, I wrote the patch below.
> Let me know if it is not what you had in mind.
It's not so much the bug on as the fact it spews these if you boot it on
a non mrst box built with a generic kernel - which was the second part of
my comment and doesn't appear addressed.
^ permalink raw reply
* Re: [PATCH -next] x86/platform: mrst/pmu.c needs module.h
From: Len Brown @ 2011-08-01 15:31 UTC (permalink / raw)
To: Randy Dunlap
Cc: Stephen Rothwell, Len Brown, LKML, platform-driver-x86,
linux-next, linux-pm, akpm, Matthew Garrett
In-Reply-To: <20110729211459.148eb8cc.rdunlap@xenotime.net>
applied
thanks,
Len Brown, Intel Open Source Technology Center
^ permalink raw reply
* Re: [PATCH] mrst_pmu: driver for Intel Moorestown Power Management Unit
From: Len Brown @ 2011-08-01 15:24 UTC (permalink / raw)
To: Alan Cox; +Cc: linux-pm, linux-kernel
In-Reply-To: <20110713105105.63e2fbcb@lxorguk.ukuu.org.uk>
> > + BUG_ON(pci_dev_num != mrst_devs[index].pci_dev_num);
>
> That strikes me as needlessly unfriendly, you could warn/return NULL and
> propogate a WARN_ONCE back to the user.
If a fit of friendlyness, I wrote the patch below.
Let me know if it is not what you had in mind.
thanks
-Len
>From 0f72ff0114695d48ecdd7270998148d61236341d Mon Sep 17 00:00:00 2001
From: Len Brown <len.brown@intel.com>
Date: Mon, 1 Aug 2011 11:13:37 -0400
Subject: [PATCH] msrt_pmu: add friendly firmware debugging code
MRST firmware has already been debugged, but this patch
makes the kernel fail more gently should somebody break
it in the future by inventing a device that is not in the SOC.
Signed-off-by: Len Brown <len.brown@intel.com>
---
arch/x86/platform/mrst/pmu.c | 30 +++++++++++++++++++++++++-----
1 files changed, 25 insertions(+), 5 deletions(-)
diff --git a/arch/x86/platform/mrst/pmu.c b/arch/x86/platform/mrst/pmu.c
index f8c6d8f..ee3afb1 100644
--- a/arch/x86/platform/mrst/pmu.c
+++ b/arch/x86/platform/mrst/pmu.c
@@ -96,7 +96,7 @@ static u32 pmu_set_power_state_send_cmd;
static struct mrst_device *pci_id_2_mrst_dev(u16 pci_dev_num)
{
- int index;
+ int index = 0;
if ((pci_dev_num >= 0x0800) && (pci_dev_num <= 0x815))
index = pci_dev_num - 0x800;
@@ -106,10 +106,11 @@ static struct mrst_device *pci_id_2_mrst_dev(u16 pci_dev_num)
index = 23;
else if (pci_dev_num == 0x4110)
index = 24;
- else
- BUG();
- BUG_ON(pci_dev_num != mrst_devs[index].pci_dev_num);
+ if (pci_dev_num != mrst_devs[index].pci_dev_num) {
+ WARN_ONCE(1, FW_BUG "Unknown PCI device 0x%04X\n", pci_dev_num);
+ return 0;
+ }
return &mrst_devs[index];
}
@@ -356,7 +357,13 @@ static u16 pmu_min_lss_pci_req(u16 *ids, u16 pci_state)
int i;
for (i = 0; ids[i]; ++i) {
- existing_request = pci_id_2_mrst_dev(ids[i])->latest_request;
+ struct mrst_device *mrst_dev;
+
+ mrst_dev = pci_id_2_mrst_dev(ids[i]);
+ if (unlikely(!mrst_dev))
+ continue;
+
+ existing_request = mrst_dev->latest_request;
if (existing_request < pci_state)
pci_state = existing_request;
}
@@ -380,6 +387,9 @@ int pmu_pci_set_power_state(struct pci_dev *pdev, pci_power_t pci_state)
BUG_ON(pci_state < PCI_D0 || pci_state > PCI_D3cold);
mrst_dev = pci_id_2_mrst_dev(pdev->device);
+ if (unlikely(!mrst_dev))
+ return -ENODEV;
+
mrst_dev->pci_state_counts[pci_state]++; /* count invocations */
/* PMU driver calls self as part of PCI initialization, ignore */
@@ -499,6 +509,10 @@ static int debug_mrst_pmu_show(struct seq_file *s, void *unused)
pdev->vendor, pdev->device,
dev_driver_string(&pdev->dev));
+ if (unlikely (!mrst_dev)) {
+ seq_printf(s, " UNKNOWN\n");
+ continue;
+ }
if (mrst_dev->lss)
seq_printf(s, "LSS %2d %-4s ", mrst_dev->lss,
@@ -598,6 +612,12 @@ static void pmu_scu_firmware_debug(void)
int pos;
mrst_dev = pci_id_2_mrst_dev(pdev->device);
+ if (unlikely(!mrst_dev)) {
+ printk(KERN_ERR FW_BUG "pmu: Unknown "
+ "PCI device 0x%04X\n", pdev->device);
+ continue;
+ }
+
if (mrst_dev->lss == 0)
continue; /* no LSS in our table */
--
1.7.4.4
^ permalink raw reply related
* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Govindraj @ 2011-08-01 12:46 UTC (permalink / raw)
To: balbi
Cc: Partha Basak, Tero Kristo, Raja, Govindraj, linux-serial,
linux-pm, linux-omap
In-Reply-To: <20110801100203.GN31013@legolas.emea.dhcp.ti.com>
On Mon, Aug 1, 2011 at 3:32 PM, Felipe Balbi <balbi@ti.com> wrote:
> Hi,
>
> (fix your mailer dude)
Sorry some settings got screwed up with my mailer
>
> On Mon, Aug 01, 2011 at 03:26:52PM +0530, Raja, Govindraj wrote:
[..]
>> this hunk is unnecessary. Clocks are always on when they are called.
>>
>> The problem is:
>> [1]:
>> runtime_put -> *power.lock* - > rpm->suspend -> above pr_debug ->
>> console_write -> get_sync
>> -> *power.lock* -> rpm resume
>> power.lock contention.
>
> Are you sure ? If the device is still on, won't that get_sync() only
> increase the pm counter ? Instead of going through everything ?? Oh
> well, this is becoming quite racy :-(
Yes true, but before it increments the counter it will take
power.lock
if we look into internals.
pm_runtime_get_sync -> *__pm_runtime_resume* --> rpm_resume
int __pm_runtime_resume(struct device *dev, int rpmflags)
{
[..]
spin_lock_irqsave(&dev->power.lock, flags);
retval = rpm_resume(dev, rpmflags);
spin_unlock_irqrestore(&dev->power.lock, flags);
[..]
}
Same applicable for runtime_put
[..]
>> I'll leave this to Kevin to decide what to do, but clocks are off
>> here...
>>
>> Yes fine.�
>> Since most of these prints will be printed if DEBUG macro
>> is defined in respective files and *debug* is used in command line.
>> Can't leave uart clocks active always on debug cases.
>> [If *debug* �used as command line]
>> and gate uart clocks only for non debug cases.
>> With this approach�at least�we can have a clean solution
>> in uart driver also without adding clock gating from idle path.
>> Not sure if this�agreeable.
>> As of now gating from idle path seems to be only clean approach.
>
> I see.. that could be one way... let's see how Kevin feels about it
> though.
>
OK.
--
Thanks,
Govindraj.R
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm
^ permalink raw reply
* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Felipe Balbi @ 2011-08-01 10:02 UTC (permalink / raw)
To: Raja, Govindraj
Cc: Partha Basak, Tero Kristo, balbi, linux-serial, Govindraj,
linux-pm, linux-omap
In-Reply-To: <CAMrsUdL_9mzKsjrKf=spihxcRTQc6Tsr+MAP9GFAz_7sE27B6Q@mail.gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 12891 bytes --]
Hi,
(fix your mailer dude)
On Mon, Aug 01, 2011 at 03:26:52PM +0530, Raja, Govindraj wrote:
> > @@ -254,14 +255,14 @@ void omap2_clk_disable(struct clk *clk)
> > � � � � � � � return;
> > � � � }
> >
> > - � � pr_debug("clock: %s: decrementing usecount\n", clk->name);
> > +// � pr_debug("clock: %s: decrementing usecount\n", clk->name);
> >
> > � � � clk->usecount--;
> >
> > � � � if (clk->usecount > 0)
> > � � � � � � � return;
> >
> > - � � pr_debug("clock: %s: disabling in hardware\n", clk->name);
> > +// � pr_debug("clock: %s: disabling in hardware\n", clk->name);
> >
> > � � � if (clk->ops && clk->ops->disable) {
> > � � � � � � � trace_clock_disable(clk->name, 0, smp_processor_id());
>
> this hunk is unnecessary. Clocks are always on when they are called.
>
> The problem is:
> [1]:
> runtime_put -> *power.lock* - > rpm->suspend -> above pr_debug ->
> console_write -> get_sync
> -> *power.lock* -> rpm resume
> power.lock contention.
Are you sure ? If the device is still on, won't that get_sync() only
increase the pm counter ? Instead of going through everything ?? Oh
well, this is becoming quite racy :-(
> > @@ -290,14 +291,14 @@ int omap2_clk_enable(struct clk *clk)
> > �{
> > � � � int ret;
> >
> > - � � pr_debug("clock: %s: incrementing usecount\n", clk->name);
> > +// � pr_debug("clock: %s: incrementing usecount\n", clk->name);
> >
> > � � � clk->usecount++;
> >
> > � � � if (clk->usecount > 1)
> > � � � � � � � return 0;
> >
> > - � � pr_debug("clock: %s: enabling in hardware\n", clk->name);
> > +// � pr_debug("clock: %s: enabling in hardware\n", clk->name);
>
> these two is ok.
> > diff --git a/arch/arm/mach-omap2/omap_hwmod.c
> b/arch/arm/mach-omap2/omap_hwmod.c
> > index 7ed0479..8ca7d40 100644
> > --- a/arch/arm/mach-omap2/omap_hwmod.c
> > +++ b/arch/arm/mach-omap2/omap_hwmod.c
> > @@ -124,7 +124,8 @@
> > � * XXX error return values should be checked to ensure that they are
> > � * appropriate
> > � */
> > -#undef DEBUG
> > +//#undef DEBUG
> > +#define DEBUG
>
> trailing.
> > @@ -597,7 +598,8 @@ static int _enable_clocks(struct omap_hwmod *oh)
> > �{
> > � � � int i;
> >
> > - � � pr_debug("omap_hwmod: %s: enabling clocks\n", oh->name);
> > + � � if (strcmp(oh->class->name, "uart"))
> > + � � � � � � pr_debug("omap_hwmod: %s: enabling clocks\n", oh->name);
>
> instead of doing checks, you could move the print to the end of the
> function, when clocks are already enabled. When doind that, of course,
> update the comment to say "%s: clocks enabled\n".
>
> the problem is the prints causing power.lock contention same as�
> the�scenario�in [1] above.
> �
>
> > @@ -627,7 +629,8 @@ static int _disable_clocks(struct omap_hwmod *oh)
> > �{
> > � � � int i;
> >
> > - � � pr_debug("omap_hwmod: %s: disabling clocks\n", oh->name);
> > + � � if (strcmp(oh->class->name, "uart"))
> > + � � � � � � pr_debug("omap_hwmod: %s: disabling clocks\n",
> oh->name);
>
> check not needed, clocks are still on.
>
> scenario�[1]
> �
>
> >
> > � � � if (oh->_clk)
> > � � � � � � � clk_disable(oh->_clk);
> > @@ -1232,7 +1235,8 @@ static int _enable(struct omap_hwmod *oh)
> > � � � � � � � return -EINVAL;
> > � � � }
> >
> > - � � pr_debug("omap_hwmod: %s: enabling\n", oh->name);
> > + � � if (strcmp(oh->class->name, "uart"))
> > + � � � � � � pr_debug("omap_hwmod: %s: enabling\n", oh->name);
>
> move it further down.
> > @@ -1264,8 +1268,9 @@ static int _enable(struct omap_hwmod *oh)
> > � � � � � � � }
> > � � � } else {
> > � � � � � � � _disable_clocks(oh);
> > - � � � � � � pr_debug("omap_hwmod: %s: _wait_target_ready: %d\n",
> > - � � � � � � � � � � �oh->name, r);
> > + � � � � � � if (strcmp(oh->class->name, "uart"))
> > + � � � � � � � � � � pr_debug("omap_hwmod: %s: _wait_target_ready:
> %d\n",
> > + � � � � � � � � � � � � � � �oh->name, r);
>
> instead of adding check, move the print before _disable_clocks(oh).
> > @@ -1287,7 +1292,8 @@ static int _idle(struct omap_hwmod *oh)
> > � � � � � � � return -EINVAL;
> > � � � }
> >
> > - � � pr_debug("omap_hwmod: %s: idling\n", oh->name);
> > + � � if (strcmp(oh->class->name, "uart"))
> > + � � � � � � pr_debug("omap_hwmod: %s: idling\n", oh->name);
>
> I believe clocks are still on here too, no checks needed.
> > diff --git a/arch/arm/plat-omap/omap_device.c
> b/arch/arm/plat-omap/omap_device.c
> > index 49fc0df..7b27704 100644
> > --- a/arch/arm/plat-omap/omap_device.c
> > +++ b/arch/arm/plat-omap/omap_device.c
> > @@ -75,7 +75,8 @@
> > � * (device must be reinitialized at this point to use it again)
> > � *
> > � */
> > -#undef DEBUG
> > +//#undef DEBUG
> > +#define DEBUG
>
> trailing.
> > @@ -114,7 +115,8 @@ static int _omap_device_activate(struct
> > omap_device *od, u8 ignore_lat)
> > �{
> > � � � struct timespec a, b, c;
> >
> > - � � pr_debug("omap_device: %s: activating\n", od->[2]pdev.name);
> > + � � if (strcmp(od->hwmods[0]->class->name, "uart"))
> > + � � � � � � pr_debug("omap_device: %s: activating\n",
> od->[3]pdev.name);
>
> move it to the end of the function.
> > @@ -138,25 +140,29 @@ static int _omap_device_activate(struct
> > omap_device *od, u8 ignore_lat)
> > � � � � � � � c = timespec_sub(b, a);
> > � � � � � � � act_lat = timespec_to_ns(&c);
> >
> > - � � � � � � pr_debug("omap_device: %s: pm_lat %d: activate: elapsed
> time "
> > - � � � � � � � � � � �"%llu nsec\n", od->[4]pdev.name,
> od->pm_lat_level,
> > - � � � � � � � � � � �act_lat);
> > + � � � � � � if (strcmp(od->hwmods[0]->class->name, "uart"))
> > + � � � � � � � � � � pr_debug("omap_device: %s: pm_lat %d: activate:
> elapsed time "
> > + � � � � � � � � � � � � � � �"%llu nsec\n", od->[5]pdev.name,
> od->pm_lat_level,
> > + � � � � � � � � � � � � � � �act_lat);
>
> move it further down.
> >
> > � � � � � � � if (act_lat > odpl->activate_lat) {
> > � � � � � � � � � � � odpl->activate_lat_worst = act_lat;
> > � � � � � � � � � � � if (odpl->flags &
> OMAP_DEVICE_LATENCY_AUTO_ADJUST) {
> > � � � � � � � � � � � � � � � odpl->activate_lat = act_lat;
> > - � � � � � � � � � � � � � � pr_warning("omap_device: %s.%d: new
> worst case "
> > - � � � � � � � � � � � � � � � � � � � �"activate latency %d:
> %llu\n",
> > - � � � � � � � � � � � � � � � � � � � �od->[6]pdev.name,
> od->[7]pdev.id,
> > - � � � � � � � � � � � � � � � � � � � �od->pm_lat_level, act_lat);
> > - � � � � � � � � � � } else
> > - � � � � � � � � � � � � � � pr_warning("omap_device: %s.%d: activate
> "
> > - � � � � � � � � � � � � � � � � � � � �"latency %d higher than
> exptected. "
> > - � � � � � � � � � � � � � � � � � � � �"(%llu > %d)\n",
> > - � � � � � � � � � � � � � � � � � � � �od->[8]pdev.name,
> od->[9]pdev.id,
> > - � � � � � � � � � � � � � � � � � � � �od->pm_lat_level, act_lat,
> > - � � � � � � � � � � � � � � � � � � � �odpl->activate_lat);
> > + � � � � � � � � � � � � � � if (strcmp(od->hwmods[0]->class->name,
> "uart"))
> > + � � � � � � � � � � � � � � � � � � pr_warning("omap_device: %s.%d:
> new worst case "
> > + � � � � � � � � � � � � � � � � � � � � � � "activate latency %d:
> %llu\n",
> > + � � � � � � � � � � � � � � � � � � � � � � od->[10]pdev.name,
> od->[11]pdev.id,
> > + � � � � � � � � � � � � � � � � � � � � � � od->pm_lat_level,
> act_lat);
> > + � � � � � � � � � � } else {
> > + � � � � � � � � � � � � � � if (strcmp(od->hwmods[0]->class->name,
> "uart"))
> > + � � � � � � � � � � � � � � � � � � pr_warning("omap_device: %s.%d:
> activate "
> > + � � � � � � � � � � � � � � � � � � � � � � "latency %d higher than
> exptected. "
> > + � � � � � � � � � � � � � � � � � � � � � � "(%llu > %d)\n",
> > + � � � � � � � � � � � � � � � � � � � � � � od->[12]pdev.name,
> od->[13]pdev.id,
> > + � � � � � � � � � � � � � � � � � � � � � � od->pm_lat_level,
> act_lat,
> > + � � � � � � � � � � � � � � � � � � � � � � odpl->activate_lat);
>
> ->activate_func() has already been called here, clocks are already on.
> > @@ -183,7 +189,8 @@ static int _omap_device_deactivate(struct
> > omap_device *od, u8 ignore_lat)
> > �{
> > � � � struct timespec a, b, c;
> >
> > - � � pr_debug("omap_device: %s: deactivating\n", od->[14]pdev.name);
> > + � � if (strcmp(od->hwmods[0]->class->name, "uart"))
> > + � � � � � � pr_debug("omap_device: %s: deactivating\n",
> od->[15]pdev.name);
>
> clocks are still on here.
> > @@ -206,25 +213,29 @@ static int _omap_device_deactivate(struct
> > omap_device *od, u8 ignore_lat)
> > � � � � � � � c = timespec_sub(b, a);
> > � � � � � � � deact_lat = timespec_to_ns(&c);
> >
> > - � � � � � � pr_debug("omap_device: %s: pm_lat %d: deactivate:
> elapsed time "
> > - � � � � � � � � � � �"%llu nsec\n", od->[16]pdev.name,
> od->pm_lat_level,
> > - � � � � � � � � � � �deact_lat);
> > + � � � � � � if (strcmp(od->hwmods[0]->class->name, "uart"))
> > + � � � � � � � � � � pr_debug("omap_device: %s: pm_lat %d:
> deactivate: elapsed time "
> > + � � � � � � � � � � � � � � �"%llu nsec\n", od->[17]pdev.name,
> od->pm_lat_level,
> > + � � � � � � � � � � � � � � �deact_lat);
>
> I'll leave this to Kevin to decide what to do, but clocks are off
> here...
>
> Yes fine.�
> Since most of these prints will be printed if DEBUG macro
> is defined in respective files and *debug* is used in command line.
> Can't leave uart clocks active always on debug cases.
> [If *debug* �used as command line]
> and gate uart clocks only for non debug cases.
> With this approach�at least�we can have a clean solution
> in uart driver also without adding clock gating from idle path.
> Not sure if this�agreeable.
> As of now gating from idle path seems to be only clean approach.
I see.. that could be one way... let's see how Kevin feels about it
though.
--
balbi
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Govindraj @ 2011-08-01 10:00 UTC (permalink / raw)
To: balbi
Cc: Partha Basak, Tero Kristo, Govindraj.R, linux-serial, linux-pm,
linux-omap
In-Reply-To: <20110801090328.GG31013@legolas.emea.dhcp.ti.com>
> Hi,
>
> On Fri, Jul 29, 2011 at 08:43:49PM +0530, Govindraj wrote:
>
> [giant snip]
>
> > Actually there is much more than this:
> >
> > <<SNIP>>
> >
> > diff --git a/arch/arm/mach-omap2/clock.c b/arch/arm/mach-omap2/clock.c
> > index 180299e..221ffb9 100644
> > --- a/arch/arm/mach-omap2/clock.c
> > +++ b/arch/arm/mach-omap2/clock.c
> > @@ -12,7 +12,8 @@
> > * it under the terms of the GNU General Public License version 2 as
> > * published by the Free Software Foundation.
> > */
> > -#undef DEBUG
> > +//#undef DEBUG
> > +#define DEBUG
>
> trailing... but you know that :-p
>
yes true.
> > @@ -254,14 +255,14 @@ void omap2_clk_disable(struct clk *clk)
> > return;
> > }
> >
> > - pr_debug("clock: %s: decrementing usecount\n", clk->name);
> > +// pr_debug("clock: %s: decrementing usecount\n", clk->name);
> >
> > clk->usecount--;
> >
> > if (clk->usecount > 0)
> > return;
> >
> > - pr_debug("clock: %s: disabling in hardware\n", clk->name);
> > +// pr_debug("clock: %s: disabling in hardware\n", clk->name);
> >
> > if (clk->ops && clk->ops->disable) {
> > trace_clock_disable(clk->name, 0, smp_processor_id());
>
> this hunk is unnecessary. Clocks are always on when they are called.
>
The problem is:
[1]:
runtime_put -> *power.lock* - > rpm->suspend -> above pr_debug ->
console_write -> get_sync
-> *power.lock* -> rpm resume
power.lock contention.
>
> > @@ -290,14 +291,14 @@ int omap2_clk_enable(struct clk *clk)
> > {
> > int ret;
> >
> > - pr_debug("clock: %s: incrementing usecount\n", clk->name);
> > +// pr_debug("clock: %s: incrementing usecount\n", clk->name);
> >
> > clk->usecount++;
> >
> > if (clk->usecount > 1)
> > return 0;
> >
> > - pr_debug("clock: %s: enabling in hardware\n", clk->name);
> > +// pr_debug("clock: %s: enabling in hardware\n", clk->name);
>
> these two is ok.
>
> > diff --git a/arch/arm/mach-omap2/omap_hwmod.c
> b/arch/arm/mach-omap2/omap_hwmod.c
> > index 7ed0479..8ca7d40 100644
> > --- a/arch/arm/mach-omap2/omap_hwmod.c
> > +++ b/arch/arm/mach-omap2/omap_hwmod.c
> > @@ -124,7 +124,8 @@
> > * XXX error return values should be checked to ensure that they are
> > * appropriate
> > */
> > -#undef DEBUG
> > +//#undef DEBUG
> > +#define DEBUG
>
> trailing.
>
> > @@ -597,7 +598,8 @@ static int _enable_clocks(struct omap_hwmod *oh)
> > {
> > int i;
> >
> > - pr_debug("omap_hwmod: %s: enabling clocks\n", oh->name);
> > + if (strcmp(oh->class->name, "uart"))
> > + pr_debug("omap_hwmod: %s: enabling clocks\n", oh->name);
>
> instead of doing checks, you could move the print to the end of the
> function, when clocks are already enabled. When doind that, of course,
> update the comment to say "%s: clocks enabled\n".
>
the problem is the prints causing power.lock contention same as
the scenario in [1] above.
>
> > @@ -627,7 +629,8 @@ static int _disable_clocks(struct omap_hwmod *oh)
> > {
> > int i;
> >
> > - pr_debug("omap_hwmod: %s: disabling clocks\n", oh->name);
> > + if (strcmp(oh->class->name, "uart"))
> > + pr_debug("omap_hwmod: %s: disabling clocks\n", oh->name);
>
> check not needed, clocks are still on.
>
scenario [1]
>
> >
> > if (oh->_clk)
> > clk_disable(oh->_clk);
> > @@ -1232,7 +1235,8 @@ static int _enable(struct omap_hwmod *oh)
> > return -EINVAL;
> > }
> >
> > - pr_debug("omap_hwmod: %s: enabling\n", oh->name);
> > + if (strcmp(oh->class->name, "uart"))
> > + pr_debug("omap_hwmod: %s: enabling\n", oh->name);
>
> move it further down.
>
> > @@ -1264,8 +1268,9 @@ static int _enable(struct omap_hwmod *oh)
> > }
> > } else {
> > _disable_clocks(oh);
> > - pr_debug("omap_hwmod: %s: _wait_target_ready: %d\n",
> > - oh->name, r);
> > + if (strcmp(oh->class->name, "uart"))
> > + pr_debug("omap_hwmod: %s: _wait_target_ready:
> %d\n",
> > + oh->name, r);
>
> instead of adding check, move the print before _disable_clocks(oh).
>
> > @@ -1287,7 +1292,8 @@ static int _idle(struct omap_hwmod *oh)
> > return -EINVAL;
> > }
> >
> > - pr_debug("omap_hwmod: %s: idling\n", oh->name);
> > + if (strcmp(oh->class->name, "uart"))
> > + pr_debug("omap_hwmod: %s: idling\n", oh->name);
>
> I believe clocks are still on here too, no checks needed.
>
> > diff --git a/arch/arm/plat-omap/omap_device.c
> b/arch/arm/plat-omap/omap_device.c
> > index 49fc0df..7b27704 100644
> > --- a/arch/arm/plat-omap/omap_device.c
> > +++ b/arch/arm/plat-omap/omap_device.c
> > @@ -75,7 +75,8 @@
> > * (device must be reinitialized at this point to use it again)
> > *
> > */
> > -#undef DEBUG
> > +//#undef DEBUG
> > +#define DEBUG
>
> trailing.
>
> > @@ -114,7 +115,8 @@ static int _omap_device_activate(struct
> > omap_device *od, u8 ignore_lat)
> > {
> > struct timespec a, b, c;
> >
> > - pr_debug("omap_device: %s: activating\n", od->pdev.name);
> > + if (strcmp(od->hwmods[0]->class->name, "uart"))
> > + pr_debug("omap_device: %s: activating\n", od->pdev.name);
>
> move it to the end of the function.
>
> > @@ -138,25 +140,29 @@ static int _omap_device_activate(struct
> > omap_device *od, u8 ignore_lat)
> > c = timespec_sub(b, a);
> > act_lat = timespec_to_ns(&c);
> >
> > - pr_debug("omap_device: %s: pm_lat %d: activate: elapsed
> time "
> > - "%llu nsec\n", od->pdev.name, od->pm_lat_level,
> > - act_lat);
> > + if (strcmp(od->hwmods[0]->class->name, "uart"))
> > + pr_debug("omap_device: %s: pm_lat %d: activate:
> elapsed time "
> > + "%llu nsec\n", od->pdev.name,
> od->pm_lat_level,
> > + act_lat);
>
> move it further down.
>
> >
> > if (act_lat > odpl->activate_lat) {
> > odpl->activate_lat_worst = act_lat;
> > if (odpl->flags & OMAP_DEVICE_LATENCY_AUTO_ADJUST)
> {
> > odpl->activate_lat = act_lat;
> > - pr_warning("omap_device: %s.%d: new worst
> case "
> > - "activate latency %d: %llu\n",
> > - od->pdev.name, od->pdev.id,
> > - od->pm_lat_level, act_lat);
> > - } else
> > - pr_warning("omap_device: %s.%d: activate "
> > - "latency %d higher than
> exptected. "
> > - "(%llu > %d)\n",
> > - od->pdev.name, od->pdev.id,
> > - od->pm_lat_level, act_lat,
> > - odpl->activate_lat);
> > + if (strcmp(od->hwmods[0]->class->name,
> "uart"))
> > + pr_warning("omap_device: %s.%d: new
> worst case "
> > + "activate latency %d:
> %llu\n",
> > + od->pdev.name, od->pdev.id
> ,
> > + od->pm_lat_level, act_lat);
> > + } else {
> > + if (strcmp(od->hwmods[0]->class->name,
> "uart"))
> > + pr_warning("omap_device: %s.%d:
> activate "
> > + "latency %d higher than
> exptected. "
> > + "(%llu > %d)\n",
> > + od->pdev.name, od->pdev.id
> ,
> > + od->pm_lat_level, act_lat,
> > + odpl->activate_lat);
>
> ->activate_func() has already been called here, clocks are already on.
>
> > @@ -183,7 +189,8 @@ static int _omap_device_deactivate(struct
> > omap_device *od, u8 ignore_lat)
> > {
> > struct timespec a, b, c;
> >
> > - pr_debug("omap_device: %s: deactivating\n", od->pdev.name);
> > + if (strcmp(od->hwmods[0]->class->name, "uart"))
> > + pr_debug("omap_device: %s: deactivating\n", od->pdev.name
> );
>
> clocks are still on here.
>
> > @@ -206,25 +213,29 @@ static int _omap_device_deactivate(struct
> > omap_device *od, u8 ignore_lat)
> > c = timespec_sub(b, a);
> > deact_lat = timespec_to_ns(&c);
> >
> > - pr_debug("omap_device: %s: pm_lat %d: deactivate: elapsed
> time "
> > - "%llu nsec\n", od->pdev.name, od->pm_lat_level,
> > - deact_lat);
> > + if (strcmp(od->hwmods[0]->class->name, "uart"))
> > + pr_debug("omap_device: %s: pm_lat %d: deactivate:
> elapsed time "
> > + "%llu nsec\n", od->pdev.name,
> od->pm_lat_level,
> > + deact_lat);
>
> I'll leave this to Kevin to decide what to do, but clocks are off
> here...
>
Yes fine.
Since most of these prints will be printed if DEBUG macro
is defined in respective files and *debug* is used in command line.
Can't leave uart clocks active always on debug cases.
[If *debug* used as command line]
and gate uart clocks only for non debug cases.
With this approach at least we can have a clean solution
in uart driver also without adding clock gating from idle path.
Not sure if this agreeable.
As of now gating from idle path seems to be only clean approach.
--
Thanks,
Govindraj.R
^ permalink raw reply
* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Raja, Govindraj @ 2011-08-01 9:56 UTC (permalink / raw)
To: balbi
Cc: Partha Basak, Tero Kristo, linux-serial, Govindraj, linux-pm,
linux-omap
In-Reply-To: <20110801090328.GG31013@legolas.emea.dhcp.ti.com>
[-- Attachment #1.1: Type: text/plain, Size: 11681 bytes --]
On Mon, Aug 1, 2011 at 2:33 PM, Felipe Balbi <balbi@ti.com> wrote:
> Hi,
>
> On Fri, Jul 29, 2011 at 08:43:49PM +0530, Govindraj wrote:
>
> [giant snip]
>
> > Actually there is much more than this:
> >
> > <<SNIP>>
> >
> > diff --git a/arch/arm/mach-omap2/clock.c b/arch/arm/mach-omap2/clock.c
> > index 180299e..221ffb9 100644
> > --- a/arch/arm/mach-omap2/clock.c
> > +++ b/arch/arm/mach-omap2/clock.c
> > @@ -12,7 +12,8 @@
> > * it under the terms of the GNU General Public License version 2 as
> > * published by the Free Software Foundation.
> > */
> > -#undef DEBUG
> > +//#undef DEBUG
> > +#define DEBUG
>
> trailing... but you know that :-p
>
yes true.
> > @@ -254,14 +255,14 @@ void omap2_clk_disable(struct clk *clk)
> > return;
> > }
> >
> > - pr_debug("clock: %s: decrementing usecount\n", clk->name);
> > +// pr_debug("clock: %s: decrementing usecount\n", clk->name);
> >
> > clk->usecount--;
> >
> > if (clk->usecount > 0)
> > return;
> >
> > - pr_debug("clock: %s: disabling in hardware\n", clk->name);
> > +// pr_debug("clock: %s: disabling in hardware\n", clk->name);
> >
> > if (clk->ops && clk->ops->disable) {
> > trace_clock_disable(clk->name, 0, smp_processor_id());
>
> this hunk is unnecessary. Clocks are always on when they are called.
>
The problem is:
[1]:
runtime_put -> *power.lock* - > rpm->suspend -> above pr_debug ->
console_write -> get_sync
-> *power.lock* -> rpm resume
power.lock contention.
>
> > @@ -290,14 +291,14 @@ int omap2_clk_enable(struct clk *clk)
> > {
> > int ret;
> >
> > - pr_debug("clock: %s: incrementing usecount\n", clk->name);
> > +// pr_debug("clock: %s: incrementing usecount\n", clk->name);
> >
> > clk->usecount++;
> >
> > if (clk->usecount > 1)
> > return 0;
> >
> > - pr_debug("clock: %s: enabling in hardware\n", clk->name);
> > +// pr_debug("clock: %s: enabling in hardware\n", clk->name);
>
> these two is ok.
>
> > diff --git a/arch/arm/mach-omap2/omap_hwmod.c
> b/arch/arm/mach-omap2/omap_hwmod.c
> > index 7ed0479..8ca7d40 100644
> > --- a/arch/arm/mach-omap2/omap_hwmod.c
> > +++ b/arch/arm/mach-omap2/omap_hwmod.c
> > @@ -124,7 +124,8 @@
> > * XXX error return values should be checked to ensure that they are
> > * appropriate
> > */
> > -#undef DEBUG
> > +//#undef DEBUG
> > +#define DEBUG
>
> trailing.
>
> > @@ -597,7 +598,8 @@ static int _enable_clocks(struct omap_hwmod *oh)
> > {
> > int i;
> >
> > - pr_debug("omap_hwmod: %s: enabling clocks\n", oh->name);
> > + if (strcmp(oh->class->name, "uart"))
> > + pr_debug("omap_hwmod: %s: enabling clocks\n", oh->name);
>
> instead of doing checks, you could move the print to the end of the
> function, when clocks are already enabled. When doind that, of course,
> update the comment to say "%s: clocks enabled\n".
>
the problem is the prints causing power.lock contention same as
the scenario in [1] above.
>
> > @@ -627,7 +629,8 @@ static int _disable_clocks(struct omap_hwmod *oh)
> > {
> > int i;
> >
> > - pr_debug("omap_hwmod: %s: disabling clocks\n", oh->name);
> > + if (strcmp(oh->class->name, "uart"))
> > + pr_debug("omap_hwmod: %s: disabling clocks\n", oh->name);
>
> check not needed, clocks are still on.
>
scenario [1]
>
> >
> > if (oh->_clk)
> > clk_disable(oh->_clk);
> > @@ -1232,7 +1235,8 @@ static int _enable(struct omap_hwmod *oh)
> > return -EINVAL;
> > }
> >
> > - pr_debug("omap_hwmod: %s: enabling\n", oh->name);
> > + if (strcmp(oh->class->name, "uart"))
> > + pr_debug("omap_hwmod: %s: enabling\n", oh->name);
>
> move it further down.
>
> > @@ -1264,8 +1268,9 @@ static int _enable(struct omap_hwmod *oh)
> > }
> > } else {
> > _disable_clocks(oh);
> > - pr_debug("omap_hwmod: %s: _wait_target_ready: %d\n",
> > - oh->name, r);
> > + if (strcmp(oh->class->name, "uart"))
> > + pr_debug("omap_hwmod: %s: _wait_target_ready:
> %d\n",
> > + oh->name, r);
>
> instead of adding check, move the print before _disable_clocks(oh).
>
> > @@ -1287,7 +1292,8 @@ static int _idle(struct omap_hwmod *oh)
> > return -EINVAL;
> > }
> >
> > - pr_debug("omap_hwmod: %s: idling\n", oh->name);
> > + if (strcmp(oh->class->name, "uart"))
> > + pr_debug("omap_hwmod: %s: idling\n", oh->name);
>
> I believe clocks are still on here too, no checks needed.
>
> > diff --git a/arch/arm/plat-omap/omap_device.c
> b/arch/arm/plat-omap/omap_device.c
> > index 49fc0df..7b27704 100644
> > --- a/arch/arm/plat-omap/omap_device.c
> > +++ b/arch/arm/plat-omap/omap_device.c
> > @@ -75,7 +75,8 @@
> > * (device must be reinitialized at this point to use it again)
> > *
> > */
> > -#undef DEBUG
> > +//#undef DEBUG
> > +#define DEBUG
>
> trailing.
>
> > @@ -114,7 +115,8 @@ static int _omap_device_activate(struct
> > omap_device *od, u8 ignore_lat)
> > {
> > struct timespec a, b, c;
> >
> > - pr_debug("omap_device: %s: activating\n", od->pdev.name);
> > + if (strcmp(od->hwmods[0]->class->name, "uart"))
> > + pr_debug("omap_device: %s: activating\n", od->pdev.name);
>
> move it to the end of the function.
>
> > @@ -138,25 +140,29 @@ static int _omap_device_activate(struct
> > omap_device *od, u8 ignore_lat)
> > c = timespec_sub(b, a);
> > act_lat = timespec_to_ns(&c);
> >
> > - pr_debug("omap_device: %s: pm_lat %d: activate: elapsed
> time "
> > - "%llu nsec\n", od->pdev.name, od->pm_lat_level,
> > - act_lat);
> > + if (strcmp(od->hwmods[0]->class->name, "uart"))
> > + pr_debug("omap_device: %s: pm_lat %d: activate:
> elapsed time "
> > + "%llu nsec\n", od->pdev.name,
> od->pm_lat_level,
> > + act_lat);
>
> move it further down.
>
> >
> > if (act_lat > odpl->activate_lat) {
> > odpl->activate_lat_worst = act_lat;
> > if (odpl->flags & OMAP_DEVICE_LATENCY_AUTO_ADJUST)
> {
> > odpl->activate_lat = act_lat;
> > - pr_warning("omap_device: %s.%d: new worst
> case "
> > - "activate latency %d: %llu\n",
> > - od->pdev.name, od->pdev.id,
> > - od->pm_lat_level, act_lat);
> > - } else
> > - pr_warning("omap_device: %s.%d: activate "
> > - "latency %d higher than
> exptected. "
> > - "(%llu > %d)\n",
> > - od->pdev.name, od->pdev.id,
> > - od->pm_lat_level, act_lat,
> > - odpl->activate_lat);
> > + if (strcmp(od->hwmods[0]->class->name,
> "uart"))
> > + pr_warning("omap_device: %s.%d: new
> worst case "
> > + "activate latency %d:
> %llu\n",
> > + od->pdev.name, od->pdev.id
> ,
> > + od->pm_lat_level, act_lat);
> > + } else {
> > + if (strcmp(od->hwmods[0]->class->name,
> "uart"))
> > + pr_warning("omap_device: %s.%d:
> activate "
> > + "latency %d higher than
> exptected. "
> > + "(%llu > %d)\n",
> > + od->pdev.name, od->pdev.id
> ,
> > + od->pm_lat_level, act_lat,
> > + odpl->activate_lat);
>
> ->activate_func() has already been called here, clocks are already on.
>
> > @@ -183,7 +189,8 @@ static int _omap_device_deactivate(struct
> > omap_device *od, u8 ignore_lat)
> > {
> > struct timespec a, b, c;
> >
> > - pr_debug("omap_device: %s: deactivating\n", od->pdev.name);
> > + if (strcmp(od->hwmods[0]->class->name, "uart"))
> > + pr_debug("omap_device: %s: deactivating\n", od->pdev.name
> );
>
> clocks are still on here.
>
> > @@ -206,25 +213,29 @@ static int _omap_device_deactivate(struct
> > omap_device *od, u8 ignore_lat)
> > c = timespec_sub(b, a);
> > deact_lat = timespec_to_ns(&c);
> >
> > - pr_debug("omap_device: %s: pm_lat %d: deactivate: elapsed
> time "
> > - "%llu nsec\n", od->pdev.name, od->pm_lat_level,
> > - deact_lat);
> > + if (strcmp(od->hwmods[0]->class->name, "uart"))
> > + pr_debug("omap_device: %s: pm_lat %d: deactivate:
> elapsed time "
> > + "%llu nsec\n", od->pdev.name,
> od->pm_lat_level,
> > + deact_lat);
>
> I'll leave this to Kevin to decide what to do, but clocks are off
> here...
>
Yes fine.
Since most of these prints will be printed if DEBUG macro
is defined in respective files and *debug* is used in command line.
Can't leave uart clocks active always on debug cases.
[If *debug* used as command line]
and gate uart clocks only for non debug cases.
With this approach at least we can have a clean solution
in uart driver also without adding clock gating from idle path.
Not sure if this agreeable.
As of now gating from idle path seems to be only clean approach.
--
Thanks,
Govindraj.R
> > if (deact_lat > odpl->deactivate_lat) {
> > odpl->deactivate_lat_worst = deact_lat;
> > if (odpl->flags & OMAP_DEVICE_LATENCY_AUTO_ADJUST)
> {
> > odpl->deactivate_lat = deact_lat;
> > - pr_warning("omap_device: %s.%d: new worst
> case "
> > - "deactivate latency %d: %llu\n",
> > - od->pdev.name, od->pdev.id,
> > - od->pm_lat_level, deact_lat);
> > - } else
> > - pr_warning("omap_device: %s.%d: deactivate
> "
> > + if (strcmp(od->hwmods[0]->class->name,
> "uart"))
> > + pr_warning("omap_device: %s.%d: new
> worst case "
> > + "deactivate latency %d:
> %llu\n",
> > + od->pdev.name, od->
> pdev.id,
> > + od->pm_lat_level,
> deact_lat);
> > + } else {
> > + if (strcmp(od->hwmods[0]->class->name,
> "uart"))
> > + pr_warning("omap_device: %s.%d:
> deactivate "
> > "latency %d higher than
> exptected. "
> > "(%llu > %d)\n",
> > od->pdev.name, od->pdev.id,
> > od->pm_lat_level, deact_lat,
> > odpl->deactivate_lat);
> > + }
>
> and here...
>
> --
> balbi
>
[-- Attachment #1.2: Type: text/html, Size: 16868 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Felipe Balbi @ 2011-08-01 9:03 UTC (permalink / raw)
To: Govindraj
Cc: Partha Basak, Tero Kristo, balbi, Govindraj.R, linux-serial,
linux-pm, linux-omap
In-Reply-To: <CAAL8m4wdKu_XGFaGxuAJ9uypRyyQMkR8HL7Y6E=pBkwX=L43nA@mail.gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 8335 bytes --]
Hi,
On Fri, Jul 29, 2011 at 08:43:49PM +0530, Govindraj wrote:
[giant snip]
> Actually there is much more than this:
>
> <<SNIP>>
>
> diff --git a/arch/arm/mach-omap2/clock.c b/arch/arm/mach-omap2/clock.c
> index 180299e..221ffb9 100644
> --- a/arch/arm/mach-omap2/clock.c
> +++ b/arch/arm/mach-omap2/clock.c
> @@ -12,7 +12,8 @@
> * it under the terms of the GNU General Public License version 2 as
> * published by the Free Software Foundation.
> */
> -#undef DEBUG
> +//#undef DEBUG
> +#define DEBUG
trailing... but you know that :-p
> @@ -254,14 +255,14 @@ void omap2_clk_disable(struct clk *clk)
> return;
> }
>
> - pr_debug("clock: %s: decrementing usecount\n", clk->name);
> +// pr_debug("clock: %s: decrementing usecount\n", clk->name);
>
> clk->usecount--;
>
> if (clk->usecount > 0)
> return;
>
> - pr_debug("clock: %s: disabling in hardware\n", clk->name);
> +// pr_debug("clock: %s: disabling in hardware\n", clk->name);
>
> if (clk->ops && clk->ops->disable) {
> trace_clock_disable(clk->name, 0, smp_processor_id());
this hunk is unnecessary. Clocks are always on when they are called.
> @@ -290,14 +291,14 @@ int omap2_clk_enable(struct clk *clk)
> {
> int ret;
>
> - pr_debug("clock: %s: incrementing usecount\n", clk->name);
> +// pr_debug("clock: %s: incrementing usecount\n", clk->name);
>
> clk->usecount++;
>
> if (clk->usecount > 1)
> return 0;
>
> - pr_debug("clock: %s: enabling in hardware\n", clk->name);
> +// pr_debug("clock: %s: enabling in hardware\n", clk->name);
these two is ok.
> diff --git a/arch/arm/mach-omap2/omap_hwmod.c b/arch/arm/mach-omap2/omap_hwmod.c
> index 7ed0479..8ca7d40 100644
> --- a/arch/arm/mach-omap2/omap_hwmod.c
> +++ b/arch/arm/mach-omap2/omap_hwmod.c
> @@ -124,7 +124,8 @@
> * XXX error return values should be checked to ensure that they are
> * appropriate
> */
> -#undef DEBUG
> +//#undef DEBUG
> +#define DEBUG
trailing.
> @@ -597,7 +598,8 @@ static int _enable_clocks(struct omap_hwmod *oh)
> {
> int i;
>
> - pr_debug("omap_hwmod: %s: enabling clocks\n", oh->name);
> + if (strcmp(oh->class->name, "uart"))
> + pr_debug("omap_hwmod: %s: enabling clocks\n", oh->name);
instead of doing checks, you could move the print to the end of the
function, when clocks are already enabled. When doind that, of course,
update the comment to say "%s: clocks enabled\n".
> @@ -627,7 +629,8 @@ static int _disable_clocks(struct omap_hwmod *oh)
> {
> int i;
>
> - pr_debug("omap_hwmod: %s: disabling clocks\n", oh->name);
> + if (strcmp(oh->class->name, "uart"))
> + pr_debug("omap_hwmod: %s: disabling clocks\n", oh->name);
check not needed, clocks are still on.
>
> if (oh->_clk)
> clk_disable(oh->_clk);
> @@ -1232,7 +1235,8 @@ static int _enable(struct omap_hwmod *oh)
> return -EINVAL;
> }
>
> - pr_debug("omap_hwmod: %s: enabling\n", oh->name);
> + if (strcmp(oh->class->name, "uart"))
> + pr_debug("omap_hwmod: %s: enabling\n", oh->name);
move it further down.
> @@ -1264,8 +1268,9 @@ static int _enable(struct omap_hwmod *oh)
> }
> } else {
> _disable_clocks(oh);
> - pr_debug("omap_hwmod: %s: _wait_target_ready: %d\n",
> - oh->name, r);
> + if (strcmp(oh->class->name, "uart"))
> + pr_debug("omap_hwmod: %s: _wait_target_ready: %d\n",
> + oh->name, r);
instead of adding check, move the print before _disable_clocks(oh).
> @@ -1287,7 +1292,8 @@ static int _idle(struct omap_hwmod *oh)
> return -EINVAL;
> }
>
> - pr_debug("omap_hwmod: %s: idling\n", oh->name);
> + if (strcmp(oh->class->name, "uart"))
> + pr_debug("omap_hwmod: %s: idling\n", oh->name);
I believe clocks are still on here too, no checks needed.
> diff --git a/arch/arm/plat-omap/omap_device.c b/arch/arm/plat-omap/omap_device.c
> index 49fc0df..7b27704 100644
> --- a/arch/arm/plat-omap/omap_device.c
> +++ b/arch/arm/plat-omap/omap_device.c
> @@ -75,7 +75,8 @@
> * (device must be reinitialized at this point to use it again)
> *
> */
> -#undef DEBUG
> +//#undef DEBUG
> +#define DEBUG
trailing.
> @@ -114,7 +115,8 @@ static int _omap_device_activate(struct
> omap_device *od, u8 ignore_lat)
> {
> struct timespec a, b, c;
>
> - pr_debug("omap_device: %s: activating\n", od->pdev.name);
> + if (strcmp(od->hwmods[0]->class->name, "uart"))
> + pr_debug("omap_device: %s: activating\n", od->pdev.name);
move it to the end of the function.
> @@ -138,25 +140,29 @@ static int _omap_device_activate(struct
> omap_device *od, u8 ignore_lat)
> c = timespec_sub(b, a);
> act_lat = timespec_to_ns(&c);
>
> - pr_debug("omap_device: %s: pm_lat %d: activate: elapsed time "
> - "%llu nsec\n", od->pdev.name, od->pm_lat_level,
> - act_lat);
> + if (strcmp(od->hwmods[0]->class->name, "uart"))
> + pr_debug("omap_device: %s: pm_lat %d: activate: elapsed time "
> + "%llu nsec\n", od->pdev.name, od->pm_lat_level,
> + act_lat);
move it further down.
>
> if (act_lat > odpl->activate_lat) {
> odpl->activate_lat_worst = act_lat;
> if (odpl->flags & OMAP_DEVICE_LATENCY_AUTO_ADJUST) {
> odpl->activate_lat = act_lat;
> - pr_warning("omap_device: %s.%d: new worst case "
> - "activate latency %d: %llu\n",
> - od->pdev.name, od->pdev.id,
> - od->pm_lat_level, act_lat);
> - } else
> - pr_warning("omap_device: %s.%d: activate "
> - "latency %d higher than exptected. "
> - "(%llu > %d)\n",
> - od->pdev.name, od->pdev.id,
> - od->pm_lat_level, act_lat,
> - odpl->activate_lat);
> + if (strcmp(od->hwmods[0]->class->name, "uart"))
> + pr_warning("omap_device: %s.%d: new worst case "
> + "activate latency %d: %llu\n",
> + od->pdev.name, od->pdev.id,
> + od->pm_lat_level, act_lat);
> + } else {
> + if (strcmp(od->hwmods[0]->class->name, "uart"))
> + pr_warning("omap_device: %s.%d: activate "
> + "latency %d higher than exptected. "
> + "(%llu > %d)\n",
> + od->pdev.name, od->pdev.id,
> + od->pm_lat_level, act_lat,
> + odpl->activate_lat);
->activate_func() has already been called here, clocks are already on.
> @@ -183,7 +189,8 @@ static int _omap_device_deactivate(struct
> omap_device *od, u8 ignore_lat)
> {
> struct timespec a, b, c;
>
> - pr_debug("omap_device: %s: deactivating\n", od->pdev.name);
> + if (strcmp(od->hwmods[0]->class->name, "uart"))
> + pr_debug("omap_device: %s: deactivating\n", od->pdev.name);
clocks are still on here.
> @@ -206,25 +213,29 @@ static int _omap_device_deactivate(struct
> omap_device *od, u8 ignore_lat)
> c = timespec_sub(b, a);
> deact_lat = timespec_to_ns(&c);
>
> - pr_debug("omap_device: %s: pm_lat %d: deactivate: elapsed time "
> - "%llu nsec\n", od->pdev.name, od->pm_lat_level,
> - deact_lat);
> + if (strcmp(od->hwmods[0]->class->name, "uart"))
> + pr_debug("omap_device: %s: pm_lat %d: deactivate: elapsed time "
> + "%llu nsec\n", od->pdev.name, od->pm_lat_level,
> + deact_lat);
I'll leave this to Kevin to decide what to do, but clocks are off
here...
> if (deact_lat > odpl->deactivate_lat) {
> odpl->deactivate_lat_worst = deact_lat;
> if (odpl->flags & OMAP_DEVICE_LATENCY_AUTO_ADJUST) {
> odpl->deactivate_lat = deact_lat;
> - pr_warning("omap_device: %s.%d: new worst case "
> - "deactivate latency %d: %llu\n",
> - od->pdev.name, od->pdev.id,
> - od->pm_lat_level, deact_lat);
> - } else
> - pr_warning("omap_device: %s.%d: deactivate "
> + if (strcmp(od->hwmods[0]->class->name, "uart"))
> + pr_warning("omap_device: %s.%d: new worst case "
> + "deactivate latency %d: %llu\n",
> + od->pdev.name, od->pdev.id,
> + od->pm_lat_level, deact_lat);
> + } else {
> + if (strcmp(od->hwmods[0]->class->name, "uart"))
> + pr_warning("omap_device: %s.%d: deactivate "
> "latency %d higher than exptected. "
> "(%llu > %d)\n",
> od->pdev.name, od->pdev.id,
> od->pm_lat_level, deact_lat,
> odpl->deactivate_lat);
> + }
and here...
--
balbi
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: [PATCH v4 0/3] DEVFREQ, DVFS framework for non-CPU devices
From: MyungJoo Ham @ 2011-08-01 6:22 UTC (permalink / raw)
To: Turquette, Mike
Cc: Len Brown, Greg Kroah-Hartman, Kyungmin Park, Thomas Gleixner,
linux-pm
In-Reply-To: <CAJOA=zPL_DUBb2miytFH6s3AKqroGDBsOPts=SdCb3p0c=cP=A@mail.gmail.com>
Hello.
On Sat, Jul 30, 2011 at 10:02 AM, Turquette, Mike <mturquette@ti.com> wrote:
> On Fri, Jul 29, 2011 at 2:10 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> On Friday, July 29, 2011, Turquette, Mike wrote:
>>> On Thu, Jul 28, 2011 at 3:10 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>> > On Friday, July 15, 2011, MyungJoo Ham wrote:
>>> >> For a usage example, please look at
>>> >> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
>>> >>
>>> >> In the above git tree, DVFS (dynamic voltage and frequency scaling) mechanism
>>> >> is applied to the memory bus of Exynos4210 for Exynos4210-NURI boards.
>>> >> In the example, the LPDDR2 DRAM frequency changes between 133, 266, and 400MHz
>>> >> and other related clocks simply follow the determined DDR RAM clock.
>>> >>
>>> >> The DEVFREQ driver for Exynos4210 memory bus is at
>>> >> /arch/arm/mach-exynos4/devfreq_bus.c in the git tree.
>>> >>
>>> >> MyungJoo Ham (3):
>>> >> PM: Introduce DEVFREQ: generic DVFS framework with device-specific
>>> >> OPPs
>>> >> PM / DEVFREQ: add example governors
>>> >> PM / DEVFREQ: add sysfs interface (including user tickling)
>>> >
>>> > OK, I'm going to take the patches for 3.2.
>>>
>>> Have any other platforms signed up to use this mechanism to manage
>>> their peripheral DVFS?
>>
>> Not that I know of, but one initial user is sufficient for me.
>> So if you have anything _against_ the patches, please speak up.
>
> I do have some concerns. Let me start by saying that I'm defining a
> "governor" as some active piece of executing code, probably a looping
> workqueue that inspects activity/idleness of a device and then makes a
> determination regarding clock frequency.
>
> devfreq seems to be good framework for creating DVFS governors.
> However I think that most scalable devices on an SoC do *not* need a
> governor, and many scalable devices won't have performance counters or
> any other way to implement such introspection.
Yes, governors except for some static or userspace-driven ones (such
as "performance", "powersave", and "userspace" although "userspace" is
not implemented for devfreq yet), they loop workqueue that inspects
activity/idleness of a device and determines frequency. However, the
inspection is done with a callback provided by each device, not done
directly by the devfreq itself. Therefore, if there is any way to
measure the activities (not just performance counters, number of
requests/function calls should be fine for may cases), normal
governors like "simple-ondemand" will work.
> Some examples include a MMC controller, which might change its clock
> rate depending on the class of card that the user has inserted. Or
> even a "smartish" device like a GPU lacking performance counters; it's
> driver will ramp up frequency when there is work to be done and kick
> off a timeout. If no new work comes in before the timeout then the
> driver will drop the frequency.
In the "simple MMC controller w/o performance counter" case, there are
following ways to use devfreq even if using the number of requests or
functions calls is not possible.
Method 1) use "userspace" governor and let user process choose
frequency based on the class
Method 2) use any "reasonable" governor and let the device driver set
only "valid" frequencies enabled.
For a rough example, we may do if class < 6, disable freq > 40MHz,
class < 10, disable freq > 80MHz, and so on. If we do not have
performance counters or any other mechanisms to monitor the
activities, "performance" governor along with clock-gated MMC driver
will save enough power.
For GPUs without anything to monitor the activities, we may do the
same as the MMC case.
However, with the H/W I've got now, (Exynos4210), we have performance
counters (PPMU) for many blocks: 3D(MALI GPU), ACP, CAMIF, CPU, DMC0,
DMC1 (memory controllers), FSYS, IMAGE, LCD0, LCD1, MFC_L, MFC_R, TV,
LEFT_BUS, and RIGHT_BUS. I don't think Exynos4 is an exceptionally
fancy SoC (already millions are sold for phones) and other mobile SoCs
(at least for flagship models) will have them very soon (or already
have them). Along with this patch, in the example with git branch
link, we control DMC0/DMC1 blocks. And,
> A governor is not required in these cases (as they are event driven)
> and devfreq is quite heavyweight for such applications. What is
> needed is a QoS-style software layer that allows throughput requests
> to be made from an initiator device towards a target device. This
> layer should aggregate requests since many initator devices may make
> requests to the same target device. This layer I'm describing, which
> does not exist today, should be where the actual DVFS transition takes
> place. That could take the form of a clk_set_rate call in the clock
> framework (as described by Colin in V1 of this series), or some other
> not-yet-realized dvfs_set_opp ,or something like Jean Pihet's
> per-device PM QoS patches or whatever. For the purposes of this email
> I don't really care which framework implements the QoS request
> aggregation.
Such aggregation could be also done with governors. If the
governor-device pair does not want to poll devfreq wouldn't loop
unless there is any governor-device pair that wants to do so. If it is
event-driven, users may just "allow/disallow" frequencies with OPP
framework and devfreq will choose proper frequency with the given
governor for the device. If every device uses "static" or
"event-driven" governors such as powersave/performance/userspace,
there will be no polling/looping.
When it is going to be directly controlled by userspace, we'll need a
"userspace" governor (same with userspace governor of cpufreq).
If there is a QoS request for a devfreq-ed device, the request could
be done with OPP's frequency enable/disable. If a device is to be
executed at 400MHz or faster, all frequencies under 400MHz could be
simply disabled w/ OPP. Devfreq governors cannot override such
frequency enable/disable configurations.
However, if such QoS requests need delays (timers) like tickle, a
generalized tickle supplied with frequency or percent of max-frequency
might work. (i.e., tickle(dev, freuqency, duration); ) Then, this
generalized tickle will hold at the request frequency or higher by
disabling lower frequencies temporarily.
>
> The point of describing this non-existant API is that devfreq should
> really be just another input into it. A governor that can measure bus
> saturation is really cool, but it may not yield optimal results
> compared to several drivers which make QoS-style requests and insure
> that performance is guaranteed for their particular needs during their
> transactions. The good news is that we don't have to choose between
> performance counter introspection and software QoS requests: both the
> driver requests and the governor should all feed as inputs into the
> QoS-style DVFS mechanism.
>
> Taking that logic to its inevitable conclusion, tickle doesn't belong
> inside the governor at all. If some device X wants to ramp up the
> frequency of device Y, it should just make a QoS-style throughput
> request towards device Y, possibly with a timeout (keeping the
> original idea of tickle intact). This is entirely a separate idea
> from a governor's introspective workqueue loop.
Although tickle is sharing the same loop with governors, tickle does
not belong inside governors. Tickle overrides the decisions of
governors; governor's decision function is not called if the device is
being tickled. However, generalizing the tickle function so that it
may take "at least at xx % of max frequency" or "operate at least xx
khz" as an option seems reasonable for QoS requests. And such options
might be implemented for next version of devfreq later. This requires
modification in tickle function interface or adding another interface
for tickle function. However, if such QoS requests do not need
duration set, we can just go with OPP's frequency enable/disable and
disable lower-than-QoS-requirement frequencies.
Thus, I guess this QoS issue is somewhat not very significant for
devfreq. And it can be easily mitigated by adding another interface or
modifying the interface of tickle function.
>
> For userspace, a sysfs entry for tickle would also not feed into the
> governor, but some dummy struct device *user would probably be the
> initiator device and it would simply call the QoS-style throughput
> API.
>
> In summary my objections to this series are:
> 1) devfreq should not be the *final* software layer to invoke a DVFS
> transition as it has not taken all constraints into account.
> 2) a devfreq governor represents just one constraint out of many to be
> considered for any given scalable device.
If the concern is about the QoS requests, I guess generalizing tickle
would be sufficient as above. For devices without performance counters
and any other mechanisms to infer the usage statistics, "performance"
governor with event-driven OPP freq-enable/disable should be fine.
>
> My objection to these patches getting merged is that I think they are
> a bit ahead of their time. We need to know what the real DVFS API
> looks like underneath devfreq first, since devfreq should really be
> built on top of it.
>
> Regards,
> Mike
>
>> Thanks,
>> Rafael
>>
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
>
Cheers!
MyungJoo.
--
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm
^ permalink raw reply
* Re: [PATCH] mrst_pmu: driver for Intel Moorestown Power Management Unit
From: Len Brown @ 2011-08-01 5:47 UTC (permalink / raw)
To: Alan Cox; +Cc: linux-pm, linux-kernel
In-Reply-To: <20110713105105.63e2fbcb@lxorguk.ukuu.org.uk>
> > +static struct mrst_device *pci_id_2_mrst_dev(u16 pci_dev_num)
> > +{
...
> > + else
> > + BUG();
> > +
> > + BUG_ON(pci_dev_num != mrst_devs[index].pci_dev_num);
>
> That strikes me as needlessly unfriendly, you could warn/return NULL and
> propogate a WARN_ONCE back to the user.
This code asserts that the firmware is correct, and today it is.
If somebody breaks the firmware in the future, I want them to discover
it the instant they break it. If I had the option of shooting
lazer beams out of the system to instantly kill the firmware programmer,
I'd do it. But alas, I have to settle for crashing their system:-)
> > +static int __init scu_fw_check(void)
if (!mrst_pmu)
return 0;
> > +late_initcall(scu_fw_check);
>
> NAK. I pointed this problem with the driver to you way back - this code
> gets run always - even on machines that are not in fact Moorestown which
> are then going to crash.
>
> You need to check that the platform is Moorestown (not just the CPU
> either because of Oaktrail)
Right-o. I think the 1-liner to check mrst_pmu above should do it.
thanks for reading, and sorry for being a slow listener,
-Len
^ permalink raw reply
* [PATCH 9/9] ARM / shmobile: Make A3RV be a subdomain of A4LC on SH7372
From: Rafael J. Wysocki @ 2011-07-31 17:52 UTC (permalink / raw)
To: Linux PM mailing list; +Cc: LKML, linux-sh
In-Reply-To: <201107311946.06654.rjw@sisk.pl>
From: Rafael J. Wysocki <rjw@sisk.pl>
Instead of coding the undocumented dependencies between power domains
A3RV and A4LC on SH7372 directly into the low-level power up/down
routines, make A3RV be a subdomain of A4LC, which will cause the
same dependecies to hold.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
arch/arm/mach-shmobile/pm-sh7372.c | 42 +---------------------------------
arch/arm/mach-shmobile/setup-sh7372.c | 3 ++
2 files changed, 5 insertions(+), 40 deletions(-)
Index: linux-2.6/arch/arm/mach-shmobile/pm-sh7372.c
===================================================================
--- linux-2.6.orig/arch/arm/mach-shmobile/pm-sh7372.c
+++ linux-2.6/arch/arm/mach-shmobile/pm-sh7372.c
@@ -91,35 +91,6 @@ static int pd_power_up(struct generic_pm
return ret;
}
-static int pd_power_up_a3rv(struct generic_pm_domain *genpd)
-{
- int ret = pd_power_up(genpd);
-
- /* force A4LC on after A3RV has been requested on */
- pm_genpd_poweron(&sh7372_a4lc.genpd);
-
- return ret;
-}
-
-static int pd_power_down_a3rv(struct generic_pm_domain *genpd)
-{
- int ret = pd_power_down(genpd);
-
- /* try to power down A4LC after A3RV is requested off */
- genpd_queue_power_off_work(&sh7372_a4lc.genpd);
-
- return ret;
-}
-
-static int pd_power_down_a4lc(struct generic_pm_domain *genpd)
-{
- /* only power down A4LC if A3RV is off */
- if (!(__raw_readl(PSTR) & (1 << sh7372_a3rv.bit_shift)))
- return pd_power_down(genpd);
-
- return -EBUSY;
-}
-
static bool pd_active_wakeup(struct device *dev)
{
return true;
@@ -133,17 +104,8 @@ void sh7372_init_pm_domain(struct sh7372
genpd->stop_device = pm_clk_suspend;
genpd->start_device = pm_clk_resume;
genpd->active_wakeup = pd_active_wakeup;
-
- if (sh7372_pd == &sh7372_a4lc) {
- genpd->power_off = pd_power_down_a4lc;
- genpd->power_on = pd_power_up;
- } else if (sh7372_pd == &sh7372_a3rv) {
- genpd->power_off = pd_power_down_a3rv;
- genpd->power_on = pd_power_up_a3rv;
- } else {
- genpd->power_off = pd_power_down;
- genpd->power_on = pd_power_up;
- }
+ genpd->power_off = pd_power_down;
+ genpd->power_on = pd_power_up;
genpd->power_on(&sh7372_pd->genpd);
}
Index: linux-2.6/arch/arm/mach-shmobile/setup-sh7372.c
===================================================================
--- linux-2.6.orig/arch/arm/mach-shmobile/setup-sh7372.c
+++ linux-2.6/arch/arm/mach-shmobile/setup-sh7372.c
@@ -30,6 +30,7 @@
#include <linux/sh_dma.h>
#include <linux/sh_intc.h>
#include <linux/sh_timer.h>
+#include <linux/pm_domain.h>
#include <mach/hardware.h>
#include <mach/sh7372.h>
#include <asm/mach-types.h>
@@ -848,6 +849,8 @@ void __init sh7372_add_standard_devices(
sh7372_init_pm_domain(&sh7372_a3ri);
sh7372_init_pm_domain(&sh7372_a3sg);
+ pm_genpd_add_subdomain(&sh7372_a4lc.genpd, &sh7372_a3sg.genpd);
+
platform_add_devices(sh7372_early_devices,
ARRAY_SIZE(sh7372_early_devices));
^ permalink raw reply
* [PATCH 8/9] PM / Domains: Rename second argument of pm_genpd_add_subdomain()
From: Rafael J. Wysocki @ 2011-07-31 17:52 UTC (permalink / raw)
To: Linux PM mailing list; +Cc: LKML, linux-sh
In-Reply-To: <201107311946.06654.rjw@sisk.pl>
From: Rafael J. Wysocki <rjw@sisk.pl>
Change the name of the second argument of pm_genpd_add_subdomain()
so that it is (a) shorter and (b) in agreement with the name of
the second argument of pm_genpd_add_subdomain().
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
drivers/base/power/domain.c | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)
Index: linux-2.6/drivers/base/power/domain.c
===================================================================
--- linux-2.6.orig/drivers/base/power/domain.c
+++ linux-2.6/drivers/base/power/domain.c
@@ -1163,36 +1163,36 @@ int pm_genpd_remove_device(struct generi
/**
* pm_genpd_add_subdomain - Add a subdomain to an I/O PM domain.
* @genpd: Master PM domain to add the subdomain to.
- * @new_subdomain: Subdomain to be added.
+ * @subdomain: Subdomain to be added.
*/
int pm_genpd_add_subdomain(struct generic_pm_domain *genpd,
- struct generic_pm_domain *new_subdomain)
+ struct generic_pm_domain *subdomain)
{
struct gpd_link *link;
int ret = 0;
- if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(new_subdomain))
+ if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(subdomain))
return -EINVAL;
start:
genpd_acquire_lock(genpd);
- mutex_lock_nested(&new_subdomain->lock, SINGLE_DEPTH_NESTING);
+ mutex_lock_nested(&subdomain->lock, SINGLE_DEPTH_NESTING);
- if (new_subdomain->status != GPD_STATE_POWER_OFF
- && new_subdomain->status != GPD_STATE_ACTIVE) {
- mutex_unlock(&new_subdomain->lock);
+ if (subdomain->status != GPD_STATE_POWER_OFF
+ && subdomain->status != GPD_STATE_ACTIVE) {
+ mutex_unlock(&subdomain->lock);
genpd_release_lock(genpd);
goto start;
}
if (genpd->status == GPD_STATE_POWER_OFF
- && new_subdomain->status != GPD_STATE_POWER_OFF) {
+ && subdomain->status != GPD_STATE_POWER_OFF) {
ret = -EINVAL;
goto out;
}
list_for_each_entry(link, &genpd->slave_links, slave_node) {
- if (link->slave == new_subdomain && link->master == genpd) {
+ if (link->slave == subdomain && link->master == genpd) {
ret = -EINVAL;
goto out;
}
@@ -1205,13 +1205,13 @@ int pm_genpd_add_subdomain(struct generi
}
link->master = genpd;
list_add_tail(&link->master_node, &genpd->master_links);
- link->slave = new_subdomain;
- list_add_tail(&link->slave_node, &new_subdomain->slave_links);
- if (new_subdomain->status != GPD_STATE_POWER_OFF)
+ link->slave = subdomain;
+ list_add_tail(&link->slave_node, &subdomain->slave_links);
+ if (subdomain->status != GPD_STATE_POWER_OFF)
genpd_sd_counter_inc(genpd);
out:
- mutex_unlock(&new_subdomain->lock);
+ mutex_unlock(&subdomain->lock);
genpd_release_lock(genpd);
return ret;
^ permalink raw reply
* [PATCH 7/9] PM / Domains: Rename GPD_STATE_WAIT_PARENT to GPD_STATE_WAIT_MASTER
From: Rafael J. Wysocki @ 2011-07-31 17:51 UTC (permalink / raw)
To: Linux PM mailing list; +Cc: LKML, linux-sh
In-Reply-To: <201107311946.06654.rjw@sisk.pl>
From: Rafael J. Wysocki <rjw@sisk.pl>
Since it is now possible for a PM domain to have multiple masters
instead of one parent, rename the "wait for parent" status to reflect
the new situation.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
drivers/base/power/domain.c | 34 +++++++++++++++++-----------------
include/linux/pm_domain.h | 2 +-
2 files changed, 18 insertions(+), 18 deletions(-)
Index: linux-2.6/drivers/base/power/domain.c
===================================================================
--- linux-2.6.orig/drivers/base/power/domain.c
+++ linux-2.6/drivers/base/power/domain.c
@@ -98,7 +98,7 @@ int __pm_genpd_poweron(struct generic_pm
for (;;) {
prepare_to_wait(&genpd->status_wait_queue, &wait,
TASK_UNINTERRUPTIBLE);
- if (genpd->status != GPD_STATE_WAIT_PARENT)
+ if (genpd->status != GPD_STATE_WAIT_MASTER)
break;
mutex_unlock(&genpd->lock);
@@ -124,7 +124,7 @@ int __pm_genpd_poweron(struct generic_pm
*/
list_for_each_entry(link, &genpd->slave_links, slave_node) {
genpd_sd_counter_inc(link->master);
- genpd->status = GPD_STATE_WAIT_PARENT;
+ genpd->status = GPD_STATE_WAIT_MASTER;
mutex_unlock(&genpd->lock);
@@ -258,7 +258,7 @@ static void __pm_genpd_restore_device(st
*/
static bool genpd_abort_poweroff(struct generic_pm_domain *genpd)
{
- return genpd->status == GPD_STATE_WAIT_PARENT
+ return genpd->status == GPD_STATE_WAIT_MASTER
|| genpd->status == GPD_STATE_ACTIVE || genpd->resume_count > 0;
}
@@ -300,7 +300,7 @@ static int pm_genpd_poweroff(struct gene
* (4) System suspend is in progress.
*/
if (genpd->status == GPD_STATE_POWER_OFF
- || genpd->status == GPD_STATE_WAIT_PARENT
+ || genpd->status == GPD_STATE_WAIT_MASTER
|| genpd->resume_count > 0 || genpd->prepared_count > 0)
return 0;
Index: linux-2.6/include/linux/pm_domain.h
===================================================================
--- linux-2.6.orig/include/linux/pm_domain.h
+++ linux-2.6/include/linux/pm_domain.h
@@ -13,7 +13,7 @@
enum gpd_status {
GPD_STATE_ACTIVE = 0, /* PM domain is active */
- GPD_STATE_WAIT_PARENT, /* PM domain's parent is being waited for */
+ GPD_STATE_WAIT_MASTER, /* PM domain's master is being waited for */
GPD_STATE_BUSY, /* Something is happening to the PM domain */
GPD_STATE_REPEAT, /* Power off in progress, to be repeated */
GPD_STATE_POWER_OFF, /* PM domain is off */
^ permalink raw reply
* [PATCH 6/9] PM / Domains: Allow generic PM domains to have multiple masters
From: Rafael J. Wysocki @ 2011-07-31 17:50 UTC (permalink / raw)
To: Linux PM mailing list; +Cc: LKML, linux-sh
In-Reply-To: <201107311946.06654.rjw@sisk.pl>
From: Rafael J. Wysocki <rjw@sisk.pl>
Currently, for a given generic PM domain there may be only one parent
domain (i.e. a PM domain it depends on). However, there is at least
one real-life case in which there should be two parents (masters) for
one PM domain (the A3RV domain on SH7372 turns out to depend on the
A4LC domain and it depends on the A4R domain and the same time). For
this reason, allow a PM domain to have multiple parents (masters) by
introducing objects representing links between PM domains.
The (logical) links between PM domains represent relationships in
which one domain is a master (i.e. it is depended on) and another
domain is a slave (i.e. it depends on the master) with the rule that
the slave cannot be powered on if the master is not powered on and
the master cannot be powered off if the slave is not powered off.
Each struct generic_pm_domain object representing a PM domain has
two lists of links, a list of links in which it is a master and
a list of links in which it is a slave. The first of these lists
replaces the list of subdomains and the second one is used in place
of the parent pointer.
Each link is represented by struct gpd_link object containing
pointers to the master and the slave and two struct list_head
members allowing it to hook into two lists (the master's list
of "master" links and the slave's list of "slave" links). This
allows the code to get to the link from each side (either from
the master or from the slave) and follow it in each direction.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
drivers/base/power/domain.c | 99 +++++++++++++++++++++++++-------------------
include/linux/pm_domain.h | 12 ++++-
2 files changed, 67 insertions(+), 44 deletions(-)
Index: linux-2.6/include/linux/pm_domain.h
===================================================================
--- linux-2.6.orig/include/linux/pm_domain.h
+++ linux-2.6/include/linux/pm_domain.h
@@ -26,9 +26,8 @@ struct dev_power_governor {
struct generic_pm_domain {
struct dev_pm_domain domain; /* PM domain operations */
struct list_head gpd_list_node; /* Node in the global PM domains list */
- struct list_head sd_node; /* Node in the parent's subdomain list */
- struct generic_pm_domain *parent; /* Parent PM domain */
- struct list_head sd_list; /* List of dubdomains */
+ struct list_head master_links; /* Links with PM domain as a master */
+ struct list_head slave_links; /* Links with PM domain as a slave */
struct list_head dev_list; /* List of devices */
struct mutex lock;
struct dev_power_governor *gov;
@@ -55,6 +54,13 @@ static inline struct generic_pm_domain *
return container_of(pd, struct generic_pm_domain, domain);
}
+struct gpd_link {
+ struct generic_pm_domain *master;
+ struct list_head master_node;
+ struct generic_pm_domain *slave;
+ struct list_head slave_node;
+};
+
struct dev_list_entry {
struct list_head node;
struct device *dev;
Index: linux-2.6/drivers/base/power/domain.c
===================================================================
--- linux-2.6.orig/drivers/base/power/domain.c
+++ linux-2.6/drivers/base/power/domain.c
@@ -81,19 +81,20 @@ static void genpd_set_active(struct gene
}
/**
- * __pm_genpd_poweron - Restore power to a given PM domain and its parents.
+ * __pm_genpd_poweron - Restore power to a given PM domain and its masters.
* @genpd: PM domain to power up.
*
- * Restore power to @genpd and all of its parents so that it is possible to
+ * Restore power to @genpd and all of its masters so that it is possible to
* resume a device belonging to it.
*/
int __pm_genpd_poweron(struct generic_pm_domain *genpd)
__releases(&genpd->lock) __acquires(&genpd->lock)
{
+ struct gpd_link *link;
DEFINE_WAIT(wait);
int ret = 0;
- /* If the domain's parent is being waited for, we have to wait too. */
+ /* If the domain's master is being waited for, we have to wait too. */
for (;;) {
prepare_to_wait(&genpd->status_wait_queue, &wait,
TASK_UNINTERRUPTIBLE);
@@ -116,24 +117,31 @@ int __pm_genpd_poweron(struct generic_pm
return 0;
}
- if (genpd->parent) {
- genpd_sd_counter_inc(genpd->parent);
+ /*
+ * The list is guaranteed not to change while the loop below is being
+ * executed, unless one of the masters' .power_on() callbacks fiddles
+ * with it.
+ */
+ list_for_each_entry(link, &genpd->slave_links, slave_node) {
+ genpd_sd_counter_inc(link->master);
genpd->status = GPD_STATE_WAIT_PARENT;
mutex_unlock(&genpd->lock);
- ret = pm_genpd_poweron(genpd->parent);
+ ret = pm_genpd_poweron(link->master);
mutex_lock(&genpd->lock);
/*
* The "wait for parent" status is guaranteed not to change
- * while the parent is powering on.
+ * while the master is powering on.
*/
genpd->status = GPD_STATE_POWER_OFF;
wake_up_all(&genpd->status_wait_queue);
- if (ret)
+ if (ret) {
+ genpd_sd_counter_dec(link->master);
goto err;
+ }
}
if (genpd->power_on) {
@@ -147,14 +155,14 @@ int __pm_genpd_poweron(struct generic_pm
return 0;
err:
- if (genpd->parent)
- genpd_sd_counter_dec(genpd->parent);
+ list_for_each_entry_continue_reverse(link, &genpd->slave_links, slave_node)
+ genpd_sd_counter_dec(link->master);
return ret;
}
/**
- * pm_genpd_poweron - Restore power to a given PM domain and its parents.
+ * pm_genpd_poweron - Restore power to a given PM domain and its masters.
* @genpd: PM domain to power up.
*/
int pm_genpd_poweron(struct generic_pm_domain *genpd)
@@ -278,8 +286,8 @@ void genpd_queue_power_off_work(struct g
static int pm_genpd_poweroff(struct generic_pm_domain *genpd)
__releases(&genpd->lock) __acquires(&genpd->lock)
{
- struct generic_pm_domain *parent;
struct dev_list_entry *dle;
+ struct gpd_link *link;
unsigned int not_suspended;
int ret = 0;
@@ -287,7 +295,7 @@ static int pm_genpd_poweroff(struct gene
/*
* Do not try to power off the domain in the following situations:
* (1) The domain is already in the "power off" state.
- * (2) The domain is waiting for its parent to power up.
+ * (2) The domain is waiting for its master to power up.
* (3) One of the domain's devices is being resumed right now.
* (4) System suspend is in progress.
*/
@@ -349,8 +357,8 @@ static int pm_genpd_poweroff(struct gene
}
/*
- * If sd_count > 0 at this point, one of the children hasn't
- * managed to call pm_genpd_poweron() for the parent yet after
+ * If sd_count > 0 at this point, one of the subdomains hasn't
+ * managed to call pm_genpd_poweron() for the master yet after
* incrementing it. In that case pm_genpd_poweron() will wait
* for us to drop the lock, so we can call .power_off() and let
* the pm_genpd_poweron() restore power for us (this shouldn't
@@ -365,9 +373,10 @@ static int pm_genpd_poweroff(struct gene
genpd->status = GPD_STATE_POWER_OFF;
- parent = genpd->parent;
- if (parent && genpd_sd_counter_dec(parent))
- genpd_queue_power_off_work(parent);
+ list_for_each_entry(link, &genpd->slave_links, slave_node) {
+ genpd_sd_counter_dec(link->master);
+ genpd_queue_power_off_work(link->master);
+ }
out:
genpd->poweroff_task = NULL;
@@ -512,11 +521,11 @@ static inline void __pm_genpd_runtime_re
#ifdef CONFIG_PM_SLEEP
/**
- * pm_genpd_sync_poweroff - Synchronously power off a PM domain and its parents.
+ * pm_genpd_sync_poweroff - Synchronously power off a PM domain and its masters.
* @genpd: PM domain to power off, if possible.
*
* Check if the given PM domain can be powered off (during system suspend or
- * hibernation) and do that if so. Also, in that case propagate to its parent.
+ * hibernation) and do that if so. Also, in that case propagate to its masters.
*
* This function is only called in "noirq" stages of system power transitions,
* so it need not acquire locks (all of the "noirq" callbacks are executed
@@ -524,7 +533,7 @@ static inline void __pm_genpd_runtime_re
*/
static void pm_genpd_sync_poweroff(struct generic_pm_domain *genpd)
{
- struct generic_pm_domain *parent = genpd->parent;
+ struct gpd_link *link;
if (genpd->status == GPD_STATE_POWER_OFF)
return;
@@ -537,9 +546,10 @@ static void pm_genpd_sync_poweroff(struc
genpd->power_off(genpd);
genpd->status = GPD_STATE_POWER_OFF;
- if (parent) {
- genpd_sd_counter_dec(parent);
- pm_genpd_sync_poweroff(parent);
+
+ list_for_each_entry(link, &genpd->slave_links, slave_node) {
+ genpd_sd_counter_dec(link->master);
+ pm_genpd_sync_poweroff(link->master);
}
}
@@ -1158,7 +1168,7 @@ int pm_genpd_remove_device(struct generi
int pm_genpd_add_subdomain(struct generic_pm_domain *genpd,
struct generic_pm_domain *new_subdomain)
{
- struct generic_pm_domain *subdomain;
+ struct gpd_link *link;
int ret = 0;
if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(new_subdomain))
@@ -1181,16 +1191,23 @@ int pm_genpd_add_subdomain(struct generi
goto out;
}
- list_for_each_entry(subdomain, &genpd->sd_list, sd_node) {
- if (subdomain == new_subdomain) {
+ list_for_each_entry(link, &genpd->slave_links, slave_node) {
+ if (link->slave == new_subdomain && link->master == genpd) {
ret = -EINVAL;
goto out;
}
}
- list_add_tail(&new_subdomain->sd_node, &genpd->sd_list);
- new_subdomain->parent = genpd;
- if (subdomain->status != GPD_STATE_POWER_OFF)
+ link = kzalloc(sizeof(*link), GFP_KERNEL);
+ if (!link) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ link->master = genpd;
+ list_add_tail(&link->master_node, &genpd->master_links);
+ link->slave = new_subdomain;
+ list_add_tail(&link->slave_node, &new_subdomain->slave_links);
+ if (new_subdomain->status != GPD_STATE_POWER_OFF)
genpd_sd_counter_inc(genpd);
out:
@@ -1203,22 +1220,22 @@ int pm_genpd_add_subdomain(struct generi
/**
* pm_genpd_remove_subdomain - Remove a subdomain from an I/O PM domain.
* @genpd: Master PM domain to remove the subdomain from.
- * @target: Subdomain to be removed.
+ * @subdomain: Subdomain to be removed.
*/
int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd,
- struct generic_pm_domain *target)
+ struct generic_pm_domain *subdomain)
{
- struct generic_pm_domain *subdomain;
+ struct gpd_link *link;
int ret = -EINVAL;
- if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(target))
+ if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(subdomain))
return -EINVAL;
start:
genpd_acquire_lock(genpd);
- list_for_each_entry(subdomain, &genpd->sd_list, sd_node) {
- if (subdomain != target)
+ list_for_each_entry(link, &genpd->master_links, master_node) {
+ if (link->slave != subdomain)
continue;
mutex_lock_nested(&subdomain->lock, SINGLE_DEPTH_NESTING);
@@ -1230,8 +1247,9 @@ int pm_genpd_remove_subdomain(struct gen
goto start;
}
- list_del(&subdomain->sd_node);
- subdomain->parent = NULL;
+ list_del(&link->master_node);
+ list_del(&link->slave_node);
+ kfree(link);
if (subdomain->status != GPD_STATE_POWER_OFF)
genpd_sd_counter_dec(genpd);
@@ -1258,10 +1276,9 @@ void pm_genpd_init(struct generic_pm_dom
if (IS_ERR_OR_NULL(genpd))
return;
- INIT_LIST_HEAD(&genpd->sd_node);
- genpd->parent = NULL;
+ INIT_LIST_HEAD(&genpd->master_links);
+ INIT_LIST_HEAD(&genpd->slave_links);
INIT_LIST_HEAD(&genpd->dev_list);
- INIT_LIST_HEAD(&genpd->sd_list);
mutex_init(&genpd->lock);
genpd->gov = gov;
INIT_WORK(&genpd->power_off_work, genpd_power_off_work_fn);
^ permalink raw reply
* [PATCH 5/9] PM / Domains: Add "wait for parent" status for generic PM domains
From: Rafael J. Wysocki @ 2011-07-31 17:49 UTC (permalink / raw)
To: Linux PM mailing list; +Cc: LKML, linux-sh
In-Reply-To: <201107311946.06654.rjw@sisk.pl>
From: Rafael J. Wysocki <rjw@sisk.pl>
The next patch will make it possible for a generic PM domain to have
multiple parents (i.e. multiple PM domains it depends on). To
prepare for that change it is necessary to change pm_genpd_poweron()
so that it doesn't jump to the start label after running itself
recursively for the parent domain. For this purpose, introduce a new
PM domain status value GPD_STATE_WAIT_PARENT that will be set by
pm_genpd_poweron() before calling itself recursively for the parent
domain and modify the code in drivers/base/power/domain.c so that
the GPD_STATE_WAIT_PARENT status is guaranteed to be preserved during
the execution of pm_genpd_poweron() for the parent.
This change also causes pm_genpd_add_subdomain() and
pm_genpd_remove_subdomain() to wait for started pm_genpd_poweron() to
complete and allows pm_genpd_runtime_resume() to avoid dropping the
lock after powering on the PM domain.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
drivers/base/power/domain.c | 90 +++++++++++++++++++++++++++++---------------
include/linux/pm_domain.h | 1
2 files changed, 61 insertions(+), 30 deletions(-)
Index: linux-2.6/include/linux/pm_domain.h
===================================================================
--- linux-2.6.orig/include/linux/pm_domain.h
+++ linux-2.6/include/linux/pm_domain.h
@@ -13,6 +13,7 @@
enum gpd_status {
GPD_STATE_ACTIVE = 0, /* PM domain is active */
+ GPD_STATE_WAIT_PARENT, /* PM domain's parent is being waited for */
GPD_STATE_BUSY, /* Something is happening to the PM domain */
GPD_STATE_REPEAT, /* Power off in progress, to be repeated */
GPD_STATE_POWER_OFF, /* PM domain is off */
Index: linux-2.6/drivers/base/power/domain.c
===================================================================
--- linux-2.6.orig/drivers/base/power/domain.c
+++ linux-2.6/drivers/base/power/domain.c
@@ -81,45 +81,59 @@ static void genpd_set_active(struct gene
}
/**
- * pm_genpd_poweron - Restore power to a given PM domain and its parents.
+ * __pm_genpd_poweron - Restore power to a given PM domain and its parents.
* @genpd: PM domain to power up.
*
* Restore power to @genpd and all of its parents so that it is possible to
* resume a device belonging to it.
*/
-int pm_genpd_poweron(struct generic_pm_domain *genpd)
+int __pm_genpd_poweron(struct generic_pm_domain *genpd)
+ __releases(&genpd->lock) __acquires(&genpd->lock)
{
- struct generic_pm_domain *parent;
+ DEFINE_WAIT(wait);
int ret = 0;
- mutex_lock(&genpd->lock);
+ /* If the domain's parent is being waited for, we have to wait too. */
+ for (;;) {
+ prepare_to_wait(&genpd->status_wait_queue, &wait,
+ TASK_UNINTERRUPTIBLE);
+ if (genpd->status != GPD_STATE_WAIT_PARENT)
+ break;
+ mutex_unlock(&genpd->lock);
- parent = genpd->parent;
+ schedule();
+
+ mutex_lock(&genpd->lock);
+ }
+ finish_wait(&genpd->status_wait_queue, &wait);
- start:
if (genpd->status == GPD_STATE_ACTIVE
|| (genpd->prepared_count > 0 && genpd->suspend_power_off))
- goto out;
+ return 0;
if (genpd->status != GPD_STATE_POWER_OFF) {
genpd_set_active(genpd);
- goto out;
+ return 0;
}
- if (parent) {
- genpd_sd_counter_inc(parent);
+ if (genpd->parent) {
+ genpd_sd_counter_inc(genpd->parent);
+ genpd->status = GPD_STATE_WAIT_PARENT;
mutex_unlock(&genpd->lock);
- ret = pm_genpd_poweron(parent);
+ ret = pm_genpd_poweron(genpd->parent);
mutex_lock(&genpd->lock);
+ /*
+ * The "wait for parent" status is guaranteed not to change
+ * while the parent is powering on.
+ */
+ genpd->status = GPD_STATE_POWER_OFF;
+ wake_up_all(&genpd->status_wait_queue);
if (ret)
goto err;
-
- parent = NULL;
- goto start;
}
if (genpd->power_on) {
@@ -130,16 +144,27 @@ int pm_genpd_poweron(struct generic_pm_d
genpd_set_active(genpd);
- out:
- mutex_unlock(&genpd->lock);
-
- return ret;
+ return 0;
err:
if (genpd->parent)
genpd_sd_counter_dec(genpd->parent);
- goto out;
+ return ret;
+}
+
+/**
+ * pm_genpd_poweron - Restore power to a given PM domain and its parents.
+ * @genpd: PM domain to power up.
+ */
+int pm_genpd_poweron(struct generic_pm_domain *genpd)
+{
+ int ret;
+
+ mutex_lock(&genpd->lock);
+ ret = __pm_genpd_poweron(genpd);
+ mutex_unlock(&genpd->lock);
+ return ret;
}
#endif /* CONFIG_PM */
@@ -225,7 +250,8 @@ static void __pm_genpd_restore_device(st
*/
static bool genpd_abort_poweroff(struct generic_pm_domain *genpd)
{
- return genpd->status == GPD_STATE_ACTIVE || genpd->resume_count > 0;
+ return genpd->status == GPD_STATE_WAIT_PARENT
+ || genpd->status == GPD_STATE_ACTIVE || genpd->resume_count > 0;
}
/**
@@ -261,11 +287,13 @@ static int pm_genpd_poweroff(struct gene
/*
* Do not try to power off the domain in the following situations:
* (1) The domain is already in the "power off" state.
- * (2) System suspend is in progress.
+ * (2) The domain is waiting for its parent to power up.
* (3) One of the domain's devices is being resumed right now.
+ * (4) System suspend is in progress.
*/
- if (genpd->status == GPD_STATE_POWER_OFF || genpd->prepared_count > 0
- || genpd->resume_count > 0)
+ if (genpd->status == GPD_STATE_POWER_OFF
+ || genpd->status == GPD_STATE_WAIT_PARENT
+ || genpd->resume_count > 0 || genpd->prepared_count > 0)
return 0;
if (atomic_read(&genpd->sd_count) > 0)
@@ -299,14 +327,15 @@ static int pm_genpd_poweroff(struct gene
list_for_each_entry_reverse(dle, &genpd->dev_list, node) {
ret = atomic_read(&genpd->sd_count) == 0 ?
__pm_genpd_save_device(dle, genpd) : -EBUSY;
+
+ if (genpd_abort_poweroff(genpd))
+ goto out;
+
if (ret) {
genpd_set_active(genpd);
goto out;
}
- if (genpd_abort_poweroff(genpd))
- goto out;
-
if (genpd->status == GPD_STATE_REPEAT) {
genpd->poweroff_task = NULL;
goto start;
@@ -432,11 +461,12 @@ static int pm_genpd_runtime_resume(struc
if (IS_ERR(genpd))
return -EINVAL;
- ret = pm_genpd_poweron(genpd);
- if (ret)
- return ret;
-
mutex_lock(&genpd->lock);
+ ret = __pm_genpd_poweron(genpd);
+ if (ret) {
+ mutex_unlock(&genpd->lock);
+ return ret;
+ }
genpd->status = GPD_STATE_BUSY;
genpd->resume_count++;
for (;;) {
^ permalink raw reply
* [PATCH 4/9] PM / Domains: Make pm_genpd_poweron() always survive parent removal
From: Rafael J. Wysocki @ 2011-07-31 17:49 UTC (permalink / raw)
To: Linux PM mailing list; +Cc: LKML, linux-sh
In-Reply-To: <201107311946.06654.rjw@sisk.pl>
From: Rafael J. Wysocki <rjw@sisk.pl>
If pm_genpd_remove_subdomain() is called to remove a PM domain's
subdomain and pm_genpd_poweron() is called for that subdomain at
the same time, and the pm_genpd_poweron() called by it recursively
for the parent returns an error, the first pm_genpd_poweron()'s
error code path will attempt to decrement the subdomain counter of
a PM domain that it's not a subdomain of any more.
Rearrange the code in pm_genpd_poweron() to prevent this from
happening.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
drivers/base/power/domain.c | 33 ++++++++++++++++++++-------------
1 file changed, 20 insertions(+), 13 deletions(-)
Index: linux-2.6/drivers/base/power/domain.c
===================================================================
--- linux-2.6.orig/drivers/base/power/domain.c
+++ linux-2.6/drivers/base/power/domain.c
@@ -89,12 +89,14 @@ static void genpd_set_active(struct gene
*/
int pm_genpd_poweron(struct generic_pm_domain *genpd)
{
- struct generic_pm_domain *parent = genpd->parent;
+ struct generic_pm_domain *parent;
int ret = 0;
- start:
mutex_lock(&genpd->lock);
+ parent = genpd->parent;
+
+ start:
if (genpd->status == GPD_STATE_ACTIVE
|| (genpd->prepared_count > 0 && genpd->suspend_power_off))
goto out;
@@ -110,29 +112,34 @@ int pm_genpd_poweron(struct generic_pm_d
mutex_unlock(&genpd->lock);
ret = pm_genpd_poweron(parent);
- if (ret) {
- genpd_sd_counter_dec(parent);
- return ret;
- }
+
+ mutex_lock(&genpd->lock);
+
+ if (ret)
+ goto err;
parent = NULL;
goto start;
}
- if (genpd->power_on)
+ if (genpd->power_on) {
ret = genpd->power_on(genpd);
-
- if (ret) {
- if (genpd->parent)
- genpd_sd_counter_dec(genpd->parent);
- } else {
- genpd_set_active(genpd);
+ if (ret)
+ goto err;
}
+ genpd_set_active(genpd);
+
out:
mutex_unlock(&genpd->lock);
return ret;
+
+ err:
+ if (genpd->parent)
+ genpd_sd_counter_dec(genpd->parent);
+
+ goto out;
}
#endif /* CONFIG_PM */
^ permalink raw reply
* [PATCH 3/9] PM / Domains: Do not take parent locks to modify subdomain counters
From: Rafael J. Wysocki @ 2011-07-31 17:48 UTC (permalink / raw)
To: Linux PM mailing list; +Cc: LKML, linux-sh
In-Reply-To: <201107311946.06654.rjw@sisk.pl>
From: Rafael J. Wysocki <rjw@sisk.pl>
After the subdomain counter in struct generic_pm_domain has been
changed into an atomic_t field, it is possible to modify
pm_genpd_poweron() and pm_genpd_poweroff() so that they don't take
the parents locks. This requires pm_genpd_poweron() to increment
the parent's subdomain counter before calling itself recursively
for the parent and to decrement it if an error is to be returned.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
drivers/base/power/domain.c | 70 +++++++++++++++++++-------------------------
1 file changed, 31 insertions(+), 39 deletions(-)
Index: linux-2.6/drivers/base/power/domain.c
===================================================================
--- linux-2.6.orig/drivers/base/power/domain.c
+++ linux-2.6/drivers/base/power/domain.c
@@ -93,12 +93,7 @@ int pm_genpd_poweron(struct generic_pm_d
int ret = 0;
start:
- if (parent) {
- genpd_acquire_lock(parent);
- mutex_lock_nested(&genpd->lock, SINGLE_DEPTH_NESTING);
- } else {
- mutex_lock(&genpd->lock);
- }
+ mutex_lock(&genpd->lock);
if (genpd->status == GPD_STATE_ACTIVE
|| (genpd->prepared_count > 0 && genpd->suspend_power_off))
@@ -109,31 +104,33 @@ int pm_genpd_poweron(struct generic_pm_d
goto out;
}
- if (parent && parent->status != GPD_STATE_ACTIVE) {
+ if (parent) {
+ genpd_sd_counter_inc(parent);
+
mutex_unlock(&genpd->lock);
- genpd_release_lock(parent);
ret = pm_genpd_poweron(parent);
- if (ret)
+ if (ret) {
+ genpd_sd_counter_dec(parent);
return ret;
+ }
+ parent = NULL;
goto start;
}
- if (genpd->power_on) {
+ if (genpd->power_on)
ret = genpd->power_on(genpd);
- if (ret)
- goto out;
- }
- genpd_set_active(genpd);
- if (parent)
- genpd_sd_counter_inc(parent);
+ if (ret) {
+ if (genpd->parent)
+ genpd_sd_counter_dec(genpd->parent);
+ } else {
+ genpd_set_active(genpd);
+ }
out:
mutex_unlock(&genpd->lock);
- if (parent)
- genpd_release_lock(parent);
return ret;
}
@@ -293,7 +290,8 @@ static int pm_genpd_poweroff(struct gene
genpd->poweroff_task = current;
list_for_each_entry_reverse(dle, &genpd->dev_list, node) {
- ret = __pm_genpd_save_device(dle, genpd);
+ ret = atomic_read(&genpd->sd_count) == 0 ?
+ __pm_genpd_save_device(dle, genpd) : -EBUSY;
if (ret) {
genpd_set_active(genpd);
goto out;
@@ -308,38 +306,32 @@ static int pm_genpd_poweroff(struct gene
}
}
- parent = genpd->parent;
- if (parent) {
- mutex_unlock(&genpd->lock);
-
- genpd_acquire_lock(parent);
- mutex_lock_nested(&genpd->lock, SINGLE_DEPTH_NESTING);
-
- if (genpd_abort_poweroff(genpd)) {
- genpd_release_lock(parent);
+ if (genpd->power_off) {
+ if (atomic_read(&genpd->sd_count) > 0) {
+ ret = -EBUSY;
goto out;
}
- }
- if (genpd->power_off) {
+ /*
+ * If sd_count > 0 at this point, one of the children hasn't
+ * managed to call pm_genpd_poweron() for the parent yet after
+ * incrementing it. In that case pm_genpd_poweron() will wait
+ * for us to drop the lock, so we can call .power_off() and let
+ * the pm_genpd_poweron() restore power for us (this shouldn't
+ * happen very often).
+ */
ret = genpd->power_off(genpd);
if (ret == -EBUSY) {
genpd_set_active(genpd);
- if (parent)
- genpd_release_lock(parent);
-
goto out;
}
}
genpd->status = GPD_STATE_POWER_OFF;
- if (parent) {
- if (genpd_sd_counter_dec(parent))
- genpd_queue_power_off_work(parent);
-
- genpd_release_lock(parent);
- }
+ parent = genpd->parent;
+ if (parent && genpd_sd_counter_dec(parent))
+ genpd_queue_power_off_work(parent);
out:
genpd->poweroff_task = NULL;
^ permalink raw reply
* [PATCH 2/9] PM / Domains: Implement subdomain counters as atomic fields
From: Rafael J. Wysocki @ 2011-07-31 17:47 UTC (permalink / raw)
To: Linux PM mailing list; +Cc: LKML, linux-sh
In-Reply-To: <201107311946.06654.rjw@sisk.pl>
From: Rafael J. Wysocki <rjw@sisk.pl>
Currently, pm_genpd_poweron() and pm_genpd_poweroff() need to take
the parent PM domain's lock in order to modify the parent's counter
of active subdomains in a nonracy way. This causes the locking to be
considerably complex and in fact is not necessary, because the
subdomain counters may be implemented as atomic fields and they
won't have to be modified under a lock.
Replace the unsigned in sd_count field in struct generic_pm_domain
by an atomic_t one and modify the code in drivers/base/power/domain.c
to take this change into account.
This patch doesn't change the locking yet, that is going to be done
in a separate subsequent patch.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
drivers/base/power/domain.c | 30 ++++++++++++++++++++----------
include/linux/pm_domain.h | 2 +-
2 files changed, 21 insertions(+), 11 deletions(-)
Index: linux-2.6/drivers/base/power/domain.c
===================================================================
--- linux-2.6.orig/drivers/base/power/domain.c
+++ linux-2.6/drivers/base/power/domain.c
@@ -29,10 +29,20 @@ static struct generic_pm_domain *dev_to_
return pd_to_genpd(dev->pm_domain);
}
-static void genpd_sd_counter_dec(struct generic_pm_domain *genpd)
+static bool genpd_sd_counter_dec(struct generic_pm_domain *genpd)
{
- if (!WARN_ON(genpd->sd_count == 0))
- genpd->sd_count--;
+ bool ret = false;
+
+ if (!WARN_ON(atomic_read(&genpd->sd_count) == 0))
+ ret = !!atomic_dec_and_test(&genpd->sd_count);
+
+ return ret;
+}
+
+static void genpd_sd_counter_inc(struct generic_pm_domain *genpd)
+{
+ atomic_inc(&genpd->sd_count);
+ smp_mb__after_atomic_inc();
}
static void genpd_acquire_lock(struct generic_pm_domain *genpd)
@@ -118,7 +128,7 @@ int pm_genpd_poweron(struct generic_pm_d
genpd_set_active(genpd);
if (parent)
- parent->sd_count++;
+ genpd_sd_counter_inc(parent);
out:
mutex_unlock(&genpd->lock);
@@ -254,7 +264,7 @@ static int pm_genpd_poweroff(struct gene
|| genpd->resume_count > 0)
return 0;
- if (genpd->sd_count > 0)
+ if (atomic_read(&genpd->sd_count) > 0)
return -EBUSY;
not_suspended = 0;
@@ -325,8 +335,7 @@ static int pm_genpd_poweroff(struct gene
genpd->status = GPD_STATE_POWER_OFF;
if (parent) {
- genpd_sd_counter_dec(parent);
- if (parent->sd_count == 0)
+ if (genpd_sd_counter_dec(parent))
genpd_queue_power_off_work(parent);
genpd_release_lock(parent);
@@ -491,7 +500,8 @@ static void pm_genpd_sync_poweroff(struc
if (genpd->status == GPD_STATE_POWER_OFF)
return;
- if (genpd->suspended_count != genpd->device_count || genpd->sd_count > 0)
+ if (genpd->suspended_count != genpd->device_count
+ || atomic_read(&genpd->sd_count) > 0)
return;
if (genpd->power_off)
@@ -1152,7 +1162,7 @@ int pm_genpd_add_subdomain(struct generi
list_add_tail(&new_subdomain->sd_node, &genpd->sd_list);
new_subdomain->parent = genpd;
if (subdomain->status != GPD_STATE_POWER_OFF)
- genpd->sd_count++;
+ genpd_sd_counter_inc(genpd);
out:
mutex_unlock(&new_subdomain->lock);
@@ -1227,7 +1237,7 @@ void pm_genpd_init(struct generic_pm_dom
genpd->gov = gov;
INIT_WORK(&genpd->power_off_work, genpd_power_off_work_fn);
genpd->in_progress = 0;
- genpd->sd_count = 0;
+ atomic_set(&genpd->sd_count, 0);
genpd->status = is_off ? GPD_STATE_POWER_OFF : GPD_STATE_ACTIVE;
init_waitqueue_head(&genpd->status_wait_queue);
genpd->poweroff_task = NULL;
Index: linux-2.6/include/linux/pm_domain.h
===================================================================
--- linux-2.6.orig/include/linux/pm_domain.h
+++ linux-2.6/include/linux/pm_domain.h
@@ -33,7 +33,7 @@ struct generic_pm_domain {
struct dev_power_governor *gov;
struct work_struct power_off_work;
unsigned int in_progress; /* Number of devices being suspended now */
- unsigned int sd_count; /* Number of subdomains with power "on" */
+ atomic_t sd_count; /* Number of subdomains with power "on" */
enum gpd_status status; /* Current state of the domain */
wait_queue_head_t status_wait_queue;
struct task_struct *poweroff_task; /* Powering off task */
^ permalink raw reply
* [PATCH 1/9] PM / Domains: Fix pm_genpd_poweron()
From: Rafael J. Wysocki @ 2011-07-31 17:47 UTC (permalink / raw)
To: Linux PM mailing list; +Cc: LKML, linux-sh
In-Reply-To: <201107311946.06654.rjw@sisk.pl>
From: Rafael J. Wysocki <rjw@sisk.pl>
The local variable ret is defined twice in pm_genpd_poweron(), which
causes this function to always return 0, even if the PM domain's
.power_on() callback fails, in which case an error code should be
returned.
Remove the wrong second definition of ret and additionally remove an
unnecessary definition of wait from pm_genpd_poweron().
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
drivers/base/power/domain.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
Index: linux-2.6/drivers/base/power/domain.c
===================================================================
--- linux-2.6.orig/drivers/base/power/domain.c
+++ linux-2.6/drivers/base/power/domain.c
@@ -80,7 +80,6 @@ static void genpd_set_active(struct gene
int pm_genpd_poweron(struct generic_pm_domain *genpd)
{
struct generic_pm_domain *parent = genpd->parent;
- DEFINE_WAIT(wait);
int ret = 0;
start:
@@ -112,7 +111,7 @@ int pm_genpd_poweron(struct generic_pm_d
}
if (genpd->power_on) {
- int ret = genpd->power_on(genpd);
+ ret = genpd->power_on(genpd);
if (ret)
goto out;
}
^ permalink raw reply
* [PATCH 0/9] PM / Domains: Allow generic PM domains to have multiple parents
From: Rafael J. Wysocki @ 2011-07-31 17:46 UTC (permalink / raw)
To: Linux PM mailing list; +Cc: LKML, linux-sh
Hi,
This patchset modifies the generic PM domains code so that it's possible
for a PM domain to have multiple parent domains (i.e. domains it depends on).
This makes the framework applicable to more use cases and allows us to
address one issue one a system already using it.
Patch [1/9] is for 3.1, patches [2-9/9] are 3.2 material, if there are no
objections.
[1/9] - Fix genpd_power_on().
[2/9] - Make subdomain counts be atomic.
[3/9] - Don't acquire parent locks in genpd_power_on() and genpd_power_off().
[4/9] - Make genpd_power_on() always survive parent removal.
[5/9] - Add PM domain status value GPD_STATE_WAIT_PARENT.
[6/9] - Allow generic PM domains to have multiple parents.
[7/9] - Rename GPD_STATE_WAIT_PARENT to GPD_STATE_WAIT_MASTER.
[8/9] - Rename the second argument of pm_genpd_add_subdomain().
[9/9] - Make A3RV be a subdomain of A4LC on SH7372.
Thanks,
Rafael
^ permalink raw reply
* Re: [RFC/PATCH v3 00/13] PM QoS: add a per-device latency constraints class
From: mark gross @ 2011-07-31 17:38 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Linux PM mailing list, broonie, Jean Pihet, linux-omap, markgross
In-Reply-To: <201107292346.22211.rjw@sisk.pl>
On Fri, Jul 29, 2011 at 11:46:21PM +0200, Rafael J. Wysocki wrote:
> On Friday, July 29, 2011, mark gross wrote:
> > On Fri, Jul 29, 2011 at 10:37:52AM +0200, Jean Pihet wrote:
> > > Mark,
> > >
> > > On Thu, Jul 28, 2011 at 3:14 PM, mark gross <markgross@thegnar.org> wrote:
> > > > On Thu, Jul 28, 2011 at 10:30:07AM +0200, jean.pihet@newoldbits.com wrote:
> > > >> From: Jean Pihet <j-pihet@ti.com>
> > > >>
> > > >> This patch set is in an RFC state, for review and comments.
> > > >>
> > > >> High level implementation:
> > > >>
> > > ...
> > >
> > > >> 7. Misc fixes to improve code readability:
> > > >> . rename of the PM QoS implementation file from pm_qos_params.[ch] to pm_qos.[ch]
> > > > I picked the name for the file as pm_qos_params over pm_qos because I
> > > > wanted to make it implicitly clear that this was not an full QOS
> > > > implementation. True QOS carries expectations similar to real time and
> > > > as the infrastructure is closer to "good intentioned" than even "best
> > > > effort" and offers no notification when the QOS request is not able to
> > > > be met and really doesn't implement a true QOS at all, (it just provides
> > > > the parameter interface for part of one its missing the notification
> > > > interface when the service level is not met and I think a few other
> > > > things.) So I wanted to have it named a bit different from just pm_qos.
> > > >
> > > > This said I'm not supper attached to the naming of the modules. If
> > > > folks want to change it I wouldn't complain (too much anyway;).
> > > Ok got the idea. I do not know what name to chose though. As suggested
> > > previously the name pm_qos_params does not reflect the implementation,
> > > that is why I renamed it.
> >
> > I must have missed the part where the name doesn't reflect the
> > implementation was talked about. I look at the interface and I see
> > parameters all over the place and a small bit of notification.
>
> The interface is for specifying PM QoS requirements or constraints that
> may or may not be regarded as "parameters", depending on ones definition
> of the last term. Moreover, the code not only implements the interface,
> but also defines data structures and API to be used throughout the kernel
> which is quite a bit more than just "parameters".
>
> In my not so humble opinion the name isn't good and the .c file should be
> in kernel/power in the first place. I would call it simply qos.c (the fact
> that _right_ _now_ it's not a full QoS implementation doesn't matter, because
> it may become one in the future).
Sounds ok to me.
--mark
^ permalink raw reply
* [git pull] cpupowerutils
From: Dominik Brodowski @ 2011-07-31 8:51 UTC (permalink / raw)
To: torvalds, akpm; +Cc: Ingo Molnar, linux-kernel, cpufreq, linux-acpi, linux-pm
Linus,
the git tree
git://git.kernel.org/pub/scm/linux/kernel/git/brodo/cpupowerutils.git master
contains a new utility called "cpupowerutils" which is laregely based on
the well known "cpufrequtils", but extended to provide much more information
about other power-related features of modern CPUs, such as idle states.
Users and Developers want to have *one* tool to get an overview what
their system supports and to monitor and debug CPU power management
in detail. The tool should compile and work on as many architectures
as possible.
To reach these goals, Thomas Renninger suggested -- and implemented most
of the parts -- to convert the external tool cpufrequtils to a userspace
tool residing in tools/power/cpupower/ .
Once this stabilizes, it is intended to replace cpufrequtils and the
Intel-specific tools in tools/power/x86 .
A list of patches and the diffstat are appended to this messages.
Best,
Dominik
Dominik Brodowski (10):
cpupowerutils - cpufrequtils extended with quite some features
cpupowerutils: use COPYING, CREDITS from top-level directory
cpupowerutils: remove ccdv, use kernel quiet/verbose mechanism
cpupowerutils: do not update po files on each and every compile
cpupowerutils: bench - ConfigStyle bugfixes
cpupowerutils: lib - ConfigStyle bugfixes
cpupowerutils: idle_monitor - ConfigStyle bugfixes
cpupowerutils: helpers - ConfigStyle bugfixes
cpupowerutils: utils - ConfigStyle bugfixes
cpupowerutils: use kernel version-derived version string
Roman Vasiyarov (1):
cpupowerutils: increase MAX_LINE_LEN
Thomas Renninger (4):
cpupowerutils: Rename: libcpufreq->libcpupower
cpupower: Rename package from cpupowerutils to cpupower
cpupower: Show Intel turbo ratio support via ./cpupower frequency-info
cpupower: Do detect IDA (opportunistic processor performance) via cpuid
CREDITS | 17 +-
MAINTAINERS | 6 +
tools/power/cpupower/.gitignore | 22 +
tools/power/cpupower/Makefile | 279 ++++++
tools/power/cpupower/README | 49 +
tools/power/cpupower/ToDo | 11 +
tools/power/cpupower/bench/Makefile | 29 +
tools/power/cpupower/bench/README-BENCH | 124 +++
tools/power/cpupower/bench/benchmark.c | 194 ++++
tools/power/cpupower/bench/benchmark.h | 29 +
tools/power/cpupower/bench/config.h | 36 +
tools/power/cpupower/bench/cpufreq-bench_plot.sh | 104 +++
tools/power/cpupower/bench/cpufreq-bench_script.sh | 101 ++
tools/power/cpupower/bench/example.cfg | 11 +
tools/power/cpupower/bench/main.c | 202 ++++
tools/power/cpupower/bench/parse.c | 225 +++++
tools/power/cpupower/bench/parse.h | 53 ++
tools/power/cpupower/bench/system.c | 191 ++++
tools/power/cpupower/bench/system.h | 29 +
tools/power/cpupower/debug/i386/Makefile | 20 +
tools/power/cpupower/debug/i386/centrino-decode.c | 113 +++
tools/power/cpupower/debug/i386/dump_psb.c | 196 ++++
tools/power/cpupower/debug/i386/intel_gsic.c | 78 ++
.../power/cpupower/debug/i386/powernow-k8-decode.c | 96 ++
tools/power/cpupower/debug/kernel/Makefile | 23 +
.../power/cpupower/debug/kernel/cpufreq-test_tsc.c | 113 +++
tools/power/cpupower/debug/x86_64/Makefile | 14 +
.../power/cpupower/debug/x86_64/centrino-decode.c | 1 +
.../cpupower/debug/x86_64/powernow-k8-decode.c | 1 +
tools/power/cpupower/lib/cpufreq.c | 208 +++++
tools/power/cpupower/lib/cpufreq.h | 223 +++++
tools/power/cpupower/lib/sysfs.c | 672 ++++++++++++++
tools/power/cpupower/lib/sysfs.h | 31 +
tools/power/cpupower/man/cpupower-frequency-info.1 | 76 ++
tools/power/cpupower/man/cpupower-frequency-set.1 | 54 ++
tools/power/cpupower/man/cpupower-info.1 | 19 +
tools/power/cpupower/man/cpupower-monitor.1 | 179 ++++
tools/power/cpupower/man/cpupower-set.1 | 103 +++
tools/power/cpupower/man/cpupower.1 | 72 ++
tools/power/cpupower/po/cs.po | 944 +++++++++++++++++++
tools/power/cpupower/po/de.po | 961 ++++++++++++++++++++
tools/power/cpupower/po/fr.po | 947 +++++++++++++++++++
tools/power/cpupower/po/it.po | 961 ++++++++++++++++++++
tools/power/cpupower/po/pt.po | 957 +++++++++++++++++++
tools/power/cpupower/utils/builtin.h | 18 +
tools/power/cpupower/utils/cpufreq-info.c | 708 ++++++++++++++
tools/power/cpupower/utils/cpufreq-set.c | 358 ++++++++
tools/power/cpupower/utils/cpuidle-info.c | 244 +++++
tools/power/cpupower/utils/cpupower-info.c | 153 ++++
tools/power/cpupower/utils/cpupower-set.c | 153 ++++
tools/power/cpupower/utils/cpupower.c | 203 ++++
tools/power/cpupower/utils/helpers/amd.c | 137 +++
tools/power/cpupower/utils/helpers/bitmask.c | 292 ++++++
tools/power/cpupower/utils/helpers/bitmask.h | 33 +
tools/power/cpupower/utils/helpers/cpuid.c | 176 ++++
tools/power/cpupower/utils/helpers/helpers.h | 178 ++++
tools/power/cpupower/utils/helpers/misc.c | 27 +
tools/power/cpupower/utils/helpers/msr.c | 115 +++
tools/power/cpupower/utils/helpers/pci.c | 44 +
tools/power/cpupower/utils/helpers/sysfs.c | 358 ++++++++
tools/power/cpupower/utils/helpers/sysfs.h | 28 +
tools/power/cpupower/utils/helpers/topology.c | 108 +++
.../cpupower/utils/idle_monitor/amd_fam14h_idle.c | 338 +++++++
.../cpupower/utils/idle_monitor/cpuidle_sysfs.c | 196 ++++
.../cpupower/utils/idle_monitor/cpupower-monitor.c | 448 +++++++++
.../cpupower/utils/idle_monitor/cpupower-monitor.h | 68 ++
.../cpupower/utils/idle_monitor/idle_monitors.def | 7 +
.../cpupower/utils/idle_monitor/idle_monitors.h | 18 +
.../cpupower/utils/idle_monitor/mperf_monitor.c | 255 ++++++
tools/power/cpupower/utils/idle_monitor/nhm_idle.c | 216 +++++
tools/power/cpupower/utils/idle_monitor/snb_idle.c | 190 ++++
tools/power/cpupower/utils/version-gen.sh | 35 +
72 files changed, 13877 insertions(+), 1 deletions(-)
create mode 100644 tools/power/cpupower/.gitignore
create mode 100644 tools/power/cpupower/Makefile
create mode 100644 tools/power/cpupower/README
create mode 100644 tools/power/cpupower/ToDo
create mode 100644 tools/power/cpupower/bench/Makefile
create mode 100644 tools/power/cpupower/bench/README-BENCH
create mode 100644 tools/power/cpupower/bench/benchmark.c
create mode 100644 tools/power/cpupower/bench/benchmark.h
create mode 100644 tools/power/cpupower/bench/config.h
create mode 100644 tools/power/cpupower/bench/cpufreq-bench_plot.sh
create mode 100644 tools/power/cpupower/bench/cpufreq-bench_script.sh
create mode 100644 tools/power/cpupower/bench/example.cfg
create mode 100644 tools/power/cpupower/bench/main.c
create mode 100644 tools/power/cpupower/bench/parse.c
create mode 100644 tools/power/cpupower/bench/parse.h
create mode 100644 tools/power/cpupower/bench/system.c
create mode 100644 tools/power/cpupower/bench/system.h
create mode 100644 tools/power/cpupower/debug/i386/Makefile
create mode 100644 tools/power/cpupower/debug/i386/centrino-decode.c
create mode 100644 tools/power/cpupower/debug/i386/dump_psb.c
create mode 100644 tools/power/cpupower/debug/i386/intel_gsic.c
create mode 100644 tools/power/cpupower/debug/i386/powernow-k8-decode.c
create mode 100644 tools/power/cpupower/debug/kernel/Makefile
create mode 100644 tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
create mode 100644 tools/power/cpupower/debug/x86_64/Makefile
create mode 120000 tools/power/cpupower/debug/x86_64/centrino-decode.c
create mode 120000 tools/power/cpupower/debug/x86_64/powernow-k8-decode.c
create mode 100644 tools/power/cpupower/lib/cpufreq.c
create mode 100644 tools/power/cpupower/lib/cpufreq.h
create mode 100644 tools/power/cpupower/lib/sysfs.c
create mode 100644 tools/power/cpupower/lib/sysfs.h
create mode 100644 tools/power/cpupower/man/cpupower-frequency-info.1
create mode 100644 tools/power/cpupower/man/cpupower-frequency-set.1
create mode 100644 tools/power/cpupower/man/cpupower-info.1
create mode 100644 tools/power/cpupower/man/cpupower-monitor.1
create mode 100644 tools/power/cpupower/man/cpupower-set.1
create mode 100644 tools/power/cpupower/man/cpupower.1
create mode 100644 tools/power/cpupower/po/cs.po
create mode 100644 tools/power/cpupower/po/de.po
create mode 100644 tools/power/cpupower/po/fr.po
create mode 100644 tools/power/cpupower/po/it.po
create mode 100644 tools/power/cpupower/po/pt.po
create mode 100644 tools/power/cpupower/utils/builtin.h
create mode 100644 tools/power/cpupower/utils/cpufreq-info.c
create mode 100644 tools/power/cpupower/utils/cpufreq-set.c
create mode 100644 tools/power/cpupower/utils/cpuidle-info.c
create mode 100644 tools/power/cpupower/utils/cpupower-info.c
create mode 100644 tools/power/cpupower/utils/cpupower-set.c
create mode 100644 tools/power/cpupower/utils/cpupower.c
create mode 100644 tools/power/cpupower/utils/helpers/amd.c
create mode 100644 tools/power/cpupower/utils/helpers/bitmask.c
create mode 100644 tools/power/cpupower/utils/helpers/bitmask.h
create mode 100644 tools/power/cpupower/utils/helpers/cpuid.c
create mode 100644 tools/power/cpupower/utils/helpers/helpers.h
create mode 100644 tools/power/cpupower/utils/helpers/misc.c
create mode 100644 tools/power/cpupower/utils/helpers/msr.c
create mode 100644 tools/power/cpupower/utils/helpers/pci.c
create mode 100644 tools/power/cpupower/utils/helpers/sysfs.c
create mode 100644 tools/power/cpupower/utils/helpers/sysfs.h
create mode 100644 tools/power/cpupower/utils/helpers/topology.c
create mode 100644 tools/power/cpupower/utils/idle_monitor/amd_fam14h_idle.c
create mode 100644 tools/power/cpupower/utils/idle_monitor/cpuidle_sysfs.c
create mode 100644 tools/power/cpupower/utils/idle_monitor/cpupower-monitor.c
create mode 100644 tools/power/cpupower/utils/idle_monitor/cpupower-monitor.h
create mode 100644 tools/power/cpupower/utils/idle_monitor/idle_monitors.def
create mode 100644 tools/power/cpupower/utils/idle_monitor/idle_monitors.h
create mode 100644 tools/power/cpupower/utils/idle_monitor/mperf_monitor.c
create mode 100644 tools/power/cpupower/utils/idle_monitor/nhm_idle.c
create mode 100644 tools/power/cpupower/utils/idle_monitor/snb_idle.c
create mode 100755 tools/power/cpupower/utils/version-gen.sh
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox