Linux Power Management development
 help / color / mirror / Atom feed
* Re: better oopsing when frozen
From: Rafael J. Wysocki @ 2011-07-29 20:29 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel
In-Reply-To: <201107292227.36424.oliver@neukum.org>

On Friday, July 29, 2011, Oliver Neukum wrote:
> Am Freitag, 29. Juli 2011, 21:54:30 schrieb Pavel Machek:
> > On Mon 2011-07-25 10:43:19, Oliver Neukum wrote:
> > > Hi Rafael,
> > > 
> > > I had a problem with the kernel stopping the machine forever because I got an
> > > oops while tasks were frozen. It seems to me that we should thaw when this
> > > happens. How about this approach?
> > 
> > Danger, Will Robinson. You must not write to the filesystems after
> > hibernation started. By thawing, you may do just that.
> > 
> > It should be safe to thaw as long as final suspend signature is not
> > on disk and will not be written there.
> 
> Yes. But this applies only if I use kernel-space hibernation, doesn't it?
> So I need a second flag for the signature being written. Any other problem?

Yes, please see my reply to Pavel.

Thanks,
Rafael

^ permalink raw reply

* Re: better oopsing when frozen
From: Rafael J. Wysocki @ 2011-07-29 20:27 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Oliver Neukum, linux-pm, Andrew Morton, linux-kernel
In-Reply-To: <20110729195430.GB1720@ucw.cz>

On Friday, July 29, 2011, Pavel Machek wrote:
> On Mon 2011-07-25 10:43:19, Oliver Neukum wrote:
> > Hi Rafael,
> > 
> > I had a problem with the kernel stopping the machine forever because I got an
> > oops while tasks were frozen. It seems to me that we should thaw when this
> > happens. How about this approach?
> 
> Danger, Will Robinson. You must not write to the filesystems after
> hibernation started. By thawing, you may do just that.
> 
> It should be safe to thaw as long as final suspend signature is not
> on disk and will not be written there.

This only applies to hibernation and only when we're going to restore
from the image the presumably hasn't been saved yet, right?

However, there's another problem I didn't think of before.  Namely,
we cannot thaw tasks before resuming devices in case we've already
suspended them, because that will defeat the very purpose of the
freezing in the first place.

Thanks,
Rafael

^ permalink raw reply

* Re: better oopsing when frozen
From: Oliver Neukum @ 2011-07-29 20:27 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-pm, linux-kernel
In-Reply-To: <20110729195430.GB1720@ucw.cz>

Am Freitag, 29. Juli 2011, 21:54:30 schrieb Pavel Machek:
> On Mon 2011-07-25 10:43:19, Oliver Neukum wrote:
> > Hi Rafael,
> > 
> > I had a problem with the kernel stopping the machine forever because I got an
> > oops while tasks were frozen. It seems to me that we should thaw when this
> > happens. How about this approach?
> 
> Danger, Will Robinson. You must not write to the filesystems after
> hibernation started. By thawing, you may do just that.
> 
> It should be safe to thaw as long as final suspend signature is not
> on disk and will not be written there.

Yes. But this applies only if I use kernel-space hibernation, doesn't it?
So I need a second flag for the signature being written. Any other problem?

	Regards
		Oliver

^ permalink raw reply

* Re: better oopsing when frozen
From: Pavel Machek @ 2011-07-29 19:54 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel
In-Reply-To: <201107251043.19932.oneukum@suse.de>

On Mon 2011-07-25 10:43:19, Oliver Neukum wrote:
> Hi Rafael,
> 
> I had a problem with the kernel stopping the machine forever because I got an
> oops while tasks were frozen. It seems to me that we should thaw when this
> happens. How about this approach?

Danger, Will Robinson. You must not write to the filesystems after
hibernation started. By thawing, you may do just that.

It should be safe to thaw as long as final suspend signature is not
on disk and will not be written there.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply

* Re: Runtime PM discussion notes
From: Rafael J. Wysocki @ 2011-07-29 19:52 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Mark Brown, linux-pm, linux-omap
In-Reply-To: <20110729134015.GA1561@ucw.cz>

On Friday, July 29, 2011, Pavel Machek wrote:
> Hi!
> 
> > > > Actually, it just occurred to me that if we're waiting for a system
> > > > timer and can hand that off to a suitable timer in the PMIC then we can
> > > > do a suspend to RAM for the deep idle state from the hardware point of
> > > > view.
> > > 
> > > Yep.  At LinuxCon Cambridge two years ago, we had a discussion about 
> > > whether it would be possible to enter ACPI S-states from CPUIdle (or some 
> > > idle governor) on Intel chips.  If I remember correctly, the conclusion 
> > > was that ACPI always disables the screen/backlight, so it would only be 
> > > useful for situations where that was acceptable.
> 
> Well, auto suspending when screensaver is active would still be
> useful.
> 
> (And IIRC some machines kept screen on when in S-state unless driver
> powered it down... but that might be S1.
> 
> > The reason why you can't enter ACPI S-states from CPUidle is because you
> > need to go out of the idle loop to execute some ACPI-specific stuff.  Which
> > is not even specific to Intel chips, but to ACPI in general.
> 
> The code was little tricky/unclean, but it "worked" for me at one
> point... I called it  "sleepy linux".

Yes, you can find a system where it might kind of work (just because
_PTS is empty or something like this).  Is it going to work in general?
No way.

So please let's turn into something at least _theoretically_ viable.

Thanks,
Rafael

^ permalink raw reply

* Re: [PATCH 08/13] OMAP2+: powerdomain: control power domains next state
From: Todd Poynor @ 2011-07-29 18:00 UTC (permalink / raw)
  To: Jean Pihet
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <CAORVsuV68JhpV5XzXNV3Z9xxRUBbgDen7NDRG+h=in7-ZL=JhQ@mail.gmail.com>

On Fri, Jul 29, 2011 at 10:47:43AM +0200, Jean Pihet wrote:
> On Fri, Jul 29, 2011 at 9:59 AM, Todd Poynor <toddpoynor@google.com> wrote:
...
> > All min_latency != PM_QOS_DEV_LAT_DEFAULT_VALUE paths need
> > free_new_user = 1.
> free_new_user = 1 is only needed if no existing constraint has been
> found, i.e. user stays at NULL. This is implemented in the check for
> an existing constraint (plist_for_each_entry(...)).

Oops, I meant to say it applies in all cases where min_latency ==
PM_QOS_DEV_LAT_DEFAULT_VALUE, since the new node is only used for
adding a constraint (and no existing constraint found).  I'd suggest
something like:

        if (min_latency != PM_QOS_DEV_LAT_DEFAULT_VALUE) {
                new_user = kzalloc(sizeof(struct pwrdm_wkup_constraints_entry),
                        GFP_KERNEL);
                <check NULL return>
                free_new_user = 1;
        }

and then set free_new_user = 0 only if no existing constraint is found
for the add case.  Because it's easy to miss cases where the
allocated memory needs to be freed when that's not the default, and you
might as well skip the allocate on a constraint removal.  Pretty minor
point, though.


Todd

^ permalink raw reply

* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Govindraj @ 2011-07-29 15:13 UTC (permalink / raw)
  To: balbi
  Cc: Partha Basak, Tero Kristo, Govindraj.R, linux-serial, linux-pm,
	linux-omap
In-Reply-To: <20110729140210.GT31013@legolas.emea.dhcp.ti.com>

On Fri, Jul 29, 2011 at 7:32 PM, Felipe Balbi <balbi@ti.com> wrote:
> Hi,
>
> On Fri, Jul 29, 2011 at 06:28:32PM +0530, Govindraj wrote:
>> On Fri, Jul 29, 2011 at 5:49 PM, Felipe Balbi <balbi@ti.com> wrote:
>> > On Fri, Jul 29, 2011 at 05:29:12PM +0530, Govindraj wrote:
>> >> Yes fine, But there are scenarios even before first runtime_suspend happens,
>> >>
>> >> ex: uart_port_configure -> get_sync -> pm_generic_runtime_resume
>> >> (omap_device_enable in this case) debug printk -> console_write -> get_sync.
>> >>
>> >> there are numerous such scenarios where we end from runtime context
>> >> to runtime api context again, or jumping from one uart port operation
>> >> to uart print operation.
>> >
>> > calling pm_runtime_get_sync() should not be a problem. It should only
>> > increase the usage counters... This sounds like a race condition on the
>> > driver, no ?
>>
>> Actually when we call a API to enable clocks we except the internals of API
>> to just enable clocks and return.
>>
>> *Clock enable API should not cause or trigger to do a _device_driver_operation_
>> even before enabling clocks of the device-driver which called it*
>>
>> for uart context get_sync can land me to uart driver back
>> even before enabling the uart clocks due to printks.
>
> only if _you_ have prints or _your_ runtime_*() calls, no ?
>
> Let's say omap_hwmod.c wants to do a print:
>
> -> printk()
>  -> pm_runtime_get_sync
>    -> console_write
>  -> pm_runtim_put
>
> now, if you have a printk() on your runtime_resume() before you enable
> the clocks, then I can see why you would deadlock:
>
> -> pm_runtime_get_sync
>  -> omap_serial_runtime_resume
>    -> printk
>      -> pm_runtime_get_sync
>        -> omap_serial_runtime_resume
>          -> printk
>           -> pm_runtime_get_sync
>            .....
>
> maybe I'm missing something, but can you add a stack dump on your
> ->runtime_resume and ->runtime_suspend methods, just so we try to figure
> out who's to blame here ?
>
>> > What you're experiencing, if I understood correctly, is a deadlock ? In
>> > that case, can you try to track the locking mechanism on the omap-serial
>> > driver to try to find if there isn't anything obviously wrong ?
>> >
>>
>> Yes deadlocks. due to entering from runtime context to runtime context
>> or entering from uart_port_operation to uart_console_write ops.
>>
>> There are already port locks used extensively within the uart driver
>> to secure a port operation.
>>
>> But cannot secure a port operation while using clock_enable API.
>> since clock enable API can land the control back to uart_console_write
>> operation..
>
> but in that case, if clock isn't enabled, why don't you just ignore the
> print and enable the clock ?? Just return 0 and continue with
> clk_enable() ??
>
>> >> So either we should not have those prints from pm_generic layers or suppress
>> >> them(seems pretty much a problem for a clean design within the driver
>> >> using console_lock/unlock for every get_sync, and for
>> >> runtime_put we cannot suppress the prints as it gets scheduled later)
>> >>
>> >> or if other folks who really need those prints form pm_generic* layers
>> >> to debug and analysis then we have no other choice rather control
>> >> the clk_enable/disable from outside driver code in idle path.
>> >
>> > yeah, none of these would be nice :-(
>> >
>> > I think this needs more debugging to be sure what's exactly going on.
>> > What's exactly causing the deadlock ? Which lock is held and never
>> > released ?
>> >
>>
>> I had done some investigations, from scenarios it simply boils down to fact
>> to handle clock within uart driver, uart driver expects clock enable API* used
>> to just enable uart clocks but rather not trigger a _uart_ops_ within which
>> kind of unacceptable from the uart_driver context.
>
> ok, now I see what you mean:
>
> 113 static int _omap_device_activate(struct omap_device *od, u8 ignore_lat)
> 114 {
> 115         struct timespec a, b, c;
> 116
> 117         pr_debug("omap_device: %s: activating\n", od->pdev.name);
> 118
> 119         while (od->pm_lat_level > 0) {
> 120                 struct omap_device_pm_latency *odpl;
> 121                 unsigned long long act_lat = 0;
> 122
> 123                 od->pm_lat_level--;
> 124
> 125                 odpl = od->pm_lats + od->pm_lat_level;
> 126
> 127                 if (!ignore_lat &&
> 128                     (od->dev_wakeup_lat <= od->_dev_wakeup_lat_limit))
> 129                         break;
> 130
> 131                 read_persistent_clock(&a);
> 132
> 133                 /* XXX check return code */
> 134                 odpl->activate_func(od);
> 135
> 136                 read_persistent_clock(&b);
> 137
> 138                 c = timespec_sub(b, a);
> 139                 act_lat = timespec_to_ns(&c);
> 140
> 141                 pr_debug("omap_device: %s: pm_lat %d: activate: elapsed time "
> 142                          "%llu nsec\n", od->pdev.name, od->pm_lat_level,
> 143                          act_lat);
> 144
> 145                 if (act_lat > odpl->activate_lat) {
> 146                         odpl->activate_lat_worst = act_lat;
> 147                         if (odpl->flags & OMAP_DEVICE_LATENCY_AUTO_ADJUST) {
> 148                                 odpl->activate_lat = act_lat;
> 149                                 pr_warning("omap_device: %s.%d: new worst case "
> 150                                            "activate latency %d: %llu\n",
> 151                                            od->pdev.name, od->pdev.id,
> 152                                            od->pm_lat_level, act_lat);
> 153                         } else
> 154                                 pr_warning("omap_device: %s.%d: activate "
> 155                                            "latency %d higher than exptected. "
> 156                                            "(%llu > %d)\n",
> 157                                            od->pdev.name, od->pdev.id,
> 158                                            od->pm_lat_level, act_lat,
> 159                                            odpl->activate_lat);
> 160                 }
> 161
> 162                 od->dev_wakeup_lat -= odpl->activate_lat;
> 163         }
> 164
> 165         return 0;
> 166 }
>
> When that first pr_debug() triggers, UART's hwmod could be disabled, and
> that would trigger the state I described above where you would keep on
> calling pm_runtime_get_sync() forever ;-)
>
> isn't it enough to patch it like below:
>
> diff --git a/arch/arm/plat-omap/omap_device.c b/arch/arm/plat-omap/omap_device.c
> index b6b4097..560f622 100644
> --- a/arch/arm/plat-omap/omap_device.c
> +++ b/arch/arm/plat-omap/omap_device.c
> @@ -114,8 +114,6 @@ static int _omap_device_activate(struct omap_device *od, u8 ignore_lat)
>  {
>        struct timespec a, b, c;
>
> -       pr_debug("omap_device: %s: activating\n", od->pdev.name);
> -
>        while (od->pm_lat_level > 0) {
>                struct omap_device_pm_latency *odpl;
>                unsigned long long act_lat = 0;
> @@ -162,6 +160,8 @@ static int _omap_device_activate(struct omap_device *od, u8 ignore_lat)
>                od->dev_wakeup_lat -= odpl->activate_lat;
>        }
>
> +       pr_debug("omap_device: %s: activated\n", od->pdev.name);
> +
>        return 0;
>  }
>

Actually there is much more than this:

<<SNIP>>

diff --git a/arch/arm/mach-omap2/clock.c b/arch/arm/mach-omap2/clock.c
index 180299e..221ffb9 100644
--- a/arch/arm/mach-omap2/clock.c
+++ b/arch/arm/mach-omap2/clock.c
@@ -12,7 +12,8 @@
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
  */
-#undef DEBUG
+//#undef DEBUG
+#define DEBUG

 #include <linux/kernel.h>
 #include <linux/list.h>
@@ -254,14 +255,14 @@ void omap2_clk_disable(struct clk *clk)
 		return;
 	}

-	pr_debug("clock: %s: decrementing usecount\n", clk->name);
+//	pr_debug("clock: %s: decrementing usecount\n", clk->name);

 	clk->usecount--;

 	if (clk->usecount > 0)
 		return;

-	pr_debug("clock: %s: disabling in hardware\n", clk->name);
+//	pr_debug("clock: %s: disabling in hardware\n", clk->name);

 	if (clk->ops && clk->ops->disable) {
 		trace_clock_disable(clk->name, 0, smp_processor_id());
@@ -290,14 +291,14 @@ int omap2_clk_enable(struct clk *clk)
 {
 	int ret;

-	pr_debug("clock: %s: incrementing usecount\n", clk->name);
+//	pr_debug("clock: %s: incrementing usecount\n", clk->name);

 	clk->usecount++;

 	if (clk->usecount > 1)
 		return 0;

-	pr_debug("clock: %s: enabling in hardware\n", clk->name);
+//	pr_debug("clock: %s: enabling in hardware\n", clk->name);

 	if (clk->parent) {
 		ret = omap2_clk_enable(clk->parent);
diff --git a/arch/arm/mach-omap2/omap_hwmod.c b/arch/arm/mach-omap2/omap_hwmod.c
index 7ed0479..8ca7d40 100644
--- a/arch/arm/mach-omap2/omap_hwmod.c
+++ b/arch/arm/mach-omap2/omap_hwmod.c
@@ -124,7 +124,8 @@
  * XXX error return values should be checked to ensure that they are
  * appropriate
  */
-#undef DEBUG
+//#undef DEBUG
+#define DEBUG

 #include <linux/kernel.h>
 #include <linux/errno.h>
@@ -597,7 +598,8 @@ static int _enable_clocks(struct omap_hwmod *oh)
 {
 	int i;

-	pr_debug("omap_hwmod: %s: enabling clocks\n", oh->name);
+	if (strcmp(oh->class->name, "uart"))
+		pr_debug("omap_hwmod: %s: enabling clocks\n", oh->name);

 	if (oh->_clk)
 		clk_enable(oh->_clk);
@@ -627,7 +629,8 @@ static int _disable_clocks(struct omap_hwmod *oh)
 {
 	int i;

-	pr_debug("omap_hwmod: %s: disabling clocks\n", oh->name);
+	if (strcmp(oh->class->name, "uart"))
+		pr_debug("omap_hwmod: %s: disabling clocks\n", oh->name);

 	if (oh->_clk)
 		clk_disable(oh->_clk);
@@ -1232,7 +1235,8 @@ static int _enable(struct omap_hwmod *oh)
 		return -EINVAL;
 	}

-	pr_debug("omap_hwmod: %s: enabling\n", oh->name);
+	if (strcmp(oh->class->name, "uart"))
+		pr_debug("omap_hwmod: %s: enabling\n", oh->name);

 	/*
 	 * If an IP contains only one HW reset line, then de-assert it in order
@@ -1264,8 +1268,9 @@ static int _enable(struct omap_hwmod *oh)
 		}
 	} else {
 		_disable_clocks(oh);
-		pr_debug("omap_hwmod: %s: _wait_target_ready: %d\n",
-			 oh->name, r);
+		if (strcmp(oh->class->name, "uart"))
+			pr_debug("omap_hwmod: %s: _wait_target_ready: %d\n",
+				 oh->name, r);
 	}

 	return r;
@@ -1287,7 +1292,8 @@ static int _idle(struct omap_hwmod *oh)
 		return -EINVAL;
 	}

-	pr_debug("omap_hwmod: %s: idling\n", oh->name);
+	if (strcmp(oh->class->name, "uart"))
+		pr_debug("omap_hwmod: %s: idling\n", oh->name);

 	if (oh->class->sysc)
 		_idle_sysc(oh);
diff --git a/arch/arm/plat-omap/omap_device.c b/arch/arm/plat-omap/omap_device.c
index 49fc0df..7b27704 100644
--- a/arch/arm/plat-omap/omap_device.c
+++ b/arch/arm/plat-omap/omap_device.c
@@ -75,7 +75,8 @@
  * (device must be reinitialized at this point to use it again)
  *
  */
-#undef DEBUG
+//#undef DEBUG
+#define DEBUG

 #include <linux/kernel.h>
 #include <linux/platform_device.h>
@@ -114,7 +115,8 @@ static int _omap_device_activate(struct
omap_device *od, u8 ignore_lat)
 {
 	struct timespec a, b, c;

-	pr_debug("omap_device: %s: activating\n", od->pdev.name);
+	if (strcmp(od->hwmods[0]->class->name, "uart"))
+		pr_debug("omap_device: %s: activating\n", od->pdev.name);

 	while (od->pm_lat_level > 0) {
 		struct omap_device_pm_latency *odpl;
@@ -138,25 +140,29 @@ static int _omap_device_activate(struct
omap_device *od, u8 ignore_lat)
 		c = timespec_sub(b, a);
 		act_lat = timespec_to_ns(&c);

-		pr_debug("omap_device: %s: pm_lat %d: activate: elapsed time "
-			 "%llu nsec\n", od->pdev.name, od->pm_lat_level,
-			 act_lat);
+		if (strcmp(od->hwmods[0]->class->name, "uart"))
+			pr_debug("omap_device: %s: pm_lat %d: activate: elapsed time "
+				 "%llu nsec\n", od->pdev.name, od->pm_lat_level,
+				 act_lat);

 		if (act_lat > odpl->activate_lat) {
 			odpl->activate_lat_worst = act_lat;
 			if (odpl->flags & OMAP_DEVICE_LATENCY_AUTO_ADJUST) {
 				odpl->activate_lat = act_lat;
-				pr_warning("omap_device: %s.%d: new worst case "
-					   "activate latency %d: %llu\n",
-					   od->pdev.name, od->pdev.id,
-					   od->pm_lat_level, act_lat);
-			} else
-				pr_warning("omap_device: %s.%d: activate "
-					   "latency %d higher than exptected. "
-					   "(%llu > %d)\n",
-					   od->pdev.name, od->pdev.id,
-					   od->pm_lat_level, act_lat,
-					   odpl->activate_lat);
+				if (strcmp(od->hwmods[0]->class->name, "uart"))
+					pr_warning("omap_device: %s.%d: new worst case "
+						"activate latency %d: %llu\n",
+						od->pdev.name, od->pdev.id,
+						od->pm_lat_level, act_lat);
+			} else {
+				if (strcmp(od->hwmods[0]->class->name, "uart"))
+					pr_warning("omap_device: %s.%d: activate "
+						"latency %d higher than exptected. "
+						"(%llu > %d)\n",
+						od->pdev.name, od->pdev.id,
+						od->pm_lat_level, act_lat,
+						odpl->activate_lat);
+			}
 		}

 		od->dev_wakeup_lat -= odpl->activate_lat;
@@ -183,7 +189,8 @@ static int _omap_device_deactivate(struct
omap_device *od, u8 ignore_lat)
 {
 	struct timespec a, b, c;

-	pr_debug("omap_device: %s: deactivating\n", od->pdev.name);
+	if (strcmp(od->hwmods[0]->class->name, "uart"))
+		pr_debug("omap_device: %s: deactivating\n", od->pdev.name);

 	while (od->pm_lat_level < od->pm_lats_cnt) {
 		struct omap_device_pm_latency *odpl;
@@ -206,25 +213,29 @@ static int _omap_device_deactivate(struct
omap_device *od, u8 ignore_lat)
 		c = timespec_sub(b, a);
 		deact_lat = timespec_to_ns(&c);

-		pr_debug("omap_device: %s: pm_lat %d: deactivate: elapsed time "
-			 "%llu nsec\n", od->pdev.name, od->pm_lat_level,
-			 deact_lat);
+		if (strcmp(od->hwmods[0]->class->name, "uart"))
+			pr_debug("omap_device: %s: pm_lat %d: deactivate: elapsed time "
+				 "%llu nsec\n", od->pdev.name, od->pm_lat_level,
+				 deact_lat);

 		if (deact_lat > odpl->deactivate_lat) {
 			odpl->deactivate_lat_worst = deact_lat;
 			if (odpl->flags & OMAP_DEVICE_LATENCY_AUTO_ADJUST) {
 				odpl->deactivate_lat = deact_lat;
-				pr_warning("omap_device: %s.%d: new worst case "
-					   "deactivate latency %d: %llu\n",
-					   od->pdev.name, od->pdev.id,
-					   od->pm_lat_level, deact_lat);
-			} else
-				pr_warning("omap_device: %s.%d: deactivate "
+				if (strcmp(od->hwmods[0]->class->name, "uart"))
+					pr_warning("omap_device: %s.%d: new worst case "
+						   "deactivate latency %d: %llu\n",
+						   od->pdev.name, od->pdev.id,
+						   od->pm_lat_level, deact_lat);
+			} else {
+				if (strcmp(od->hwmods[0]->class->name, "uart"))
+					pr_warning("omap_device: %s.%d: deactivate "
 					   "latency %d higher than exptected. "
 					   "(%llu > %d)\n",
 					   od->pdev.name, od->pdev.id,
 					   od->pm_lat_level, deact_lat,
 					   odpl->deactivate_lat);
+			}
 		}


<<SNIP>>


>
> either the above or something like:
>
> if (pm_runtime_suspended(dev))
>        return 0;
>
> on console_write() ??

Consider the scenario where after bootup uart is idled
with runtime auto suspend so now state of console
uart is RPM_SUSPENDED.

Now you connect a pendrive to ehci-port.

Missing critical usb host <-> device_enumeration prints?

uart_console rpm_suspended state ->

usb_device mass storage device found print ->

console_write -> return without printing.

--
Thanks,
Govindraj.R

^ permalink raw reply related

* Re: [RFC/PATCH v3 00/13] PM QoS: add a per-device latency constraints class
From: mark gross @ 2011-07-29 14:24 UTC (permalink / raw)
  To: Jean Pihet
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <CAORVsuVoJ5-1hx2+6+gCuuWDd9A9h_RTS=U1228BM7EYiG=gxA@mail.gmail.com>

On Fri, Jul 29, 2011 at 10:37:52AM +0200, Jean Pihet wrote:
> Mark,
> 
> On Thu, Jul 28, 2011 at 3:14 PM, mark gross <markgross@thegnar.org> wrote:
> > On Thu, Jul 28, 2011 at 10:30:07AM +0200, jean.pihet@newoldbits.com wrote:
> >> From: Jean Pihet <j-pihet@ti.com>
> >>
> >> This patch set is in an RFC state, for review and comments.
> >>
> >> High level implementation:
> >>
> ...
> 
> >> 7. Misc fixes to improve code readability:
> >> . rename of the PM QoS implementation file from pm_qos_params.[ch] to pm_qos.[ch]
> > I picked the name for the file as pm_qos_params over pm_qos because I
> > wanted to make it implicitly clear that this was not an full QOS
> > implementation.  True QOS carries expectations similar to real time and
> > as the infrastructure is closer to "good intentioned" than even "best
> > effort" and offers no notification when the QOS request is not able to
> > be met and really doesn't implement a true QOS at all, (it just provides
> > the parameter interface for part of one its missing the notification
> > interface when the service level is not met and I think a few other
> > things.) So I wanted to have it named a bit different from just pm_qos.
> >
> > This said I'm not supper attached to the naming of the modules.  If
> > folks want to change it I wouldn't complain (too much anyway;).
> Ok got the idea. I do not know what name to chose though. As suggested
> previously the name pm_qos_params does not reflect the implementation,
> that is why I renamed it.

I must have missed the part where the name doesn't reflect the
implementation was talked about.  I look at the interface and I see
parameters all over the place and a small bit of notification.

--mark.

> 
> >
> > --mark
> > PS I'll look at the rest of the patches tomorrow, this time for real as
> > I'm about to have more free time to focus on non-work stuff :)
> Thanks you for reviewing!
> 
> > FWIW this write up sounds interesting.
> Hope it is readable ;p
> 
> Regards,
> Jean

^ permalink raw reply

* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Felipe Balbi @ 2011-07-29 14:02 UTC (permalink / raw)
  To: Govindraj
  Cc: Partha Basak, Tero Kristo, balbi, Govindraj.R, linux-serial,
	linux-pm, linux-omap
In-Reply-To: <CAAL8m4wQqS64xG0Vi9YCOdy3f2=1V7UkJCErpikHOra4X=g7+w@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 7283 bytes --]

Hi,

On Fri, Jul 29, 2011 at 06:28:32PM +0530, Govindraj wrote:
> On Fri, Jul 29, 2011 at 5:49 PM, Felipe Balbi <balbi@ti.com> wrote:
> > On Fri, Jul 29, 2011 at 05:29:12PM +0530, Govindraj wrote:
> >> Yes fine, But there are scenarios even before first runtime_suspend happens,
> >>
> >> ex: uart_port_configure -> get_sync -> pm_generic_runtime_resume
> >> (omap_device_enable in this case) debug printk -> console_write -> get_sync.
> >>
> >> there are numerous such scenarios where we end from runtime context
> >> to runtime api context again, or jumping from one uart port operation
> >> to uart print operation.
> >
> > calling pm_runtime_get_sync() should not be a problem. It should only
> > increase the usage counters... This sounds like a race condition on the
> > driver, no ?
> 
> Actually when we call a API to enable clocks we except the internals of API
> to just enable clocks and return.
> 
> *Clock enable API should not cause or trigger to do a _device_driver_operation_
> even before enabling clocks of the device-driver which called it*
> 
> for uart context get_sync can land me to uart driver back
> even before enabling the uart clocks due to printks.

only if _you_ have prints or _your_ runtime_*() calls, no ?

Let's say omap_hwmod.c wants to do a print:

-> printk()
  -> pm_runtime_get_sync
    -> console_write
  -> pm_runtim_put

now, if you have a printk() on your runtime_resume() before you enable
the clocks, then I can see why you would deadlock:

-> pm_runtime_get_sync
  -> omap_serial_runtime_resume
    -> printk
      -> pm_runtime_get_sync
        -> omap_serial_runtime_resume
	  -> printk
	   -> pm_runtime_get_sync
	    .....

maybe I'm missing something, but can you add a stack dump on your
->runtime_resume and ->runtime_suspend methods, just so we try to figure
out who's to blame here ?

> > What you're experiencing, if I understood correctly, is a deadlock ? In
> > that case, can you try to track the locking mechanism on the omap-serial
> > driver to try to find if there isn't anything obviously wrong ?
> >
> 
> Yes deadlocks. due to entering from runtime context to runtime context
> or entering from uart_port_operation to uart_console_write ops.
> 
> There are already port locks used extensively within the uart driver
> to secure a port operation.
> 
> But cannot secure a port operation while using clock_enable API.
> since clock enable API can land the control back to uart_console_write
> operation..

but in that case, if clock isn't enabled, why don't you just ignore the
print and enable the clock ?? Just return 0 and continue with
clk_enable() ??

> >> So either we should not have those prints from pm_generic layers or suppress
> >> them(seems pretty much a problem for a clean design within the driver
> >> using console_lock/unlock for every get_sync, and for
> >> runtime_put we cannot suppress the prints as it gets scheduled later)
> >>
> >> or if other folks who really need those prints form pm_generic* layers
> >> to debug and analysis then we have no other choice rather control
> >> the clk_enable/disable from outside driver code in idle path.
> >
> > yeah, none of these would be nice :-(
> >
> > I think this needs more debugging to be sure what's exactly going on.
> > What's exactly causing the deadlock ? Which lock is held and never
> > released ?
> >
> 
> I had done some investigations, from scenarios it simply boils down to fact
> to handle clock within uart driver, uart driver expects clock enable API* used
> to just enable uart clocks but rather not trigger a _uart_ops_ within which
> kind of unacceptable from the uart_driver context.

ok, now I see what you mean:

113 static int _omap_device_activate(struct omap_device *od, u8 ignore_lat)
114 {
115         struct timespec a, b, c;
116
117         pr_debug("omap_device: %s: activating\n", od->pdev.name);
118
119         while (od->pm_lat_level > 0) {
120                 struct omap_device_pm_latency *odpl;
121                 unsigned long long act_lat = 0;
122
123                 od->pm_lat_level--;
124
125                 odpl = od->pm_lats + od->pm_lat_level;
126
127                 if (!ignore_lat &&
128                     (od->dev_wakeup_lat <= od->_dev_wakeup_lat_limit))
129                         break;
130
131                 read_persistent_clock(&a);
132
133                 /* XXX check return code */
134                 odpl->activate_func(od);
135
136                 read_persistent_clock(&b);
137
138                 c = timespec_sub(b, a);
139                 act_lat = timespec_to_ns(&c);
140
141                 pr_debug("omap_device: %s: pm_lat %d: activate: elapsed time "
142                          "%llu nsec\n", od->pdev.name, od->pm_lat_level,
143                          act_lat);
144
145                 if (act_lat > odpl->activate_lat) {
146                         odpl->activate_lat_worst = act_lat;
147                         if (odpl->flags & OMAP_DEVICE_LATENCY_AUTO_ADJUST) {
148                                 odpl->activate_lat = act_lat;
149                                 pr_warning("omap_device: %s.%d: new worst case "
150                                            "activate latency %d: %llu\n",
151                                            od->pdev.name, od->pdev.id,
152                                            od->pm_lat_level, act_lat);
153                         } else
154                                 pr_warning("omap_device: %s.%d: activate "
155                                            "latency %d higher than exptected. "
156                                            "(%llu > %d)\n",
157                                            od->pdev.name, od->pdev.id,
158                                            od->pm_lat_level, act_lat,
159                                            odpl->activate_lat);
160                 }
161
162                 od->dev_wakeup_lat -= odpl->activate_lat;
163         }
164
165         return 0;
166 }

When that first pr_debug() triggers, UART's hwmod could be disabled, and
that would trigger the state I described above where you would keep on
calling pm_runtime_get_sync() forever ;-)

isn't it enough to patch it like below:

diff --git a/arch/arm/plat-omap/omap_device.c b/arch/arm/plat-omap/omap_device.c
index b6b4097..560f622 100644
--- a/arch/arm/plat-omap/omap_device.c
+++ b/arch/arm/plat-omap/omap_device.c
@@ -114,8 +114,6 @@ static int _omap_device_activate(struct omap_device *od, u8 ignore_lat)
 {
        struct timespec a, b, c;
 
-       pr_debug("omap_device: %s: activating\n", od->pdev.name);
-
        while (od->pm_lat_level > 0) {
                struct omap_device_pm_latency *odpl;
                unsigned long long act_lat = 0;
@@ -162,6 +160,8 @@ static int _omap_device_activate(struct omap_device *od, u8 ignore_lat)
                od->dev_wakeup_lat -= odpl->activate_lat;
        }
 
+       pr_debug("omap_device: %s: activated\n", od->pdev.name);
+
        return 0;
 }
 

either the above or something like:

if (pm_runtime_suspended(dev))
	return 0;

on console_write() ??

-- 
balbi

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply related

* Re: Runtime PM discussion notes
From: Pavel Machek @ 2011-07-29 13:41 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Mark Brown, linux-pm, linux-omap
In-Reply-To: <201107131104.06995.rjw@sisk.pl>

Hi!

> > > Actually, it just occurred to me that if we're waiting for a system
> > > timer and can hand that off to a suitable timer in the PMIC then we can
> > > do a suspend to RAM for the deep idle state from the hardware point of
> > > view.
> > 
> > Yep.  At LinuxCon Cambridge two years ago, we had a discussion about 
> > whether it would be possible to enter ACPI S-states from CPUIdle (or some 
> > idle governor) on Intel chips.  If I remember correctly, the conclusion 
> > was that ACPI always disables the screen/backlight, so it would only be 
> > useful for situations where that was acceptable.

Well, auto suspending when screensaver is active would still be
useful.

(And IIRC some machines kept screen on when in S-state unless driver
powered it down... but that might be S1.

> The reason why you can't enter ACPI S-states from CPUidle is because you
> need to go out of the idle loop to execute some ACPI-specific stuff.  Which
> is not even specific to Intel chips, but to ACPI in general.

The code was little tricky/unclean, but it "worked" for me at one
point... I called it  "sleepy linux".
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply

* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Govindraj @ 2011-07-29 12:58 UTC (permalink / raw)
  To: balbi
  Cc: Partha Basak, Tero Kristo, Govindraj.R, linux-serial, linux-pm,
	linux-omap
In-Reply-To: <20110729121907.GM31013@legolas.emea.dhcp.ti.com>

On Fri, Jul 29, 2011 at 5:49 PM, Felipe Balbi <balbi@ti.com> wrote:
> Hi,
>
> On Fri, Jul 29, 2011 at 05:29:12PM +0530, Govindraj wrote:
>> Yes fine, But there are scenarios even before first runtime_suspend happens,
>>
>> ex: uart_port_configure -> get_sync -> pm_generic_runtime_resume
>> (omap_device_enable in this case) debug printk -> console_write -> get_sync.
>>
>> there are numerous such scenarios where we end from runtime context
>> to runtime api context again, or jumping from one uart port operation
>> to uart print operation.
>
> calling pm_runtime_get_sync() should not be a problem. It should only
> increase the usage counters... This sounds like a race condition on the
> driver, no ?

Actually when we call a API to enable clocks we except the internals of API
to just enable clocks and return.

*Clock enable API should not cause or trigger to do a _device_driver_operation_
even before enabling clocks of the device-driver which called it*

for uart context get_sync can land me to uart driver back
even before enabling the uart clocks due to printks.

> What you're experiencing, if I understood correctly, is a deadlock ? In
> that case, can you try to track the locking mechanism on the omap-serial
> driver to try to find if there isn't anything obviously wrong ?
>

Yes deadlocks. due to entering from runtime context to runtime context
or entering from uart_port_operation to uart_console_write ops.

There are already port locks used extensively within the uart driver
to secure a port operation.

But cannot secure a port operation while using clock_enable API.
since clock enable API can land the control back to uart_console_write
operation..

>> So either we should not have those prints from pm_generic layers or suppress
>> them(seems pretty much a problem for a clean design within the driver
>> using console_lock/unlock for every get_sync, and for
>> runtime_put we cannot suppress the prints as it gets scheduled later)
>>
>> or if other folks who really need those prints form pm_generic* layers
>> to debug and analysis then we have no other choice rather control
>> the clk_enable/disable from outside driver code in idle path.
>
> yeah, none of these would be nice :-(
>
> I think this needs more debugging to be sure what's exactly going on.
> What's exactly causing the deadlock ? Which lock is held and never
> released ?
>

I had done some investigations, from scenarios it simply boils down to fact
to handle clock within uart driver, uart driver expects clock enable API* used
to just enable uart clocks but rather not trigger a _uart_ops_ within which
kind of unacceptable from the uart_driver context.

--
Thanks,
Govindraj.R


*clock enable API can be any API used to enable clocks like
get_sync/ omap_device_enable/clock_enable etc.

^ permalink raw reply

* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Felipe Balbi @ 2011-07-29 12:19 UTC (permalink / raw)
  To: Govindraj
  Cc: Partha Basak, Tero Kristo, balbi, Govindraj.R, linux-serial,
	linux-pm, linux-omap
In-Reply-To: <CAAL8m4wq7RQ_Rp-+LMWYY=dtTnhtR5UY+8HpO5C_54goKezaTg@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 1554 bytes --]

Hi,

On Fri, Jul 29, 2011 at 05:29:12PM +0530, Govindraj wrote:
> Yes fine, But there are scenarios even before first runtime_suspend happens,
> 
> ex: uart_port_configure -> get_sync -> pm_generic_runtime_resume
> (omap_device_enable in this case) debug printk -> console_write -> get_sync.
> 
> there are numerous such scenarios where we end from runtime context
> to runtime api context again, or jumping from one uart port operation
> to uart print operation.

calling pm_runtime_get_sync() should not be a problem. It should only
increase the usage counters... This sounds like a race condition on the
driver, no ?

What you're experiencing, if I understood correctly, is a deadlock ? In
that case, can you try to track the locking mechanism on the omap-serial
driver to try to find if there isn't anything obviously wrong ?

> So either we should not have those prints from pm_generic layers or suppress
> them(seems pretty much a problem for a clean design within the driver
> using console_lock/unlock for every get_sync, and for
> runtime_put we cannot suppress the prints as it gets scheduled later)
> 
> or if other folks who really need those prints form pm_generic* layers
> to debug and analysis then we have no other choice rather control
> the clk_enable/disable from outside driver code in idle path.

yeah, none of these would be nice :-(

I think this needs more debugging to be sure what's exactly going on.
What's exactly causing the deadlock ? Which lock is held and never
released ?

-- 
balbi

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply

* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Govindraj @ 2011-07-29 11:59 UTC (permalink / raw)
  To: balbi
  Cc: Partha Basak, Tero Kristo, Govindraj.R, linux-serial, linux-pm,
	linux-omap
In-Reply-To: <20110729113713.GK31013@legolas.emea.dhcp.ti.com>

On Fri, Jul 29, 2011 at 5:07 PM, Felipe Balbi <balbi@ti.com> wrote:
> Hi,
>
> On Fri, Jul 29, 2011 at 04:54:44PM +0530, Govindraj wrote:
>> On Fri, Jul 29, 2011 at 3:25 PM, Felipe Balbi <balbi@ti.com> wrote:
>> > Hi,
>> >
>>
>> Thanks for replying.
>>
>> > On Thu, Jul 28, 2011 at 02:59:55PM +0530, Govindraj.R wrote:
>> >> Proposal:
>> >> --------
>> >>       1. For the UART, follow the current approach of locking the console in
>> >>          Idle/Suspend path before cutting the clock but using pm_runtime_putsync.
>> >>          That is, continue using the prepare/resume Idle calls in idle path.
>> >
>> > I believe you should be using ->prepare() to prevent that any other
>> > work on UART won't trigger a console_write() right now. Maybe only
>> > queueing the work for after ->complete() or, maybe, just ignoring the
>> > work and loosing some prints, dunno.
>> >
>>
>> Yes true, for suspend path but for Idle path we don't have any such
>> mechanism.
>
> aha... good point ;-)
>
>> I have done clock gating in idle path integrating irq chaining patches.
>> hosted in gitorious here[1].
>>
>> Was consolidating whether this approach is OK,
>> or
>> Are there any other approaches that I should consider?
>
> why don't you also start queueing writes after the first call to
> runtime_suspend() and flush them after the first call runtime_resume()??
>

Yes fine, But there are scenarios even before first runtime_suspend happens,

ex: uart_port_configure -> get_sync -> pm_generic_runtime_resume
(omap_device_enable in this case) debug printk -> console_write -> get_sync.

there are numerous such scenarios where we end from runtime context
to runtime api context again, or jumping from one uart port operation
to uart print operation.

So either we should not have those prints from pm_generic layers or suppress
them(seems pretty much a problem for a clean design within the driver
using console_lock/unlock for every get_sync, and for
runtime_put we cannot suppress the prints as it gets scheduled later)

or if other folks who really need those prints form pm_generic* layers
to debug and analysis then we have no other choice rather control
the clk_enable/disable from outside driver code in idle path.

--
Thanks
Govindraj.R

*pm_generic layers for omap, referring to omap_device/omap_hwmod layers
which does low level clock handling.

^ permalink raw reply

* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Felipe Balbi @ 2011-07-29 11:37 UTC (permalink / raw)
  To: Govindraj
  Cc: Partha Basak, Tero Kristo, balbi, Govindraj.R, linux-serial,
	linux-pm, linux-omap
In-Reply-To: <CAAL8m4xjN15RfwbDd_c1HdoKY327CWOn5qwuoUT=fop4J0K5cw@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 1304 bytes --]

Hi,

On Fri, Jul 29, 2011 at 04:54:44PM +0530, Govindraj wrote:
> On Fri, Jul 29, 2011 at 3:25 PM, Felipe Balbi <balbi@ti.com> wrote:
> > Hi,
> >
> 
> Thanks for replying.
> 
> > On Thu, Jul 28, 2011 at 02:59:55PM +0530, Govindraj.R wrote:
> >> Proposal:
> >> --------
> >>       1. For the UART, follow the current approach of locking the console in
> >>          Idle/Suspend path before cutting the clock but using pm_runtime_putsync.
> >>          That is, continue using the prepare/resume Idle calls in idle path.
> >
> > I believe you should be using ->prepare() to prevent that any other
> > work on UART won't trigger a console_write() right now. Maybe only
> > queueing the work for after ->complete() or, maybe, just ignoring the
> > work and loosing some prints, dunno.
> >
> 
> Yes true, for suspend path but for Idle path we don't have any such
> mechanism.

aha... good point ;-)

> I have done clock gating in idle path integrating irq chaining patches.
> hosted in gitorious here[1].
> 
> Was consolidating whether this approach is OK,
> or
> Are there any other approaches that I should consider?

why don't you also start queueing writes after the first call to
runtime_suspend() and flush them after the first call runtime_resume()??

-- 
balbi

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply

* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Govindraj @ 2011-07-29 11:24 UTC (permalink / raw)
  To: balbi
  Cc: Partha Basak, Tero Kristo, Govindraj.R, linux-serial, linux-pm,
	linux-omap
In-Reply-To: <20110729095512.GE31013@legolas.emea.dhcp.ti.com>

On Fri, Jul 29, 2011 at 3:25 PM, Felipe Balbi <balbi@ti.com> wrote:
> Hi,
>

Thanks for replying.

> On Thu, Jul 28, 2011 at 02:59:55PM +0530, Govindraj.R wrote:
>> Proposal:
>> --------
>>       1. For the UART, follow the current approach of locking the console in
>>          Idle/Suspend path before cutting the clock but using pm_runtime_putsync.
>>          That is, continue using the prepare/resume Idle calls in idle path.
>
> I believe you should be using ->prepare() to prevent that any other
> work on UART won't trigger a console_write() right now. Maybe only
> queueing the work for after ->complete() or, maybe, just ignoring the
> work and loosing some prints, dunno.
>

Yes true, for suspend path but for Idle path we don't have any such
mechanism.

>>       2. Other Approach would be adding conditions to debug prints from
>>          omap_device/ omap_hwmod/clock_framework avoid calling these debug
>>          prints for uart.
>>          Or Even a debug macro that would not debug prints if the context is
>>          from uart.
>
> yeah, that might work but then again, if we were using 8250.c, would we
> have this limitation ??
>

Its not necessary which driver we use, its about clock handling mechanism
where it has to be done.

if we add clock handling mechanism into the driver, from driver context
handling clocks + managing the printks for the same uart port is a
issue, due to reasons said earlier.

> I think that, maybe, your best call would be to capture console_write()
> calls into an internal temporary buffer on ->prepare() and flush it once
> ->complete() is called ?
>

IIUC ->prepare and -> complete are only for suspend path
when we do system wide suspend

for normal idle path get_sync and put_sycn we dont have any ->prepare
or -> complete where we can buffer the contents and flush later.

> BTW, where are the patches ? :-)

I have done clock gating in idle path integrating irq chaining patches.
hosted in gitorious here[1].

Was consolidating whether this approach is OK,
or
Are there any other approaches that I should consider?

--
Thanks,
Govindraj.R

[1]: https://gitorious.org/runtime_3-0/runtime_3-0/commits/wip_irqchn


>
> --
> balbi
>

^ permalink raw reply

* Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
From: Felipe Balbi @ 2011-07-29  9:55 UTC (permalink / raw)
  To: Govindraj.R
  Cc: Partha Basak, Tero Kristo, Felipe Balbi, linux-serial, linux-pm,
	linux-omap
In-Reply-To: <1311845395-31917-1-git-send-email-govindraj.raja@ti.com>


[-- Attachment #1.1: Type: text/plain, Size: 1159 bytes --]

Hi,

On Thu, Jul 28, 2011 at 02:59:55PM +0530, Govindraj.R wrote:
> Proposal:
> --------
> 	1. For the UART, follow the current approach of locking the console in
> 	   Idle/Suspend path before cutting the clock but using pm_runtime_putsync.
> 	   That is, continue using the prepare/resume Idle calls in idle path.

I believe you should be using ->prepare() to prevent that any other
work on UART won't trigger a console_write() right now. Maybe only
queueing the work for after ->complete() or, maybe, just ignoring the
work and loosing some prints, dunno.

> 	2. Other Approach would be adding conditions to debug prints from
> 	   omap_device/ omap_hwmod/clock_framework avoid calling these debug
> 	   prints for uart.
> 	   Or Even a debug macro that would not debug prints if the context is
> 	   from uart.

yeah, that might work but then again, if we were using 8250.c, would we
have this limitation ??

I think that, maybe, your best call would be to capture console_write()
calls into an internal temporary buffer on ->prepare() and flush it once
->complete() is called ?

BTW, where are the patches ? :-)

-- 
balbi

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply

* Re: [PATCH v4 0/3] DEVFREQ, DVFS framework for non-CPU devices
From: Rafael J. Wysocki @ 2011-07-29  9:10 UTC (permalink / raw)
  To: Turquette, Mike
  Cc: Len Brown, Greg Kroah-Hartman, Kyungmin Park, MyungJoo Ham,
	Thomas Gleixner, linux-pm
In-Reply-To: <CAJOA=zNpmQpCyh0ow2q6gUTxkLMcWDsoMOaQEGzV4Sw__BszHg@mail.gmail.com>

On Friday, July 29, 2011, Turquette, Mike wrote:
> On Thu, Jul 28, 2011 at 3:10 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Friday, July 15, 2011, MyungJoo Ham wrote:
> >> For a usage example, please look at
> >> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
> >>
> >> In the above git tree, DVFS (dynamic voltage and frequency scaling) mechanism
> >> is applied to the memory bus of Exynos4210 for Exynos4210-NURI boards.
> >> In the example, the LPDDR2 DRAM frequency changes between 133, 266, and 400MHz
> >> and other related clocks simply follow the determined DDR RAM clock.
> >>
> >> The DEVFREQ driver for Exynos4210 memory bus is at
> >> /arch/arm/mach-exynos4/devfreq_bus.c in the git tree.
> >>
> >> MyungJoo Ham (3):
> >>   PM: Introduce DEVFREQ: generic DVFS framework with device-specific
> >>     OPPs
> >>   PM / DEVFREQ: add example governors
> >>   PM / DEVFREQ: add sysfs interface (including user tickling)
> >
> > OK, I'm going to take the patches for 3.2.
> 
> Have any other platforms signed up to use this mechanism to manage
> their peripheral DVFS?

Not that I know of, but one initial user is sufficient for me.
So if you have anything _against_ the patches, please speak up.

Thanks,
Rafael

^ permalink raw reply

* Re: [PATCH 07/13] OMAP PM: early init of the pwrdms states
From: Jean Pihet @ 2011-07-29  8:50 UTC (permalink / raw)
  To: Todd Poynor
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <20110729080820.GB26959@google.com>

On Fri, Jul 29, 2011 at 10:08 AM, Todd Poynor <toddpoynor@google.com> wrote:
> On Thu, Jul 28, 2011 at 10:30:14AM +0200, jean.pihet@newoldbits.com wrote:
>> From: Jean Pihet <j-pihet@ti.com>
>>
>> The powerdomains next states are initialized in pwrdms_setup as a
>> late_initcall. Because the PM QoS devices constraint can be requested
>> early in the boot sequence, the power domains next states can be
>> overwritten by pwrdms_setup.
>>
>> This patch fixes it by initializing the power domains next states
>> early at boot, so that the constraints can be applied.
>> Later in the pwrdms_setup function the currently programmed
>> next states are re-used as next state values.
>>
>> Applies to OMAP3 and OMAP4.
>>
>> Tested on OMAP3 Beagleboard and OMAP4 Pandaboard in RET/OFF using
>> wake-up latency constraints on MPU, CORE and PER.
>>
>> Signed-off-by: Jean Pihet <j-pihet@ti.com>
>> ---
> ...
>> diff --git a/arch/arm/mach-omap2/powerdomain.c b/arch/arm/mach-omap2/powerdomain.c
>> index 9af0847..63c3e7a 100644
>> --- a/arch/arm/mach-omap2/powerdomain.c
>> +++ b/arch/arm/mach-omap2/powerdomain.c
>> @@ -108,6 +108,9 @@ static int _pwrdm_register(struct powerdomain *pwrdm)
>>       pwrdm->state = pwrdm_read_pwrst(pwrdm);
>>       pwrdm->state_counter[pwrdm->state] = 1;
>>
>> +     /* Early init of the next power state */
>> +     pwrdm_set_next_pwrst(pwrdm, PWRDM_POWER_RET);
>> +
>
> Wanted to check that it's OK to initialize the next state of a power
> domain to RETENTION early in the boot sequence.  I believe patches
> have been previously discussed that set the state to ON to ensure the
> domain doesn't go to a lower state, and possibly lose context, before
> the PM subsystem is setup to handle it?  Not sure, thought maybe worth
> a doublecheck.
Indeed I need to check the behavior for OMAP3 & 4 which seem to
initialize the pwrdm states differently.
BTW the patch that inits all pwrdms to ON is not yet in l-o master
that is why I (lazily) submitted this one for now.

>
>
> Todd
>
>

Jean

^ permalink raw reply

* Re: [PATCH 08/13] OMAP2+: powerdomain: control power domains next state
From: Jean Pihet @ 2011-07-29  8:47 UTC (permalink / raw)
  To: Todd Poynor
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <20110729075942.GA26959@google.com>

Todd,

On Fri, Jul 29, 2011 at 9:59 AM, Todd Poynor <toddpoynor@google.com> wrote:
> On Thu, Jul 28, 2011 at 10:30:15AM +0200, jean.pihet@newoldbits.com wrote:
> ...
>> +int pwrdm_set_wkup_lat_constraint(struct powerdomain *pwrdm, void *cookie,
>> +                               long min_latency)
>> +{
>> +     struct pwrdm_wkup_constraints_entry *user = NULL;
>> +     struct pwrdm_wkup_constraints_entry *tmp_user, *new_user;
>> +     int ret = 0, free_new_user = 0, free_node = 0;
>> +     long value = 0;
>> +     unsigned long flags;
>> +
>> +     pr_debug("powerdomain: %s: pwrdm %s, cookie=0x%p, min_latency=%ld\n",
>> +              __func__, pwrdm->name, cookie, min_latency);
>> +
>> +     new_user = kzalloc(sizeof(struct pwrdm_wkup_constraints_entry),
>> +                        GFP_KERNEL);
>> +     if (!new_user) {
>> +             pr_err("%s: FATAL ERROR: kzalloc failed\n", __func__);
>> +             return -ENOMEM;
>> +     }
>> +
>> +     spin_lock_irqsave(&pwrdm->wkup_lat_plist_lock, flags);
>> +
>> +     /* Check if there already is a constraint for cookie */
>> +     plist_for_each_entry(tmp_user, &pwrdm->wkup_lat_plist_head, node) {
>> +             if (tmp_user->cookie == cookie) {
>> +                     user = tmp_user;
>> +                     free_new_user = 1;
>> +                     break;
>> +             }
>> +     }
>> +
>> +     if (min_latency != PM_QOS_DEV_LAT_DEFAULT_VALUE) {
>> +             /* If nothing to update, job done */
>> +             if (user && (user->node.prio == min_latency))
>> +                     goto exit_ok;
>> +
>> +             if (!user) {
>> +                     /* Add new entry to the list */
>> +                     user = new_user;
>> +                     user->cookie = cookie;
>> +             } else {
>> +                     /* Update existing entry */
>> +                     plist_del(&user->node, &pwrdm->wkup_lat_plist_head);
>> +             }
>> +
>> +             plist_node_init(&user->node, min_latency);
>> +             plist_add(&user->node, &pwrdm->wkup_lat_plist_head);
>> +     } else {
>> +             /* Remove the constraint from the list */
>> +             if (!user) {
>> +                     pr_err("%s: Error: no prior constraint to release\n",
>> +                            __func__);
>> +                     ret = -EINVAL;
>> +                     goto exit_error;
>> +             }
>> +
>> +             plist_del(&user->node, &pwrdm->wkup_lat_plist_head);
>> +             free_node = 1;
>
> All min_latency != PM_QOS_DEV_LAT_DEFAULT_VALUE paths need
> free_new_user = 1.
free_new_user = 1 is only needed if no existing constraint has been
found, i.e. user stays at NULL. This is implemented in the check for
an existing constraint (plist_for_each_entry(...)).

>(Or maybe change the logic to check user !=
> new_user and free new_user if so.)
That could be done as well.

>
>> +     }
>> +
>> +exit_ok:
>> +     /* Find the strongest constraint from the list */
>> +     if (!plist_head_empty(&pwrdm->wkup_lat_plist_head))
>> +             value = plist_first(&pwrdm->wkup_lat_plist_head)->prio;
>> +
>> +     spin_unlock_irqrestore(&pwrdm->wkup_lat_plist_lock, flags);
>> +
>> +     if (free_node)
>> +             kfree(user);
>> +
>> +     if (free_new_user)
>> +             kfree(new_user);
>> +
>> +     /* Apply the constraint to the pwrdm */
>> +     pr_debug("powerdomain: %s: pwrdm %s, value=%ld\n",
>> +              __func__, pwrdm->name, value);
>> +     pwrdm_wakeuplat_update_pwrst(pwrdm, value);
>> +
>> +     return 0;
>> +
>> +exit_error:
>> +     spin_unlock_irqrestore(&pwrdm->wkup_lat_plist_lock, flags);
>
> Need:
>        kfree(new_user);
Correct!
BTW I have a new version (patch here below) that improves the cases
with latency = PM_QOS_DEV_LAT_DEFAULT_VALUE. This will come in v4
after testing.

>
>> +     return ret;
>> +}
>
>
> Todd
>

Thanks for reviewing!

Regards,
Jean

---
Patch for cases with latency = PM_QOS_DEV_LAT_DEFAULT_VALUE:

diff --git a/arch/arm/mach-omap2/powerdomain.c
b/arch/arm/mach-omap2/powerdomain.c
index c0f48ab..2e9379b 100644
--- a/arch/arm/mach-omap2/powerdomain.c
+++ b/arch/arm/mach-omap2/powerdomain.c
@@ -206,7 +206,7 @@ static int _pwrdm_post_transition_cb(struct
powerdomain *pwrdm, void *unused)
  * pwrdm_wakeuplat_update_pwrst - Update power domain power state if needed
  * @pwrdm: struct powerdomain * to which requesting device belongs to.
  * @min_latency: the allowed wake-up latency for the given power domain. A
- *  value of 0 means 'no constraint' on the pwrdm.
+ *  value of PM_QOS_DEV_LAT_DEFAULT_VALUE means 'no constraint' on the pwrdm.
  *
  * Finds the power domain next power state that fulfills the constraint.
  * Programs a new target state if it is different from current power state.
@@ -232,7 +232,7 @@ static int pwrdm_wakeuplat_update_pwrst(struct
powerdomain *pwrdm,

 	/* Find power state with wakeup latency < minimum constraint */
 	for (new_state = 0x0; new_state < PWRDM_MAX_PWRSTS; new_state++) {
-		if (min_latency == 0 ||
+		if (min_latency == PM_QOS_DEV_LAT_DEFAULT_VALUE ||
 		    pwrdm->wakeup_lat[new_state] <= min_latency)
 			break;
 	}
@@ -1018,8 +1018,8 @@ int pwrdm_post_transition(void)
  * @pwrdm: struct powerdomain * which the constraint applies to
  * @cookie: constraint identifier, used for tracking.
  * @min_latency: minimum wakeup latency constraint (in microseconds) for
- *  the given pwrdm. The value of PM_QOS_DEV_WAKEUP_LAT_DEFAULT_VALUE
- *  removes the constraint.
+ *  the given pwrdm. The value of PM_QOS_DEV_LAT_DEFAULT_VALUE removes
+ *  the constraint.
  *
  * Tracks the constraints by @cookie.
  * Constraint set/update: Adds a new entry to powerdomain's wake-up latency
@@ -1042,7 +1042,7 @@ int pwrdm_set_wkup_lat_constraint(struct
powerdomain *pwrdm, void *cookie,
 	struct pwrdm_wkup_constraints_entry *user = NULL;
 	struct pwrdm_wkup_constraints_entry *tmp_user, *new_user;
 	int ret = 0, free_new_user = 0, free_node = 0;
-	long value = 0;
+	long value = PM_QOS_DEV_LAT_DEFAULT_VALUE;
 	unsigned long flags;

 	pr_debug("powerdomain: %s: pwrdm %s, cookie=0x%p, min_latency=%ld\n",
@@ -1083,16 +1083,17 @@ int pwrdm_set_wkup_lat_constraint(struct
powerdomain *pwrdm, void *cookie,
 		plist_node_init(&user->node, min_latency);
 		plist_add(&user->node, &pwrdm->wkup_lat_plist_head);
 	} else {
-		/* Remove the constraint from the list */
-		if (!user) {
-			pr_err("%s: Error: no prior constraint to release\n",
-			       __func__);
+		if (user) {
+			/* Remove the constraint from the list */
+			plist_del(&user->node, &pwrdm->wkup_lat_plist_head);
+			free_node = 1;
+		} else {
+			/* Constraint not existing or list empty, do nothing */
+			free_new_user = 1;
 			ret = -EINVAL;
 			goto exit_error;
 		}

-		plist_del(&user->node, &pwrdm->wkup_lat_plist_head);
-		free_node = 1;
 	}

 exit_ok:
@@ -1117,6 +1118,10 @@ exit_ok:

 exit_error:
 	spin_unlock_irqrestore(&pwrdm->wkup_lat_plist_lock, flags);
+
+	if (free_new_user)
+		kfree(new_user);
+
 	return ret;
 }

^ permalink raw reply related

* Re: [RFC/PATCH v3 00/13] PM QoS: add a per-device latency constraints class
From: Jean Pihet @ 2011-07-29  8:37 UTC (permalink / raw)
  To: markgross; +Cc: broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <20110728131454.GA2579@gvim.org>

Mark,

On Thu, Jul 28, 2011 at 3:14 PM, mark gross <markgross@thegnar.org> wrote:
> On Thu, Jul 28, 2011 at 10:30:07AM +0200, jean.pihet@newoldbits.com wrote:
>> From: Jean Pihet <j-pihet@ti.com>
>>
>> This patch set is in an RFC state, for review and comments.
>>
>> High level implementation:
>>
...

>> 7. Misc fixes to improve code readability:
>> . rename of the PM QoS implementation file from pm_qos_params.[ch] to pm_qos.[ch]
> I picked the name for the file as pm_qos_params over pm_qos because I
> wanted to make it implicitly clear that this was not an full QOS
> implementation.  True QOS carries expectations similar to real time and
> as the infrastructure is closer to "good intentioned" than even "best
> effort" and offers no notification when the QOS request is not able to
> be met and really doesn't implement a true QOS at all, (it just provides
> the parameter interface for part of one its missing the notification
> interface when the service level is not met and I think a few other
> things.) So I wanted to have it named a bit different from just pm_qos.
>
> This said I'm not supper attached to the naming of the modules.  If
> folks want to change it I wouldn't complain (too much anyway;).
Ok got the idea. I do not know what name to chose though. As suggested
previously the name pm_qos_params does not reflect the implementation,
that is why I renamed it.

>
> --mark
> PS I'll look at the rest of the patches tomorrow, this time for real as
> I'm about to have more free time to focus on non-work stuff :)
Thanks you for reviewing!

> FWIW this write up sounds interesting.
Hope it is readable ;p

Regards,
Jean

^ permalink raw reply

* Re: [PATCH 07/13] OMAP PM: early init of the pwrdms states
From: Todd Poynor @ 2011-07-29  8:08 UTC (permalink / raw)
  To: jean.pihet
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <1311841821-10252-8-git-send-email-j-pihet@ti.com>

On Thu, Jul 28, 2011 at 10:30:14AM +0200, jean.pihet@newoldbits.com wrote:
> From: Jean Pihet <j-pihet@ti.com>
> 
> The powerdomains next states are initialized in pwrdms_setup as a
> late_initcall. Because the PM QoS devices constraint can be requested
> early in the boot sequence, the power domains next states can be
> overwritten by pwrdms_setup.
> 
> This patch fixes it by initializing the power domains next states
> early at boot, so that the constraints can be applied.
> Later in the pwrdms_setup function the currently programmed
> next states are re-used as next state values.
> 
> Applies to OMAP3 and OMAP4.
> 
> Tested on OMAP3 Beagleboard and OMAP4 Pandaboard in RET/OFF using
> wake-up latency constraints on MPU, CORE and PER.
> 
> Signed-off-by: Jean Pihet <j-pihet@ti.com>
> ---
...
> diff --git a/arch/arm/mach-omap2/powerdomain.c b/arch/arm/mach-omap2/powerdomain.c
> index 9af0847..63c3e7a 100644
> --- a/arch/arm/mach-omap2/powerdomain.c
> +++ b/arch/arm/mach-omap2/powerdomain.c
> @@ -108,6 +108,9 @@ static int _pwrdm_register(struct powerdomain *pwrdm)
>  	pwrdm->state = pwrdm_read_pwrst(pwrdm);
>  	pwrdm->state_counter[pwrdm->state] = 1;
>  
> +	/* Early init of the next power state */
> +	pwrdm_set_next_pwrst(pwrdm, PWRDM_POWER_RET);
> +

Wanted to check that it's OK to initialize the next state of a power
domain to RETENTION early in the boot sequence.  I believe patches
have been previously discussed that set the state to ON to ensure the
domain doesn't go to a lower state, and possibly lose context, before
the PM subsystem is setup to handle it?  Not sure, thought maybe worth
a doublecheck.


Todd

^ permalink raw reply

* Re: [PATCH 08/13] OMAP2+: powerdomain: control power domains next state
From: Todd Poynor @ 2011-07-29  7:59 UTC (permalink / raw)
  To: jean.pihet
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <1311841821-10252-9-git-send-email-j-pihet@ti.com>

On Thu, Jul 28, 2011 at 10:30:15AM +0200, jean.pihet@newoldbits.com wrote:
...
> +int pwrdm_set_wkup_lat_constraint(struct powerdomain *pwrdm, void *cookie,
> +				  long min_latency)
> +{
> +	struct pwrdm_wkup_constraints_entry *user = NULL;
> +	struct pwrdm_wkup_constraints_entry *tmp_user, *new_user;
> +	int ret = 0, free_new_user = 0, free_node = 0;
> +	long value = 0;
> +	unsigned long flags;
> +
> +	pr_debug("powerdomain: %s: pwrdm %s, cookie=0x%p, min_latency=%ld\n",
> +		 __func__, pwrdm->name, cookie, min_latency);
> +
> +	new_user = kzalloc(sizeof(struct pwrdm_wkup_constraints_entry),
> +			   GFP_KERNEL);
> +	if (!new_user) {
> +		pr_err("%s: FATAL ERROR: kzalloc failed\n", __func__);
> +		return -ENOMEM;
> +	}
> +
> +	spin_lock_irqsave(&pwrdm->wkup_lat_plist_lock, flags);
> +
> +	/* Check if there already is a constraint for cookie */
> +	plist_for_each_entry(tmp_user, &pwrdm->wkup_lat_plist_head, node) {
> +		if (tmp_user->cookie == cookie) {
> +			user = tmp_user;
> +			free_new_user = 1;
> +			break;
> +		}
> +	}
> +
> +	if (min_latency != PM_QOS_DEV_LAT_DEFAULT_VALUE) {
> +		/* If nothing to update, job done */
> +		if (user && (user->node.prio == min_latency))
> +			goto exit_ok;
> +
> +		if (!user) {
> +			/* Add new entry to the list */
> +			user = new_user;
> +			user->cookie = cookie;
> +		} else {
> +			/* Update existing entry */
> +			plist_del(&user->node, &pwrdm->wkup_lat_plist_head);
> +		}
> +
> +		plist_node_init(&user->node, min_latency);
> +		plist_add(&user->node, &pwrdm->wkup_lat_plist_head);
> +	} else {
> +		/* Remove the constraint from the list */
> +		if (!user) {
> +			pr_err("%s: Error: no prior constraint to release\n",
> +			       __func__);
> +			ret = -EINVAL;
> +			goto exit_error;
> +		}
> +
> +		plist_del(&user->node, &pwrdm->wkup_lat_plist_head);
> +		free_node = 1;

All min_latency != PM_QOS_DEV_LAT_DEFAULT_VALUE paths need
free_new_user = 1.  (Or maybe change the logic to check user !=
new_user and free new_user if so.)

> +	}
> +
> +exit_ok:
> +	/* Find the strongest constraint from the list */
> +	if (!plist_head_empty(&pwrdm->wkup_lat_plist_head))
> +		value = plist_first(&pwrdm->wkup_lat_plist_head)->prio;
> +
> +	spin_unlock_irqrestore(&pwrdm->wkup_lat_plist_lock, flags);
> +
> +	if (free_node)
> +		kfree(user);
> +
> +	if (free_new_user)
> +		kfree(new_user);
> +
> +	/* Apply the constraint to the pwrdm */
> +	pr_debug("powerdomain: %s: pwrdm %s, value=%ld\n",
> +		 __func__, pwrdm->name, value);
> +	pwrdm_wakeuplat_update_pwrst(pwrdm, value);
> +
> +	return 0;
> +
> +exit_error:
> +	spin_unlock_irqrestore(&pwrdm->wkup_lat_plist_lock, flags);

Need:
        kfree(new_user);

> +	return ret;
> +}


Todd

^ permalink raw reply

* Re: [PATCH v4 0/3] DEVFREQ, DVFS framework for non-CPU devices
From: Turquette, Mike @ 2011-07-29  4:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Len Brown, Greg Kroah-Hartman, Kyungmin Park, MyungJoo Ham,
	Thomas Gleixner, linux-pm
In-Reply-To: <201107290010.53788.rjw@sisk.pl>

On Thu, Jul 28, 2011 at 3:10 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Friday, July 15, 2011, MyungJoo Ham wrote:
>> For a usage example, please look at
>> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
>>
>> In the above git tree, DVFS (dynamic voltage and frequency scaling) mechanism
>> is applied to the memory bus of Exynos4210 for Exynos4210-NURI boards.
>> In the example, the LPDDR2 DRAM frequency changes between 133, 266, and 400MHz
>> and other related clocks simply follow the determined DDR RAM clock.
>>
>> The DEVFREQ driver for Exynos4210 memory bus is at
>> /arch/arm/mach-exynos4/devfreq_bus.c in the git tree.
>>
>> MyungJoo Ham (3):
>>   PM: Introduce DEVFREQ: generic DVFS framework with device-specific
>>     OPPs
>>   PM / DEVFREQ: add example governors
>>   PM / DEVFREQ: add sysfs interface (including user tickling)
>
> OK, I'm going to take the patches for 3.2.

Have any other platforms signed up to use this mechanism to manage
their peripheral DVFS?

Thanks,
Mike

> Thanks,
> Rafael
>
>
>>  Documentation/ABI/testing/sysfs-devices-power |   50 ++
>>  Documentation/ABI/testing/sysfs-power         |   43 ++
>>  drivers/base/power/Makefile                   |    1 +
>>  drivers/base/power/devfreq.c                  |  714 +++++++++++++++++++++++++
>>  drivers/base/power/opp.c                      |    9 +
>>  include/linux/devfreq.h                       |  119 ++++
>>  kernel/power/Kconfig                          |   34 ++
>>  7 files changed, 970 insertions(+), 0 deletions(-)
>>  create mode 100644 drivers/base/power/devfreq.c
>>  create mode 100644 include/linux/devfreq.h
>>
>>
>
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
>

^ permalink raw reply

* Re: [PATCH v4 0/3] DEVFREQ, DVFS framework for non-CPU devices
From: Rafael J. Wysocki @ 2011-07-28 22:10 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: Len Brown, Greg Kroah-Hartman, Kyungmin Park, linux-pm,
	Thomas Gleixner
In-Reply-To: <1310717510-19002-1-git-send-email-myungjoo.ham@samsung.com>

On Friday, July 15, 2011, MyungJoo Ham wrote:
> For a usage example, please look at
> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
> 
> In the above git tree, DVFS (dynamic voltage and frequency scaling) mechanism
> is applied to the memory bus of Exynos4210 for Exynos4210-NURI boards.
> In the example, the LPDDR2 DRAM frequency changes between 133, 266, and 400MHz
> and other related clocks simply follow the determined DDR RAM clock.
> 
> The DEVFREQ driver for Exynos4210 memory bus is at
> /arch/arm/mach-exynos4/devfreq_bus.c in the git tree.
> 
> MyungJoo Ham (3):
>   PM: Introduce DEVFREQ: generic DVFS framework with device-specific
>     OPPs
>   PM / DEVFREQ: add example governors
>   PM / DEVFREQ: add sysfs interface (including user tickling)

OK, I'm going to take the patches for 3.2.

Thanks,
Rafael


>  Documentation/ABI/testing/sysfs-devices-power |   50 ++
>  Documentation/ABI/testing/sysfs-power         |   43 ++
>  drivers/base/power/Makefile                   |    1 +
>  drivers/base/power/devfreq.c                  |  714 +++++++++++++++++++++++++
>  drivers/base/power/opp.c                      |    9 +
>  include/linux/devfreq.h                       |  119 ++++
>  kernel/power/Kconfig                          |   34 ++
>  7 files changed, 970 insertions(+), 0 deletions(-)
>  create mode 100644 drivers/base/power/devfreq.c
>  create mode 100644 include/linux/devfreq.h
> 
> 

^ permalink raw reply

* Re: [patch 1/1] [PATCH] include storage keys in hibernation image.
From: Rafael J. Wysocki @ 2011-07-28 22:01 UTC (permalink / raw)
  To: Martin Schwidefsky; +Cc: linux-s390, linux-pm, Jiri Slaby, linux-kernel
In-Reply-To: <20110608074648.287491002@de.ibm.com>

Hi,

Sorry for the extreme delay.

On Wednesday, June 08, 2011, Martin Schwidefsky wrote:
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> 
> For s390 there is one additional byte associated with each page,
> the storage key. This byte contains the referenced and changed
> bits and needs to be included into the hibernation image.
> If the storage keys are not restored to their previous state all
> original pages would appear to be dirty. This can cause
> inconsistencies e.g. with read-only filesystems.
> 
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Rafael J. Wysocki <rjw@sisk.pl>
> Cc: Jiri Slaby <jslaby@suse.cz>
> Cc: linux-pm@lists.linux-foundation.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-s390@vger.kernel.org
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> ---
> 
>  arch/s390/kernel/swsusp_asm64.S |    3 
>  kernel/power/snapshot.c         |  148 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 151 insertions(+)
> 
> diff -urpN linux-2.6/arch/s390/kernel/swsusp_asm64.S linux-2.6-patched/arch/s390/kernel/swsusp_asm64.S
> --- linux-2.6/arch/s390/kernel/swsusp_asm64.S	2011-05-19 06:06:34.000000000 +0200
> +++ linux-2.6-patched/arch/s390/kernel/swsusp_asm64.S	2011-06-06 11:14:43.000000000 +0200
> @@ -138,11 +138,14 @@ swsusp_arch_resume:
>  0:
>  	lg	%r2,8(%r1)
>  	lg	%r4,0(%r1)
> +	iske	%r0,%r4
>  	lghi	%r3,PAGE_SIZE
>  	lghi	%r5,PAGE_SIZE
>  1:
>  	mvcle	%r2,%r4,0
>  	jo	1b
> +	lg	%r2,8(%r1)
> +	sske	%r0,%r2
>  	lg	%r1,16(%r1)
>  	ltgr	%r1,%r1
>  	jnz	0b
> diff -urpN linux-2.6/kernel/power/snapshot.c linux-2.6-patched/kernel/power/snapshot.c
> --- linux-2.6/kernel/power/snapshot.c	2011-06-06 11:14:39.000000000 +0200
> +++ linux-2.6-patched/kernel/power/snapshot.c	2011-06-06 11:14:43.000000000 +0200
> @@ -1022,6 +1022,137 @@ static inline void copy_data_page(unsign
>  }
>  #endif /* CONFIG_HIGHMEM */
>  
> +#ifdef CONFIG_S390
> +/*
> + * For s390 there is one additional byte associated with each page,
> + * the storage key. The key consists of the access-control bits
> + * (alias the key), the fetch-protection bit, the referenced bit
> + * and the change bit (alias dirty bit). Linux uses only the
> + * referenced bit and the change bit for pages in the page cache.
> + * Any modification of a page will set the change bit, any access
> + * sets the referenced bit. Overindication of the referenced bits
> + * after a hibernation cycle does not cause any harm but the
> + * overindication of the change bits would cause trouble.
> + * Therefore it is necessary to include the storage keys in the
> + * hibernation image. The storage keys are saved to the most
> + * significant byte of each page frame number in the list of pfns
> + * in the hibernation image.
> + */
> +
> +/*
> + * Key storage is allocated as a linked list of pages.
> + * The size of the keys array is (PAGE_SIZE - sizeof(long))
> + */
> +struct page_key_data {
> +	struct page_key_data *next;
> +	unsigned char data[];
> +};
> +
> +#define PAGE_KEY_DATA_SIZE	(PAGE_SIZE - sizeof(struct page_key_data *))
> +
> +static struct page_key_data *page_key_data;
> +static struct page_key_data *page_key_rp, *page_key_wp;
> +static unsigned long page_key_rx, page_key_wx;
> +
> +/*
> + * For each page in the hibernation image one additional byte is
> + * stored in the most significant byte of the page frame number.
> + * On suspend no additional memory is required but on resume the
> + * keys need to be memorized until the page data has been restored.
> + * Only then can the storage keys be set to their old state.
> + */
> +static inline unsigned long page_key_additional_pages(unsigned long pages)
> +{
> +	return DIV_ROUND_UP(pages, PAGE_KEY_DATA_SIZE);
> +}
> +
> +/*
> + * Free page_key_data list of arrays.
> + */
> +static void page_key_free(void)
> +{
> +	struct page_key_data *pkd;
> +
> +	while (page_key_data) {
> +		pkd = page_key_data;
> +		page_key_data = pkd->next;
> +		free_page((unsigned long) pkd);
> +	}
> +}
> +
> +/*
> + * Allocate page_key_data list of arrays with enough room to store
> + * one byte for each page in the hibernation image.
> + */
> +static int page_key_alloc(unsigned long pages)
> +{
> +	struct page_key_data *pk;
> +	unsigned long size;
> +
> +	size = DIV_ROUND_UP(pages, PAGE_KEY_DATA_SIZE);
> +	while (size--) {
> +		pk = (struct page_key_data *) get_zeroed_page(GFP_KERNEL);
> +		if (!pk) {
> +			page_key_free();
> +			return -ENOMEM;
> +		}
> +		pk->next = page_key_data;
> +		page_key_data = pk;
> +	}
> +	page_key_rp = page_key_wp = page_key_data;
> +	page_key_rx = page_key_wx = 0;
> +	return 0;
> +}
> +
> +/*
> + * Save the storage key into the upper 8 bits of the page frame number.
> + */
> +static inline void page_key_read(unsigned long *pfn)
> +{
> +	unsigned long addr;
> +
> +	addr = (unsigned long) page_address(pfn_to_page(*pfn));
> +	*(unsigned char *) pfn = (unsigned char) page_get_storage_key(addr);
> +}
> +
> +/*
> + * Extract the storage key from the upper 8 bits of the page frame number
> + * and store it in the page_key_data list of arrays.
> + */
> +static inline void page_key_memorize(unsigned long *pfn)
> +{
> +	page_key_wp->data[page_key_wx] = *(unsigned char *) pfn;
> +	*(unsigned char *) pfn = 0;
> +	if (++page_key_wx < PAGE_KEY_DATA_SIZE)
> +		return;
> +	page_key_wp = page_key_wp->next;
> +	page_key_wx = 0;
> +}
> +
> +/*
> + * Get the next key from the page_key_data list of arrays and set the
> + * storage key of the page referred by @address. If @address refers to
> + * a "safe" page the swsusp_arch_resume code will transfer the storage
> + * key from the buffer page to the original page.
> + */
> +static void page_key_write(void *address)
> +{
> +	page_set_storage_key((unsigned long) address,
> +			     page_key_rp->data[page_key_rx], 0);
> +	if (++page_key_rx >= PAGE_KEY_DATA_SIZE)
> +		return;
> +	page_key_rp = page_key_rp->next;
> +	page_key_rx = 0;
> +}
> +#else
> +static unsigned long page_key_additional_pages(unsigned long pages) { return 0; }
> +static inline int  page_key_alloc(unsigned long pages) { return 0; }
> +static inline void page_key_free(void) { }
> +static inline void page_key_read(unsigned long *pfn) { }
> +static inline void page_key_memorize(unsigned long *pfn) { }
> +static inline void page_key_write(void *address) { }
> +#endif

Having reconsidered things I think the code under the #ifdef above
should really go to arch/s390.

Now, for the purpose of exporting the headers I'd introduce
CONFIG_ARCH_SAVE_PAGE_KEYS and make S390 do

select ARCH_SAVE_PAGE_KEYS if HIBERNATION

and I'd put a #ifdef depending on that into include/linux/suspend.h.

Apart from this, I have only one complaint, which is that the kerneldoc
comments should follow the standard (the other comments in snapshot.c don't,
but that's a matter for a separate patch).

Thanks,
Rafael

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox