From: Felipe Balbi <balbi@ti.com>
To: Govindraj <govindraj.ti@gmail.com>
Cc: balbi@ti.com, "Govindraj.R" <govindraj.raja@ti.com>,
linux-omap@vger.kernel.org, linux-serial@vger.kernel.org,
linux-pm@lists.linux-foundation.org,
"Rafael J. Wysocki" <rjw@sisk.pl>, Paul Walmsley <paul@pwsan.com>,
Kevin Hilman <khilman@ti.com>,
Vishwanath Sripathy <vishwanath.bs@ti.com>,
Partha Basak <p-basak2@ti.com>,
Santosh Shilimkar <santosh.shilimkar@ti.com>,
Rajendra Nayak <rnayak@ti.com>, Benoit Cousson <b-cousson@ti.com>,
Tero Kristo <t-kristo@ti.com>
Subject: Re: [RFC v2]: Issues implementing clock handling mechanism within UART driver
Date: Fri, 29 Jul 2011 17:02:11 +0300 [thread overview]
Message-ID: <20110729140210.GT31013@legolas.emea.dhcp.ti.com> (raw)
In-Reply-To: <CAAL8m4wQqS64xG0Vi9YCOdy3f2=1V7UkJCErpikHOra4X=g7+w@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 7283 bytes --]
Hi,
On Fri, Jul 29, 2011 at 06:28:32PM +0530, Govindraj wrote:
> On Fri, Jul 29, 2011 at 5:49 PM, Felipe Balbi <balbi@ti.com> wrote:
> > On Fri, Jul 29, 2011 at 05:29:12PM +0530, Govindraj wrote:
> >> Yes fine, But there are scenarios even before first runtime_suspend happens,
> >>
> >> ex: uart_port_configure -> get_sync -> pm_generic_runtime_resume
> >> (omap_device_enable in this case) debug printk -> console_write -> get_sync.
> >>
> >> there are numerous such scenarios where we end from runtime context
> >> to runtime api context again, or jumping from one uart port operation
> >> to uart print operation.
> >
> > calling pm_runtime_get_sync() should not be a problem. It should only
> > increase the usage counters... This sounds like a race condition on the
> > driver, no ?
>
> Actually when we call a API to enable clocks we except the internals of API
> to just enable clocks and return.
>
> *Clock enable API should not cause or trigger to do a _device_driver_operation_
> even before enabling clocks of the device-driver which called it*
>
> for uart context get_sync can land me to uart driver back
> even before enabling the uart clocks due to printks.
only if _you_ have prints or _your_ runtime_*() calls, no ?
Let's say omap_hwmod.c wants to do a print:
-> printk()
-> pm_runtime_get_sync
-> console_write
-> pm_runtim_put
now, if you have a printk() on your runtime_resume() before you enable
the clocks, then I can see why you would deadlock:
-> pm_runtime_get_sync
-> omap_serial_runtime_resume
-> printk
-> pm_runtime_get_sync
-> omap_serial_runtime_resume
-> printk
-> pm_runtime_get_sync
.....
maybe I'm missing something, but can you add a stack dump on your
->runtime_resume and ->runtime_suspend methods, just so we try to figure
out who's to blame here ?
> > What you're experiencing, if I understood correctly, is a deadlock ? In
> > that case, can you try to track the locking mechanism on the omap-serial
> > driver to try to find if there isn't anything obviously wrong ?
> >
>
> Yes deadlocks. due to entering from runtime context to runtime context
> or entering from uart_port_operation to uart_console_write ops.
>
> There are already port locks used extensively within the uart driver
> to secure a port operation.
>
> But cannot secure a port operation while using clock_enable API.
> since clock enable API can land the control back to uart_console_write
> operation..
but in that case, if clock isn't enabled, why don't you just ignore the
print and enable the clock ?? Just return 0 and continue with
clk_enable() ??
> >> So either we should not have those prints from pm_generic layers or suppress
> >> them(seems pretty much a problem for a clean design within the driver
> >> using console_lock/unlock for every get_sync, and for
> >> runtime_put we cannot suppress the prints as it gets scheduled later)
> >>
> >> or if other folks who really need those prints form pm_generic* layers
> >> to debug and analysis then we have no other choice rather control
> >> the clk_enable/disable from outside driver code in idle path.
> >
> > yeah, none of these would be nice :-(
> >
> > I think this needs more debugging to be sure what's exactly going on.
> > What's exactly causing the deadlock ? Which lock is held and never
> > released ?
> >
>
> I had done some investigations, from scenarios it simply boils down to fact
> to handle clock within uart driver, uart driver expects clock enable API* used
> to just enable uart clocks but rather not trigger a _uart_ops_ within which
> kind of unacceptable from the uart_driver context.
ok, now I see what you mean:
113 static int _omap_device_activate(struct omap_device *od, u8 ignore_lat)
114 {
115 struct timespec a, b, c;
116
117 pr_debug("omap_device: %s: activating\n", od->pdev.name);
118
119 while (od->pm_lat_level > 0) {
120 struct omap_device_pm_latency *odpl;
121 unsigned long long act_lat = 0;
122
123 od->pm_lat_level--;
124
125 odpl = od->pm_lats + od->pm_lat_level;
126
127 if (!ignore_lat &&
128 (od->dev_wakeup_lat <= od->_dev_wakeup_lat_limit))
129 break;
130
131 read_persistent_clock(&a);
132
133 /* XXX check return code */
134 odpl->activate_func(od);
135
136 read_persistent_clock(&b);
137
138 c = timespec_sub(b, a);
139 act_lat = timespec_to_ns(&c);
140
141 pr_debug("omap_device: %s: pm_lat %d: activate: elapsed time "
142 "%llu nsec\n", od->pdev.name, od->pm_lat_level,
143 act_lat);
144
145 if (act_lat > odpl->activate_lat) {
146 odpl->activate_lat_worst = act_lat;
147 if (odpl->flags & OMAP_DEVICE_LATENCY_AUTO_ADJUST) {
148 odpl->activate_lat = act_lat;
149 pr_warning("omap_device: %s.%d: new worst case "
150 "activate latency %d: %llu\n",
151 od->pdev.name, od->pdev.id,
152 od->pm_lat_level, act_lat);
153 } else
154 pr_warning("omap_device: %s.%d: activate "
155 "latency %d higher than exptected. "
156 "(%llu > %d)\n",
157 od->pdev.name, od->pdev.id,
158 od->pm_lat_level, act_lat,
159 odpl->activate_lat);
160 }
161
162 od->dev_wakeup_lat -= odpl->activate_lat;
163 }
164
165 return 0;
166 }
When that first pr_debug() triggers, UART's hwmod could be disabled, and
that would trigger the state I described above where you would keep on
calling pm_runtime_get_sync() forever ;-)
isn't it enough to patch it like below:
diff --git a/arch/arm/plat-omap/omap_device.c b/arch/arm/plat-omap/omap_device.c
index b6b4097..560f622 100644
--- a/arch/arm/plat-omap/omap_device.c
+++ b/arch/arm/plat-omap/omap_device.c
@@ -114,8 +114,6 @@ static int _omap_device_activate(struct omap_device *od, u8 ignore_lat)
{
struct timespec a, b, c;
- pr_debug("omap_device: %s: activating\n", od->pdev.name);
-
while (od->pm_lat_level > 0) {
struct omap_device_pm_latency *odpl;
unsigned long long act_lat = 0;
@@ -162,6 +160,8 @@ static int _omap_device_activate(struct omap_device *od, u8 ignore_lat)
od->dev_wakeup_lat -= odpl->activate_lat;
}
+ pr_debug("omap_device: %s: activated\n", od->pdev.name);
+
return 0;
}
either the above or something like:
if (pm_runtime_suspended(dev))
return 0;
on console_write() ??
--
balbi
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]
next prev parent reply other threads:[~2011-07-29 14:02 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-28 9:29 [RFC v2]: Issues implementing clock handling mechanism within UART driver Govindraj.R
2011-07-29 9:55 ` Felipe Balbi
2011-07-29 11:24 ` Govindraj
2011-07-29 11:37 ` Felipe Balbi
2011-07-29 11:59 ` Govindraj
2011-07-29 12:19 ` Felipe Balbi
2011-07-29 12:58 ` Govindraj
2011-07-29 14:02 ` Felipe Balbi [this message]
2011-07-29 15:13 ` Govindraj
2011-08-01 9:03 ` Felipe Balbi
2011-08-01 9:56 ` Raja, Govindraj
2011-08-01 10:02 ` Felipe Balbi
2011-08-01 12:46 ` Govindraj
2011-08-01 10:00 ` Govindraj
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110729140210.GT31013@legolas.emea.dhcp.ti.com \
--to=balbi@ti.com \
--cc=b-cousson@ti.com \
--cc=govindraj.raja@ti.com \
--cc=govindraj.ti@gmail.com \
--cc=khilman@ti.com \
--cc=linux-omap@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=linux-serial@vger.kernel.org \
--cc=p-basak2@ti.com \
--cc=paul@pwsan.com \
--cc=rjw@sisk.pl \
--cc=rnayak@ti.com \
--cc=santosh.shilimkar@ti.com \
--cc=t-kristo@ti.com \
--cc=vishwanath.bs@ti.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).