* [PATCH 1/1] mmc: core: Use delayed work in clock gating framework
@ 2011-10-19 14:48 Sujit Reddy Thumma
2011-10-24 7:23 ` Linus Walleij
0 siblings, 1 reply; 4+ messages in thread
From: Sujit Reddy Thumma @ 2011-10-19 14:48 UTC (permalink / raw)
To: linux-mmc; +Cc: Sujit Reddy Thumma, linux-arm-msm, cjb
Current clock gating framework disables the MCI clock as soon as the
request is completed and enables it when a request arrives. This aggressive
clock gating framework when enabled cause following issues:
When there are back-to-back requests from the Queue layer, we unnecessarily
end up disabling and enabling the clocks between these requests since 8 MCLK
clock cycles is a very short duration compared to the time delay between
back to back requests reaching the MMC layer. This overhead can effect the
overall performance depending on how long the clock enable and disable
calls take which is platform dependent. For example on some platforms we
can have clock control not on the local processor, but on a different
subsystem and the time taken to perform the clock enable/disable can add
significant overhead.
Also if the host controller driver decides to disable the host clock too
when mmc_set_ios function is called with ios.clock=0, it adds additional
delay and it is highly possible that the next request had already arrived
and unnecessarily blocked in enabling the clocks. This is seen frequently
when the processor is executing at high speeds and in multi-core platforms
thus reduces the overall throughput compared to if clock gating is disabled.
Fix this by delaying turning off the clocks by posting request on
delayed workqueue. Also cancel the unscheduled pending work, if any,
when there is access to card.
Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
---
drivers/mmc/core/host.c | 11 +++++++----
include/linux/mmc/host.h | 3 ++-
2 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
index ca2e4f5..b7b51da 100644
--- a/drivers/mmc/core/host.c
+++ b/drivers/mmc/core/host.c
@@ -113,7 +113,7 @@ static void mmc_host_clk_gate_delayed(struct mmc_host *host)
static void mmc_host_clk_gate_work(struct work_struct *work)
{
struct mmc_host *host = container_of(work, struct mmc_host,
- clk_gate_work);
+ clk_gate_work.work);
mmc_host_clk_gate_delayed(host);
}
@@ -130,6 +130,8 @@ void mmc_host_clk_hold(struct mmc_host *host)
{
unsigned long flags;
+ /* cancel any clock gating work scheduled by mmc_host_clk_release() */
+ cancel_delayed_work_sync(&host->clk_gate_work);
mutex_lock(&host->clk_gate_mutex);
spin_lock_irqsave(&host->clk_lock, flags);
if (host->clk_gated) {
@@ -179,7 +181,8 @@ void mmc_host_clk_release(struct mmc_host *host)
host->clk_requests--;
if (mmc_host_may_gate_card(host->card) &&
!host->clk_requests)
- queue_work(system_nrt_wq, &host->clk_gate_work);
+ queue_delayed_work(system_nrt_wq, &host->clk_gate_work,
+ msecs_to_jiffies(MMC_CLK_GATE_DELAY));
spin_unlock_irqrestore(&host->clk_lock, flags);
}
@@ -213,7 +216,7 @@ static inline void mmc_host_clk_init(struct mmc_host *host)
/* Hold MCI clock for 8 cycles by default */
host->clk_delay = 8;
host->clk_gated = false;
- INIT_WORK(&host->clk_gate_work, mmc_host_clk_gate_work);
+ INIT_DELAYED_WORK(&host->clk_gate_work, mmc_host_clk_gate_work);
spin_lock_init(&host->clk_lock);
mutex_init(&host->clk_gate_mutex);
}
@@ -228,7 +231,7 @@ static inline void mmc_host_clk_exit(struct mmc_host *host)
* Wait for any outstanding gate and then make sure we're
* ungated before exiting.
*/
- if (cancel_work_sync(&host->clk_gate_work))
+ if (cancel_delayed_work_sync(&host->clk_gate_work))
mmc_host_clk_gate_delayed(host);
if (host->clk_gated)
mmc_host_clk_hold(host);
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 16e9338..8d6ba4c 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -252,10 +252,11 @@ struct mmc_host {
int clk_requests; /* internal reference counter */
unsigned int clk_delay; /* number of MCI clk hold cycles */
bool clk_gated; /* clock gated */
- struct work_struct clk_gate_work; /* delayed clock gate */
+ struct delayed_work clk_gate_work; /* delayed clock gate */
unsigned int clk_old; /* old clock value cache */
spinlock_t clk_lock; /* lock for clk fields */
struct mutex clk_gate_mutex; /* mutex for clock gating */
+#define MMC_CLK_GATE_DELAY 50 /* in milliseconds*/
#endif
/* host specific block data */
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 1/1] mmc: core: Use delayed work in clock gating framework
@ 2011-10-19 14:48 Sujit Reddy Thumma
0 siblings, 0 replies; 4+ messages in thread
From: Sujit Reddy Thumma @ 2011-10-19 14:48 UTC (permalink / raw)
To: linux-mmc; +Cc: Sujit Reddy Thumma, linux-arm-msm, cjb
Current clock gating framework disables the MCI clock as soon as the
request is completed and enables it when a request arrives. This aggressive
clock gating framework when enabled cause following issues:
When there are back-to-back requests from the Queue layer, we unnecessarily
end up disabling and enabling the clocks between these requests since 8 MCLK
clock cycles is a very short duration compared to the time delay between
back to back requests reaching the MMC layer. This overhead can effect the
overall performance depending on how long the clock enable and disable
calls take which is platform dependent. For example on some platforms we
can have clock control not on the local processor, but on a different
subsystem and the time taken to perform the clock enable/disable can add
significant overhead.
Also if the host controller driver decides to disable the host clock too
when mmc_set_ios function is called with ios.clock=0, it adds additional
delay and it is highly possible that the next request had already arrived
and unnecessarily blocked in enabling the clocks. This is seen frequently
when the processor is executing at high speeds and in multi-core platforms
thus reduces the overall throughput compared to if clock gating is disabled.
Fix this by delaying turning off the clocks by posting request on
delayed workqueue. Also cancel the unscheduled pending work, if any,
when there is access to card.
Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
---
drivers/mmc/core/host.c | 11 +++++++----
include/linux/mmc/host.h | 3 ++-
2 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
index ca2e4f5..b7b51da 100644
--- a/drivers/mmc/core/host.c
+++ b/drivers/mmc/core/host.c
@@ -113,7 +113,7 @@ static void mmc_host_clk_gate_delayed(struct mmc_host *host)
static void mmc_host_clk_gate_work(struct work_struct *work)
{
struct mmc_host *host = container_of(work, struct mmc_host,
- clk_gate_work);
+ clk_gate_work.work);
mmc_host_clk_gate_delayed(host);
}
@@ -130,6 +130,8 @@ void mmc_host_clk_hold(struct mmc_host *host)
{
unsigned long flags;
+ /* cancel any clock gating work scheduled by mmc_host_clk_release() */
+ cancel_delayed_work_sync(&host->clk_gate_work);
mutex_lock(&host->clk_gate_mutex);
spin_lock_irqsave(&host->clk_lock, flags);
if (host->clk_gated) {
@@ -179,7 +181,8 @@ void mmc_host_clk_release(struct mmc_host *host)
host->clk_requests--;
if (mmc_host_may_gate_card(host->card) &&
!host->clk_requests)
- queue_work(system_nrt_wq, &host->clk_gate_work);
+ queue_delayed_work(system_nrt_wq, &host->clk_gate_work,
+ msecs_to_jiffies(MMC_CLK_GATE_DELAY));
spin_unlock_irqrestore(&host->clk_lock, flags);
}
@@ -213,7 +216,7 @@ static inline void mmc_host_clk_init(struct mmc_host *host)
/* Hold MCI clock for 8 cycles by default */
host->clk_delay = 8;
host->clk_gated = false;
- INIT_WORK(&host->clk_gate_work, mmc_host_clk_gate_work);
+ INIT_DELAYED_WORK(&host->clk_gate_work, mmc_host_clk_gate_work);
spin_lock_init(&host->clk_lock);
mutex_init(&host->clk_gate_mutex);
}
@@ -228,7 +231,7 @@ static inline void mmc_host_clk_exit(struct mmc_host *host)
* Wait for any outstanding gate and then make sure we're
* ungated before exiting.
*/
- if (cancel_work_sync(&host->clk_gate_work))
+ if (cancel_delayed_work_sync(&host->clk_gate_work))
mmc_host_clk_gate_delayed(host);
if (host->clk_gated)
mmc_host_clk_hold(host);
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 16e9338..8d6ba4c 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -252,10 +252,11 @@ struct mmc_host {
int clk_requests; /* internal reference counter */
unsigned int clk_delay; /* number of MCI clk hold cycles */
bool clk_gated; /* clock gated */
- struct work_struct clk_gate_work; /* delayed clock gate */
+ struct delayed_work clk_gate_work; /* delayed clock gate */
unsigned int clk_old; /* old clock value cache */
spinlock_t clk_lock; /* lock for clk fields */
struct mutex clk_gate_mutex; /* mutex for clock gating */
+#define MMC_CLK_GATE_DELAY 50 /* in milliseconds*/
#endif
/* host specific block data */
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 1/1] mmc: core: Use delayed work in clock gating framework
2011-10-19 14:48 [PATCH 1/1] mmc: core: Use delayed work in clock gating framework Sujit Reddy Thumma
@ 2011-10-24 7:23 ` Linus Walleij
2011-10-24 11:16 ` Sujit Reddy Thumma
0 siblings, 1 reply; 4+ messages in thread
From: Linus Walleij @ 2011-10-24 7:23 UTC (permalink / raw)
To: Sujit Reddy Thumma; +Cc: linux-mmc, linux-arm-msm, cjb
On Wed, Oct 19, 2011 at 4:48 PM, Sujit Reddy Thumma
<sthumma@codeaurora.org> wrote:
> Current clock gating framework disables the MCI clock as soon as the
> request is completed and enables it when a request arrives. This aggressive
> clock gating framework when enabled cause following issues:
>
> When there are back-to-back requests from the Queue layer, we unnecessarily
> end up disabling and enabling the clocks between these requests since 8 MCLK
> clock cycles is a very short duration compared to the time delay between
> back to back requests reaching the MMC layer. This overhead can effect the
> overall performance depending on how long the clock enable and disable
> calls take which is platform dependent. For example on some platforms we
> can have clock control not on the local processor, but on a different
> subsystem and the time taken to perform the clock enable/disable can add
> significant overhead.
OK I see the problem. At one point it was suggested to use the delayed
disable facilities in runtime PM for avoiding this, but it never materialized.
> Fix this by delaying turning off the clocks by posting request on
> delayed workqueue. Also cancel the unscheduled pending work, if any,
> when there is access to card.
(...)
> @@ -252,10 +252,11 @@ struct mmc_host {
> int clk_requests; /* internal reference counter */
> unsigned int clk_delay; /* number of MCI clk hold cycles */
> bool clk_gated; /* clock gated */
> - struct work_struct clk_gate_work; /* delayed clock gate */
> + struct delayed_work clk_gate_work; /* delayed clock gate */
> unsigned int clk_old; /* old clock value cache */
> spinlock_t clk_lock; /* lock for clk fields */
> struct mutex clk_gate_mutex; /* mutex for clock gating */
> +#define MMC_CLK_GATE_DELAY 50 /* in milliseconds*/
> #endif
What's the rationale of having this in the middle of the struct
in the .h-file?
Move it inside the #ifdef in host.c instead, and also provide
some rationale about the choice of 50 ms.
Maybe we should even provide a sysfs or debugfs entry for
tweaking that value, as has been suggested in the past.
However that's no big deal for me...
Do you have a patch to your msm_sdcc.c that enables the
gating to take effect as well? Does it solve the problems
pointed out by Russell when I tried the same type of patch
to mmci.c?
http://marc.info/?l=linux-mmc&m=129146545916794&w=2
Yours,
Linus Walleij
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/1] mmc: core: Use delayed work in clock gating framework
2011-10-24 7:23 ` Linus Walleij
@ 2011-10-24 11:16 ` Sujit Reddy Thumma
0 siblings, 0 replies; 4+ messages in thread
From: Sujit Reddy Thumma @ 2011-10-24 11:16 UTC (permalink / raw)
To: Linus Walleij; +Cc: linux-mmc, linux-arm-msm, cjb
On 10/24/2011 12:53 PM, Linus Walleij wrote:
> On Wed, Oct 19, 2011 at 4:48 PM, Sujit Reddy Thumma
> <sthumma@codeaurora.org> wrote:
>
>> Current clock gating framework disables the MCI clock as soon as the
>> request is completed and enables it when a request arrives. This aggressive
>> clock gating framework when enabled cause following issues:
>>
>> When there are back-to-back requests from the Queue layer, we unnecessarily
>> end up disabling and enabling the clocks between these requests since 8 MCLK
>> clock cycles is a very short duration compared to the time delay between
>> back to back requests reaching the MMC layer. This overhead can effect the
>> overall performance depending on how long the clock enable and disable
>> calls take which is platform dependent. For example on some platforms we
>> can have clock control not on the local processor, but on a different
>> subsystem and the time taken to perform the clock enable/disable can add
>> significant overhead.
>
> OK I see the problem. At one point it was suggested to use the delayed
> disable facilities in runtime PM for avoiding this, but it never materialized.
>
>> Fix this by delaying turning off the clocks by posting request on
>> delayed workqueue. Also cancel the unscheduled pending work, if any,
>> when there is access to card.
>
> (...)
>> @@ -252,10 +252,11 @@ struct mmc_host {
>> int clk_requests; /* internal reference counter */
>> unsigned int clk_delay; /* number of MCI clk hold cycles */
>> bool clk_gated; /* clock gated */
>> - struct work_struct clk_gate_work; /* delayed clock gate */
>> + struct delayed_work clk_gate_work; /* delayed clock gate */
>> unsigned int clk_old; /* old clock value cache */
>> spinlock_t clk_lock; /* lock for clk fields */
>> struct mutex clk_gate_mutex; /* mutex for clock gating */
>> +#define MMC_CLK_GATE_DELAY 50 /* in milliseconds*/
>> #endif
>
> What's the rationale of having this in the middle of the struct
> in the .h-file?
>
> Move it inside the #ifdef in host.c instead, and also provide
> some rationale about the choice of 50 ms.
With my testing on a MSM platform, round-trip delay of 10ms to turn off
and on the clocks is seen. So I thought 50ms is a reasonable tradeoff.
There is no specific calculation behind this delay. We just relied on
the empirical data which again may vary with different platforms.
If you have any suggestions I would be happy to consider that. I am
planning to increase this to 200ms, any comments on that?
>
> Maybe we should even provide a sysfs or debugfs entry for
> tweaking that value, as has been suggested in the past.
> However that's no big deal for me...
Adding an entry in sysfs seems to be a good solution to tune the
required delay. I will add this.
>
> Do you have a patch to your msm_sdcc.c that enables the
> gating to take effect as well? Does it solve the problems
> pointed out by Russell when I tried the same type of patch
> to mmci.c?
> http://marc.info/?l=linux-mmc&m=129146545916794&w=2
Looking at the patch that you pointed out, I don't think it is required
for msm_sdcc.c to enable clock gating. The current code inherently takes
care of logic that is needed to ensure that MCLK is not turned off until
any pending register write in the host driver is completed. You might
want to have a look at set_ios function in
https://www.codeaurora.org/gitweb/quic/la/?p=kernel/msm.git;a=blob;f=drivers/mmc/host/msm_sdcc.c;h=b5a08d2b58e0cd6d18a8f1041139b6866c0cdc09;hb=refs/heads/msm-3.0
Please let me know if I am still missing anything.
--
Thanks & Regards,
Sujit Reddy Thumma
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-10-24 11:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-19 14:48 [PATCH 1/1] mmc: core: Use delayed work in clock gating framework Sujit Reddy Thumma
2011-10-24 7:23 ` Linus Walleij
2011-10-24 11:16 ` Sujit Reddy Thumma
-- strict thread matches above, loose matches on Subject: below --
2011-10-19 14:48 Sujit Reddy Thumma
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).