public inbox for linux-mmc@vger.kernel.org
 help / color / mirror / Atom feed
* BUG/PATCH?  mmcblk devices don't suspend properly.
@ 2012-03-30  2:47 NeilBrown
  2012-03-30  5:25 ` Sujit Reddy Thumma
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2012-03-30  2:47 UTC (permalink / raw)
  To: linux-mmc, linux-pm

[-- Attachment #1: Type: text/plain, Size: 6021 bytes --]


I've been experimenting with removing the call to sys_sync() in
enter_state().  For me (with verbose debugging and syslog running)
this causes a noticeable delay when entering suspend.

Removing it should not affect correctness as there is no locking the ensure
that no other process writes out data immediately after the 'sync'
completed.  But it could make existing bugs more obvious - in fact it does :-)

The device I am doing this on is using a micro-SD card for storage.
The card cannot be removed without removing the battery, so
MMC_CAP_NONREMOVABLE is set for the mmc slot (which seems necessary for
safely having '/' there).

Since removing the sys_sync() call I've notices a number of problems with
suspend/resume, the most obvious being that resume blocks in mmc_claim_host:

[  263.585754] susman          D c046455c  5604  1086   1084 0x00000000
[  263.592498] [<c046455c>] (__schedule+0x584/0x614) from [<c0302630>] (__mmc_claim_host+0xb8/0x154)
[  263.601806] [<c0302630>] (__mmc_claim_host+0xb8/0x154) from [<c0308640>] (mmc_sd_resume+0x34/0x5c)
[  263.611206] [<c0308640>] (mmc_sd_resume+0x34/0x5c) from [<c0301ce4>] (mmc_resume_host+0xc8/0x15c)
[  263.620513] [<c0301ce4>] (mmc_resume_host+0xc8/0x15c) from [<c0317bdc>] (omap_hsmmc_resume+0xbc/0x104)
[  263.630279] [<c0317bdc>] (omap_hsmmc_resume+0xbc/0x104) from [<c025b41c>] (platform_pm_resume+0x44/0x54)
[  263.640197] [<c025b41c>] (platform_pm_resume+0x44/0x54) from [<c025f88c>] (dpm_run_callback+0x48/0x8c)
[  263.649963] [<c025f88c>] (dpm_run_callback+0x48/0x8c) from [<c0260348>] (device_resume+0x204/0x268)
[  263.659454] [<c0260348>] (device_resume+0x204/0x268) from [<c02604b8>] (dpm_resume+0x10c/0x244)
[  263.668579] [<c02604b8>] (dpm_resume+0x10c/0x244) from [<c02605fc>] (dpm_resume_end+0xc/0x18)
[  263.677490] [<c02605fc>] (dpm_resume_end+0xc/0x18) from [<c0061784>] (suspend_devices_and_enter+0x1d0/0x22c)
[  263.687805] [<c0061784>] (suspend_devices_and_enter+0x1d0/0x22c) from [<c00618f8>] (enter_state+0x118/0x170)
[  263.698089] [<c00618f8>] (enter_state+0x118/0x170) from [<c00605b4>] (state_store+0x94/0x118)
[  263.707061] [<c00605b4>] (state_store+0x94/0x118) from [<c01df898>] (kobj_attr_store+0x1c/0x24)
[  263.716186] [<c01df898>] (kobj_attr_store+0x1c/0x24) from [<c011dcf4>] (sysfs_write_file+0x108/0x13c)
[  263.725860] [<c011dcf4>] (sysfs_write_file+0x108/0x13c) from [<c00c5748>] (vfs_write+0xac/0x180)
[  263.735076] [<c00c5748>] (vfs_write+0xac/0x180) from [<c00c58d4>] (sys_write+0x40/0x6c)
[  263.743469] [<c00c58d4>] (sys_write+0x40/0x6c) from [<c000ea00>] (ret_fast_syscall+0x0/0x3c)


while mmcdq/0 is owning the device and waiting for something that will
apparently never happen:

[  262.566680] mmcqd/0         D c046455c  5828    53      2 0x00000000
[  262.573425] [<c046455c>] (__schedule+0x584/0x614) from [<c04620f0>] (schedule_timeout+0x1c/0x1d0)
[  262.582733] [<c04620f0>] (schedule_timeout+0x1c/0x1d0) from [<c0463eb4>] (wait_for_common+0xd8/0x150)
[  262.592407] [<c0463eb4>] (wait_for_common+0xd8/0x150) from [<c0303488>] (mmc_wait_for_req_done+0x24/0xbc)
[  262.602447] [<c0303488>] (mmc_wait_for_req_done+0x24/0xbc) from [<c0303f20>] (mmc_start_req+0x50/0x11c)
[  262.612304] [<c0303f20>] (mmc_start_req+0x50/0x11c) from [<c030df2c>] (mmc_blk_issue_rw_rq+0x78/0x500)
[  262.622070] [<c030df2c>] (mmc_blk_issue_rw_rq+0x78/0x500) from [<c030e7a4>] (mmc_blk_issue_rq+0x3f0/0x420)
[  262.632171] [<c030e7a4>] (mmc_blk_issue_rq+0x3f0/0x420) from [<c030f884>] (mmc_queue_thread+0x98/0x100)
[  262.642028] [<c030f884>] (mmc_queue_thread+0x98/0x100) from [<c004fcc8>] (kthread+0x84/0x90)
[  262.650878] [<c004fcc8>] (kthread+0x84/0x90) from [<c000f398>] (kernel_thread_exit+0x0/0x8)

I traced this to the fact that mmc_bus_suspend / mmc_bus_resume are *never*
being called.  So mmc_blk_suspend isn't called and the queue isn't suspended.

The problem appears to be that mmc_bus_suspend is defined as a 'legacy'
suspend function. i.e. it is assigned to mmc_bus_type.suspend rather than
mmc_bus_type.pm.suspend.
In __device_suspend() I see that the legacy suspend function is only called if
the bus has *no* dev_pm_ops at all.
However when CONFIG_PM_RUNTIME is defined, mmc_bustype does have a dev_pm_ops
which contains runtime_{suspend,resume,idle},but no suspend or resume.

The net effect is that - as I observed - mmc_bus_suspend is never called.

I added lines:

	.suspend		= mmc_bus_suspend,
	.resume			= mmc_bus_resume,

to mmc_bus_pm_ops and can no longer reproduce the problem.  So maybe this
patch is appropriate:

diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
index 5d011a3..80c1e46 100644
--- a/drivers/mmc/core/bus.c
+++ b/drivers/mmc/core/bus.c
@@ -169,6 +169,8 @@ static const struct dev_pm_ops mmc_bus_pm_ops = {
 	.runtime_suspend	= mmc_runtime_suspend,
 	.runtime_resume		= mmc_runtime_resume,
 	.runtime_idle		= mmc_runtime_idle,
+	.suspend		= mmc_bus_suspend,
+	.resume			= mmc_bus_resume,
 };
 
 #define MMC_PM_OPS_PTR	(&mmc_bus_pm_ops)

however I suspect we should remove the 'legacy' pointers at the same time.(?).

Also, while exploring this problem I could not find anything that would cause
mmc_bus_suspend() to wait for an async request to complete.  Maybe  it is
there somewhere that I don't understand yet, and I cannot be sure that any of
my symptoms could be explained by an async request continuing while the
hardware was powered off, but I wonder if something like this might be needed
too:

diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index 2517547..cd36c30 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -354,6 +354,8 @@ void mmc_queue_suspend(struct mmc_queue *mq)
 		spin_unlock_irqrestore(q->queue_lock, flags);
 
 		down(&mq->thread_sem);
+		/* wait for current request to complete */
+		mmc_start_req(mq->card->host, NULL, NULL);
 	}
 }
 

Thanks for any help you can provide,

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: BUG/PATCH?  mmcblk devices don't suspend properly.
  2012-03-30  2:47 BUG/PATCH? mmcblk devices don't suspend properly NeilBrown
@ 2012-03-30  5:25 ` Sujit Reddy Thumma
  2012-04-02 20:43   ` Per Forlin
  2012-04-05  8:57   ` Hebbar, Gururaja
  0 siblings, 2 replies; 5+ messages in thread
From: Sujit Reddy Thumma @ 2012-03-30  5:25 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-mmc, linux-pm, per.forlin

On 3/30/2012 8:17 AM, NeilBrown wrote:
>
> I've been experimenting with removing the call to sys_sync() in
> enter_state().  For me (with verbose debugging and syslog running)
> this causes a noticeable delay when entering suspend.
>
> Removing it should not affect correctness as there is no locking the ensure
> that no other process writes out data immediately after the 'sync'
> completed.  But it could make existing bugs more obvious - in fact it does :-)
>
> The device I am doing this on is using a micro-SD card for storage.
> The card cannot be removed without removing the battery, so
> MMC_CAP_NONREMOVABLE is set for the mmc slot (which seems necessary for
> safely having '/' there).
>
> Since removing the sys_sync() call I've notices a number of problems with
> suspend/resume, the most obvious being that resume blocks in mmc_claim_host:
>
> [  263.585754] susman          D c046455c  5604  1086   1084 0x00000000
> [  263.592498] [<c046455c>] (__schedule+0x584/0x614) from [<c0302630>] (__mmc_claim_host+0xb8/0x154)
> [  263.601806] [<c0302630>] (__mmc_claim_host+0xb8/0x154) from [<c0308640>] (mmc_sd_resume+0x34/0x5c)
> [  263.611206] [<c0308640>] (mmc_sd_resume+0x34/0x5c) from [<c0301ce4>] (mmc_resume_host+0xc8/0x15c)
> [  263.620513] [<c0301ce4>] (mmc_resume_host+0xc8/0x15c) from [<c0317bdc>] (omap_hsmmc_resume+0xbc/0x104)
> [  263.630279] [<c0317bdc>] (omap_hsmmc_resume+0xbc/0x104) from [<c025b41c>] (platform_pm_resume+0x44/0x54)
> [  263.640197] [<c025b41c>] (platform_pm_resume+0x44/0x54) from [<c025f88c>] (dpm_run_callback+0x48/0x8c)
> [  263.649963] [<c025f88c>] (dpm_run_callback+0x48/0x8c) from [<c0260348>] (device_resume+0x204/0x268)
> [  263.659454] [<c0260348>] (device_resume+0x204/0x268) from [<c02604b8>] (dpm_resume+0x10c/0x244)
> [  263.668579] [<c02604b8>] (dpm_resume+0x10c/0x244) from [<c02605fc>] (dpm_resume_end+0xc/0x18)
> [  263.677490] [<c02605fc>] (dpm_resume_end+0xc/0x18) from [<c0061784>] (suspend_devices_and_enter+0x1d0/0x22c)
> [  263.687805] [<c0061784>] (suspend_devices_and_enter+0x1d0/0x22c) from [<c00618f8>] (enter_state+0x118/0x170)
> [  263.698089] [<c00618f8>] (enter_state+0x118/0x170) from [<c00605b4>] (state_store+0x94/0x118)
> [  263.707061] [<c00605b4>] (state_store+0x94/0x118) from [<c01df898>] (kobj_attr_store+0x1c/0x24)
> [  263.716186] [<c01df898>] (kobj_attr_store+0x1c/0x24) from [<c011dcf4>] (sysfs_write_file+0x108/0x13c)
> [  263.725860] [<c011dcf4>] (sysfs_write_file+0x108/0x13c) from [<c00c5748>] (vfs_write+0xac/0x180)
> [  263.735076] [<c00c5748>] (vfs_write+0xac/0x180) from [<c00c58d4>] (sys_write+0x40/0x6c)
> [  263.743469] [<c00c58d4>] (sys_write+0x40/0x6c) from [<c000ea00>] (ret_fast_syscall+0x0/0x3c)
>
>
> while mmcdq/0 is owning the device and waiting for something that will
> apparently never happen:
>
> [  262.566680] mmcqd/0         D c046455c  5828    53      2 0x00000000
> [  262.573425] [<c046455c>] (__schedule+0x584/0x614) from [<c04620f0>] (schedule_timeout+0x1c/0x1d0)
> [  262.582733] [<c04620f0>] (schedule_timeout+0x1c/0x1d0) from [<c0463eb4>] (wait_for_common+0xd8/0x150)
> [  262.592407] [<c0463eb4>] (wait_for_common+0xd8/0x150) from [<c0303488>] (mmc_wait_for_req_done+0x24/0xbc)
> [  262.602447] [<c0303488>] (mmc_wait_for_req_done+0x24/0xbc) from [<c0303f20>] (mmc_start_req+0x50/0x11c)
> [  262.612304] [<c0303f20>] (mmc_start_req+0x50/0x11c) from [<c030df2c>] (mmc_blk_issue_rw_rq+0x78/0x500)
> [  262.622070] [<c030df2c>] (mmc_blk_issue_rw_rq+0x78/0x500) from [<c030e7a4>] (mmc_blk_issue_rq+0x3f0/0x420)
> [  262.632171] [<c030e7a4>] (mmc_blk_issue_rq+0x3f0/0x420) from [<c030f884>] (mmc_queue_thread+0x98/0x100)
> [  262.642028] [<c030f884>] (mmc_queue_thread+0x98/0x100) from [<c004fcc8>] (kthread+0x84/0x90)
> [  262.650878] [<c004fcc8>] (kthread+0x84/0x90) from [<c000f398>] (kernel_thread_exit+0x0/0x8)
>
> I traced this to the fact that mmc_bus_suspend / mmc_bus_resume are *never*
> being called.  So mmc_blk_suspend isn't called and the queue isn't suspended.
>
> The problem appears to be that mmc_bus_suspend is defined as a 'legacy'
> suspend function. i.e. it is assigned to mmc_bus_type.suspend rather than
> mmc_bus_type.pm.suspend.
> In __device_suspend() I see that the legacy suspend function is only called if
> the bus has *no* dev_pm_ops at all.
> However when CONFIG_PM_RUNTIME is defined, mmc_bustype does have a dev_pm_ops
> which contains runtime_{suspend,resume,idle},but no suspend or resume.
>
> The net effect is that - as I observed - mmc_bus_suspend is never called.
>
> I added lines:
>
> 	.suspend		= mmc_bus_suspend,
> 	.resume			= mmc_bus_resume,
>
> to mmc_bus_pm_ops and can no longer reproduce the problem.  So maybe this
> patch is appropriate:
>
> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
> index 5d011a3..80c1e46 100644
> --- a/drivers/mmc/core/bus.c
> +++ b/drivers/mmc/core/bus.c
> @@ -169,6 +169,8 @@ static const struct dev_pm_ops mmc_bus_pm_ops = {
>   	.runtime_suspend	= mmc_runtime_suspend,
>   	.runtime_resume		= mmc_runtime_resume,
>   	.runtime_idle		= mmc_runtime_idle,
> +	.suspend		= mmc_bus_suspend,
> +	.resume			= mmc_bus_resume,
>   };
>
>   #define MMC_PM_OPS_PTR	(&mmc_bus_pm_ops)
>
> however I suspect we should remove the 'legacy' pointers at the same time.(?).
This was pointed out earlier and a patch is posted but looks like it 
never went into mmc tree -- 
http://comments.gmane.org/gmane.linux.kernel.mmc/9168

>
> Also, while exploring this problem I could not find anything that would cause
> mmc_bus_suspend() to wait for an async request to complete.  Maybe  it is
> there somewhere that I don't understand yet, and I cannot be sure that any of
> my symptoms could be explained by an async request continuing while the
> hardware was powered off, but I wonder if something like this might be needed
> too:
>
> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
> index 2517547..cd36c30 100644
> --- a/drivers/mmc/card/queue.c
> +++ b/drivers/mmc/card/queue.c
> @@ -354,6 +354,8 @@ void mmc_queue_suspend(struct mmc_queue *mq)
>   		spin_unlock_irqrestore(q->queue_lock, flags);
>
>   		down(&mq->thread_sem);
> +		/* wait for current request to complete */
> +		mmc_start_req(mq->card->host, NULL, NULL);
>   	}
This looks good to me but I would prefer Per Forlin to ack on it.

>   }
>
>
> Thanks for any help you can provide,
>
> NeilBrown
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG/PATCH? mmcblk devices don't suspend properly.
  2012-03-30  5:25 ` Sujit Reddy Thumma
@ 2012-04-02 20:43   ` Per Forlin
  2012-04-04  8:43     ` Hebbar, Gururaja
  2012-04-05  8:57   ` Hebbar, Gururaja
  1 sibling, 1 reply; 5+ messages in thread
From: Per Forlin @ 2012-04-02 20:43 UTC (permalink / raw)
  To: Sujit Reddy Thumma; +Cc: NeilBrown, linux-mmc, linux-pm

On Fri, Mar 30, 2012 at 7:25 AM, Sujit Reddy Thumma
<sthumma@codeaurora.org> wrote:
> On 3/30/2012 8:17 AM, NeilBrown wrote:
>>
>>
>> I've been experimenting with removing the call to sys_sync() in
>> enter_state().  For me (with verbose debugging and syslog running)
>> this causes a noticeable delay when entering suspend.
>>
>> Removing it should not affect correctness as there is no locking the
>> ensure
>> that no other process writes out data immediately after the 'sync'
>> completed.  But it could make existing bugs more obvious - in fact it does
>> :-)
>>
>> The device I am doing this on is using a micro-SD card for storage.
>> The card cannot be removed without removing the battery, so
>> MMC_CAP_NONREMOVABLE is set for the mmc slot (which seems necessary for
>> safely having '/' there).
>>
>> Since removing the sys_sync() call I've notices a number of problems with
>> suspend/resume, the most obvious being that resume blocks in
>> mmc_claim_host:
>>
>> [  263.585754] susman          D c046455c  5604  1086   1084 0x00000000
>> [  263.592498] [<c046455c>] (__schedule+0x584/0x614) from [<c0302630>]
>> (__mmc_claim_host+0xb8/0x154)
>> [  263.601806] [<c0302630>] (__mmc_claim_host+0xb8/0x154) from
>> [<c0308640>] (mmc_sd_resume+0x34/0x5c)
>> [  263.611206] [<c0308640>] (mmc_sd_resume+0x34/0x5c) from [<c0301ce4>]
>> (mmc_resume_host+0xc8/0x15c)
>> [  263.620513] [<c0301ce4>] (mmc_resume_host+0xc8/0x15c) from [<c0317bdc>]
>> (omap_hsmmc_resume+0xbc/0x104)
>> [  263.630279] [<c0317bdc>] (omap_hsmmc_resume+0xbc/0x104) from
>> [<c025b41c>] (platform_pm_resume+0x44/0x54)
>> [  263.640197] [<c025b41c>] (platform_pm_resume+0x44/0x54) from
>> [<c025f88c>] (dpm_run_callback+0x48/0x8c)
>> [  263.649963] [<c025f88c>] (dpm_run_callback+0x48/0x8c) from [<c0260348>]
>> (device_resume+0x204/0x268)
>> [  263.659454] [<c0260348>] (device_resume+0x204/0x268) from [<c02604b8>]
>> (dpm_resume+0x10c/0x244)
>> [  263.668579] [<c02604b8>] (dpm_resume+0x10c/0x244) from [<c02605fc>]
>> (dpm_resume_end+0xc/0x18)
>> [  263.677490] [<c02605fc>] (dpm_resume_end+0xc/0x18) from [<c0061784>]
>> (suspend_devices_and_enter+0x1d0/0x22c)
>> [  263.687805] [<c0061784>] (suspend_devices_and_enter+0x1d0/0x22c) from
>> [<c00618f8>] (enter_state+0x118/0x170)
>> [  263.698089] [<c00618f8>] (enter_state+0x118/0x170) from [<c00605b4>]
>> (state_store+0x94/0x118)
>> [  263.707061] [<c00605b4>] (state_store+0x94/0x118) from [<c01df898>]
>> (kobj_attr_store+0x1c/0x24)
>> [  263.716186] [<c01df898>] (kobj_attr_store+0x1c/0x24) from [<c011dcf4>]
>> (sysfs_write_file+0x108/0x13c)
>> [  263.725860] [<c011dcf4>] (sysfs_write_file+0x108/0x13c) from
>> [<c00c5748>] (vfs_write+0xac/0x180)
>> [  263.735076] [<c00c5748>] (vfs_write+0xac/0x180) from [<c00c58d4>]
>> (sys_write+0x40/0x6c)
>> [  263.743469] [<c00c58d4>] (sys_write+0x40/0x6c) from [<c000ea00>]
>> (ret_fast_syscall+0x0/0x3c)
>>
>>
>> while mmcdq/0 is owning the device and waiting for something that will
>> apparently never happen:
>>
>> [  262.566680] mmcqd/0         D c046455c  5828    53      2 0x00000000
>> [  262.573425] [<c046455c>] (__schedule+0x584/0x614) from [<c04620f0>]
>> (schedule_timeout+0x1c/0x1d0)
>> [  262.582733] [<c04620f0>] (schedule_timeout+0x1c/0x1d0) from
>> [<c0463eb4>] (wait_for_common+0xd8/0x150)
>> [  262.592407] [<c0463eb4>] (wait_for_common+0xd8/0x150) from [<c0303488>]
>> (mmc_wait_for_req_done+0x24/0xbc)
>> [  262.602447] [<c0303488>] (mmc_wait_for_req_done+0x24/0xbc) from
>> [<c0303f20>] (mmc_start_req+0x50/0x11c)
>> [  262.612304] [<c0303f20>] (mmc_start_req+0x50/0x11c) from [<c030df2c>]
>> (mmc_blk_issue_rw_rq+0x78/0x500)
>> [  262.622070] [<c030df2c>] (mmc_blk_issue_rw_rq+0x78/0x500) from
>> [<c030e7a4>] (mmc_blk_issue_rq+0x3f0/0x420)
>> [  262.632171] [<c030e7a4>] (mmc_blk_issue_rq+0x3f0/0x420) from
>> [<c030f884>] (mmc_queue_thread+0x98/0x100)
>> [  262.642028] [<c030f884>] (mmc_queue_thread+0x98/0x100) from
>> [<c004fcc8>] (kthread+0x84/0x90)
>> [  262.650878] [<c004fcc8>] (kthread+0x84/0x90) from [<c000f398>]
>> (kernel_thread_exit+0x0/0x8)
>>
>> I traced this to the fact that mmc_bus_suspend / mmc_bus_resume are
>> *never*
>> being called.  So mmc_blk_suspend isn't called and the queue isn't
>> suspended.
>>
>> The problem appears to be that mmc_bus_suspend is defined as a 'legacy'
>> suspend function. i.e. it is assigned to mmc_bus_type.suspend rather than
>> mmc_bus_type.pm.suspend.
>> In __device_suspend() I see that the legacy suspend function is only
>> called if
>> the bus has *no* dev_pm_ops at all.
>> However when CONFIG_PM_RUNTIME is defined, mmc_bustype does have a
>> dev_pm_ops
>> which contains runtime_{suspend,resume,idle},but no suspend or resume.
>>
>> The net effect is that - as I observed - mmc_bus_suspend is never called.
>>
>> I added lines:
>>
>>        .suspend                = mmc_bus_suspend,
>>        .resume                 = mmc_bus_resume,
>>
>> to mmc_bus_pm_ops and can no longer reproduce the problem.  So maybe this
>> patch is appropriate:
>>
>> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
>> index 5d011a3..80c1e46 100644
>> --- a/drivers/mmc/core/bus.c
>> +++ b/drivers/mmc/core/bus.c
>> @@ -169,6 +169,8 @@ static const struct dev_pm_ops mmc_bus_pm_ops = {
>>        .runtime_suspend        = mmc_runtime_suspend,
>>        .runtime_resume         = mmc_runtime_resume,
>>        .runtime_idle           = mmc_runtime_idle,
>> +       .suspend                = mmc_bus_suspend,
>> +       .resume                 = mmc_bus_resume,
>>  };
>>
>>  #define MMC_PM_OPS_PTR        (&mmc_bus_pm_ops)
>>
>> however I suspect we should remove the 'legacy' pointers at the same
>> time.(?).
>
> This was pointed out earlier and a patch is posted but looks like it never
> went into mmc tree -- http://comments.gmane.org/gmane.linux.kernel.mmc/9168
>
>
>>
>> Also, while exploring this problem I could not find anything that would
>> cause
>> mmc_bus_suspend() to wait for an async request to complete.  Maybe  it is
>> there somewhere that I don't understand yet, and I cannot be sure that any
>> of
>> my symptoms could be explained by an async request continuing while the
>> hardware was powered off, but I wonder if something like this might be
>> needed
>> too:
>>
>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>> index 2517547..cd36c30 100644
>> --- a/drivers/mmc/card/queue.c
>> +++ b/drivers/mmc/card/queue.c
>> @@ -354,6 +354,8 @@ void mmc_queue_suspend(struct mmc_queue *mq)
>>                spin_unlock_irqrestore(q->queue_lock, flags);
>>
>>                down(&mq->thread_sem);
>> +               /* wait for current request to complete */
>> +               mmc_start_req(mq->card->host, NULL, NULL);
This shouldn't be necessary.

>>        }
>
> This looks good to me but I would prefer Per Forlin to ack on it.
My interpretation of the code is that the "thread_sem" blocks until
the mmc_qeueu_thread() has finished all request in the queue.
I refer to this code:
----------------
  		if (req || mq->mqrq_prev->req) {
			set_current_state(TASK_RUNNING);
			mq->issue_fn(mq, req);
		} else {
			if (kthread_should_stop()) {
				set_current_state(TASK_RUNNING);
				break;
			}
			up(&mq->thread_sem);
			schedule();
			down(&mq->thread_sem);
		}
--------------------------
mmc_queue_thread() will always finish off a sequence of requests by
issuing a NULL request to complete the ongoing async request,
If both the new req fetched from the queue and the ongoing (previous)
req are NULL, it will goto sleep.
At this point mmc_queue_thread will increase the sem, and wake up the
mmc_queue_suspend().

Regards,
Per

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: BUG/PATCH? mmcblk devices don't suspend properly.
  2012-04-02 20:43   ` Per Forlin
@ 2012-04-04  8:43     ` Hebbar, Gururaja
  0 siblings, 0 replies; 5+ messages in thread
From: Hebbar, Gururaja @ 2012-04-04  8:43 UTC (permalink / raw)
  To: Per Forlin, Sujit Reddy Thumma
  Cc: NeilBrown, linux-mmc@vger.kernel.org, linux-pm@vger.kernel.org

On Tue, Apr 03, 2012 at 02:13:54, Per Forlin wrote:
> On Fri, Mar 30, 2012 at 7:25 AM, Sujit Reddy Thumma
> <sthumma@codeaurora.org> wrote:
> > On 3/30/2012 8:17 AM, NeilBrown wrote:
> >>
> >>

...snip...
...snip...

> >>
> >> The net effect is that - as I observed - mmc_bus_suspend is never called.
> >>
> >> I added lines:
> >>
> >>        .suspend                = mmc_bus_suspend,
> >>        .resume                 = mmc_bus_resume,
> >>
> >> to mmc_bus_pm_ops and can no longer reproduce the problem.  So maybe this
> >> patch is appropriate:
> >>
> >> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
> >> index 5d011a3..80c1e46 100644
> >> --- a/drivers/mmc/core/bus.c
> >> +++ b/drivers/mmc/core/bus.c
> >> @@ -169,6 +169,8 @@ static const struct dev_pm_ops mmc_bus_pm_ops = {
> >>        .runtime_suspend        = mmc_runtime_suspend,
> >>        .runtime_resume         = mmc_runtime_resume,
> >>        .runtime_idle           = mmc_runtime_idle,
> >> +       .suspend                = mmc_bus_suspend,
> >> +       .resume                 = mmc_bus_resume,
> >>  };
> >>
> >>  #define MMC_PM_OPS_PTR        (&mmc_bus_pm_ops)
> >>
> >> however I suspect we should remove the 'legacy' pointers at the same
> >> time.(?).
> >
> > This was pointed out earlier and a patch is posted but looks like it never
> > went into mmc tree -- http://comments.gmane.org/gmane.linux.kernel.mmc/9168
> >

We were also facing a similar issue related to mmc copy operation & suspend/resume
from console.
http://www.mail-archive.com/linux-omap@vger.kernel.org/msg65425.html

We applied the above patch it started working properly.

Acked by me for this patch

...snip...
...snip...

> mmc_queue_suspend().
> 
> Regards,
> Per
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Regards, 
Gururaja

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: BUG/PATCH?  mmcblk devices don't suspend properly.
  2012-03-30  5:25 ` Sujit Reddy Thumma
  2012-04-02 20:43   ` Per Forlin
@ 2012-04-05  8:57   ` Hebbar, Gururaja
  1 sibling, 0 replies; 5+ messages in thread
From: Hebbar, Gururaja @ 2012-04-05  8:57 UTC (permalink / raw)
  To: Sujit Reddy Thumma, NeilBrown
  Cc: linux-mmc@vger.kernel.org, linux-pm@vger.kernel.org,
	per.forlin@linaro.org

I contacted the original author of the patch (http://comments.gmane.org/gmane.linux.kernel.mmc/9168)
And he has posted the 2nd version of the commit. 

[PATCH V2]mmc: remove MMC bus legacy suspend/resume method

MMC bus is using legacy suspend/resume method, which is not compatible if
runtime pm callbacks are used. In this scenario, MMC bus suspend/resume
callbacks cannot be called when system entering S3. So change to use the new
defined dev_pm_ops for system sleeping mode

Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
---
Changes in v2:
	use SET_SYSTEM_SLEEP_PM_OPS to define sleep callbacks as Rafael
	suggested


May be you can test your setup with this patch.


On Fri, Mar 30, 2012 at 10:55:45, Sujit Reddy Thumma wrote:
> On 3/30/2012 8:17 AM, NeilBrown wrote:
> >
> > I've been experimenting with removing the call to sys_sync() in
> > enter_state().  For me (with verbose debugging and syslog running)
> > this causes a noticeable delay when entering suspend.
> >
[...snip...]
> >
> > Since removing the sys_sync() call I've notices a number of problems with
> > suspend/resume, the most obvious being that resume blocks in mmc_claim_host:
> >
> > [  263.585754] susman          D c046455c  5604  1086   1084 0x00000000
> > [  263.592498] [<c046455c>] (__schedule+0x584/0x614) from [<c0302630>] (__mmc_claim_host+0xb8/0x154)
> > [  263.601806] [<c0302630>] (__mmc_claim_host+0xb8/0x154) from [<c0308640>] (mmc_sd_resume+0x34/0x5c)
> > [  263.611206] [<c0308640>] (mmc_sd_resume+0x34/0x5c) from [<c0301ce4>] (mmc_resume_host+0xc8/0x15c)

[...snip...]

> This was pointed out earlier and a patch is posted but looks like it 
> never went into mmc tree -- 
> http://comments.gmane.org/gmane.linux.kernel.mmc/9168
> 
> >
> >
> > Thanks for any help you can provide,
> >
> > NeilBrown
> >
> 
> --

Regards, 
Gururaja

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-04-05  8:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-30  2:47 BUG/PATCH? mmcblk devices don't suspend properly NeilBrown
2012-03-30  5:25 ` Sujit Reddy Thumma
2012-04-02 20:43   ` Per Forlin
2012-04-04  8:43     ` Hebbar, Gururaja
2012-04-05  8:57   ` Hebbar, Gururaja

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox