Linux Power Management development
 help / color / mirror / Atom feed
From: Nitin Rawat <quic_nitirawa@quicinc.com>
To: Peter Wang <peter.wang@mediatek.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v1] PM-runtime: Check supplier_preactivated before release supplier
Date: Wed, 12 Oct 2022 16:01:01 +0530	[thread overview]
Message-ID: <d7111bd2-d431-e5e1-1a36-6d0d4d4ec19b@quicinc.com> (raw)
In-Reply-To: <80a67ef6-ea29-5b96-9596-6fbbb34c4961@mediatek.com>

Hi Peter/Rafael,
We are also observed similiar issue on our platform. Looks like there is 
a race condition(explained below) which cause consumer to resume w/o 
bumping up the supplier's PM-runtime usage counter.

Process 1 (ufshcd_async_scan context)
ufshcd_async_scan()
     scsi_probe_and_add_lun
         scsi_add_lun
             slave_configure    -> enable rpm
                 scsi_sysfs_add_sdev
                     scsi_autopm_get_device
                         device_add     <- invoked sd_probe in process 2
                             scsi_autopm_put_device

Process 2 (sd_probe context)
driver_probe_device
__device_attach_async_helper
     __device_attach_driver
         driver_probe_device
             __driver_probe_device
                 sd_probe
                     scsi_autopm_get_device



Race condition for dev->power.runtime_status for consumer dev 0:0:0:0 
can happen as below in rpm framework

ufshcd_async_scan context (process 1)
scsi_autopm_put_device() //0:0:0:0
	pm_runtime_put_sync()
	__pm_runtime_idle()
	rpm_idle()
	__rpm_callback()
		scsi_runtime_idle()
			pm_runtime_mark_last_busy()
			pm_runtime_autosuspend()
				__pm_runtime_suspend(RPM_AUTO)
				rpm_suspend(RPM_AUTO)
					status = RPM_SUSPENDING
					scsi_runtime_suspend()
						__rpm_callback()
					status = RPM_SUSPENDED------>1
					rpm_suspend_suppliers()
			return -EBUSY

		(use_links)&&(dev->power.runtime_status == RPM_RESUMING && 
retval)------->3
		__rpm_put_suppliers()





sd_probe context (Process 2)
scsi_autopm_get_device() //0:0:0:0
     __pm_runtime_resume(RPM_GET_PUT)
     rpm_resume
      	status = RPM_RESUMING----->2



After power.runtime_status of consumer 0:0:0:0 was changed to 
RPM_SUSPENDED and before scsi_runtime_idle retval was -16(EBUSY) to 
__rpm_callback, power.runtime_status of consumer 0:0:0:0 was changed to 
RPM_RESUMING and hence condition 3 became true and __rpm_put_suppliers 
was called and hence consumer resumed with decremented usage_count due 
to this race condition.

Please let me know your thoughts on this.

Regards,
Nitin

On 8/2/2022 7:03 PM, Peter Wang wrote:
> 
> On 8/2/22 7:01 PM, Rafael J. Wysocki wrote:
>> On Tue, Aug 2, 2022 at 5:19 AM Peter Wang <peter.wang@mediatek.com> 
>> wrote:
>>>
>>>> Hi Rafael,
>>>>
>>>> Yes, it is very clear!
>>>> I miss this important key point that usage_count is always >
>>>> rpm_active 1.
>>>> I think this patch could work.
>>>>
>>>> Thanks.
>>>> Peter
>>>>
>>>>
>>>>
>>>>
>>> Hi Rafael,
>>>
>>> After test with commit ("887371066039011144b4a94af97d9328df6869a2 PM:
>>> runtime: Fix supplier device management during consumer probe") past 
>>> weeks,
>>> The supplier still suspend when consumer is active "after"
>>> pm_runtime_put_suppliers.
>>> Do you have any idea about that?
>> Well, this means that the consumer probe doesn't bump up the
>> supplier's PM-runtime usage counter as appropriate.
>>
>> You need to tell me more about what happens during the consumer probe.
>> Which driver is this?
> 
> Hi Rafael,
> 
> I have the same idea with you. But I still don't know how it could happen.
> 
> It is upstream ufs driver in scsi system. Here is call flow
> do_scan_async (process 1)
>      do_scsi_scan_host
>          scsi_scan_host_selected
>              scsi_scan_channel
>                  __scsi_scan_target
>                      scsi_probe_and_add_lun
>                          scsi_alloc_sdev
>                              slave_alloc     -> setup link
>                          scsi_add_lun
>                              slave_configure    -> enable rpm
>                              scsi_sysfs_add_sdev
>                                  scsi_autopm_get_device    <- get 
> runtime pm
>                                  device_add                <- invoke 
> sd_probe in process 2
>                                  scsi_autopm_put_device    <- put 
> runtime pm, point 1
> 
> driver_probe_device (process 2)
>      __driver_probe_device
>          pm_runtime_get_suppliers
>              really_probe
>                  sd_probe
>                      scsi_autopm_get_device                <- get 
> runtime pm, point 2
>                      pm_runtime_set_autosuspend_delay    <- set rpm 
> delay to 2s
>                      scsi_autopm_put_device                <- put 
> runtime pm
>          pm_runtime_put_suppliers                        <- 
> (link->rpm_active = 1)
> 
> After process 1 call scsi_autopm_put_device(point 1) let consumer enter 
> suspend,
> process 2 call scsi_autopm_get_device(point 2) may have chance resume 
> consumer but not
> bump up the supplier's PM-runtime usage counter as appropriate.
> 
> Thanks.
> Peter
> 
> 
> 
> 
> 
> 
> 

      reply	other threads:[~2022-10-12 10:31 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-13 12:07 [PATCH v1] PM-runtime: Check supplier_preactivated before release supplier peter.wang
2022-06-22  6:09 ` Peter Wang
2022-06-22  6:48   ` Greg KH
2022-06-27 14:14 ` Greg KH
2022-06-27 14:27   ` Rafael J. Wysocki
2022-06-28  1:49   ` Peter Wang
2022-06-27 19:00 ` Rafael J. Wysocki
2022-06-28  1:53   ` Peter Wang
2022-06-28 15:54     ` Rafael J. Wysocki
     [not found] ` <b55d5691-0b2d-56bb-26ff-dcac56770611@mediatek.com>
     [not found]   ` <CAJZ5v0gTpv2gt_Gm9rUd+8Jmp4=ij2=J20o7qO0sC-hm=w3=_A@mail.gmail.com>
2022-06-29 16:01     ` Rafael J. Wysocki
2022-06-30 14:26       ` Peter Wang
2022-06-30 14:47         ` Rafael J. Wysocki
2022-06-30 15:19           ` Peter Wang
2022-06-30 16:28             ` Rafael J. Wysocki
2022-07-01 10:21               ` Peter Wang
2022-08-02  3:19                 ` Peter Wang
2022-08-02 11:01                   ` Rafael J. Wysocki
2022-08-02 13:33                     ` Peter Wang
2022-10-12 10:31                       ` Nitin Rawat [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d7111bd2-d431-e5e1-1a36-6d0d4d4ec19b@quicinc.com \
    --to=quic_nitirawa@quicinc.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peter.wang@mediatek.com \
    --cc=rafael@kernel.org \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox