* Re: [PATCH v1 1/1] scsi: Synchronize request queue PM status only on successful resume
[not found] ` <1546410308-13486-3-git-send-email-stanley.chu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>
@ 2019-01-02 16:15 ` Bart Van Assche
[not found] ` <1546445745.163063.4.camel-HInyCGIudOg@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: Bart Van Assche @ 2019-01-02 16:15 UTC (permalink / raw)
To: stanley.chu-NuS5LvNUpcJWk0Htik3J/w,
linux-scsi-u79uwXL29TY76Z2rM5mHXA
Cc: srv_wsdupstream-NuS5LvNUpcJWk0Htik3J/w,
matthias.bgg-Re5JQEeQqe8AvxtiuMwx3w,
linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On Wed, 2019-01-02 at 14:25 +0800, stanley.chu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org wrote:
> From: Stanley Chu <stanley.chu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>
>
> The commit 356fd2663cff ("scsi: Set request queue runtime PM status
> back to active on resume") fixed up the inconsistent RPM status between
> request queue and device. However changing request queue RPM status
> shall be done only on successful resume, otherwise status may be still
> inconsistent as below,
>
> Request queue: RPM_ACTIVE
> Device: RPM_SUSPENDED
>
> This ends up soft lockup because requests can be submitted to
> underlying devices but those devices and their required resource
> are not resumed.
>
> Signed-off-by: Stanley Chu <stanley.chu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>
Please add "Fixes:" and "Cc: stable" tags and also Cc the author of commit
356fd2663cff.
> ---
> drivers/scsi/scsi_pm.c | 24 ++++++++++++++----------
> 1 file changed, 14 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/scsi/scsi_pm.c b/drivers/scsi/scsi_pm.c
> index a2b4179..eff3e59 100644
> --- a/drivers/scsi/scsi_pm.c
> +++ b/drivers/scsi/scsi_pm.c
> @@ -82,6 +82,20 @@ static int scsi_dev_type_resume(struct device *dev,
> pm_runtime_disable(dev);
> pm_runtime_set_active(dev);
> pm_runtime_enable(dev);
> +
> + /*
> + * Forcibly set runtime PM status of request queue to "active"
> + * to make sure we can again get requests from the queue
> + * (see also blk_pm_peek_request()).
> + *
> + * The resume hook will correct runtime PM status of the disk.
> + */
> + if (!err && scsi_is_sdev_device(dev)) {
> + struct scsi_device *sdev = to_scsi_device(dev);
> +
> + if (sdev->request_queue->dev)
> + blk_set_runtime_active(sdev->request_queue);
> + }
What makes you think that the sdev->request_queue->dev test is necessary? The
scsi_dev_type_resume() function is only called after blk_pm_runtime_init() has
finished so I don't think that test is necessary.
Additionally, since the above code occurs inside a block controlled by an
"if (err == 0)" statement, I think the !err test is redundant and should be
left out.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v1 1/1] scsi: Synchronize request queue PM status only on successful resume
[not found] ` <1546445745.163063.4.camel-HInyCGIudOg@public.gmane.org>
@ 2019-01-03 6:38 ` Stanley Chu
2019-01-03 23:15 ` Bart Van Assche
0 siblings, 1 reply; 3+ messages in thread
From: Stanley Chu @ 2019-01-03 6:38 UTC (permalink / raw)
To: Bart Van Assche
Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA,
kuohong.wang-NuS5LvNUpcJWk0Htik3J/w,
wsdupstream-NuS5LvNUpcJWk0Htik3J/w,
linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
peter.wang-NuS5LvNUpcJWk0Htik3J/w,
matthias.bgg-Re5JQEeQqe8AvxtiuMwx3w,
mika.westerberg-VuQAYsv1563Yd54FQh9/CA
Hi Bart,
On Wed, 2019-01-02 at 08:15 -0800, Bart Van Assche wrote:
> On Wed, 2019-01-02 at 14:25 +0800, stanley.chu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org wrote:
> > From: Stanley Chu <stanley.chu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>
> >
> > The commit 356fd2663cff ("scsi: Set request queue runtime PM status
> > back to active on resume") fixed up the inconsistent RPM status between
> > request queue and device. However changing request queue RPM status
> > shall be done only on successful resume, otherwise status may be still
> > inconsistent as below,
> >
> > Request queue: RPM_ACTIVE
> > Device: RPM_SUSPENDED
> >
> > This ends up soft lockup because requests can be submitted to
> > underlying devices but those devices and their required resource
> > are not resumed.
> >
> > Signed-off-by: Stanley Chu <stanley.chu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>
>
> Please add "Fixes:" and "Cc: stable" tags and also Cc the author of commit
> 356fd2663cff.
Sure. Thanks for remind.
>
>
> > ---
> > drivers/scsi/scsi_pm.c | 24 ++++++++++++++----------
> > 1 file changed, 14 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/scsi/scsi_pm.c b/drivers/scsi/scsi_pm.c
> > index a2b4179..eff3e59 100644
> > --- a/drivers/scsi/scsi_pm.c
> > +++ b/drivers/scsi/scsi_pm.c
> > @@ -82,6 +82,20 @@ static int scsi_dev_type_resume(struct device *dev,
> > pm_runtime_disable(dev);
> > pm_runtime_set_active(dev);
> > pm_runtime_enable(dev);
> > +
> > + /*
> > + * Forcibly set runtime PM status of request queue to "active"
> > + * to make sure we can again get requests from the queue
> > + * (see also blk_pm_peek_request()).
> > + *
> > + * The resume hook will correct runtime PM status of the disk.
> > + */
> > + if (!err && scsi_is_sdev_device(dev)) {
> > + struct scsi_device *sdev = to_scsi_device(dev);
> > +
> > + if (sdev->request_queue->dev)
> > + blk_set_runtime_active(sdev->request_queue);
> > + }
>
> What makes you think that the sdev->request_queue->dev test is necessary? The
> scsi_dev_type_resume() function is only called after blk_pm_runtime_init() has
> finished so I don't think that test is necessary.
We found NULL sdev->request_queue->dev may be dereferenced during below
system resume flow,
scsi_bus_resume_common()
=> async_schedule_domain(async_sdev_resume)
And then async_sdev_resume() is invoked asynchronously,
async_sdev_resume()
=> scsi_dev_type_resume(dev, do_scsi_resume)
=> blk_set_runtime_active(sdev->request_queue)
If a SCSI device does not have upper layer driver (like SCSI disk), it
may not be applied blk_pm_runtime_init() invoked by sd_probe() while
this SCSI device is added.
For example, some SCSI devices (like UFS Boot W-LUN) are added
explicitly in __scsi_add_device() by ufshcd_scsi_add_wlus() first and
thus sd_probe() for them is skipped because they are already visible.
For those SCSI devices, null sdev->request_queue->dev will be
dereferenced in blk_set_runtime_active() during above system resume
flow, therefore we add a null pointer checking for this case.
The same issue also happens on those SCSI devices before this patch as
below system resume flow while devices are already runtime-suspended.
scsi_bus_resume_common()
=> blk_set_runtime_active(to_scsi_device(dev)->request_queue)
>
> Additionally, since the above code occurs inside a block controlled by an
> "if (err == 0)" statement, I think the !err test is redundant and should be
> left out.
Sorry this is my code merge defect.
"err" here shall be returned value from pm_runtime_set_active().
I will fix it in v2.
>
> Thanks,
>
> Bart.
>
> _______________________________________________
> Linux-mediatek mailing list
> Linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v1 1/1] scsi: Synchronize request queue PM status only on successful resume
2019-01-03 6:38 ` Stanley Chu
@ 2019-01-03 23:15 ` Bart Van Assche
0 siblings, 0 replies; 3+ messages in thread
From: Bart Van Assche @ 2019-01-03 23:15 UTC (permalink / raw)
To: Stanley Chu
Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA,
kuohong.wang-NuS5LvNUpcJWk0Htik3J/w,
wsdupstream-NuS5LvNUpcJWk0Htik3J/w,
linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
peter.wang-NuS5LvNUpcJWk0Htik3J/w,
matthias.bgg-Re5JQEeQqe8AvxtiuMwx3w,
mika.westerberg-VuQAYsv1563Yd54FQh9/CA
On Thu, 2019-01-03 at 14:38 +0800, Stanley Chu wrote:
> We found NULL sdev->request_queue->dev may be dereferenced during below
> system resume flow,
>
> scsi_bus_resume_common()
> => async_schedule_domain(async_sdev_resume)
>
> And then async_sdev_resume() is invoked asynchronously,
>
> async_sdev_resume()
> => scsi_dev_type_resume(dev, do_scsi_resume)
> => blk_set_runtime_active(sdev->request_queue)
>
> If a SCSI device does not have upper layer driver (like SCSI disk), it
> may not be applied blk_pm_runtime_init() invoked by sd_probe() while
> this SCSI device is added.
>
> For example, some SCSI devices (like UFS Boot W-LUN) are added
> explicitly in __scsi_add_device() by ufshcd_scsi_add_wlus() first and
> thus sd_probe() for them is skipped because they are already visible.
>
> For those SCSI devices, null sdev->request_queue->dev will be
> dereferenced in blk_set_runtime_active() during above system resume
> flow, therefore we add a null pointer checking for this case.
>
> The same issue also happens on those SCSI devices before this patch as
> below system resume flow while devices are already runtime-suspended.
>
> scsi_bus_resume_common()
> => blk_set_runtime_active(to_scsi_device(dev)->request_queue)
Hi Stanley,
Thanks, this is helpful information. If you would have to repost your
patch please add a comment that refers to the __scsi_add_device() calls
in the UFS driver.
Bart.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-01-03 23:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1546410308-13486-1-git-send-email-stanley.chu@mediatek.com>
[not found] ` <1546410308-13486-3-git-send-email-stanley.chu@mediatek.com>
[not found] ` <1546410308-13486-3-git-send-email-stanley.chu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>
2019-01-02 16:15 ` [PATCH v1 1/1] scsi: Synchronize request queue PM status only on successful resume Bart Van Assche
[not found] ` <1546445745.163063.4.camel-HInyCGIudOg@public.gmane.org>
2019-01-03 6:38 ` Stanley Chu
2019-01-03 23:15 ` Bart Van Assche
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).