All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>,
	Josh Triplett <josh@joshtriplett.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	"jejb@linux.ibm.com" <jejb@linux.ibm.com>,
	"martin.petersen@oracle.com" <martin.petersen@oracle.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"hch@lst.de" <hch@lst.de>,
	"bvanassche@acm.org" <bvanassche@acm.org>,
	"hare@suse.de" <hare@suse.de>,
	Michael Kelley <mikelley@microsoft.com>,
	Long Li <longli@microsoft.com>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	KY Srinivasan <kys@microsoft.com>,
	linux-pm@vger.kernel.org
Subject: Re: [PATCH] scsi: storvsc: Fix a panic in the hibernation procedure
Date: Wed, 22 Apr 2020 12:16:29 +0800	[thread overview]
Message-ID: <20200422041629.GE299948@T590> (raw)
In-Reply-To: <20200422030807.GK17661@paulmck-ThinkPad-P72>

On Tue, Apr 21, 2020 at 08:08:07PM -0700, Paul E. McKenney wrote:
> On Wed, Apr 22, 2020 at 10:01:34AM +0800, Ming Lei wrote:
> > On Wed, Apr 22, 2020 at 01:48:25AM +0000, Dexuan Cui wrote:
> > > > From: Ming Lei <ming.lei@redhat.com>
> > > > Sent: Tuesday, April 21, 2020 6:28 PM
> > > > To: Dexuan Cui <decui@microsoft.com>
> > > > 
> > > > On Tue, Apr 21, 2020 at 05:17:24PM -0700, Dexuan Cui wrote:
> > > > > During hibernation, the sdevs are suspended automatically in
> > > > > drivers/scsi/scsi_pm.c before storvsc_suspend(), so after
> > > > > storvsc_suspend(), there is no disk I/O from the file systems, but there
> > > > > can still be disk I/O from the kernel space, e.g. disk_check_events() ->
> > > > > sr_block_check_events() -> cdrom_check_events() can still submit I/O
> > > > > to the storvsc driver, which causes a paic of NULL pointer dereference,
> > > > > since storvsc has closed the vmbus channel in storvsc_suspend(): refer
> > > > > to the below links for more info:
> > > > >
> > > > > Fix the panic by blocking/unblocking all the I/O queues properly.
> > > > >
> > > > > Note: this patch depends on another patch "scsi: core: Allow the state
> > > > > change from SDEV_QUIESCE to SDEV_BLOCK" (refer to the second link
> > > > above).
> > > > >
> > > > > Fixes: 56fb10585934 ("scsi: storvsc: Add the support of hibernation")
> > > > > Signed-off-by: Dexuan Cui <decui@microsoft.com>
> > > > > ---
> > > > >  drivers/scsi/storvsc_drv.c | 10 ++++++++++
> > > > >  1 file changed, 10 insertions(+)
> > > > >
> > > > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> > > > > index fb41636519ee..fd51d2f03778 100644
> > > > > --- a/drivers/scsi/storvsc_drv.c
> > > > > +++ b/drivers/scsi/storvsc_drv.c
> > > > > @@ -1948,6 +1948,11 @@ static int storvsc_suspend(struct hv_device
> > > > *hv_dev)
> > > > >  	struct storvsc_device *stor_device = hv_get_drvdata(hv_dev);
> > > > >  	struct Scsi_Host *host = stor_device->host;
> > > > >  	struct hv_host_device *host_dev = shost_priv(host);
> > > > > +	int ret;
> > > > > +
> > > > > +	ret = scsi_host_block(host);
> > > > > +	if (ret)
> > > > > +		return ret;
> > > > >
> > > > >  	storvsc_wait_to_drain(stor_device);
> > > > >
> > > > > @@ -1968,10 +1973,15 @@ static int storvsc_suspend(struct hv_device
> > > > *hv_dev)
> > > > >
> > > > >  static int storvsc_resume(struct hv_device *hv_dev)
> > > > >  {
> > > > > +	struct storvsc_device *stor_device = hv_get_drvdata(hv_dev);
> > > > > +	struct Scsi_Host *host = stor_device->host;
> > > > >  	int ret;
> > > > >
> > > > >  	ret = storvsc_connect_to_vsp(hv_dev, storvsc_ringbuffer_size,
> > > > >  				     hv_dev_is_fc(hv_dev));
> > > > > +	if (!ret)
> > > > > +		ret = scsi_host_unblock(host, SDEV_RUNNING);
> > > > > +
> > > > >  	return ret;
> > > > >  }
> > > > 
> > > > scsi_host_block() is actually too heavy for just avoiding
> > > > scsi internal command, which can be done simply by one atomic
> > > > variable.
> > > > 
> > > > Not mention scsi_host_block() is implemented too clumsy because
> > > > nr_luns * synchronize_rcu() are required in scsi_host_block(),
> > > > which should have been optimized to just one.
> > > > 
> > > > Also scsi_device_quiesce() is heavy too, still takes 2
> > > > synchronize_rcu() for one LUN.
> > > > 
> > > > That is said SCSI suspend may take (3 * nr_luns) sysnchronize_rcu() in
> > > > case that the HBA's suspend handler needs scsi_host_block().
> > > > 
> > > > Thanks,
> > > > Ming
> > > 
> > > When we're in storvsc_suspend(), all the userspace processes have been
> > > frozen and all the file systems have been flushed, and there should not
> > > be too much I/O from the kernel space, so IMO scsi_host_block() should be
> > > pretty fast here. 
> > 
> > I guess it depends on RCU's implementation, so CC RCU guys.
> > 
> > Hello Paul & Josh,
> > 
> > Could you clarify that if sysnchronize_rcu becomes quickly during
> > system suspend?
> 
> Once you have all but one CPU offlined, it becomes extremely fast, as
> in roughly a no-op (which is an idea of Josh's from back in the day).
> But if there is more than one CPU online, then synchronize_rcu() still
> takes on the order of several to several tens of jiffies.
> 
> So, yes, in some portions of system suspend, synchronize_rcu() becomes
> very fast indeed.

Hi Paul,

Thanks for your clarification.

In system suspend path, device is suspended before suspend_disable_secondary_cpus(), 
so I guess synchronize_rcu() is not quick enough even though user space
processes and some kernel threads are frozen.

Thanks,
Ming


  reply	other threads:[~2020-04-22  4:16 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-22  0:17 [PATCH] scsi: storvsc: Fix a panic in the hibernation procedure Dexuan Cui
2020-04-22  1:28 ` Ming Lei
2020-04-22  1:48   ` Dexuan Cui
2020-04-22  2:01     ` Ming Lei
2020-04-22  3:08       ` Paul E. McKenney
2020-04-22  4:16         ` Ming Lei [this message]
2020-04-22  4:58           ` Dexuan Cui
2020-04-22  9:23             ` Ming Lei
2020-04-22 16:19               ` Paul E. McKenney
2020-04-22 18:01               ` Dexuan Cui
2020-04-22  2:06     ` Ming Lei
2020-04-22  5:02 ` Bart Van Assche
2020-04-22  6:24   ` Dexuan Cui
2020-04-22 19:56     ` Bart Van Assche
2020-04-23  7:04       ` Dexuan Cui
2020-04-23 16:37         ` Bart Van Assche
2020-04-23 18:29           ` Dexuan Cui
2020-04-23 23:25             ` Bart Van Assche
2020-04-24  2:40               ` Dexuan Cui
2020-04-24  3:45               ` Dexuan Cui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200422041629.GE299948@T590 \
    --to=ming.lei@redhat.com \
    --cc=bvanassche@acm.org \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jejb@linux.ibm.com \
    --cc=josh@joshtriplett.org \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=martin.petersen@oracle.com \
    --cc=mikelley@microsoft.com \
    --cc=paulmck@kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=sthemmin@microsoft.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.