public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Can Guo <cang@codeaurora.org>
To: Bean Huo <huobean@gmail.com>
Cc: asutoshd@codeaurora.org, nguyenb@codeaurora.org,
	hongwus@codeaurora.org, ziqichen@codeaurora.org,
	rnayak@codeaurora.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, saravanak@google.com,
	salyzyn@google.com, rjw@rjwysocki.net,
	Alim Akhtar <alim.akhtar@samsung.com>,
	Avri Altman <avri.altman@wdc.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Stanley Chu <stanley.chu@mediatek.com>,
	Bean Huo <beanhuo@micron.com>,
	Nitin Rawat <nitirawa@codeaurora.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Bart Van Assche <bvanassche@acm.org>,
	Satya Tangirala <satyat@google.com>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] scsi: ufs: Protect PM ops and err_handler from user access through sysfs
Date: Sat, 09 Jan 2021 12:45:08 +0800	[thread overview]
Message-ID: <e69bd5a6b73d5c652130bf4fa077aac0@codeaurora.org> (raw)
In-Reply-To: <cb388d8ea15b2c80a072dec74d9ededecb183a08.camel@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5539 bytes --]

On 2021-01-08 19:29, Bean Huo wrote:
> On Wed, 2021-01-06 at 09:20 +0800, Can Guo wrote:
>> Hi Bean,
>> 
>> On 2021-01-06 02:38, Bean Huo wrote:
>> > On Tue, 2021-01-05 at 09:07 +0800, Can Guo wrote:
>> > > On 2021-01-05 04:05, Bean Huo wrote:
>> > > > On Sat, 2021-01-02 at 05:59 -0800, Can Guo wrote:
>> > > > > + * @shutting_down: flag to check if shutdown has been
>> > > > > invoked
>> > > >
>> > > > I am not much sure if this flag is need, since once PM going in
>> > > > shutdown path, what will be returnded by pm_runtime_get_sync()?
>> > > >
>> > > > If pm_runtime_get_sync() will fail, just check its return.
>> > > >
>> > >
>> > > That depends. During/after shutdown, for UFS's case only,
>> > > pm_runtime_get_sync(hba->dev) will most likely return 0,
>> > > because it is already RUNTIME_ACTIVE, pm_runtime_get_sync()
>> > > will directly return 0... meaning you cannot count on it.
>> > >
>> > > Check Stanley's change -
>> > > https://lore.kernel.org/patchwork/patch/1341389/
>> > >
>> > > Can Guo.
>> >
>> > Can,
>> >
>> > Thanks for pointing out that.
>> >
>> > Based on my understanding, that patch is redundent. maybe I
>> > misundestood Linux shutdown sequence.
>> 
>> Sorry, do you mean Stanley's change is redundant?
> 
> yes.
> 

No, it is definitely needed. As Stanley replied you in another
thread, it is not protecting I/Os from user layer, but from
other subsystems during shutdown.

>> 
>> >
>> > I checked the shutdown flow:
>> >
>> > 1. Set the "system_state" variable
>> > 2. Disable usermod to ensure that no user from userspace can start
>> > a
>> > request
>> 
>> I hope it is like what you interpreted, but step #2 only stops
>> UMH(#265)
>> but not all user space activities. Whereas, UMH is for kernel space
>> calling
>> user space.
> 
> 
> Can,
> 
> I did further study and homework on the Linux shutdown in the last few
> days. Yes, you are right, usermodehelper_disable() is to prevent
> executing the process from the kernel space.
> 
> But I didn't reproduce this "maybe" race issue while shutdown. no
> matter how I torment my system, once Linux shutdown/halt/reboot starts,
> nobody can access the sysfs node. I create 10 processes in the user
> space and constantly access UFS sysfs node, also, fio is running in the
> background for the normal data read/write. there is a shutdown thread
> that will randomly trigger shutdown/halt/reboot. but no race issue
> appears.
> 
> I don't know if this is a hypothetical issue(the race between shutdown
> flow and sysfs node access), it may not really exist in the Linux
> envriroment. everytime, the shutdonw flow will be:
> 
> e10_sync_handler()->e10_svc()->do_e10_svc()->__do_sys_reboot()-
>> kernel_poweroff/kernel_halt()->device_shutdown()->platform_shutdown()-
>> ufshcd_platform_shutdown()->ufshcd_shutdown().
> 
> I think before going into the kernel shutdown, the userspace cannot
> issue new requests anymore. otherwise, this would be a big issue.
> 
> pm_runtime_get_sync() will return 0 or failure while shutdown? the
> answer is not important now, maybe as you said, it is always 0. But in
> my testing, it didn't get there the system has been shutdown. Which
> means once shutdonw starts, sysfs node access path cannot reach
> pm_runtime_get_sync(). (note, I don't know if sysfs node access thread
> has been disabled or not)
> 
> 
> Responsibly say, I didn't reproduce this issue on my system (ubuntu),
> maybe you are using Android. I am not an expert on this topic, if you
> have the best idea on how to reproduce this issue. please please let me
> try. appreciate it!!!!!
> 

When you do a reboot/shutdown/poweroff, how your system behaves highly
depends on how the reboot cmd is implemented in C code under /sbin/.

On Ubuntu, reboot looks like:
$ reboot --help
reboot [OPTIONS...] [ARG]

Reboot the system.

      --help      Show this help
      --halt      Halt the machine
   -p --poweroff  Switch off the machine
      --reboot    Reboot the machine
   -f --force     Force immediate halt/power-off/reboot
   -w --wtmp-only Don't halt/power-off/reboot, just write wtmp record
   -d --no-wtmp   Don't write wtmp record
      --no-wall   Don't send wall message before halt/power-off/reboot


On a pure Linux with a initrd RAM FS built from busybox, reboot looks 
like:
# reboot --help
BusyBox v1.30.1 (2019-05-24 12:53:36 IST) multi-call binary.

Usage: reboot [-d DELAY] [-n] [-f]

Reboot the system

         -d SEC  Delay interval
         -n      Do not sync
         -f      Force (don't go through init)


For example, when you work on a pure Linux with a filesystem built from
busybox, when you hit reboot cmd, halt_main() will be called. And based
on the reboot options passed to reboot cmd, halt_main() behaves 
differently.

A plain reboot cmd does things like sync filesystem, send SIGKILL to all
processes (except for init), remount all filesytem as read-only and so 
on
before invoking linux kernel reboot syscall. In this case, we are safe.

However, if you do a "reboot -f", halt_main() directly invokes reboot().
And with "reboot -f", I can easily reproduce the race condition we are
talking about here - it is not based on imagination.

Find the patch I used for replication in the attachment, fix conflicts
if any. After boot up, the cmd lines I used are

# while true; do cat /sys/devices/platform/soc@0/*ufshc*/rpm_lvl; done &
# reboot -f

Can Guo.

> 
> Thanks,
> Bean
> 
> 
>> 
>> 264     system_state = state;
>> 265     usermodehelper_disable();
>> 266     device_shutdown();
>> 
>> Thanks,
>> Can Guo.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-scsi-ufs-Reproduce-race-condition-btw-sysfs-access-a.patch --]
[-- Type: text/x-diff; name=0001-scsi-ufs-Reproduce-race-condition-btw-sysfs-access-a.patch, Size: 1858 bytes --]

From 639358dd6e2f7e5e9f7e66b828bf043fa3c7bbf2 Mon Sep 17 00:00:00 2001
From: Can Guo <cang@codeaurora.org>
Date: Sat, 9 Jan 2021 12:37:40 +0800
Subject: [PATCH] scsi: ufs: Reproduce race condition btw sysfs access and
 shutdown

Reproduce race condition btw sysfs access and shutdown

Usage:

Signed-off-by: Can Guo <cang@codeaurora.org>

diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c
index 92a63ee..a29a490 100644
--- a/drivers/scsi/ufs/ufs-sysfs.c
+++ b/drivers/scsi/ufs/ufs-sysfs.c
@@ -59,6 +59,8 @@ static ssize_t rpm_lvl_show(struct device *dev,
 {
 	struct ufs_hba *hba = dev_get_drvdata(dev);
 
+	BUG_ON(hba->shutting_down);
+
 	return sprintf(buf, "%d\n", hba->rpm_lvl);
 }
 
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 698e8d2..210fa5c 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -8277,6 +8277,11 @@ int ufshcd_shutdown(struct ufs_hba *hba)
 {
 	int ret = 0;
 
+	hba->shutting_down = true;
+	pr_err("[DEBUG]%s: UFS SHUTDOWN START\n", __func__);
+
+	usleep_range(10000000, 11000000);
+
 	if (!hba->is_powered)
 		goto out;
 
@@ -8294,6 +8299,7 @@ int ufshcd_shutdown(struct ufs_hba *hba)
 	if (ret)
 		dev_err(hba->dev, "%s failed, err %d\n", __func__, ret);
 	/* allow force shutdown even in case of errors */
+	pr_err("[DEBUG]%s: UFS SHUTDOWN END\n", __func__);
 	return 0;
 }
 EXPORT_SYMBOL(ufshcd_shutdown);
diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
index 6ffc08a..3a40d94 100644
--- a/drivers/scsi/ufs/ufshcd.h
+++ b/drivers/scsi/ufs/ufshcd.h
@@ -677,6 +677,7 @@ struct ufs_hba {
 	u16 ee_ctrl_mask;
 	u16 hba_enable_delay_us;
 	bool is_powered;
+	bool shutting_down;
 
 	/* Work Queues */
 	struct work_struct eh_work;
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.


  parent reply	other threads:[~2021-01-09  4:46 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-02 13:59 [PATCH v3 0/2] Synchronize user layer access with system PM ops and error handling Can Guo
2021-01-02 13:59 ` [PATCH 1/2] scsi: ufs: Fix a possible NULL pointer issue Can Guo
2021-01-12  6:35   ` Stanley Chu
2021-01-12  6:52     ` Can Guo
2021-01-12  9:17       ` Stanley Chu
2021-01-02 13:59 ` [PATCH 2/2] scsi: ufs: Protect PM ops and err_handler from user access through sysfs Can Guo
2021-01-04 20:05   ` Bean Huo
2021-01-05  1:07     ` Can Guo
2021-01-05 18:38       ` Bean Huo
2021-01-06  1:20         ` Can Guo
2021-01-08 11:29           ` Bean Huo
2021-01-08 13:11             ` Stanley Chu
2021-01-09  4:45             ` Can Guo [this message]
2021-01-09  4:51               ` Can Guo
2021-01-10 16:13                 ` Bean Huo
2021-01-11  1:27                   ` Can Guo
2021-01-11  8:23                     ` Bean Huo
2021-01-11  9:22                       ` Can Guo
2021-01-11 10:04                         ` Bean Huo
2021-01-12  0:45                           ` Can Guo
2021-01-12 11:32                             ` Bean Huo
2021-01-11  1:52                   ` Can Guo
2021-01-10 16:18   ` Bean Huo
2021-01-11  1:30     ` Can Guo
2021-01-11  8:25       ` Bean Huo
2021-01-12  8:20     ` Avri Altman
2021-01-12  9:36   ` Stanley Chu
2021-01-13  4:18 ` [PATCH v3 0/2] Synchronize user layer access with system PM ops and error handling Martin K. Petersen
2021-01-13  4:23   ` Can Guo
  -- strict thread matches above, loose matches on Subject: below --
2020-12-31  4:25 [PATCH v1 " Can Guo
2020-12-31  4:25 ` [PATCH 2/2] scsi: ufs: Protect PM ops and err_handler from user access through sysfs Can Guo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e69bd5a6b73d5c652130bf4fa077aac0@codeaurora.org \
    --to=cang@codeaurora.org \
    --cc=adrian.hunter@intel.com \
    --cc=alim.akhtar@samsung.com \
    --cc=asutoshd@codeaurora.org \
    --cc=avri.altman@wdc.com \
    --cc=beanhuo@micron.com \
    --cc=bvanassche@acm.org \
    --cc=hongwus@codeaurora.org \
    --cc=huobean@gmail.com \
    --cc=jejb@linux.ibm.com \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=nguyenb@codeaurora.org \
    --cc=nitirawa@codeaurora.org \
    --cc=rjw@rjwysocki.net \
    --cc=rnayak@codeaurora.org \
    --cc=salyzyn@google.com \
    --cc=saravanak@google.com \
    --cc=satyat@google.com \
    --cc=stanley.chu@mediatek.com \
    --cc=ziqichen@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox