public inbox for linux-ide@vger.kernel.org
 help / color / mirror / Atom feed
From: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
To: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
	linux-ide@vger.kernel.org
Cc: Hans de Goede <hdegoede@redhat.com>, Tejun Heo <tj@kernel.org>
Subject: sd: wait for slow devices on shutdown path
Date: Mon, 10 Apr 2017 20:49:33 -0300	[thread overview]
Message-ID: <20170410234933.GA10185@khazad-dum.debian.net> (raw)
In-Reply-To: <20170410232118.GA4816@khazad-dum.debian.net>

Author: Henrique de Moraes Holschuh <hmh@debian.org>
Date:   Wed Feb 1 20:42:02 2017 -0200

    sd: wait for slow devices on shutdown path
    
    Wait 1s during suspend/shutdown for the device to settle after
    we issue the STOP command.
    
    Otherwise we race ATA SSDs to powerdown, possibly causing damage to
    FLASH/data and even bricking the device.
    
    This is an experimental patch, there are likely better ways of doing
    this that don't punish non-SSDs.
    
    Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 4e08d1cd..3c6d5d3 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3230,6 +3230,38 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start)
 			res = 0;
 	}
 
+	/*
+	 * Wait for slow devices that signal they have fully entered
+	 * the stopped state before they actully did it.
+	 *
+	 * This behavior is apparently allowed per-spec for ATA
+	 * devices, and our SAT layer does not account for it.
+	 * Thus, on return, the device might still be in the process
+	 * of entering STANDBY state.
+	 *
+	 * Worse, apparently the ATA spec also says the unit should
+	 * return that it is already in STANDBY state *while still
+	 * entering that state*.
+	 *
+	 * SSDs absolutely depend on receiving a STANDBY IMMEDIATE
+	 * command prior to power off for a clean shutdown (and
+	 * likely we don't want to send them *anything else* in-
+	 * between either, to be on the safe side).
+	 *
+	 * As things stand, we are racing the SSD's firmware.  If it
+	 * finishes first, nothing bad happens.  If it doesn't, we
+	 * cut power while it is still saving metadata, and not only
+	 * this will cause extra FLASH wear (and maybe even damage
+	 * some cells), it also has a non-zero chance of bricking the
+	 * SSD.
+	 *
+	 * Issue reported on Intel, Crucial and Micron SSDs.
+	 * Issue can be detected by S.M.A.R.T. signaling unexpected
+	 * power cuts.
+	 */
+	if (!res && !start)
+		msleep(1000);
+
 	/* SCSI error codes must not go to the generic layer */
 	if (res)
 		return -EIO;

-- 
  Henrique Holschuh

  parent reply	other threads:[~2017-04-10 23:49 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-10 23:21 Race to power off harming SATA SSDs Henrique de Moraes Holschuh
2017-04-10 23:34 ` Bart Van Assche
2017-04-10 23:50   ` Henrique de Moraes Holschuh
2017-04-10 23:49 ` Henrique de Moraes Holschuh [this message]
2017-04-10 23:52 ` Tejun Heo
2017-04-10 23:57   ` James Bottomley
2017-04-11  2:02     ` Henrique de Moraes Holschuh
2017-04-11  1:26   ` Henrique de Moraes Holschuh
2017-04-11 10:37   ` Martin Steigerwald
2017-04-11 14:31     ` Henrique de Moraes Holschuh
2017-04-12  7:47       ` Martin Steigerwald
2017-05-07 20:40   ` Pavel Machek
2017-05-08  7:21     ` David Woodhouse
2017-05-08  7:38       ` Ricard Wanderlof
2017-05-08  8:13         ` David Woodhouse
2017-05-08  8:36           ` Ricard Wanderlof
2017-05-08  8:54             ` David Woodhouse
2017-05-08  9:06               ` Ricard Wanderlof
2017-05-08  9:09                 ` Hans de Goede
2017-05-08 10:13                   ` David Woodhouse
2017-05-08 11:50                     ` Boris Brezillon
2017-05-08 15:40                       ` David Woodhouse
2017-05-08 21:36                         ` Pavel Machek
2017-05-08 16:43                       ` Pavel Machek
2017-05-08 17:43                         ` Tejun Heo
2017-05-08 18:56                           ` Pavel Machek
2017-05-08 19:04                             ` Tejun Heo
2017-05-08 18:29                         ` Atlant Schmidt
2017-05-08 10:12                 ` David Woodhouse
2017-05-08  9:28       ` Pavel Machek
2017-05-08  9:34         ` David Woodhouse
2017-05-08 10:49           ` Pavel Machek
2017-05-08 11:06             ` Richard Weinberger
2017-05-08 11:48               ` Boris Brezillon
2017-05-08 11:55                 ` Boris Brezillon
2017-05-08 12:13                 ` Richard Weinberger
2017-05-08 11:09             ` David Woodhouse
2017-05-08 12:32               ` Pavel Machek
2017-05-08  9:51         ` Richard Weinberger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170410234933.GA10185@khazad-dum.debian.net \
    --to=hmh@hmh.eng.br \
    --cc=hdegoede@redhat.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox