public inbox for linux-ide@vger.kernel.org
 help / color / mirror / Atom feed
From: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
To: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
	linux-ide@vger.kernel.org, Hans de Goede <hdegoede@redhat.com>
Subject: Re: Race to power off harming SATA SSDs
Date: Mon, 10 Apr 2017 22:26:12 -0300	[thread overview]
Message-ID: <20170411012612.GC10185@khazad-dum.debian.net> (raw)
In-Reply-To: <20170410235206.GA28603@wtj.duckdns.org>

On Tue, 11 Apr 2017, Tejun Heo wrote:
> > The kernel then continues the shutdown path while the SSD is still
> > preparing itself to be powered off, and it becomes a race.  When the
> > kernel + firmware wins, platform power is cut before the SSD has
> > finished (i.e. the SSD is subject to an unclean power-off).
> 
> At that point, the device is fully flushed and in terms of data
> integrity should be fine with losing power at any point anyway.

All bets are off at this point, really.

We issued a command that explicitly orders the SSD to checkpoint and
stop all background tasks, and flush *everything* including invisible
state (device data, stats, logs, translation tables, flash metadata,
etc)...  and then cut its power before it finished.

> > NOTE: unclean SSD power-offs are dangerous and may brick the device in
> > the worst case, or otherwise harm it (reduce longevity, damage flash
> > blocks).  It is also not impossible to get data corruption.
> 
> I get that the incrementing counters might not be pretty but I'm a bit
> skeptical about this being an actual issue.  Because if that were

As an *example* I know of because I tracked it personally, Crucial SSDs
models from a few years ago were known to eventually brick on any
platforms where they were being subject to repeated unclean shutdowns,
*Windows included*.  There are some threads on their forums about it.
Firmware revisions made it harder to happen, but still...

> true, the device would be bricking itself from any sort of power
> losses be that an actual power loss, battery rundown or hard power off
> after crash.

Bricking is a worst-case, really.  I guess they learned to keep the
device always in a will-not-brick state using append-only logs for
critical state or something, so it really takes very nasty flash damage
to exactly the wrong place to render it unusable.

> > Fixing the issue properly:
> > 
> > The proof of concept patch works fine, but it "punishes" the system with
> > too much delay.  Also, if sd device shutdown is serialized, it will
> > punish systems with many /dev/sd devices severely.
> > 
> > 1. The delay needs to happen only once right before powering down for
> >    hibernation/suspend/power-off.  There is no need to delay per-device
> >    for platform power off/suspend/hibernate.
> > 
> > 2. A per-device delay needs to happen before signaling that a device
> >    can be safely removed when doing controlled hotswap (e.g. when
> >    deleting the SD device due to a sysfs command).
> > 
> > I am unsure how much *total* delay would be enough.  Two seconds seems
> > like a safe bet.
> > 
> > Any comments?  Any clues on how to make the delay "smarter" to trigger
> > only once during platform shutdown, but still trigger per-device when
> > doing per-device hotswapping ?
> 
> So, if this is actually an issue, sure, we can try to work around;
> however, can we first confirm that this has any other consequences
> than a SMART counter being bumped up?  I'm not sure how meaningful
> that is in itself.

I have no idea how to confirm an SSD is being either less, or more
damaged by the "STANDBY-IMMEDIATE and cut power too early", when
compared with "sudden power cut".  At least not without actually
damaging the SSDs using three groups (normal power cuts,
STANDBY-IMMEDIATE + power cut, control group).

A "SSD power cut test" search on duckduckgo shows several papers and
testing reports on the first results page.  I don't think there is any
doubt whatsoever that your typical consumer SSD *can* get damaged by a
"sudden power cut" so badly that it is actually noticed by the user.

That FLASH itself gets damaged or can have stored data corrupted by
power cuts at bad times is quite clear:

http://cseweb.ucsd.edu/users/swanson/papers/DAC2011PowerCut.pdf

SSDs do a lot of work to recover from that without data loss.  You won't
notice it easily unless that recovery work *fails*.

-- 
  Henrique Holschuh

  parent reply	other threads:[~2017-04-11  1:26 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-10 23:21 Race to power off harming SATA SSDs Henrique de Moraes Holschuh
2017-04-10 23:34 ` Bart Van Assche
2017-04-10 23:50   ` Henrique de Moraes Holschuh
2017-04-10 23:49 ` sd: wait for slow devices on shutdown path Henrique de Moraes Holschuh
2017-04-10 23:52 ` Race to power off harming SATA SSDs Tejun Heo
2017-04-10 23:57   ` James Bottomley
2017-04-11  2:02     ` Henrique de Moraes Holschuh
2017-04-11  1:26   ` Henrique de Moraes Holschuh [this message]
2017-04-11 10:37   ` Martin Steigerwald
2017-04-11 14:31     ` Henrique de Moraes Holschuh
2017-04-12  7:47       ` Martin Steigerwald
2017-05-07 20:40   ` Pavel Machek
2017-05-08  7:21     ` David Woodhouse
2017-05-08  7:38       ` Ricard Wanderlof
2017-05-08  8:13         ` David Woodhouse
2017-05-08  8:36           ` Ricard Wanderlof
2017-05-08  8:54             ` David Woodhouse
2017-05-08  9:06               ` Ricard Wanderlof
2017-05-08  9:09                 ` Hans de Goede
2017-05-08 10:13                   ` David Woodhouse
2017-05-08 11:50                     ` Boris Brezillon
2017-05-08 15:40                       ` David Woodhouse
2017-05-08 21:36                         ` Pavel Machek
2017-05-08 16:43                       ` Pavel Machek
2017-05-08 17:43                         ` Tejun Heo
2017-05-08 18:56                           ` Pavel Machek
2017-05-08 19:04                             ` Tejun Heo
2017-05-08 18:29                         ` Atlant Schmidt
2017-05-08 10:12                 ` David Woodhouse
2017-05-08  9:28       ` Pavel Machek
2017-05-08  9:34         ` David Woodhouse
2017-05-08 10:49           ` Pavel Machek
2017-05-08 11:06             ` Richard Weinberger
2017-05-08 11:48               ` Boris Brezillon
2017-05-08 11:55                 ` Boris Brezillon
2017-05-08 12:13                 ` Richard Weinberger
2017-05-08 11:09             ` David Woodhouse
2017-05-08 12:32               ` Pavel Machek
2017-05-08  9:51         ` Richard Weinberger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170411012612.GC10185@khazad-dum.debian.net \
    --to=hmh@hmh.eng.br \
    --cc=hdegoede@redhat.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox