public inbox for linux-mmc@vger.kernel.org
 help / color / mirror / Atom feed
* SD-card endurance, wear and crappiness
@ 2014-09-03 14:24 Johan Rudholm
  2014-09-09  8:54 ` Johan Rudholm
  0 siblings, 1 reply; 4+ messages in thread
From: Johan Rudholm @ 2014-09-03 14:24 UTC (permalink / raw)
  To: linux-mmc@vger.kernel.org

Hi all,

as you know, NAND flash can be programmed a limited number of times
before it reaches end of life, the number of times varies with the
NAND technology used, among other things.

As far as I can tell from the simplified SD-spec, there is no way of
asking the card about how many program/erase cycles it can handle, or
how many p/e cycles are left before reaching EOL. Right?

So, if one should want to give the user some kind of early warning
that it's time to change SD-cards, is there a way? Also, when a card
has reached EOL, is there a way of telling this condition apart from
all other error conditions that may arise? As you know, depending on
the quality of the card and controller, read timeouts, write timeouts,
lockups etc may occur but can usually be fixed with a power cycle.

I'm thinking of collecting simple statistics from for instance
card/block.c and exposing it via an ioctl or sysfs. The statistics can
be gathered and processed by some user space process which can
determine if the user needs to be alerted. The statistics can be, for
instance:

* Writes/reads that timeout, but succeed after a retry
* Writes/reads that timeout and never succeeds
* Different kinds of errors in the card status
* Anything else?

Perhaps it's not possible to detect worn out cards this way, but at
least it could point out and warn about crappy cards?

Any thoughts about this?

Kind regards, Johan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: SD-card endurance, wear and crappiness
  2014-09-03 14:24 SD-card endurance, wear and crappiness Johan Rudholm
@ 2014-09-09  8:54 ` Johan Rudholm
  2014-09-09  9:11   ` Arnd Bergmann
  0 siblings, 1 reply; 4+ messages in thread
From: Johan Rudholm @ 2014-09-09  8:54 UTC (permalink / raw)
  To: Ulf Hansson, arnd; +Cc: linux-mmc@vger.kernel.org

Hi Ulf, Arnd,

do you guys have any thoughts on this?

Kind regards, Johan

2014-09-03 16:24 GMT+02:00 Johan Rudholm <jrudholm@gmail.com>:
> Hi all,
>
> as you know, NAND flash can be programmed a limited number of times
> before it reaches end of life, the number of times varies with the
> NAND technology used, among other things.
>
> As far as I can tell from the simplified SD-spec, there is no way of
> asking the card about how many program/erase cycles it can handle, or
> how many p/e cycles are left before reaching EOL. Right?
>
> So, if one should want to give the user some kind of early warning
> that it's time to change SD-cards, is there a way? Also, when a card
> has reached EOL, is there a way of telling this condition apart from
> all other error conditions that may arise? As you know, depending on
> the quality of the card and controller, read timeouts, write timeouts,
> lockups etc may occur but can usually be fixed with a power cycle.
>
> I'm thinking of collecting simple statistics from for instance
> card/block.c and exposing it via an ioctl or sysfs. The statistics can
> be gathered and processed by some user space process which can
> determine if the user needs to be alerted. The statistics can be, for
> instance:
>
> * Writes/reads that timeout, but succeed after a retry
> * Writes/reads that timeout and never succeeds
> * Different kinds of errors in the card status
> * Anything else?
>
> Perhaps it's not possible to detect worn out cards this way, but at
> least it could point out and warn about crappy cards?
>
> Any thoughts about this?
>
> Kind regards, Johan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: SD-card endurance, wear and crappiness
  2014-09-09  8:54 ` Johan Rudholm
@ 2014-09-09  9:11   ` Arnd Bergmann
  2014-09-10  7:18     ` Johan Rudholm
  0 siblings, 1 reply; 4+ messages in thread
From: Arnd Bergmann @ 2014-09-09  9:11 UTC (permalink / raw)
  To: Johan Rudholm; +Cc: Ulf Hansson, linux-mmc@vger.kernel.org

On Tuesday 09 September 2014 10:54:51 Johan Rudholm wrote:
> 2014-09-03 16:24 GMT+02:00 Johan Rudholm <jrudholm@gmail.com>:
> > Hi all,
> >
> > as you know, NAND flash can be programmed a limited number of times
> > before it reaches end of life, the number of times varies with the
> > NAND technology used, among other things.
> >
> > As far as I can tell from the simplified SD-spec, there is no way of
> > asking the card about how many program/erase cycles it can handle, or
> > how many p/e cycles are left before reaching EOL. Right?

I think that is correct.

> > So, if one should want to give the user some kind of early warning
> > that it's time to change SD-cards, is there a way? Also, when a card
> > has reached EOL, is there a way of telling this condition apart from
> > all other error conditions that may arise? As you know, depending on
> > the quality of the card and controller, read timeouts, write timeouts,
> > lockups etc may occur but can usually be fixed with a power cycle.
> >
> > I'm thinking of collecting simple statistics from for instance
> > card/block.c and exposing it via an ioctl or sysfs. The statistics can
> > be gathered and processed by some user space process which can
> > determine if the user needs to be alerted. The statistics can be, for
> > instance:
> >
> > * Writes/reads that timeout, but succeed after a retry
> > * Writes/reads that timeout and never succeeds
> > * Different kinds of errors in the card status
> > * Anything else?
> >
> > Perhaps it's not possible to detect worn out cards this way, but at
> > least it could point out and warn about crappy cards?
> >
> > Any thoughts about this?

Have you tried if this works? In my experience, the worn-out cards
I have either just fail completely, or they return incorrect data,
but I have not looked at this side of the problem much.

Do you have cards that sometimes time out but always still return
correct data on retry?

	Arnd

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: SD-card endurance, wear and crappiness
  2014-09-09  9:11   ` Arnd Bergmann
@ 2014-09-10  7:18     ` Johan Rudholm
  0 siblings, 0 replies; 4+ messages in thread
From: Johan Rudholm @ 2014-09-10  7:18 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Ulf Hansson, linux-mmc@vger.kernel.org

2014-09-09 11:11 GMT+02:00 Arnd Bergmann <arnd@arndb.de>:
> On Tuesday 09 September 2014 10:54:51 Johan Rudholm wrote:
>> 2014-09-03 16:24 GMT+02:00 Johan Rudholm <jrudholm@gmail.com>:
>> > Hi all,
>> >
>> > as you know, NAND flash can be programmed a limited number of times
>> > before it reaches end of life, the number of times varies with the
>> > NAND technology used, among other things.
>> >
>> > As far as I can tell from the simplified SD-spec, there is no way of
>> > asking the card about how many program/erase cycles it can handle, or
>> > how many p/e cycles are left before reaching EOL. Right?
>
> I think that is correct.
>
>> > So, if one should want to give the user some kind of early warning
>> > that it's time to change SD-cards, is there a way? Also, when a card
>> > has reached EOL, is there a way of telling this condition apart from
>> > all other error conditions that may arise? As you know, depending on
>> > the quality of the card and controller, read timeouts, write timeouts,
>> > lockups etc may occur but can usually be fixed with a power cycle.
>> >
>> > I'm thinking of collecting simple statistics from for instance
>> > card/block.c and exposing it via an ioctl or sysfs. The statistics can
>> > be gathered and processed by some user space process which can
>> > determine if the user needs to be alerted. The statistics can be, for
>> > instance:
>> >
>> > * Writes/reads that timeout, but succeed after a retry
>> > * Writes/reads that timeout and never succeeds
>> > * Different kinds of errors in the card status
>> > * Anything else?
>> >
>> > Perhaps it's not possible to detect worn out cards this way, but at
>> > least it could point out and warn about crappy cards?
>> >
>> > Any thoughts about this?
>
> Have you tried if this works? In my experience, the worn-out cards
> I have either just fail completely, or they return incorrect data,
> but I have not looked at this side of the problem much.
>
> Do you have cards that sometimes time out but always still return
> correct data on retry?

I have noticed that some cards time out on a multi block read, but
then succeeds when single block reads are attempted. I've also
experimented with retrying the multi block read instead of falling
back to single block reads, and some cards succeed after a number (>
10) of retries. However, these cards have not been close to being worn
out.

I have almost never seen an SD-card that's died because of wear, at
least not under controlled circumstances. So, thanks for sharing your
experiences! Maybe the bottom line will be that there is no guaranteed
way of detecting that a card is nearing EOL.

//Johan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-09-10  7:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-03 14:24 SD-card endurance, wear and crappiness Johan Rudholm
2014-09-09  8:54 ` Johan Rudholm
2014-09-09  9:11   ` Arnd Bergmann
2014-09-10  7:18     ` Johan Rudholm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox