public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* SD/MMC cards: how crappy they are?
@ 2008-12-02 14:48 Pavel Machek
  2008-12-02 16:30 ` H. Peter Anvin
  0 siblings, 1 reply; 10+ messages in thread
From: Pavel Machek @ 2008-12-02 14:48 UTC (permalink / raw)
  To: kernel list; +Cc: tytso


I have 32GB card here...

root@amd:/home/pavel/WWW/wear/tinylight# time cat /dev/mmc1 > /dev/null
cat: /dev/mmc1: Input/output error
1.32user 49.03system 4184.78 (69m44.789s) elapsed 1.20%CPU

...maybe it was because of powerfail? I'll try to run badblocks to
recover it...

...I did. Badblocks did not help, but cat /dev/zero > /dev/mmc1
did.. And yes, thosse 'temporarily bad blocks' seem very much
powerfail related.

Its bad, because ext2/3 does not seem to handle this very well... not
even fsck does the selective rewrite... :-(.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SD/MMC cards: how crappy they are?
  2008-12-02 14:48 SD/MMC cards: how crappy they are? Pavel Machek
@ 2008-12-02 16:30 ` H. Peter Anvin
  2008-12-02 16:56   ` Theodore Tso
  0 siblings, 1 reply; 10+ messages in thread
From: H. Peter Anvin @ 2008-12-02 16:30 UTC (permalink / raw)
  To: Pavel Machek; +Cc: kernel list, tytso

Pavel Machek wrote:
> 
> ...maybe it was because of powerfail? I'll try to run badblocks to
> recover it...
> 
> ...I did. Badblocks did not help, but cat /dev/zero > /dev/mmc1
> did.. And yes, thosse 'temporarily bad blocks' seem very much
> powerfail related.
> 

Power failures can, indeed, do nasty things to SD/MMC cards, especially
power rail sag in the middle of writes.

	-hpa

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SD/MMC cards: how crappy they are?
  2008-12-02 16:30 ` H. Peter Anvin
@ 2008-12-02 16:56   ` Theodore Tso
  2008-12-02 17:59     ` H. Peter Anvin
  2008-12-26 22:39     ` Pavel Machek
  0 siblings, 2 replies; 10+ messages in thread
From: Theodore Tso @ 2008-12-02 16:56 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Pavel Machek, kernel list

On Tue, Dec 02, 2008 at 08:30:29AM -0800, H. Peter Anvin wrote:
> > ...maybe it was because of powerfail? I'll try to run badblocks to
> > recover it...
> > 
> > ...I did. Badblocks did not help, but cat /dev/zero > /dev/mmc1
> > did.. And yes, thosse 'temporarily bad blocks' seem very much
> > powerfail related.
> > 
> 
> Power failures can, indeed, do nasty things to SD/MMC cards, especially
> power rail sag in the middle of writes.

If this is your random eject out from your HP laptop problem, note
that random ejects while the card is writing can cause corruption of
the flash translation layer (FTL), which for some really crappy cards,
can permanently damage them; hopefully most of those are gone from the
market, but I wouldn't be positive about that.  The better ones will
have some kind of journalling scheme for their FTL...

Fsck does have a force rewrite option, although it's not the default.
You have to answer "n" to ignore error, and then yes to "force
rewrite".  I should perhaps change that; my worry at the time was a
transient read error tricking e2fsck into blowing away the contents of
what was actually a good sector.  Of course, that will only help
blocks which fsck actually tried reading; it won't help data blocks.

Badblocks -n will fix the problem, since it will do a non-destructive
read/write test over the entire disk.  Patches to add an
forced-rewrite mode to the standard r/o badblocks sweep (so we only
write to a sector that has a read error) would be gratefully accepted.

						- Ted



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SD/MMC cards: how crappy they are?
  2008-12-02 16:56   ` Theodore Tso
@ 2008-12-02 17:59     ` H. Peter Anvin
  2008-12-04 10:32       ` Pavel Machek
  2008-12-26 22:39     ` Pavel Machek
  1 sibling, 1 reply; 10+ messages in thread
From: H. Peter Anvin @ 2008-12-02 17:59 UTC (permalink / raw)
  To: Theodore Tso, H. Peter Anvin, Pavel Machek, kernel list

Theodore Tso wrote:
> 
> If this is your random eject out from your HP laptop problem, note
> that random ejects while the card is writing can cause corruption of
> the flash translation layer (FTL), which for some really crappy cards,
> can permanently damage them; hopefully most of those are gone from the
> market, but I wouldn't be positive about that.  The better ones will
> have some kind of journalling scheme for their FTL...
> 

I have seen flash cards die permanently from having a partition table it
didn't like written to it.  Yes, the microcontroller on the flash card
tried to interpret the partition table, assumed to be MS-DOS style, and
would crash.

	-hpa

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SD/MMC cards: how crappy they are?
  2008-12-02 17:59     ` H. Peter Anvin
@ 2008-12-04 10:32       ` Pavel Machek
  2008-12-04 19:03         ` H. Peter Anvin
  0 siblings, 1 reply; 10+ messages in thread
From: Pavel Machek @ 2008-12-04 10:32 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Theodore Tso, kernel list

On Tue 2008-12-02 09:59:42, H. Peter Anvin wrote:
> Theodore Tso wrote:
> > 
> > If this is your random eject out from your HP laptop problem, note
> > that random ejects while the card is writing can cause corruption of
> > the flash translation layer (FTL), which for some really crappy cards,
> > can permanently damage them; hopefully most of those are gone from the
> > market, but I wouldn't be positive about that.  The better ones will
> > have some kind of journalling scheme for their FTL...
> > 
> 
> I have seen flash cards die permanently from having a partition table it
> didn't like written to it.  Yes, the microcontroller on the flash card
> tried to interpret the partition table, assumed to be MS-DOS style, and
> would crash.

Aha... that explains why I killed few flashcards by tar xzvf /dev/sdX files
... hopefully thats fixed in the better/bigger cards now.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SD/MMC cards: how crappy they are?
  2008-12-04 10:32       ` Pavel Machek
@ 2008-12-04 19:03         ` H. Peter Anvin
  2008-12-26 21:46           ` Pavel Machek
  0 siblings, 1 reply; 10+ messages in thread
From: H. Peter Anvin @ 2008-12-04 19:03 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Theodore Tso, kernel list

Pavel Machek wrote:
>>>
>> I have seen flash cards die permanently from having a partition table it
>> didn't like written to it.  Yes, the microcontroller on the flash card
>> tried to interpret the partition table, assumed to be MS-DOS style, and
>> would crash.
> 
> Aha... that explains why I killed few flashcards by tar xzvf /dev/sdX files
> ... hopefully thats fixed in the better/bigger cards now.
> 

Also had a batch of cards which would silently "correct" the partition
table for you to align the partitions to its flash erase blocks.

	-hpa

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SD/MMC cards: how crappy they are?
  2008-12-04 19:03         ` H. Peter Anvin
@ 2008-12-26 21:46           ` Pavel Machek
  2008-12-26 21:49             ` H. Peter Anvin
  0 siblings, 1 reply; 10+ messages in thread
From: Pavel Machek @ 2008-12-26 21:46 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Theodore Tso, kernel list

> Pavel Machek wrote:
> >>>
> >> I have seen flash cards die permanently from having a partition table it
> >> didn't like written to it.  Yes, the microcontroller on the flash card
> >> tried to interpret the partition table, assumed to be MS-DOS style, and
> >> would crash.
> > 
> > Aha... that explains why I killed few flashcards by tar xzvf /dev/sdX files
> > ... hopefully thats fixed in the better/bigger cards now.
> > 
> 
> Also had a batch of cards which would silently "correct" the partition
> table for you to align the partitions to its flash erase blocks.

Can you mention the manufacturer/model? Silendt data corruption is a
nasty thing....
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SD/MMC cards: how crappy they are?
  2008-12-26 21:46           ` Pavel Machek
@ 2008-12-26 21:49             ` H. Peter Anvin
  0 siblings, 0 replies; 10+ messages in thread
From: H. Peter Anvin @ 2008-12-26 21:49 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Theodore Tso, kernel list

Pavel Machek wrote:
>> Also had a batch of cards which would silently "correct" the partition
>> table for you to align the partitions to its flash erase blocks.
> 
> Can you mention the manufacturer/model? Silendt data corruption is a
> nasty thing....

I would, if I remembered.  It was a few years ago.  All I can remember
now is that it wasn't one of the well-known brands like SanDisk or PQI.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SD/MMC cards: how crappy they are?
  2008-12-02 16:56   ` Theodore Tso
  2008-12-02 17:59     ` H. Peter Anvin
@ 2008-12-26 22:39     ` Pavel Machek
  2008-12-27  1:03       ` Ben Pfaff
  1 sibling, 1 reply; 10+ messages in thread
From: Pavel Machek @ 2008-12-26 22:39 UTC (permalink / raw)
  To: Theodore Tso, H. Peter Anvin, kernel list

Hi!

> If this is your random eject out from your HP laptop problem, note
> that random ejects while the card is writing can cause corruption of
> the flash translation layer (FTL), which for some really crappy cards,
> can permanently damage them; hopefully most of those are gone from the
> market, but I wouldn't be positive about that.  The better ones will
> have some kind of journalling scheme for their FTL...
> 
> Fsck does have a force rewrite option, although it's not the default.
> You have to answer "n" to ignore error, and then yes to "force
> rewrite".  I should perhaps change that; my worry at the time was a
> transient read error tricking e2fsck into blowing away the contents of
> what was actually a good sector.  Of course, that will only help

Yes, I think that should be changed. Transient read errors are not
common, while bad blocks fixed by rewrite are quite common.
> 
> Badblocks -n will fix the problem, since it will do a non-destructive
> read/write test over the entire disk.  Patches to add an
> forced-rewrite mode to the standard r/o badblocks sweep (so we only
> write to a sector that has a read error) would be gratefully accepted.

badblocks -n took > 8hours on 32GB flash, so no, that's not usable. I
started digging into badblocks (please take a look/apply following
documentation updates, I only understood some stuff when reading the
source)... And I wish I'd known about SIGALARM before.

Question: does badblocks expect the media to be valied ext2/3/4
filesystem? It seems so...

								Pavel

Binary files e2fsprogs-1.41.3-clean/misc/badblocks and e2fsprogs-1.41.3/misc/badblocks differ
diff -ur e2fsprogs-1.41.3-clean/misc/badblocks.8 e2fsprogs-1.41.3/misc/badblocks.8
--- e2fsprogs-1.41.3-clean/misc/badblocks.8	2008-12-26 23:08:55.000000000 +0100
+++ e2fsprogs-1.41.3/misc/badblocks.8	2008-12-26 23:18:56.000000000 +0100
@@ -173,6 +173,10 @@
 read-only test is done.  This option must not be combined with the 
 .B \-w
 option, as they are mutually exclusive.
+
+This will read the block to be tested, then overwrite it with few different
+patterns, then write old data back. If something goes very wrong during the
+test (powerfail?) it may still damage the data.
 .TP
 .B \-s
 Show the progress of the scan by writing out the block numbers as they
@@ -211,6 +215,10 @@
 bad blocks. Therefore it is recommended to use it only when one wants
 to know if there are any bad blocks at all on the device, and not when
 the list of bad blocks is wanted.
+
+You can send SIGALARM to make badblocks report its progress. You can
+send SIGTERM to make badblocks terminate; it will catch the signal, clean
+up and exit.
 .SH AUTHOR
 .B badblocks
 was written by Remy Card <Remy.Card@linux.org>.  Current maintainer is

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SD/MMC cards: how crappy they are?
  2008-12-26 22:39     ` Pavel Machek
@ 2008-12-27  1:03       ` Ben Pfaff
  0 siblings, 0 replies; 10+ messages in thread
From: Ben Pfaff @ 2008-12-27  1:03 UTC (permalink / raw)
  To: linux-kernel

Pavel Machek <pavel@suse.cz> writes:

> @@ -211,6 +215,10 @@
>  bad blocks. Therefore it is recommended to use it only when one wants
>  to know if there are any bad blocks at all on the device, and not when
>  the list of bad blocks is wanted.
> +
> +You can send SIGALARM to make badblocks report its progress. You can
> +send SIGTERM to make badblocks terminate; it will catch the signal, clean
> +up and exit.

s/SIGALARM/SIGALRM/
-- 
Ben Pfaff 
http://benpfaff.org


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-12-27  1:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-02 14:48 SD/MMC cards: how crappy they are? Pavel Machek
2008-12-02 16:30 ` H. Peter Anvin
2008-12-02 16:56   ` Theodore Tso
2008-12-02 17:59     ` H. Peter Anvin
2008-12-04 10:32       ` Pavel Machek
2008-12-04 19:03         ` H. Peter Anvin
2008-12-26 21:46           ` Pavel Machek
2008-12-26 21:49             ` H. Peter Anvin
2008-12-26 22:39     ` Pavel Machek
2008-12-27  1:03       ` Ben Pfaff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox