linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* UBIFS Corruption
@ 2011-07-08 13:12 Reginald Perrin
  2011-07-18 11:33 ` Artem Bityutskiy
  2011-07-18 11:38 ` Artem Bityutskiy
  0 siblings, 2 replies; 16+ messages in thread
From: Reginald Perrin @ 2011-07-08 13:12 UTC (permalink / raw)
  To: MTD Mailing List

Hi folks,

We're using ubifs in an embedded uclinux system (based on ADI Blackfin's BF524). 
 Been working great for us for a while.  Kernel is 2.6.34.7 (uclinux).

However, we just saw 2 corruptions within the past 48h that we can't explain. 
 We've been doing the same basic operation (in terms of flashing/reading/writing 
images) for quite some time, and have reflashed our units many times (over 
thousands of different hardware units).

Device #1 failure:
* Device was running out of a partition mounted to /home (a 93MB partition from 
a 128MB NAND device)
* Our app was running normally and locked up (not sure why).  Our code may have 
been updating a sqlite database located in that partition
* When we power cycled, the partition had the corruption issue noted.

Device #1 boot log:
UBI device number 1, total 750 LEBs (96768000 bytes, 92.3 MiB), available 0 LEBs 
(0 bytes), LEB size 129024 bytes (126.0 KiB) 
[ 5.228000] UBIFS: recovery needed 
[ 5.320000] UBIFS error (pid 363): ubifs_scanned_corruption: corruption at LEB 
172:45056 
[ 5.348000] UBIFS error (pid 363): ubifs_recover_leb: LEB 172 scanning failed 
mount: mounting ubi1:home on /home failed: Structure needs cleaning 

Device #2 failure:
* Device was running normally.
* We upgraded our application (which involved updating executables on that 
partition)
* After the successful upgrade, we powered the unit down and stored 
* Days later, powered up the device and the above invalid CRC as noted

Device #2 boot log:
UBI device number 1, total 750 LEBs (96768000 bytes, 92.3 MiB), available 0 LEBs 
(0 bytes), LEB size 129024 bytes (126.0 KiB) 
[ 5.488000] UBIFS: recovery needed 
[ 5.492000] UBIFS error (pid 365): check_lpt_crc: invalid crc in LPT node: crc 
a0 calc 9013
mount: mounting ubi1:home on /home failed: Invalid argument


So, what is concerning is the sheer randomness of these failures.  In neither 
case were we doing anything new (vs. standard operations we have been performing 
for over a year on many devices per day).  Additionally, there's no additional 
logging available, because this *never* happens.  We have never needed (after we 
got UBIFS working) to have the debug output enabled in the driver.  To make 
matters worse, if you ask me to reproduce this, I don't know any way of doing 
it.  We have automated tests that run continually, and they never see these 
issues.

One corruption could be written off as a fluke, but 2 happening within 48h is 
very unusual.  

Can anybody give me any insight into this?  

TIA
RP

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2011-12-13 20:19 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-08 13:12 UBIFS Corruption Reginald Perrin
2011-07-18 11:33 ` Artem Bityutskiy
2011-07-19 20:47   ` Reginald Perrin
2011-07-26 19:17   ` Reginald Perrin
2011-08-02 16:14     ` Reginald Perrin
2011-08-15 12:30       ` Artem Bityutskiy
2011-08-15 12:29     ` Artem Bityutskiy
2011-11-28 20:04       ` Reginald Perrin
2011-11-29 22:30         ` Artem Bityutskiy
2011-12-07 17:19         ` Reginald Perrin
2011-12-08 21:42           ` Artem Bityutskiy
2011-12-12 14:29             ` Reginald Perrin
2011-12-12 16:34               ` Artem Bityutskiy
     [not found]                 ` <1323709988.3066.YahooMailNeo@web114618.mail.gq1.yahoo.com>
2011-12-13 16:25                   ` Reginald Perrin
2011-12-13 20:19                     ` Artem Bityutskiy
2011-07-18 11:38 ` Artem Bityutskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).