* Does modern UBI/UBIFS still suffer from the 'unstable bits issue'? @ 2018-03-01 16:15 Tim Harvey 2018-03-01 16:32 ` Richard Weinberger 0 siblings, 1 reply; 8+ messages in thread From: Tim Harvey @ 2018-03-01 16:15 UTC (permalink / raw) To: Richard Weinberger, Artem Bityutskiy, Adrian Hunter Cc: linux-mtd, Koen Vandeputte, Scott Bowman Greetings, I have a user with an IMX6 and raw NAND using UBI/UBIFS who has been able to reproduce a NAND corruption: [ 10.611972] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, PID 631 [ 10.634365] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 10.657492] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 10.681137] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 10.704267] ubi0 error: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read 253952 bytes The kernel they are using is a bit out of date but does have 'gpmi-nand: Handle ECC Errors in erased pages' [1] patch I'm wondering if the 'unstable bits issue' [2] is still an issue or if the UBI/UBFS Documentation is out of date and this has been resolved. If it has been resolved, can anyone point me to the patches. Regards, Tim [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bd2e778c9ee361c23ccb2b10591712e129d97893 [2] http://www.linux-mtd.infradead.org/doc/ubifs.html#L_unstable_bits ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does modern UBI/UBIFS still suffer from the 'unstable bits issue'? 2018-03-01 16:15 Does modern UBI/UBIFS still suffer from the 'unstable bits issue'? Tim Harvey @ 2018-03-01 16:32 ` Richard Weinberger 2018-03-02 1:19 ` Tim Harvey 0 siblings, 1 reply; 8+ messages in thread From: Richard Weinberger @ 2018-03-01 16:32 UTC (permalink / raw) To: Tim Harvey Cc: Artem Bityutskiy, Adrian Hunter, linux-mtd, Koen Vandeputte, Scott Bowman Tim, Am Donnerstag, 1. März 2018, 17:15:44 CET schrieb Tim Harvey: > Greetings, > > I have a user with an IMX6 and raw NAND using UBI/UBIFS who has been > able to reproduce a NAND corruption: What does your user to reproduce this? > [ 10.611972] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, PID > 631 [ 10.634365] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ > 10.657492] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading > 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ > 10.681137] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading > 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ > 10.704267] ubi0 error: ubi_io_read: error -74 (ECC error) while reading > 253952 bytes from PEB 2807:8192, read 253952 bytes > > The kernel they are using is a bit out of date but does have > 'gpmi-nand: Handle ECC Errors in erased pages' [1] patch > > I'm wondering if the 'unstable bits issue' [2] is still an issue or if > the UBI/UBFS Documentation is out of date and this has been resolved. > If it has been resolved, can anyone point me to the patches. This issue is highly theoretical and I never actually saw it in the wild. Every single time someone claimed to suffer from that, it turned out to be something else. Currently UBI/UBIFS has no counter measurement, for the said reasons. This reminds me that we have to update the website... So did you verify (with your NAND vendor) that this really is the named issue? Thanks, //richard ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does modern UBI/UBIFS still suffer from the 'unstable bits issue'? 2018-03-01 16:32 ` Richard Weinberger @ 2018-03-02 1:19 ` Tim Harvey 2018-03-02 10:07 ` Richard Weinberger 0 siblings, 1 reply; 8+ messages in thread From: Tim Harvey @ 2018-03-02 1:19 UTC (permalink / raw) To: Richard Weinberger Cc: Artem Bityutskiy, Adrian Hunter, linux-mtd, Koen Vandeputte, Scott Bowman On Thu, Mar 1, 2018 at 8:32 AM, Richard Weinberger <richard@nod.at> wrote: > Tim, > > Am Donnerstag, 1. März 2018, 17:15:44 CET schrieb Tim Harvey: >> Greetings, >> >> I have a user with an IMX6 and raw NAND using UBI/UBIFS who has been >> able to reproduce a NAND corruption: > > What does your user to reproduce this? Richard, It's unclear at the moment. It's one of those 'this happened twice on two different boards' reports without a lot of detail. However I do know they do write to the filesystem on every boot and do encounter random power-cuts. > >> [ 10.611972] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, PID >> 631 [ 10.634365] ubi0 warning: ubi_io_read: error -74 (ECC error) while >> reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ >> 10.657492] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading >> 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ >> 10.681137] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading >> 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ >> 10.704267] ubi0 error: ubi_io_read: error -74 (ECC error) while reading >> 253952 bytes from PEB 2807:8192, read 253952 bytes >> >> The kernel they are using is a bit out of date but does have >> 'gpmi-nand: Handle ECC Errors in erased pages' [1] patch >> >> I'm wondering if the 'unstable bits issue' [2] is still an issue or if >> the UBI/UBFS Documentation is out of date and this has been resolved. >> If it has been resolved, can anyone point me to the patches. > > This issue is highly theoretical and I never actually saw it in the wild. > Every single time someone claimed to suffer from that, it turned out to be > something else. Currently UBI/UBIFS has no counter measurement, for the said > reasons. > This reminds me that we have to update the website... > > So did you verify (with your NAND vendor) that this really is the named issue? I have no idea if what the user reported is the unstable bits issue but the fact you've never seen it occur in the wild tells me probably not. They are using a rather old kernel (4.4 but with a patch to gpmi-nand backported from 4.7). I will setup a controlled test with random power-cuts in a test fixture I have to see if I can get it to re-occur on a) the old kernel and then b) the current kernel. Thanks for the feedback! Tim ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does modern UBI/UBIFS still suffer from the 'unstable bits issue'? 2018-03-02 1:19 ` Tim Harvey @ 2018-03-02 10:07 ` Richard Weinberger 2018-03-02 16:20 ` Tim Harvey 0 siblings, 1 reply; 8+ messages in thread From: Richard Weinberger @ 2018-03-02 10:07 UTC (permalink / raw) To: Tim Harvey Cc: Artem Bityutskiy, Adrian Hunter, linux-mtd, Koen Vandeputte, Scott Bowman Tim, Am Freitag, 2. März 2018, 02:19:54 CET schrieb Tim Harvey: > On Thu, Mar 1, 2018 at 8:32 AM, Richard Weinberger <richard@nod.at> wrote: > > Tim, > > > > Am Donnerstag, 1. März 2018, 17:15:44 CET schrieb Tim Harvey: > >> Greetings, > >> > >> I have a user with an IMX6 and raw NAND using UBI/UBIFS who has been > > > >> able to reproduce a NAND corruption: > > What does your user to reproduce this? > > Richard, > > It's unclear at the moment. It's one of those 'this happened twice on > two different boards' reports without a lot of detail. However I do > know they do write to the filesystem on every boot and do encounter > random power-cuts. > > >> [ 10.611972] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, > >> PID 631 [ 10.634365] ubi0 warning: ubi_io_read: error -74 (ECC error) > >> while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, > >> retry [ 10.657492] ubi0 warning: ubi_io_read: error -74 (ECC error) > >> while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, > >> retry [ > >> 10.681137] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading > >> 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ > >> 10.704267] ubi0 error: ubi_io_read: error -74 (ECC error) while reading > >> 253952 bytes from PEB 2807:8192, read 253952 bytes BTW: I miss a back trace here. How did you obtain that messages? > >> The kernel they are using is a bit out of date but does have > >> 'gpmi-nand: Handle ECC Errors in erased pages' [1] patch > >> > >> I'm wondering if the 'unstable bits issue' [2] is still an issue or if > >> the UBI/UBFS Documentation is out of date and this has been resolved. > >> If it has been resolved, can anyone point me to the patches. > > > > This issue is highly theoretical and I never actually saw it in the wild. > > Every single time someone claimed to suffer from that, it turned out to be > > something else. Currently UBI/UBIFS has no counter measurement, for the > > said reasons. > > This reminds me that we have to update the website... > > > > So did you verify (with your NAND vendor) that this really is the named > > issue? > I have no idea if what the user reported is the unstable bits issue > but the fact you've never seen it occur in the wild tells me probably > not. I'd be surprised, but you never know. :-) Just to be sure, this is SLC NAND, right? > They are using a rather old kernel (4.4 but with a patch to gpmi-nand > backported from 4.7). I will setup a controlled test with random > power-cuts in a test fixture I have to see if I can get it to re-occur > on a) the old kernel and then b) the current kernel. Thanks, //richard ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does modern UBI/UBIFS still suffer from the 'unstable bits issue'? 2018-03-02 10:07 ` Richard Weinberger @ 2018-03-02 16:20 ` Tim Harvey 2018-03-02 17:33 ` Han Xu 2018-03-03 10:40 ` Richard Weinberger 0 siblings, 2 replies; 8+ messages in thread From: Tim Harvey @ 2018-03-02 16:20 UTC (permalink / raw) To: Richard Weinberger Cc: Artem Bityutskiy, Adrian Hunter, linux-mtd, Koen Vandeputte, Scott Bowman On Fri, Mar 2, 2018 at 2:07 AM, Richard Weinberger <richard@nod.at> wrote: > Tim, > > Am Freitag, 2. März 2018, 02:19:54 CET schrieb Tim Harvey: >> On Thu, Mar 1, 2018 at 8:32 AM, Richard Weinberger <richard@nod.at> wrote: >> > Tim, >> > >> > Am Donnerstag, 1. März 2018, 17:15:44 CET schrieb Tim Harvey: >> >> Greetings, >> >> >> >> I have a user with an IMX6 and raw NAND using UBI/UBIFS who has been >> > >> >> able to reproduce a NAND corruption: >> > What does your user to reproduce this? >> >> Richard, >> >> It's unclear at the moment. It's one of those 'this happened twice on >> two different boards' reports without a lot of detail. However I do >> know they do write to the filesystem on every boot and do encounter >> random power-cuts. >> >> >> [ 10.611972] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, >> >> PID 631 [ 10.634365] ubi0 warning: ubi_io_read: error -74 (ECC error) >> >> while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, >> >> retry [ 10.657492] ubi0 warning: ubi_io_read: error -74 (ECC error) >> >> while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, >> >> retry [ >> >> 10.681137] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading >> >> 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ >> >> 10.704267] ubi0 error: ubi_io_read: error -74 (ECC error) while reading >> >> 253952 bytes from PEB 2807:8192, read 253952 bytes > > BTW: I miss a back trace here. How did you obtain that messages? > [ 10.528272] Buffer I/O error on dev mtdblock0, logical block 0, async page read [ 10.611972] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, PID 631 [ 10.634365] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 10.657492] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 10.681137] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 10.704267] ubi0 error: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read 253952 bytes [ 10.715425] CPU: 2 PID: 629 Comm: block Not tainted 4.4.0 #6 [ 10.721087] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 10.727619] Backtrace: [ 10.730108] [<8001e674>] (dump_backtrace) from [<8001e86c>] (show_stack+0x18/0x1c) [ 10.737679] r7:00000af7 r6:0003e000 r5:60000013 r4:00000000 [ 10.743406] [<8001e854>] (show_stack) from [<80232028>] (dump_stack+0x84/0xa4) [ 10.750649] [<80231fa4>] (dump_stack) from [<8030a9c4>] (ubi_io_read+0x1dc/0x2b0) [ 10.758132] r5:bf206000 r4:ffffffb6 [ 10.761744] [<8030a7e8>] (ubi_io_read) from [<80308974>] (ubi_eba_read_leb+0x27c/0x388) [ 10.769748] r10:be913c00 r9:00000000 r8:00000000 r7:00000002 r6:bf206000 r5:bf206000 [ 10.777646] r4:0003e000 [ 10.780204] [<803086f8>] (ubi_eba_read_leb) from [<80307864>] (ubi_leb_read+0x74/0xc4) [ 10.788120] r10:c0d81000 r9:00000002 r8:00000002 r7:00000000 r6:bf206000 r5:be913c00 [ 10.796020] r4:0003e000 [ 10.798579] [<803077f0>] (ubi_leb_read) from [<801da34c>] (ubifs_leb_read+0x34/0x98) [ 10.806322] r10:be55eec0 r9:00000002 r8:00000000 r7:00000002 r6:0003e000 r5:bf1cb000 [ 10.814221] r4:be560180 [ 10.816779] [<801da318>] (ubifs_leb_read) from [<801e19c8>] (ubifs_start_scan+0x7c/0xf8) [ 10.824869] r8:00000002 r7:c0d81000 r6:00000000 r5:bf1cb000 r4:be560180 [ 10.831648] [<801e194c>] (ubifs_start_scan) from [<801e1ccc>] (ubifs_scan+0x2c/0x330) [ 10.839477] r8:00000003 r7:0003e000 r6:c0d81000 r5:00000000 r4:bf1cb000 [ 10.846252] [<801e1ca0>] (ubifs_scan) from [<801e0e38>] (ubifs_read_master+0xb4/0x924) [ 10.854169] r10:be55eec0 r9:000000a0 r8:00000003 r7:00002000 r6:be560300 r5:be560180 [ 10.862069] r4:bf1cb000 [ 10.864622] [<801e0d84>] (ubifs_read_master) from [<801d82c4>] (ubifs_mount+0xa7c/0x156c) [ 10.872798] r10:be55eec0 r9:000000a0 r8:bf1cb87c r7:be538000 r6:00000000 r5:bf1cb000 [ 10.880697] r4:bf2da140 [ 10.883255] [<801d7848>] (ubifs_mount) from [<801011ac>] (mount_fs+0x1c/0xa0) [ 10.890390] r10:be55e000 r9:806f2078 r8:00000000 r7:806f2078 r6:806f2078 r5:00000000 [ 10.898291] r4:801d7848 [ 10.900849] [<80101190>] (mount_fs) from [<80119054>] (vfs_kern_mount+0x50/0x108) [ 10.908332] r6:be55e180 r5:00000000 r4:bf1a6cc0 [ 10.913002] [<80119004>] (vfs_kern_mount) from [<8011c354>] (do_mount+0x9d8/0xb70) [ 10.920572] r9:806f2078 r8:be55e180 r7:7eeabe14 r6:00000400 r5:806dbc6c r4:00000008 [ 10.928391] [<8011b97c>] (do_mount) from [<8011c72c>] (SyS_mount+0x7c/0xa8) [ 10.935354] r10:00000000 r9:be4f6000 r8:00000400 r7:7eeabe14 r6:be55e180 r5:be55e000 [ 10.943253] r4:00000000 [ 10.945811] [<8011c6b0>] (SyS_mount) from [<80009bc0>] (ret_fast_syscall+0x0/0x3c) [ 10.953380] r8:80009d84 r7:00000015 r6:00027014 r5:7eeabe14 r4:00000000 [ 10.984081] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 11.007847] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 11.031492] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 11.055202] ubi0 error: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read 253952 bytes [ 11.066358] CPU: 2 PID: 629 Comm: block Not tainted 4.4.0 #6 [ 11.072020] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 11.078549] Backtrace: [ 11.081034] [<8001e674>] (dump_backtrace) from [<8001e86c>] (show_stack+0x18/0x1c) [ 11.088606] r7:00000af7 r6:0003e000 r5:60000013 r4:00000000 [ 11.094334] [<8001e854>] (show_stack) from [<80232028>] (dump_stack+0x84/0xa4) [ 11.101575] [<80231fa4>] (dump_stack) from [<8030a9c4>] (ubi_io_read+0x1dc/0x2b0) [ 11.109058] r5:bf206000 r4:ffffffb6 [ 11.112669] [<8030a7e8>] (ubi_io_read) from [<80308974>] (ubi_eba_read_leb+0x27c/0x388) [ 11.120673] r10:be913c00 r9:00000000 r8:00000000 r7:00000002 r6:bf206000 r5:bf206000 [ 11.128571] r4:0003e000 [ 11.131126] [<803086f8>] (ubi_eba_read_leb) from [<80307864>] (ubi_leb_read+0x74/0xc4) [ 11.139042] r10:c0e3e000 r9:c0e3e000 r8:00000002 r7:00000000 r6:bf206000 r5:be913c00 [ 11.146941] r4:0003e000 [ 11.149498] [<803077f0>] (ubi_leb_read) from [<801da34c>] (ubifs_leb_read+0x34/0x98) [ 11.157240] r10:00000002 r9:c0e3e000 r8:00000000 r7:00000002 r6:0003e000 r5:bf1cb000 [ 11.165140] r4:bf1cb000 [ 11.167706] [<801da318>] (ubifs_leb_read) from [<801f1668>] (get_master_node+0x58/0x1f0) [ 11.175796] r8:bf1cb000 r7:00001000 r6:be560300 r5:00000000 r4:bf1cb000 [ 11.182575] [<801f1610>] (get_master_node) from [<801f1b0c>] (ubifs_recover_master_node+0x70/0x2f4) [ 11.191620] r10:be55eec0 r9:000000a0 r8:00000003 r7:00001000 r6:be560300 r5:00000000 [ 11.199520] r4:bf1cb000 [ 11.202077] [<801f1a9c>] (ubifs_recover_master_node) from [<801e0f28>] (ubifs_read_master+0x1a4/0x924) [ 11.211383] r7:00002000 r6:be560300 r5:ffffff8b r4:bf1cb000 [ 11.217106] [<801e0d84>] (ubifs_read_master) from [<801d82c4>] (ubifs_mount+0xa7c/0x156c) [ 11.225282] r10:be55eec0 r9:000000a0 r8:bf1cb87c r7:be538000 r6:00000000 r5:bf1cb000 [ 11.233180] r4:bf2da140 [ 11.235735] [<801d7848>] (ubifs_mount) from [<801011ac>] (mount_fs+0x1c/0xa0) [ 11.242870] r10:be55e000 r9:806f2078 r8:00000000 r7:806f2078 r6:806f2078 r5:00000000 [ 11.250769] r4:801d7848 [ 11.253326] [<80101190>] (mount_fs) from [<80119054>] (vfs_kern_mount+0x50/0x108) [ 11.260808] r6:be55e180 r5:00000000 r4:bf1a6cc0 [ 11.265478] [<80119004>] (vfs_kern_mount) from [<8011c354>] (do_mount+0x9d8/0xb70) [ 11.273047] r9:806f2078 r8:be55e180 r7:7eeabe14 r6:00000400 r5:806dbc6c r4:00000008 [ 11.280865] [<8011b97c>] (do_mount) from [<8011c72c>] (SyS_mount+0x7c/0xa8) [ 11.287827] r10:00000000 r9:be4f6000 r8:00000400 r7:7eeabe14 r6:be55e180 r5:be55e000 [ 11.295727] r4:00000000 [ 11.298284] [<8011c6b0>] (SyS_mount) from [<80009bc0>] (ret_fast_syscall+0x0/0x3c) [ 11.305853] r8:80009d84 r7:00000015 r6:00027014 r5:7eeabe14 r4:00000000 [ 11.313088] UBIFS error (ubi0:2 pid 629): ubifs_recover_master_node: failed to recover master node [ 11.322071] UBIFS error (ubi0:2 pid 629): ubifs_recover_master_node: dumping first master node [ 11.330686] magic 0x6101831 [ 11.334361] crc 0xd0feaa12 [ 11.338113] node_type 7 (master node) [ 11.342310] group_type 0 (no node group) [ 11.346668] sqnum 272796 [ 11.350071] len 512 [ 11.353226] highest_inum 3500 [ 11.356456] commit number 8967 [ 11.359686] flags 0x3 [ 11.362840] log_lnum 3 [ 11.365809] root_lnum 461 [ 11.368950] root_offs 74096 [ 11.372276] root_len 128 [ 11.375418] gc_lnum 460 [ 11.378559] ihead_lnum 461 [ 11.381701] ihead_offs 77824 [ 11.385026] index_size 210120 [ 11.388429] lpt_lnum 10 [ 11.391483] lpt_offs 94430 [ 11.394809] nhead_lnum 10 [ 11.397865] nhead_offs 98304 [ 11.401180] ltab_lnum 10 [ 11.404246] ltab_offs 94208 [ 11.407561] lsave_lnum 0 [ 11.410529] lsave_offs 0 [ 11.413508] lscan_lnum 460 [ 11.416650] leb_cnt 7820 [ 11.419878] empty_lebs 7705 [ 11.423118] idx_lebs 10 [ 11.426174] total_free 1957130240 [ 11.429925] total_dirty 6846968 [ 11.433425] total_used 18161984 [ 11.437001] total_dead 88160 [ 11.440317] total_dark 63299584 [ 11.443952] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" stops [ 11.451984] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, PID 634 [ 11.474373] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 11.497488] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 11.520558] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 11.543655] ubi0 error: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read 253952 bytes [ 11.554809] CPU: 1 PID: 626 Comm: mount_root Not tainted 4.4.0 #6 [ 11.560905] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 11.567435] Backtrace: [ 11.569917] [<8001e674>] (dump_backtrace) from [<8001e86c>] (show_stack+0x18/0x1c) [ 11.577489] r7:00000af7 r6:0003e000 r5:60000013 r4:00000000 [ 11.583216] [<8001e854>] (show_stack) from [<80232028>] (dump_stack+0x84/0xa4) [ 11.590455] [<80231fa4>] (dump_stack) from [<8030a9c4>] (ubi_io_read+0x1dc/0x2b0) [ 11.597938] r5:bf206000 r4:ffffffb6 [ 11.601549] [<8030a7e8>] (ubi_io_read) from [<80308974>] (ubi_eba_read_leb+0x27c/0x388) [ 11.609552] r10:be913c00 r9:00000000 r8:00000000 r7:00000002 r6:bf206000 r5:bf206000 [ 11.617453] r4:0003e000 [ 11.620008] [<803086f8>] (ubi_eba_read_leb) from [<80307864>] (ubi_leb_read+0x74/0xc4) [ 11.627924] r10:c0e7d000 r9:00000002 r8:00000002 r7:00000000 r6:bf206000 r5:be913c00 [ 11.635823] r4:0003e000 [ 11.638377] [<803077f0>] (ubi_leb_read) from [<801da34c>] (ubifs_leb_read+0x34/0x98) [ 11.646120] r10:bf17d680 r9:00000002 r8:00000000 r7:00000002 r6:0003e000 r5:bf210000 [ 11.654019] r4:bf17d240 [ 11.656576] [<801da318>] (ubifs_leb_read) from [<801e19c8>] (ubifs_start_scan+0x7c/0xf8) [ 11.664666] r8:00000002 r7:c0e7d000 r6:00000000 r5:bf210000 r4:bf17d240 [ 11.671442] [<801e194c>] (ubifs_start_scan) from [<801e1ccc>] (ubifs_scan+0x2c/0x330) [ 11.679271] r8:00000003 r7:0003e000 r6:c0e7d000 r5:00000000 r4:bf210000 [ 11.686046] [<801e1ca0>] (ubifs_scan) from [<801e0e38>] (ubifs_read_master+0xb4/0x924) [ 11.693963] r10:bf17d680 r9:000000a0 r8:00000003 r7:00002000 r6:bf17d440 r5:bf17d240 [ 11.701862] r4:bf210000 [ 11.704415] [<801e0d84>] (ubifs_read_master) from [<801d82c4>] (ubifs_mount+0xa7c/0x156c) [ 11.712592] r10:bf17d680 r9:000000a0 r8:bf21087c r7:be548400 r6:00000000 r5:bf210000 [ 11.720489] r4:bf2d9300 [ 11.723045] [<801d7848>] (ubifs_mount) from [<801011ac>] (mount_fs+0x1c/0xa0) [ 11.730180] r10:bf17d180 r9:806f2078 r8:00000000 r7:806f2078 r6:806f2078 r5:00000000 [ 11.738081] r4:801d7848 [ 11.740636] [<80101190>] (mount_fs) from [<80119054>] (vfs_kern_mount+0x50/0x108) [ 11.748119] r6:bf17d480 r5:00000000 r4:bf183cc0 [ 11.752788] [<80119004>] (vfs_kern_mount) from [<8011c354>] (do_mount+0x9d8/0xb70) [ 11.760356] r9:806f2078 r8:bf17d480 r7:76edc4c5 r6:00000400 r5:806dbc6c r4:00000008 [ 11.768176] [<8011b97c>] (do_mount) from [<8011c72c>] (SyS_mount+0x7c/0xa8) [ 11.775137] r10:00000000 r9:be588000 r8:00000400 r7:76edc4c5 r6:bf17d480 r5:bf17d180 [ 11.783036] r4:00000000 [ 11.785591] [<8011c6b0>] (SyS_mount) from [<80009bc0>] (ret_fast_syscall+0x0/0x3c) [ 11.793160] r8:80009d84 r7:00000015 r6:76eece70 r5:76ebd0e0 r4:00000000 [ 11.823668] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 11.847405] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 11.871113] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 11.894761] ubi0 error: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read 253952 bytes [ 11.905915] CPU: 1 PID: 626 Comm: mount_root Not tainted 4.4.0 #6 [ 11.912010] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 11.918540] Backtrace: [ 11.921022] [<8001e674>] (dump_backtrace) from [<8001e86c>] (show_stack+0x18/0x1c) [ 11.928593] r7:00000af7 r6:0003e000 r5:60000013 r4:00000000 [ 11.934321] [<8001e854>] (show_stack) from [<80232028>] (dump_stack+0x84/0xa4) [ 11.941562] [<80231fa4>] (dump_stack) from [<8030a9c4>] (ubi_io_read+0x1dc/0x2b0) [ 11.949044] r5:bf206000 r4:ffffffb6 [ 11.952657] [<8030a7e8>] (ubi_io_read) from [<80308974>] (ubi_eba_read_leb+0x27c/0x388) [ 11.960660] r10:be913c00 r9:00000000 r8:00000000 r7:00000002 r6:bf206000 r5:bf206000 [ 11.968559] r4:0003e000 [ 11.971114] [<803086f8>] (ubi_eba_read_leb) from [<80307864>] (ubi_leb_read+0x74/0xc4) [ 11.979030] r10:c0f3a000 r9:c0f3a000 r8:00000002 r7:00000000 r6:bf206000 r5:be913c00 [ 11.986929] r4:0003e000 [ 11.989484] [<803077f0>] (ubi_leb_read) from [<801da34c>] (ubifs_leb_read+0x34/0x98) [ 11.997227] r10:00000002 r9:c0f3a000 r8:00000000 r7:00000002 r6:0003e000 r5:bf210000 [ 12.005127] r4:bf210000 [ 12.007691] [<801da318>] (ubifs_leb_read) from [<801f1668>] (get_master_node+0x58/0x1f0) [ 12.015782] r8:bf210000 r7:00001000 r6:bf17d440 r5:00000000 r4:bf210000 [ 12.022560] [<801f1610>] (get_master_node) from [<801f1b0c>] (ubifs_recover_master_node+0x70/0x2f4) [ 12.031606] r10:bf17d680 r9:000000a0 r8:00000003 r7:00001000 r6:bf17d440 r5:00000000 [ 12.039505] r4:bf210000 [ 12.042060] [<801f1a9c>] (ubifs_recover_master_node) from [<801e0f28>] (ubifs_read_master+0x1a4/0x924) [ 12.051366] r7:00002000 r6:bf17d440 r5:ffffff8b r4:bf210000 [ 12.057088] [<801e0d84>] (ubifs_read_master) from [<801d82c4>] (ubifs_mount+0xa7c/0x156c) [ 12.065264] r10:bf17d680 r9:000000a0 r8:bf21087c r7:be548400 r6:00000000 r5:bf210000 [ 12.073164] r4:bf2d9300 [ 12.075721] [<801d7848>] (ubifs_mount) from [<801011ac>] (mount_fs+0x1c/0xa0) [ 12.082856] r10:bf17d180 r9:806f2078 r8:00000000 r7:806f2078 r6:806f2078 r5:00000000 [ 12.090756] r4:801d7848 [ 12.093311] [<80101190>] (mount_fs) from [<80119054>] (vfs_kern_mount+0x50/0x108) [ 12.100794] r6:bf17d480 r5:00000000 r4:bf183cc0 [ 12.105462] [<80119004>] (vfs_kern_mount) from [<8011c354>] (do_mount+0x9d8/0xb70) [ 12.113031] r9:806f2078 r8:bf17d480 r7:76edc4c5 r6:00000400 r5:806dbc6c r4:00000008 [ 12.120852] [<8011b97c>] (do_mount) from [<8011c72c>] (SyS_mount+0x7c/0xa8) [ 12.127814] r10:00000000 r9:be588000 r8:00000400 r7:76edc4c5 r6:bf17d480 r5:bf17d180 [ 12.135713] r4:00000000 [ 12.138269] [<8011c6b0>] (SyS_mount) from [<80009bc0>] (ret_fast_syscall+0x0/0x3c) [ 12.145840] r8:80009d84 r7:00000015 r6:76eece70 r5:76ebd0e0 r4:00000000 [ 12.153108] UBIFS error (ubi0:2 pid 626): ubifs_recover_master_node: failed to recover master node [ 12.162093] UBIFS error (ubi0:2 pid 626): ubifs_recover_master_node: dumping first master node [ 12.170708] magic 0x6101831 [ 12.174389] crc 0xd0feaa12 [ 12.178140] node_type 7 (master node) [ 12.182340] group_type 0 (no node group) [ 12.186699] sqnum 272796 [ 12.190101] len 512 [ 12.193258] highest_inum 3500 [ 12.196492] commit number 8967 [ 12.199722] flags 0x3 [ 12.202877] log_lnum 3 [ 12.205846] root_lnum 461 [ 12.208990] root_offs 74096 [ 12.212319] root_len 128 [ 12.215461] gc_lnum 460 [ 12.218602] ihead_lnum 461 [ 12.221758] ihead_offs 77824 [ 12.225073] index_size 210120 [ 12.228476] lpt_lnum 10 [ 12.231530] lpt_offs 94430 [ 12.234859] nhead_lnum 10 [ 12.237914] nhead_offs 98304 [ 12.241229] ltab_lnum 10 [ 12.244298] ltab_offs 94208 [ 12.247613] lsave_lnum 0 [ 12.250581] lsave_offs 0 [ 12.253562] lscan_lnum 460 [ 12.256705] leb_cnt 7820 [ 12.259933] empty_lebs 7705 [ 12.263174] idx_lebs 10 [ 12.266231] total_free 1957130240 [ 12.269980] total_dirty 6846968 [ 12.273482] total_used 18161984 [ 12.277058] total_dead 88160 [ 12.280374] total_dark 63299584 [ 12.284022] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" stops [ 12.290714] mount_root: failed to mount -t ubifs /dev/ubi0_2 /tmp/overlay: Invalid argument [ 12.303451] blk_update_request: I/O error, dev mtdblock0, sector 0 [ 12.313183] blk_update_request: I/O error, dev mtdblock0, sector 0 [ 12.319374] Buffer I/O error on dev mtdblock0, logical block 0, async page read [ 12.389844] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, PID 638 [ 12.412129] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 12.435259] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 12.458336] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ 12.482116] ubi0 error: ubi_io_read: error -74 (ECC error) while reading 253952 bytes from PEB 2807:8192, read 253952 bytes ... >> >> The kernel they are using is a bit out of date but does have >> >> 'gpmi-nand: Handle ECC Errors in erased pages' [1] patch >> >> >> >> I'm wondering if the 'unstable bits issue' [2] is still an issue or if >> >> the UBI/UBFS Documentation is out of date and this has been resolved. >> >> If it has been resolved, can anyone point me to the patches. >> > >> > This issue is highly theoretical and I never actually saw it in the wild. >> > Every single time someone claimed to suffer from that, it turned out to be >> > something else. Currently UBI/UBIFS has no counter measurement, for the >> > said reasons. >> > This reminds me that we have to update the website... >> > >> > So did you verify (with your NAND vendor) that this really is the named >> > issue? >> I have no idea if what the user reported is the unstable bits issue >> but the fact you've never seen it occur in the wild tells me probably >> not. > > I'd be surprised, but you never know. :-) > > Just to be sure, this is SLC NAND, right? No, its a MT29F16G08 16GB MLC Tim ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does modern UBI/UBIFS still suffer from the 'unstable bits issue'? 2018-03-02 16:20 ` Tim Harvey @ 2018-03-02 17:33 ` Han Xu 2018-03-03 10:40 ` Richard Weinberger 1 sibling, 0 replies; 8+ messages in thread From: Han Xu @ 2018-03-02 17:33 UTC (permalink / raw) To: Tim Harvey Cc: Richard Weinberger, Scott Bowman, linux-mtd, Adrian Hunter, Koen Vandeputte, Artem Bityutskiy Hi Tim, I know one potential issue may cause rare UBIFS mount failure, only If both dma_mapping_error and bitflip happened, the alternative buffer failed to swap back to the correct data buffer. The detailed workflow as follows: 1.read_page_prepare: direct_dma_map_ok is 0, alternative buffer path is enabled. 2.gpmi_read_page: page data goes to alloc DMA buffer (not direct mapped). 3.read_page_end: nothing happens, dma_map_ok is 0. 4.Loop over ECC chunks, STATUS_UNCORRECTABLE is hit, gpmi_erased_check starts. 5.gpmi_erased_check: gpmi_read_buf occurs, this leads to prepare_data_dma. direct_dma_map_ok goes to 1. This is the important part as direct_dma_map_ok changes. 6.gpmi_erased_check: payload_virt/payload_phys (alloc DMA buffer) is set to 0xFF since page is erased. 7.read_page_swap_end: direct_map_ok is now 1, data from payload_virt/payload_phys (alloc DMA buffer) never makes it back to data buffer, previous page data from previous operation is there instead. This issue was fixed by Markus patch[1], you can follow the same implementation to move the read_page_swap_end() call before the ECC status checking for-loop, and gpmi_erased_check to check buf rather than payload_virt, for kernel 4.4 Please let me know if it helps. [1]:http://patchwork.ozlabs.org/patch/614433/ On Fri, Mar 2, 2018 at 10:20 AM, Tim Harvey <tharvey@gateworks.com> wrote: > On Fri, Mar 2, 2018 at 2:07 AM, Richard Weinberger <richard@nod.at> wrote: >> Tim, >> >> Am Freitag, 2. März 2018, 02:19:54 CET schrieb Tim Harvey: >>> On Thu, Mar 1, 2018 at 8:32 AM, Richard Weinberger <richard@nod.at> wrote: >>> > Tim, >>> > >>> > Am Donnerstag, 1. März 2018, 17:15:44 CET schrieb Tim Harvey: >>> >> Greetings, >>> >> >>> >> I have a user with an IMX6 and raw NAND using UBI/UBIFS who has been >>> > >>> >> able to reproduce a NAND corruption: >>> > What does your user to reproduce this? >>> >>> Richard, >>> >>> It's unclear at the moment. It's one of those 'this happened twice on >>> two different boards' reports without a lot of detail. However I do >>> know they do write to the filesystem on every boot and do encounter >>> random power-cuts. >>> >>> >> [ 10.611972] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, >>> >> PID 631 [ 10.634365] ubi0 warning: ubi_io_read: error -74 (ECC error) >>> >> while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, >>> >> retry [ 10.657492] ubi0 warning: ubi_io_read: error -74 (ECC error) >>> >> while reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, >>> >> retry [ >>> >> 10.681137] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading >>> >> 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry [ >>> >> 10.704267] ubi0 error: ubi_io_read: error -74 (ECC error) while reading >>> >> 253952 bytes from PEB 2807:8192, read 253952 bytes >> >> BTW: I miss a back trace here. How did you obtain that messages? >> > > [ 10.528272] Buffer I/O error on dev mtdblock0, logical block 0, > async page read > [ 10.611972] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, PID 631 > [ 10.634365] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 10.657492] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 10.681137] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 10.704267] ubi0 error: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read 253952 bytes > [ 10.715425] CPU: 2 PID: 629 Comm: block Not tainted 4.4.0 #6 > [ 10.721087] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > [ 10.727619] Backtrace: > [ 10.730108] [<8001e674>] (dump_backtrace) from [<8001e86c>] > (show_stack+0x18/0x1c) > [ 10.737679] r7:00000af7 r6:0003e000 r5:60000013 r4:00000000 > [ 10.743406] [<8001e854>] (show_stack) from [<80232028>] > (dump_stack+0x84/0xa4) > [ 10.750649] [<80231fa4>] (dump_stack) from [<8030a9c4>] > (ubi_io_read+0x1dc/0x2b0) > [ 10.758132] r5:bf206000 r4:ffffffb6 > [ 10.761744] [<8030a7e8>] (ubi_io_read) from [<80308974>] > (ubi_eba_read_leb+0x27c/0x388) > [ 10.769748] r10:be913c00 r9:00000000 r8:00000000 r7:00000002 > r6:bf206000 r5:bf206000 > [ 10.777646] r4:0003e000 > [ 10.780204] [<803086f8>] (ubi_eba_read_leb) from [<80307864>] > (ubi_leb_read+0x74/0xc4) > [ 10.788120] r10:c0d81000 r9:00000002 r8:00000002 r7:00000000 > r6:bf206000 r5:be913c00 > [ 10.796020] r4:0003e000 > [ 10.798579] [<803077f0>] (ubi_leb_read) from [<801da34c>] > (ubifs_leb_read+0x34/0x98) > [ 10.806322] r10:be55eec0 r9:00000002 r8:00000000 r7:00000002 > r6:0003e000 r5:bf1cb000 > [ 10.814221] r4:be560180 > [ 10.816779] [<801da318>] (ubifs_leb_read) from [<801e19c8>] > (ubifs_start_scan+0x7c/0xf8) > [ 10.824869] r8:00000002 r7:c0d81000 r6:00000000 r5:bf1cb000 r4:be560180 > [ 10.831648] [<801e194c>] (ubifs_start_scan) from [<801e1ccc>] > (ubifs_scan+0x2c/0x330) > [ 10.839477] r8:00000003 r7:0003e000 r6:c0d81000 r5:00000000 r4:bf1cb000 > [ 10.846252] [<801e1ca0>] (ubifs_scan) from [<801e0e38>] > (ubifs_read_master+0xb4/0x924) > [ 10.854169] r10:be55eec0 r9:000000a0 r8:00000003 r7:00002000 > r6:be560300 r5:be560180 > [ 10.862069] r4:bf1cb000 > [ 10.864622] [<801e0d84>] (ubifs_read_master) from [<801d82c4>] > (ubifs_mount+0xa7c/0x156c) > [ 10.872798] r10:be55eec0 r9:000000a0 r8:bf1cb87c r7:be538000 > r6:00000000 r5:bf1cb000 > [ 10.880697] r4:bf2da140 > [ 10.883255] [<801d7848>] (ubifs_mount) from [<801011ac>] (mount_fs+0x1c/0xa0) > [ 10.890390] r10:be55e000 r9:806f2078 r8:00000000 r7:806f2078 > r6:806f2078 r5:00000000 > [ 10.898291] r4:801d7848 > [ 10.900849] [<80101190>] (mount_fs) from [<80119054>] > (vfs_kern_mount+0x50/0x108) > [ 10.908332] r6:be55e180 r5:00000000 r4:bf1a6cc0 > [ 10.913002] [<80119004>] (vfs_kern_mount) from [<8011c354>] > (do_mount+0x9d8/0xb70) > [ 10.920572] r9:806f2078 r8:be55e180 r7:7eeabe14 r6:00000400 > r5:806dbc6c r4:00000008 > [ 10.928391] [<8011b97c>] (do_mount) from [<8011c72c>] (SyS_mount+0x7c/0xa8) > [ 10.935354] r10:00000000 r9:be4f6000 r8:00000400 r7:7eeabe14 > r6:be55e180 r5:be55e000 > [ 10.943253] r4:00000000 > [ 10.945811] [<8011c6b0>] (SyS_mount) from [<80009bc0>] > (ret_fast_syscall+0x0/0x3c) > [ 10.953380] r8:80009d84 r7:00000015 r6:00027014 r5:7eeabe14 r4:00000000 > [ 10.984081] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 11.007847] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 11.031492] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 11.055202] ubi0 error: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read 253952 bytes > [ 11.066358] CPU: 2 PID: 629 Comm: block Not tainted 4.4.0 #6 > [ 11.072020] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > [ 11.078549] Backtrace: > [ 11.081034] [<8001e674>] (dump_backtrace) from [<8001e86c>] > (show_stack+0x18/0x1c) > [ 11.088606] r7:00000af7 r6:0003e000 r5:60000013 r4:00000000 > [ 11.094334] [<8001e854>] (show_stack) from [<80232028>] > (dump_stack+0x84/0xa4) > [ 11.101575] [<80231fa4>] (dump_stack) from [<8030a9c4>] > (ubi_io_read+0x1dc/0x2b0) > [ 11.109058] r5:bf206000 r4:ffffffb6 > [ 11.112669] [<8030a7e8>] (ubi_io_read) from [<80308974>] > (ubi_eba_read_leb+0x27c/0x388) > [ 11.120673] r10:be913c00 r9:00000000 r8:00000000 r7:00000002 > r6:bf206000 r5:bf206000 > [ 11.128571] r4:0003e000 > [ 11.131126] [<803086f8>] (ubi_eba_read_leb) from [<80307864>] > (ubi_leb_read+0x74/0xc4) > [ 11.139042] r10:c0e3e000 r9:c0e3e000 r8:00000002 r7:00000000 > r6:bf206000 r5:be913c00 > [ 11.146941] r4:0003e000 > [ 11.149498] [<803077f0>] (ubi_leb_read) from [<801da34c>] > (ubifs_leb_read+0x34/0x98) > [ 11.157240] r10:00000002 r9:c0e3e000 r8:00000000 r7:00000002 > r6:0003e000 r5:bf1cb000 > [ 11.165140] r4:bf1cb000 > [ 11.167706] [<801da318>] (ubifs_leb_read) from [<801f1668>] > (get_master_node+0x58/0x1f0) > [ 11.175796] r8:bf1cb000 r7:00001000 r6:be560300 r5:00000000 r4:bf1cb000 > [ 11.182575] [<801f1610>] (get_master_node) from [<801f1b0c>] > (ubifs_recover_master_node+0x70/0x2f4) > [ 11.191620] r10:be55eec0 r9:000000a0 r8:00000003 r7:00001000 > r6:be560300 r5:00000000 > [ 11.199520] r4:bf1cb000 > [ 11.202077] [<801f1a9c>] (ubifs_recover_master_node) from > [<801e0f28>] (ubifs_read_master+0x1a4/0x924) > [ 11.211383] r7:00002000 r6:be560300 r5:ffffff8b r4:bf1cb000 > [ 11.217106] [<801e0d84>] (ubifs_read_master) from [<801d82c4>] > (ubifs_mount+0xa7c/0x156c) > [ 11.225282] r10:be55eec0 r9:000000a0 r8:bf1cb87c r7:be538000 > r6:00000000 r5:bf1cb000 > [ 11.233180] r4:bf2da140 > [ 11.235735] [<801d7848>] (ubifs_mount) from [<801011ac>] (mount_fs+0x1c/0xa0) > [ 11.242870] r10:be55e000 r9:806f2078 r8:00000000 r7:806f2078 > r6:806f2078 r5:00000000 > [ 11.250769] r4:801d7848 > [ 11.253326] [<80101190>] (mount_fs) from [<80119054>] > (vfs_kern_mount+0x50/0x108) > [ 11.260808] r6:be55e180 r5:00000000 r4:bf1a6cc0 > [ 11.265478] [<80119004>] (vfs_kern_mount) from [<8011c354>] > (do_mount+0x9d8/0xb70) > [ 11.273047] r9:806f2078 r8:be55e180 r7:7eeabe14 r6:00000400 > r5:806dbc6c r4:00000008 > [ 11.280865] [<8011b97c>] (do_mount) from [<8011c72c>] (SyS_mount+0x7c/0xa8) > [ 11.287827] r10:00000000 r9:be4f6000 r8:00000400 r7:7eeabe14 > r6:be55e180 r5:be55e000 > [ 11.295727] r4:00000000 > [ 11.298284] [<8011c6b0>] (SyS_mount) from [<80009bc0>] > (ret_fast_syscall+0x0/0x3c) > [ 11.305853] r8:80009d84 r7:00000015 r6:00027014 r5:7eeabe14 r4:00000000 > [ 11.313088] UBIFS error (ubi0:2 pid 629): > ubifs_recover_master_node: failed to recover master node > [ 11.322071] UBIFS error (ubi0:2 pid 629): > ubifs_recover_master_node: dumping first master node > [ 11.330686] magic 0x6101831 > [ 11.334361] crc 0xd0feaa12 > [ 11.338113] node_type 7 (master node) > [ 11.342310] group_type 0 (no node group) > [ 11.346668] sqnum 272796 > [ 11.350071] len 512 > [ 11.353226] highest_inum 3500 > [ 11.356456] commit number 8967 > [ 11.359686] flags 0x3 > [ 11.362840] log_lnum 3 > [ 11.365809] root_lnum 461 > [ 11.368950] root_offs 74096 > [ 11.372276] root_len 128 > [ 11.375418] gc_lnum 460 > [ 11.378559] ihead_lnum 461 > [ 11.381701] ihead_offs 77824 > [ 11.385026] index_size 210120 > [ 11.388429] lpt_lnum 10 > [ 11.391483] lpt_offs 94430 > [ 11.394809] nhead_lnum 10 > [ 11.397865] nhead_offs 98304 > [ 11.401180] ltab_lnum 10 > [ 11.404246] ltab_offs 94208 > [ 11.407561] lsave_lnum 0 > [ 11.410529] lsave_offs 0 > [ 11.413508] lscan_lnum 460 > [ 11.416650] leb_cnt 7820 > [ 11.419878] empty_lebs 7705 > [ 11.423118] idx_lebs 10 > [ 11.426174] total_free 1957130240 > [ 11.429925] total_dirty 6846968 > [ 11.433425] total_used 18161984 > [ 11.437001] total_dead 88160 > [ 11.440317] total_dark 63299584 > [ 11.443952] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" stops > [ 11.451984] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, PID 634 > [ 11.474373] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 11.497488] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 11.520558] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 11.543655] ubi0 error: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read 253952 bytes > [ 11.554809] CPU: 1 PID: 626 Comm: mount_root Not tainted 4.4.0 #6 > [ 11.560905] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > [ 11.567435] Backtrace: > [ 11.569917] [<8001e674>] (dump_backtrace) from [<8001e86c>] > (show_stack+0x18/0x1c) > [ 11.577489] r7:00000af7 r6:0003e000 r5:60000013 r4:00000000 > [ 11.583216] [<8001e854>] (show_stack) from [<80232028>] > (dump_stack+0x84/0xa4) > [ 11.590455] [<80231fa4>] (dump_stack) from [<8030a9c4>] > (ubi_io_read+0x1dc/0x2b0) > [ 11.597938] r5:bf206000 r4:ffffffb6 > [ 11.601549] [<8030a7e8>] (ubi_io_read) from [<80308974>] > (ubi_eba_read_leb+0x27c/0x388) > [ 11.609552] r10:be913c00 r9:00000000 r8:00000000 r7:00000002 > r6:bf206000 r5:bf206000 > [ 11.617453] r4:0003e000 > [ 11.620008] [<803086f8>] (ubi_eba_read_leb) from [<80307864>] > (ubi_leb_read+0x74/0xc4) > [ 11.627924] r10:c0e7d000 r9:00000002 r8:00000002 r7:00000000 > r6:bf206000 r5:be913c00 > [ 11.635823] r4:0003e000 > [ 11.638377] [<803077f0>] (ubi_leb_read) from [<801da34c>] > (ubifs_leb_read+0x34/0x98) > [ 11.646120] r10:bf17d680 r9:00000002 r8:00000000 r7:00000002 > r6:0003e000 r5:bf210000 > [ 11.654019] r4:bf17d240 > [ 11.656576] [<801da318>] (ubifs_leb_read) from [<801e19c8>] > (ubifs_start_scan+0x7c/0xf8) > [ 11.664666] r8:00000002 r7:c0e7d000 r6:00000000 r5:bf210000 r4:bf17d240 > [ 11.671442] [<801e194c>] (ubifs_start_scan) from [<801e1ccc>] > (ubifs_scan+0x2c/0x330) > [ 11.679271] r8:00000003 r7:0003e000 r6:c0e7d000 r5:00000000 r4:bf210000 > [ 11.686046] [<801e1ca0>] (ubifs_scan) from [<801e0e38>] > (ubifs_read_master+0xb4/0x924) > [ 11.693963] r10:bf17d680 r9:000000a0 r8:00000003 r7:00002000 > r6:bf17d440 r5:bf17d240 > [ 11.701862] r4:bf210000 > [ 11.704415] [<801e0d84>] (ubifs_read_master) from [<801d82c4>] > (ubifs_mount+0xa7c/0x156c) > [ 11.712592] r10:bf17d680 r9:000000a0 r8:bf21087c r7:be548400 > r6:00000000 r5:bf210000 > [ 11.720489] r4:bf2d9300 > [ 11.723045] [<801d7848>] (ubifs_mount) from [<801011ac>] (mount_fs+0x1c/0xa0) > [ 11.730180] r10:bf17d180 r9:806f2078 r8:00000000 r7:806f2078 > r6:806f2078 r5:00000000 > [ 11.738081] r4:801d7848 > [ 11.740636] [<80101190>] (mount_fs) from [<80119054>] > (vfs_kern_mount+0x50/0x108) > [ 11.748119] r6:bf17d480 r5:00000000 r4:bf183cc0 > [ 11.752788] [<80119004>] (vfs_kern_mount) from [<8011c354>] > (do_mount+0x9d8/0xb70) > [ 11.760356] r9:806f2078 r8:bf17d480 r7:76edc4c5 r6:00000400 > r5:806dbc6c r4:00000008 > [ 11.768176] [<8011b97c>] (do_mount) from [<8011c72c>] (SyS_mount+0x7c/0xa8) > [ 11.775137] r10:00000000 r9:be588000 r8:00000400 r7:76edc4c5 > r6:bf17d480 r5:bf17d180 > [ 11.783036] r4:00000000 > [ 11.785591] [<8011c6b0>] (SyS_mount) from [<80009bc0>] > (ret_fast_syscall+0x0/0x3c) > [ 11.793160] r8:80009d84 r7:00000015 r6:76eece70 r5:76ebd0e0 r4:00000000 > [ 11.823668] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 11.847405] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 11.871113] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 11.894761] ubi0 error: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read 253952 bytes > [ 11.905915] CPU: 1 PID: 626 Comm: mount_root Not tainted 4.4.0 #6 > [ 11.912010] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > [ 11.918540] Backtrace: > [ 11.921022] [<8001e674>] (dump_backtrace) from [<8001e86c>] > (show_stack+0x18/0x1c) > [ 11.928593] r7:00000af7 r6:0003e000 r5:60000013 r4:00000000 > [ 11.934321] [<8001e854>] (show_stack) from [<80232028>] > (dump_stack+0x84/0xa4) > [ 11.941562] [<80231fa4>] (dump_stack) from [<8030a9c4>] > (ubi_io_read+0x1dc/0x2b0) > [ 11.949044] r5:bf206000 r4:ffffffb6 > [ 11.952657] [<8030a7e8>] (ubi_io_read) from [<80308974>] > (ubi_eba_read_leb+0x27c/0x388) > [ 11.960660] r10:be913c00 r9:00000000 r8:00000000 r7:00000002 > r6:bf206000 r5:bf206000 > [ 11.968559] r4:0003e000 > [ 11.971114] [<803086f8>] (ubi_eba_read_leb) from [<80307864>] > (ubi_leb_read+0x74/0xc4) > [ 11.979030] r10:c0f3a000 r9:c0f3a000 r8:00000002 r7:00000000 > r6:bf206000 r5:be913c00 > [ 11.986929] r4:0003e000 > [ 11.989484] [<803077f0>] (ubi_leb_read) from [<801da34c>] > (ubifs_leb_read+0x34/0x98) > [ 11.997227] r10:00000002 r9:c0f3a000 r8:00000000 r7:00000002 > r6:0003e000 r5:bf210000 > [ 12.005127] r4:bf210000 > [ 12.007691] [<801da318>] (ubifs_leb_read) from [<801f1668>] > (get_master_node+0x58/0x1f0) > [ 12.015782] r8:bf210000 r7:00001000 r6:bf17d440 r5:00000000 r4:bf210000 > [ 12.022560] [<801f1610>] (get_master_node) from [<801f1b0c>] > (ubifs_recover_master_node+0x70/0x2f4) > [ 12.031606] r10:bf17d680 r9:000000a0 r8:00000003 r7:00001000 > r6:bf17d440 r5:00000000 > [ 12.039505] r4:bf210000 > [ 12.042060] [<801f1a9c>] (ubifs_recover_master_node) from > [<801e0f28>] (ubifs_read_master+0x1a4/0x924) > [ 12.051366] r7:00002000 r6:bf17d440 r5:ffffff8b r4:bf210000 > [ 12.057088] [<801e0d84>] (ubifs_read_master) from [<801d82c4>] > (ubifs_mount+0xa7c/0x156c) > [ 12.065264] r10:bf17d680 r9:000000a0 r8:bf21087c r7:be548400 > r6:00000000 r5:bf210000 > [ 12.073164] r4:bf2d9300 > [ 12.075721] [<801d7848>] (ubifs_mount) from [<801011ac>] (mount_fs+0x1c/0xa0) > [ 12.082856] r10:bf17d180 r9:806f2078 r8:00000000 r7:806f2078 > r6:806f2078 r5:00000000 > [ 12.090756] r4:801d7848 > [ 12.093311] [<80101190>] (mount_fs) from [<80119054>] > (vfs_kern_mount+0x50/0x108) > [ 12.100794] r6:bf17d480 r5:00000000 r4:bf183cc0 > [ 12.105462] [<80119004>] (vfs_kern_mount) from [<8011c354>] > (do_mount+0x9d8/0xb70) > [ 12.113031] r9:806f2078 r8:bf17d480 r7:76edc4c5 r6:00000400 > r5:806dbc6c r4:00000008 > [ 12.120852] [<8011b97c>] (do_mount) from [<8011c72c>] (SyS_mount+0x7c/0xa8) > [ 12.127814] r10:00000000 r9:be588000 r8:00000400 r7:76edc4c5 > r6:bf17d480 r5:bf17d180 > [ 12.135713] r4:00000000 > [ 12.138269] [<8011c6b0>] (SyS_mount) from [<80009bc0>] > (ret_fast_syscall+0x0/0x3c) > [ 12.145840] r8:80009d84 r7:00000015 r6:76eece70 r5:76ebd0e0 r4:00000000 > [ 12.153108] UBIFS error (ubi0:2 pid 626): > ubifs_recover_master_node: failed to recover master node > [ 12.162093] UBIFS error (ubi0:2 pid 626): > ubifs_recover_master_node: dumping first master node > [ 12.170708] magic 0x6101831 > [ 12.174389] crc 0xd0feaa12 > [ 12.178140] node_type 7 (master node) > [ 12.182340] group_type 0 (no node group) > [ 12.186699] sqnum 272796 > [ 12.190101] len 512 > [ 12.193258] highest_inum 3500 > [ 12.196492] commit number 8967 > [ 12.199722] flags 0x3 > [ 12.202877] log_lnum 3 > [ 12.205846] root_lnum 461 > [ 12.208990] root_offs 74096 > [ 12.212319] root_len 128 > [ 12.215461] gc_lnum 460 > [ 12.218602] ihead_lnum 461 > [ 12.221758] ihead_offs 77824 > [ 12.225073] index_size 210120 > [ 12.228476] lpt_lnum 10 > [ 12.231530] lpt_offs 94430 > [ 12.234859] nhead_lnum 10 > [ 12.237914] nhead_offs 98304 > [ 12.241229] ltab_lnum 10 > [ 12.244298] ltab_offs 94208 > [ 12.247613] lsave_lnum 0 > [ 12.250581] lsave_offs 0 > [ 12.253562] lscan_lnum 460 > [ 12.256705] leb_cnt 7820 > [ 12.259933] empty_lebs 7705 > [ 12.263174] idx_lebs 10 > [ 12.266231] total_free 1957130240 > [ 12.269980] total_dirty 6846968 > [ 12.273482] total_used 18161984 > [ 12.277058] total_dead 88160 > [ 12.280374] total_dark 63299584 > [ 12.284022] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" stops > [ 12.290714] mount_root: failed to mount -t ubifs /dev/ubi0_2 > /tmp/overlay: Invalid argument > [ 12.303451] blk_update_request: I/O error, dev mtdblock0, sector 0 > [ 12.313183] blk_update_request: I/O error, dev mtdblock0, sector 0 > [ 12.319374] Buffer I/O error on dev mtdblock0, logical block 0, > async page read > [ 12.389844] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, PID 638 > [ 12.412129] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 12.435259] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 12.458336] ubi0 warning: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read only 253952 bytes, retry > [ 12.482116] ubi0 error: ubi_io_read: error -74 (ECC error) while > reading 253952 bytes from PEB 2807:8192, read 253952 bytes > ... > >>> >> The kernel they are using is a bit out of date but does have >>> >> 'gpmi-nand: Handle ECC Errors in erased pages' [1] patch >>> >> >>> >> I'm wondering if the 'unstable bits issue' [2] is still an issue or if >>> >> the UBI/UBFS Documentation is out of date and this has been resolved. >>> >> If it has been resolved, can anyone point me to the patches. >>> > >>> > This issue is highly theoretical and I never actually saw it in the wild. >>> > Every single time someone claimed to suffer from that, it turned out to be >>> > something else. Currently UBI/UBIFS has no counter measurement, for the >>> > said reasons. >>> > This reminds me that we have to update the website... >>> > >>> > So did you verify (with your NAND vendor) that this really is the named >>> > issue? >>> I have no idea if what the user reported is the unstable bits issue >>> but the fact you've never seen it occur in the wild tells me probably >>> not. >> >> I'd be surprised, but you never know. :-) >> >> Just to be sure, this is SLC NAND, right? > > No, its a MT29F16G08 16GB MLC > > Tim > > ______________________________________________________ > Linux MTD discussion mailing list > http://lists.infradead.org/mailman/listinfo/linux-mtd/ -- Sincerely, Han XU ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does modern UBI/UBIFS still suffer from the 'unstable bits issue'? 2018-03-02 16:20 ` Tim Harvey 2018-03-02 17:33 ` Han Xu @ 2018-03-03 10:40 ` Richard Weinberger 2018-03-05 17:05 ` Tim Harvey 1 sibling, 1 reply; 8+ messages in thread From: Richard Weinberger @ 2018-03-03 10:40 UTC (permalink / raw) To: Tim Harvey Cc: Artem Bityutskiy, Adrian Hunter, linux-mtd, Koen Vandeputte, Scott Bowman, Boris Brezillon Tim, Am Freitag, 2. März 2018, 17:20:57 CET schrieb Tim Harvey: > > Just to be sure, this is SLC NAND, right? > > No, its a MT29F16G08 16GB MLC Sorry, MLC NAND is not supported by UBI and UBIFS [0]. The ECC errors you are facing are most likely caused by paired pages. On MLC NAND, pages come in pairs. If a write operation is interrupted, not only the current page is corrupted like on SLC, also the already written paired page is lost too. Boris Brezillon and I spent a lot of time in addressing this problem but came to no good solution after all. Well, we had a solution but it needs a lot of testing and fine tuning, sadly we run out of budget. Beside of that, read and write disturb are also an important factor, this can be addressed with the experimental ubihealthd. Thanks, //richard [0] http://linux-mtd.infradead.org/doc/ubifs.html#L_ubifs_mlc ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does modern UBI/UBIFS still suffer from the 'unstable bits issue'? 2018-03-03 10:40 ` Richard Weinberger @ 2018-03-05 17:05 ` Tim Harvey 0 siblings, 0 replies; 8+ messages in thread From: Tim Harvey @ 2018-03-05 17:05 UTC (permalink / raw) To: Richard Weinberger, Han Xu Cc: Artem Bityutskiy, Adrian Hunter, linux-mtd, Koen Vandeputte, Scott Bowman, Boris Brezillon On Sat, Mar 3, 2018 at 2:40 AM, Richard Weinberger <richard@nod.at> wrote: > Tim, > > Am Freitag, 2. März 2018, 17:20:57 CET schrieb Tim Harvey: >> > Just to be sure, this is SLC NAND, right? >> >> No, its a MT29F16G08 16GB MLC > > Sorry, MLC NAND is not supported by UBI and UBIFS [0]. > > The ECC errors you are facing are most likely caused by paired pages. > On MLC NAND, pages come in pairs. If a write operation is interrupted, not > only the current page is corrupted like on SLC, also the already written > paired page is lost too. > Boris Brezillon and I spent a lot of time in addressing this problem but came > to no good solution after all. > Well, we had a solution but it needs a lot of testing and fine tuning, sadly > we run out of budget. > Beside of that, read and write disturb are also an important factor, this can > be addressed with the experimental ubihealthd. > Richard, My mistake - it is the MT29F2G08ABAEAH4 being used here which is SLC not MLC. So I suppose perhaps we could be running into the issue Han Xu pointed out. Regards, Tim ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-03-05 17:05 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-03-01 16:15 Does modern UBI/UBIFS still suffer from the 'unstable bits issue'? Tim Harvey 2018-03-01 16:32 ` Richard Weinberger 2018-03-02 1:19 ` Tim Harvey 2018-03-02 10:07 ` Richard Weinberger 2018-03-02 16:20 ` Tim Harvey 2018-03-02 17:33 ` Han Xu 2018-03-03 10:40 ` Richard Weinberger 2018-03-05 17:05 ` Tim Harvey
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox