public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
From: Rickard x Andersson <rickaran@axis.com>
To: Zhihao Cheng <chengzhihao1@huawei.com>,
	richard@nod.at, linux-mtd@lists.infradead.org
Cc: rickard314.andersson@gmail.com
Subject: Re: Lots of fastmap writes
Date: Mon, 17 Jun 2024 13:20:49 +0200	[thread overview]
Message-ID: <0a53be8a-6d21-0ae3-a013-e1155140c889@axis.com> (raw)
In-Reply-To: <a8bf10e3-009f-3d4b-fa4b-43bbc6e1bebc@huawei.com>

On 6/14/24 14:28, Zhihao Cheng wrote:
> 在 2024/6/14 19:42, Rickard X Andersson 写道:
>> On 6/4/24 03:52, Zhihao Cheng wrote:
> 
> [...]
>>>
>>> BTW, after applying the patches, the kernel should run on a new 
>>> flash, the improved wear-leveling algorithm cannot rescue the worn 
>>> out image.
>>>
>>
>> Thanks for the patches!
>>
>> I have backported the patches to Linux kernel 6.1. Do you think the 
>> patches are safe to apply to Linux kernel 6.1?
> 
> Yes, it's okay. I have backported the patches to our product(kernel 
> v5.10) and it works fine.

Thanks! I backported the patches to Linux 6.1 and did run my own stress 
test for a few days. (On another device with fresh flash memory.) It 
seems like the wear of the fastmap physical blocks (0-63) is a lot less 
now with the patches applied, which is good.

However I got this problem after almost 3 days of stress testing (file 
system is set to read only mode):


[ 7885.036577][  T182] ubi2: scrubbed PEB 2904 (LEB 0:229), data moved 
to PEB 627
[83721.724621][  T182] ubi2: scrubbed PEB 983 (LEB 0:3240), data moved 
to PEB 7
[83721.832521][  T182] ubi2: scrubbed PEB 997 (LEB 0:2819), data moved 
to PEB 5
[83784.750714][  T182] ubi2: scrubbed PEB 1927 (LEB 0:10), data moved to 
PEB 2
[165812.657934][  T182] ubi2: scrubbed PEB 3691 (LEB 0:11), data moved 
to PEB 18
[166748.055242][  T182] ubi2: scrubbed PEB 3045 (LEB 0:2), data moved to 
PEB 837
[166834.742451][  T182] ubi2: scrubbed PEB 918 (LEB 0:2), data moved to 
PEB 43
[239986.496840][T31387] UBIFS error (ubi2:0 pid 31387): ubifs_scan: 
corrupt empty space at LEB 3519:101376
[239986.506809][T31387] UBIFS error (ubi2:0 pid 31387): 
ubifs_scanned_corruption: corruption at LEB 3519:101376
[239986.519742][T31387] UBIFS error (ubi2:0 pid 31387): 
ubifs_scanned_corruption: first 8192 bytes from LEB 3519:101376
[239986.532052][T31387] 00000000: fffffffe ffffffff ffffffff ffffffff 
ffffffff ffffffff ffffffff ffffffff  ................................
[239986.532230][T31387] 00000020: ffffffff ffffffff ffffffff ffffffff 
ffffffff ffffffff ffffffff ffffffff  ................................
[239986.532450][T31387] 00000040: ffffffff ffffffff ffffffff ffffffff 
ffffffff ffffffff ffffffff ffffffff  ................................
[239986.532607][T31387] 00000060: ffffffff ffffffff ffffffff ffffffff 
ffffffff ffffffff ffffffff ffffffff  ................................
[239986.532732][T31387] 00000080: ffffffff ffffffff ffffffff ffffffff 
ffffffff ffffffff ffffffff ffffffff  ................................

...

[239986.603283][T31387] 00001000: fffffffe ffffffff ffffffff ffffffff 
ffffffff ffffffff ffffffff ffffffff  ................................
[239986.603667][T31387] 00001020: ffffffff ffffffff ffffffff ffffffff 
ffffffff ffffffff ffffffff ffffffff  ................................

...

[239986.707743][T31387] 00001fe0: ffffffff ffffffff ffffffff ffffffff 
ffffffff ffffffff ffffffff ffffffff  ................................


[239986.707894][T31387] UBIFS error (ubi2:0 pid 31387): ubifs_scan: LEB 
3519 scanning failed
[239986.724625][T31387] UBIFS error (ubi2:0 pid 31387): do_commit: 
commit failed, error -117
[239986.734335][T31387] UBIFS warning (ubi2:0 pid 31387): 
ubifs_ro_mode.part.0: switched to read-only mode, error -117
[239986.748276][T31387] CPU: 0 PID: 31387 Comm: sync Kdump: loaded Not 
tainted 6.1.55-axis9-devel #1
[239986.757327][T31387] Hardware name: Freescale i.MX6 SoloX (Device Tree)
[239986.764095][T31387]  unwind_backtrace from show_stack+0x18/0x1c
[239986.770208][T31387]  show_stack from dump_stack_lvl+0x24/0x2c
[239986.776215][T31387]  dump_stack_lvl from do_commit+0xc0/0x528
[239986.782167][T31387]  do_commit from ubifs_sync_fs+0x84/0x98
[239986.787991][T31387]  ubifs_sync_fs from iterate_supers+0x9c/0x118
[239986.794268][T31387]  iterate_supers from ksys_sync+0x54/0x8c
[239986.800175][T31387]  ksys_sync from sys_sync+0x10/0x18
[239986.805492][T31387]  sys_sync from ret_fast_syscall+0x0/0x64
[239986.811394][T31387] Exception stack(0xc81b5fa8 to 0xc81b5ff0)
[239986.817314][T31387] 5fa0:                   00000072 be8b5d44 
00000001 be8b5d44 00000000 004e5299
[239986.826423][T31387] 5fc0: 00000072 be8b5d44 00000000 00000024 
004a12cd b6f74ce8 00000000 004f806c
[239986.835530][T31387] 5fe0: 004f8f14 be8b5bac 004e529f b6ef4e58

Is the above error something you have seen before?

>>
>> Another thing, would it not be possible to rescue that particular worn 
>> out device by simply turning fastmap off on that device?
>>
> 
> Can I regard the rescuing as making erase counters become normal 
> again(max - min <= UBI_WL_THRESHOLD)? If so, I'm afraid that not all 
> PEBs can be rescued, according to get_peb_for_wl().
> For example: PB, PC cannot be rescued, unless PA is taken for writing 
> and then wl is just right scheduled.
> 
> ubi->free tree:
>       29600(PB)
> 1(PA)        29600(PC)

I mean that I think that the badly worn device could be made usable 
again by turning off fastmap. I mean would it not work properly? I do 
however understand that the first 64 physical erase blocks would not be 
used in practice since the erase counts of those blocks are very high. 
But would not the filsystem work OK? Or am I missing something?

Thanks for all help!
Rickard Andersson


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

  reply	other threads:[~2024-06-17 11:21 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-03  8:55 Lots of fastmap writes Rickard x Andersson
2024-06-04  1:41 ` Zhihao Cheng
2024-06-04  1:52   ` Zhihao Cheng
2024-06-14 11:42     ` Rickard X Andersson
2024-06-14 12:28       ` Zhihao Cheng
2024-06-17 11:20         ` Rickard x Andersson [this message]
2024-06-17 13:21           ` Zhihao Cheng
2024-06-17 13:48             ` Rickard x Andersson
2024-06-17 13:55               ` Zhihao Cheng
2024-06-04  6:47   ` Richard Weinberger
2024-06-14 11:45     ` Rickard X Andersson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0a53be8a-6d21-0ae3-a013-e1155140c889@axis.com \
    --to=rickaran@axis.com \
    --cc=chengzhihao1@huawei.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=richard@nod.at \
    --cc=rickard314.andersson@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox