From: Barak Adam <BAdam@adva.com>
To: "linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>
Subject: UBIFS corruption in empty space during mount
Date: Thu, 29 Oct 2020 04:48:39 +0000 [thread overview]
Message-ID: <a60546366eb049698c7a0b5426361833@adva.com> (raw)
Hi all,
We are facing a kernel panic in our legacy switches, similar to one in the following post:
https://patchwork.ozlabs.org/project/linux-mtd/patch/loom.20120319T102527-948@post.gmane.org/
This corruption happens upon root FS mount and thus triggers a kernel panic upon system init.
System description:
=================
Our system is legacy, using Marvell Cetus SOC with a raw 1Gbit NAND of Micron, NAND ECC is 8 bit.
We are using UBIFS, Linux-3.10.70.
NAND driver is "armada-nand" by Marvell (mtd/nand/mvebu_nfc/nand_nfc.c), based on the PXA drivers/mtd/nand/pxa3xx_nand.c.
Using a script of endless loop of power cycling, we get this panic:
========================================================
UBIFS error (pid 1): ubifs_scan: corrupt empty space at LEB 3:7571
UBIFS error (pid 1): ubifs_scanned_corruption: corruption at LEB 3:7571
UBIFS error (pid 1): ubifs_scanned_corruption: first 8192 bytes from LEB 3:7571
UBIFS error (pid 1): ubifs_scan: LEB 3 scanning failed
VFS: Cannot open root device "ubi0:root" or unknown-block(0,0): error -117
Please append a correct "root=" boot option; here are the available partitions:
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
============================================================
I did read some of the posts about corruption of empty space for UBIFS.
Most of them recommend applying a fix on the lower layers, mtd or nand drivers.
In the past we had similar issues, it was happening during recovery of master node and I applied the following commits:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=730a43fbc135e593cc3de3b1b895e49c05c8e2dc
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40cbe6eee97b706f27bcc4c6aa1018bbe4f1e577
But now I think it is happening during mount, while UBIFS replaying the journal and it is a different scenario.
As far as I understand, this is the call stack now that leading to the panic:
[<c0015050>] (unwind_backtrace+0x0/0xf8) from [<c00115f4>] (show_stack+0x10/0x18)
[<c00115f4>] (show_stack+0x10/0x18) from [<c0196634>] (ubifs_scan+0x29c/0x378)
[<c0196634>] (ubifs_scan+0x29c/0x378) from [<c0196aa4>] (ubifs_replay_journal+0x104/0x1380)
[<c0196aa4>] (ubifs_replay_journal+0x104/0x1380) from [<c018caf8>] (ubifs_mount+0xe88/0x15c8)
[<c018caf8>] (ubifs_mount+0xe88/0x15c8) from [<c00a0830>] (mount_fs+0x14/0xc8)
[<c00a0830>] (mount_fs+0x14/0xc8) from [<c00b7620>] (vfs_kern_mount+0x4c/0xc4)
[<c00b7620>] (vfs_kern_mount+0x4c/0xc4) from [<c00b992c>] (do_mount+0x1ac/0x8e8)
[<c00b992c>] (do_mount+0x1ac/0x8e8) from [<c00ba0ec>] (SyS_mount+0x84/0xbc)
[<c00ba0ec>] (SyS_mount+0x84/0xbc) from [<c0674ee0>] (mount_block_root+0x104/0x22c)
[<c0674ee0>] (mount_block_root+0x104/0x22c) from [<c06751a4>] (prepare_namespace+0x90/0x194)
[<c06751a4>] (prepare_namespace+0x90/0x194) from [<c0674bf0>] (kernel_init_freeable+0x180/0x1c8)
[<c0674bf0>] (kernel_init_freeable+0x180/0x1c8) from [<c04de5e8>] (kernel_init+0x8/0x154)
[<c04de5e8>] (kernel_init+0x8/0x154) from [<c000dfd8>] (ret_from_fork+0x14/0x3c)
ubifs_scan (fs/ubifs) is called to scan the lebs.
It detects the corrupted empty space, dump the corruption messages as shown above, and return the -EUCLEAN error code that makes the kernel panic.
ubifs_scan:
--> calls ubifs_start_scan (fs/ubifs)
--> which calls ubifs_leb_read (fs/ubifs)
--> which calls ubi_read (mtd/ubi.h)
--> which calls ubi_leb_read (mtd/ubi)
ubi_leb_read calls lower layer nand driver functions but finally returns with -EBADMSG error code indicating that the MTD driver has detected a data integrity problem (unrecoverable ECC checksum mismatch in case of NAND).
I am still debugging, looking for any solution / workaround.
Thanks !
Barak
Please see our privacy statement at https://www.adva.com/en/about-us/legal/privacy-statement for details of how ADVA processes personal information.
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
next reply other threads:[~2020-10-29 4:49 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-29 4:48 Barak Adam [this message]
2020-10-29 10:36 ` UBIFS corruption in empty space during mount Richard Weinberger
2020-10-29 14:52 ` Barak Adam
2020-10-29 21:19 ` Richard Weinberger
2020-11-02 6:06 ` Barak Adam
2020-11-02 7:44 ` Richard Weinberger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a60546366eb049698c7a0b5426361833@adva.com \
--to=badam@adva.com \
--cc=linux-mtd@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox