linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* UBIFS node corruption issue
@ 2018-06-01  4:25 Sourabh Jain (R&D - Telecom)
  2018-06-01  8:28 ` Richard Weinberger
  0 siblings, 1 reply; 9+ messages in thread
From: Sourabh Jain (R&D - Telecom) @ 2018-06-01  4:25 UTC (permalink / raw)
  To: linux-mtd
  Cc: Shailesh Panchal (Software Development - Telecom),
	Rajdeep Vaghasia (Software Development - Telecom),
	Vamsi S (R&D - Telecom)

Hi all,


We are using 3.2.54 Linux kernel. In this, we are facing one issue
related to ubifs node  corruption. Below is the core dump captured in
dmesg output. Due to this node corruption, the application having pid
1374 is not able to run.


This is not resolved even after reboot. It is generated every time the
system starts.


What can be the problem?

How we can resolve this issue?

Please suggest us necessary action to be taken. Also, please inform us
if any further input is required from us.



[   75.379644] UBIFS error (pid 1374): ubifs_read_node: bad node type
(255 but expected 0)
[   75.387718] UBIFS error (pid 1374): ubifs_read_node: bad node at
LEB 328:75776, LEB mapping status 0
[   75.396890] Not a node, first 24 bytes:
[   75.396903] 00000000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff
........................
[   75.396917] Backtrace:
[   75.396955] [<84011d0c>] (dump_backtrace+0x0/0x10c) from
[<844b51f8>] (dump_stack+0x18/0x1c)
[   75.396966]  r6:00012800 r5:e66ee000 r4:e6c01000 r3:00000018
[   75.397004] [<844b51e0>] (dump_stack+0x0/0x1c) from [<8421a9a8>]
(ubifs_read_node+0x268/0x28c)
[   75.397033] [<8421a740>] (ubifs_read_node+0x0/0x28c) from
[<84236ac0>] (ubifs_tnc_read_node+0x6c/0x118)
[   75.397055] [<84236a54>] (ubifs_tnc_read_node+0x0/0x118) from
[<8421daf4>] (ubifs_tnc_locate+0xc8/0x1a0)
[   75.397066]  r7:e66ee000 r6:e6771d28 r5:00000001 r4:e6c01000
[   75.397091] [<8421da2c>] (ubifs_tnc_locate+0x0/0x1a0) from
[<84214e18>] (ubifs_iget+0x70/0x76c)
[   75.397110] [<84214da8>] (ubifs_iget+0x0/0x76c) from [<84212a04>]
(ubifs_lookup+0x13c/0x1dc)
[   75.397131] [<842128c8>] (ubifs_lookup+0x0/0x1dc) from [<840df220>]
(d_alloc_and_lookup+0x4c/0x68)
[   75.397149] [<840df1d4>] (d_alloc_and_lookup+0x0/0x68) from
[<840e0ea4>] (do_lookup+0x1fc/0x320)
[   75.397160]  r6:00000001 r5:e6771e70 r4:00000000 r3:0000031c
[   75.397183] [<840e0ca8>] (do_lookup+0x0/0x320) from [<840e1948>]
(path_lookupat+0x100/0x6bc)
[   75.397201] [<840e1848>] (path_lookupat+0x0/0x6bc) from
[<840e1f28>] (do_path_lookup+0x24/0x60)
[   75.397219] [<840e1f04>] (do_path_lookup+0x0/0x60) from
[<840e2f74>] (user_path_at_empty+0x60/0x90)
[   75.397229]  r7:ffffff9c r6:e6771e70 r5:00000001 r4:e6d05000
[   75.397252] [<840e2f14>] (user_path_at_empty+0x0/0x90) from
[<840e2fc0>] (user_path_at+0x1c/0x24)
[   75.397263]  r8:8400df28 r7:000000c3 r6:00000001 r5:e6771f40 r4:7e8e5660
[   75.397295] [<840e2fa4>] (user_path_at+0x0/0x24) from [<840d9680>]
(vfs_fstatat+0x3c/0x6c)
[   75.397313] [<840d9644>] (vfs_fstatat+0x0/0x6c) from [<840d96fc>]
(vfs_stat+0x24/0x28)
[   75.397339]  r5:000ae25c r4:7e8e5660
[   75.397360] [<840d96d8>] (vfs_stat+0x0/0x28) from [<840d98f0>]
(sys_stat64+0x1c/0x38)
[   75.397381] [<840d98d4>] (sys_stat64+0x0/0x38) from [<8400dd80>]
(ret_fast_syscall+0x0/0x30)
[   75.397391]  r4:7e8e5710
[   75.397406] UBIFS error (pid 1374): ubifs_iget: failed to read
inode 8372, error -22
[   75.405189] UBIFS error (pid 1374): ubifs_lookup: dead directory
entry 'dnsmasq', error -22
[   75.413584] UBIFS warning (pid 1374): ubifs_ro_mode: switched to
read-only mode, error -22
[   75.421888] Backtrace:
[   75.421912] [<84011d0c>] (dump_backtrace+0x0/0x10c) from
[<844b51f8>] (dump_stack+0x18/0x1c)
[   75.421923]  r6:e6698600 r5:e757c3d8 r4:ffffffea r3:846974ec
[   75.421952] [<844b51e0>] (dump_stack+0x0/0x1c) from [<84218b98>]
(ubifs_ro_mode+0x6c/0x78)
[   75.421970] [<84218b2c>] (ubifs_ro_mode+0x0/0x78) from [<84212a48>]
(ubifs_lookup+0x180/0x1dc)
[   75.421988] [<842128c8>] (ubifs_lookup+0x0/0x1dc) from [<840df220>]
(d_alloc_and_lookup+0x4c/0x68)
[   75.422006] [<840df1d4>] (d_alloc_and_lookup+0x0/0x68) from
[<840e0ea4>] (do_lookup+0x1fc/0x320)
[   75.422016]  r6:00000001 r5:e6771e70 r4:00000000 r3:0000031c
[   75.422039] [<840e0ca8>] (do_lookup+0x0/0x320) from [<840e1948>]
(path_lookupat+0x100/0x6bc)
[   75.422056] [<840e1848>] (path_lookupat+0x0/0x6bc) from
[<840e1f28>] (do_path_lookup+0x24/0x60)
[   75.422074] [<840e1f04>] (do_path_lookup+0x0/0x60) from
[<840e2f74>] (user_path_at_empty+0x60/0x90)
[   75.422084]  r7:ffffff9c r6:e6771e70 r5:00000001 r4:e6d05000
[   75.422107] [<840e2f14>] (user_path_at_empty+0x0/0x90) from
[<840e2fc0>] (user_path_at+0x1c/0x24)
[   75.422118]  r8:8400df28 r7:000000c3 r6:00000001 r5:e6771f40 r4:7e8e5660
[   75.422146] [<840e2fa4>] (user_path_at+0x0/0x24) from [<840d9680>]
(vfs_fstatat+0x3c/0x6c)
[   75.422164] [<840d9644>] (vfs_fstatat+0x0/0x6c) from [<840d96fc>]
(vfs_stat+0x24/0x28)
[   75.422174]  r5:000ae25c r4:7e8e5660
[   75.422192] [<840d96d8>] (vfs_stat+0x0/0x28) from [<840d98f0>]
(sys_stat64+0x1c/0x38)
[   75.422209] [<840d98d4>] (sys_stat64+0x0/0x38) from [<8400dd80>]
(ret_fast_syscall+0x0/0x30)
[   75.422219]  r4:7e8e5710


Best Regards,
Sourabh Jain

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: UBIFS node corruption issue
  2018-06-01  4:25 Sourabh Jain (R&D - Telecom)
@ 2018-06-01  8:28 ` Richard Weinberger
  2018-06-01 12:23   ` Sourabh Jain (R&D - Telecom)
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Weinberger @ 2018-06-01  8:28 UTC (permalink / raw)
  To: Sourabh Jain (R&D - Telecom)
  Cc: linux-mtd @ lists . infradead . org, Vamsi S (R&D - Telecom),
	Rajdeep Vaghasia (Software Development - Telecom),
	Shailesh Panchal (Software Development - Telecom)

Sourabh Jain,

On Fri, Jun 1, 2018 at 6:25 AM, Sourabh Jain (R&D - Telecom)
<sourabh.jain@matrixcomsec.com> wrote:
> Hi all,
>
>
> We are using 3.2.54 Linux kernel. In this, we are facing one issue

Please note that this kernel is very old.
At least run a recent 3.2 stable kernel.

> related to ubifs node  corruption. Below is the core dump captured in
> dmesg output. Due to this node corruption, the application having pid
> 1374 is not able to run.
>
>
> This is not resolved even after reboot. It is generated every time the
> system starts.
>
>
> What can be the problem?

Well, UBIFS' index tree seems to reference a LEB (or part of a LEB)
which is empty.
This can have tons of reasons. Starting from MTD drivers bugs to UBIFS bugs.

> How we can resolve this issue?
>
> Please suggest us necessary action to be taken. Also, please inform us
> if any further input is required from us.

Are you facing this problem on more than one target, and is it reproducible?

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: UBIFS node corruption issue
  2018-06-01  8:28 ` Richard Weinberger
@ 2018-06-01 12:23   ` Sourabh Jain (R&D - Telecom)
  2018-06-01 12:49     ` Richard Weinberger
  0 siblings, 1 reply; 9+ messages in thread
From: Sourabh Jain (R&D - Telecom) @ 2018-06-01 12:23 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: linux-mtd @ lists . infradead . org, Vamsi S (R&D - Telecom),
	Rajdeep Vaghasia (Software Development - Telecom),
	Shailesh Panchal (Software Development - Telecom)

Hi Richard,

On Fri, Jun 1, 2018 at 1:58 PM, Richard Weinberger
<richard.weinberger@gmail.com> wrote:
> Sourabh Jain,
>
> On Fri, Jun 1, 2018 at 6:25 AM, Sourabh Jain (R&D - Telecom)
> <sourabh.jain@matrixcomsec.com> wrote:
>> Hi all,
>>
>>
>> We are using 3.2.54 Linux kernel. In this, we are facing one issue
>
> Please note that this kernel is very old.
> At least run a recent 3.2 stable kernel.
>
>> related to ubifs node  corruption. Below is the core dump captured in
>> dmesg output. Due to this node corruption, the application having pid
>> 1374 is not able to run.
>>
>>
>> This is not resolved even after reboot. It is generated every time the
>> system starts.
>>
>>
>> What can be the problem?
>
> Well, UBIFS' index tree seems to reference a LEB (or part of a LEB)
> which is empty.
> This can have tons of reasons. Starting from MTD drivers bugs to UBIFS bugs.

There is one observation, i have read all the VID header from the PEB
and there is
no header having LEB number 328, which causing problem.
Why there is mapping for LEB Number 328 in index tree?


>> How we can resolve this issue?
>>
>> Please suggest us necessary action to be taken. Also, please inform us
>> if any further input is required from us.
>
> Are you facing this problem on more than one target, and is it reproducible?

We are facing this problem in some boards.
there is another problem in which the root file system become read
only but actually it is r/w.

Regards,
Sourabh Jain

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: UBIFS node corruption issue
  2018-06-01 12:23   ` Sourabh Jain (R&D - Telecom)
@ 2018-06-01 12:49     ` Richard Weinberger
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Weinberger @ 2018-06-01 12:49 UTC (permalink / raw)
  To: Sourabh Jain (R&D - Telecom),
	linux-mtd @ lists . infradead . org
  Cc: Vamsi S (R&D - Telecom),
	Rajdeep Vaghasia (Software Development - Telecom),
	Shailesh Panchal (Software Development - Telecom)

Am Freitag, 1. Juni 2018, 14:23:05 CEST schrieb Sourabh Jain (R&D - Telecom):
> >> What can be the problem?
> >
> > Well, UBIFS' index tree seems to reference a LEB (or part of a LEB)
> > which is empty.
> > This can have tons of reasons. Starting from MTD drivers bugs to UBIFS bugs.
> 
> There is one observation, i have read all the VID header from the PEB
> and there is
> no header having LEB number 328, which causing problem.
> Why there is mapping for LEB Number 328 in index tree?

Depends. Maybe the PEB behind the LEB was erased by mistake.
I have seen such driver bugs.
 
> 
> >> How we can resolve this issue?
> >>
> >> Please suggest us necessary action to be taken. Also, please inform us
> >> if any further input is required from us.
> >
> > Are you facing this problem on more than one target, and is it reproducible?
> 
> We are facing this problem in some boards.
> there is another problem in which the root file system become read
> only but actually it is r/w.

Do you face this issues also on a recent kernel?

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: UBIFS node corruption issue
       [not found] <d03e8f457821494f951fabac2933bbfe@SIWEX5A.sing.micron.com>
@ 2018-06-11  5:19 ` Sourabh Jain (R&D - Telecom)
  2018-06-11  7:05   ` Richard Weinberger
  2018-06-15  6:27   ` Richard Weinberger
  0 siblings, 2 replies; 9+ messages in thread
From: Sourabh Jain (R&D - Telecom) @ 2018-06-11  5:19 UTC (permalink / raw)
  To: Bean Huo (beanhuo); +Cc: linux-mtd@lists.infradead.org

On Mon, Jun 4, 2018 at 4:39 PM, Bean Huo (beanhuo) <beanhuo@micron.com> wrote:
> Resend seems delivery failed
>
>
>
Hi Bean,
> Eg, what kind of NAND, SLC or MLC, what happened to this NAND before this
> issue?
>
We are using SLC NAND and our system was in the field when the problem occurred.

Hi Richard,
We are a product based company. We have purchased SDK from a
semiconductor vendor. There are so many patches from vendor for that
particular SoC.
So, we can not directly use latest kernel version.
We are using Linux 3.2.54 kernel provided to us along with SDK, Could
you please suggest the most stable kernel version after Linux 3.2.54
kernel version which we can use with the least modifications?

Regards,
Sourabh Jain

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: UBIFS node corruption issue
  2018-06-11  5:19 ` Re: UBIFS node corruption issue Sourabh Jain (R&D - Telecom)
@ 2018-06-11  7:05   ` Richard Weinberger
  2018-06-15  6:27   ` Richard Weinberger
  1 sibling, 0 replies; 9+ messages in thread
From: Richard Weinberger @ 2018-06-11  7:05 UTC (permalink / raw)
  To: Sourabh Jain (R&D - Telecom)
  Cc: Bean Huo (beanhuo), linux-mtd@lists.infradead.org

On Mon, Jun 11, 2018 at 7:19 AM, Sourabh Jain (R&D - Telecom)
<sourabh.jain@matrixcomsec.com> wrote:
> On Mon, Jun 4, 2018 at 4:39 PM, Bean Huo (beanhuo) <beanhuo@micron.com> wrote:
>> Resend seems delivery failed
>>
>>
>>
> Hi Bean,
>> Eg, what kind of NAND, SLC or MLC, what happened to this NAND before this
>> issue?
>>
> We are using SLC NAND and our system was in the field when the problem occurred.
>
> Hi Richard,
> We are a product based company. We have purchased SDK from a
> semiconductor vendor. There are so many patches from vendor for that
> particular SoC.
> So, we can not directly use latest kernel version.
> We are using Linux 3.2.54 kernel provided to us along with SDK, Could
> you please suggest the most stable kernel version after Linux 3.2.54
> kernel version which we can use with the least modifications?

I strongly suggest using the latest mainline kernel, 4.17, without vendor stuff.
If we are sure that the problem is mainline specific we can start bug hunting.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: UBIFS node corruption issue
  2018-06-11  5:19 ` Re: UBIFS node corruption issue Sourabh Jain (R&D - Telecom)
  2018-06-11  7:05   ` Richard Weinberger
@ 2018-06-15  6:27   ` Richard Weinberger
  2018-06-15  7:19     ` Sourabh Jain (R&D - Telecom)
  1 sibling, 1 reply; 9+ messages in thread
From: Richard Weinberger @ 2018-06-15  6:27 UTC (permalink / raw)
  To: Sourabh Jain (R&D - Telecom)
  Cc: Bean Huo (beanhuo), linux-mtd@lists.infradead.org

On Mon, Jun 11, 2018 at 7:19 AM, Sourabh Jain (R&D - Telecom)
<sourabh.jain@matrixcomsec.com> wrote:
> On Mon, Jun 4, 2018 at 4:39 PM, Bean Huo (beanhuo) <beanhuo@micron.com> wrote:
>> Resend seems delivery failed
>>
>>
>>
> Hi Bean,
>> Eg, what kind of NAND, SLC or MLC, what happened to this NAND before this
>> issue?
>>
> We are using SLC NAND and our system was in the field when the problem occurred.
>
> Hi Richard,
> We are a product based company. We have purchased SDK from a
> semiconductor vendor. There are so many patches from vendor for that
> particular SoC.
> So, we can not directly use latest kernel version.
> We are using Linux 3.2.54 kernel provided to us along with SDK, Could
> you please suggest the most stable kernel version after Linux 3.2.54
> kernel version which we can use with the least modifications?

BTW: is your filesystem using xattrs?
(Maybe due to selinux, systemd-journald, ....)

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: UBIFS node corruption issue
  2018-06-15  6:27   ` Richard Weinberger
@ 2018-06-15  7:19     ` Sourabh Jain (R&D - Telecom)
  2018-06-15  7:25       ` Richard Weinberger
  0 siblings, 1 reply; 9+ messages in thread
From: Sourabh Jain (R&D - Telecom) @ 2018-06-15  7:19 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Bean Huo (beanhuo), linux-mtd@lists.infradead.org

On Fri, Jun 15, 2018 at 11:57 AM, Richard Weinberger
<richard.weinberger@gmail.com> wrote:
> On Mon, Jun 11, 2018 at 7:19 AM, Sourabh Jain (R&D - Telecom)
> <sourabh.jain@matrixcomsec.com> wrote:
>> On Mon, Jun 4, 2018 at 4:39 PM, Bean Huo (beanhuo) <beanhuo@micron.com> wrote:
>>> Resend seems delivery failed
>>>
>>>
>>>
>> Hi Bean,
>>> Eg, what kind of NAND, SLC or MLC, what happened to this NAND before this
>>> issue?
>>>
>> We are using SLC NAND and our system was in the field when the problem occurred.
>>
>> Hi Richard,
>> We are a product based company. We have purchased SDK from a
>> semiconductor vendor. There are so many patches from vendor for that
>> particular SoC.
>> So, we can not directly use latest kernel version.
>> We are using Linux 3.2.54 kernel provided to us along with SDK, Could
>> you please suggest the most stable kernel version after Linux 3.2.54
>> kernel version which we can use with the least modifications?
>
> BTW: is your filesystem using xattrs?
> (Maybe due to selinux, systemd-journald, ....)
Hi Richard,

We did not enabled the xattr in the ubifs filesystem in kernel configuration.

Also, we had kept a setup with integck utility of mtd-utils to stress
test ubifs. In this case, we found a separate error of "corrupt empty
space at some index LEB". Due to this, the mtd partition was not
mount.
So, in which portion (UBIFS, UBI Driver, MTD drivers or NAND Driver?)
we should work to resolve this problem?

Regards,
Sourabh

> --
> Thanks,
> //richard

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: UBIFS node corruption issue
  2018-06-15  7:19     ` Sourabh Jain (R&D - Telecom)
@ 2018-06-15  7:25       ` Richard Weinberger
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Weinberger @ 2018-06-15  7:25 UTC (permalink / raw)
  To: Sourabh Jain (R&D - Telecom), Bean Huo (beanhuo)
  Cc: linux-mtd@lists.infradead.org

Am Freitag, 15. Juni 2018, 09:19:46 CEST schrieb Sourabh Jain (R&D - Telecom):
> We did not enabled the xattr in the ubifs filesystem in kernel configuration.

Ok.
 
> Also, we had kept a setup with integck utility of mtd-utils to stress
> test ubifs. In this case, we found a separate error of "corrupt empty
> space at some index LEB". Due to this, the mtd partition was not
> mount.
> So, in which portion (UBIFS, UBI Driver, MTD drivers or NAND Driver?)
> we should work to resolve this problem?

At the MTD layer. In recent kernels we made sure that MTD drivers don't
report ECC errors to UBI when an empty page has bitflips.

See gpmi-nand.

Thanks,
//richard


-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-06-15  7:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <d03e8f457821494f951fabac2933bbfe@SIWEX5A.sing.micron.com>
2018-06-11  5:19 ` Re: UBIFS node corruption issue Sourabh Jain (R&D - Telecom)
2018-06-11  7:05   ` Richard Weinberger
2018-06-15  6:27   ` Richard Weinberger
2018-06-15  7:19     ` Sourabh Jain (R&D - Telecom)
2018-06-15  7:25       ` Richard Weinberger
2018-06-01  4:25 Sourabh Jain (R&D - Telecom)
2018-06-01  8:28 ` Richard Weinberger
2018-06-01 12:23   ` Sourabh Jain (R&D - Telecom)
2018-06-01 12:49     ` Richard Weinberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).