public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH] UBI: add ubi_err() to report the failure of leb read
@ 2014-12-16  7:54 hujianyang
  2014-12-16  8:02 ` hujianyang
  2014-12-16  8:58 ` Richard Weinberger
  0 siblings, 2 replies; 8+ messages in thread
From: hujianyang @ 2014-12-16  7:54 UTC (permalink / raw)
  To: Artem Bityutskiy; +Cc: Richard Weinberger, linux-mtd

If an error occur while reading from PEBs, for example, an ECC error,
ubi_io_read() will print some error messages. But it's not enough for
debugging. These messages don't show the mapping info for a read from
UBIFS layer.

Although UBIFS will soon print its error messages after catching the
return value from UBI layer,  multi-path reading will confuse the
relationship between LEBs and PEBs showed by these messages.

This patch adds an ubi_err() to report reading errors in the function
ubi_eba_read_leb(). The mapping info of LEB and PEB is showed by
this error message.

Signed-off-by: hujianyang <hujianyang@huawei.com>
---
 drivers/mtd/ubi/eba.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c
index b698534..b4e69e1 100644
--- a/drivers/mtd/ubi/eba.c
+++ b/drivers/mtd/ubi/eba.c
@@ -477,6 +477,8 @@ out_free:
 	ubi_free_vid_hdr(ubi, vid_hdr);
 out_unlock:
 	leb_read_unlock(ubi, vol_id, lnum);
+	ubi_err(ubi, "err %d while reading %d bytes from offset %d of LEB %d:%d, PEB %d",
+		err, len, offset, vol_id, lnum, pnum);
 	return err;
 }

-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] UBI: add ubi_err() to report the failure of leb read
  2014-12-16  7:54 [PATCH] UBI: add ubi_err() to report the failure of leb read hujianyang
@ 2014-12-16  8:02 ` hujianyang
  2014-12-16  9:21   ` Richard Weinberger
  2014-12-16  8:58 ` Richard Weinberger
  1 sibling, 1 reply; 8+ messages in thread
From: hujianyang @ 2014-12-16  8:02 UTC (permalink / raw)
  To: Artem Bityutskiy; +Cc: Richard Weinberger, linux-mtd

On 2014/12/16 15:54, hujianyang wrote:
> If an error occur while reading from PEBs, for example, an ECC error,
> ubi_io_read() will print some error messages. But it's not enough for
> debugging. These messages don't show the mapping info for a read from
> UBIFS layer.
> 
> Although UBIFS will soon print its error messages after catching the
> return value from UBI layer,  multi-path reading will confuse the
> relationship between LEBs and PEBs showed by these messages.
> 
> This patch adds an ubi_err() to report reading errors in the function
> ubi_eba_read_leb(). The mapping info of LEB and PEB is showed by
> this error message.
> 
> Signed-off-by: hujianyang <hujianyang@huawei.com>
> ---
>  drivers/mtd/ubi/eba.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c
> index b698534..b4e69e1 100644
> --- a/drivers/mtd/ubi/eba.c
> +++ b/drivers/mtd/ubi/eba.c
> @@ -477,6 +477,8 @@ out_free:
>  	ubi_free_vid_hdr(ubi, vid_hdr);
>  out_unlock:
>  	leb_read_unlock(ubi, vol_id, lnum);
> +	ubi_err(ubi, "err %d while reading %d bytes from offset %d of LEB %d:%d, PEB %d",
> +		err, len, offset, vol_id, lnum, pnum);
>  	return err;
>  }
> 

Hi,

I met a problem that I was failed to mount a UBIFS partition.

[   38.442770] UBI error: ubi_io_read: error -74 (ECC error) while reading 26624 bytes from PEB 54:104448, read 26624 bytes
[   38.852461] UBI error: ubi_io_read: error -74 (ECC error) while reading 77824 bytes from PEB 346:53248, read 77824 bytes
[   38.864142] UBIFS error (pid 1444): ubifs_recover_leb: corruption -3
[   38.870487] UBIFS error (pid 1444): ubifs_scanned_corruption: corruption at LEB 928:55280
[   38.878625] UBIFS error (pid 1444): ubifs_scanned_corruption: first 8192 bytes from LEB 928:55280
[   38.892117] UBIFS error (pid 1444): ubifs_recover_leb: LEB 928 scanning failed
mount: mounting ubi1:bak on /HFFS2: failed: Structure needs cleaning

I think it is caused by an ECC error of nand flash. Do we have some methods
to mount this partition? Data losing is acceptable.


Thanks,

Hu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] UBI: add ubi_err() to report the failure of leb read
  2014-12-16  7:54 [PATCH] UBI: add ubi_err() to report the failure of leb read hujianyang
  2014-12-16  8:02 ` hujianyang
@ 2014-12-16  8:58 ` Richard Weinberger
  2014-12-16 10:12   ` hujianyang
  1 sibling, 1 reply; 8+ messages in thread
From: Richard Weinberger @ 2014-12-16  8:58 UTC (permalink / raw)
  To: hujianyang, Artem Bityutskiy; +Cc: linux-mtd

Am 16.12.2014 um 08:54 schrieb hujianyang:
> If an error occur while reading from PEBs, for example, an ECC error,
> ubi_io_read() will print some error messages. But it's not enough for
> debugging. These messages don't show the mapping info for a read from
> UBIFS layer.
> 
> Although UBIFS will soon print its error messages after catching the
> return value from UBI layer,  multi-path reading will confuse the
> relationship between LEBs and PEBs showed by these messages.
> 
> This patch adds an ubi_err() to report reading errors in the function
> ubi_eba_read_leb(). The mapping info of LEB and PEB is showed by
> this error message.
> 
> Signed-off-by: hujianyang <hujianyang@huawei.com>
> ---
>  drivers/mtd/ubi/eba.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c
> index b698534..b4e69e1 100644
> --- a/drivers/mtd/ubi/eba.c
> +++ b/drivers/mtd/ubi/eba.c
> @@ -477,6 +477,8 @@ out_free:
>  	ubi_free_vid_hdr(ubi, vid_hdr);
>  out_unlock:
>  	leb_read_unlock(ubi, vol_id, lnum);
> +	ubi_err(ubi, "err %d while reading %d bytes from offset %d of LEB %d:%d, PEB %d",
> +		err, len, offset, vol_id, lnum, pnum);

This label will also be reached if we run out of memory.
Please make sure that the new ubi_err() can only be reached in case of an MTD error.
Also make sure that the function prints only one message and not two.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] UBI: add ubi_err() to report the failure of leb read
  2014-12-16  8:02 ` hujianyang
@ 2014-12-16  9:21   ` Richard Weinberger
  2014-12-16  9:52     ` hujianyang
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Weinberger @ 2014-12-16  9:21 UTC (permalink / raw)
  To: hujianyang, Artem Bityutskiy; +Cc: linux-mtd

Am 16.12.2014 um 09:02 schrieb hujianyang:
> Hi,
> 
> I met a problem that I was failed to mount a UBIFS partition.
> 
> [   38.442770] UBI error: ubi_io_read: error -74 (ECC error) while reading 26624 bytes from PEB 54:104448, read 26624 bytes
> [   38.852461] UBI error: ubi_io_read: error -74 (ECC error) while reading 77824 bytes from PEB 346:53248, read 77824 bytes
> [   38.864142] UBIFS error (pid 1444): ubifs_recover_leb: corruption -3
> [   38.870487] UBIFS error (pid 1444): ubifs_scanned_corruption: corruption at LEB 928:55280
> [   38.878625] UBIFS error (pid 1444): ubifs_scanned_corruption: first 8192 bytes from LEB 928:55280
> [   38.892117] UBIFS error (pid 1444): ubifs_recover_leb: LEB 928 scanning failed
> mount: mounting ubi1:bak on /HFFS2: failed: Structure needs cleaning
> 
> I think it is caused by an ECC error of nand flash. Do we have some methods
> to mount this partition? Data losing is acceptable.

I don't think that UBIFS has such a mount option.
You can dump the raw data and inspect the corrupted data.
Maybe you can fix it by hand.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] UBI: add ubi_err() to report the failure of leb read
  2014-12-16  9:21   ` Richard Weinberger
@ 2014-12-16  9:52     ` hujianyang
  2014-12-16  9:57       ` Richard Weinberger
  0 siblings, 1 reply; 8+ messages in thread
From: hujianyang @ 2014-12-16  9:52 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, Artem Bityutskiy

On 2014/12/16 17:21, Richard Weinberger wrote:
> Am 16.12.2014 um 09:02 schrieb hujianyang:
>> Hi,
>>
>> I met a problem that I was failed to mount a UBIFS partition.
>>
>> [   38.442770] UBI error: ubi_io_read: error -74 (ECC error) while reading 26624 bytes from PEB 54:104448, read 26624 bytes
>> [   38.852461] UBI error: ubi_io_read: error -74 (ECC error) while reading 77824 bytes from PEB 346:53248, read 77824 bytes
>> [   38.864142] UBIFS error (pid 1444): ubifs_recover_leb: corruption -3
>> [   38.870487] UBIFS error (pid 1444): ubifs_scanned_corruption: corruption at LEB 928:55280
>> [   38.878625] UBIFS error (pid 1444): ubifs_scanned_corruption: first 8192 bytes from LEB 928:55280
>> [   38.892117] UBIFS error (pid 1444): ubifs_recover_leb: LEB 928 scanning failed
>> mount: mounting ubi1:bak on /HFFS2: failed: Structure needs cleaning
>>
>> I think it is caused by an ECC error of nand flash. Do we have some methods
>> to mount this partition? Data losing is acceptable.
> 
> I don't think that UBIFS has such a mount option.

Er, I don't know it either. How about a mount option like --force?

> You can dump the raw data and inspect the corrupted data.
> Maybe you can fix it by hand.

Yes, I want a try~! If we have to introduce a new feature or new mount
option. So would you like to help me? Do you think it's a valuable
work?


Thanks~

Hu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] UBI: add ubi_err() to report the failure of leb read
  2014-12-16  9:52     ` hujianyang
@ 2014-12-16  9:57       ` Richard Weinberger
  2014-12-16 10:34         ` hujianyang
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Weinberger @ 2014-12-16  9:57 UTC (permalink / raw)
  To: hujianyang; +Cc: linux-mtd, Artem Bityutskiy

Am 16.12.2014 um 10:52 schrieb hujianyang:
> On 2014/12/16 17:21, Richard Weinberger wrote:
>> Am 16.12.2014 um 09:02 schrieb hujianyang:
>>> Hi,
>>>
>>> I met a problem that I was failed to mount a UBIFS partition.
>>>
>>> [   38.442770] UBI error: ubi_io_read: error -74 (ECC error) while reading 26624 bytes from PEB 54:104448, read 26624 bytes
>>> [   38.852461] UBI error: ubi_io_read: error -74 (ECC error) while reading 77824 bytes from PEB 346:53248, read 77824 bytes
>>> [   38.864142] UBIFS error (pid 1444): ubifs_recover_leb: corruption -3
>>> [   38.870487] UBIFS error (pid 1444): ubifs_scanned_corruption: corruption at LEB 928:55280
>>> [   38.878625] UBIFS error (pid 1444): ubifs_scanned_corruption: first 8192 bytes from LEB 928:55280
>>> [   38.892117] UBIFS error (pid 1444): ubifs_recover_leb: LEB 928 scanning failed
>>> mount: mounting ubi1:bak on /HFFS2: failed: Structure needs cleaning
>>>
>>> I think it is caused by an ECC error of nand flash. Do we have some methods
>>> to mount this partition? Data losing is acceptable.
>>
>> I don't think that UBIFS has such a mount option.
> 
> Er, I don't know it either. How about a mount option like --force?

Then every single embedded vendor will use this flag to keep the broken MTD/UBI/UBIFS setups running
as long as possible no mater of how corrupted the data is. :-)
IIRC UBIFS will either mount and work correctly as expected or fail hard.

>> You can dump the raw data and inspect the corrupted data.
>> Maybe you can fix it by hand.
> 
> Yes, I want a try~! If we have to introduce a new feature or new mount
> option. So would you like to help me? Do you think it's a valuable
> work?

I'm not a fan of such a mount option.
What we really need is a fsck.ubifs and a ubifs dump tool to fix and recover
broken UBIFS images.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] UBI: add ubi_err() to report the failure of leb read
  2014-12-16  8:58 ` Richard Weinberger
@ 2014-12-16 10:12   ` hujianyang
  0 siblings, 0 replies; 8+ messages in thread
From: hujianyang @ 2014-12-16 10:12 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, Artem Bityutskiy

On 2014/12/16 16:58, Richard Weinberger wrote:
> Am 16.12.2014 um 08:54 schrieb hujianyang:
>> If an error occur while reading from PEBs, for example, an ECC error,
>> ubi_io_read() will print some error messages. But it's not enough for
>> debugging. These messages don't show the mapping info for a read from
>> UBIFS layer.
>>
>> Although UBIFS will soon print its error messages after catching the
>> return value from UBI layer,  multi-path reading will confuse the
>> relationship between LEBs and PEBs showed by these messages.
>>
>> This patch adds an ubi_err() to report reading errors in the function
>> ubi_eba_read_leb(). The mapping info of LEB and PEB is showed by
>> this error message.
>>
>> Signed-off-by: hujianyang <hujianyang@huawei.com>
>> ---
>>  drivers/mtd/ubi/eba.c |    2 ++
>>  1 files changed, 2 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c
>> index b698534..b4e69e1 100644
>> --- a/drivers/mtd/ubi/eba.c
>> +++ b/drivers/mtd/ubi/eba.c
>> @@ -477,6 +477,8 @@ out_free:
>>  	ubi_free_vid_hdr(ubi, vid_hdr);
>>  out_unlock:
>>  	leb_read_unlock(ubi, vol_id, lnum);
>> +	ubi_err(ubi, "err %d while reading %d bytes from offset %d of LEB %d:%d, PEB %d",
>> +		err, len, offset, vol_id, lnum, pnum);
> 
> This label will also be reached if we run out of memory.

Yes, but I think it's OK. Because a read operation is really failed
by some errors. We can print these errors not only in case of MTD
errors but also in case of CRC errors, ENOMEM or others.

> Please make sure that the new ubi_err() can only be reached in case of an MTD error.
> Also make sure that the function prints only one message and not two.

Er, At first, I want to print just MTD error, perform ubi_err() in
mtd_is_eccerr() case, for example. But I found it's not easy to print
it only once. The *retry* label makes it difficulty to determine when
this function, ubi_eba_read_leb() returns, this message will print
two times or none.

So at last, I move this ubi_err() to the end of error handling path
for easy. ^.^

What's your opinion?

Hu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] UBI: add ubi_err() to report the failure of leb read
  2014-12-16  9:57       ` Richard Weinberger
@ 2014-12-16 10:34         ` hujianyang
  0 siblings, 0 replies; 8+ messages in thread
From: hujianyang @ 2014-12-16 10:34 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, Artem Bityutskiy

On 2014/12/16 17:57, Richard Weinberger wrote:
> Then every single embedded vendor will use this flag to keep the broken MTD/UBI/UBIFS setups running
> as long as possible no mater of how corrupted the data is. :-)
> IIRC UBIFS will either mount and work correctly as expected or fail hard.

You are right~!

Maybe we can set filesystem to RO if it is mounted with --force, and
allow users to copy their data to other place.

How about this?

> 
>>> You can dump the raw data and inspect the corrupted data.
>>> Maybe you can fix it by hand.
>>
>> Yes, I want a try~! If we have to introduce a new feature or new mount
>> option. So would you like to help me? Do you think it's a valuable
>> work?
> 
> I'm not a fan of such a mount option.
> What we really need is a fsck.ubifs and a ubifs dump tool to fix and recover
> broken UBIFS images.

I think it's better, but a bit harder. As I know, my UBIDUMP is far
from what you expect. I should spend more time on it.

After all, I was asked to fix this error. My plan is do something
after an ECC error is detected, not directly breakout to allow
this partition to be mounted. I don't think this solution will be
easily accept by mainline.

However, I'd like to show my work if I succeed.

Thanks~!

Hu

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-12-16 10:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-16  7:54 [PATCH] UBI: add ubi_err() to report the failure of leb read hujianyang
2014-12-16  8:02 ` hujianyang
2014-12-16  9:21   ` Richard Weinberger
2014-12-16  9:52     ` hujianyang
2014-12-16  9:57       ` Richard Weinberger
2014-12-16 10:34         ` hujianyang
2014-12-16  8:58 ` Richard Weinberger
2014-12-16 10:12   ` hujianyang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox