UBIFS seeing corrupt blank pages when image flashed via u-boot

linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* UBIFS seeing corrupt blank pages when image flashed via u-boot
@ 2014-01-03 11:45 Gupta, Pekon
  2014-01-03 12:59 ` Artem Bityutskiy
  0 siblings, 1 reply; 13+ messages in thread
From: Gupta, Pekon @ 2014-01-03 11:45 UTC (permalink / raw)
  To: linux-mtd@lists.infradead.org, artem.bityutskiy@linux.intel.com
  Cc: u-boot@lists.denx.de

Hi All,

I have been facing a weird problem, may be someone has a solution.

*_Case-1_ Flashing UBIFS image from u-boot using 'nand write' utility*

For a partially written erased-block..
(a) 1st page is written with 'erase-header'
(b) 2nd page is written with 'volume-header'
(c) '3rd page' is written with 'some data'
(d) '4th to last-page of block' should be left blank, but they are written with 0xFF.
As a effect of (d), the ECC calculated for (all 0xff data) is written to
OOB area of all pages from 4th-page till last-page of the PEB.

As per my understanding, after mounting UBIFS as root, kernel tries to
append some data to some files in rootfs, due to which leftover pages (d)
get written by appended data, _without_ PEB getting erased.
This causes ECC bytes to get corrupted, because OOB of 'unused-pages'
was already written with ECC of (all 0xff data) by u-boot in step(d).
And on next reboot, kernel sees 'un-corrected ECC' errors while booting.

*_Case-2_ Flashing UBIFS image from kernel using 'ubiformat' utility*

Whereas when same image is flashed using 'ubiformat' utility, after booting
Kernel from other root source. 'ubiformat' automatically skips empty pages
(4th to last-page) in erased-block. Thus OOB area of pages from 4th-page
till last-page are un-touched.
Hence, when kernel 'appends' the rootfs files there is no ECC corruption
as the OOB area of 4th to last-page of PEB were blank. So everything works fine.

Now my queries:
(1) while _appending_ a file
  (a) does UBIFS writes appended data to same existing PEB, if there is
  enough space in PEB to accommodate new data ?
 OR
  (b) does UBIFS copies the existing data to newer PEB  along with the
   appended data ?

(2) In-case (b), then can someone point me to what can possibly be the issue ?
 (Any references to UBIFS docs on infradead.org may also help).

with regards, pekon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UBIFS seeing corrupt blank pages when image flashed via u-boot
  2014-01-03 11:45 UBIFS seeing corrupt blank pages when image flashed via u-boot Gupta, Pekon
@ 2014-01-03 12:59 ` Artem Bityutskiy
  2014-01-03 13:05   ` Bityutskiy, Artem
                     ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Artem Bityutskiy @ 2014-01-03 12:59 UTC (permalink / raw)
  To: Gupta, Pekon; +Cc: u-boot@lists.denx.de, linux-mtd@lists.infradead.org

Hi Pekon,

On Fri, 2014-01-03 at 11:45 +0000, Gupta, Pekon wrote:
> *_Case-1_ Flashing UBIFS image from u-boot using 'nand write' utility*
> 
> For a partially written erased-block..
> (a) 1st page is written with 'erase-header'
> (b) 2nd page is written with 'volume-header'
> (c) '3rd page' is written with 'some data'
> (d) '4th to last-page of block' should be left blank, but they are written with 0xFF.
> As a effect of (d), the ECC calculated for (all 0xff data) is written to
> OOB area of all pages from 4th-page till last-page of the PEB.

Yup.

> As per my understanding, after mounting UBIFS as root, kernel tries to
> append some data to some files in rootfs, due to which leftover pages (d)
> get written by appended data, _without_ PEB getting erased.

Right.

> This causes ECC bytes to get corrupted, because OOB of 'unused-pages'
> was already written with ECC of (all 0xff data) by u-boot in step(d).
> And on next reboot, kernel sees 'un-corrected ECC' errors while booting.

Sure.

> *_Case-2_ Flashing UBIFS image from kernel using 'ubiformat' utility*
> 
> Whereas when same image is flashed using 'ubiformat' utility, after booting
> Kernel from other root source. 'ubiformat' automatically skips empty pages
> (4th to last-page) in erased-block. Thus OOB area of pages from 4th-page
> till last-page are un-touched.
> Hence, when kernel 'appends' the rootfs files there is no ECC corruption
> as the OOB area of 4th to last-page of PEB were blank. So everything works fine.

Exactly!

> Now my queries:
> (1) while _appending_ a file
>   (a) does UBIFS writes appended data to same existing PEB, if there is
>   enough space in PEB to accommodate new data ?

UBIFS always writes to the Journal PEB, whatever it happens to be. So
no, it is unlikely that the data will go to the same PEB.

>  OR
>   (b) does UBIFS copies the existing data to newer PEB  along with the
>    appended data ?

No, the existing data stays where it is. New data goes to the journal.
Then the journal PEB gets indexed. The data nodes end up in different
PEBs (i.e., fragmented).

If you are worried about fragmentation, we can discuss this separately.
You can find more about UBIFS journal in my very old UBIFS presentation,
which explains basic ideas behind the UBIFS wandering journal:

http://www.linux-mtd.infradead.org/doc/ubifs.html#L_documentation

There is also Adrian's white paper with some design description there.

> (2) In-case (b), then can someone point me to what can possibly be the issue ?
>  (Any references to UBIFS docs on infradead.org may also help).

I do not understand the question. There are no problems in your (b),
neither in "*_Case-2_" described.

If you meant "*_Case-1_", then yes, there is a piece of doc:

http://www.linux-mtd.infradead.org/doc/ubi.html#L_flasher_algo

Basically, "ubiformat" is the "correct" UBI-aware flasher, while
u-boot's "nand write" seems to be a dumb flasher. I guess you have 2
options:

1. Teach u-boot's "nand write" to skip empty pages, or may be implement
a separate "clever" flashing command.

2. Use UBIFS's "space fixup" feature. This will cause UBIFS to fix-up
all empty pages by basically copying all partially-used PEBs to
different PEBes with empty pages skipping. This will be done on the
first mount, only once, and may cause considerable delays.

See http://www.linux-mtd.infradead.org/faq/ubifs.html#L_free_space_fixup

P.S. Looking at the MTD web-site now, when I am not doing any
UBI/UBIFS/MTD work anymore for few years, I am impressed how much stuff
I actually documented there :-)

HTH.

-- 
Best Regards,
Artem Bityutskiy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UBIFS seeing corrupt blank pages when image flashed via u-boot
  2014-01-03 12:59 ` Artem Bityutskiy
@ 2014-01-03 13:05   ` Bityutskiy, Artem
  2014-01-03 14:04   ` Stefano Babic
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Bityutskiy, Artem @ 2014-01-03 13:05 UTC (permalink / raw)
  To: Gupta, Pekon; +Cc: u-boot@lists.denx.de, linux-mtd@lists.infradead.org

On Fri, 2014-01-03 at 14:59 +0200, Artem Bityutskiy wrote:
> If you are worried about fragmentation, we can discuss this
> separately.
> You can find more about UBIFS journal in my very old UBIFS
> presentation,
> which explains basic ideas behind the UBIFS wandering journal:
> 
> http://www.linux-mtd.infradead.org/doc/ubifs.html#L_documentation

I guess I meant slide #39.

-- 
Best Regards,
Artem Bityutskiy
---------------------------------------------------------------------
Intel Finland Oy
Registered Address: PL 281, 00181 Helsinki 
Business Identity Code: 0357606 - 4 
Domiciled in Helsinki 

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UBIFS seeing corrupt blank pages when image flashed via u-boot
  2014-01-03 12:59 ` Artem Bityutskiy
  2014-01-03 13:05   ` Bityutskiy, Artem
@ 2014-01-03 14:04   ` Stefano Babic
  2014-01-07 17:30   ` Gupta, Pekon
  2014-01-13 12:19   ` Calvin Johnson
  3 siblings, 0 replies; 13+ messages in thread
From: Stefano Babic @ 2014-01-03 14:04 UTC (permalink / raw)
  To: Gupta, Pekon
  Cc: artem.bityutskiy, linux-mtd@lists.infradead.org,
	u-boot@lists.denx.de

Hi Gupta,

On 03/01/2014 13:59, Artem Bityutskiy wrote:

> Basically, "ubiformat" is the "correct" UBI-aware flasher, while
> u-boot's "nand write" seems to be a dumb flasher.

It is, it is *not* recommended for UBI volume without "ubinizing" your
image.

> I guess you have 2
> options:
> 
> 1. Teach u-boot's "nand write" to skip empty pages, or may be implement
> a separate "clever" flashing command.

You can also add (my preferred way) UBI support to your U-Boot, if it
does not yet have. Then you will have "ubi part" (corresponds to
ubiattach in Artem's MTD utilities), "ubi createvol" and "ubi writevol"
(respectively, ubiupdatevol and ubimkvol). If you use dumb nand
utilities, you have to erase the flash first with "nand erase", and this
will lose the erase counters.

Best regards,
Stefano Babic

-- 
=====================================================================
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-53 Fax: +49-8142-66989-80 Email: sbabic@denx.de
=====================================================================

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: UBIFS seeing corrupt blank pages when image flashed via u-boot
  2014-01-03 12:59 ` Artem Bityutskiy
  2014-01-03 13:05   ` Bityutskiy, Artem
  2014-01-03 14:04   ` Stefano Babic
@ 2014-01-07 17:30   ` Gupta, Pekon
  2014-01-13 12:19   ` Calvin Johnson
  3 siblings, 0 replies; 13+ messages in thread
From: Gupta, Pekon @ 2014-01-07 17:30 UTC (permalink / raw)
  To: artem.bityutskiy@linux.intel.com, Artem Bityutskiy
  Cc: u-boot@lists.denx.de, linux-mtd@lists.infradead.org


Hi Artem,

I wanted to check the 'white-space-fixup' and re-reading your
documentation before, so got delayed in replying.
+ my mail got moderated again by mailman..


>From: Artem Bityutskiy [mailto:artem.bityutskiy@linux.intel.com]
[...]

>If you are worried about fragmentation, we can discuss this separately.
>You can find more about UBIFS journal in my very old UBIFS presentation,
>which explains basic ideas behind the UBIFS wandering journal:
>
>http://www.linux-mtd.infradead.org/doc/ubifs.html#L_documentation
>
>There is also Adrian's white paper with some design description there.
>
Thanks much for reminding me about this.
I had read your slides long back, but never dig deep into Adrian's slides.
so, this was still in my 'To Read' list. But really appreciate your work and
presentation.

[...]

>I do not understand the question. There are no problems in your (b),
>neither in "*_Case-2_" described.
>
>If you meant "*_Case-1_", then yes, there is a piece of doc:
>
>http://www.linux-mtd.infradead.org/doc/ubi.html#L_flasher_algo
>
>Basically, "ubiformat" is the "correct" UBI-aware flasher, while
>u-boot's "nand write" seems to be a dumb flasher. I guess you have 2
>options:
>
>1. Teach u-boot's "nand write" to skip empty pages, or may be implement
>a separate "clever" flashing command.
>
Yes, I'll try 'Stefano Babic' suggestion of using u-boot UBI tools.


>2. Use UBIFS's "space fixup" feature. This will cause UBIFS to fix-up
>all empty pages by basically copying all partially-used PEBs to
>different PEBes with empty pages skipping. This will be done on the
>first mount, only once, and may cause considerable delays.
>
>See http://www.linux-mtd.infradead.org/faq/ubifs.html#L_free_space_fixup
>
Though I had read about 'white-space-fixup' feature earlier too, But
somewhere in back of my mind, I thought it was only for "free PEBs"
(erased-blocks which had corrupted or no volume-header). But after
re-reading the FAQ page, I realized that 'white-space-fixup' is done for
all pages, whether in 'free-PEB' or 'used-PEB'.

So, This solved my problem.. Thanks much..


>P.S. Looking at the MTD web-site now, when I am not doing any
>UBI/UBIFS/MTD work anymore for few years, I am impressed how much stuff
>I actually documented there :-)
>
Absolutely agree. Therefore your file-system is so popular..
Especially the MTD and UBI documentation is not only limited to 'how to use it',
Instead I think, it has some advanced details, explanations and reasoning
which were quite ahead of its time when it was written.

This is something which you and other MTD/UBI/ & UBIFS Authors
and Maintainers should be proud of.

Thanks again .. 

with regards, pekon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UBIFS seeing corrupt blank pages when image flashed via u-boot
  2014-01-03 12:59 ` Artem Bityutskiy
                     ` (2 preceding siblings ...)
  2014-01-07 17:30   ` Gupta, Pekon
@ 2014-01-13 12:19   ` Calvin Johnson
  2014-01-13 12:58     ` Artem Bityutskiy
  3 siblings, 1 reply; 13+ messages in thread
From: Calvin Johnson @ 2014-01-13 12:19 UTC (permalink / raw)
  To: artem.bityutskiy
  Cc: u-boot@lists.denx.de, linux-mtd@lists.infradead.org, Gupta, Pekon

Hi,

On Fri, Jan 3, 2014 at 6:29 PM, Artem Bityutskiy
<artem.bityutskiy@linux.intel.com> wrote:
>
> Hi Pekon,
>
> On Fri, 2014-01-03 at 11:45 +0000, Gupta, Pekon wrote:
> > *_Case-1_ Flashing UBIFS image from u-boot using 'nand write' utility*
> >
> > For a partially written erased-block..
> > (a) 1st page is written with 'erase-header'
> > (b) 2nd page is written with 'volume-header'
> > (c) '3rd page' is written with 'some data'
> > (d) '4th to last-page of block' should be left blank, but they are written with 0xFF.
> > As a effect of (d), the ECC calculated for (all 0xff data) is written to
> > OOB area of all pages from 4th-page till last-page of the PEB.
>
> Yup.
>

If the 4th to last-page are left blank and not covered with ECC, what
will happen in case of bit flips on the blank pages? There was an
issue reported some time back.
http://lists.infradead.org/pipermail/linux-mtd/2012-January/039256.html

Does UBI/UBIFS take care of this now?

Thanks,
Calvin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UBIFS seeing corrupt blank pages when image flashed via u-boot
  2014-01-13 12:19   ` Calvin Johnson
@ 2014-01-13 12:58     ` Artem Bityutskiy
  2014-01-13 13:16       ` Gupta, Pekon
  0 siblings, 1 reply; 13+ messages in thread
From: Artem Bityutskiy @ 2014-01-13 12:58 UTC (permalink / raw)
  To: Calvin Johnson
  Cc: u-boot@lists.denx.de, linux-mtd@lists.infradead.org, Gupta, Pekon

On Mon, 2014-01-13 at 17:49 +0530, Calvin Johnson wrote:
> If the 4th to last-page are left blank and not covered with ECC, what
> will happen in case of bit flips on the blank pages? There was an
> issue reported some time back.
> http://lists.infradead.org/pipermail/linux-mtd/2012-January/039256.html
> 
> Does UBI/UBIFS take care of this now?

No. UBIFS still assumes that blank pages are ECC-protected by the
driver. No one stepped in and took care of changing this yet.

-- 
Best Regards,
Artem Bityutskiy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: UBIFS seeing corrupt blank pages when image flashed via u-boot
  2014-01-13 12:58     ` Artem Bityutskiy
@ 2014-01-13 13:16       ` Gupta, Pekon
  2014-01-13 15:06         ` Artem Bityutskiy
  0 siblings, 1 reply; 13+ messages in thread
From: Gupta, Pekon @ 2014-01-13 13:16 UTC (permalink / raw)
  To: artem.bityutskiy@linux.intel.com, Calvin Johnson
  Cc: u-boot@lists.denx.de,
	Stefan Roese <sr@denx.de> (sr@denx.de),
	linux-mtd@lists.infradead.org

Hi Calvin,

>From: Artem Bityutskiy [mailto:artem.bityutskiy@linux.intel.com]
>>On Mon, 2014-01-13 at 17:49 +0530, Calvin Johnson wrote:
>> If the 4th to last-page are left blank and not covered with ECC, what
>> will happen in case of bit flips on the blank pages? There was an
>> issue reported some time back.
>> http://lists.infradead.org/pipermail/linux-mtd/2012-January/039256.html
>>
>> Does UBI/UBIFS take care of this now?
>
>No. UBIFS still assumes that blank pages are ECC-protected by the
>driver. No one stepped in and took care of changing this yet.
>
Yes, it's true that in newer technologies (specially < 28nm flash), we are seeing
lot of erased-pages having bit-flips. And due to which UBIFS is cribbing.
But as there is no ECC stored in an erased-page, bit-flips in erased-page
cannot be corrected, unless you compare each byte of read_data.
However, there are other way of handling bit-flips in erased-page.
Following are few ways in which OMAP NAND driver handles bit-flips
in erased-page:

*Case-1*: If bit-flips are found in data-region of an erased-page.
(1) An erased-page implicitly means that its data-region should *only* contain 0xff,
   so its safe to fill read_buf() with 0xff.
  But the controller driver should report the correctable/un-correctable bit-flips
  to upper-layer, so that upper-layers like UBI take corrective action by re-erase
  this block before using it.
(Refer) http://lists.infradead.org/pipermail/linux-mtd/2014-January/051368.html


*Case-2*: If bit-flips are found in oob-region of an erased-page.
This is bit trivial, because if there are bit-flips in ecc-layout (OOB region) of
erased-page, it would be difficult to differentiate between an
erased-page v/s programmed-page.
Though you can keep a 'marker' in ecc-layout reserved for detecting
Programmed-pages, but that marker byte itself can be subjected to
bit-flips (assuming on MLC and newer technology NAND bit-flips are common).
So, OMAP NAND driver takes probabilistic approach.
(Refer) http://lists.infradead.org/pipermail/linux-mtd/2014-January/051367.html
-----------------------
This patch 'assumes' any page to be 'erased':
		(a) if        all(read_ecc)  == 0xff
		(b) else if   all(read_data) == 0xff
-----------------------


Currently both UBI and UBIFS layer checks for erased-page to be all(0xff),
But I think its over-kill to put this burden on UBI or UBIFS layer, because
low-level controller drivers can handle this easily.
So, if Artem and Brian agree to above approaches, then I can a submit patch
for removal of:
 - "ubi_self_check_all_ff()" from UBI layer.
 - checking of 'buf == 0xff' from ubifs_scan_leb() in UBIFS layer.


with regards, pekon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UBIFS seeing corrupt blank pages when image flashed via u-boot
  2014-01-13 13:16       ` Gupta, Pekon
@ 2014-01-13 15:06         ` Artem Bityutskiy
  2014-01-15 21:29           ` Gupta, Pekon
  0 siblings, 1 reply; 13+ messages in thread
From: Artem Bityutskiy @ 2014-01-13 15:06 UTC (permalink / raw)
  To: Gupta, Pekon
  Cc: u-boot@lists.denx.de,
	Stefan Roese <sr@denx.de> (sr@denx.de),
	linux-mtd@lists.infradead.org, Calvin Johnson

On Mon, 2014-01-13 at 13:16 +0000, Gupta, Pekon wrote:
> Currently both UBI and UBIFS layer checks for erased-page to be
> all(0xff),
> But I think its over-kill to put this burden on UBI or UBIFS layer,
> because
> low-level controller drivers can handle this easily.
> So, if Artem and Brian agree to above approaches, then I can a submit
> patch
> for removal of:
>  - "ubi_self_check_all_ff()" from UBI layer.

Well, this is just debugging and sanity check stuff.

>  - checking of 'buf == 0xff' from ubifs_scan_leb() in UBIFS layer.

I do not think this is a good idea. Let me do some quick braindump,
thankfully I still remember the reasons behind this.
> 
This is about the recovery, and this is the code path where we actually
do these checks.

Just like in defensive programming you try to assume the worst, we tried
to assume the worst too. And the worst is - you cannot make any
assumption about what is on the media.

Now, we wanted to make UBIFS robust in a sense that you can cut the
power off at any point, and you can be sure the UBIFS driver is still
able to mount your flash. You can lose some data because it did not make
it to the media yet by the time of power cut. But you never lose the
data which made it to the media before the power cut.

And the file-system should mount the media without any user-space tools
like 'ckfs.ubifs'. The system should recover itself (detect half-written
garbage and get rid of it, preparing "clean" blank flash area for
writing new data).

When you mount a file-system, UBIFS scans the journal. Suppose it hits a
corrupted data node. At this point UBIFS need to make a decision whether
this is a node which was corrupted because of a power cut, or this is a
piece of data which has to be correct, but got corrupted because of,
say, under-voltage problems, or NAND wear, or radiation, etc.

In the first case - you recover silently, and you do not bother the user
with warnings.

In the second case - you report loudly. You do not do anything because
you risk of losing important user data (an expensive bitcoin!)

Right? So you gotta be very careful, because this is user data.

To put it differently, we specifically targeted a special type of
corruptions - power-cut related corruptions. We made related
assumptions. And we were very careful about validating these
assumptions.

So UBIFS always starts with fully erased LEBs. Then it writes there
sequentially, NAND page-by-page, from beginning to the end.

(Well, it is a bit more complex than that, but this is not important in
this discussion. The complexity is that there are several journal heads,
so UBIFS writes to more than one LEBs, but it is sequetial anyway. Also,
we write in so-called "max. write units", which are usually the same as
NAND page in case of NAND anyway).

When UBIFS mounts a file-system, it scans the journal. When it meets a
corrupted node in NAND page X, it looks at NAND page X+1 and checks if
it is blank or not. If it is blank, this looks normal, and X was just
the NAND page UBIFS presumably was writing to just before the power cut.

If NAND page X+1 contains something, then page X cannot be corrupted due
to power cut, and this is something else. And we, the FS authors, do not
know how to deal with this, we did not think about this type of
corruptions. So just we complain and exit. This is better then trying to
erase something and make you lose your data, right?

That's the logic. And of course people are welcome to extend it and
improve it.

Conclusion: all UBIFS needs is a way to ask the driver - is this NAND
page blank or not? UBIFS does not really has to compare to all 0xFFs.

-- 
Best Regards,
Artem Bityutskiy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: UBIFS seeing corrupt blank pages when image flashed via u-boot
  2014-01-13 15:06         ` Artem Bityutskiy
@ 2014-01-15 21:29           ` Gupta, Pekon
  2014-01-16  7:19             ` Artem Bityutskiy
  0 siblings, 1 reply; 13+ messages in thread
From: Gupta, Pekon @ 2014-01-15 21:29 UTC (permalink / raw)
  To: artem.bityutskiy@linux.intel.com
  Cc: u-boot@lists.denx.de,
	Stefan Roese <sr@denx.de> (sr@denx.de),
	linux-mtd@lists.infradead.org, Calvin Johnson

Hi Artem,

>From: Artem Bityutskiy [mailto:artem.bityutskiy@linux.intel.com]
<snip>
>Conclusion: all UBIFS needs is a way to ask the driver - is this NAND
>page blank or not? UBIFS does not really has to compare to all 0xFFs.
>
Thanks for details. Yes, I understand the concept in general that you
want to recover last bit of user-data written on NAND (without corruption).

Now, as NAND driver itself does differentiation between and erased-page
v/s programmed-page. Can we use different error codes to pass this
information to upper layers like;

*For MTD layer*
0: data valid, length of data is determined by 'read_len'  (currently)
-EUCLEAN: correctable bit-flips found, data is valid
-EBADMSG: un-correctable bit-flips, data *may-be* invalid.
-ENODATA: detected erased-page. *Actual* data determined by read_len.
-ENOMSG:  detected erased-page with bit-flips. *Actual* data determined by read_len.

*For UBI layer*
We can convert the MTD error-codes into UBI specific error-codes at
ubi_io_read() interface. This way many checks in other parts of UBI
layer can be simplified, and even slightly speed-up scan_peb().

--------------------------
diff --git a/drivers/mtd/ubi/io.c b/drivers/mtd/ubi/io.c
index bf79def..9b011b9 100644
--- a/drivers/mtd/ubi/io.c
+++ b/drivers/mtd/ubi/io.c
@@ -168,18 +168,24 @@ retry:
        if (err) {
                const char *errstr = mtd_is_eccerr(err) ? " (ECC error)" : "";

-               if (mtd_is_bitflip(err)) {
-                       /*
-                        * -EUCLEAN is reported if there was a bit-flip which
-                        * was corrected, so this is harmless.
-                        *
-                        * We do not report about it here unless debugging is
-                        * enabled. A corresponding message will be printed
-                        * later, when it is has been scrubbed.
-                        */
+               switch (err) {
+               case -EUCLEAN:
                        ubi_msg("fixable bit-flip detected at PEB %d", pnum);
                        ubi_assert(len == read);
                        return UBI_IO_BITFLIPS;
+               case -ENODATA:
+                       if (read == 0)
+                               return UBI_IO_FF;
+                       else
+                               return 0;
+               case -ENOMSG:
+                       if (read == 0)
+                               return UBI_IO_FF_BITFLIPS;
+                       else
+                               return UBI_IO_BITFLIPS;
+               case -EBADMSG:
+               default:
+                       continue;
                }
--------------------------

Also, please consider, if this approach is even feasible for other types
of MTD devices (NOR and OneNAND) ?


with regards, pekon

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: UBIFS seeing corrupt blank pages when image flashed via u-boot
  2014-01-15 21:29           ` Gupta, Pekon
@ 2014-01-16  7:19             ` Artem Bityutskiy
  2014-01-16  7:44               ` Gupta, Pekon
  0 siblings, 1 reply; 13+ messages in thread
From: Artem Bityutskiy @ 2014-01-16  7:19 UTC (permalink / raw)
  To: Gupta, Pekon
  Cc: u-boot@lists.denx.de,
	Stefan Roese <sr@denx.de> (sr@denx.de),
	linux-mtd@lists.infradead.org, Calvin Johnson

On Wed, 2014-01-15 at 21:29 +0000, Gupta, Pekon wrote:
> Hi Artem,
> 
> >From: Artem Bityutskiy [mailto:artem.bityutskiy@linux.intel.com]
> <snip>
> >Conclusion: all UBIFS needs is a way to ask the driver - is this NAND
> >page blank or not? UBIFS does not really has to compare to all 0xFFs.
> >
> Thanks for details. Yes, I understand the concept in general that you
> want to recover last bit of user-data written on NAND (without corruption).
> 
> Now, as NAND driver itself does differentiation between and erased-page
> v/s programmed-page. Can we use different error codes to pass this
> information to upper layers like;

I thought the ECC is something which could be used to differentiate.


> *For MTD layer*
> 0: data valid, length of data is determined by 'read_len'  (currently)
> -EUCLEAN: correctable bit-flips found, data is valid
> -EBADMSG: un-correctable bit-flips, data *may-be* invalid.
> -ENODATA: detected erased-page. *Actual* data determined by read_len.
> -ENOMSG:  detected erased-page with bit-flips. *Actual* data determined by read_len.

Not sure this is a good idea. If NAND driver cannot do the
differentiation, then it should not be done by the MTD layer, I think.

Then just improve UBI and UBIFS and make the function which compares
buffers with all 0xFFs allow for bit-flips. We know the maximum possible
bit-flips per min. I/O unit, right? Just allow for that amount.

-- 
Best Regards,
Artem Bityutskiy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: UBIFS seeing corrupt blank pages when image flashed via u-boot
  2014-01-16  7:19             ` Artem Bityutskiy
@ 2014-01-16  7:44               ` Gupta, Pekon
  2014-01-16  8:22                 ` Artem Bityutskiy
  0 siblings, 1 reply; 13+ messages in thread
From: Gupta, Pekon @ 2014-01-16  7:44 UTC (permalink / raw)
  To: artem.bityutskiy@linux.intel.com
  Cc: u-boot@lists.denx.de,
	Stefan Roese <sr@denx.de> (sr@denx.de),
	linux-mtd@lists.infradead.org, Calvin Johnson

Hi Artem,

>From: Artem Bityutskiy [mailto:artem.bityutskiy@linux.intel.com]
>>On Wed, 2014-01-15 at 21:29 +0000, Gupta, Pekon wrote:
>> Hi Artem,
>>
>> >From: Artem Bityutskiy [mailto:artem.bityutskiy@linux.intel.com]
>> <snip>
>> >Conclusion: all UBIFS needs is a way to ask the driver - is this NAND
>> >page blank or not? UBIFS does not really has to compare to all 0xFFs.
>> >
>> Thanks for details. Yes, I understand the concept in general that you
>> want to recover last bit of user-data written on NAND (without corruption).
>>
>> Now, as NAND driver itself does differentiation between and erased-page
>> v/s programmed-page. Can we use different error codes to pass this
>> information to upper layers like;
>
>I thought the ECC is something which could be used to differentiate.
>
(I think you confused this thread with other mail thread going with Brian,
 that one is purely about how to implement detection of erased-page).

However, *assuming NAND driver can identify erased-page correctly*,
I don't want  UBI/UBIFS to re-check the read_buf for 0xff again, because
 underlying NAND driver has already identified as erased-page, and
fixed the data before passing it to above MTD layer.

So, I'm proposing that NAND driver returns following error-codes
to indicate the results of its finding to MTD and UBI layers, so they
don't have to re-check data again.

>
>> *For MTD layer*
>> 0: data valid, length of data is determined by 'read_len'  (currently)
>> -EUCLEAN: correctable bit-flips found, data is valid
>> -EBADMSG: un-correctable bit-flips, data *may-be* invalid.
>> -ENODATA: detected erased-page. *Actual* data determined by read_len.
>> -ENOMSG:  detected erased-page with bit-flips. *Actual* data determined by read_len.
>
>Not sure this is a good idea. If NAND driver cannot do the
>differentiation, then it should not be done by the MTD layer, I think.
>

[...]

>Then just improve UBI and UBIFS and make the function which compares
>buffers with all 0xFFs allow for bit-flips. We know the maximum possible
>bit-flips per min. I/O unit, right? Just allow for that amount.
>
I think UBI/UBIFS layer should be kept independent of ECC details.
So, adding checks for (bit-flip count < ecc.strength) in UBI/UBIFS is not good.

My aim is that UBI/UBIFS should be able to classify its ubi_io_read() in 
any of the following baskets with least possible CPU cycles.
- no data (with or without correctable bit-flips)
- valid data (with or without correctable bit-flips)
- invalid data (un-correctable bit-flips)

with regards, pekon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UBIFS seeing corrupt blank pages when image flashed via u-boot
  2014-01-16  7:44               ` Gupta, Pekon
@ 2014-01-16  8:22                 ` Artem Bityutskiy
  0 siblings, 0 replies; 13+ messages in thread
From: Artem Bityutskiy @ 2014-01-16  8:22 UTC (permalink / raw)
  To: Gupta, Pekon
  Cc: u-boot@lists.denx.de,
	Stefan Roese <sr@denx.de> (sr@denx.de),
	linux-mtd@lists.infradead.org, Calvin Johnson

On Thu, 2014-01-16 at 07:44 +0000, Gupta, Pekon wrote:
> However, *assuming NAND driver can identify erased-page correctly*,
> I don't want  UBI/UBIFS to re-check the read_buf for 0xff again, because
>  underlying NAND driver has already identified as erased-page, and
> fixed the data before passing it to above MTD layer.

I would agree only if this identification costs nothing, or very little.
If for _some_ setups this would be about comparing buffers against 0xFF
for every single read - then I am so sure.

Indeed, UBIFS only needs to check for blank areas at the recovery time.
Normal reads do not need this. So _if_ there are measurable costs, I'd
argue that there is no need for people to pay it for nothing during
normal file I/O.

Hmm?

Thanks!

-- 
Best Regards,
Artem Bityutskiy

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-01-16  8:23 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-03 11:45 UBIFS seeing corrupt blank pages when image flashed via u-boot Gupta, Pekon
2014-01-03 12:59 ` Artem Bityutskiy
2014-01-03 13:05   ` Bityutskiy, Artem
2014-01-03 14:04   ` Stefano Babic
2014-01-07 17:30   ` Gupta, Pekon
2014-01-13 12:19   ` Calvin Johnson
2014-01-13 12:58     ` Artem Bityutskiy
2014-01-13 13:16       ` Gupta, Pekon
2014-01-13 15:06         ` Artem Bityutskiy
2014-01-15 21:29           ` Gupta, Pekon
2014-01-16  7:19             ` Artem Bityutskiy
2014-01-16  7:44               ` Gupta, Pekon
2014-01-16  8:22                 ` Artem Bityutskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).