public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* [RFC PATCH] UBI fixable bit-flip issue
@ 2018-08-17  0:34 Mark Spieth
  2018-08-17  8:25 ` Boris Brezillon
  0 siblings, 1 reply; 8+ messages in thread
From: Mark Spieth @ 2018-08-17  0:34 UTC (permalink / raw)
  To: linux-mtd

Hi

Richard Weinberger suggested I post this here. It is also in the uboot 
mailing list

In the process of investigating a boot failure on one of our devices, the

UBI: fixable bit-flip detected at PEB

message was seen with the following behaviour during kernel load in u-boot.

Read [2285568] bytes
UBI: fixable bit-flip detected at PEB 415
UBI: schedule PEB 415 for scrubbing
UBI: fixable bit-flip detected at PEB 415
UBI: fixable bit-flip detected at PEB 419
UBI: schedule PEB 419 for scrubbing
UBI: fixable bit-flip detected at PEB 419
UBI: fixable bit-flip detected at PEB 420
UBI: schedule PEB 420 for scrubbing
UBI: fixable bit-flip detected at PEB 420
UBI: fixable bit-flip detected at PEB 419
UBI: fixable bit-flip detected at PEB 420
UBI: fixable bit-flip detected at PEB 419
UBI: fixable bit-flip detected at PEB 420
UBI: fixable bit-flip detected at PEB 419
UBI: fixable bit-flip detected at PEB 420
UBI: fixable bit-flip detected at PEB 419
UBI: fixable bit-flip detected at PEB 420
UBI: fixable bit-flip detected at PEB 419
UBI: fixable bit-flip detected at PEB 420
UBI: fixable bit-flip detected at PEB 419

his repeats until reset.

U boot is a patched version of 2010.06 supplied by the chip vendor. No 
newer version is available from the vendor to try.

The patches include the init eba/wl swap.

A more detailed log with debugging available follows:

UBI: fixable bit-flip detected at PEB 419
UBI DBG: schedule_erase: schedule erasure of PEB 419, EC 19, torture 0
UBI DBG: erase_worker: erase PEB 419 EC 19
UBI DBG: sync_erase: erase PEB 419, old EC 19
UBI DBG: do_sync_erase: erase PEB 419
UBI DBG: sync_erase: erased PEB 419, new EC 20
UBI DBG: ubi_io_write_ec_hdr: write EC header to PEB 419
UBI DBG: ubi_io_write: write 2048 bytes to PEB 419:0
UBI DBG: ensure_wear_leveling: schedule scrubbing
UBI DBG: wear_leveling_worker: scrub PEB 420 to PEB 419
UBI DBG: ubi_io_read_vid_hdr: read VID header from PEB 420
UBI DBG: ubi_io_read: read 2048 bytes from PEB 420:2048
UBI DBG: ubi_eba_copy_leb: copy LEB 6:11, PEB 420 to PEB 419
UBI DBG: ubi_eba_copy_leb: read 126976 bytes of data
UBI DBG: ubi_io_read: read 126976 bytes from PEB 420:4096
UBI: fixable bit-flip detected at PEB 420
UBI DBG: ubi_io_write_vid_hdr: write VID header to PEB 419
UBI DBG: ubi_io_write: write 2048 bytes to PEB 419:2048
UBI DBG: ubi_io_read_vid_hdr: read VID header from PEB 419
UBI DBG: ubi_io_read: read 2048 bytes from PEB 419:2048
UBI DBG: ubi_io_write: write 126976 bytes to PEB 419:4096
UBI DBG: ubi_io_read: read 126976 bytes from PEB 419:4096
UBI: fixable bit-flip detected at PEB 419
UBI DBG: schedule_erase: schedule erasure of PEB 419, EC 20, torture 0
UBI DBG: erase_worker: erase PEB 419 EC 20
UBI DBG: sync_erase: erase PEB 419, old EC 20
UBI DBG: do_sync_erase: erase PEB 419
UBI DBG: sync_erase: erased PEB 419, new EC 21
UBI DBG: ubi_io_write_ec_hdr: write EC header to PEB 419
UBI DBG: ubi_io_write: write 2048 bytes to PEB 419:0
UBI DBG: ensure_wear_leveling: schedule scrubbing
UBI DBG: wear_leveling_worker: scrub PEB 420 to PEB 419
UBI DBG: ubi_io_read_vid_hdr: read VID header from PEB 420
UBI DBG: ubi_io_read: read 2048 bytes from PEB 420:2048
UBI DBG: ubi_eba_copy_leb: copy LEB 6:11, PEB 420 to PEB 419
UBI DBG: ubi_eba_copy_leb: read 126976 bytes of data
UBI DBG: ubi_io_read: read 126976 bytes from PEB 420:4096
UBI: fixable bit-flip detected at PEB 420
UBI DBG: ubi_io_write_vid_hdr: write VID header to PEB 419
UBI DBG: ubi_io_write: write 2048 bytes to PEB 419:2048
UBI DBG: ubi_io_read_vid_hdr: read VID header from PEB 419
UBI DBG: ubi_io_read: read 2048 bytes from PEB 419:2048
UBI DBG: ubi_io_write: write 126976 bytes to PEB 419:4096
UBI DBG: ubi_io_read: read 126976 bytes from PEB 419:4096
UBI: fixable bit-flip detected at PEB 419

Investigation showed that a read with correctable bit errors was done 
returning -EUCLEAN to the ubi read function.

Having read 
https://lists.denx.de/pipermail/u-boot/2013-September/161961.html which 
details a workaround to not return EUCLEAN from the NAND reader unless 
the number of fixed bits returned was 75% of the total number of 
correctable bits was exceeded during the read. This was impleneted in 
this version of ubi in uboot 2010.06 and it does hide the bit-flip 
infinite issue since this is new NAND FLASH. The original 2010.06 
implementation returns EUCLEAN for any number of fixable bit flips and 
thus causes the PEB move to the best free one (scrub mode in 
wear_leveling_worker).

This fix is not a root cause fix though. Investigating further led to 
the following root cause solution. The following is AFAICT.

When the scrubber chooses a PEB to move the from the free balanced tree. 
This tree is sorted by EC (erase count) and then by PEB number.

The find_wl_entry call uses a max parameter of WL_FREE_MAX_DIFF which is 
8192 in this config. So the find_wl_entry function will find a PEB that 
is better in erase count that the current PEB EC. This can easily cause 
it to find the PEB that was just moved from if it is the lowest numbered 
PEB in the free tree. Waiting for EC to go above 8192 would take a long 
time and cause premature aging of the flash PEBs in question.

The easy solution is to change the max parameter for scrubbing to this 
call to 0 so it finds a PEB with a smaller EC than the one being 
replaced. This means it wont use the previously discarded PEB as its 
first choice.

This fix was implemented and fixable bit-flip errors no longer 
hang/freeze the boot process! UBI erase and reformat was used between 
re-tests to get consistent results.

Adding the above 75% correctable bitflip threshold is also a good thing 
as less movement will ensue when the FLASH is new, but as the flash 
ages, the root cause will once again be invoked causing un-recoverable 
boot failures.

Note this fault is also in the latest kernel drivers for UBI and may 
also exist in other wear leveling implementations. The kernel driver 
issue may be at fault for android devices locking up/freezing 
sporadically during FLASH read when scrubbing with a relatively full 
flash and marginally correctable errors causing ping pong PEB moves.

The following patch is a workaround and is almost certainly not an 
optimal solution.

What is required for CONFIG_MTD_UBI_FASTMAP is uncertain.

I am in the process of writing a unit test to highlight this ping ping 
move behaviour but have not completed that yet.

I hope this description is clear enough.

Regards

Mark

==== PATCH ====
Date:   Fri Jul 13 12:10:20 2018 +1000

     wear leveling scrubbing fixable bit flip fix

diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index f66b3b22f328..a1cfadd3b395 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -736,7 +736,7 @@ static int wear_leveling_worker(struct ubi_device 
*ubi, struct ubi_work *wrk,
          /* Perform scrubbing */
          scrubbing = 1;
          e1 = rb_entry(rb_first(&ubi->scrub), struct ubi_wl_entry, u.rb);
-        e2 = get_peb_for_wl(ubi);
+        e2 = get_peb_for_wl_scrubbing(ubi);
          if (!e2)
              goto out_cancel;

@@ -1878,6 +1878,19 @@ static struct ubi_wl_entry *get_peb_for_wl(struct 
ubi_device *ubi)
      return e;
  }

+static struct ubi_wl_entry *get_peb_for_wl_scrubbing(struct ubi_device 
*ubi)
+{
+    struct ubi_wl_entry *e;
+
+    e = find_wl_entry(ubi, &ubi->free, 0);
+    self_check_in_wl_tree(ubi, e, &ubi->free);
+    ubi->free_count--;
+    ubi_assert(ubi->free_count >= 0);
+    rb_erase(&e->u.rb, &ubi->free);
+
+    return e;
+}
+
  /**
   * produce_free_peb - produce a free physical eraseblock.
   * @ubi: UBI device description object
diff --git a/drivers/mtd/ubi/wl.h b/drivers/mtd/ubi/wl.h
index a9e2d669acd8..579f7c729b5d 100644
--- a/drivers/mtd/ubi/wl.h
+++ b/drivers/mtd/ubi/wl.h
@@ -6,6 +6,10 @@ static int anchor_pebs_available(struct rb_root *root);
  static void update_fastmap_work_fn(struct work_struct *wrk);
  static struct ubi_wl_entry *find_anchor_wl_entry(struct rb_root *root);
  static struct ubi_wl_entry *get_peb_for_wl(struct ubi_device *ubi);
+static inline struct ubi_wl_entry *get_peb_for_wl_scrubbing(struct 
ubi_device *ubi)
+{
+    return get_peb_for_wl(ubi);
+}
  static void ubi_fastmap_close(struct ubi_device *ubi);
  static inline void ubi_fastmap_init(struct ubi_device *ubi, int *count)
  {
@@ -18,6 +22,7 @@ static struct ubi_wl_entry *may_reserve_for_fm(struct 
ubi_device *ubi,
                             struct rb_root *root);
  #else /* !CONFIG_MTD_UBI_FASTMAP */
  static struct ubi_wl_entry *get_peb_for_wl(struct ubi_device *ubi);
+static struct ubi_wl_entry *get_peb_for_wl_scrubbing(struct ubi_device 
*ubi);
  static inline void ubi_fastmap_close(struct ubi_device *ubi) { }
  static inline void ubi_fastmap_init(struct ubi_device *ubi, int 
*count) { }
  static struct ubi_wl_entry *may_reserve_for_fm(struct ubi_device *ubi,

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] UBI fixable bit-flip issue
  2018-08-17  0:34 [RFC PATCH] UBI fixable bit-flip issue Mark Spieth
@ 2018-08-17  8:25 ` Boris Brezillon
  2018-08-17 14:33   ` Mark Spieth
  0 siblings, 1 reply; 8+ messages in thread
From: Boris Brezillon @ 2018-08-17  8:25 UTC (permalink / raw)
  To: Mark Spieth; +Cc: linux-mtd, Richard Weinberger

Hi Mark,

On Fri, 17 Aug 2018 10:34:21 +1000
Mark Spieth <mspieth@digivation.com.au> wrote:

> Hi
> 
> Richard Weinberger suggested I post this here. It is also in the uboot 
> mailing list
> 
> In the process of investigating a boot failure on one of our devices, the
> 
> UBI: fixable bit-flip detected at PEB
> 
> message was seen with the following behaviour during kernel load in u-boot.
> 
> Read [2285568] bytes
> UBI: fixable bit-flip detected at PEB 415
> UBI: schedule PEB 415 for scrubbing
> UBI: fixable bit-flip detected at PEB 415
> UBI: fixable bit-flip detected at PEB 419
> UBI: schedule PEB 419 for scrubbing
> UBI: fixable bit-flip detected at PEB 419
> UBI: fixable bit-flip detected at PEB 420
> UBI: schedule PEB 420 for scrubbing
> UBI: fixable bit-flip detected at PEB 420
> UBI: fixable bit-flip detected at PEB 419
> UBI: fixable bit-flip detected at PEB 420
> UBI: fixable bit-flip detected at PEB 419
> UBI: fixable bit-flip detected at PEB 420
> UBI: fixable bit-flip detected at PEB 419
> UBI: fixable bit-flip detected at PEB 420
> UBI: fixable bit-flip detected at PEB 419
> UBI: fixable bit-flip detected at PEB 420
> UBI: fixable bit-flip detected at PEB 419
> UBI: fixable bit-flip detected at PEB 420
> UBI: fixable bit-flip detected at PEB 419
> 
> his repeats until reset.
> 
> U boot is a patched version of 2010.06 supplied by the chip vendor. No 
> newer version is available from the vendor to try.
> 
> The patches include the init eba/wl swap.
> 
> A more detailed log with debugging available follows:
> 
> UBI: fixable bit-flip detected at PEB 419
> UBI DBG: schedule_erase: schedule erasure of PEB 419, EC 19, torture 0
> UBI DBG: erase_worker: erase PEB 419 EC 19
> UBI DBG: sync_erase: erase PEB 419, old EC 19
> UBI DBG: do_sync_erase: erase PEB 419
> UBI DBG: sync_erase: erased PEB 419, new EC 20
> UBI DBG: ubi_io_write_ec_hdr: write EC header to PEB 419
> UBI DBG: ubi_io_write: write 2048 bytes to PEB 419:0
> UBI DBG: ensure_wear_leveling: schedule scrubbing
> UBI DBG: wear_leveling_worker: scrub PEB 420 to PEB 419
> UBI DBG: ubi_io_read_vid_hdr: read VID header from PEB 420
> UBI DBG: ubi_io_read: read 2048 bytes from PEB 420:2048
> UBI DBG: ubi_eba_copy_leb: copy LEB 6:11, PEB 420 to PEB 419
> UBI DBG: ubi_eba_copy_leb: read 126976 bytes of data
> UBI DBG: ubi_io_read: read 126976 bytes from PEB 420:4096
> UBI: fixable bit-flip detected at PEB 420
> UBI DBG: ubi_io_write_vid_hdr: write VID header to PEB 419
> UBI DBG: ubi_io_write: write 2048 bytes to PEB 419:2048
> UBI DBG: ubi_io_read_vid_hdr: read VID header from PEB 419
> UBI DBG: ubi_io_read: read 2048 bytes from PEB 419:2048
> UBI DBG: ubi_io_write: write 126976 bytes to PEB 419:4096
> UBI DBG: ubi_io_read: read 126976 bytes from PEB 419:4096
> UBI: fixable bit-flip detected at PEB 419
> UBI DBG: schedule_erase: schedule erasure of PEB 419, EC 20, torture 0
> UBI DBG: erase_worker: erase PEB 419 EC 20
> UBI DBG: sync_erase: erase PEB 419, old EC 20
> UBI DBG: do_sync_erase: erase PEB 419
> UBI DBG: sync_erase: erased PEB 419, new EC 21
> UBI DBG: ubi_io_write_ec_hdr: write EC header to PEB 419
> UBI DBG: ubi_io_write: write 2048 bytes to PEB 419:0
> UBI DBG: ensure_wear_leveling: schedule scrubbing
> UBI DBG: wear_leveling_worker: scrub PEB 420 to PEB 419
> UBI DBG: ubi_io_read_vid_hdr: read VID header from PEB 420
> UBI DBG: ubi_io_read: read 2048 bytes from PEB 420:2048
> UBI DBG: ubi_eba_copy_leb: copy LEB 6:11, PEB 420 to PEB 419
> UBI DBG: ubi_eba_copy_leb: read 126976 bytes of data
> UBI DBG: ubi_io_read: read 126976 bytes from PEB 420:4096
> UBI: fixable bit-flip detected at PEB 420
> UBI DBG: ubi_io_write_vid_hdr: write VID header to PEB 419
> UBI DBG: ubi_io_write: write 2048 bytes to PEB 419:2048
> UBI DBG: ubi_io_read_vid_hdr: read VID header from PEB 419
> UBI DBG: ubi_io_read: read 2048 bytes from PEB 419:2048
> UBI DBG: ubi_io_write: write 126976 bytes to PEB 419:4096
> UBI DBG: ubi_io_read: read 126976 bytes from PEB 419:4096
> UBI: fixable bit-flip detected at PEB 419
> 
> Investigation showed that a read with correctable bit errors was done 
> returning -EUCLEAN to the ubi read function.
> 
> Having read 
> https://lists.denx.de/pipermail/u-boot/2013-September/161961.html which 
> details a workaround to not return EUCLEAN from the NAND reader unless 
> the number of fixed bits returned was 75% of the total number of 
> correctable bits was exceeded during the read. This was impleneted in 
> this version of ubi in uboot 2010.06 and it does hide the bit-flip 
> infinite issue since this is new NAND FLASH. The original 2010.06 
> implementation returns EUCLEAN for any number of fixable bit flips and 
> thus causes the PEB move to the best free one (scrub mode in 
> wear_leveling_worker).

What's your NAND ECC requirements, and how many bitflips do you
actually have in those blocks? Also, which NAND controller are we
talking about?

> 
> This fix is not a root cause fix though. Investigating further led to 
> the following root cause solution. The following is AFAICT.
> 
> When the scrubber chooses a PEB to move the from the free balanced tree. 
> This tree is sorted by EC (erase count) and then by PEB number.
> 
> The find_wl_entry call uses a max parameter of WL_FREE_MAX_DIFF which is 
> 8192 in this config. So the find_wl_entry function will find a PEB that 
> is better in erase count that the current PEB EC. This can easily cause 
> it to find the PEB that was just moved from if it is the lowest numbered 
> PEB in the free tree. Waiting for EC to go above 8192 would take a long 
> time and cause premature aging of the flash PEBs in question.
> 
> The easy solution is to change the max parameter for scrubbing to this 
> call to 0 so it finds a PEB with a smaller EC than the one being 
> replaced. This means it wont use the previously discarded PEB as its 
> first choice.

Setting it to 0 sounds a bit aggressive. I guess the idea behind this
MAX_DIFF was to avoid spending too much time searching for the smallest
EC val when most of them are close enough. On the other hand, 8192 is
big an probably only suitable for NANDs that allows 100000 PE cycles.

> 
> This fix was implemented and fixable bit-flip errors no longer 
> hang/freeze the boot process! UBI erase and reformat was used between 
> re-tests to get consistent results.
> 
> Adding the above 75% correctable bitflip threshold is also a good thing 
> as less movement will ensue when the FLASH is new, but as the flash 
> ages, the root cause will once again be invoked causing un-recoverable 
> boot failures.

It shouldn't. As long as you configure the threshold to a proper
value (if you think 75% is too high, set it to 50%) UBI should have
time to detect blocks containing too many bitflips and move them
around.

> 
> Note this fault is also in the latest kernel drivers for UBI and may 
> also exist in other wear leveling implementations. The kernel driver 
> issue may be at fault for android devices locking up/freezing 
> sporadically during FLASH read when scrubbing with a relatively full 
> flash and marginally correctable errors causing ping pong PEB moves.
> 
> The following patch is a workaround and is almost certainly not an 
> optimal solution.
> 
> What is required for CONFIG_MTD_UBI_FASTMAP is uncertain.
> 
> I am in the process of writing a unit test to highlight this ping ping 
> move behaviour but have not completed that yet.
> 
> I hope this description is clear enough.

Well, I think selecting the bitflip threshold properly is really
important, simply because some NANDs (including SLC NANDs) are showing
bitflips even on blocks that have a low EC. Check the NAND ECC
requirements, and if it's something like 8bit/512bytes, I guess that's
more or less expected (it all depends on how many bitflips you have in
the faulty block). It's less likely on NANDs requiring 1bit/512bytes
ECC, and if that happens on such NANDs, you may have a problem in the
controller driver.

Regards,

Boris

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] UBI fixable bit-flip issue
  2018-08-17  8:25 ` Boris Brezillon
@ 2018-08-17 14:33   ` Mark Spieth
  2018-08-17 14:53     ` Boris Brezillon
  0 siblings, 1 reply; 8+ messages in thread
From: Mark Spieth @ 2018-08-17 14:33 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd, Richard Weinberger

On 8/17/2018 6:25 PM, Boris Brezillon wrote:
> Hi Mark,
>
> On Fri, 17 Aug 2018 10:34:21 +1000
> Mark Spieth <mspieth@digivation.com.au> wrote:
>
>> Hi
>>
>> Richard Weinberger suggested I post this here. It is also in the uboot
>> mailing list
>>
>> In the process of investigating a boot failure on one of our devices, the
>>
>> UBI: fixable bit-flip detected at PEB
>>
>> message was seen with the following behaviour during kernel load in u-boot.
>>
>> Read [2285568] bytes
>> UBI: fixable bit-flip detected at PEB 415
>> UBI: schedule PEB 415 for scrubbing
>> UBI: fixable bit-flip detected at PEB 415
>> UBI: fixable bit-flip detected at PEB 419
>> UBI: schedule PEB 419 for scrubbing
>> UBI: fixable bit-flip detected at PEB 419
>> UBI: fixable bit-flip detected at PEB 420
>> UBI: schedule PEB 420 for scrubbing
>> UBI: fixable bit-flip detected at PEB 420
>> UBI: fixable bit-flip detected at PEB 419
>> UBI: fixable bit-flip detected at PEB 420
>> UBI: fixable bit-flip detected at PEB 419
>> UBI: fixable bit-flip detected at PEB 420
>> UBI: fixable bit-flip detected at PEB 419
>> UBI: fixable bit-flip detected at PEB 420
>> UBI: fixable bit-flip detected at PEB 419
>> UBI: fixable bit-flip detected at PEB 420
>> UBI: fixable bit-flip detected at PEB 419
>> UBI: fixable bit-flip detected at PEB 420
>> UBI: fixable bit-flip detected at PEB 419
>>
>> his repeats until reset.
>>
>> U boot is a patched version of 2010.06 supplied by the chip vendor. No
>> newer version is available from the vendor to try.
>>
>> The patches include the init eba/wl swap.
>>
>> A more detailed log with debugging available follows:
>>
>> UBI: fixable bit-flip detected at PEB 419
>> UBI DBG: schedule_erase: schedule erasure of PEB 419, EC 19, torture 0
>> UBI DBG: erase_worker: erase PEB 419 EC 19
>> UBI DBG: sync_erase: erase PEB 419, old EC 19
>> UBI DBG: do_sync_erase: erase PEB 419
>> UBI DBG: sync_erase: erased PEB 419, new EC 20
>> UBI DBG: ubi_io_write_ec_hdr: write EC header to PEB 419
>> UBI DBG: ubi_io_write: write 2048 bytes to PEB 419:0
>> UBI DBG: ensure_wear_leveling: schedule scrubbing
>> UBI DBG: wear_leveling_worker: scrub PEB 420 to PEB 419
>> UBI DBG: ubi_io_read_vid_hdr: read VID header from PEB 420
>> UBI DBG: ubi_io_read: read 2048 bytes from PEB 420:2048
>> UBI DBG: ubi_eba_copy_leb: copy LEB 6:11, PEB 420 to PEB 419
>> UBI DBG: ubi_eba_copy_leb: read 126976 bytes of data
>> UBI DBG: ubi_io_read: read 126976 bytes from PEB 420:4096
>> UBI: fixable bit-flip detected at PEB 420
>> UBI DBG: ubi_io_write_vid_hdr: write VID header to PEB 419
>> UBI DBG: ubi_io_write: write 2048 bytes to PEB 419:2048
>> UBI DBG: ubi_io_read_vid_hdr: read VID header from PEB 419
>> UBI DBG: ubi_io_read: read 2048 bytes from PEB 419:2048
>> UBI DBG: ubi_io_write: write 126976 bytes to PEB 419:4096
>> UBI DBG: ubi_io_read: read 126976 bytes from PEB 419:4096
>> UBI: fixable bit-flip detected at PEB 419
>> UBI DBG: schedule_erase: schedule erasure of PEB 419, EC 20, torture 0
>> UBI DBG: erase_worker: erase PEB 419 EC 20
>> UBI DBG: sync_erase: erase PEB 419, old EC 20
>> UBI DBG: do_sync_erase: erase PEB 419
>> UBI DBG: sync_erase: erased PEB 419, new EC 21
>> UBI DBG: ubi_io_write_ec_hdr: write EC header to PEB 419
>> UBI DBG: ubi_io_write: write 2048 bytes to PEB 419:0
>> UBI DBG: ensure_wear_leveling: schedule scrubbing
>> UBI DBG: wear_leveling_worker: scrub PEB 420 to PEB 419
>> UBI DBG: ubi_io_read_vid_hdr: read VID header from PEB 420
>> UBI DBG: ubi_io_read: read 2048 bytes from PEB 420:2048
>> UBI DBG: ubi_eba_copy_leb: copy LEB 6:11, PEB 420 to PEB 419
>> UBI DBG: ubi_eba_copy_leb: read 126976 bytes of data
>> UBI DBG: ubi_io_read: read 126976 bytes from PEB 420:4096
>> UBI: fixable bit-flip detected at PEB 420
>> UBI DBG: ubi_io_write_vid_hdr: write VID header to PEB 419
>> UBI DBG: ubi_io_write: write 2048 bytes to PEB 419:2048
>> UBI DBG: ubi_io_read_vid_hdr: read VID header from PEB 419
>> UBI DBG: ubi_io_read: read 2048 bytes from PEB 419:2048
>> UBI DBG: ubi_io_write: write 126976 bytes to PEB 419:4096
>> UBI DBG: ubi_io_read: read 126976 bytes from PEB 419:4096
>> UBI: fixable bit-flip detected at PEB 419
>>
>> Investigation showed that a read with correctable bit errors was done
>> returning -EUCLEAN to the ubi read function.
>>
>> Having read
>> https://lists.denx.de/pipermail/u-boot/2013-September/161961.html which
>> details a workaround to not return EUCLEAN from the NAND reader unless
>> the number of fixed bits returned was 75% of the total number of
>> correctable bits was exceeded during the read. This was impleneted in
>> this version of ubi in uboot 2010.06 and it does hide the bit-flip
>> infinite issue since this is new NAND FLASH. The original 2010.06
>> implementation returns EUCLEAN for any number of fixable bit flips and
>> thus causes the PEB move to the best free one (scrub mode in
>> wear_leveling_worker).
> What's your NAND ECC requirements, and how many bitflips do you
> actually have in those blocks? Also, which NAND controller are we
> talking about?
I will get you the nand and chip info on monday.
It is a SOC by Lantiq/Intel, so no external controller. No hardware ECC 
anyway.
4 bits per 512 byte block correction (software).
75% means 4 bits are be corrected before a -UCLEAN is returned, though 
the data is good.
The original nand flash was brand new in a unit straight from 
production. The uboot driver triggered UCLEAN after a single bit is 
corrected, so only 1 bit triggered this in the original UBI driver (2010 
vintage prior to the 75% threshold being added). Other wise we would not 
have seen the issue. This affected approx 0.4% of units from production 
(43k units with approx 200 failing with recurrent bit-flip errors and 
unbootable at the time prior to the patch attached + 75% threshold patch 
i.e. 4 bits to trigger scrubbing).
>
>> This fix is not a root cause fix though. Investigating further led to
>> the following root cause solution. The following is AFAICT.
>>
>> When the scrubber chooses a PEB to move the from the free balanced tree.
>> This tree is sorted by EC (erase count) and then by PEB number.
>>
>> The find_wl_entry call uses a max parameter of WL_FREE_MAX_DIFF which is
>> 8192 in this config. So the find_wl_entry function will find a PEB that
>> is better in erase count that the current PEB EC. This can easily cause
>> it to find the PEB that was just moved from if it is the lowest numbered
>> PEB in the free tree. Waiting for EC to go above 8192 would take a long
>> time and cause premature aging of the flash PEBs in question.
>>
>> The easy solution is to change the max parameter for scrubbing to this
>> call to 0 so it finds a PEB with a smaller EC than the one being
>> replaced. This means it wont use the previously discarded PEB as its
>> first choice.
> Setting it to 0 sounds a bit aggressive. I guess the idea behind this
> MAX_DIFF was to avoid spending too much time searching for the smallest
> EC val when most of them are close enough. On the other hand, 8192 is
> big an probably only suitable for NANDs that allows 100000 PE cycles.
I did not know the design behind this threshold and chose 0 so it would 
pick the least erased PEB which should be a better choice than the first 
one that is less than 4000.
What would be better is a way to detect scrubbing reusing a PEB that was 
used in the same scrubbing session so that the infinite loop does not 
occur. In our case a hardware watchdog kicks in and it reboots, with the 
same error sequence and no boot as a result. This occurs forever but we 
didnt wait long enough for the PEBs in question to be destroyed.
I did say my solution was not ideal. :-)
>> This fix was implemented and fixable bit-flip errors no longer
>> hang/freeze the boot process! UBI erase and reformat was used between
>> re-tests to get consistent results.
>>
>> Adding the above 75% correctable bitflip threshold is also a good thing
>> as less movement will ensue when the FLASH is new, but as the flash
>> ages, the root cause will once again be invoked causing un-recoverable
>> boot failures.
> It shouldn't. As long as you configure the threshold to a proper
> value (if you think 75% is too high, set it to 50%) UBI should have
> time to detect blocks containing too many bitflips and move them
> around.
This threshold is not a root cause. It hides the root cause. When the 
flash ages and hits the condition, the same infinite loop will occur on 
scrubbing, thus IO locking the disk subsystem, effectively freezing the 
OS. My old (5 years) Samsung Galaxy 4 is currently doing exactly this. 
My analysis may be wrong though. And it may affect other flash wear 
leveling filesystems too. IDK.
>> Note this fault is also in the latest kernel drivers for UBI and may
>> also exist in other wear leveling implementations. The kernel driver
>> issue may be at fault for android devices locking up/freezing
>> sporadically during FLASH read when scrubbing with a relatively full
>> flash and marginally correctable errors causing ping pong PEB moves.
>>
>> The following patch is a workaround and is almost certainly not an
>> optimal solution.
>>
>> What is required for CONFIG_MTD_UBI_FASTMAP is uncertain.
>>
>> I am in the process of writing a unit test to highlight this ping ping
>> move behaviour but have not completed that yet.
>>
>> I hope this description is clear enough.
> Well, I think selecting the bitflip threshold properly is really
> important, simply because some NANDs (including SLC NANDs) are showing
> bitflips even on blocks that have a low EC. Check the NAND ECC
> requirements, and if it's something like 8bit/512bytes, I guess that's
> more or less expected (it all depends on how many bitflips you have in
> the faulty block). It's less likely on NANDs requiring 1bit/512bytes
> ECC, and if that happens on such NANDs, you may have a problem in the
> controller driver.
4 bits ECC per 512 bytes, from memory 28 bytes in OOB, using software 
ECC in the MTD driver.
As I said, I believe the better threshold is hiding the root cause. It 
is only a band-aid.

Thanks for looking into this Boris.

Mark

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] UBI fixable bit-flip issue
  2018-08-17 14:33   ` Mark Spieth
@ 2018-08-17 14:53     ` Boris Brezillon
  2018-08-17 15:22       ` Boris Brezillon
  0 siblings, 1 reply; 8+ messages in thread
From: Boris Brezillon @ 2018-08-17 14:53 UTC (permalink / raw)
  To: Mark Spieth; +Cc: linux-mtd, Richard Weinberger

On Sat, 18 Aug 2018 00:33:25 +1000
Mark Spieth <mspieth@digivation.com.au> wrote:

> >> I hope this description is clear enough.  
> > Well, I think selecting the bitflip threshold properly is really
> > important, simply because some NANDs (including SLC NANDs) are showing
> > bitflips even on blocks that have a low EC. Check the NAND ECC
> > requirements, and if it's something like 8bit/512bytes, I guess that's
> > more or less expected (it all depends on how many bitflips you have in
> > the faulty block). It's less likely on NANDs requiring 1bit/512bytes
> > ECC, and if that happens on such NANDs, you may have a problem in the
> > controller driver.  
> 4 bits ECC per 512 bytes, from memory 28 bytes in OOB, using software 
> ECC in the MTD driver.
> As I said, I believe the better threshold is hiding the root cause. It 
> is only a band-aid.

What you describe will anyway happen sooner or later: if you're using
almost all LEBs, and the remaining free ones are all impacted by the
correctable bit-flip issue you'll have to use them anyway. So, yes,
this is a band-aid, just like your solution is just improving things
but not really solving the issue. This being said, if the blocks
really show too many bitflips, they should be marked bad at some point,
because during the scrubbing process we do write a pattern and check
that we can read it back. I'll have to double check, but I think we're
also checking for EUCLEAN and mark the block bad when that happens.

Another option would be to order free blocks, not only by
descending erase counters, but also by number of times the upper layer
reported EUCLEAN on them. That would imply adding a new field to the EC
header, but I think both Richard and I are open to discussing that.

Regards,

Boris

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] UBI fixable bit-flip issue
  2018-08-17 14:53     ` Boris Brezillon
@ 2018-08-17 15:22       ` Boris Brezillon
  2018-08-20  0:40         ` Mark Spieth
  0 siblings, 1 reply; 8+ messages in thread
From: Boris Brezillon @ 2018-08-17 15:22 UTC (permalink / raw)
  To: Mark Spieth; +Cc: linux-mtd, Richard Weinberger

On Fri, 17 Aug 2018 16:53:22 +0200
Boris Brezillon <boris.brezillon@bootlin.com> wrote:

> On Sat, 18 Aug 2018 00:33:25 +1000
> Mark Spieth <mspieth@digivation.com.au> wrote:
> 
> > >> I hope this description is clear enough.    
> > > Well, I think selecting the bitflip threshold properly is really
> > > important, simply because some NANDs (including SLC NANDs) are showing
> > > bitflips even on blocks that have a low EC. Check the NAND ECC
> > > requirements, and if it's something like 8bit/512bytes, I guess that's
> > > more or less expected (it all depends on how many bitflips you have in
> > > the faulty block). It's less likely on NANDs requiring 1bit/512bytes
> > > ECC, and if that happens on such NANDs, you may have a problem in the
> > > controller driver.    
> > 4 bits ECC per 512 bytes, from memory 28 bytes in OOB, using software 
> > ECC in the MTD driver.
> > As I said, I believe the better threshold is hiding the root cause. It 
> > is only a band-aid.  
> 
> What you describe will anyway happen sooner or later: if you're using
> almost al LEBs, and the remaining free ones are all impacted by the
> correctable bit-flip issue you'll have to use them anyway. So, yes,
> this is a band-aid, just like your solution is just improving things
> but not really solving the issue. This being said, if the blocks
> really show too many bitflips, they should be marked bad at some point,
> because during the scrubbing process we do write a pattern and check
> that we can read it back. I'll have to double check, but I think we're
> also checking for EUCLEAN and mark the block bad when that happens.

Hm, actually we're not torturing the source PEB when moving a LEB
because of bitflips (probably because it's expensive and tends to wear
the block even faster) :-/. The destination PEB is tortured if we fail
to read the VID header back, which is definitely not a guarantee that
other data are readable or do not contain too much bitflips.

There's definitely something to improve there.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] UBI fixable bit-flip issue
  2018-08-17 15:22       ` Boris Brezillon
@ 2018-08-20  0:40         ` Mark Spieth
  2018-08-20  8:36           ` Boris Brezillon
  0 siblings, 1 reply; 8+ messages in thread
From: Mark Spieth @ 2018-08-20  0:40 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd, Richard Weinberger


On 18/08/18 01:22, Boris Brezillon wrote:
> On Fri, 17 Aug 2018 16:53:22 +0200
> Boris Brezillon <boris.brezillon@bootlin.com> wrote:
>
>> On Sat, 18 Aug 2018 00:33:25 +1000
>> Mark Spieth <mspieth@digivation.com.au> wrote:
>>
>>>>> I hope this description is clear enough.
>>>> Well, I think selecting the bitflip threshold properly is really
>>>> important, simply because some NANDs (including SLC NANDs) are showing
>>>> bitflips even on blocks that have a low EC. Check the NAND ECC
>>>> requirements, and if it's something like 8bit/512bytes, I guess that's
>>>> more or less expected (it all depends on how many bitflips you have in
>>>> the faulty block). It's less likely on NANDs requiring 1bit/512bytes
>>>> ECC, and if that happens on such NANDs, you may have a problem in the
>>>> controller driver.
>>> 4 bits ECC per 512 bytes, from memory 28 bytes in OOB, using software
>>> ECC in the MTD driver.
>>> As I said, I believe the better threshold is hiding the root cause. It
>>> is only a band-aid.
>> What you describe will anyway happen sooner or later: if you're using
>> almost al LEBs, and the remaining free ones are all impacted by the
>> correctable bit-flip issue you'll have to use them anyway. So, yes,
>> this is a band-aid, just like your solution is just improving things
>> but not really solving the issue. This being said, if the blocks
>> really show too many bitflips, they should be marked bad at some point,
>> because during the scrubbing process we do write a pattern and check
>> that we can read it back. I'll have to double check, but I think we're
>> also checking for EUCLEAN and mark the block bad when that happens.
> Hm, actually we're not torturing the source PEB when moving a LEB
> because of bitflips (probably because it's expensive and tends to wear
> the block even faster) :-/. The destination PEB is tortured if we fail
> to read the VID header back, which is definitely not a guarantee that
> other data are readable or do not contain too much bitflips.
>
> There's definitely something to improve there.
Hi Boris,

The flash in use is a Macronix MX30LF1G18AC and uses ONFI mode.

My understanding of the problem is that when a block is read (say 
kernel+initrd) and one of the PEBs reads ok but with corrected bit 
errors, scrub mode is enabled.
It then finds a suitable PEB to copy it to which it does. It then 
verifies this copy and also detects a corrected bit error, and frees the 
PEB it copied it from as it read ok, but with corrected errors. It then 
finds a suitable PEB to copy it to, and finds the original PEB that it 
moved it from! Does the whole copy and readback verify with corrected 
errors. This continues forever (or until the PEB does not verify which 
could be a while). Naturally the block read never completes.

This is the behaviour I observed in the older driver with lots of print 
debugging. This may not be the behaviour in the current master, but I 
suspect it is.
Some way of detecting this loop in a scrubbing session would be optimal, 
but seems complex to do from my examination of the UBI scrubber. But it 
shouldnt require a persisted header change.

Regards
Mark

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] UBI fixable bit-flip issue
  2018-08-20  0:40         ` Mark Spieth
@ 2018-08-20  8:36           ` Boris Brezillon
  2018-08-20 10:01             ` Arnaud Mouiche
  0 siblings, 1 reply; 8+ messages in thread
From: Boris Brezillon @ 2018-08-20  8:36 UTC (permalink / raw)
  To: Mark Spieth; +Cc: linux-mtd, Richard Weinberger

Hi Mark,

On Mon, 20 Aug 2018 10:40:14 +1000
Mark Spieth <mspieth@digivation.com.au> wrote:

> On 18/08/18 01:22, Boris Brezillon wrote:
> > On Fri, 17 Aug 2018 16:53:22 +0200
> > Boris Brezillon <boris.brezillon@bootlin.com> wrote:
> >  
> >> On Sat, 18 Aug 2018 00:33:25 +1000
> >> Mark Spieth <mspieth@digivation.com.au> wrote:
> >>  
> >>>>> I hope this description is clear enough.  
> >>>> Well, I think selecting the bitflip threshold properly is really
> >>>> important, simply because some NANDs (including SLC NANDs) are showing
> >>>> bitflips even on blocks that have a low EC. Check the NAND ECC
> >>>> requirements, and if it's something like 8bit/512bytes, I guess that's
> >>>> more or less expected (it all depends on how many bitflips you have in
> >>>> the faulty block). It's less likely on NANDs requiring 1bit/512bytes
> >>>> ECC, and if that happens on such NANDs, you may have a problem in the
> >>>> controller driver.  
> >>> 4 bits ECC per 512 bytes, from memory 28 bytes in OOB, using software
> >>> ECC in the MTD driver.
> >>> As I said, I believe the better threshold is hiding the root cause. It
> >>> is only a band-aid.  
> >> What you describe will anyway happen sooner or later: if you're using
> >> almost al LEBs, and the remaining free ones are all impacted by the
> >> correctable bit-flip issue you'll have to use them anyway. So, yes,
> >> this is a band-aid, just like your solution is just improving things
> >> but not really solving the issue. This being said, if the blocks
> >> really show too many bitflips, they should be marked bad at some point,
> >> because during the scrubbing process we do write a pattern and check
> >> that we can read it back. I'll have to double check, but I think we're
> >> also checking for EUCLEAN and mark the block bad when that happens.  
> > Hm, actually we're not torturing the source PEB when moving a LEB
> > because of bitflips (probably because it's expensive and tends to wear
> > the block even faster) :-/. The destination PEB is tortured if we fail
> > to read the VID header back, which is definitely not a guarantee that
> > other data are readable or do not contain too much bitflips.
> >
> > There's definitely something to improve there.  
> Hi Boris,
> 
> The flash in use is a Macronix MX30LF1G18AC and uses ONFI mode.
> 
> My understanding of the problem is that when a block is read (say 
> kernel+initrd) and one of the PEBs reads ok but with corrected bit 
> errors, scrub mode is enabled.
> It then finds a suitable PEB to copy it to which it does. It then 
> verifies this copy and also detects a corrected bit error, and frees the 
> PEB it copied it from as it read ok, but with corrected errors. It then 
> finds a suitable PEB to copy it to, and finds the original PEB that it 
> moved it from! Does the whole copy and readback verify with corrected 
> errors.

You're correct, but it seems Linux is no longer reading back the data
since commit 1e0a74f10d76 ("UBI: Don't read back all data in
ubi_eba_copy_leb()"). Not sure this was such a good idea to drop this
test :-/.

> This continues forever (or until the PEB does not verify which 
> could be a while). Naturally the block read never completes.

Well, Linux and uboot are a bit different in this regard. When you
schedule a block for erasure, the real erase operation is done in a
separate thread in Linux. Since uboot has no thread support, the erase
operation is done right away, and the block goes back in the free map
immediately, thus leading to an infinite loop if the first 2 PEBs in the
map are prone to bitflips.

Note that I'm not saying we shouldn't make things better for Linux too,
just trying to explain why the infinite loop issue should not happen in
Linux. Still, even Linux would keep moving LEBs around which we
definitely don't want.

> 
> This is the behaviour I observed in the older driver with lots of print 
> debugging. This may not be the behaviour in the current master, but I 
> suspect it is.

The problem seems to be present in mainline (uboot).

> Some way of detecting this loop in a scrubbing session would be optimal, 
> but seems complex to do from my examination of the UBI scrubber. But it 
> shouldnt require a persisted header change.

Except you're only fixing the case where you still have blocks without
such inherent bitflips (probably stuck bits), but what if all the
blocks in the free pool are subject to this symptom.

Really, we have the bitflip threshold concept for a reason, and setting
it to 1 when your engine is capable of fixing 4 bitflips sounds a bit
too extreme.

Also, when we realize the block we're trying to use shows too many
bitflips (above the threshold) just after writing something into it,
then it's probably time to stop using it (and mark it bad). That's what
the torture_peb() function is supposed to do.

Regards,

Boris

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] UBI fixable bit-flip issue
  2018-08-20  8:36           ` Boris Brezillon
@ 2018-08-20 10:01             ` Arnaud Mouiche
  0 siblings, 0 replies; 8+ messages in thread
From: Arnaud Mouiche @ 2018-08-20 10:01 UTC (permalink / raw)
  To: linux-mtd

Hi all.

This issue reminds me a similar one I had, also with Macronix devices.
Quickly:
- some blocks are prone to bit errors just after writing.
- those PEB are frequently queued for torture, but torture always pass.
- In fact, the torture test pass mostly because errors appear on some 
page when writing other pages of the same PEB. Since the patterns used 
for testing are too simple (same patterns for every page), the torture 
test doesn't catch the issue.

Details here:
http://lists.infradead.org/pipermail/linux-mtd/2016-April/066628.html

Since my projects have changes I didn't get a chance to work again on 
the subject.... :-(

Arnaud

On 20/08/2018 10:36, Boris Brezillon wrote:
> Hi Mark,
>
> On Mon, 20 Aug 2018 10:40:14 +1000
> Mark Spieth <mspieth@digivation.com.au> wrote:
>
>> On 18/08/18 01:22, Boris Brezillon wrote:
>>> On Fri, 17 Aug 2018 16:53:22 +0200
>>> Boris Brezillon <boris.brezillon@bootlin.com> wrote:
>>>   
>>>> On Sat, 18 Aug 2018 00:33:25 +1000
>>>> Mark Spieth <mspieth@digivation.com.au> wrote:
>>>>   
>>>>>>> I hope this description is clear enough.
>>>>>> Well, I think selecting the bitflip threshold properly is really
>>>>>> important, simply because some NANDs (including SLC NANDs) are showing
>>>>>> bitflips even on blocks that have a low EC. Check the NAND ECC
>>>>>> requirements, and if it's something like 8bit/512bytes, I guess that's
>>>>>> more or less expected (it all depends on how many bitflips you have in
>>>>>> the faulty block). It's less likely on NANDs requiring 1bit/512bytes
>>>>>> ECC, and if that happens on such NANDs, you may have a problem in the
>>>>>> controller driver.
>>>>> 4 bits ECC per 512 bytes, from memory 28 bytes in OOB, using software
>>>>> ECC in the MTD driver.
>>>>> As I said, I believe the better threshold is hiding the root cause. It
>>>>> is only a band-aid.
>>>> What you describe will anyway happen sooner or later: if you're using
>>>> almost al LEBs, and the remaining free ones are all impacted by the
>>>> correctable bit-flip issue you'll have to use them anyway. So, yes,
>>>> this is a band-aid, just like your solution is just improving things
>>>> but not really solving the issue. This being said, if the blocks
>>>> really show too many bitflips, they should be marked bad at some point,
>>>> because during the scrubbing process we do write a pattern and check
>>>> that we can read it back. I'll have to double check, but I think we're
>>>> also checking for EUCLEAN and mark the block bad when that happens.
>>> Hm, actually we're not torturing the source PEB when moving a LEB
>>> because of bitflips (probably because it's expensive and tends to wear
>>> the block even faster) :-/. The destination PEB is tortured if we fail
>>> to read the VID header back, which is definitely not a guarantee that
>>> other data are readable or do not contain too much bitflips.
>>>
>>> There's definitely something to improve there.
>> Hi Boris,
>>
>> The flash in use is a Macronix MX30LF1G18AC and uses ONFI mode.
>>
>> My understanding of the problem is that when a block is read (say
>> kernel+initrd) and one of the PEBs reads ok but with corrected bit
>> errors, scrub mode is enabled.
>> It then finds a suitable PEB to copy it to which it does. It then
>> verifies this copy and also detects a corrected bit error, and frees the
>> PEB it copied it from as it read ok, but with corrected errors. It then
>> finds a suitable PEB to copy it to, and finds the original PEB that it
>> moved it from! Does the whole copy and readback verify with corrected
>> errors.
> You're correct, but it seems Linux is no longer reading back the data
> since commit 1e0a74f10d76 ("UBI: Don't read back all data in
> ubi_eba_copy_leb()"). Not sure this was such a good idea to drop this
> test :-/.
>
>> This continues forever (or until the PEB does not verify which
>> could be a while). Naturally the block read never completes.
> Well, Linux and uboot are a bit different in this regard. When you
> schedule a block for erasure, the real erase operation is done in a
> separate thread in Linux. Since uboot has no thread support, the erase
> operation is done right away, and the block goes back in the free map
> immediately, thus leading to an infinite loop if the first 2 PEBs in the
> map are prone to bitflips.
>
> Note that I'm not saying we shouldn't make things better for Linux too,
> just trying to explain why the infinite loop issue should not happen in
> Linux. Still, even Linux would keep moving LEBs around which we
> definitely don't want.
>
>> This is the behaviour I observed in the older driver with lots of print
>> debugging. This may not be the behaviour in the current master, but I
>> suspect it is.
> The problem seems to be present in mainline (uboot).
>
>> Some way of detecting this loop in a scrubbing session would be optimal,
>> but seems complex to do from my examination of the UBI scrubber. But it
>> shouldnt require a persisted header change.
> Except you're only fixing the case where you still have blocks without
> such inherent bitflips (probably stuck bits), but what if all the
> blocks in the free pool are subject to this symptom.
>
> Really, we have the bitflip threshold concept for a reason, and setting
> it to 1 when your engine is capable of fixing 4 bitflips sounds a bit
> too extreme.
>
> Also, when we realize the block we're trying to use shows too many
> bitflips (above the threshold) just after writing something into it,
> then it's probably time to stop using it (and mark it bad). That's what
> the torture_peb() function is supposed to do.
>
> Regards,
>
> Boris
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-08-20 10:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-17  0:34 [RFC PATCH] UBI fixable bit-flip issue Mark Spieth
2018-08-17  8:25 ` Boris Brezillon
2018-08-17 14:33   ` Mark Spieth
2018-08-17 14:53     ` Boris Brezillon
2018-08-17 15:22       ` Boris Brezillon
2018-08-20  0:40         ` Mark Spieth
2018-08-20  8:36           ` Boris Brezillon
2018-08-20 10:01             ` Arnaud Mouiche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox