* [PATCH v0] mtd: gpmi: Use cached syndromes to speedup erased region bitflip detection.
@ 2014-01-07 19:44 Elie De Brauwer
2014-01-07 19:44 ` Elie De Brauwer
2014-01-08 5:38 ` Huang Shijie
0 siblings, 2 replies; 6+ messages in thread
From: Elie De Brauwer @ 2014-01-07 19:44 UTC (permalink / raw)
To: b32955, dwmw2, dedekind1, computersforpeace, shijie8
Cc: Elie De Brauwer, linux-mtd
Hello all,
This patch is incremental with respect to 'mtd: gpmi: Deal with bitflips in
erased regions' currently in its 7th version, which I consider more or less
stable.
The 7th version of that patch removed however a fast path option which
resulted in a flash throughput decrease. Everytime an erased block is read,
each individual byte has its hamming weight calculated and bitflips corrected
in software. I'm testing this on a poor old i.mx28 which isn't an powerful
beast.
Hence I've been looking for a way to regain some of the performance which
was lost (at the cost of making ubifs more reliable). And I've been a bit
insipired by Pekon Gupta's work on omap (where I stole the first hamming
weight approach).
Apparently the BCH block has the possibility to write the syndrome data
to auxiliary memory (where also the status bytes are stored) when setting
the HW_BCH_CTRL:DEBUGSYNDROME). Hence I followed the approach where the
first time a correctly erase block is found these syndromes are cached
and future reads of likely-to-be--erased blocks can be identified based
on the comparison of these syndroms as opposed to checking each individual
bytes. For example on my 2k chip I would normally get the hamming weight
of 4 (chunks) of 512 bytes aka 2k bytes. But with an ecc8 I can replace
this by a memcmp() of 4x32 bytes. (4 sets of syndromes). The result is
obviously that the processor is more eager to do this resulting in a
regaining some of the lost speed.
I did some benchmarking on the following 2k and 4k nand chips:
NAND device: Manufacturer ID: 0x2c, Chip ID: 0xdc (Micron MT29F4G08ABAEAH4), 512MiB, page size: 4096, OOB size: 224
NAND device: Manufacturer ID: 0x2c, Chip ID: 0xda (Micron MT29F2G08ABAEAH4), 256MiB, page size: 2048, OOB size: 64
By simply doing time dd if=/dev/mtd8 of=/dev/null bs=1M and calculating
the throughput (in megabyte/s). This gave the following results:
2k |4k
========
7.0 |11.3 <- v6 of the bitflips correction path (broken fast path)
4.7 |5.9 <- v7 of the bitflip correction patch (no fast path)
5.9 |8.4 <- with this patch applied.
(Some background info, I expect these chips to be less than half used,
hence plenty of erased blocks, also the NAND timings are already optimized
for these chips).
I have tested this on the chips above on Linux 3.9 and rebased this patch
to l2-mtd (however my test mainly concluded the legacy geometry part).
A side node, two pair of syndromes are stored, the first syndromes cover
the first data block AND the metadata (more 0xff's) and the second set only
covers n-1 datablocks.
Any feedback regarding this approach/patch is welcomed.
Thanks
E.
Elie De Brauwer (1):
mtd: gpmi: Use cached syndromes to speedup erase region bitflip
detection.
drivers/mtd/nand/gpmi-nand/bch-regs.h | 1 +
drivers/mtd/nand/gpmi-nand/gpmi-lib.c | 4 ++
drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 95 ++++++++++++++++++++++++++++++++--
drivers/mtd/nand/gpmi-nand/gpmi-nand.h | 6 +++
4 files changed, 101 insertions(+), 5 deletions(-)
--
1.8.5.2
^ permalink raw reply [flat|nested] 6+ messages in thread* [PATCH v0] mtd: gpmi: Use cached syndromes to speedup erased region bitflip detection. 2014-01-07 19:44 [PATCH v0] mtd: gpmi: Use cached syndromes to speedup erased region bitflip detection Elie De Brauwer @ 2014-01-07 19:44 ` Elie De Brauwer 2014-01-08 5:38 ` Huang Shijie 1 sibling, 0 replies; 6+ messages in thread From: Elie De Brauwer @ 2014-01-07 19:44 UTC (permalink / raw) To: b32955, dwmw2, dedekind1, computersforpeace, shijie8 Cc: Elie De Brauwer, linux-mtd When reading an erased page detection of bitflips in this page needs to be done in software. This is currently done by calculating the hamming weight of each individual byte which scales linearly with the page size and results in a decreased read speed. The BCH block has an opion to write the computed syndromes to auxiliary memory for each transaction (see HW_BCH_CTRL:DEBUGSYNDROME, or also Figure 16-4 in the i.MX28 reference manual). If the syndromes of properly erased blocks are known it is possible to distinguish faster between a properly erased page and an erased page with bitflips. This compares scales linearly with the ECC strength and number of data blocks the page is split into (as opposed to page size). Hence the first time a properly erase block is found its syndromes are put in a cache, and all successive reads of an erased block compare the syndromes (rather than the individual bytes). The only catch here is that the syndromes (SM) of the first block cover the first datablock and the metadata. And the successive syndromes (SD) cover the other data blocks. So two sets of syndromes will need to be stored. +--------------------------------------------+ | Meta | Data0 | Data1 | Data2 | ... | Datan | +--------------------------------------------+ <-----SM-----> <--SD-> <--SD-> ... <--SD-> Signed-off-by: Elie De Brauwer <eliedebrauwer@gmail.com> --- drivers/mtd/nand/gpmi-nand/bch-regs.h | 1 + drivers/mtd/nand/gpmi-nand/gpmi-lib.c | 4 ++ drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 95 ++++++++++++++++++++++++++++++++-- drivers/mtd/nand/gpmi-nand/gpmi-nand.h | 6 +++ 4 files changed, 101 insertions(+), 5 deletions(-) diff --git a/drivers/mtd/nand/gpmi-nand/bch-regs.h b/drivers/mtd/nand/gpmi-nand/bch-regs.h index b2104de..e828c59 100644 --- a/drivers/mtd/nand/gpmi-nand/bch-regs.h +++ b/drivers/mtd/nand/gpmi-nand/bch-regs.h @@ -26,6 +26,7 @@ #define HW_BCH_CTRL_CLR 0x00000008 #define HW_BCH_CTRL_TOG 0x0000000c +#define BM_BCH_CTRL_DEBUGSYNDROME (1 << 22) #define BM_BCH_CTRL_COMPLETE_IRQ_EN (1 << 8) #define BM_BCH_CTRL_COMPLETE_IRQ (1 << 0) diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c index b6a7aa8..440cd52 100644 --- a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c +++ b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c @@ -307,6 +307,10 @@ int bch_set_geometry(struct gpmi_nand_data *this) writel(erase_threshold & BM_BCH_MODE_ERASE_THRESHOLD_MASK, r->bch_regs + HW_BCH_MODE); + /* The syndromes should be written to aux mem*/ + writel(BM_BCH_CTRL_DEBUGSYNDROME, + r->bch_regs + HW_BCH_CTRL_SET); + /* Set *all* chip selects to use layout 0. */ writel(0, r->bch_regs + HW_BCH_LAYOUTSELECT); diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c index eac6714..f0a8489 100644 --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c @@ -219,8 +219,21 @@ static bool set_geometry_by_ecc_info(struct gpmi_nand_data *this) geo->payload_size = mtd->writesize; geo->auxiliary_status_offset = ALIGN(geo->metadata_size, 4); + + /* + * The syndromes consist out of 2*t 13-bit symbols each + * stored as a 16 bit halfword, per chunk. Hence 4 byte per + * ECC bit per chunk. + */ + geo->auxiliary_syndrome_size = 4 * geo->ecc_strength * + geo->ecc_chunk_count; + geo->auxiliary_size = ALIGN(geo->metadata_size, 4) - + ALIGN(geo->ecc_chunk_count, 4); + + ALIGN(geo->ecc_chunk_count, 4) + + geo->auxiliary_syndrome_size; + + geo->auxiliary_syndrome_offset = ALIGN(geo->metadata_size, 4) + + ALIGN(geo->ecc_chunk_count, 4); if (!this->swap_block_mark) return true; @@ -286,8 +299,18 @@ static int legacy_set_geometry(struct gpmi_nand_data *this) metadata_size = ALIGN(geo->metadata_size, 4); status_size = ALIGN(geo->ecc_chunk_count, 4); - geo->auxiliary_size = metadata_size + status_size; + /* + * The syndromes consist out of 2*t 13-bit symbols each + * stored as a 16 bit halfword, per chunk. Hence 4 byte per + * ECC bit per chunk. + */ + geo->auxiliary_syndrome_size = 4 * geo->ecc_strength * + geo->ecc_chunk_count; + + geo->auxiliary_size = metadata_size + status_size + + geo->auxiliary_syndrome_size; geo->auxiliary_status_offset = metadata_size; + geo->auxiliary_syndrome_offset = metadata_size + status_size; if (!this->swap_block_mark) return 0; @@ -761,6 +784,18 @@ static void send_page_end(struct gpmi_nand_data *this, dma_unmap_single(dev, used_phys, length, DMA_TO_DEVICE); } +static void gpmi_free_syndrome_buffer(struct gpmi_nand_data *this) +{ + if (this->syndrome_data) + kfree(this->syndrome_data); + + if (this->syndrome_metadata) + kfree(this->syndrome_metadata); + + this->syndrome_data = NULL; + this->syndrome_metadata = NULL; +} + static void gpmi_free_dma_buffer(struct gpmi_nand_data *this) { struct device *dev = this->dev; @@ -959,25 +994,74 @@ static void block_mark_swapping(struct gpmi_nand_data *this, } /* + * Count number of zero bits in a region. + */ +static unsigned int num_zero_bits(unsigned char *data, int len) +{ + int i; + int num = 0; + for (i = 0; i < len; i++) + num += hweight8(~data[i]); + return num; +} + +/* * Count the number of 0 bits in a supposed to be * erased region and correct them. Return the number * of bitflips or zero when the region was correct. */ static unsigned int erased_sector_bitflips(unsigned char *data, + unsigned char * aux, unsigned int chunk, - struct bch_geometry *geo) + struct gpmi_nand_data *this) { + struct bch_geometry *geo = &this->bch_geometry; unsigned int flip_bits = 0; int i; int base = geo->ecc_chunk_size * chunk; + int syndrome_size = geo->auxiliary_syndrome_size / geo->ecc_chunk_count; + int aux_offset = geo->auxiliary_syndrome_offset + chunk * syndrome_size; + char ** syndrome_cache = &this->syndrome_data; + if (chunk == 0) + syndrome_cache = &this->syndrome_metadata; + + if (*syndrome_cache) + { + /* Fast path with cached syndromes */ + if (memcmp(*syndrome_cache, &aux[aux_offset], syndrome_size)) + goto slow_path; + return 0; + } + else + { + /* + * Cache the syndromes if the data is valid. For chunk 0 + * data and metadata should be all 0xff, for the other chunks + * data should be all 0xff. + */ + if (num_zero_bits(&data[base], geo->ecc_chunk_size) || + (chunk == 0 && num_zero_bits(&aux[0], geo->metadata_size))) + goto slow_path; + + *syndrome_cache = kmalloc(syndrome_size, GFP_KERNEL); + if ( !*syndrome_cache) + goto slow_path; + + memcpy(*syndrome_cache, &aux[aux_offset], syndrome_size); + + return 0; + } +slow_path: /* Count bitflips */ for (i = 0; i < geo->ecc_chunk_size; i++) flip_bits += hweight8(~data[base + i]); + flip_bits = num_zero_bits(&data[base], geo->ecc_chunk_size); /* Correct bitflips by 0xFF'ing this chunk. */ if (flip_bits) memset(&data[base], 0xFF, geo->ecc_chunk_size); + memset(&data[base], 0xFF, geo->ecc_chunk_size); return flip_bits; } @@ -1042,8 +1126,8 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip, * the BCH block. */ if (*status == STATUS_ERASED) - flips = erased_sector_bitflips(payload_virt, i, - nfc_geo); + flips = erased_sector_bitflips(payload_virt, + auxiliary_virt, i, this); else flips = *status; @@ -1561,6 +1645,7 @@ static int gpmi_set_geometry(struct gpmi_nand_data *this) static void gpmi_nand_exit(struct gpmi_nand_data *this) { nand_release(&this->mtd); + gpmi_free_syndrome_buffer(this); gpmi_free_dma_buffer(this); } diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.h b/drivers/mtd/nand/gpmi-nand/gpmi-nand.h index 4c801fa..136f8e9 100644 --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.h +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.h @@ -62,6 +62,8 @@ struct bch_geometry { unsigned int payload_size; unsigned int auxiliary_size; unsigned int auxiliary_status_offset; + unsigned int auxiliary_syndrome_offset; + unsigned int auxiliary_syndrome_size; unsigned int block_mark_byte_offset; unsigned int block_mark_bit_offset; }; @@ -156,6 +158,10 @@ struct gpmi_nand_data { uint8_t *upper_buf; int upper_len; + /* Syndrome caches */ + char *syndrome_data; + char *syndrome_metadata; + /* for DMA operations */ bool direct_dma_map_ok; -- 1.8.5.2 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v0] mtd: gpmi: Use cached syndromes to speedup erased region bitflip detection. 2014-01-07 19:44 [PATCH v0] mtd: gpmi: Use cached syndromes to speedup erased region bitflip detection Elie De Brauwer 2014-01-07 19:44 ` Elie De Brauwer @ 2014-01-08 5:38 ` Huang Shijie 2014-01-08 7:52 ` Elie De Brauwer 1 sibling, 1 reply; 6+ messages in thread From: Huang Shijie @ 2014-01-08 5:38 UTC (permalink / raw) To: Elie De Brauwer; +Cc: shijie8, computersforpeace, dwmw2, linux-mtd, dedekind1 On Tue, Jan 07, 2014 at 08:44:32PM +0100, Elie De Brauwer wrote: > Hello all, > > This patch is incremental with respect to 'mtd: gpmi: Deal with bitflips in > erased regions' currently in its 7th version, which I consider more or less > stable. > The 7th version of that patch removed however a fast path option which > resulted in a flash throughput decrease. Everytime an erased block is read, > each individual byte has its hamming weight calculated and bitflips corrected > in software. I'm testing this on a poor old i.mx28 which isn't an powerful > beast. > > Hence I've been looking for a way to regain some of the performance which > was lost (at the cost of making ubifs more reliable). And I've been a bit > insipired by Pekon Gupta's work on omap (where I stole the first hamming > weight approach). > > Apparently the BCH block has the possibility to write the syndrome data > to auxiliary memory (where also the status bytes are stored) when setting > the HW_BCH_CTRL:DEBUGSYNDROME). Hence I followed the approach where the > first time a correctly erase block is found these syndromes are cached > and future reads of likely-to-be--erased blocks can be identified based > on the comparison of these syndroms as opposed to checking each individual > bytes. For example on my 2k chip I would normally get the hamming weight > of 4 (chunks) of 512 bytes aka 2k bytes. But with an ecc8 I can replace > this by a memcmp() of 4x32 bytes. (4 sets of syndromes). The result is > obviously that the processor is more eager to do this resulting in a > regaining some of the lost speed. > > I did some benchmarking on the following 2k and 4k nand chips: > NAND device: Manufacturer ID: 0x2c, Chip ID: 0xdc (Micron MT29F4G08ABAEAH4), 512MiB, page size: 4096, OOB size: 224 > NAND device: Manufacturer ID: 0x2c, Chip ID: 0xda (Micron MT29F2G08ABAEAH4), 256MiB, page size: 2048, OOB size: 64 > > By simply doing time dd if=/dev/mtd8 of=/dev/null bs=1M and calculating > the throughput (in megabyte/s). This gave the following results: > > 2k |4k > ======== > 7.0 |11.3 <- v6 of the bitflips correction path (broken fast path) > 4.7 |5.9 <- v7 of the bitflip correction patch (no fast path) > 5.9 |8.4 <- with this patch applied. > thanks for the new patch. I suddenly think out a new solution about this issue: [1] when the bitflip occurs, the BCH will tells up the uncorrectable, [2] if we catch a uncorrectable error, we could check the whole buffer, and count the number of the bitflips. Assume we get the bitflips is N. [3] if N is < (gf_len ), we could think this is a erased page, and call the memset to the whole buffer, and tell the upper layer that this is a good empty page. [4] since the [1] is very rare, i think this method is much faster then the current solution. the patch is something like this: --- drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-) diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c index e2f5820..575327d 100644 --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c @@ -958,6 +958,29 @@ static void block_mark_swapping(struct gpmi_nand_data *this, p[1] = (p[1] & mask) | (from_oob >> (8 - bit)); } +static bool gpmi_erased_check(struct gpmi_nand_data *this, + unsigned char *data, unsigned int chunk) +{ + struct bch_geometry *geo = &this->bch_geometry; + int base = geo->ecc_chunk_size * chunk; + unsigned int flip_bits = 0; + unsigned threthold = geo->gf_len / 2; + int i; + + /* Count bitflips */ + for (i = 0; i < geo->ecc_chunk_size; i++) + flip_bits += hweight8(~data[base + i]); + + if (threthold > geo->ecc_strength) + threthold = geo->ecc_strength; + + if (flip_bits < threthold) { + memset(&data[base], 0xFF, geo->ecc_chunk_size); + return true; + } + return false; +} + static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip, uint8_t *buf, int oob_required, int page) { @@ -1007,6 +1030,8 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip, continue; if (*status == STATUS_UNCORRECTABLE) { + if (gpmi_erased_check(this, payload_virt, i)) + continue; mtd->ecc_stats.failed++; continue; } -- thanks Huang Shijie ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v0] mtd: gpmi: Use cached syndromes to speedup erased region bitflip detection. 2014-01-08 5:38 ` Huang Shijie @ 2014-01-08 7:52 ` Elie De Brauwer 2014-01-08 17:10 ` Bill Pringlemeir 2014-01-09 2:00 ` Huang Shijie 0 siblings, 2 replies; 6+ messages in thread From: Elie De Brauwer @ 2014-01-08 7:52 UTC (permalink / raw) To: Huang Shijie Cc: Huang Shijie, Brian Norris, David Woodhouse, linux-mtd@lists.infradead.org, Artem Bityutskiy On Wed, Jan 8, 2014 at 6:38 AM, Huang Shijie <b32955@freescale.com> wrote: > On Tue, Jan 07, 2014 at 08:44:32PM +0100, Elie De Brauwer wrote: >> Hello all, >> >> This patch is incremental with respect to 'mtd: gpmi: Deal with bitflips in >> erased regions' currently in its 7th version, which I consider more or less >> stable. >> The 7th version of that patch removed however a fast path option which >> resulted in a flash throughput decrease. Everytime an erased block is read, >> each individual byte has its hamming weight calculated and bitflips corrected >> in software. I'm testing this on a poor old i.mx28 which isn't an powerful >> beast. >> >> Hence I've been looking for a way to regain some of the performance which >> was lost (at the cost of making ubifs more reliable). And I've been a bit >> insipired by Pekon Gupta's work on omap (where I stole the first hamming >> weight approach). >> >> Apparently the BCH block has the possibility to write the syndrome data >> to auxiliary memory (where also the status bytes are stored) when setting >> the HW_BCH_CTRL:DEBUGSYNDROME). Hence I followed the approach where the >> first time a correctly erase block is found these syndromes are cached >> and future reads of likely-to-be--erased blocks can be identified based >> on the comparison of these syndroms as opposed to checking each individual >> bytes. For example on my 2k chip I would normally get the hamming weight >> of 4 (chunks) of 512 bytes aka 2k bytes. But with an ecc8 I can replace >> this by a memcmp() of 4x32 bytes. (4 sets of syndromes). The result is >> obviously that the processor is more eager to do this resulting in a >> regaining some of the lost speed. >> >> I did some benchmarking on the following 2k and 4k nand chips: >> NAND device: Manufacturer ID: 0x2c, Chip ID: 0xdc (Micron MT29F4G08ABAEAH4), 512MiB, page size: 4096, OOB size: 224 >> NAND device: Manufacturer ID: 0x2c, Chip ID: 0xda (Micron MT29F2G08ABAEAH4), 256MiB, page size: 2048, OOB size: 64 >> >> By simply doing time dd if=/dev/mtd8 of=/dev/null bs=1M and calculating >> the throughput (in megabyte/s). This gave the following results: >> >> 2k |4k >> ======== >> 7.0 |11.3 <- v6 of the bitflips correction path (broken fast path) >> 4.7 |5.9 <- v7 of the bitflip correction patch (no fast path) >> 5.9 |8.4 <- with this patch applied. >> > thanks for the new patch. > > I suddenly think out a new solution about this issue: > [1] when the bitflip occurs, the BCH will tells up the uncorrectable, > [2] if we catch a uncorrectable error, we could check the whole buffer, and > count the number of the bitflips. Assume we get the bitflips is N. > > [3] if N is < (gf_len ), we could think this is a erased page, and call the > memset to the whole buffer, and tell the upper layer that this is a good > empty page. > > [4] since the [1] is very rare, i think this method is much faster then the > current solution. > > the patch is something like this: What you suggest will obviously be able to reach the maximum speed, I've been playing with a similar idea too but always bumped into the issue that you cannot know whether or not a page is actually an erased page or not. In your solution for example you cannot distinguish between: - an erased page which suffers from bitflips - a genuine uncorrectable page whose original contents may be close to all 0xff's, but only has a handful of bits set to 0. The risk of this approach is that if you bump into an erased page, which the system should mark as uncorrectable, (and deal with accordingly) is returned as a valid all 0xff's page which the system will consider as valid data. I agree this may sound a bit theoretical, and the risk (uncorrectable on a page with very few 0 bits) is small, but I'm not able to judge whether or not it can/should be neglected. What do you think ? -- Elie De Brauwer ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v0] mtd: gpmi: Use cached syndromes to speedup erased region bitflip detection. 2014-01-08 7:52 ` Elie De Brauwer @ 2014-01-08 17:10 ` Bill Pringlemeir 2014-01-09 2:00 ` Huang Shijie 1 sibling, 0 replies; 6+ messages in thread From: Bill Pringlemeir @ 2014-01-08 17:10 UTC (permalink / raw) To: Elie De Brauwer; +Cc: Huang Shijie, linux-mtd eliedebrauwer at gmail.com wrote: > (Some background info, I expect these chips to be less than half used, > hence plenty of erased blocks, also the NAND timings are already > optimized for these chips). That may not be true if the higher layer file system does wear levelling. Especially dynamic wear levelling where seldom read inodes are moved in order to spread the wear. > On Wed, Jan 8, 2014 at 6:38 AM, Huang Shijie <b32955 at freescale.com> wrote: >> thanks for the new patch. >> >> I suddenly think out a new solution about this issue: >> [1] when the bitflip occurs, the BCH will tells up the uncorrectable, >> [2] if we catch a uncorrectable error, we could check the whole buffer, and >> count the number of the bitflips. Assume we get the bitflips is N. >> >> [3] if N is < (gf_len ), we could think this is a erased page, and call the >> memset to the whole buffer, and tell the upper layer that this is a good >> empty page. >> >> [4] since the [1] is very rare, i think this method is much faster then the >> current solution. >> Elie De Brauwer wrote: > What you suggest will obviously be able to reach the maximum speed, > I've been playing with a similar idea too but always bumped into the issue > that you cannot know whether or not a page is actually an erased page or > not. In your solution for example you cannot distinguish between: > - an erased page which suffers from bitflips > - a genuine uncorrectable page whose original contents may be close to > all 0xff's, but only has a handful of bits set to 0. > The risk of this approach is that if you bump into an erased page, which the > system should mark as uncorrectable, (and deal with accordingly) is returned > as a valid all 0xff's page which the system will consider as valid data. > I agree this may sound a bit theoretical, and the risk (uncorrectable > on a page with very few 0 bits) is small, but I'm not able to judge > whether or not it can/should be neglected. > What do you think ? I think you have a good rational. However, there will always be errors which can by-pass the bit-flips checking of any ECC (just like any hash has some theoretical collision/conflict). The higher layer file system should still have some integrity checks and the '-EUCLEAN' should have been a clue that these pages/sectors were wearing previously. Ie, the NAND device should gradually fail. I don't think it is fair to expect the MTD drivers to perform on a flash that it excessively used to the point where they are so broke it has bit flips way beyond the maximum specified by the vendors. Probably there is no completely right answer, just as you analyze. How do you tell between a mostly xff page with bit flips versus a bad hardware-ECC of an erased page? It maybe possible to 're-read' the page with the hardware ECC turned off to validate that the whole page is xff (I have guessed you are comparing data with ECC applied?). But this would again affect performance. I think that we should have some reliability at the upper layers. How do you combat against power fail, etc? The ECC should be able to correct errors. This foists error detection to another layer, with it highly likely to be caught at this layer. Running file systems on mtdblock, etc might like this but those have other issues, don't they? My 2cents, Bill Pringlemeir. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v0] mtd: gpmi: Use cached syndromes to speedup erased region bitflip detection. 2014-01-08 7:52 ` Elie De Brauwer 2014-01-08 17:10 ` Bill Pringlemeir @ 2014-01-09 2:00 ` Huang Shijie 1 sibling, 0 replies; 6+ messages in thread From: Huang Shijie @ 2014-01-09 2:00 UTC (permalink / raw) To: Elie De Brauwer Cc: Huang Shijie, Brian Norris, David Woodhouse, linux-mtd@lists.infradead.org, Artem Bityutskiy On Wed, Jan 08, 2014 at 08:52:35AM +0100, Elie De Brauwer wrote: > >> I did some benchmarking on the following 2k and 4k nand chips: > >> NAND device: Manufacturer ID: 0x2c, Chip ID: 0xdc (Micron MT29F4G08ABAEAH4), 512MiB, page size: 4096, OOB size: 224 > >> NAND device: Manufacturer ID: 0x2c, Chip ID: 0xda (Micron MT29F2G08ABAEAH4), 256MiB, page size: 2048, OOB size: 64 > >> > >> By simply doing time dd if=/dev/mtd8 of=/dev/null bs=1M and calculating > >> the throughput (in megabyte/s). This gave the following results: > >> > >> 2k |4k > >> ======== > >> 7.0 |11.3 <- v6 of the bitflips correction path (broken fast path) > >> 4.7 |5.9 <- v7 of the bitflip correction patch (no fast path) > >> 5.9 |8.4 <- with this patch applied. > >> > > thanks for the new patch. > > > > I suddenly think out a new solution about this issue: > > [1] when the bitflip occurs, the BCH will tells up the uncorrectable, > > [2] if we catch a uncorrectable error, we could check the whole buffer, and > > count the number of the bitflips. Assume we get the bitflips is N. > > > > [3] if N is < (gf_len ), we could think this is a erased page, and call the > > memset to the whole buffer, and tell the upper layer that this is a good > > empty page. > > > > [4] since the [1] is very rare, i think this method is much faster then the > > current solution. > > > > the patch is something like this: > > What you suggest will obviously be able to reach the maximum speed, > I've been playing with a similar idea too but always bumped into the issue > that you cannot know whether or not a page is actually an erased page or > not. In your solution for example you cannot distinguish between: > - an erased page which suffers from bitflips > - a genuine uncorrectable page whose original contents may be close to > all 0xff's, but only has a handful of bits set to 0. In actually, we do have a method to distinguish the two cases: If [3] is met, we could read the page out without ECC enabled, and check the bitflips again. such as: <1> if [3] is met, read out the page again without the ECC enabled. and check the bitflips with this new page, assume we get N2. <2> if it is a erased page, the N2 will less then gf_len. <3> if it is a page full of 0xff(only servel bits set to 0), the N2 will greater then gf_len. I will give a patch later, and you can test it. thanks Huang Shijie ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-01-09 2:34 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-01-07 19:44 [PATCH v0] mtd: gpmi: Use cached syndromes to speedup erased region bitflip detection Elie De Brauwer 2014-01-07 19:44 ` Elie De Brauwer 2014-01-08 5:38 ` Huang Shijie 2014-01-08 7:52 ` Elie De Brauwer 2014-01-08 17:10 ` Bill Pringlemeir 2014-01-09 2:00 ` Huang Shijie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).