From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932446Ab2CZNKg (ORCPT ); Mon, 26 Mar 2012 09:10:36 -0400 Received: from m50-133.163.com ([123.125.50.133]:59862 "EHLO m50-133.163.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932112Ab2CZNKe (ORCPT ); Mon, 26 Mar 2012 09:10:34 -0400 Date: Mon, 26 Mar 2012 21:10:03 +0800 From: Zhaoxiu Zeng To: linux-mtd@lists.infradead.org Cc: linux-kernel@vger.kernel.org Subject: Re: Re: [PATCH] mtd/nand/nand_ecc.c: replace bitsperbyte table by a simple operation Message-ID: <20120326131002.GA14318@Fedora16> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-CM-TRANSID: DdGowEDpR1SsanBPiG33Fw--.2413S2 X-Coremail-Antispam: 1Uf129KBjvJXoW3JFyfCw1DCw1fAw4xZF43KFg_yoWxJw17pr W5CFW2yr4rGw1xur4kCr4UJryDt3ykXa4UJr43XryfCwsxZF1qyr4DKrWFva47Wr95t34r Xrs5XF1fCFs5Z3JanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07UZFxUUUUUU= X-CM-SenderInfo: p2hqw6xkdr5xrx6rljoofrz/1tbiGQp9gE9oyKTe5AAAsf Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dear Paul and Frans Thanks for your reply. I don't write a commit log because I have ugly eglish! :) For all words which has only one "1" bit, "!(tmp & (tmp - 1))" must be true, others except zero must be false! In this instance we just need to check whether the result has only one "1" bit, so "!(tmp & (tmp - 1))" is enough, and faster. Rest changes are adjuvant. > Dear Zhaoxiu > > Thanks for contributing this patch. > However, I think it has some issues and should not be applied. See my > remarks below > > 2012/3/24 Zhaoxiu Zeng 163.com>: > > > > Signed-off-by: Zhaoxiu Zeng 163.com> > > > > --- > > drivers/mtd/nand/nand_ecc.c | 50 +++++++++--------------------------------- > > 1 files changed, 11 insertions(+), 39 deletions(-) > > > > diff --git a/drivers/mtd/nand/nand_ecc.c b/drivers/mtd/nand/nand_ecc.c > > index b7cfe0d..e05a0fd 100644 > > --- a/drivers/mtd/nand/nand_ecc.c > > +++ b/drivers/mtd/nand/nand_ecc.c > > @@ -85,30 +85,6 @@ static const char invparity[256] = { > > }; > > > > /* > > - * bitsperbyte contains the number of bits per byte > > - * this is only used for testing and repairing parity > > - * (a precalculated value slightly improves performance) > > - */ > > -static const char bitsperbyte[256] = { > > - 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, > > - 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, > > - 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, > > - 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > > - 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, > > - 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > > - 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > > - 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > > - 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, > > - 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > > - 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > > - 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > > - 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > > - 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > > - 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > > - 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > > -}; > > - > > -/* > > * addressbits is a lookup table to filter out the bits from the xor-ed > > * ECC data that identify the faulty location. > > * this is only used for repairing parity > > @@ -448,12 +424,14 @@ int __nand_correct_data(unsigned char *buf, > > unsigned int byte_addr; > > /* 256 or 512 bytes/ecc */ > > const uint32_t eccsize_mult = eccsize >> 8; > > + const uint32_t check_mask = eccsize_mult == 2 ? 0x555555 : 0x545555; > > + uint32_t tmp; > > > > /* > > * b0 to b2 indicate which bit is faulty (if any) > > * we might need the xor result more than once, > > * so keep them in a local var > > - */ > > + */ > > #ifdef CONFIG_MTD_NAND_ECC_SMC > > b0 = read_ecc[0] ^ calc_ecc[0]; > > b1 = read_ecc[1] ^ calc_ecc[1]; > > @@ -467,14 +445,11 @@ int __nand_correct_data(unsigned char *buf, > > > > /* repeated if statements are slightly more efficient than switch ... */ > > /* ordered in order of likelihood */ > > - > > - if ((b0 | b1 | b2) == 0) > > + tmp = ((uint32_t)b2 << 16) | ((uint32_t)b1 << 8) | b0; > > Not sure how efficient this is; on some systems shifting by N takes N > clock cycles so this could be relatively expensive. > (and the systems where it takes more clock cycles have a higher chance > on not having hw crc correction). > > > + if (tmp == 0) > > return 0; /* no error */ > > > > - if ((((b0 ^ (b0 >> 1)) & 0x55) == 0x55) && > > - (((b1 ^ (b1 >> 1)) & 0x55) == 0x55) && > > - ((eccsize_mult == 1 && ((b2 ^ (b2 >> 1)) & 0x54) == 0x54) || > > - (eccsize_mult == 2 && ((b2 ^ (b2 >> 1)) & 0x55) == 0x55))) { > > + if (((tmp ^ (tmp >> 1)) & check_mask) == check_mask) { > > Have you benchmarked this? > > And while it might be an optimisation, it is not mentioned in the > commit message. > > > /* single bit error */ > > /* > > * rp17/rp15/13/11/9/7/5/3/1 indicate which byte is the faulty > > @@ -492,19 +467,16 @@ int __nand_correct_data(unsigned char *buf, > > * We could also do addressbits[b2] >> 1 but for the > > * performance it does not make any difference > > */ > > - if (eccsize_mult == 1) > > - byte_addr = (addressbits[b1] << 4) + addressbits[b0]; > > - else > > - byte_addr = (addressbits[b2 & 0x3] << 8) + > > - (addressbits[b1] << 4) + addressbits[b0]; > > + byte_addr = (addressbits[b1] << 4) + addressbits[b0]; > > + if (eccsize_mult == 2) > > + byte_addr += (addressbits[b2 & 0x3] << 8); > > I'm not sure if this is a win. It *looks* like an optimisation since > you factor out some common code. Then again within the if you need to > add to byte_addr. > For a naive compiler this is probably more expensive. For a good > compiler this probably makes no difference at all. > > > bit_addr = addressbits[b2 >> 2]; > > /* flip the bit */ > > buf[byte_addr] ^= (1 << bit_addr); > > return 1; > > - > > } > > - /* count nr of bits; use table lookup, faster than calculating it */ > > - if ((bitsperbyte[b0] + bitsperbyte[b1] + bitsperbyte[b2]) == 1) > > + > > + if (!(tmp & (tmp - 1))) > > This is not the same as the code you replace!!! > > take as example b0 = 1; b1 =0, b2 = 0; > The original if statement will return true: > bitsperbyte[b0] in that case is 1 > bitsperbyte[b1] = 0 > bitsperbyte[b2] = 0 > so (bitsperbyte[b0] + bitsperbyte[b1] + bitsperbyte[b2]) equals 1 and > ((bitsperbyte[b0] + bitsperbyte[b1] + bitsperbyte[b2]) == 1) yields 1 so true > > However if I take your code > tmp = ((uint32_t)b2 << 16) | ((uint32_t)b1 << 8) | b0; > taking the example values from above this results in tmp being 1; > tmp & (tmp - 1) becomes 1 & 0 which effectively results in 0. > > So in my original code b0 = 1; b1 =0, b2 = 0; results in 1 being > returned whereas your code will result in a log entry "incorrectable > error" and a return of -1. > You ignore the "!" opeator, Frans. > Given this, I'd say this patch is to be rejected. > > BTW: optimisations in this part of code are not too important. This is > only called when ecc errors are to be corrected and that is not very > likely. > So perhaps it is not worth the time to improve it. > (and yes, it might be possible to save some bytes by eliminating the > array; then again the code and logic is not really trivial so good > testing is needed) > > See also Documentation/mtd/nand_ecc.txt > near the end there is a small section on correcting. > > > return 1; /* error in ECC data; no action needed */ > > > > printk(KERN_ERR "uncorrectable error : "); > > -- > > 1.7.7.6 > > > > > > Best regards, Frans.