From mboxrd@z Thu Jan 1 00:00:00 1970 From: Albert ARIBAUD Date: Thu, 27 Jun 2013 12:02:40 +0200 Subject: [U-Boot] [PATCH v3 1/2] Optimized nand_read_buf for kirkwood In-Reply-To: <1372271126-2642-2-git-send-email-phil.sutter@viprinet.com> References: <1361467316-29044-1-git-send-email-phil.sutter@viprinet.com> <1372271126-2642-1-git-send-email-phil.sutter@viprinet.com> <1372271126-2642-2-git-send-email-phil.sutter@viprinet.com> Message-ID: <20130627120240.5875f2f5@lilith> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de Hi Phil, On Wed, 26 Jun 2013 20:25:25 +0200, Phil Sutter wrote: > The basic idea is taken from the linux-kernel, but further optimized. > > First align the buffer to 8 bytes, then use ldrd/strd to read and store > in 8 byte quantities, then do the final bytes. > > Tested using: 'date ; nand read.raw 0xE00000 0x0 0x10000 ; date'. > Without this patch, NAND read of 132MB took 49s (~2.69MB/s). With this > patch in place, reading the same amount of data was done in 27s > (~4.89MB/s). So read performance is increased by ~80%! > > Signed-off-by: Nico Erfurth > Tested-by: Phil Sutter > Cc: Prafulla Wadaskar > --- Patch history missing. > drivers/mtd/nand/kirkwood_nand.c | 32 ++++++++++++++++++++++++++++++++ > 1 file changed, 32 insertions(+) > > diff --git a/drivers/mtd/nand/kirkwood_nand.c b/drivers/mtd/nand/kirkwood_nand.c > index 0a99a10..85ea5d2 100644 > --- a/drivers/mtd/nand/kirkwood_nand.c > +++ b/drivers/mtd/nand/kirkwood_nand.c > @@ -38,6 +38,37 @@ struct kwnandf_registers { > static struct kwnandf_registers *nf_reg = > (struct kwnandf_registers *)KW_NANDF_BASE; > > + > +/* > + * The basic idea is stolen from the linux kernel, but the inner loop is > + * optimized a bit more. > + */ > +static void kw_nand_read_buf(struct mtd_info *mtd, uint8_t *buf, int len) > +{ > + struct nand_chip *chip = mtd->priv; > + > + while (len && (unsigned long)buf & 7) { > + *buf++ = readb(chip->IO_ADDR_R); > + len--; > + }; > + > + /* This loop reads and writes 64bit per round. */ > + asm volatile ( > + "1:\n" > + " subs %0, #8\n" > + " ldrpld r2, [%2]\n" > + " strpld r2, [%1], #8\n" > + " bhi 1b\n" > + " addne %0, #8\n" > + : "+&r" (len), "+&r" (buf) > + : "r" (chip->IO_ADDR_R) > + : "r2", "r3", "memory", "cc" > + ); Are assembler instructions *really* required? IOW, can you not get enough performance simply with a cleverly written C loop? > + while (len--) > + *buf++ = readb(chip->IO_ADDR_R); > +} > + > /* > * hardware specific access to control-lines/bits > */ > @@ -80,6 +111,7 @@ int board_nand_init(struct nand_chip *nand) > nand->ecc.mode = NAND_ECC_SOFT; > #endif > nand->cmd_ctrl = kw_nand_hwcontrol; > + nand->read_buf = kw_nand_read_buf; > nand->chip_delay = 40; > nand->select_chip = kw_nand_select_chip; > return 0; Amicalement, -- Albert.