[U-Boot] [PATCH v2 3/4] mtd: nand: add Freescale NFC driver

public inbox for u-boot@lists.denx.de
 help / color / mirror / Atom feed

From: Bill Pringlemeir <bpringlemeir@nbsps.com>
To: u-boot@lists.denx.de
Subject: [U-Boot] [PATCH v2 3/4] mtd: nand: add Freescale NFC driver
Date: Mon, 18 Aug 2014 12:38:43 -0400	[thread overview]
Message-ID: <87a971ah2k.fsf@nbsps.com> (raw)
In-Reply-To: 007acb722088ae239042316042f7dba7@agner.ch

On 18 Aug 2014, stefan at agner.ch wrote:

> Am 2014-08-14 23:12, schrieb Bill Pringlemeir:
>>> On 14 Aug 2014, stefan at agner.ch wrote:
>>>
>>> This adds initial support for Freescale NFC (NAND Flash Controller)
>>> found in ARM Vybrid SoC's, Power Architecture MPC5125 and others.
>>> However, this driver is only tested on Vybrid.
>>
>> This is only to expand on the nand controller register and SRAM use.
>>
>> [snip]
>>
>>> diff --git a/drivers/mtd/nand/vf610_nfc.c
>>> b/drivers/mtd/nand/vf610_nfc.c new file mode 100644 index
>>> 0000000..3150ac1 --- /dev/null +++ b/drivers/mtd/nand/vf610_nfc.c
>>
>> [snip]
>>
>>> +static inline u32 vf610_nfc_read(struct mtd_info *mtd, uint reg) +{

>> Ok, we always use readl/writel.  This is fine, but a little slower
>> and bigger.  I may try a register cache if I resubmit to the Linux
>> MTD as per Scott's suggestion.  Especially, this version is good for
>> an incremental patch.

> I measured the difference and get 1MB/s
> Full pages, readl/writel:
> NAND read: device 0 offset 0x200000, size 0x800000
> 8388608 bytes read in 772 ms (10.4 MiB/s): OK

> Full pages, __raw_readl/__raw_writel
> NAND read: device 0 offset 0x200000, size 0x800000
> 8388608 bytes read in 696 ms (11.5 MiB/s): OK

> Ok, this is actually quite a lot. Especially since I already optimized
> the C code (by not using the helper functions like nfc_set/nfc_clear
> in vf610_nfc_send_command), one would think there is now almost no
> optimization potential. I looked into the disassembled code and could
> narrow down the issue. Due to the memory barriers, all offsets were
> calculated on each register access (nfc base to reg base, and add reg
> offset), multiple instances of:

>  20:	e59cc120 	ldr	ip, [ip, #288]	; 0x120
>  24:	e59cc134 	ldr	ip, [ip, #308]	; 0x134

> I optimized the code again and calculate the offsets manually and
> access __raw_readl/__raw_writel rather then vf610_nfc_read/write in
> the vf610_nfc_send_command(s) function, I get the full speed again:

> NAND read: device 0 offset 0x200000, size 0x800000
> 8388608 bytes read in 687 ms (11.6 MiB/s): OK

I think what you have is fine.  The 10BM/s versus 11MB/s is not
insignificant, but it is not a huge difference.  I expect that the
driver is already better than the others; especially the Imx-25 was only
7.7MB/s read and 4.6Mb/s write from Linux mtd tests.  Although the end
user might like to wait 10% less for an image to load.

Also, probably a lot of people care about code size.  This is not so
much a case for U-Boot as it is usually machine specific and doesn't
support several SOCs afaik.

However, the more specific you make the optimization to the platform,
the less likely it is to extend well.  We also wish to have this work
well with different gcc versions and CPUs (PowerPC, etc).  The
'readl/writel' handicap the compiler.  Although they are more likely to
work with a wide variety of buses.

[snip]

> On Wed, 2014-08-13 at 22:32, Scott Wood wrote:
>> raw_writel() is itself something that should only be used for
>> hand-optimized sections.  For non-performance-critical code you
>> should use normal writel() so that you don't need to worry about
>> manually adding I/O barriers.
>
> The reason I choosed readl/writel instead of the raw variants is to
> preserve align with other drivers...

>> care.  Maybe a comment is fine?  It seems the Vybrid is safe for
>> different access sizes, but it is possible that some other CPU might
>> not be able to access this memory via 32/16/8 bit accesses and
>> 'memcpy()' may not be appropriate.  It seems that 'natural' size of
>> the NFC controller itself is 32bits and the CPU interface does lane
>> masking.  Ie, boot mode documentation talks about remapping
>> 'sram_physical_addr[13:3] = {cpu_addr[11:3],cpu_addr[13:12]}' saying
>> that bits 2,1 are not used (hopefully one based numbers).  This is
>> just my guess...

> What assumptions do you make how memcpy accesses memory? This latest
> patch now uses the optimized versions from the kernel... Maybe they
> even try to access 64-bit width (the NIC interconnect supports 64-bit
> access)

The memcpy() itself could use anything. 64bits is possible on AXI/NIC.
The 'PBRIDGE' is 64bit, but I think the AIPS/IPS (apparently AIPS means
'AHB-lite to IPS) are 32bit.  At least that is the case on the Imx25
which has a different AIPS version.  I assumed the 'memcpy()' was using
32bits but this certainly isn't explicit in the code.

The majority of the register banks are non-volatile with this
controller.  Instead of running multiple NAND programming sequences, the
controller runs them all for us.  Most registers are are mainly like
SRAM.

My only point is that the SRAM buffers use the same interface as the
main Nand controller register banks.  So using 'readl/writel' for the
register, but not the SRAM buffers seems inconsistent.

So to address this inconsistency, I was thinking that we should at least
have a comment?

 /* BUS: This assumes the BUS is 32 bit accessible.  If you are porting
    to other systems, this may not be the case.
  */
 memcpy(nfc->regs + NFC_MAIN_AREA(0) + c, buf, l);

Or we could implement our own version of memcpy that did 32bit aligned
transfers with a similar comment.  In theory, we need a barrier after
the memcpy(), in case anyone modified the code to touch the
NFC_FLASH_CMD2's START_BIT directly after the memcpy() or custom
function.  But all the paranoia adds some code and has potential to slow
things down.

Doing a barrier after every single byte read as the ARM Linux's
memcpy_fromio() does will surely make significant performance
differences.  Instead of being double the Imx25, it was half.  A user
waiting 400% longer won't be too happy.

Fwiw,
Bill Pringlemeir.

next prev parent reply	other threads:[~2014-08-18 16:38 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-14 16:30 [U-Boot] [PATCH v2 0/4] arm: vf610: add NAND flash support Stefan Agner
2014-08-14 16:30 ` [U-Boot] [PATCH v2 1/4] arm: vf610: add NFC pin mux Stefan Agner
2014-08-14 16:30 ` [U-Boot] [PATCH v2 2/4] arm: vf610: add NFC clock support Stefan Agner
2014-08-14 16:30 ` [U-Boot] [PATCH v2 3/4] mtd: nand: add Freescale NFC driver Stefan Agner
2014-08-14 18:34   ` Bill Pringlemeir
2014-08-14 19:49   ` Scott Wood
2014-08-14 21:12   ` Bill Pringlemeir
2014-08-18  9:41     ` Stefan Agner
2014-08-18 16:38       ` Bill Pringlemeir [this message]
2014-08-19 17:00         ` Stefan Agner
2014-08-21 21:15           ` [U-Boot] [PATCH v3 3/4] mtd: nand: add Freescale vf610_nfc driver Bill Pringlemeir
2014-09-11  9:37             ` Stefano Babic
2014-08-14 16:30 ` [U-Boot] [PATCH v2 4/4] arm: vf610: add NAND support for vf610twr Stefan Agner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a971ah2k.fsf@nbsps.com \
    --to=bpringlemeir@nbsps.com \
    --cc=u-boot@lists.denx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox