From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Roese Date: Wed, 4 May 2016 10:59:42 +0200 Subject: [U-Boot] [PATCH 1/2] mtd: cqspi: Simplify indirect write code In-Reply-To: <5728DD56.5040804@denx.de> References: <1461796606-9254-1-git-send-email-marex@denx.de> <57232AF6.9060903@denx.de> <572333C6.1060801@denx.de> <57277026.9010105@denx.de> <5728D77A.90906@denx.de> <5728D92E.4090504@denx.de> <5728DD56.5040804@denx.de> Message-ID: <5729B9FE.4000404@denx.de> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de On 03.05.2016 19:18, Marek Vasut wrote: > On 05/03/2016 07:00 PM, Stefan Roese wrote: >> On 03.05.2016 18:53, Marek Vasut wrote: >>> On 05/02/2016 05:20 PM, Stefan Roese wrote: >>>> On 29.04.2016 12:13, Marek Vasut wrote: >>>>>> On 28.04.2016 00:36, Marek Vasut wrote: >>>>>>> The indirect write code is buggy pile of nastiness which fails >>>>>>> horribly >>>>>>> when the system runs fast enough to saturate the controller. The >>>>>>> failure >>>>>>> results in some pages (256B) not being written to the flash. This >>>>>>> can be >>>>>>> observed on systems which run with Dcache enabled and L2 cache >>>>>>> enabled, >>>>>>> like the Altera SoCFPGA. >>>>>>> >>>>>>> This patch replaces the whole unmaintainable indirect write >>>>>>> implementation >>>>>>> with the one from upcoming Linux CQSPI driver, which went through >>>>>>> multiple >>>>>>> rounds of thorough review and testing. While this makes the patch >>>>>>> look >>>>>>> terrifying and violates all best-practices of software >>>>>>> development, all >>>>>>> the patch does is it plucks out duplicate ad-hoc code distributed >>>>>>> across >>>>>>> the driver and replaces it with more compact code doing exactly >>>>>>> the same >>>>>>> thing. >>>>>>> >>>>>>> Signed-off-by: Marek Vasut >>>>>>> Cc: Anatolij Gustschin >>>>>>> Cc: Chin Liang See >>>>>>> Cc: Dinh Nguyen >>>>>>> Cc: Jagan Teki >>>>>>> Cc: Pavel Machek >>>>>>> Cc: Stefan Roese >>>>>>> Cc: Vignesh R >>>>>> >>>>>> I've applied both patches and tested them on SR1500 (SPI-NOR used >>>>>> for booting and redundant environment). This is what I get upon >>>>>> "saveeenv": >>>>>> >>>>>> => saveenv >>>>>> Saving Environment to SPI Flash... >>>>>> SF: Detected N25Q128 with page size 256 Bytes, erase size 64 KiB, >>>>>> total 16 MiB >>>>>> Erasing SPI flash...Writing to SPI flash...data abort >>>>>> pc : [<3ff8368a>] lr : [<3ff8301b>] >>>>>> reloc pc : [<010216ca>] lr : [<0102105b>] >>>>>> sp : 3bf54eb8 ip : 3ff82f69 fp : 00000002 >>>>>> r10: 00000000 r9 : 3bf5dee8 r8 : ffff0000 >>>>>> r7 : 00000001 r6 : 3bf54f9b r5 : 00000001 r4 : 3bf5e520 >>>>>> r3 : 00000000 r2 : 3bf54f9b r1 : 00000001 r0 : ffa00000 >>>>>> Flags: nZCv IRQs off FIQs off Mode SVC_32 >>>>>> Resetting CPU ... >>>>>> >>>>>> resetting ... >>>>>> >>>>>> U-Boot SPL 2016.05-rc3-00009-ge1bf9b8 (Apr 29 2016 - 11:25:46) >>>>>> >>>>>> >>>>>> Any idea, what might be going wrong here? >>>>> >>>>> Does it work without the patch ? >>>> >>>> Yes, of course. I wouldn't have posted as a reply to this patch >>>> if this is not the root cause. >>> >>> *grumble* >> >> ? > > I just sensed slight USB subtext here ;-) > >>>> The board is using SPI NOR for env storage from the beginning. >>> >>> It only happens if you use redundant env in SPI NOR. >>> >>>> Where does your PC point to at the time >>>>> of the crash ,which function is it ? >>>> >>>> Its in cadence_qspi_apb_indirect_write_execute(). >>>> >>>> I debugged this issue a bit and found the following problem >>>> in cadence_qspi_apb_indirect_write_execute(): >>>> >>>> saveenv issues a 1-byte SPI write transfer with a non 4-byte >>>> aligned txbuf. This causes the data-abort here. Here my small >>>> patch that fixes the problem: >>> >>> Thanks, see below. >>> >>>> diff --git a/drivers/spi/cadence_qspi_apb.c >>>> b/drivers/spi/cadence_qspi_apb.c >>>> index ac47c6f..021a3e8 100644 >>>> --- a/drivers/spi/cadence_qspi_apb.c >>>> +++ b/drivers/spi/cadence_qspi_apb.c >>>> @@ -745,7 +745,15 @@ int >>>> cadence_qspi_apb_indirect_write_execute(struct cadence_spi_platdata >>>> *plat, >>>> >>>> while (remaining > 0) { >>>> write_bytes = remaining > page_size ? page_size : >>>> remaining; >>>> - writesl(plat->ahbbase, txbuf, >>>> DIV_ROUND_UP(write_bytes, 4)); >>>> + >>>> + /* >>>> + * Handle non 4-byte aligned access differently to avoid >>>> + * data-aborts >>>> + */ >>>> + if (((u32)txbuf % 4) || (write_bytes % 4)) >>>> + writesb(plat->ahbbase, txbuf, write_bytes); >>>> + else >>>> + writesl(plat->ahbbase, txbuf, write_bytes >> 2); >>>> >>>> ret = wait_for_bit("QSPI", plat->regbase + >>>> CQSPI_REG_SDRAMLEVEL, >>>> CQSPI_REG_SDRAMLEVEL_WR_MASK << >>>> >>>> >>>> Please fell free to use this patch as-is and squash it into >>>> your patches or enhance it while doing this. The read function >>>> is also missing this unaligned handling. >>> >>> Im afraid of the performance hit that we can suffer if we use byte-level >>> access for every unaligned buffer. >> >> This is why is wrote: "or enhance it while doing this". You might want >> to change the code to first write the (optionally) unaligned bytes, >> then the aligned bytes via writesl() and last the (optionally) unaligned >> bytes. >> >>> What do you think >>> about using a bounce-buffer instead ? >> >> I would prefer the simple solution I've drafted above. > > OK, I checked how many aligned and unaligned transfers happens and I > guess it's really not worth going through the bounce buffer for 1 byte > which only happens once during saveenv. > > I'll squash it and pick via socfpga tree, so we can get this in 2016.05 > and have working QSPI NOR support there. Good, thanks! >>>> And of course the Linux driver version as well. >>> >>> Does linux use unaligned buffers internally ? >> >> Frankly, I don't know for sure. But I suspect, that you can also >> see unaligned buffers (and sizes!!!) in Linux as well. And you >> can't just write a different amount of data to the SPI NOR, which >> happens when you use DIV_ROUND_UP on an unaligned size. > > You can write different amount of data into the FIFO, the amount which > is transferred is controlled by INDIRECTRDBYTES / INDIRECTWRBYTES register. Okay. But won't those additional (non-written) bytes not be stuck in the FIFO until the next transfer? And sent first before the new data that is written to the FIFO? But that's a mood discussion as its not a problem any more with the v2 of the patches AFAICT. Thanks, Stefan