public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* GPMI-NAND Status?
@ 2011-08-05 13:51 Wolfram Sang
  2011-08-08  6:21 ` Huang Shijie
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Wolfram Sang @ 2011-08-05 13:51 UTC (permalink / raw)
  To: linux-mtd
  Cc: Huang Shijie, Koen Beel, Shawn Guo, linux-arm-kernel,
	Lothar Waßmann

[-- Attachment #1: Type: text/plain, Size: 3480 bytes --]

Hi,

I am a bit uncertain how the state of the GPMI-NAND driver currently is, so
I'll try to sum it up here. There is without doubt interest in getting the
driver into mainline from at least Huang, Shawn, Lothar, Koen and me, so I
wonder if we can join forces more effectively. First of all, I want to thank
Huang Shijie for all his work so far which was already quite some effort; this
sum-up is by no means meant as bashing, just trying to understand the status
quo (Sidenote: I am more or less on holiday until Monday, so no time for real
debugging myself. I write this mail so we hopefully gain a common
understanding. When I am back to full strength, I can then start working on
what seems apropriate)

Issues with the current driver I am aware of:

DMA timeouts [1]
================

[    2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA :1
[    3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!

Always reproducible by me when trying to format mtd0. Sometimes(always?) seen
by Koen during boot (on read?). Never seen by Huang? It is currently unclear if
the bug is in the GPMI driver, or in the MXS-DMA driver. Still, I'd say the
issue is a show-stopper. We can't put a driver into mainline which leads to the
above failure. The fact that there is _some_ configuration which works for
someone does not help, it doesn't work for Koen and me at least. We need
reliable drivers in mainline, so the issue needs to be resolved, regardless
where the bug resides.


problem overwriting all-0xff data in NAND [2]
=============================================

Although it occured only when writing JFFS2 images so far, this is a generic
issue and needs to be fixed, right?


ecclayout needs to be used to show that OOB is fully in use [1]
===============================================================

Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver only
working with UBIFS is surely not ready for mainline.


Pecularities
============

There are a few issues which are odd. I don't know if some are mainly intended
for debugging, yet they shouldn't be in a mainline driver. At least:

* custom sysfs-entries
* custom kernel command line parameters
* namespacing (some functions have no prefix, some have "mil_", some have mx23)
  (I think 'mil' means 'mtd interface layer', but why is that needed?)

Complexity
==========

The driver is not easy to review. I wonder if it makes sense to use incremental
patches for it? maybe making it a staging driver could be a solution for that?
Huang, are you interested in accepting patches or do you prefer we just point
at certain code and you then fix it? Starting with a simpler driver and then
adding stuff might be another option if we can't chase all the bugs in the
current driver.

That being said, I'd think fixing the DMA issue has prio #1 and maybe we can
meet in IRC or something to work that out? Is there interest in that?

Ok, those were my two cents. Your mileages may vary, please give your thoughts,
then. I mainly don't want the driver development to get stalled.

Regards,

   Wolfram

[1] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037200.html
[2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-05 13:51 GPMI-NAND Status? Wolfram Sang
@ 2011-08-08  6:21 ` Huang Shijie
  2011-08-08  9:19   ` Koen Beel
  2011-08-09  9:35   ` Wolfram Sang
  2011-08-08  9:12 ` Huang Shijie
  2011-08-14  8:11 ` Ivan Djelic
  2 siblings, 2 replies; 33+ messages in thread
From: Huang Shijie @ 2011-08-08  6:21 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: Koen Beel, Shawn Guo, linux-mtd, linux-arm-kernel,
	Lothar Waßmann

Hi Wolfram:
> Hi,
>
> I am a bit uncertain how the state of the GPMI-NAND driver currently is, so
> I'll try to sum it up here. There is without doubt interest in getting the
> driver into mainline from at least Huang, Shawn, Lothar, Koen and me, so I
> wonder if we can join forces more effectively. First of all, I want to thank
> Huang Shijie for all his work so far which was already quite some effort; this
> sum-up is by no means meant as bashing, just trying to understand the status
> quo (Sidenote: I am more or less on holiday until Monday, so no time for real
> debugging myself. I write this mail so we hopefully gain a common
> understanding. When I am back to full strength, I can then start working on
> what seems apropriate)
>
> Issues with the current driver I am aware of:
>
> DMA timeouts [1]
> ================
>
> [    2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA :1
> [    3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
>
> Always reproducible by me when trying to format mtd0. Sometimes(always?) seen
> by Koen during boot (on read?). Never seen by Huang? It is currently unclear if
After I used a different .config, it never appears in my side.

> the bug is in the GPMI driver, or in the MXS-DMA driver. Still, I'd say the
> issue is a show-stopper. We can't put a driver into mainline which leads to the
> above failure. The fact that there is _some_ configuration which works for
> someone does not help, it doesn't work for Koen and me at least. We need
Hi Koen, do you test my uImage?
Does the timeout occur?
> reliable drivers in mainline, so the issue needs to be resolved, regardless
> where the bug resides.
ok. I will debug it too.


Please test the driver again when you back to office.
Pay attention to your version of /arch/arm/configs/mxs_defconfig.
Your mxs_defconfig may miss Shawn Guo's patches.

thanks.


>
> problem overwriting all-0xff data in NAND [2]
> =============================================
>
> Although it occured only when writing JFFS2 images so far, this is a generic
> issue and needs to be fixed, right?
>
Artem said it should not change the driver, but the upper layer(jffs2).

So I think i do not need to change the driver.
> ecclayout needs to be used to show that OOB is fully in use [1]
> ===============================================================
>
> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver only
> working with UBIFS is surely not ready for mainline.
>
I programmed for mx6q in the recent days. I have no time to fix it. The 
mx6q can runs well now.

So I will fix the issue in the following days.

> Pecularities
> ============
>
> There are a few issues which are odd. I don't know if some are mainly intended
> for debugging, yet they shouldn't be in a mainline driver. At least:
>
> * custom sysfs-entries
My sysfs-entries is in the GPMI-NAND directory.
Does be a mainline driver means I should not have any sysfs-entries?
If it does, i can remove it.

> * custom kernel command line parameters
The kernel command line 'gpmi_nand' is to avoid the conflict with other 
modules such as
SD.

If it's be removed, I have to use different config to resolve the issue 
which is not better either. :(

> * namespacing (some functions have no prefix, some have "mil_", some have mx23)
>    (I think 'mil' means 'mtd interface layer', but why is that needed?)
The mil is used to make the gpmi_nand_data{} simple.
Without it, the gpmi_nand_data{} will very big.

The functions which have mx23 prefix are only used in mx23.
The functions which have no prefix can used in both mx28 and mx23.

> Complexity
> ==========
>
> The driver is not easy to review. I wonder if it makes sense to use incremental
> patches for it? maybe making it a staging driver could be a solution for that?
Frankly speaking, the current driver is maybe the smallest version now.

I even do not add the on-chip BBT feature now.
> Huang, are you interested in accepting patches or do you prefer we just point
> at certain code and you then fix it? Starting with a simpler driver and then
Feel free to mail me the patch. it's welcome.


> adding stuff might be another option if we can't chase all the bugs in the
> current driver.
>
> That being said, I'd think fixing the DMA issue has prio #1 and maybe we can
> meet in IRC or something to work that out? Is there interest in that?
What about gtalk?

Best Regards
Huang Shijie

> Ok, those were my two cents. Your mileages may vary, please give your thoughts,
> then. I mainly don't want the driver development to get stalled.
>
> Regards,
>
>     Wolfram
>
> [1] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037200.html
> [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-05 13:51 GPMI-NAND Status? Wolfram Sang
  2011-08-08  6:21 ` Huang Shijie
@ 2011-08-08  9:12 ` Huang Shijie
  2011-08-09  9:19   ` Wolfram Sang
  2011-08-14  8:11 ` Ivan Djelic
  2 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-08  9:12 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: Koen Beel, Shawn Guo, linux-mtd, linux-arm-kernel,
	Lothar Waßmann

Hi Wolfram:
>
> ecclayout needs to be used to show that OOB is fully in use [1]
> ===============================================================
>
> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver only
> working with UBIFS is surely not ready for mainline.
>
It seems just modifying the ecclayout of GPMI-NAND can not fix the problem.
It should also change the code of JFFS2 and mtd.

Some one ever posted a patch about this:
http://lists.infradead.org/pipermail/linux-mtd/2007-December/020047.html

Best Regards
Huang Shijie

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-08  6:21 ` Huang Shijie
@ 2011-08-08  9:19   ` Koen Beel
  2011-08-08 10:37     ` Huang Shijie
                       ` (2 more replies)
  2011-08-09  9:35   ` Wolfram Sang
  1 sibling, 3 replies; 33+ messages in thread
From: Koen Beel @ 2011-08-08  9:19 UTC (permalink / raw)
  To: Huang Shijie
  Cc: Shawn Guo, linux-mtd, Wolfram Sang, linux-arm-kernel,
	Lothar Waßmann

Hi Wolfram,

Thanks for taking the initiative to summarize the current status.
Also thanks to Huang Shijie for all the work done so far.


On Mon, Aug 8, 2011 at 8:21 AM, Huang Shijie <b32955@freescale.com> wrote:
> Hi Wolfram:
>>
>> Hi,
>>
>> I am a bit uncertain how the state of the GPMI-NAND driver currently is,
>> so
>> I'll try to sum it up here. There is without doubt interest in getting the
>> driver into mainline from at least Huang, Shawn, Lothar, Koen and me, so I
>> wonder if we can join forces more effectively. First of all, I want to
>> thank
>> Huang Shijie for all his work so far which was already quite some effort;
>> this
>> sum-up is by no means meant as bashing, just trying to understand the
>> status
>> quo (Sidenote: I am more or less on holiday until Monday, so no time for
>> real
>> debugging myself. I write this mail so we hopefully gain a common
>> understanding. When I am back to full strength, I can then start working
>> on
>> what seems apropriate)
>>
>> Issues with the current driver I am aware of:
>>
>> DMA timeouts [1]
>> ================
>>
>> [    2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA
>> :1
>> [    3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
>>
>> Always reproducible by me when trying to format mtd0. Sometimes(always?)
>> seen
>> by Koen during boot (on read?). Never seen by Huang? It is currently
>> unclear if
>
> After I used a different .config, it never appears in my side.

flash_eraseall of mtd1 works for me.
ubi_format of mtd1 always gives the dma timeout
reading/writing of mtd0/1 always gives the dma timeout
I have seen dma timeout during boot if i try to enable ubi rootfs (so
that's the same issue as dma time during read/write).

I don't use mtd0 for testing as this contains my uboot.

I tested using Huang's .config and the Linaro git but still see
exactly the same issue.

>
>> the bug is in the GPMI driver, or in the MXS-DMA driver. Still, I'd say
>> the
>> issue is a show-stopper. We can't put a driver into mainline which leads
>> to the
>> above failure. The fact that there is _some_ configuration which works for
>> someone does not help, it doesn't work for Koen and me at least. We need

On my target, the mxs-dma is working for sdio until the gpmi-nand
gives a timeout. After that the dma for sdio is *not fully* working
anymore.

>
> Hi Koen, do you test my uImage?
> Does the timeout occur?

I was not able to test you uImage. It ended with a "Kernel panic - not
syncing: read error". See (off list) mail from last week.


>>
>> reliable drivers in mainline, so the issue needs to be resolved,
>> regardless
>> where the bug resides.
>
> ok. I will debug it too.
>
>
> Please test the driver again when you back to office.
> Pay attention to your version of /arch/arm/configs/mxs_defconfig.
> Your mxs_defconfig may miss Shawn Guo's patches.
>
> thanks.
>
>
>>
>> problem overwriting all-0xff data in NAND [2]
>> =============================================
>>
>> Although it occured only when writing JFFS2 images so far, this is a
>> generic
>> issue and needs to be fixed, right?
>>
> Artem said it should not change the driver, but the upper layer(jffs2).
>
> So I think i do not need to change the driver.
>>
>> ecclayout needs to be used to show that OOB is fully in use [1]
>> ===============================================================
>>
>> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver
>> only
>> working with UBIFS is surely not ready for mainline.
>>
> I programmed for mx6q in the recent days. I have no time to fix it. The mx6q
> can runs well now.
>
> So I will fix the issue in the following days.
>
>> Pecularities
>> ============
>>
>> There are a few issues which are odd. I don't know if some are mainly
>> intended
>> for debugging, yet they shouldn't be in a mainline driver. At least:
>>
>> * custom sysfs-entries
>
> My sysfs-entries is in the GPMI-NAND directory.
> Does be a mainline driver means I should not have any sysfs-entries?
> If it does, i can remove it.
>
>> * custom kernel command line parameters
>
> The kernel command line 'gpmi_nand' is to avoid the conflict with other
> modules such as
> SD.
>
> If it's be removed, I have to use different config to resolve the issue
> which is not better either. :(
>
>> * namespacing (some functions have no prefix, some have "mil_", some have
>> mx23)
>>   (I think 'mil' means 'mtd interface layer', but why is that needed?)
>
> The mil is used to make the gpmi_nand_data{} simple.
> Without it, the gpmi_nand_data{} will very big.
>
> The functions which have mx23 prefix are only used in mx23.
> The functions which have no prefix can used in both mx28 and mx23.
>
>> Complexity
>> ==========
>>
>> The driver is not easy to review. I wonder if it makes sense to use
>> incremental
>> patches for it? maybe making it a staging driver could be a solution for
>> that?
>
> Frankly speaking, the current driver is maybe the smallest version now.
>
> I even do not add the on-chip BBT feature now.
>>
>> Huang, are you interested in accepting patches or do you prefer we just
>> point
>> at certain code and you then fix it? Starting with a simpler driver and
>> then
>
> Feel free to mail me the patch. it's welcome.
>
>
>> adding stuff might be another option if we can't chase all the bugs in the
>> current driver.
>>
>> That being said, I'd think fixing the DMA issue has prio #1 and maybe we
>> can
>> meet in IRC or something to work that out? Is there interest in that?
>
> What about gtalk?

Anything is good for me.
Could also be useful to make sure we test on the same HW as much as
possible and are using the same source tree.
HW I have:
- mx23evk rev C1
- mx23evk rev B2
- own target hw using mx23 lqfp-128 chip and different type of ddr and nand.

>
> Best Regards
> Huang Shijie
>
>> Ok, those were my two cents. Your mileages may vary, please give your
>> thoughts,
>> then. I mainly don't want the driver development to get stalled.
+1

Br,
Koen

>>
>> Regards,
>>
>>    Wolfram
>>
>> [1] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037200.html
>> [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-08  9:19   ` Koen Beel
@ 2011-08-08 10:37     ` Huang Shijie
  2011-08-08 12:42       ` Koen Beel
  2011-08-09  5:11     ` Huang Shijie
  2011-08-09  9:45     ` Wolfram Sang
  2 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-08 10:37 UTC (permalink / raw)
  To: Koen Beel
  Cc: Wolfram Sang, linux-mtd, Shawn Guo, shijie8, linux-arm-kernel,
	Lothar Waßmann

[-- Attachment #1: Type: text/plain, Size: 6226 bytes --]

Hi,
> Hi Wolfram,
>
> Thanks for taking the initiative to summarize the current status.
> Also thanks to Huang Shijie for all the work done so far.
>
>
> On Mon, Aug 8, 2011 at 8:21 AM, Huang Shijie<b32955@freescale.com>  wrote:
>> Hi Wolfram:
>>> Hi,
>>>
>>> I am a bit uncertain how the state of the GPMI-NAND driver currently is,
>>> so
>>> I'll try to sum it up here. There is without doubt interest in getting the
>>> driver into mainline from at least Huang, Shawn, Lothar, Koen and me, so I
>>> wonder if we can join forces more effectively. First of all, I want to
>>> thank
>>> Huang Shijie for all his work so far which was already quite some effort;
>>> this
>>> sum-up is by no means meant as bashing, just trying to understand the
>>> status
>>> quo (Sidenote: I am more or less on holiday until Monday, so no time for
>>> real
>>> debugging myself. I write this mail so we hopefully gain a common
>>> understanding. When I am back to full strength, I can then start working
>>> on
>>> what seems apropriate)
>>>
>>> Issues with the current driver I am aware of:
>>>
>>> DMA timeouts [1]
>>> ================
>>>
>>> [    2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA
>>> :1
>>> [    3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
>>>
>>> Always reproducible by me when trying to format mtd0. Sometimes(always?)
>>> seen
>>> by Koen during boot (on read?). Never seen by Huang? It is currently
>>> unclear if
>> After I used a different .config, it never appears in my side.
> flash_eraseall of mtd1 works for me.
> ubi_format of mtd1 always gives the dma timeout
> reading/writing of mtd0/1 always gives the dma timeout
> I have seen dma timeout during boot if i try to enable ubi rootfs (so
> that's the same issue as dma time during read/write).
>
> I don't use mtd0 for testing as this contains my uboot.
>
> I tested using Huang's .config and the Linaro git but still see
> exactly the same issue.
>
strange.
>>> the bug is in the GPMI driver, or in the MXS-DMA driver. Still, I'd say
>>> the
>>> issue is a show-stopper. We can't put a driver into mainline which leads
>>> to the
>>> above failure. The fact that there is _some_ configuration which works for
>>> someone does not help, it doesn't work for Koen and me at least. We need
> On my target, the mxs-dma is working for sdio until the gpmi-nand
> gives a timeout. After that the dma for sdio is *not fully* working
> anymore.
>
We need more log in following aspects:
[1] apbh-dma registers
[2] clk registers
[3] gpmi registers

Please git-apply the patch in the attachment.
It will print out more DMA information WHEN dma-timeout occur.
>> Hi Koen, do you test my uImage?
>> Does the timeout occur?
> I was not able to test you uImage. It ended with a "Kernel panic - not
> syncing: read error". See (off list) mail from last week.
>
ok.
>>> reliable drivers in mainline, so the issue needs to be resolved,
>>> regardless
>>> where the bug resides.
>> ok. I will debug it too.
>>
>>
>> Please test the driver again when you back to office.
>> Pay attention to your version of /arch/arm/configs/mxs_defconfig.
>> Your mxs_defconfig may miss Shawn Guo's patches.
>>
>> thanks.
>>
>>
>>> problem overwriting all-0xff data in NAND [2]
>>> =============================================
>>>
>>> Although it occured only when writing JFFS2 images so far, this is a
>>> generic
>>> issue and needs to be fixed, right?
>>>
>> Artem said it should not change the driver, but the upper layer(jffs2).
>>
>> So I think i do not need to change the driver.
>>> ecclayout needs to be used to show that OOB is fully in use [1]
>>> ===============================================================
>>>
>>> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver
>>> only
>>> working with UBIFS is surely not ready for mainline.
>>>
>> I programmed for mx6q in the recent days. I have no time to fix it. The mx6q
>> can runs well now.
>>
>> So I will fix the issue in the following days.
>>
>>> Pecularities
>>> ============
>>>
>>> There are a few issues which are odd. I don't know if some are mainly
>>> intended
>>> for debugging, yet they shouldn't be in a mainline driver. At least:
>>>
>>> * custom sysfs-entries
>> My sysfs-entries is in the GPMI-NAND directory.
>> Does be a mainline driver means I should not have any sysfs-entries?
>> If it does, i can remove it.
>>
>>> * custom kernel command line parameters
>> The kernel command line 'gpmi_nand' is to avoid the conflict with other
>> modules such as
>> SD.
>>
>> If it's be removed, I have to use different config to resolve the issue
>> which is not better either. :(
>>
>>> * namespacing (some functions have no prefix, some have "mil_", some have
>>> mx23)
>>>    (I think 'mil' means 'mtd interface layer', but why is that needed?)
>> The mil is used to make the gpmi_nand_data{} simple.
>> Without it, the gpmi_nand_data{} will very big.
>>
>> The functions which have mx23 prefix are only used in mx23.
>> The functions which have no prefix can used in both mx28 and mx23.
>>
>>> Complexity
>>> ==========
>>>
>>> The driver is not easy to review. I wonder if it makes sense to use
>>> incremental
>>> patches for it? maybe making it a staging driver could be a solution for
>>> that?
>> Frankly speaking, the current driver is maybe the smallest version now.
>>
>> I even do not add the on-chip BBT feature now.
>>> Huang, are you interested in accepting patches or do you prefer we just
>>> point
>>> at certain code and you then fix it? Starting with a simpler driver and
>>> then
>> Feel free to mail me the patch. it's welcome.
>>
>>
>>> adding stuff might be another option if we can't chase all the bugs in the
>>> current driver.
>>>
>>> That being said, I'd think fixing the DMA issue has prio #1 and maybe we
>>> can
>>> meet in IRC or something to work that out? Is there interest in that?
>> What about gtalk?
> Anything is good for me.
> Could also be useful to make sure we test on the same HW as much as
> possible and are using the same source tree.
> HW I have:
> - mx23evk rev C1
> - mx23evk rev B2
> - own target hw using mx23 lqfp-128 chip and different type of ddr and nand.
>
I have mx23evk rev C.

Best Regards
Huang Shijie

[-- Attachment #2: 0001-print_more_log.patch --]
[-- Type: text/x-patch, Size: 3507 bytes --]

>From 69b5bf4d3bf73a89b521a7c592f5bea1d66c2755 Mon Sep 17 00:00:00 2001
From: Huang Shijie <b32955@freescale.com>
Date: Mon, 8 Aug 2011 18:39:11 +0800
Subject: [PATCH] print_more_log

print out the DMA register when timeout occur.

Signed-off-by: Huang Shijie <b32955@freescale.com>
---
 drivers/dma/mxs-dma.c                  |   37 +++++++++++++++++++++++++++++++-
 drivers/mtd/nand/gpmi-nand/gpmi-nand.c |    2 +
 2 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/drivers/dma/mxs-dma.c b/drivers/dma/mxs-dma.c
index 88aad4f..755cbfc 100644
--- a/drivers/dma/mxs-dma.c
+++ b/drivers/dma/mxs-dma.c
@@ -130,6 +130,7 @@ struct mxs_dma_engine {
 	struct mxs_dma_chan		mxs_chans[MXS_DMA_CHANNELS];
 };
 
+struct mxs_dma_chan *g_mxs_chan;
 static void mxs_dma_reset_chan(struct mxs_dma_chan *mxs_chan)
 {
 	struct mxs_dma_engine *mxs_dma = mxs_chan->mxs_dma;
@@ -239,6 +240,7 @@ static dma_cookie_t mxs_dma_tx_submit(struct dma_async_tx_descriptor *tx)
 	struct mxs_dma_chan *mxs_chan = to_mxs_dma_chan(tx->chan);
 
 	mxs_dma_enable_chan(mxs_chan);
+	g_mxs_chan = mxs_chan;
 
 	return mxs_dma_assign_cookie(mxs_chan);
 }
@@ -370,6 +372,7 @@ static void mxs_dma_free_chan_resources(struct dma_chan *chan)
 	clk_disable(mxs_dma->clk);
 }
 
+static int idx;
 static struct dma_async_tx_descriptor *mxs_dma_prep_slave_sg(
 		struct dma_chan *chan, struct scatterlist *sgl,
 		unsigned int sg_len, enum dma_data_direction direction,
@@ -381,7 +384,6 @@ static struct dma_async_tx_descriptor *mxs_dma_prep_slave_sg(
 	struct scatterlist *sg;
 	int i, j;
 	u32 *pio;
-	static int idx;
 
 	if (mxs_chan->status == DMA_IN_PROGRESS && !append)
 		return NULL;
@@ -606,6 +608,39 @@ err_out:
 	return ret;
 }
 
+
+void dump_dma_reg(void)
+{
+	int i;
+	u32 stat1;
+
+	struct mxs_dma_chan *mxs_chan = g_mxs_chan;
+	struct mxs_dma_engine *g_mxs_dma = mxs_chan->mxs_dma;
+	struct mxs_dma_ccw *ccw;
+
+	printk("------------------------DMA DUMP END ------------\n");
+	for (i = 0; i < 7; i++) {
+		stat1 = readl(g_mxs_dma->base + 0x10 * i);
+		printk("APBH REG :%x : %.8X\n", 0x10 * i, stat1);
+	}
+	for (i = 0; i < 7; i++) {
+		stat1 = readl(g_mxs_dma->base + 0x10 * i + 0x100);
+		printk("APBH REG :%x : %.8X\n", 0x10 * i + 0x100, stat1);
+	}
+
+	for (i = 0; i < idx; i++) {
+		int j;
+
+		ccw = &mxs_chan->ccw[i];
+		printk("[ %d ] : ME : %.8x, next : %.8x, bits : %.8x, bytes : %.8x, buf : %.8x\n",
+			i, mxs_chan->ccw_phys + sizeof(*ccw) * i,
+			ccw->next, ccw->bits, ccw->xfer_bytes, ccw->bufaddr);
+		for (j = 0; j < 3; j++)
+			printk("[ %d ] PIO[%d] : %.8x\n", i, j, ccw->pio_words[j]); 
+	}
+	printk("------------------------DMA DUMP END ------------\n");
+}
+
 static int __init mxs_dma_probe(struct platform_device *pdev)
 {
 	const struct platform_device_id *id_entry =
diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
index 1c2cbc5..3d6895b 100644
--- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
@@ -378,6 +378,7 @@ int start_dma_without_bch_irq(struct gpmi_nand_data *this,
 {
 	struct completion *dma_c = &this->dma_done;
 	int err;
+	extern void dump_dma_reg(void);
 
 	init_completion(dma_c);
 
@@ -391,6 +392,7 @@ int start_dma_without_bch_irq(struct gpmi_nand_data *this,
 	if (err) {
 		pr_info("DMA timeout, last DMA :%d\n", this->last_dma_type);
 		if (gpmi_debug & GPMI_DEBUG_CRAZY) {
+			dump_dma_reg();
 			gpmi_show_regs(this);
 			panic("-----------DMA FAILED------------------");
 		}
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-08 10:37     ` Huang Shijie
@ 2011-08-08 12:42       ` Koen Beel
  2011-08-09  6:36         ` Huang Shijie
  0 siblings, 1 reply; 33+ messages in thread
From: Koen Beel @ 2011-08-08 12:42 UTC (permalink / raw)
  To: Huang Shijie
  Cc: Wolfram Sang, linux-mtd, Shawn Guo, shijie8, linux-arm-kernel,
	Lothar Waßmann

Hi,

On Mon, Aug 8, 2011 at 12:37 PM, Huang Shijie <b32955@freescale.com> wrote:
> Hi,
>>
>> On my target, the mxs-dma is working for sdio until the gpmi-nand
>> gives a timeout. After that the dma for sdio is *not fully* working
>> anymore.
>>
> We need more log in following aspects:
> [1] apbh-dma registers
> [2] clk registers
> [3] gpmi registers
>
> Please git-apply the patch in the attachment.
> It will print out more DMA information WHEN dma-timeout occur.

Don't get it. What exactly are you trying to dump?
This patch dumps CTRL0, CTRL1, CTRL2, DEVSEL but also some registers
of APBH channel0 which is reserved....
Then it prints some debug info on channel 1 (ssp1) and then alle
channel 2 register except the debug register (ssp2 = not used here).

What info do you need?

Br,
Koen

>
> Best Regards
> Huang Shijie
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-08  9:19   ` Koen Beel
  2011-08-08 10:37     ` Huang Shijie
@ 2011-08-09  5:11     ` Huang Shijie
  2011-08-09  6:25       ` Koen Beel
  2011-08-09  9:45     ` Wolfram Sang
  2 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-09  5:11 UTC (permalink / raw)
  To: Koen Beel
  Cc: Frank LI, Wolfram Sang, linux-mtd, Shawn Guo, linux-arm-kernel,
	Lothar Waßmann

Hi Koen:
> Hi Wolfram,
>
> Thanks for taking the initiative to summarize the current status.
> Also thanks to Huang Shijie for all the work done so far.
>
>
> On Mon, Aug 8, 2011 at 8:21 AM, Huang Shijie<b32955@freescale.com>  wrote:
>> Hi Wolfram:
>>> Hi,
>>>
>>> I am a bit uncertain how the state of the GPMI-NAND driver currently is,
>>> so
>>> I'll try to sum it up here. There is without doubt interest in getting the
>>> driver into mainline from at least Huang, Shawn, Lothar, Koen and me, so I
>>> wonder if we can join forces more effectively. First of all, I want to
>>> thank
>>> Huang Shijie for all his work so far which was already quite some effort;
>>> this
>>> sum-up is by no means meant as bashing, just trying to understand the
>>> status
>>> quo (Sidenote: I am more or less on holiday until Monday, so no time for
>>> real
>>> debugging myself. I write this mail so we hopefully gain a common
>>> understanding. When I am back to full strength, I can then start working
>>> on
>>> what seems apropriate)
>>>
>>> Issues with the current driver I am aware of:
>>>
>>> DMA timeouts [1]
>>> ================
>>>
>>> [    2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA
>>> :1
>>> [    3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
>>>
>>> Always reproducible by me when trying to format mtd0. Sometimes(always?)
>>> seen
>>> by Koen during boot (on read?). Never seen by Huang? It is currently
>>> unclear if
>> After I used a different .config, it never appears in my side.
> flash_eraseall of mtd1 works for me.
> ubi_format of mtd1 always gives the dma timeout
> reading/writing of mtd0/1 always gives the dma timeout
> I have seen dma timeout during boot if i try to enable ubi rootfs (so
> that's the same issue as dma time during read/write).
>
> I don't use mtd0 for testing as this contains my uboot.
>
> I tested using Huang's .config and the Linaro git but still see
> exactly the same issue.
>
>>> the bug is in the GPMI driver, or in the MXS-DMA driver. Still, I'd say
>>> the
>>> issue is a show-stopper. We can't put a driver into mainline which leads
>>> to the
>>> above failure. The fact that there is _some_ configuration which works for
>>> someone does not help, it doesn't work for Koen and me at least. We need
> On my target, the mxs-dma is working for sdio until the gpmi-nand
> gives a timeout. After that the dma for sdio is *not fully* working
> anymore.
>
>> Hi Koen, do you test my uImage?
>> Does the timeout occur?
> I was not able to test you uImage. It ended with a "Kernel panic - not
> syncing: read error". See (off list) mail from last week.
>
>
>>> reliable drivers in mainline, so the issue needs to be resolved,
>>> regardless
>>> where the bug resides.
>> ok. I will debug it too.
>>
>>
>> Please test the driver again when you back to office.
>> Pay attention to your version of /arch/arm/configs/mxs_defconfig.
>> Your mxs_defconfig may miss Shawn Guo's patches.
>>
>> thanks.
>>
>>
>>> problem overwriting all-0xff data in NAND [2]
>>> =============================================
>>>
>>> Although it occured only when writing JFFS2 images so far, this is a
>>> generic
>>> issue and needs to be fixed, right?
>>>
>> Artem said it should not change the driver, but the upper layer(jffs2).
>>
>> So I think i do not need to change the driver.
>>> ecclayout needs to be used to show that OOB is fully in use [1]
>>> ===============================================================
>>>
>>> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver
>>> only
>>> working with UBIFS is surely not ready for mainline.
>>>
>> I programmed for mx6q in the recent days. I have no time to fix it. The mx6q
>> can runs well now.
>>
>> So I will fix the issue in the following days.
>>
>>> Pecularities
>>> ============
>>>
>>> There are a few issues which are odd. I don't know if some are mainly
>>> intended
>>> for debugging, yet they shouldn't be in a mainline driver. At least:
>>>
>>> * custom sysfs-entries
>> My sysfs-entries is in the GPMI-NAND directory.
>> Does be a mainline driver means I should not have any sysfs-entries?
>> If it does, i can remove it.
>>
>>> * custom kernel command line parameters
>> The kernel command line 'gpmi_nand' is to avoid the conflict with other
>> modules such as
>> SD.
>>
>> If it's be removed, I have to use different config to resolve the issue
>> which is not better either. :(
>>
>>> * namespacing (some functions have no prefix, some have "mil_", some have
>>> mx23)
>>>    (I think 'mil' means 'mtd interface layer', but why is that needed?)
>> The mil is used to make the gpmi_nand_data{} simple.
>> Without it, the gpmi_nand_data{} will very big.
>>
>> The functions which have mx23 prefix are only used in mx23.
>> The functions which have no prefix can used in both mx28 and mx23.
>>
>>> Complexity
>>> ==========
>>>
>>> The driver is not easy to review. I wonder if it makes sense to use
>>> incremental
>>> patches for it? maybe making it a staging driver could be a solution for
>>> that?
>> Frankly speaking, the current driver is maybe the smallest version now.
>>
>> I even do not add the on-chip BBT feature now.
>>> Huang, are you interested in accepting patches or do you prefer we just
>>> point
>>> at certain code and you then fix it? Starting with a simpler driver and
>>> then
>> Feel free to mail me the patch. it's welcome.
>>
>>
>>> adding stuff might be another option if we can't chase all the bugs in the
>>> current driver.
>>>
>>> That being said, I'd think fixing the DMA issue has prio #1 and maybe we
>>> can
>>> meet in IRC or something to work that out? Is there interest in that?
>> What about gtalk?
> Anything is good for me.
> Could also be useful to make sure we test on the same HW as much as
> possible and are using the same source tree.
> HW I have:
> - mx23evk rev C1
> - mx23evk rev B2
> - own target hw using mx23 lqfp-128 chip and different type of ddr and nand.
>
My test mx23 board is 169BGA package which is different from yours.

Could you get the 169BGA package board?

I think the DMA timeout is caused by the different package type.

Best Regards
Huang Shijie

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-09  5:11     ` Huang Shijie
@ 2011-08-09  6:25       ` Koen Beel
  2011-08-09  6:40         ` Huang Shijie
  0 siblings, 1 reply; 33+ messages in thread
From: Koen Beel @ 2011-08-09  6:25 UTC (permalink / raw)
  To: Huang Shijie
  Cc: Frank LI, Wolfram Sang, linux-mtd, Shawn Guo, linux-arm-kernel,
	Lothar Waßmann

Hi,

On Tue, Aug 9, 2011 at 7:11 AM, Huang Shijie <b32955@freescale.com> wrote:
> Hi Koen:
>> Anything is good for me.
>> Could also be useful to make sure we test on the same HW as much as
>> possible and are using the same source tree.
>> HW I have:
>> - mx23evk rev C1
>> - mx23evk rev B2
>> - own target hw using mx23 lqfp-128 chip and different type of ddr and
>> nand.
>>
> My test mx23 board is 169BGA package which is different from yours.
>
> Could you get the 169BGA package board?
>
> I think the DMA timeout is caused by the different package type.

I suppose my mx23evk is right the same as you have (same revision) and
this has the bga169 package. My actual target has lqfp128. But on both
boards I get the same issues.
So I don't think the package type has something to do with the dma
timeout issue. After all the silicon inside is the same thing.

I'm trying to debug a little further.

Regards,
Koen

>
> Best Regards
> Huang Shijie
>
>
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-08 12:42       ` Koen Beel
@ 2011-08-09  6:36         ` Huang Shijie
  2011-08-09  7:58           ` Koen Beel
  0 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-09  6:36 UTC (permalink / raw)
  To: Koen Beel
  Cc: Wolfram Sang, linux-mtd, Shawn Guo, shijie8, linux-arm-kernel,
	Lothar Waßmann

[-- Attachment #1: Type: text/plain, Size: 1280 bytes --]

Hi Koen:
> Hi,
>
> On Mon, Aug 8, 2011 at 12:37 PM, Huang Shijie<b32955@freescale.com>  wrote:
>> Hi,
>>> On my target, the mxs-dma is working for sdio until the gpmi-nand
>>> gives a timeout. After that the dma for sdio is *not fully* working
>>> anymore.
>>>
>> We need more log in following aspects:
>> [1] apbh-dma registers
>> [2] clk registers
>> [3] gpmi registers
>>
>> Please git-apply the patch in the attachment.
>> It will print out more DMA information WHEN dma-timeout occur.
> Don't get it. What exactly are you trying to dump?
> This patch dumps CTRL0, CTRL1, CTRL2, DEVSEL but also some registers
> of APBH channel0 which is reserved....
sorry, I intended to print out the channel 4(NAND_DEVICE0).

I want to know that:
  When the dma timeout occurs, whether it caused by the GPMI or by the 
DMA itself.


Please try the new patch.

Best Regards
Huang Shijie
> Then it prints some debug info on channel 1 (ssp1) and then alle
> channel 2 register except the debug register (ssp2 = not used here).
>
> What info do you need?
>
> Br,
> Koen
>
>> Best Regards
>> Huang Shijie
>>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>


[-- Attachment #2: 0001-print_more_log.patch --]
[-- Type: text/x-patch, Size: 3507 bytes --]

>From 69b5bf4d3bf73a89b521a7c592f5bea1d66c2755 Mon Sep 17 00:00:00 2001
From: Huang Shijie <b32955@freescale.com>
Date: Mon, 8 Aug 2011 18:39:11 +0800
Subject: [PATCH] print_more_log

print out the DMA register when timeout occur.

Signed-off-by: Huang Shijie <b32955@freescale.com>
---
 drivers/dma/mxs-dma.c                  |   37 +++++++++++++++++++++++++++++++-
 drivers/mtd/nand/gpmi-nand/gpmi-nand.c |    2 +
 2 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/drivers/dma/mxs-dma.c b/drivers/dma/mxs-dma.c
index 88aad4f..755cbfc 100644
--- a/drivers/dma/mxs-dma.c
+++ b/drivers/dma/mxs-dma.c
@@ -130,6 +130,7 @@ struct mxs_dma_engine {
 	struct mxs_dma_chan		mxs_chans[MXS_DMA_CHANNELS];
 };
 
+struct mxs_dma_chan *g_mxs_chan;
 static void mxs_dma_reset_chan(struct mxs_dma_chan *mxs_chan)
 {
 	struct mxs_dma_engine *mxs_dma = mxs_chan->mxs_dma;
@@ -239,6 +240,7 @@ static dma_cookie_t mxs_dma_tx_submit(struct dma_async_tx_descriptor *tx)
 	struct mxs_dma_chan *mxs_chan = to_mxs_dma_chan(tx->chan);
 
 	mxs_dma_enable_chan(mxs_chan);
+	g_mxs_chan = mxs_chan;
 
 	return mxs_dma_assign_cookie(mxs_chan);
 }
@@ -370,6 +372,7 @@ static void mxs_dma_free_chan_resources(struct dma_chan *chan)
 	clk_disable(mxs_dma->clk);
 }
 
+static int idx;
 static struct dma_async_tx_descriptor *mxs_dma_prep_slave_sg(
 		struct dma_chan *chan, struct scatterlist *sgl,
 		unsigned int sg_len, enum dma_data_direction direction,
@@ -381,7 +384,6 @@ static struct dma_async_tx_descriptor *mxs_dma_prep_slave_sg(
 	struct scatterlist *sg;
 	int i, j;
 	u32 *pio;
-	static int idx;
 
 	if (mxs_chan->status == DMA_IN_PROGRESS && !append)
 		return NULL;
@@ -606,6 +608,39 @@ err_out:
 	return ret;
 }
 
+
+void dump_dma_reg(void)
+{
+	int i;
+	u32 stat1;
+
+	struct mxs_dma_chan *mxs_chan = g_mxs_chan;
+	struct mxs_dma_engine *g_mxs_dma = mxs_chan->mxs_dma;
+	struct mxs_dma_ccw *ccw;
+
+	printk("------------------------DMA DUMP END ------------\n");
+	for (i = 0; i < 7; i++) {
+		stat1 = readl(g_mxs_dma->base + 0x10 * i);
+		printk("APBH REG :%x : %.8X\n", 0x10 * i, stat1);
+	}
+	for (i = 0; i < 7; i++) {
+		stat1 = readl(g_mxs_dma->base + 0x10 * i + 0x400);
+		printk("APBH REG :%x : %.8X\n", 0x10 * i + 0x400, stat1);
+	}
+
+	for (i = 0; i < idx; i++) {
+		int j;
+
+		ccw = &mxs_chan->ccw[i];
+		printk("[ %d ] : ME : %.8x, next : %.8x, bits : %.8x, bytes : %.8x, buf : %.8x\n",
+			i, mxs_chan->ccw_phys + sizeof(*ccw) * i,
+			ccw->next, ccw->bits, ccw->xfer_bytes, ccw->bufaddr);
+		for (j = 0; j < 3; j++)
+			printk("[ %d ] PIO[%d] : %.8x\n", i, j, ccw->pio_words[j]); 
+	}
+	printk("------------------------DMA DUMP END ------------\n");
+}
+
 static int __init mxs_dma_probe(struct platform_device *pdev)
 {
 	const struct platform_device_id *id_entry =
diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
index 1c2cbc5..3d6895b 100644
--- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
@@ -378,6 +378,7 @@ int start_dma_without_bch_irq(struct gpmi_nand_data *this,
 {
 	struct completion *dma_c = &this->dma_done;
 	int err;
+	extern void dump_dma_reg(void);
 
 	init_completion(dma_c);
 
@@ -391,6 +392,7 @@ int start_dma_without_bch_irq(struct gpmi_nand_data *this,
 	if (err) {
 		pr_info("DMA timeout, last DMA :%d\n", this->last_dma_type);
 		if (gpmi_debug & GPMI_DEBUG_CRAZY) {
+			dump_dma_reg();
 			gpmi_show_regs(this);
 			panic("-----------DMA FAILED------------------");
 		}
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-09  6:25       ` Koen Beel
@ 2011-08-09  6:40         ` Huang Shijie
  0 siblings, 0 replies; 33+ messages in thread
From: Huang Shijie @ 2011-08-09  6:40 UTC (permalink / raw)
  To: Koen Beel
  Cc: Frank LI, Wolfram Sang, linux-mtd, Shawn Guo, linux-arm-kernel,
	Lothar Waßmann

Hi,
> Hi,
>
> On Tue, Aug 9, 2011 at 7:11 AM, Huang Shijie<b32955@freescale.com>  wrote:
>> Hi Koen:
>>> Anything is good for me.
>>> Could also be useful to make sure we test on the same HW as much as
>>> possible and are using the same source tree.
>>> HW I have:
>>> - mx23evk rev C1
>>> - mx23evk rev B2
>>> - own target hw using mx23 lqfp-128 chip and different type of ddr and
>>> nand.
>>>
>> My test mx23 board is 169BGA package which is different from yours.
>>
>> Could you get the 169BGA package board?
>>
>> I think the DMA timeout is caused by the different package type.
> I suppose my mx23evk is right the same as you have (same revision) and
> this has the bga169 package. My actual target has lqfp128. But on both
> boards I get the same issues.

Could you test my kernel in your side?

I can provide you the sd_loader, kernel.
You can use SD card to store the rootfs.

Best Regards
Huang Shijie
> So I don't think the package type has something to do with the dma
> timeout issue. After all the silicon inside is the same thing.
>
> I'm trying to debug a little further.
>
> Regards,
> Koen
>
>> Best Regards
>> Huang Shijie
>>
>>
>>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-09  6:36         ` Huang Shijie
@ 2011-08-09  7:58           ` Koen Beel
  2011-08-09  8:18             ` Huang Shijie
  0 siblings, 1 reply; 33+ messages in thread
From: Koen Beel @ 2011-08-09  7:58 UTC (permalink / raw)
  To: Huang Shijie
  Cc: Wolfram Sang, linux-mtd, Shawn Guo, shijie8, linux-arm-kernel,
	Lothar Waßmann

[-- Attachment #1: Type: text/plain, Size: 5658 bytes --]

Hi,



On Tue, Aug 9, 2011 at 8:36 AM, Huang Shijie <b32955@freescale.com> wrote:
> Hi Koen:
>>
>> Hi,
>>
>> On Mon, Aug 8, 2011 at 12:37 PM, Huang Shijie<b32955@freescale.com>
>>  wrote:
>>>
>>> Hi,
>>>>
>>>> On my target, the mxs-dma is working for sdio until the gpmi-nand
>>>> gives a timeout. After that the dma for sdio is *not fully* working
>>>> anymore.
>>>>
>>> We need more log in following aspects:
>>> [1] apbh-dma registers
>>> [2] clk registers
>>> [3] gpmi registers
>>>
>>> Please git-apply the patch in the attachment.
>>> It will print out more DMA information WHEN dma-timeout occur.
>>
>> Don't get it. What exactly are you trying to dump?
>> This patch dumps CTRL0, CTRL1, CTRL2, DEVSEL but also some registers
>> of APBH channel0 which is reserved....
>
> sorry, I intended to print out the channel 4(NAND_DEVICE0).
>
> I want to know that:
>  When the dma timeout occurs, whether it caused by the GPMI or by the DMA
> itself.

Ok, I was a little confused about the addresses, but it seems like you
are using mx28 (and corresponding addresses). APBH dma for mx23 has
different address according to the datasheet.
So I adjusted the patch a little for mx23, see attachment.

Here is the log with some comments added on the dma.

# ubiformat /dev/mtd1
ubiformat: mtd1 (nand), size 20971520 bytes (20.0 MiB), 40 eraseblocks
of 524288 bytes (512.0 KiB), min. I/O size 4096 bytes
libscan: scanning eraseblock 0 --  2 % complete  [   86.720000] [
start_dma_without_bch_irq : 393 ] DMA timeout, last DMA :1
[   86.720000] ------------------------DMA DUMP BEGIN ----------
[   86.730000] APBH REG :0 : 30000000   // -> HW_APBH_CTRL0:
AHB_BURST8_EN, APB_BURST4_EN
[   86.730000] APBH REG :10 : 00FF0000   // -> HW_APBH_CTRL1:
CHX_CMDCMPLT_IRQ_EN, no cmdcmplt_irq
[   86.740000] APBH REG :20 : 00000000   // -> HW_APBH_CTRL2: no error_irq
[   86.740000] APBH REG :30 : 00000000   // -> HW_APBH_DEVSEL: "N/A
for apbh bridge dma."
[   86.750000] APBH CH4 REG :200 : 418D7098 // executing last dma
command of command chain (see below)
[   86.750000] APBH CH4 REG :210 : 00000000 // no next command, ok
[   86.750000] APBH CH4 REG :220 : 000001C8 // HW_APBH_CH4_CMD:
COMMAND = NO DMA TRANSFER, IRQONCMPLT, WAIT4ENDCMD, SEMAPHORE,
HALTONTERMINATE
[   86.760000] APBH CH4 REG :230 : 00000000 // HW_APBH_CH4_BAR:
"Address of system memory buffer to be read or written over the AHB
bus." -> strange value ...
[   86.760000] APBH CH4 REG :240 : 00010000 // HW_APBH_CH4_SEMA:
semaphore counter is 1
[   86.770000] APBH CH4 REG :250 : 03A00015 // HW_APBH_CH4_DEBUG1:
LOCK, NEXTCMDADDRVALID, RD_FIFO_EMPTY, WR_FIFO_EMPTY,  STATEMACHINE =
"WAIT_END = 0x15 When the Wait for Command End bit is set, the state
machine enters this state until the DMA device indicates that the
command is complete."
[   86.770000] APBH CH4 REG :260 : 00000000 // -> HW_APBH_CH4_DEBUG2:
no apb of ahb bytes remaining for transfer
[   86.780000] [ 0 ] : ME : 418d7000, next : 418d704c, bits :
00002304, bytes : 00000000, buf : 00000000
[   86.790000] [ 0 ] PIO[0] : 03800000
[   86.790000] [ 0 ] PIO[1] : 00000000
[   86.800000] [ 0 ] PIO[2] : 00000000
[   86.800000] [ 1 ] : ME : 418d704c, next : 418d7098, bits :
00006304, bytes : 00000001, buf : 4181b000
[   86.810000] [ 1 ] PIO[0] : 018010da
[   86.810000] [ 1 ] PIO[1] : 00000000
[   86.820000] [ 1 ] PIO[2] : 000011ff
[   86.820000] [ 2 ] : ME : 418d7098, next : 00000000, bits :
000023c8, bytes : 00000000, buf : 00000000
[   86.830000] [ 2 ] PIO[0] : 038010da
[   86.840000] [ 2 ] PIO[1] : 00000000
[   86.840000] [ 2 ] PIO[2] : 00000000
[   86.840000] ------------------------DMA DUMP END ------------
[   86.850000] [ gpmi_show_regs : 076 ] -------------- Show GPMI
registers ----------
[   86.860000] [ gpmi_show_regs : 079 ] offset 0x000 : 0x238010da
[   86.870000] [ gpmi_show_regs : 079 ] offset 0x010 : 0x00000000
[   86.870000] [ gpmi_show_regs : 079 ] offset 0x020 : 0x000011ff
[   86.880000] [ gpmi_show_regs : 079 ] offset 0x030 : 0x000010da
[   86.890000] [ gpmi_show_regs : 079 ] offset 0x040 : 0x40f0c480
[   86.890000] [ gpmi_show_regs : 079 ] offset 0x050 : 0x40f09000
[   86.900000] [ gpmi_show_regs : 079 ] offset 0x060 : 0x0004000c
[   86.910000] [ gpmi_show_regs : 079 ] offset 0x070 : 0x00010203
[   86.910000] [ gpmi_show_regs : 079 ] offset 0x080 : 0x05000000
[   86.920000] [ gpmi_show_regs : 079 ] offset 0x090 : 0x09020101
[   86.920000] [ gpmi_show_regs : 079 ] offset 0x0a0 : 0x00000030
[   86.930000] [ gpmi_show_regs : 079 ] offset 0x0b0 : 0x80000010
[   86.940000] [ gpmi_show_regs : 079 ] offset 0x0c0 : 0x100000ba
[   86.940000] [ gpmi_show_regs : 079 ] offset 0x0d0 : 0x03000000
[   86.950000] [ gpmi_show_regs : 081 ] -------------- Show GPMI
registers end ----------
[   86.960000] Kernel panic - not syncing: -----------DMA
FAILED------------------

Br,
Koen

>
>
> Please try the new patch.
>
> Best Regards
> Huang Shijie
>>
>> Then it prints some debug info on channel 1 (ssp1) and then alle
>> channel 2 register except the debug register (ssp2 = not used here).
>>
>> What info do you need?
>>
>> Br,
>> Koen
>>
>>> Best Regards
>>> Huang Shijie
>>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
>

[-- Attachment #2: 0008-Added-extra-dma-log-for-ch4-nand0.patch --]
[-- Type: text/x-patch, Size: 3433 bytes --]

From 457e7328e0b11e7fc88884412952038a9cae5248 Mon Sep 17 00:00:00 2001
From: Koen Beel <koen.beel@barco.com>
Date: Tue, 9 Aug 2011 09:56:42 +0200
Subject: [PATCH 8/8] Added extra dma log for ch4 (nand0).

---
 drivers/dma/mxs-dma.c                  |   37 +++++++++++++++++++++++++++++++-
 drivers/mtd/nand/gpmi-nand/gpmi-nand.c |    2 +
 2 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/drivers/dma/mxs-dma.c b/drivers/dma/mxs-dma.c
index 88aad4f..09c6c7b 100644
--- a/drivers/dma/mxs-dma.c
+++ b/drivers/dma/mxs-dma.c
@@ -130,6 +130,7 @@ struct mxs_dma_engine {
 	struct mxs_dma_chan		mxs_chans[MXS_DMA_CHANNELS];
 };
 
+struct mxs_dma_chan *g_mxs_chan;
 static void mxs_dma_reset_chan(struct mxs_dma_chan *mxs_chan)
 {
 	struct mxs_dma_engine *mxs_dma = mxs_chan->mxs_dma;
@@ -239,6 +240,7 @@ static dma_cookie_t mxs_dma_tx_submit(struct dma_async_tx_descriptor *tx)
 	struct mxs_dma_chan *mxs_chan = to_mxs_dma_chan(tx->chan);
 
 	mxs_dma_enable_chan(mxs_chan);
+	g_mxs_chan = mxs_chan;
 
 	return mxs_dma_assign_cookie(mxs_chan);
 }
@@ -370,6 +372,7 @@ static void mxs_dma_free_chan_resources(struct dma_chan *chan)
 	clk_disable(mxs_dma->clk);
 }
 
+static int idx;
 static struct dma_async_tx_descriptor *mxs_dma_prep_slave_sg(
 		struct dma_chan *chan, struct scatterlist *sgl,
 		unsigned int sg_len, enum dma_data_direction direction,
@@ -381,7 +384,6 @@ static struct dma_async_tx_descriptor *mxs_dma_prep_slave_sg(
 	struct scatterlist *sg;
 	int i, j;
 	u32 *pio;
-	static int idx;
 
 	if (mxs_chan->status == DMA_IN_PROGRESS && !append)
 		return NULL;
@@ -606,6 +608,39 @@ err_out:
 	return ret;
 }
 
+
+void dump_dma_reg(void)
+{
+	int i;
+	u32 stat1;
+
+	struct mxs_dma_chan *mxs_chan = g_mxs_chan;
+	struct mxs_dma_engine *g_mxs_dma = mxs_chan->mxs_dma;
+	struct mxs_dma_ccw *ccw;
+
+	printk("------------------------DMA DUMP BEGIN ----------\n");
+	for (i = 0; i < 4; i++) {
+		stat1 = readl(g_mxs_dma->base + 0x10 * i);
+		printk("APBH REG :%x : %.8X\n", 0x10 * i, stat1);
+	}
+	for (i = 0; i < 7; i++) {
+		stat1 = readl(g_mxs_dma->base + 0x10 * i + 0x200);
+		printk("APBH CH4 REG :%x : %.8X\n", 0x10 * i + 0x200, stat1);
+	}
+
+	for (i = 0; i < idx; i++) {
+		int j;
+
+		ccw = &mxs_chan->ccw[i];
+		printk("[ %d ] : ME : %.8x, next : %.8x, bits : %.8x, bytes : %.8x, buf : %.8x\n",
+			i, mxs_chan->ccw_phys + sizeof(*ccw) * i,
+			ccw->next, ccw->bits, ccw->xfer_bytes, ccw->bufaddr);
+		for (j = 0; j < 3; j++)
+			printk("[ %d ] PIO[%d] : %.8x\n", i, j, ccw->pio_words[j]); 
+	}
+	printk("------------------------DMA DUMP END ------------\n");
+}
+
 static int __init mxs_dma_probe(struct platform_device *pdev)
 {
 	const struct platform_device_id *id_entry =
diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
index 1c2cbc5..3d6895b 100644
--- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
@@ -378,6 +378,7 @@ int start_dma_without_bch_irq(struct gpmi_nand_data *this,
 {
 	struct completion *dma_c = &this->dma_done;
 	int err;
+	extern void dump_dma_reg(void);
 
 	init_completion(dma_c);
 
@@ -391,6 +392,7 @@ int start_dma_without_bch_irq(struct gpmi_nand_data *this,
 	if (err) {
 		pr_info("DMA timeout, last DMA :%d\n", this->last_dma_type);
 		if (gpmi_debug & GPMI_DEBUG_CRAZY) {
+			dump_dma_reg();
 			gpmi_show_regs(this);
 			panic("-----------DMA FAILED------------------");
 		}
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-09  7:58           ` Koen Beel
@ 2011-08-09  8:18             ` Huang Shijie
  2011-08-09  8:25               ` Koen Beel
  0 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-09  8:18 UTC (permalink / raw)
  To: Koen Beel
  Cc: Wolfram Sang, linux-mtd, Shawn Guo, shijie8, linux-arm-kernel,
	Lothar Waßmann

Hi Koen:
thanks for your test.
> Hi,
>
>
>
> On Tue, Aug 9, 2011 at 8:36 AM, Huang Shijie<b32955@freescale.com>  wrote:
>> Hi Koen:
>>> Hi,
>>>
>>> On Mon, Aug 8, 2011 at 12:37 PM, Huang Shijie<b32955@freescale.com>
>>>   wrote:
>>>> Hi,
>>>>> On my target, the mxs-dma is working for sdio until the gpmi-nand
>>>>> gives a timeout. After that the dma for sdio is *not fully* working
>>>>> anymore.
>>>>>
>>>> We need more log in following aspects:
>>>> [1] apbh-dma registers
>>>> [2] clk registers
>>>> [3] gpmi registers
>>>>
>>>> Please git-apply the patch in the attachment.
>>>> It will print out more DMA information WHEN dma-timeout occur.
>>> Don't get it. What exactly are you trying to dump?
>>> This patch dumps CTRL0, CTRL1, CTRL2, DEVSEL but also some registers
>>> of APBH channel0 which is reserved....
>> sorry, I intended to print out the channel 4(NAND_DEVICE0).
>>
>> I want to know that:
>>   When the dma timeout occurs, whether it caused by the GPMI or by the DMA
>> itself.
> Ok, I was a little confused about the addresses, but it seems like you
> are using mx28 (and corresponding addresses). APBH dma for mx23 has
> different address according to the datasheet.
> So I adjusted the patch a little for mx23, see attachment.
>
you are right. My address was wrong.
> Here is the log with some comments added on the dma.
>
> # ubiformat /dev/mtd1
> ubiformat: mtd1 (nand), size 20971520 bytes (20.0 MiB), 40 eraseblocks
> of 524288 bytes (512.0 KiB), min. I/O size 4096 bytes
> libscan: scanning eraseblock 0 --  2 % complete  [   86.720000] [
> start_dma_without_bch_irq : 393 ] DMA timeout, last DMA :1
> [   86.720000] ------------------------DMA DUMP BEGIN ----------
> [   86.730000] APBH REG :0 : 30000000   // ->  HW_APBH_CTRL0:
> AHB_BURST8_EN, APB_BURST4_EN
> [   86.730000] APBH REG :10 : 00FF0000   // ->  HW_APBH_CTRL1:
> CHX_CMDCMPLT_IRQ_EN, no cmdcmplt_irq
> [   86.740000] APBH REG :20 : 00000000   // ->  HW_APBH_CTRL2: no error_irq
> [   86.740000] APBH REG :30 : 00000000   // ->  HW_APBH_DEVSEL: "N/A
> for apbh bridge dma."
> [   86.750000] APBH CH4 REG :200 : 418D7098 // executing last dma
> command of command chain (see below)
> [   86.750000] APBH CH4 REG :210 : 00000000 // no next command, ok
> [   86.750000] APBH CH4 REG :220 : 000001C8 // HW_APBH_CH4_CMD:
> COMMAND = NO DMA TRANSFER, IRQONCMPLT, WAIT4ENDCMD, SEMAPHORE,
> HALTONTERMINATE
> [   86.760000] APBH CH4 REG :230 : 00000000 // HW_APBH_CH4_BAR:
> "Address of system memory buffer to be read or written over the AHB
> bus." ->  strange value ...
> [   86.760000] APBH CH4 REG :240 : 00010000 // HW_APBH_CH4_SEMA:
> semaphore counter is 1
> [   86.770000] APBH CH4 REG :250 : 03A00015 // HW_APBH_CH4_DEBUG1:
> LOCK, NEXTCMDADDRVALID, RD_FIFO_EMPTY, WR_FIFO_EMPTY,  STATEMACHINE =
> "WAIT_END = 0x15 When the Wait for Command End bit is set, the state
> machine enters this state until the DMA device indicates that the
> command is complete."
> [   86.770000] APBH CH4 REG :260 : 00000000 // ->  HW_APBH_CH4_DEBUG2:
> no apb of ahb bytes remaining for transfer
> [   86.780000] [ 0 ] : ME : 418d7000, next : 418d704c, bits :
> 00002304, bytes : 00000000, buf : 00000000
> [   86.790000] [ 0 ] PIO[0] : 03800000
> [   86.790000] [ 0 ] PIO[1] : 00000000
> [   86.800000] [ 0 ] PIO[2] : 00000000
> [   86.800000] [ 1 ] : ME : 418d704c, next : 418d7098, bits :
> 00006304, bytes : 00000001, buf : 4181b000
> [   86.810000] [ 1 ] PIO[0] : 018010da
> [   86.810000] [ 1 ] PIO[1] : 00000000
> [   86.820000] [ 1 ] PIO[2] : 000011ff
> [   86.820000] [ 2 ] : ME : 418d7098, next : 00000000, bits :
It hungs here.

> 000023c8, bytes : 00000000, buf : 00000000
> [   86.830000] [ 2 ] PIO[0] : 038010da
> [   86.840000] [ 2 ] PIO[1] : 00000000
> [   86.840000] [ 2 ] PIO[2] : 00000000
> [   86.840000] ------------------------DMA DUMP END ------------
> [   86.850000] [ gpmi_show_regs : 076 ] -------------- Show GPMI
> registers ----------
> [   86.860000] [ gpmi_show_regs : 079 ] offset 0x000 : 0x238010da
> [   86.870000] [ gpmi_show_regs : 079 ] offset 0x010 : 0x00000000
> [   86.870000] [ gpmi_show_regs : 079 ] offset 0x020 : 0x000011ff
> [   86.880000] [ gpmi_show_regs : 079 ] offset 0x030 : 0x000010da
> [   86.890000] [ gpmi_show_regs : 079 ] offset 0x040 : 0x40f0c480
> [   86.890000] [ gpmi_show_regs : 079 ] offset 0x050 : 0x40f09000
> [   86.900000] [ gpmi_show_regs : 079 ] offset 0x060 : 0x0004000c
> [   86.910000] [ gpmi_show_regs : 079 ] offset 0x070 : 0x00010203
> [   86.910000] [ gpmi_show_regs : 079 ] offset 0x080 : 0x05000000
> [   86.920000] [ gpmi_show_regs : 079 ] offset 0x090 : 0x09020101
> [   86.920000] [ gpmi_show_regs : 079 ] offset 0x0a0 : 0x00000030
> [   86.930000] [ gpmi_show_regs : 079 ] offset 0x0b0 : 0x80000010
> [   86.940000] [ gpmi_show_regs : 079 ] offset 0x0c0 : 0x100000ba
> [   86.940000] [ gpmi_show_regs : 079 ] offset 0x0d0 : 0x03000000
> [   86.950000] [ gpmi_show_regs : 081 ] -------------- Show GPMI
> registers end ----------
> [   86.960000] Kernel panic - not syncing: -----------DMA
> FAILED------------------
Please post the functions-calling stack when the panic occurs.
I also want to know where the code is running when the time-out occur.

My gmail is shijie8@gmail.com
We may talk more on the gtalk.

Best Regards
Huang Shijie

> Br,
> Koen
>
>>
>> Please try the new patch.
>>
>> Best Regards
>> Huang Shijie
>>> Then it prints some debug info on channel 1 (ssp1) and then alle
>>> channel 2 register except the debug register (ssp2 = not used here).
>>>
>>> What info do you need?
>>>
>>> Br,
>>> Koen
>>>
>>>> Best Regards
>>>> Huang Shijie
>>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-09  8:18             ` Huang Shijie
@ 2011-08-09  8:25               ` Koen Beel
  0 siblings, 0 replies; 33+ messages in thread
From: Koen Beel @ 2011-08-09  8:25 UTC (permalink / raw)
  To: Huang Shijie
  Cc: Wolfram Sang, linux-mtd, Shawn Guo, shijie8, linux-arm-kernel,
	Lothar Waßmann

On Tue, Aug 9, 2011 at 10:18 AM, Huang Shijie <b32955@freescale.com> wrote:
> Hi Koen:
> thanks for your test.
>>
>> Hi,
>>
>>
>>
>> On Tue, Aug 9, 2011 at 8:36 AM, Huang Shijie<b32955@freescale.com>  wrote:
>>>
>>> Hi Koen:
>>>>
>>>> Hi,
>>>>
>>>> On Mon, Aug 8, 2011 at 12:37 PM, Huang Shijie<b32955@freescale.com>
>>>>  wrote:
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> On my target, the mxs-dma is working for sdio until the gpmi-nand
>>>>>> gives a timeout. After that the dma for sdio is *not fully* working
>>>>>> anymore.
>>>>>>
>>>>> We need more log in following aspects:
>>>>> [1] apbh-dma registers
>>>>> [2] clk registers
>>>>> [3] gpmi registers
>>>>>
>>>>> Please git-apply the patch in the attachment.
>>>>> It will print out more DMA information WHEN dma-timeout occur.
>>>>
>>>> Don't get it. What exactly are you trying to dump?
>>>> This patch dumps CTRL0, CTRL1, CTRL2, DEVSEL but also some registers
>>>> of APBH channel0 which is reserved....
>>>
>>> sorry, I intended to print out the channel 4(NAND_DEVICE0).
>>>
>>> I want to know that:
>>>  When the dma timeout occurs, whether it caused by the GPMI or by the DMA
>>> itself.
>>
>> Ok, I was a little confused about the addresses, but it seems like you
>> are using mx28 (and corresponding addresses). APBH dma for mx23 has
>> different address according to the datasheet.
>> So I adjusted the patch a little for mx23, see attachment.
>>
> you are right. My address was wrong.
>>
>> Here is the log with some comments added on the dma.
>>
>> # ubiformat /dev/mtd1
>> ubiformat: mtd1 (nand), size 20971520 bytes (20.0 MiB), 40 eraseblocks
>> of 524288 bytes (512.0 KiB), min. I/O size 4096 bytes
>> libscan: scanning eraseblock 0 --  2 % complete  [ 86.720000] [
>> start_dma_without_bch_irq : 393 ] DMA timeout, last DMA :1
>> [ 86.720000] ------------------------DMA DUMP BEGIN ----------
>> [ 86.730000] APBH REG :0 : 30000000   // ->  HW_APBH_CTRL0:
>> AHB_BURST8_EN, APB_BURST4_EN
>> [ 86.730000] APBH REG :10 : 00FF0000   // ->  HW_APBH_CTRL1:
>> CHX_CMDCMPLT_IRQ_EN, no cmdcmplt_irq
>> [ 86.740000] APBH REG :20 : 00000000   // ->  HW_APBH_CTRL2: no error_irq
>> [ 86.740000] APBH REG :30 : 00000000   // ->  HW_APBH_DEVSEL: "N/A
>> for apbh bridge dma."
>> [ 86.750000] APBH CH4 REG :200 : 418D7098 // executing last dma
>> command of command chain (see below)
>> [ 86.750000] APBH CH4 REG :210 : 00000000 // no next command, ok
>> [ 86.750000] APBH CH4 REG :220 : 000001C8 // HW_APBH_CH4_CMD:
>> COMMAND = NO DMA TRANSFER, IRQONCMPLT, WAIT4ENDCMD, SEMAPHORE,
>> HALTONTERMINATE
>> [ 86.760000] APBH CH4 REG :230 : 00000000 // HW_APBH_CH4_BAR:
>> "Address of system memory buffer to be read or written over the AHB
>> bus." ->  strange value ...
>> [ 86.760000] APBH CH4 REG :240 : 00010000 // HW_APBH_CH4_SEMA:
>> semaphore counter is 1
>> [ 86.770000] APBH CH4 REG :250 : 03A00015 // HW_APBH_CH4_DEBUG1:
>> LOCK, NEXTCMDADDRVALID, RD_FIFO_EMPTY, WR_FIFO_EMPTY,  STATEMACHINE =
>> "WAIT_END = 0x15 When the Wait for Command End bit is set, the state
>> machine enters this state until the DMA device indicates that the
>> command is complete."
>> [ 86.770000] APBH CH4 REG :260 : 00000000 // ->  HW_APBH_CH4_DEBUG2:
>> no apb of ahb bytes remaining for transfer
>> [ 86.780000] [ 0 ] : ME : 418d7000, next : 418d704c, bits :
>> 00002304, bytes : 00000000, buf : 00000000
>> [ 86.790000] [ 0 ] PIO[0] : 03800000
>> [ 86.790000] [ 0 ] PIO[1] : 00000000
>> [ 86.800000] [ 0 ] PIO[2] : 00000000
>> [ 86.800000] [ 1 ] : ME : 418d704c, next : 418d7098, bits :
>> 00006304, bytes : 00000001, buf : 4181b000
>> [ 86.810000] [ 1 ] PIO[0] : 018010da
>> [ 86.810000] [ 1 ] PIO[1] : 00000000
>> [ 86.820000] [ 1 ] PIO[2] : 000011ff
>> [ 86.820000] [ 2 ] : ME : 418d7098, next : 00000000, bits :
>
> It hungs here.
>
>> 000023c8, bytes : 00000000, buf : 00000000
>> [ 86.830000] [ 2 ] PIO[0] : 038010da
>> [ 86.840000] [ 2 ] PIO[1] : 00000000
>> [ 86.840000] [ 2 ] PIO[2] : 00000000
>> [ 86.840000] ------------------------DMA DUMP END ------------
>> [ 86.850000] [ gpmi_show_regs : 076 ] -------------- Show GPMI
>> registers ----------
>> [ 86.860000] [ gpmi_show_regs : 079 ] offset 0x000 : 0x238010da
>> [ 86.870000] [ gpmi_show_regs : 079 ] offset 0x010 : 0x00000000
>> [ 86.870000] [ gpmi_show_regs : 079 ] offset 0x020 : 0x000011ff
>> [ 86.880000] [ gpmi_show_regs : 079 ] offset 0x030 : 0x000010da
>> [ 86.890000] [ gpmi_show_regs : 079 ] offset 0x040 : 0x40f0c480
>> [ 86.890000] [ gpmi_show_regs : 079 ] offset 0x050 : 0x40f09000
>> [ 86.900000] [ gpmi_show_regs : 079 ] offset 0x060 : 0x0004000c
>> [ 86.910000] [ gpmi_show_regs : 079 ] offset 0x070 : 0x00010203
>> [ 86.910000] [ gpmi_show_regs : 079 ] offset 0x080 : 0x05000000
>> [ 86.920000] [ gpmi_show_regs : 079 ] offset 0x090 : 0x09020101
>> [ 86.920000] [ gpmi_show_regs : 079 ] offset 0x0a0 : 0x00000030
>> [ 86.930000] [ gpmi_show_regs : 079 ] offset 0x0b0 : 0x80000010
>> [ 86.940000] [ gpmi_show_regs : 079 ] offset 0x0c0 : 0x100000ba
>> [ 86.940000] [ gpmi_show_regs : 079 ] offset 0x0d0 : 0x03000000
>> [ 86.950000] [ gpmi_show_regs : 081 ] -------------- Show GPMI
>> registers end ----------
>> [ 86.960000] Kernel panic - not syncing: -----------DMA
>> FAILED------------------
>
> Please post the functions-calling stack when the panic occurs.
> I also want to know where the code is running when the time-out occur.

Oops, here it is:
[   86.960000] Kernel panic - not syncing: -----------DMA
FAILED------------------
[   86.970000] [<c03a19b0>] (unwind_backtrace+0x0/0xf0) from
[<c0635be4>] (panic+0x58/0x18c)
[   86.980000] [<c0635be4>] (panic+0x58/0x18c) from [<c057ebd0>]
(start_dma_without_bch_irq+0x9c/0xb4)
[   86.990000] [<c057ebd0>] (start_dma_without_bch_irq+0x9c/0xb4) from
[<c057ec14>] (start_dma_with_bch_irq+0x2c/0x78)
[   87.000000] [<c057ec14>] (start_dma_with_bch_irq+0x2c/0x78) from
[<c057f86c>] (gpmi_read_page+0x160/0x1cc)
[   87.010000] [<c057f86c>] (gpmi_read_page+0x160/0x1cc) from
[<c057e33c>] (mil_ecc_read_page+0x64/0x1d8)
[   87.020000] [<c057e33c>] (mil_ecc_read_page+0x64/0x1d8) from
[<c05795a0>] (nand_do_read_ops+0x1d8/0x468)
[   87.030000] [<c05795a0>] (nand_do_read_ops+0x1d8/0x468) from
[<c0579ba4>] (nand_read+0x94/0xb0)
[   87.040000] [<c0579ba4>] (nand_read+0x94/0xb0) from [<c0573768>]
(part_read+0x60/0xe4)
[   87.050000] [<c0573768>] (part_read+0x60/0xe4) from [<c05751f0>]
(mtd_read+0xd8/0x20c)
[   87.060000] [<c05751f0>] (mtd_read+0xd8/0x20c) from [<c0442890>]
(vfs_read+0xb0/0x180)
[   87.060000] [<c0442890>] (vfs_read+0xb0/0x180) from [<c04429a0>]
(sys_read+0x40/0x70)
[   87.070000] [<c04429a0>] (sys_read+0x40/0x70) from [<c039c780>]
(ret_fast_syscall+0x0/0x2c

>
> My gmail is shijie8@gmail.com
> We may talk more on the gtalk.

i'm available using koen.beel.barco@gmail.com

>
> Best Regards
> Huang Shijie
>
>> Br,
>> Koen
>>
>>>
>>> Please try the new patch.
>>>
>>> Best Regards
>>> Huang Shijie
>>>>
>>>> Then it prints some debug info on channel 1 (ssp1) and then alle
>>>> channel 2 register except the debug register (ssp2 = not used here).
>>>>
>>>> What info do you need?
>>>>
>>>> Br,
>>>> Koen
>>>>
>>>>> Best Regards
>>>>> Huang Shijie
>>>>>
>>>> _______________________________________________
>>>> linux-arm-kernel mailing list
>>>> linux-arm-kernel@lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>>
>
>
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-08  9:12 ` Huang Shijie
@ 2011-08-09  9:19   ` Wolfram Sang
  2011-08-09 10:41     ` Huang Shijie
  0 siblings, 1 reply; 33+ messages in thread
From: Wolfram Sang @ 2011-08-09  9:19 UTC (permalink / raw)
  To: Huang Shijie
  Cc: Koen Beel, Shawn Guo, linux-mtd, linux-arm-kernel,
	Lothar Waßmann

[-- Attachment #1: Type: text/plain, Size: 823 bytes --]

Hi,

> >
> >ecclayout needs to be used to show that OOB is fully in use [1]
> >===============================================================
> >
> >Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver only
> >working with UBIFS is surely not ready for mainline.
> >
> It seems just modifying the ecclayout of GPMI-NAND can not fix the problem.
> It should also change the code of JFFS2 and mtd.
> 
> Some one ever posted a patch about this:
> http://lists.infradead.org/pipermail/linux-mtd/2007-December/020047.html

This mail is about MLC. That's another issue. For SLC, ecclayout should
work, no?

Regards,

   Wolfram

-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-08  6:21 ` Huang Shijie
  2011-08-08  9:19   ` Koen Beel
@ 2011-08-09  9:35   ` Wolfram Sang
  2011-08-09 10:54     ` Huang Shijie
  1 sibling, 1 reply; 33+ messages in thread
From: Wolfram Sang @ 2011-08-09  9:35 UTC (permalink / raw)
  To: Huang Shijie
  Cc: Koen Beel, Shawn Guo, linux-mtd, linux-arm-kernel,
	Lothar Waßmann

[-- Attachment #1: Type: text/plain, Size: 3889 bytes --]


> >DMA timeouts [1]
> >================
> >
> >[    2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA :1
> >[    3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
> >
> >Always reproducible by me when trying to format mtd0. Sometimes(always?) seen
> >by Koen during boot (on read?). Never seen by Huang? It is currently unclear if
> After I used a different .config, it never appears in my side.

So, you have a config which triggers this? That should be useful for
debugging. What do you need to enable to see this?

> Please test the driver again when you back to office.
> Pay attention to your version of /arch/arm/configs/mxs_defconfig.
> Your mxs_defconfig may miss Shawn Guo's patches.

I have all the correct patches, I triple-checked that. Regarding the
config, I am not looking for a config that works, I want my config to
work. I meanwhile have the feeling this is a bug in the DMA driver
(because Aisheng Dong has DMA problems, too, in the audio path), still
we need to be sure.

> >problem overwriting all-0xff data in NAND [2]
> >=============================================
> >
> >Although it occured only when writing JFFS2 images so far, this is a generic
> >issue and needs to be fixed, right?
> >
> Artem said it should not change the driver, but the upper layer(jffs2).
> 
> So I think i do not need to change the driver.

OK, read it again, got it now. Agreed.


> >* custom sysfs-entries
> My sysfs-entries is in the GPMI-NAND directory.
> Does be a mainline driver means I should not have any sysfs-entries?
> If it does, i can remove it.

It is some kund of ABI, so we would have to support them forever. If
there is no really strong reason to have them, it is better to remove
them.

> 
> >* custom kernel command line parameters
> The kernel command line 'gpmi_nand' is to avoid the conflict with
> other modules such as
> SD.
> 
> If it's be removed, I have to use different config to resolve the
> issue which is not better either. :(

This is a board-specific issue, so you should handle this at
board-level, not at driver level.

> >* namespacing (some functions have no prefix, some have "mil_", some have mx23)
> >   (I think 'mil' means 'mtd interface layer', but why is that needed?)
> The mil is used to make the gpmi_nand_data{} simple.
> Without it, the gpmi_nand_data{} will very big.
> 
> The functions which have mx23 prefix are only used in mx23.
> The functions which have no prefix can used in both mx28 and mx23.

I understood this, but wonder if mx23_* specific stuff has to be in the
main driver. Will have a closer look to the driver this week, then I can
say more.

> >Complexity
> >==========
> >
> >The driver is not easy to review. I wonder if it makes sense to use incremental
> >patches for it? maybe making it a staging driver could be a solution for that?
> Frankly speaking, the current driver is maybe the smallest version now.
> 
> I even do not add the on-chip BBT feature now.

OK.

> >Huang, are you interested in accepting patches or do you prefer we just point
> >at certain code and you then fix it? Starting with a simpler driver and then
> Feel free to mail me the patch. it's welcome.

We'd need a branch somewhere for that, so we have a history.

> >adding stuff might be another option if we can't chase all the bugs in the
> >current driver.
> >
> >That being said, I'd think fixing the DMA issue has prio #1 and maybe we can
> >meet in IRC or something to work that out? Is there interest in that?
> What about gtalk?

Definately not my favourite, but seems like Koen and you already use it.
Might try it...

Regards,

    Wolfram

-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-08  9:19   ` Koen Beel
  2011-08-08 10:37     ` Huang Shijie
  2011-08-09  5:11     ` Huang Shijie
@ 2011-08-09  9:45     ` Wolfram Sang
  2 siblings, 0 replies; 33+ messages in thread
From: Wolfram Sang @ 2011-08-09  9:45 UTC (permalink / raw)
  To: Koen Beel
  Cc: Huang Shijie, Shawn Guo, linux-mtd, linux-arm-kernel,
	Lothar Waßmann

[-- Attachment #1: Type: text/plain, Size: 598 bytes --]


> Could also be useful to make sure we test on the same HW as much as
> possible and are using the same source tree.
> HW I have:
> - mx23evk rev C1
> - mx23evk rev B2
> - own target hw using mx23 lqfp-128 chip and different type of ddr and nand.

Actively used:

FSL MX28EVK Rev. D
Karo TX28-4020 with STK5

Could get:

two custom MX23 boards (though I don't think this will make a
difference)

Regards,

   Wolfram

-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-09  9:19   ` Wolfram Sang
@ 2011-08-09 10:41     ` Huang Shijie
  2011-08-09 11:36       ` Lothar Waßmann
  0 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-09 10:41 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: Koen Beel, Shawn Guo, linux-mtd, linux-arm-kernel,
	Lothar Waßmann

Hi,
> Hi,
>
>>> ecclayout needs to be used to show that OOB is fully in use [1]
>>> ===============================================================
>>>
>>> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver only
>>> working with UBIFS is surely not ready for mainline.
>>>
>> It seems just modifying the ecclayout of GPMI-NAND can not fix the problem.
>> It should also change the code of JFFS2 and mtd.
>>
>> Some one ever posted a patch about this:
>> http://lists.infradead.org/pipermail/linux-mtd/2007-December/020047.html
> This mail is about MLC. That's another issue. For SLC, ecclayout should
> work, no?

The matter is that the GPMI/BCH will use the OOB. Please see the page 1263
in 16.2.2 of mx28's datasheet. It shows an NAND PAGE's layout.

You may see that the OOB will be used for storing the DATA or ECC.
There will be left some space even when i enable the BCH,  it seems the 
left space can
contains the jffs2's clean marker.  It needs to be confirmed. :)

thanks
Huang Shijie

> Regards,
>
>     Wolfram
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-09  9:35   ` Wolfram Sang
@ 2011-08-09 10:54     ` Huang Shijie
  2011-08-09 20:42       ` Wolfram Sang
  0 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-09 10:54 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: Koen Beel, Shawn Guo, linux-mtd, linux-arm-kernel,
	Lothar Waßmann

Hi,
>>> DMA timeouts [1]
>>> ================
>>>
>>> [    2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA :1
>>> [    3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
>>>
>>> Always reproducible by me when trying to format mtd0. Sometimes(always?) seen
>>> by Koen during boot (on read?). Never seen by Huang? It is currently unclear if
>> After I used a different .config, it never appears in my side.
> So, you have a config which triggers this? That should be useful for
> debugging. What do you need to enable to see this?
>
My old config is made by myself. I think it was a wrong config,
and it had too much difference from the config made by 'make mxs_defconfig".

So i think it's has no use for debugging.




>> Please test the driver again when you back to office.
>> Pay attention to your version of /arch/arm/configs/mxs_defconfig.
>> Your mxs_defconfig may miss Shawn Guo's patches.
> I have all the correct patches, I triple-checked that. Regarding the
> config, I am not looking for a config that works, I want my config to
> work. I meanwhile have the feeling this is a bug in the DMA driver
> (because Aisheng Dong has DMA problems, too, in the audio path), still
> we need to be sure.
>
it's not a DMA bug, I discuss with Koen, and make sure that the bug is
caused by the GPMI or BCH.

it's a different bug from  Aisheng Dong's bug.

>>> problem overwriting all-0xff data in NAND [2]
>>> =============================================
>>>
>>> Although it occured only when writing JFFS2 images so far, this is a generic
>>> issue and needs to be fixed, right?
>>>
>> Artem said it should not change the driver, but the upper layer(jffs2).
>>
>> So I think i do not need to change the driver.
> OK, read it again, got it now. Agreed.
>
>
>>> * custom sysfs-entries
>> My sysfs-entries is in the GPMI-NAND directory.
>> Does be a mainline driver means I should not have any sysfs-entries?
>> If it does, i can remove it.
> It is some kund of ABI, so we would have to support them forever. If
> there is no really strong reason to have them, it is better to remove
> them.
>
ok, thanks.
>>> * custom kernel command line parameters
>> The kernel command line 'gpmi_nand' is to avoid the conflict with
>> other modules such as
>> SD.
>>
>> If it's be removed, I have to use different config to resolve the
>> issue which is not better either. :(
> This is a board-specific issue, so you should handle this at
> board-level, not at driver level.
>
I wish to handle it at the board level.

But I have no idea how to solve the conflict between GPMI and SD.  :(

Could you give me some hint?
thanks

>>> * namespacing (some functions have no prefix, some have "mil_", some have mx23)
>>>    (I think 'mil' means 'mtd interface layer', but why is that needed?)
>> The mil is used to make the gpmi_nand_data{} simple.
>> Without it, the gpmi_nand_data{} will very big.
>>
>> The functions which have mx23 prefix are only used in mx23.
>> The functions which have no prefix can used in both mx28 and mx23.
> I understood this, but wonder if mx23_* specific stuff has to be in the
> main driver. Will have a closer look to the driver this week, then I can
> say more.
>
thanks
>>> Complexity
>>> ==========
>>>
>>> The driver is not easy to review. I wonder if it makes sense to use incremental
>>> patches for it? maybe making it a staging driver could be a solution for that?
>> Frankly speaking, the current driver is maybe the smallest version now.
>>
>> I even do not add the on-chip BBT feature now.
> OK.
>
>>> Huang, are you interested in accepting patches or do you prefer we just point
>>> at certain code and you then fix it? Starting with a simpler driver and then
>> Feel free to mail me the patch. it's welcome.
> We'd need a branch somewhere for that, so we have a history.
>
ok.
I will try to find some solution.
>>> adding stuff might be another option if we can't chase all the bugs in the
>>> current driver.
>>>
>>> That being said, I'd think fixing the DMA issue has prio #1 and maybe we can
>>> meet in IRC or something to work that out? Is there interest in that?
>> What about gtalk?
> Definately not my favourite, but seems like Koen and you already use it.
> Might try it...
I can use the IRC too.

Huang Shijie

> Regards,
>
>      Wolfram
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-09 10:41     ` Huang Shijie
@ 2011-08-09 11:36       ` Lothar Waßmann
  0 siblings, 0 replies; 33+ messages in thread
From: Lothar Waßmann @ 2011-08-09 11:36 UTC (permalink / raw)
  To: Huang Shijie
  Cc: Koen Beel, Shawn Guo, linux-mtd, Wolfram Sang, linux-arm-kernel

Hi,

Huang Shijie writes:
> Hi,
> > Hi,
> >
> >>> ecclayout needs to be used to show that OOB is fully in use [1]
> >>> ===============================================================
> >>>
> >>> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver only
> >>> working with UBIFS is surely not ready for mainline.
> >>>
> >> It seems just modifying the ecclayout of GPMI-NAND can not fix the problem.
> >> It should also change the code of JFFS2 and mtd.
> >>
> >> Some one ever posted a patch about this:
> >> http://lists.infradead.org/pipermail/linux-mtd/2007-December/020047.html
> > This mail is about MLC. That's another issue. For SLC, ecclayout should
> > work, no?
> 
> The matter is that the GPMI/BCH will use the OOB. Please see the page 1263
> in 16.2.2 of mx28's datasheet. It shows an NAND PAGE's layout.
> 
> You may see that the OOB will be used for storing the DATA or ECC.
> There will be left some space even when i enable the BCH,  it seems the 
> left space can
> contains the jffs2's clean marker.  It needs to be confirmed. :)
> 
It possibly could be written there, if the spare area had a separate
ECC independent from the first data block. With the current ECC
layout (combined ECC for spare area and first block), writing the
clean marker would change the ECC for the first data block to non-FF
so that writing the first block lateron would again result in ECC
errors.


Lothar Waßmann
-- 
___________________________________________________________

Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Geschäftsführer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996

www.karo-electronics.de | info@karo-electronics.de
___________________________________________________________

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-09 10:54     ` Huang Shijie
@ 2011-08-09 20:42       ` Wolfram Sang
  0 siblings, 0 replies; 33+ messages in thread
From: Wolfram Sang @ 2011-08-09 20:42 UTC (permalink / raw)
  To: Huang Shijie
  Cc: Koen Beel, Shawn Guo, linux-mtd, linux-arm-kernel,
	Lothar Waßmann

[-- Attachment #1: Type: text/plain, Size: 1990 bytes --]


> >>>Always reproducible by me when trying to format mtd0. Sometimes(always?) seen
> >>>by Koen during boot (on read?). Never seen by Huang? It is currently unclear if
> >>After I used a different .config, it never appears in my side.
> >So, you have a config which triggers this? That should be useful for
> >debugging. What do you need to enable to see this?
> >
> My old config is made by myself. I think it was a wrong config,

Let's see what the bug is in the end, but I don't think a config could be
"wrong" in a way to trigger such a bug. Even misconfiguration can be handled
gracefully with code.

> it's not a DMA bug, I discuss with Koen, and make sure that the bug is
> caused by the GPMI or BCH.

Did you get any further during this day?

> it's a different bug from  Aisheng Dong's bug.

Okay, I had a look at this one today.

> >>>* custom kernel command line parameters
> >>The kernel command line 'gpmi_nand' is to avoid the conflict with
> >>other modules such as
> >>SD.
> >>
> >>If it's be removed, I have to use different config to resolve the
> >>issue which is not better either. :(
> >This is a board-specific issue, so you should handle this at
> >board-level, not at driver level.
> >
> I wish to handle it at the board level.
> 
> But I have no idea how to solve the conflict between GPMI and SD.  :(
> 
> Could you give me some hint?

For starters, you could move some kerel-parameter to the board-file and create the
devices as needed depending on that:

if (gpmi_nand)
	mx28_add_gpmi_nand(&mx28evk_gpmi_nand_data);
else
	mx28_add_mxs_mmc(1, &mx28evk_mmc_pdata[1]);

or something alike unless I miss something. This is probably not the best
solution as well, but at least it keeps the driver free from the
board-configuration.

Regards,

   Wolfram

-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-05 13:51 GPMI-NAND Status? Wolfram Sang
  2011-08-08  6:21 ` Huang Shijie
  2011-08-08  9:12 ` Huang Shijie
@ 2011-08-14  8:11 ` Ivan Djelic
  2011-08-14 18:31   ` Wolfram Sang
                     ` (2 more replies)
  2 siblings, 3 replies; 33+ messages in thread
From: Ivan Djelic @ 2011-08-14  8:11 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: Koen Beel, Huang Shijie, linux-mtd@lists.infradead.org, Shawn Guo,
	linux-arm-kernel@lists.infradead.org, Lothar Waßmann

On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
(...)
> 
> problem overwriting all-0xff data in NAND [2]
> =============================================
> 
> Although it occured only when writing JFFS2 images so far, this is a generic
> issue and needs to be fixed, right?
> 
> 
(...)
> [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html

As explained in the thread linked above, this issue should be fixed in your
flashing tool, _not_ in your driver. The nand device you are using does not
support programming pages multiple times in a row; pretending it does in the
special all-0xff case is inefficient (you need to detect all-0xff data) and
unnecessary (just do not program blank pages !).
BR,

Ivan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-14  8:11 ` Ivan Djelic
@ 2011-08-14 18:31   ` Wolfram Sang
  2011-08-15  5:41   ` Lothar Waßmann
  2011-08-15 16:22   ` Artem Bityutskiy
  2 siblings, 0 replies; 33+ messages in thread
From: Wolfram Sang @ 2011-08-14 18:31 UTC (permalink / raw)
  To: Ivan Djelic
  Cc: Koen Beel, Huang Shijie, linux-mtd@lists.infradead.org, Shawn Guo,
	linux-arm-kernel@lists.infradead.org, Lothar Waßmann

[-- Attachment #1: Type: text/plain, Size: 641 bytes --]

> > problem overwriting all-0xff data in NAND [2]
> > =============================================
> > 
> > Although it occured only when writing JFFS2 images so far, this is a generic
> > issue and needs to be fixed, right?
> > 
> > 
> (...)
> > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> 
> As explained in the thread linked above, this issue should be fixed in your

Yup, Huang pointed me already to the part I got wrong.

Thanks.

-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-14  8:11 ` Ivan Djelic
  2011-08-14 18:31   ` Wolfram Sang
@ 2011-08-15  5:41   ` Lothar Waßmann
  2011-08-15  6:30     ` Lin Tony-B19295
                       ` (2 more replies)
  2011-08-15 16:22   ` Artem Bityutskiy
  2 siblings, 3 replies; 33+ messages in thread
From: Lothar Waßmann @ 2011-08-15  5:41 UTC (permalink / raw)
  To: Ivan Djelic
  Cc: Koen Beel, Wolfram Sang, Huang Shijie,
	linux-mtd@lists.infradead.org, Shawn Guo,
	linux-arm-kernel@lists.infradead.org

Hi,

Ivan Djelic writes:
> On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> (...)
> > 
> > problem overwriting all-0xff data in NAND [2]
> > =============================================
> > 
> > Although it occured only when writing JFFS2 images so far, this is a generic
> > issue and needs to be fixed, right?
> > 
> > 
> (...)
> > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> 
> As explained in the thread linked above, this issue should be fixed in your
> flashing tool, _not_ in your driver. The nand device you are using does not
> support programming pages multiple times in a row; pretending it does in the
>
It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
problem is that the controller generates an ECC code that is non-FF
for all-FF data, which JFFS2 cannot handle properly.


Lothar Waßmann
-- 
___________________________________________________________

Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Geschäftsführer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996

www.karo-electronics.de | info@karo-electronics.de
___________________________________________________________

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: GPMI-NAND Status?
  2011-08-15  5:41   ` Lothar Waßmann
@ 2011-08-15  6:30     ` Lin Tony-B19295
  2011-08-15  8:41       ` Ivan Djelic
  2011-08-15  8:29     ` Ivan Djelic
  2011-08-15 16:18     ` Artem Bityutskiy
  2 siblings, 1 reply; 33+ messages in thread
From: Lin Tony-B19295 @ 2011-08-15  6:30 UTC (permalink / raw)
  To: Lothar Waßmann, Ivan Djelic
  Cc: Koen Beel, Wolfram Sang, Huang Shijie-B32955,
	linux-mtd@lists.infradead.org, Shawn Guo,
	linux-arm-kernel@lists.infradead.org

> -----Original Message-----
> From: linux-arm-kernel-bounces@lists.infradead.org [mailto:linux-arm-
> kernel-bounces@lists.infradead.org] On Behalf Of Lothar Wa?mann
> Sent: Monday, August 15, 2011 1:41 PM
> To: Ivan Djelic
> Cc: Koen Beel; Wolfram Sang; Huang Shijie-B32955; linux-
> mtd@lists.infradead.org; Shawn Guo; linux-arm-kernel@lists.infradead.org
> Subject: Re: GPMI-NAND Status?
> 
> Hi,
> 
> Ivan Djelic writes:
> > On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> > (...)
> > >
> > > problem overwriting all-0xff data in NAND [2]
> > > =============================================
> > >
> > > Although it occured only when writing JFFS2 images so far, this is a
> > > generic issue and needs to be fixed, right?
> > >
> > >
> > (...)
> > > [2]
> > > http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> >
> > As explained in the thread linked above, this issue should be fixed in
> > your flashing tool, _not_ in your driver. The nand device you are
> > using does not support programming pages multiple times in a row;
> > pretending it does in the
> >
> It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> problem is that the controller generates an ECC code that is non-FF for
> all-FF data, which JFFS2 cannot handle properly.
> 
As I know, this is BCH algorithm limitation for ECC.(non-FF ECC code for all-FF data)
So that BCH engine will ignore the ECC error when all data are 0xFF. That's the BCH usage for ECC.
Under such condition, I think it's the JFFS2 that should handle such case instead
Of BCH. So far more and more SOCs are using BCH for NAND ECC, JFFS2 can't escaping such problem if
Not changed.

> 
> Lothar Waßmann
> --
> ___________________________________________________________
> 
> Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
> Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
> Geschäftsführer: Matthias Kaussen
> Handelsregistereintrag: Amtsgericht Aachen, HRB 4996
> 
> www.karo-electronics.de | info@karo-electronics.de
> ___________________________________________________________
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-15  5:41   ` Lothar Waßmann
  2011-08-15  6:30     ` Lin Tony-B19295
@ 2011-08-15  8:29     ` Ivan Djelic
  2011-08-15  9:31       ` Lothar Waßmann
  2011-08-15 16:18     ` Artem Bityutskiy
  2 siblings, 1 reply; 33+ messages in thread
From: Ivan Djelic @ 2011-08-15  8:29 UTC (permalink / raw)
  To: Lothar Waßmann
  Cc: Koen Beel, Wolfram Sang, Huang Shijie,
	linux-mtd@lists.infradead.org, Shawn Guo,
	linux-arm-kernel@lists.infradead.org

On Mon, Aug 15, 2011 at 06:41:23AM +0100, Lothar Waßmann wrote:
> Hi,
> 
> Ivan Djelic writes:
> > On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> > (...)
> > > 
> > > problem overwriting all-0xff data in NAND [2]
> > > =============================================
> > > 
> > > Although it occured only when writing JFFS2 images so far, this is a generic
> > > issue and needs to be fixed, right?
> > > 
> > > 
> > (...)
> > > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> > 
> > As explained in the thread linked above, this issue should be fixed in your
> > flashing tool, _not_ in your driver. The nand device you are using does not
> > support programming pages multiple times in a row; pretending it does in the
> >
> It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> problem is that the controller generates an ECC code that is non-FF
> for all-FF data, which JFFS2 cannot handle properly.

JFFS2 has nothing to do with it. JFFS2 does not assume it can program empty
pages and then reprogram them on a NAND flash device. You flashing method does.

If your BCH controller allows it, you could XOR the computed ECC bytes with a
specific mask to make sure all-FF data have all-FF ecc. This is useful to allow
reading erased blocks with ecc correction enabled.

But even so, you cannot work around the fact that NAND devices are different
from NOR devices, in that they typically allow only a limited number of partial
page programming operations (4 in your K9F1G08U0B).
If you implemented the mask trick described above and used it to allow
multiple page programming, you still would not track the number of partial
program operations on a given page, and expose yourself to nasty bugs (when
exceeding the number of specified partial operations); i.e. it could work on
some devices for a few operations, but not reliably on all devices for any
number of empty page programmings.

So the only real possibility is to avoid programming (physically) a page when
its target contents are empty (all-FF); this is not implemented at the driver
level because:
- it is useless: none of the existing filesystems need this "feature"
- it would waste cpu cycles to check if target data is all-FF each time a page
is programmed

Therefore... it is simply a matter of avoiding empty page programming, which
only happens in your flasher. See also the flashing guidelines [1] as per Artem
suggestion.

BR,

Ivan

[1] http://www.linux-mtd.infradead.org/doc/ubi.html#L_flasher_algo

> 
> 
> Lothar Waßmann
> -- 
> ___________________________________________________________
> 
> Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
> Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
> Geschäftsführer: Matthias Kaussen
> Handelsregistereintrag: Amtsgericht Aachen, HRB 4996
> 
> www.karo-electronics.de | info@karo-electronics.de
> ___________________________________________________________

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-15  6:30     ` Lin Tony-B19295
@ 2011-08-15  8:41       ` Ivan Djelic
  0 siblings, 0 replies; 33+ messages in thread
From: Ivan Djelic @ 2011-08-15  8:41 UTC (permalink / raw)
  To: Lin Tony-B19295
  Cc: Koen Beel, Wolfram Sang, Huang Shijie-B32955,
	linux-mtd@lists.infradead.org, Shawn Guo,
	linux-arm-kernel@lists.infradead.org, Lothar Waßmann

On Mon, Aug 15, 2011 at 07:30:58AM +0100, Lin Tony-B19295 wrote:
> > -----Original Message-----
> > From: linux-arm-kernel-bounces@lists.infradead.org [mailto:linux-arm-
> > kernel-bounces@lists.infradead.org] On Behalf Of Lothar Wa?mann
> > Sent: Monday, August 15, 2011 1:41 PM
> > To: Ivan Djelic
> > Cc: Koen Beel; Wolfram Sang; Huang Shijie-B32955; linux-
> > mtd@lists.infradead.org; Shawn Guo; linux-arm-kernel@lists.infradead.org
> > Subject: Re: GPMI-NAND Status?
> > 
> > Hi,
> > 
> > Ivan Djelic writes:
> > > On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> > > (...)
> > > >
> > > > problem overwriting all-0xff data in NAND [2]
> > > > =============================================
> > > >
> > > > Although it occured only when writing JFFS2 images so far, this is a
> > > > generic issue and needs to be fixed, right?
> > > >
> > > >
> > > (...)
> > > > [2]
> > > > http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> > >
> > > As explained in the thread linked above, this issue should be fixed in
> > > your flashing tool, _not_ in your driver. The nand device you are
> > > using does not support programming pages multiple times in a row;
> > > pretending it does in the
> > >
> > It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> > problem is that the controller generates an ECC code that is non-FF for
> > all-FF data, which JFFS2 cannot handle properly.
> > 
> As I know, this is BCH algorithm limitation for ECC.(non-FF ECC code for all-FF data)

Not a BCH algorithm limitation: you can always XOR BCH bytes to get all-FF ECC
code for all-FF data. It is equivalent to simply adding a particular
polynomial. See drivers/mtd/nand/nand_bch.c:62 for an example.
Not being able to do so is a hardware limitation. For instance, the OMAP35xx
BCH engine allows this masking operation.

BR,

Ivan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-15  8:29     ` Ivan Djelic
@ 2011-08-15  9:31       ` Lothar Waßmann
  2011-08-15 12:54         ` Ivan Djelic
  2011-08-15 16:34         ` Artem Bityutskiy
  0 siblings, 2 replies; 33+ messages in thread
From: Lothar Waßmann @ 2011-08-15  9:31 UTC (permalink / raw)
  To: Ivan Djelic
  Cc: Koen Beel, Wolfram Sang, Huang Shijie,
	linux-mtd@lists.infradead.org, Shawn Guo,
	linux-arm-kernel@lists.infradead.org

Hi,

Ivan Djelic writes:
> On Mon, Aug 15, 2011 at 06:41:23AM +0100, Lothar Waßmann wrote:
> > Hi,
> > 
> > Ivan Djelic writes:
> > > On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> > > (...)
> > > > 
> > > > problem overwriting all-0xff data in NAND [2]
> > > > =============================================
> > > > 
> > > > Although it occured only when writing JFFS2 images so far, this is a generic
> > > > issue and needs to be fixed, right?
> > > > 
> > > > 
> > > (...)
> > > > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> > > 
> > > As explained in the thread linked above, this issue should be fixed in your
> > > flashing tool, _not_ in your driver. The nand device you are using does not
> > > support programming pages multiple times in a row; pretending it does in the
> > >
> > It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> > problem is that the controller generates an ECC code that is non-FF
> > for all-FF data, which JFFS2 cannot handle properly.
> 
> JFFS2 has nothing to do with it. JFFS2 does not assume it can program empty
> pages and then reprogram them on a NAND flash device. You flashing method does.
> 
AFAICT JFFS2 checks the flash for areas that contain only FF and
treats them like erased flash. At least it tries to overwrite such
areas in flash without erasing it beforehand.
I avoided the problem by creating JFFS2 images that are padded with
oxff to page size only instead of eraseblock size.

> If your BCH controller allows it, you could XOR the computed ECC bytes with a
> specific mask to make sure all-FF data have all-FF ecc. This is useful to allow
> reading erased blocks with ecc correction enabled.
> 
> But even so, you cannot work around the fact that NAND devices are different
> from NOR devices, in that they typically allow only a limited number of partial
> page programming operations (4 in your K9F1G08U0B).
> If you implemented the mask trick described above and used it to allow
> multiple page programming, you still would not track the number of partial
> program operations on a given page, and expose yourself to nasty bugs (when
> exceeding the number of specified partial operations); i.e. it could work on
> some devices for a few operations, but not reliably on all devices for any
> number of empty page programmings.
> 
> So the only real possibility is to avoid programming (physically) a page when
> its target contents are empty (all-FF); this is not implemented at the driver
> level because:
> - it is useless: none of the existing filesystems need this "feature"
> - it would waste cpu cycles to check if target data is all-FF each time a page
> is programmed
> 
> Therefore... it is simply a matter of avoiding empty page programming, which
> only happens in your flasher. See also the flashing guidelines [1] as per Artem
> suggestion.
> 
"Your flasher" is the standard mtd-utils mkfs.jffs2 to create an image
file and the U-Boot commands 'nand erase/nand write' or
the mtd-utils 'flash_eraseall/nandwrite' to write it to flash.


Lothar Waßmann
-- 
___________________________________________________________

Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Geschäftsführer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996

www.karo-electronics.de | info@karo-electronics.de
___________________________________________________________

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-15  9:31       ` Lothar Waßmann
@ 2011-08-15 12:54         ` Ivan Djelic
  2011-08-15 13:37           ` Lothar Waßmann
  2011-08-15 16:34         ` Artem Bityutskiy
  1 sibling, 1 reply; 33+ messages in thread
From: Ivan Djelic @ 2011-08-15 12:54 UTC (permalink / raw)
  To: Lothar Waßmann
  Cc: Koen Beel, Wolfram Sang, Huang Shijie,
	linux-mtd@lists.infradead.org, Shawn Guo,
	linux-arm-kernel@lists.infradead.org

On Mon, Aug 15, 2011 at 10:31:34AM +0100, Lothar Waßmann wrote:
> > > It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> > > problem is that the controller generates an ECC code that is non-FF
> > > for all-FF data, which JFFS2 cannot handle properly.
> > 
> > JFFS2 has nothing to do with it. JFFS2 does not assume it can program empty
> > pages and then reprogram them on a NAND flash device. You flashing method does.
> > 
> AFAICT JFFS2 checks the flash for areas that contain only FF and
> treats them like erased flash. At least it tries to overwrite such
> areas in flash without erasing it beforehand.

Hmmm, again, JFFS2 simply "writes" to an erased block. You say "it tries to
overwrite", but that's only because you programmed an empty page in the first
place! And it cannot "erase such areas beforehand" (think of a partially
programmed block with tailing empty pages).

> I avoided the problem by creating JFFS2 images that are padded with
> oxff to page size only instead of eraseblock size.

Good.

> > Therefore... it is simply a matter of avoiding empty page programming, which
> > only happens in your flasher. See also the flashing guidelines [1] as per Artem
> > suggestion.
> > 
> "Your flasher" is the standard mtd-utils mkfs.jffs2 to create an image
> file and the U-Boot commands 'nand erase/nand write' or
> the mtd-utils 'flash_eraseall/nandwrite' to write it to flash.

OK, maybe you could submit a patch to fix this issue then ?
Thanks,

Ivan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-15 12:54         ` Ivan Djelic
@ 2011-08-15 13:37           ` Lothar Waßmann
  0 siblings, 0 replies; 33+ messages in thread
From: Lothar Waßmann @ 2011-08-15 13:37 UTC (permalink / raw)
  To: Ivan Djelic
  Cc: Koen Beel, Wolfram Sang, Huang Shijie,
	linux-mtd@lists.infradead.org, Shawn Guo,
	linux-arm-kernel@lists.infradead.org

Hi,

Ivan Djelic writes:
> On Mon, Aug 15, 2011 at 10:31:34AM +0100, Lothar Waßmann wrote:
> > > > It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> > > > problem is that the controller generates an ECC code that is non-FF
> > > > for all-FF data, which JFFS2 cannot handle properly.
> > > 
> > > JFFS2 has nothing to do with it. JFFS2 does not assume it can program empty
> > > pages and then reprogram them on a NAND flash device. You flashing method does.
> > > 
> > AFAICT JFFS2 checks the flash for areas that contain only FF and
> > treats them like erased flash. At least it tries to overwrite such
> > areas in flash without erasing it beforehand.
> 
> Hmmm, again, JFFS2 simply "writes" to an erased block. You say "it tries to
> overwrite", but that's only because you programmed an empty page in the first
>
Which will happen most of the time (unless the image file size is near
a multiple of the eraseblock size), if you create an jffs2 image with
mkfs.jffs2 and the '-p' option.

Unfortunately the data written by U-Boot must be page aligned and
mkfs.jffs2 cannot generate images that fulfill that constraint (unless
you know the exact image file size beforehand and specify the
rounded-up file size with the '-p' option).

> place! And it cannot "erase such areas beforehand" (think of a partially
> programmed block with tailing empty pages).
> 
I know, that it cannot simply erase such blocks.

> > > Therefore... it is simply a matter of avoiding empty page programming, which
> > > only happens in your flasher. See also the flashing guidelines [1] as per Artem
> > > suggestion.
> > > 
> > "Your flasher" is the standard mtd-utils mkfs.jffs2 to create an image
> > file and the U-Boot commands 'nand erase/nand write' or
> > the mtd-utils 'flash_eraseall/nandwrite' to write it to flash.
> 
> OK, maybe you could submit a patch to fix this issue then ?
> Thanks,
> 
I have no idea how and where it should be fixed.
- at driver level, creating an all-FF ECC for all-FF data?
- in the bootloader and mtd-utils preventing stretches of all-FF data
  to be written?
- in mkfs.jffs2, enabling it to generate images that are padded to
  page size instead of eraseblock size?


Lothar Waßmann
-- 
___________________________________________________________

Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Geschäftsführer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996

www.karo-electronics.de | info@karo-electronics.de
___________________________________________________________

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-15  5:41   ` Lothar Waßmann
  2011-08-15  6:30     ` Lin Tony-B19295
  2011-08-15  8:29     ` Ivan Djelic
@ 2011-08-15 16:18     ` Artem Bityutskiy
  2 siblings, 0 replies; 33+ messages in thread
From: Artem Bityutskiy @ 2011-08-15 16:18 UTC (permalink / raw)
  To: Lothar Waßmann
  Cc: Koen Beel, Wolfram Sang, Huang Shijie,
	linux-mtd@lists.infradead.org, Ivan Djelic, Shawn Guo,
	linux-arm-kernel@lists.infradead.org

On Mon, 2011-08-15 at 07:41 +0200, Lothar Waßmann wrote:
> > As explained in the thread linked above, this issue should be fixed in your
> > flashing tool, _not_ in your driver. The nand device you are using does not
> > support programming pages multiple times in a row; pretending it does in the
> >
> It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> problem is that the controller generates an ECC code that is non-FF
> for all-FF data, which JFFS2 cannot handle properly.

I believe that it does not matter for the kernel community that your
specific device can survive multiple writes. I certainly does matter for
you, so if you want a quick fix - just change your kernel, but I would
not recommend to do this.

We (the community) care about the _general_ case - in general, only one
write is allowed, period. Once the JFFS2 or/and the flashing tool is
fixed - the problem will go away.

Let me put it this way - hacking the driver will just hide the issue
deeper - we'll have this issue popped up again a bit later and because
of hacks like that [1] it will be more confusing. Let's avoid this.

Also, someone pointed in that thread that if I write data to NAND - I
want my data to be ECC-protected. Please, explain why my data should be
unprotected if it happened to be just 2KiB of 0xFFs covering whole NAND
page?

Ivan provided much better explanation, showing that even with NOP 4
flashes there may be problems (e.g., 1 write of all 0xFFs + 4 writes
from the user).

[1] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037181.html

-- 
Best Regards,
Artem Bityutskiy

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-14  8:11 ` Ivan Djelic
  2011-08-14 18:31   ` Wolfram Sang
  2011-08-15  5:41   ` Lothar Waßmann
@ 2011-08-15 16:22   ` Artem Bityutskiy
  2011-08-15 16:57     ` Ivan Djelic
  2 siblings, 1 reply; 33+ messages in thread
From: Artem Bityutskiy @ 2011-08-15 16:22 UTC (permalink / raw)
  To: Ivan Djelic
  Cc: Koen Beel, Wolfram Sang, Huang Shijie,
	linux-mtd@lists.infradead.org, Shawn Guo,
	linux-arm-kernel@lists.infradead.org, Lothar Waßmann

On Sun, 2011-08-14 at 10:11 +0200, Ivan Djelic wrote:
> On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> (...)
> > 
> > problem overwriting all-0xff data in NAND [2]
> > =============================================
> > 
> > Although it occured only when writing JFFS2 images so far, this is a generic
> > issue and needs to be fixed, right?
> > 
> > 
> (...)
> > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> 
> As explained in the thread linked above, this issue should be fixed in your
> flashing tool, _not_ in your driver. The nand device you are using does not
> support programming pages multiple times in a row; pretending it does in the
> special all-0xff case is inefficient (you need to detect all-0xff data) and
> unnecessary (just do not program blank pages !).

Hmm, isn't it also buggy because if my precious data contains 2KiB of
0xFFs (aligned to 2KiB boundary) then I will have no ECC protection for
this page? Or I miss something?

-- 
Best Regards,
Artem Bityutskiy

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-15  9:31       ` Lothar Waßmann
  2011-08-15 12:54         ` Ivan Djelic
@ 2011-08-15 16:34         ` Artem Bityutskiy
  1 sibling, 0 replies; 33+ messages in thread
From: Artem Bityutskiy @ 2011-08-15 16:34 UTC (permalink / raw)
  To: Lothar Waßmann
  Cc: Koen Beel, Wolfram Sang, Huang Shijie,
	linux-mtd@lists.infradead.org, Ivan Djelic, Shawn Guo,
	linux-arm-kernel@lists.infradead.org

On Mon, 2011-08-15 at 11:31 +0200, Lothar Waßmann wrote:
> Hi,
> 
> Ivan Djelic writes:
> > On Mon, Aug 15, 2011 at 06:41:23AM +0100, Lothar Waßmann wrote:
> > > Hi,
> > > 
> > > Ivan Djelic writes:
> > > > On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> > > > (...)
> > > > > 
> > > > > problem overwriting all-0xff data in NAND [2]
> > > > > =============================================
> > > > > 
> > > > > Although it occured only when writing JFFS2 images so far, this is a generic
> > > > > issue and needs to be fixed, right?
> > > > > 
> > > > > 
> > > > (...)
> > > > > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> > > > 
> > > > As explained in the thread linked above, this issue should be fixed in your
> > > > flashing tool, _not_ in your driver. The nand device you are using does not
> > > > support programming pages multiple times in a row; pretending it does in the
> > > >
> > > It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> > > problem is that the controller generates an ECC code that is non-FF
> > > for all-FF data, which JFFS2 cannot handle properly.
> > 
> > JFFS2 has nothing to do with it. JFFS2 does not assume it can program empty
> > pages and then reprogram them on a NAND flash device. You flashing method does.
> > 
> AFAICT JFFS2 checks the flash for areas that contain only FF and
> treats them like erased flash. At least it tries to overwrite such
> areas in flash without erasing it beforehand.

Right, when JFFS2 scans the flash and finds a partially-used eraseblock,
it tries to find out where the data ends and the empty space starts.
JFFS2 assumes that the empty space is usable, and it uses it. JFFS2
author just missed the fact that in case of newly flashed JFFS2 image
this empty space may be unusable. And this was extremely unlikely those
times.

You may teach JFFS2 to avoid using this space or to "clean it up", just
like we taught UBIFS to do this recently. The other option is to change
the flashing method. 

> I avoided the problem by creating JFFS2 images that are padded with
> oxff to page size only instead of eraseblock size.

Right.

> > Therefore... it is simply a matter of avoiding empty page programming, which
> > only happens in your flasher. See also the flashing guidelines [1] as per Artem
> > suggestion.
> > 
> "Your flasher" is the standard mtd-utils mkfs.jffs2 to create an image
> file and the U-Boot commands 'nand erase/nand write' or
> the mtd-utils 'flash_eraseall/nandwrite' to write it to flash.

I do not use these tools, but if they have issues - just fix them and
send patches.

-- 
Best Regards,
Artem Bityutskiy

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: GPMI-NAND Status?
  2011-08-15 16:22   ` Artem Bityutskiy
@ 2011-08-15 16:57     ` Ivan Djelic
  0 siblings, 0 replies; 33+ messages in thread
From: Ivan Djelic @ 2011-08-15 16:57 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Koen Beel, Wolfram Sang, Huang Shijie,
	linux-mtd@lists.infradead.org, Shawn Guo,
	linux-arm-kernel@lists.infradead.org, Lothar Waßmann

On Mon, Aug 15, 2011 at 05:22:13PM +0100, Artem Bityutskiy wrote:
> On Sun, 2011-08-14 at 10:11 +0200, Ivan Djelic wrote:
> > On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> > (...)
> > > 
> > > problem overwriting all-0xff data in NAND [2]
> > > =============================================
> > > 
> > > Although it occured only when writing JFFS2 images so far, this is a generic
> > > issue and needs to be fixed, right?
> > > 
> > > 
> > (...)
> > > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> > 
> > As explained in the thread linked above, this issue should be fixed in your
> > flashing tool, _not_ in your driver. The nand device you are using does not
> > support programming pages multiple times in a row; pretending it does in the
> > special all-0xff case is inefficient (you need to detect all-0xff data) and
> > unnecessary (just do not program blank pages !).
> 
> Hmm, isn't it also buggy because if my precious data contains 2KiB of
> 0xFFs (aligned to 2KiB boundary) then I will have no ECC protection for
> this page? Or I miss something?

Ouch, yes you are correct, very good point which I missed :)

Ivan

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2011-08-15 16:57 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-05 13:51 GPMI-NAND Status? Wolfram Sang
2011-08-08  6:21 ` Huang Shijie
2011-08-08  9:19   ` Koen Beel
2011-08-08 10:37     ` Huang Shijie
2011-08-08 12:42       ` Koen Beel
2011-08-09  6:36         ` Huang Shijie
2011-08-09  7:58           ` Koen Beel
2011-08-09  8:18             ` Huang Shijie
2011-08-09  8:25               ` Koen Beel
2011-08-09  5:11     ` Huang Shijie
2011-08-09  6:25       ` Koen Beel
2011-08-09  6:40         ` Huang Shijie
2011-08-09  9:45     ` Wolfram Sang
2011-08-09  9:35   ` Wolfram Sang
2011-08-09 10:54     ` Huang Shijie
2011-08-09 20:42       ` Wolfram Sang
2011-08-08  9:12 ` Huang Shijie
2011-08-09  9:19   ` Wolfram Sang
2011-08-09 10:41     ` Huang Shijie
2011-08-09 11:36       ` Lothar Waßmann
2011-08-14  8:11 ` Ivan Djelic
2011-08-14 18:31   ` Wolfram Sang
2011-08-15  5:41   ` Lothar Waßmann
2011-08-15  6:30     ` Lin Tony-B19295
2011-08-15  8:41       ` Ivan Djelic
2011-08-15  8:29     ` Ivan Djelic
2011-08-15  9:31       ` Lothar Waßmann
2011-08-15 12:54         ` Ivan Djelic
2011-08-15 13:37           ` Lothar Waßmann
2011-08-15 16:34         ` Artem Bityutskiy
2011-08-15 16:18     ` Artem Bityutskiy
2011-08-15 16:22   ` Artem Bityutskiy
2011-08-15 16:57     ` Ivan Djelic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox