* GPMI-NAND Status?
@ 2011-08-05 13:51 Wolfram Sang
2011-08-08 6:21 ` Huang Shijie
` (2 more replies)
0 siblings, 3 replies; 33+ messages in thread
From: Wolfram Sang @ 2011-08-05 13:51 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
I am a bit uncertain how the state of the GPMI-NAND driver currently is, so
I'll try to sum it up here. There is without doubt interest in getting the
driver into mainline from at least Huang, Shawn, Lothar, Koen and me, so I
wonder if we can join forces more effectively. First of all, I want to thank
Huang Shijie for all his work so far which was already quite some effort; this
sum-up is by no means meant as bashing, just trying to understand the status
quo (Sidenote: I am more or less on holiday until Monday, so no time for real
debugging myself. I write this mail so we hopefully gain a common
understanding. When I am back to full strength, I can then start working on
what seems apropriate)
Issues with the current driver I am aware of:
DMA timeouts [1]
================
[ 2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA :1
[ 3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
Always reproducible by me when trying to format mtd0. Sometimes(always?) seen
by Koen during boot (on read?). Never seen by Huang? It is currently unclear if
the bug is in the GPMI driver, or in the MXS-DMA driver. Still, I'd say the
issue is a show-stopper. We can't put a driver into mainline which leads to the
above failure. The fact that there is _some_ configuration which works for
someone does not help, it doesn't work for Koen and me at least. We need
reliable drivers in mainline, so the issue needs to be resolved, regardless
where the bug resides.
problem overwriting all-0xff data in NAND [2]
=============================================
Although it occured only when writing JFFS2 images so far, this is a generic
issue and needs to be fixed, right?
ecclayout needs to be used to show that OOB is fully in use [1]
===============================================================
Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver only
working with UBIFS is surely not ready for mainline.
Pecularities
============
There are a few issues which are odd. I don't know if some are mainly intended
for debugging, yet they shouldn't be in a mainline driver. At least:
* custom sysfs-entries
* custom kernel command line parameters
* namespacing (some functions have no prefix, some have "mil_", some have mx23)
(I think 'mil' means 'mtd interface layer', but why is that needed?)
Complexity
==========
The driver is not easy to review. I wonder if it makes sense to use incremental
patches for it? maybe making it a staging driver could be a solution for that?
Huang, are you interested in accepting patches or do you prefer we just point
at certain code and you then fix it? Starting with a simpler driver and then
adding stuff might be another option if we can't chase all the bugs in the
current driver.
That being said, I'd think fixing the DMA issue has prio #1 and maybe we can
meet in IRC or something to work that out? Is there interest in that?
Ok, those were my two cents. Your mileages may vary, please give your thoughts,
then. I mainly don't want the driver development to get stalled.
Regards,
Wolfram
[1] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037200.html
[2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
--
Pengutronix e.K. | Wolfram Sang |
Industrial Linux Solutions | http://www.pengutronix.de/ |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20110805/e0d99aab/attachment-0001.sig>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-05 13:51 GPMI-NAND Status? Wolfram Sang
@ 2011-08-08 6:21 ` Huang Shijie
2011-08-08 9:19 ` Koen Beel
2011-08-09 9:35 ` Wolfram Sang
2011-08-08 9:12 ` Huang Shijie
2011-08-14 8:11 ` Ivan Djelic
2 siblings, 2 replies; 33+ messages in thread
From: Huang Shijie @ 2011-08-08 6:21 UTC (permalink / raw)
To: linux-arm-kernel
Hi Wolfram:
> Hi,
>
> I am a bit uncertain how the state of the GPMI-NAND driver currently is, so
> I'll try to sum it up here. There is without doubt interest in getting the
> driver into mainline from at least Huang, Shawn, Lothar, Koen and me, so I
> wonder if we can join forces more effectively. First of all, I want to thank
> Huang Shijie for all his work so far which was already quite some effort; this
> sum-up is by no means meant as bashing, just trying to understand the status
> quo (Sidenote: I am more or less on holiday until Monday, so no time for real
> debugging myself. I write this mail so we hopefully gain a common
> understanding. When I am back to full strength, I can then start working on
> what seems apropriate)
>
> Issues with the current driver I am aware of:
>
> DMA timeouts [1]
> ================
>
> [ 2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA :1
> [ 3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
>
> Always reproducible by me when trying to format mtd0. Sometimes(always?) seen
> by Koen during boot (on read?). Never seen by Huang? It is currently unclear if
After I used a different .config, it never appears in my side.
> the bug is in the GPMI driver, or in the MXS-DMA driver. Still, I'd say the
> issue is a show-stopper. We can't put a driver into mainline which leads to the
> above failure. The fact that there is _some_ configuration which works for
> someone does not help, it doesn't work for Koen and me at least. We need
Hi Koen, do you test my uImage?
Does the timeout occur?
> reliable drivers in mainline, so the issue needs to be resolved, regardless
> where the bug resides.
ok. I will debug it too.
Please test the driver again when you back to office.
Pay attention to your version of /arch/arm/configs/mxs_defconfig.
Your mxs_defconfig may miss Shawn Guo's patches.
thanks.
>
> problem overwriting all-0xff data in NAND [2]
> =============================================
>
> Although it occured only when writing JFFS2 images so far, this is a generic
> issue and needs to be fixed, right?
>
Artem said it should not change the driver, but the upper layer(jffs2).
So I think i do not need to change the driver.
> ecclayout needs to be used to show that OOB is fully in use [1]
> ===============================================================
>
> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver only
> working with UBIFS is surely not ready for mainline.
>
I programmed for mx6q in the recent days. I have no time to fix it. The
mx6q can runs well now.
So I will fix the issue in the following days.
> Pecularities
> ============
>
> There are a few issues which are odd. I don't know if some are mainly intended
> for debugging, yet they shouldn't be in a mainline driver. At least:
>
> * custom sysfs-entries
My sysfs-entries is in the GPMI-NAND directory.
Does be a mainline driver means I should not have any sysfs-entries?
If it does, i can remove it.
> * custom kernel command line parameters
The kernel command line 'gpmi_nand' is to avoid the conflict with other
modules such as
SD.
If it's be removed, I have to use different config to resolve the issue
which is not better either. :(
> * namespacing (some functions have no prefix, some have "mil_", some have mx23)
> (I think 'mil' means 'mtd interface layer', but why is that needed?)
The mil is used to make the gpmi_nand_data{} simple.
Without it, the gpmi_nand_data{} will very big.
The functions which have mx23 prefix are only used in mx23.
The functions which have no prefix can used in both mx28 and mx23.
> Complexity
> ==========
>
> The driver is not easy to review. I wonder if it makes sense to use incremental
> patches for it? maybe making it a staging driver could be a solution for that?
Frankly speaking, the current driver is maybe the smallest version now.
I even do not add the on-chip BBT feature now.
> Huang, are you interested in accepting patches or do you prefer we just point
> at certain code and you then fix it? Starting with a simpler driver and then
Feel free to mail me the patch. it's welcome.
> adding stuff might be another option if we can't chase all the bugs in the
> current driver.
>
> That being said, I'd think fixing the DMA issue has prio #1 and maybe we can
> meet in IRC or something to work that out? Is there interest in that?
What about gtalk?
Best Regards
Huang Shijie
> Ok, those were my two cents. Your mileages may vary, please give your thoughts,
> then. I mainly don't want the driver development to get stalled.
>
> Regards,
>
> Wolfram
>
> [1] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037200.html
> [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-05 13:51 GPMI-NAND Status? Wolfram Sang
2011-08-08 6:21 ` Huang Shijie
@ 2011-08-08 9:12 ` Huang Shijie
2011-08-09 9:19 ` Wolfram Sang
2011-08-14 8:11 ` Ivan Djelic
2 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-08 9:12 UTC (permalink / raw)
To: linux-arm-kernel
Hi Wolfram:
>
> ecclayout needs to be used to show that OOB is fully in use [1]
> ===============================================================
>
> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver only
> working with UBIFS is surely not ready for mainline.
>
It seems just modifying the ecclayout of GPMI-NAND can not fix the problem.
It should also change the code of JFFS2 and mtd.
Some one ever posted a patch about this:
http://lists.infradead.org/pipermail/linux-mtd/2007-December/020047.html
Best Regards
Huang Shijie
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-08 6:21 ` Huang Shijie
@ 2011-08-08 9:19 ` Koen Beel
2011-08-08 10:37 ` Huang Shijie
` (2 more replies)
2011-08-09 9:35 ` Wolfram Sang
1 sibling, 3 replies; 33+ messages in thread
From: Koen Beel @ 2011-08-08 9:19 UTC (permalink / raw)
To: linux-arm-kernel
Hi Wolfram,
Thanks for taking the initiative to summarize the current status.
Also thanks to Huang Shijie for all the work done so far.
On Mon, Aug 8, 2011 at 8:21 AM, Huang Shijie <b32955@freescale.com> wrote:
> Hi Wolfram:
>>
>> Hi,
>>
>> I am a bit uncertain how the state of the GPMI-NAND driver currently is,
>> so
>> I'll try to sum it up here. There is without doubt interest in getting the
>> driver into mainline from at least Huang, Shawn, Lothar, Koen and me, so I
>> wonder if we can join forces more effectively. First of all, I want to
>> thank
>> Huang Shijie for all his work so far which was already quite some effort;
>> this
>> sum-up is by no means meant as bashing, just trying to understand the
>> status
>> quo (Sidenote: I am more or less on holiday until Monday, so no time for
>> real
>> debugging myself. I write this mail so we hopefully gain a common
>> understanding. When I am back to full strength, I can then start working
>> on
>> what seems apropriate)
>>
>> Issues with the current driver I am aware of:
>>
>> DMA timeouts [1]
>> ================
>>
>> [ ? ?2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA
>> :1
>> [ ? ?3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
>>
>> Always reproducible by me when trying to format mtd0. Sometimes(always?)
>> seen
>> by Koen during boot (on read?). Never seen by Huang? It is currently
>> unclear if
>
> After I used a different .config, it never appears in my side.
flash_eraseall of mtd1 works for me.
ubi_format of mtd1 always gives the dma timeout
reading/writing of mtd0/1 always gives the dma timeout
I have seen dma timeout during boot if i try to enable ubi rootfs (so
that's the same issue as dma time during read/write).
I don't use mtd0 for testing as this contains my uboot.
I tested using Huang's .config and the Linaro git but still see
exactly the same issue.
>
>> the bug is in the GPMI driver, or in the MXS-DMA driver. Still, I'd say
>> the
>> issue is a show-stopper. We can't put a driver into mainline which leads
>> to the
>> above failure. The fact that there is _some_ configuration which works for
>> someone does not help, it doesn't work for Koen and me at least. We need
On my target, the mxs-dma is working for sdio until the gpmi-nand
gives a timeout. After that the dma for sdio is *not fully* working
anymore.
>
> Hi Koen, do you test my uImage?
> Does the timeout occur?
I was not able to test you uImage. It ended with a "Kernel panic - not
syncing: read error". See (off list) mail from last week.
>>
>> reliable drivers in mainline, so the issue needs to be resolved,
>> regardless
>> where the bug resides.
>
> ok. I will debug it too.
>
>
> Please test the driver again when you back to office.
> Pay attention to your version of /arch/arm/configs/mxs_defconfig.
> Your mxs_defconfig may miss Shawn Guo's patches.
>
> thanks.
>
>
>>
>> problem overwriting all-0xff data in NAND [2]
>> =============================================
>>
>> Although it occured only when writing JFFS2 images so far, this is a
>> generic
>> issue and needs to be fixed, right?
>>
> Artem said it should not change the driver, but the upper layer(jffs2).
>
> So I think i do not need to change the driver.
>>
>> ecclayout needs to be used to show that OOB is fully in use [1]
>> ===============================================================
>>
>> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver
>> only
>> working with UBIFS is surely not ready for mainline.
>>
> I programmed for mx6q in the recent days. I have no time to fix it. The mx6q
> can runs well now.
>
> So I will fix the issue in the following days.
>
>> Pecularities
>> ============
>>
>> There are a few issues which are odd. I don't know if some are mainly
>> intended
>> for debugging, yet they shouldn't be in a mainline driver. At least:
>>
>> * custom sysfs-entries
>
> My sysfs-entries is in the GPMI-NAND directory.
> Does be a mainline driver means I should not have any sysfs-entries?
> If it does, i can remove it.
>
>> * custom kernel command line parameters
>
> The kernel command line 'gpmi_nand' is to avoid the conflict with other
> modules such as
> SD.
>
> If it's be removed, I have to use different config to resolve the issue
> which is not better either. :(
>
>> * namespacing (some functions have no prefix, some have "mil_", some have
>> mx23)
>> ? (I think 'mil' means 'mtd interface layer', but why is that needed?)
>
> The mil is used to make the gpmi_nand_data{} simple.
> Without it, the gpmi_nand_data{} will very big.
>
> The functions which have mx23 prefix are only used in mx23.
> The functions which have no prefix can used in both mx28 and mx23.
>
>> Complexity
>> ==========
>>
>> The driver is not easy to review. I wonder if it makes sense to use
>> incremental
>> patches for it? maybe making it a staging driver could be a solution for
>> that?
>
> Frankly speaking, the current driver is maybe the smallest version now.
>
> I even do not add the on-chip BBT feature now.
>>
>> Huang, are you interested in accepting patches or do you prefer we just
>> point
>> at certain code and you then fix it? Starting with a simpler driver and
>> then
>
> Feel free to mail me the patch. it's welcome.
>
>
>> adding stuff might be another option if we can't chase all the bugs in the
>> current driver.
>>
>> That being said, I'd think fixing the DMA issue has prio #1 and maybe we
>> can
>> meet in IRC or something to work that out? Is there interest in that?
>
> What about gtalk?
Anything is good for me.
Could also be useful to make sure we test on the same HW as much as
possible and are using the same source tree.
HW I have:
- mx23evk rev C1
- mx23evk rev B2
- own target hw using mx23 lqfp-128 chip and different type of ddr and nand.
>
> Best Regards
> Huang Shijie
>
>> Ok, those were my two cents. Your mileages may vary, please give your
>> thoughts,
>> then. I mainly don't want the driver development to get stalled.
+1
Br,
Koen
>>
>> Regards,
>>
>> ? ?Wolfram
>>
>> [1] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037200.html
>> [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-08 9:19 ` Koen Beel
@ 2011-08-08 10:37 ` Huang Shijie
2011-08-08 12:42 ` Koen Beel
2011-08-09 5:11 ` Huang Shijie
2011-08-09 9:45 ` Wolfram Sang
2 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-08 10:37 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
> Hi Wolfram,
>
> Thanks for taking the initiative to summarize the current status.
> Also thanks to Huang Shijie for all the work done so far.
>
>
> On Mon, Aug 8, 2011 at 8:21 AM, Huang Shijie<b32955@freescale.com> wrote:
>> Hi Wolfram:
>>> Hi,
>>>
>>> I am a bit uncertain how the state of the GPMI-NAND driver currently is,
>>> so
>>> I'll try to sum it up here. There is without doubt interest in getting the
>>> driver into mainline from at least Huang, Shawn, Lothar, Koen and me, so I
>>> wonder if we can join forces more effectively. First of all, I want to
>>> thank
>>> Huang Shijie for all his work so far which was already quite some effort;
>>> this
>>> sum-up is by no means meant as bashing, just trying to understand the
>>> status
>>> quo (Sidenote: I am more or less on holiday until Monday, so no time for
>>> real
>>> debugging myself. I write this mail so we hopefully gain a common
>>> understanding. When I am back to full strength, I can then start working
>>> on
>>> what seems apropriate)
>>>
>>> Issues with the current driver I am aware of:
>>>
>>> DMA timeouts [1]
>>> ================
>>>
>>> [ 2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA
>>> :1
>>> [ 3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
>>>
>>> Always reproducible by me when trying to format mtd0. Sometimes(always?)
>>> seen
>>> by Koen during boot (on read?). Never seen by Huang? It is currently
>>> unclear if
>> After I used a different .config, it never appears in my side.
> flash_eraseall of mtd1 works for me.
> ubi_format of mtd1 always gives the dma timeout
> reading/writing of mtd0/1 always gives the dma timeout
> I have seen dma timeout during boot if i try to enable ubi rootfs (so
> that's the same issue as dma time during read/write).
>
> I don't use mtd0 for testing as this contains my uboot.
>
> I tested using Huang's .config and the Linaro git but still see
> exactly the same issue.
>
strange.
>>> the bug is in the GPMI driver, or in the MXS-DMA driver. Still, I'd say
>>> the
>>> issue is a show-stopper. We can't put a driver into mainline which leads
>>> to the
>>> above failure. The fact that there is _some_ configuration which works for
>>> someone does not help, it doesn't work for Koen and me at least. We need
> On my target, the mxs-dma is working for sdio until the gpmi-nand
> gives a timeout. After that the dma for sdio is *not fully* working
> anymore.
>
We need more log in following aspects:
[1] apbh-dma registers
[2] clk registers
[3] gpmi registers
Please git-apply the patch in the attachment.
It will print out more DMA information WHEN dma-timeout occur.
>> Hi Koen, do you test my uImage?
>> Does the timeout occur?
> I was not able to test you uImage. It ended with a "Kernel panic - not
> syncing: read error". See (off list) mail from last week.
>
ok.
>>> reliable drivers in mainline, so the issue needs to be resolved,
>>> regardless
>>> where the bug resides.
>> ok. I will debug it too.
>>
>>
>> Please test the driver again when you back to office.
>> Pay attention to your version of /arch/arm/configs/mxs_defconfig.
>> Your mxs_defconfig may miss Shawn Guo's patches.
>>
>> thanks.
>>
>>
>>> problem overwriting all-0xff data in NAND [2]
>>> =============================================
>>>
>>> Although it occured only when writing JFFS2 images so far, this is a
>>> generic
>>> issue and needs to be fixed, right?
>>>
>> Artem said it should not change the driver, but the upper layer(jffs2).
>>
>> So I think i do not need to change the driver.
>>> ecclayout needs to be used to show that OOB is fully in use [1]
>>> ===============================================================
>>>
>>> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver
>>> only
>>> working with UBIFS is surely not ready for mainline.
>>>
>> I programmed for mx6q in the recent days. I have no time to fix it. The mx6q
>> can runs well now.
>>
>> So I will fix the issue in the following days.
>>
>>> Pecularities
>>> ============
>>>
>>> There are a few issues which are odd. I don't know if some are mainly
>>> intended
>>> for debugging, yet they shouldn't be in a mainline driver. At least:
>>>
>>> * custom sysfs-entries
>> My sysfs-entries is in the GPMI-NAND directory.
>> Does be a mainline driver means I should not have any sysfs-entries?
>> If it does, i can remove it.
>>
>>> * custom kernel command line parameters
>> The kernel command line 'gpmi_nand' is to avoid the conflict with other
>> modules such as
>> SD.
>>
>> If it's be removed, I have to use different config to resolve the issue
>> which is not better either. :(
>>
>>> * namespacing (some functions have no prefix, some have "mil_", some have
>>> mx23)
>>> (I think 'mil' means 'mtd interface layer', but why is that needed?)
>> The mil is used to make the gpmi_nand_data{} simple.
>> Without it, the gpmi_nand_data{} will very big.
>>
>> The functions which have mx23 prefix are only used in mx23.
>> The functions which have no prefix can used in both mx28 and mx23.
>>
>>> Complexity
>>> ==========
>>>
>>> The driver is not easy to review. I wonder if it makes sense to use
>>> incremental
>>> patches for it? maybe making it a staging driver could be a solution for
>>> that?
>> Frankly speaking, the current driver is maybe the smallest version now.
>>
>> I even do not add the on-chip BBT feature now.
>>> Huang, are you interested in accepting patches or do you prefer we just
>>> point
>>> at certain code and you then fix it? Starting with a simpler driver and
>>> then
>> Feel free to mail me the patch. it's welcome.
>>
>>
>>> adding stuff might be another option if we can't chase all the bugs in the
>>> current driver.
>>>
>>> That being said, I'd think fixing the DMA issue has prio #1 and maybe we
>>> can
>>> meet in IRC or something to work that out? Is there interest in that?
>> What about gtalk?
> Anything is good for me.
> Could also be useful to make sure we test on the same HW as much as
> possible and are using the same source tree.
> HW I have:
> - mx23evk rev C1
> - mx23evk rev B2
> - own target hw using mx23 lqfp-128 chip and different type of ddr and nand.
>
I have mx23evk rev C.
Best Regards
Huang Shijie
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-08 10:37 ` Huang Shijie
@ 2011-08-08 12:42 ` Koen Beel
2011-08-09 6:36 ` Huang Shijie
0 siblings, 1 reply; 33+ messages in thread
From: Koen Beel @ 2011-08-08 12:42 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
On Mon, Aug 8, 2011 at 12:37 PM, Huang Shijie <b32955@freescale.com> wrote:
> Hi,
>>
>> On my target, the mxs-dma is working for sdio until the gpmi-nand
>> gives a timeout. After that the dma for sdio is *not fully* working
>> anymore.
>>
> We need more log in following aspects:
> [1] apbh-dma registers
> [2] clk registers
> [3] gpmi registers
>
> Please git-apply the patch in the attachment.
> It will print out more DMA information WHEN dma-timeout occur.
Don't get it. What exactly are you trying to dump?
This patch dumps CTRL0, CTRL1, CTRL2, DEVSEL but also some registers
of APBH channel0 which is reserved....
Then it prints some debug info on channel 1 (ssp1) and then alle
channel 2 register except the debug register (ssp2 = not used here).
What info do you need?
Br,
Koen
>
> Best Regards
> Huang Shijie
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-08 9:19 ` Koen Beel
2011-08-08 10:37 ` Huang Shijie
@ 2011-08-09 5:11 ` Huang Shijie
2011-08-09 6:25 ` Koen Beel
2011-08-09 9:45 ` Wolfram Sang
2 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-09 5:11 UTC (permalink / raw)
To: linux-arm-kernel
Hi Koen:
> Hi Wolfram,
>
> Thanks for taking the initiative to summarize the current status.
> Also thanks to Huang Shijie for all the work done so far.
>
>
> On Mon, Aug 8, 2011 at 8:21 AM, Huang Shijie<b32955@freescale.com> wrote:
>> Hi Wolfram:
>>> Hi,
>>>
>>> I am a bit uncertain how the state of the GPMI-NAND driver currently is,
>>> so
>>> I'll try to sum it up here. There is without doubt interest in getting the
>>> driver into mainline from at least Huang, Shawn, Lothar, Koen and me, so I
>>> wonder if we can join forces more effectively. First of all, I want to
>>> thank
>>> Huang Shijie for all his work so far which was already quite some effort;
>>> this
>>> sum-up is by no means meant as bashing, just trying to understand the
>>> status
>>> quo (Sidenote: I am more or less on holiday until Monday, so no time for
>>> real
>>> debugging myself. I write this mail so we hopefully gain a common
>>> understanding. When I am back to full strength, I can then start working
>>> on
>>> what seems apropriate)
>>>
>>> Issues with the current driver I am aware of:
>>>
>>> DMA timeouts [1]
>>> ================
>>>
>>> [ 2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA
>>> :1
>>> [ 3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
>>>
>>> Always reproducible by me when trying to format mtd0. Sometimes(always?)
>>> seen
>>> by Koen during boot (on read?). Never seen by Huang? It is currently
>>> unclear if
>> After I used a different .config, it never appears in my side.
> flash_eraseall of mtd1 works for me.
> ubi_format of mtd1 always gives the dma timeout
> reading/writing of mtd0/1 always gives the dma timeout
> I have seen dma timeout during boot if i try to enable ubi rootfs (so
> that's the same issue as dma time during read/write).
>
> I don't use mtd0 for testing as this contains my uboot.
>
> I tested using Huang's .config and the Linaro git but still see
> exactly the same issue.
>
>>> the bug is in the GPMI driver, or in the MXS-DMA driver. Still, I'd say
>>> the
>>> issue is a show-stopper. We can't put a driver into mainline which leads
>>> to the
>>> above failure. The fact that there is _some_ configuration which works for
>>> someone does not help, it doesn't work for Koen and me at least. We need
> On my target, the mxs-dma is working for sdio until the gpmi-nand
> gives a timeout. After that the dma for sdio is *not fully* working
> anymore.
>
>> Hi Koen, do you test my uImage?
>> Does the timeout occur?
> I was not able to test you uImage. It ended with a "Kernel panic - not
> syncing: read error". See (off list) mail from last week.
>
>
>>> reliable drivers in mainline, so the issue needs to be resolved,
>>> regardless
>>> where the bug resides.
>> ok. I will debug it too.
>>
>>
>> Please test the driver again when you back to office.
>> Pay attention to your version of /arch/arm/configs/mxs_defconfig.
>> Your mxs_defconfig may miss Shawn Guo's patches.
>>
>> thanks.
>>
>>
>>> problem overwriting all-0xff data in NAND [2]
>>> =============================================
>>>
>>> Although it occured only when writing JFFS2 images so far, this is a
>>> generic
>>> issue and needs to be fixed, right?
>>>
>> Artem said it should not change the driver, but the upper layer(jffs2).
>>
>> So I think i do not need to change the driver.
>>> ecclayout needs to be used to show that OOB is fully in use [1]
>>> ===============================================================
>>>
>>> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver
>>> only
>>> working with UBIFS is surely not ready for mainline.
>>>
>> I programmed for mx6q in the recent days. I have no time to fix it. The mx6q
>> can runs well now.
>>
>> So I will fix the issue in the following days.
>>
>>> Pecularities
>>> ============
>>>
>>> There are a few issues which are odd. I don't know if some are mainly
>>> intended
>>> for debugging, yet they shouldn't be in a mainline driver. At least:
>>>
>>> * custom sysfs-entries
>> My sysfs-entries is in the GPMI-NAND directory.
>> Does be a mainline driver means I should not have any sysfs-entries?
>> If it does, i can remove it.
>>
>>> * custom kernel command line parameters
>> The kernel command line 'gpmi_nand' is to avoid the conflict with other
>> modules such as
>> SD.
>>
>> If it's be removed, I have to use different config to resolve the issue
>> which is not better either. :(
>>
>>> * namespacing (some functions have no prefix, some have "mil_", some have
>>> mx23)
>>> (I think 'mil' means 'mtd interface layer', but why is that needed?)
>> The mil is used to make the gpmi_nand_data{} simple.
>> Without it, the gpmi_nand_data{} will very big.
>>
>> The functions which have mx23 prefix are only used in mx23.
>> The functions which have no prefix can used in both mx28 and mx23.
>>
>>> Complexity
>>> ==========
>>>
>>> The driver is not easy to review. I wonder if it makes sense to use
>>> incremental
>>> patches for it? maybe making it a staging driver could be a solution for
>>> that?
>> Frankly speaking, the current driver is maybe the smallest version now.
>>
>> I even do not add the on-chip BBT feature now.
>>> Huang, are you interested in accepting patches or do you prefer we just
>>> point
>>> at certain code and you then fix it? Starting with a simpler driver and
>>> then
>> Feel free to mail me the patch. it's welcome.
>>
>>
>>> adding stuff might be another option if we can't chase all the bugs in the
>>> current driver.
>>>
>>> That being said, I'd think fixing the DMA issue has prio #1 and maybe we
>>> can
>>> meet in IRC or something to work that out? Is there interest in that?
>> What about gtalk?
> Anything is good for me.
> Could also be useful to make sure we test on the same HW as much as
> possible and are using the same source tree.
> HW I have:
> - mx23evk rev C1
> - mx23evk rev B2
> - own target hw using mx23 lqfp-128 chip and different type of ddr and nand.
>
My test mx23 board is 169BGA package which is different from yours.
Could you get the 169BGA package board?
I think the DMA timeout is caused by the different package type.
Best Regards
Huang Shijie
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-09 5:11 ` Huang Shijie
@ 2011-08-09 6:25 ` Koen Beel
2011-08-09 6:40 ` Huang Shijie
0 siblings, 1 reply; 33+ messages in thread
From: Koen Beel @ 2011-08-09 6:25 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
On Tue, Aug 9, 2011 at 7:11 AM, Huang Shijie <b32955@freescale.com> wrote:
> Hi Koen:
>> Anything is good for me.
>> Could also be useful to make sure we test on the same HW as much as
>> possible and are using the same source tree.
>> HW I have:
>> - mx23evk rev C1
>> - mx23evk rev B2
>> - own target hw using mx23 lqfp-128 chip and different type of ddr and
>> nand.
>>
> My test mx23 board is 169BGA package which is different from yours.
>
> Could you get the 169BGA package board?
>
> I think the DMA timeout is caused by the different package type.
I suppose my mx23evk is right the same as you have (same revision) and
this has the bga169 package. My actual target has lqfp128. But on both
boards I get the same issues.
So I don't think the package type has something to do with the dma
timeout issue. After all the silicon inside is the same thing.
I'm trying to debug a little further.
Regards,
Koen
>
> Best Regards
> Huang Shijie
>
>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-08 12:42 ` Koen Beel
@ 2011-08-09 6:36 ` Huang Shijie
2011-08-09 7:58 ` Koen Beel
0 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-09 6:36 UTC (permalink / raw)
To: linux-arm-kernel
Hi Koen:
> Hi,
>
> On Mon, Aug 8, 2011 at 12:37 PM, Huang Shijie<b32955@freescale.com> wrote:
>> Hi,
>>> On my target, the mxs-dma is working for sdio until the gpmi-nand
>>> gives a timeout. After that the dma for sdio is *not fully* working
>>> anymore.
>>>
>> We need more log in following aspects:
>> [1] apbh-dma registers
>> [2] clk registers
>> [3] gpmi registers
>>
>> Please git-apply the patch in the attachment.
>> It will print out more DMA information WHEN dma-timeout occur.
> Don't get it. What exactly are you trying to dump?
> This patch dumps CTRL0, CTRL1, CTRL2, DEVSEL but also some registers
> of APBH channel0 which is reserved....
sorry, I intended to print out the channel 4(NAND_DEVICE0).
I want to know that:
When the dma timeout occurs, whether it caused by the GPMI or by the
DMA itself.
Please try the new patch.
Best Regards
Huang Shijie
> Then it prints some debug info on channel 1 (ssp1) and then alle
> channel 2 register except the debug register (ssp2 = not used here).
>
> What info do you need?
>
> Br,
> Koen
>
>> Best Regards
>> Huang Shijie
>>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-09 6:25 ` Koen Beel
@ 2011-08-09 6:40 ` Huang Shijie
0 siblings, 0 replies; 33+ messages in thread
From: Huang Shijie @ 2011-08-09 6:40 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
> Hi,
>
> On Tue, Aug 9, 2011 at 7:11 AM, Huang Shijie<b32955@freescale.com> wrote:
>> Hi Koen:
>>> Anything is good for me.
>>> Could also be useful to make sure we test on the same HW as much as
>>> possible and are using the same source tree.
>>> HW I have:
>>> - mx23evk rev C1
>>> - mx23evk rev B2
>>> - own target hw using mx23 lqfp-128 chip and different type of ddr and
>>> nand.
>>>
>> My test mx23 board is 169BGA package which is different from yours.
>>
>> Could you get the 169BGA package board?
>>
>> I think the DMA timeout is caused by the different package type.
> I suppose my mx23evk is right the same as you have (same revision) and
> this has the bga169 package. My actual target has lqfp128. But on both
> boards I get the same issues.
Could you test my kernel in your side?
I can provide you the sd_loader, kernel.
You can use SD card to store the rootfs.
Best Regards
Huang Shijie
> So I don't think the package type has something to do with the dma
> timeout issue. After all the silicon inside is the same thing.
>
> I'm trying to debug a little further.
>
> Regards,
> Koen
>
>> Best Regards
>> Huang Shijie
>>
>>
>>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-09 6:36 ` Huang Shijie
@ 2011-08-09 7:58 ` Koen Beel
2011-08-09 8:18 ` Huang Shijie
0 siblings, 1 reply; 33+ messages in thread
From: Koen Beel @ 2011-08-09 7:58 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
On Tue, Aug 9, 2011 at 8:36 AM, Huang Shijie <b32955@freescale.com> wrote:
> Hi Koen:
>>
>> Hi,
>>
>> On Mon, Aug 8, 2011 at 12:37 PM, Huang Shijie<b32955@freescale.com>
>> ?wrote:
>>>
>>> Hi,
>>>>
>>>> On my target, the mxs-dma is working for sdio until the gpmi-nand
>>>> gives a timeout. After that the dma for sdio is *not fully* working
>>>> anymore.
>>>>
>>> We need more log in following aspects:
>>> [1] apbh-dma registers
>>> [2] clk registers
>>> [3] gpmi registers
>>>
>>> Please git-apply the patch in the attachment.
>>> It will print out more DMA information WHEN dma-timeout occur.
>>
>> Don't get it. What exactly are you trying to dump?
>> This patch dumps CTRL0, CTRL1, CTRL2, DEVSEL but also some registers
>> of APBH channel0 which is reserved....
>
> sorry, I intended to print out the channel 4(NAND_DEVICE0).
>
> I want to know that:
> ?When the dma timeout occurs, whether it caused by the GPMI or by the DMA
> itself.
Ok, I was a little confused about the addresses, but it seems like you
are using mx28 (and corresponding addresses). APBH dma for mx23 has
different address according to the datasheet.
So I adjusted the patch a little for mx23, see attachment.
Here is the log with some comments added on the dma.
# ubiformat /dev/mtd1
ubiformat: mtd1 (nand), size 20971520 bytes (20.0 MiB), 40 eraseblocks
of 524288 bytes (512.0 KiB), min. I/O size 4096 bytes
libscan: scanning eraseblock 0 -- 2 % complete [ 86.720000] [
start_dma_without_bch_irq : 393 ] DMA timeout, last DMA :1
[ 86.720000] ------------------------DMA DUMP BEGIN ----------
[ 86.730000] APBH REG :0 : 30000000 // -> HW_APBH_CTRL0:
AHB_BURST8_EN, APB_BURST4_EN
[ 86.730000] APBH REG :10 : 00FF0000 // -> HW_APBH_CTRL1:
CHX_CMDCMPLT_IRQ_EN, no cmdcmplt_irq
[ 86.740000] APBH REG :20 : 00000000 // -> HW_APBH_CTRL2: no error_irq
[ 86.740000] APBH REG :30 : 00000000 // -> HW_APBH_DEVSEL: "N/A
for apbh bridge dma."
[ 86.750000] APBH CH4 REG :200 : 418D7098 // executing last dma
command of command chain (see below)
[ 86.750000] APBH CH4 REG :210 : 00000000 // no next command, ok
[ 86.750000] APBH CH4 REG :220 : 000001C8 // HW_APBH_CH4_CMD:
COMMAND = NO DMA TRANSFER, IRQONCMPLT, WAIT4ENDCMD, SEMAPHORE,
HALTONTERMINATE
[ 86.760000] APBH CH4 REG :230 : 00000000 // HW_APBH_CH4_BAR:
"Address of system memory buffer to be read or written over the AHB
bus." -> strange value ...
[ 86.760000] APBH CH4 REG :240 : 00010000 // HW_APBH_CH4_SEMA:
semaphore counter is 1
[ 86.770000] APBH CH4 REG :250 : 03A00015 // HW_APBH_CH4_DEBUG1:
LOCK, NEXTCMDADDRVALID, RD_FIFO_EMPTY, WR_FIFO_EMPTY, STATEMACHINE =
"WAIT_END = 0x15 When the Wait for Command End bit is set, the state
machine enters this state until the DMA device indicates that the
command is complete."
[ 86.770000] APBH CH4 REG :260 : 00000000 // -> HW_APBH_CH4_DEBUG2:
no apb of ahb bytes remaining for transfer
[ 86.780000] [ 0 ] : ME : 418d7000, next : 418d704c, bits :
00002304, bytes : 00000000, buf : 00000000
[ 86.790000] [ 0 ] PIO[0] : 03800000
[ 86.790000] [ 0 ] PIO[1] : 00000000
[ 86.800000] [ 0 ] PIO[2] : 00000000
[ 86.800000] [ 1 ] : ME : 418d704c, next : 418d7098, bits :
00006304, bytes : 00000001, buf : 4181b000
[ 86.810000] [ 1 ] PIO[0] : 018010da
[ 86.810000] [ 1 ] PIO[1] : 00000000
[ 86.820000] [ 1 ] PIO[2] : 000011ff
[ 86.820000] [ 2 ] : ME : 418d7098, next : 00000000, bits :
000023c8, bytes : 00000000, buf : 00000000
[ 86.830000] [ 2 ] PIO[0] : 038010da
[ 86.840000] [ 2 ] PIO[1] : 00000000
[ 86.840000] [ 2 ] PIO[2] : 00000000
[ 86.840000] ------------------------DMA DUMP END ------------
[ 86.850000] [ gpmi_show_regs : 076 ] -------------- Show GPMI
registers ----------
[ 86.860000] [ gpmi_show_regs : 079 ] offset 0x000 : 0x238010da
[ 86.870000] [ gpmi_show_regs : 079 ] offset 0x010 : 0x00000000
[ 86.870000] [ gpmi_show_regs : 079 ] offset 0x020 : 0x000011ff
[ 86.880000] [ gpmi_show_regs : 079 ] offset 0x030 : 0x000010da
[ 86.890000] [ gpmi_show_regs : 079 ] offset 0x040 : 0x40f0c480
[ 86.890000] [ gpmi_show_regs : 079 ] offset 0x050 : 0x40f09000
[ 86.900000] [ gpmi_show_regs : 079 ] offset 0x060 : 0x0004000c
[ 86.910000] [ gpmi_show_regs : 079 ] offset 0x070 : 0x00010203
[ 86.910000] [ gpmi_show_regs : 079 ] offset 0x080 : 0x05000000
[ 86.920000] [ gpmi_show_regs : 079 ] offset 0x090 : 0x09020101
[ 86.920000] [ gpmi_show_regs : 079 ] offset 0x0a0 : 0x00000030
[ 86.930000] [ gpmi_show_regs : 079 ] offset 0x0b0 : 0x80000010
[ 86.940000] [ gpmi_show_regs : 079 ] offset 0x0c0 : 0x100000ba
[ 86.940000] [ gpmi_show_regs : 079 ] offset 0x0d0 : 0x03000000
[ 86.950000] [ gpmi_show_regs : 081 ] -------------- Show GPMI
registers end ----------
[ 86.960000] Kernel panic - not syncing: -----------DMA
FAILED------------------
Br,
Koen
>
>
> Please try the new patch.
>
> Best Regards
> Huang Shijie
>>
>> Then it prints some debug info on channel 1 (ssp1) and then alle
>> channel 2 register except the debug register (ssp2 = not used here).
>>
>> What info do you need?
>>
>> Br,
>> Koen
>>
>>> Best Regards
>>> Huang Shijie
>>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0008-Added-extra-dma-log-for-ch4-nand0.patch
Type: text/x-patch
Size: 3432 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20110809/321eae87/attachment.bin>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-09 7:58 ` Koen Beel
@ 2011-08-09 8:18 ` Huang Shijie
2011-08-09 8:25 ` Koen Beel
0 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-09 8:18 UTC (permalink / raw)
To: linux-arm-kernel
Hi Koen:
thanks for your test.
> Hi,
>
>
>
> On Tue, Aug 9, 2011 at 8:36 AM, Huang Shijie<b32955@freescale.com> wrote:
>> Hi Koen:
>>> Hi,
>>>
>>> On Mon, Aug 8, 2011 at 12:37 PM, Huang Shijie<b32955@freescale.com>
>>> wrote:
>>>> Hi,
>>>>> On my target, the mxs-dma is working for sdio until the gpmi-nand
>>>>> gives a timeout. After that the dma for sdio is *not fully* working
>>>>> anymore.
>>>>>
>>>> We need more log in following aspects:
>>>> [1] apbh-dma registers
>>>> [2] clk registers
>>>> [3] gpmi registers
>>>>
>>>> Please git-apply the patch in the attachment.
>>>> It will print out more DMA information WHEN dma-timeout occur.
>>> Don't get it. What exactly are you trying to dump?
>>> This patch dumps CTRL0, CTRL1, CTRL2, DEVSEL but also some registers
>>> of APBH channel0 which is reserved....
>> sorry, I intended to print out the channel 4(NAND_DEVICE0).
>>
>> I want to know that:
>> When the dma timeout occurs, whether it caused by the GPMI or by the DMA
>> itself.
> Ok, I was a little confused about the addresses, but it seems like you
> are using mx28 (and corresponding addresses). APBH dma for mx23 has
> different address according to the datasheet.
> So I adjusted the patch a little for mx23, see attachment.
>
you are right. My address was wrong.
> Here is the log with some comments added on the dma.
>
> # ubiformat /dev/mtd1
> ubiformat: mtd1 (nand), size 20971520 bytes (20.0 MiB), 40 eraseblocks
> of 524288 bytes (512.0 KiB), min. I/O size 4096 bytes
> libscan: scanning eraseblock 0 -- 2 % complete [ 86.720000] [
> start_dma_without_bch_irq : 393 ] DMA timeout, last DMA :1
> [ 86.720000] ------------------------DMA DUMP BEGIN ----------
> [ 86.730000] APBH REG :0 : 30000000 // -> HW_APBH_CTRL0:
> AHB_BURST8_EN, APB_BURST4_EN
> [ 86.730000] APBH REG :10 : 00FF0000 // -> HW_APBH_CTRL1:
> CHX_CMDCMPLT_IRQ_EN, no cmdcmplt_irq
> [ 86.740000] APBH REG :20 : 00000000 // -> HW_APBH_CTRL2: no error_irq
> [ 86.740000] APBH REG :30 : 00000000 // -> HW_APBH_DEVSEL: "N/A
> for apbh bridge dma."
> [ 86.750000] APBH CH4 REG :200 : 418D7098 // executing last dma
> command of command chain (see below)
> [ 86.750000] APBH CH4 REG :210 : 00000000 // no next command, ok
> [ 86.750000] APBH CH4 REG :220 : 000001C8 // HW_APBH_CH4_CMD:
> COMMAND = NO DMA TRANSFER, IRQONCMPLT, WAIT4ENDCMD, SEMAPHORE,
> HALTONTERMINATE
> [ 86.760000] APBH CH4 REG :230 : 00000000 // HW_APBH_CH4_BAR:
> "Address of system memory buffer to be read or written over the AHB
> bus." -> strange value ...
> [ 86.760000] APBH CH4 REG :240 : 00010000 // HW_APBH_CH4_SEMA:
> semaphore counter is 1
> [ 86.770000] APBH CH4 REG :250 : 03A00015 // HW_APBH_CH4_DEBUG1:
> LOCK, NEXTCMDADDRVALID, RD_FIFO_EMPTY, WR_FIFO_EMPTY, STATEMACHINE =
> "WAIT_END = 0x15 When the Wait for Command End bit is set, the state
> machine enters this state until the DMA device indicates that the
> command is complete."
> [ 86.770000] APBH CH4 REG :260 : 00000000 // -> HW_APBH_CH4_DEBUG2:
> no apb of ahb bytes remaining for transfer
> [ 86.780000] [ 0 ] : ME : 418d7000, next : 418d704c, bits :
> 00002304, bytes : 00000000, buf : 00000000
> [ 86.790000] [ 0 ] PIO[0] : 03800000
> [ 86.790000] [ 0 ] PIO[1] : 00000000
> [ 86.800000] [ 0 ] PIO[2] : 00000000
> [ 86.800000] [ 1 ] : ME : 418d704c, next : 418d7098, bits :
> 00006304, bytes : 00000001, buf : 4181b000
> [ 86.810000] [ 1 ] PIO[0] : 018010da
> [ 86.810000] [ 1 ] PIO[1] : 00000000
> [ 86.820000] [ 1 ] PIO[2] : 000011ff
> [ 86.820000] [ 2 ] : ME : 418d7098, next : 00000000, bits :
It hungs here.
> 000023c8, bytes : 00000000, buf : 00000000
> [ 86.830000] [ 2 ] PIO[0] : 038010da
> [ 86.840000] [ 2 ] PIO[1] : 00000000
> [ 86.840000] [ 2 ] PIO[2] : 00000000
> [ 86.840000] ------------------------DMA DUMP END ------------
> [ 86.850000] [ gpmi_show_regs : 076 ] -------------- Show GPMI
> registers ----------
> [ 86.860000] [ gpmi_show_regs : 079 ] offset 0x000 : 0x238010da
> [ 86.870000] [ gpmi_show_regs : 079 ] offset 0x010 : 0x00000000
> [ 86.870000] [ gpmi_show_regs : 079 ] offset 0x020 : 0x000011ff
> [ 86.880000] [ gpmi_show_regs : 079 ] offset 0x030 : 0x000010da
> [ 86.890000] [ gpmi_show_regs : 079 ] offset 0x040 : 0x40f0c480
> [ 86.890000] [ gpmi_show_regs : 079 ] offset 0x050 : 0x40f09000
> [ 86.900000] [ gpmi_show_regs : 079 ] offset 0x060 : 0x0004000c
> [ 86.910000] [ gpmi_show_regs : 079 ] offset 0x070 : 0x00010203
> [ 86.910000] [ gpmi_show_regs : 079 ] offset 0x080 : 0x05000000
> [ 86.920000] [ gpmi_show_regs : 079 ] offset 0x090 : 0x09020101
> [ 86.920000] [ gpmi_show_regs : 079 ] offset 0x0a0 : 0x00000030
> [ 86.930000] [ gpmi_show_regs : 079 ] offset 0x0b0 : 0x80000010
> [ 86.940000] [ gpmi_show_regs : 079 ] offset 0x0c0 : 0x100000ba
> [ 86.940000] [ gpmi_show_regs : 079 ] offset 0x0d0 : 0x03000000
> [ 86.950000] [ gpmi_show_regs : 081 ] -------------- Show GPMI
> registers end ----------
> [ 86.960000] Kernel panic - not syncing: -----------DMA
> FAILED------------------
Please post the functions-calling stack when the panic occurs.
I also want to know where the code is running when the time-out occur.
My gmail is shijie8 at gmail.com
We may talk more on the gtalk.
Best Regards
Huang Shijie
> Br,
> Koen
>
>>
>> Please try the new patch.
>>
>> Best Regards
>> Huang Shijie
>>> Then it prints some debug info on channel 1 (ssp1) and then alle
>>> channel 2 register except the debug register (ssp2 = not used here).
>>>
>>> What info do you need?
>>>
>>> Br,
>>> Koen
>>>
>>>> Best Regards
>>>> Huang Shijie
>>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel at lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-09 8:18 ` Huang Shijie
@ 2011-08-09 8:25 ` Koen Beel
0 siblings, 0 replies; 33+ messages in thread
From: Koen Beel @ 2011-08-09 8:25 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Aug 9, 2011 at 10:18 AM, Huang Shijie <b32955@freescale.com> wrote:
> Hi Koen:
> thanks for your test.
>>
>> Hi,
>>
>>
>>
>> On Tue, Aug 9, 2011 at 8:36 AM, Huang Shijie<b32955@freescale.com> ?wrote:
>>>
>>> Hi Koen:
>>>>
>>>> Hi,
>>>>
>>>> On Mon, Aug 8, 2011 at 12:37 PM, Huang Shijie<b32955@freescale.com>
>>>> ?wrote:
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> On my target, the mxs-dma is working for sdio until the gpmi-nand
>>>>>> gives a timeout. After that the dma for sdio is *not fully* working
>>>>>> anymore.
>>>>>>
>>>>> We need more log in following aspects:
>>>>> [1] apbh-dma registers
>>>>> [2] clk registers
>>>>> [3] gpmi registers
>>>>>
>>>>> Please git-apply the patch in the attachment.
>>>>> It will print out more DMA information WHEN dma-timeout occur.
>>>>
>>>> Don't get it. What exactly are you trying to dump?
>>>> This patch dumps CTRL0, CTRL1, CTRL2, DEVSEL but also some registers
>>>> of APBH channel0 which is reserved....
>>>
>>> sorry, I intended to print out the channel 4(NAND_DEVICE0).
>>>
>>> I want to know that:
>>> ?When the dma timeout occurs, whether it caused by the GPMI or by the DMA
>>> itself.
>>
>> Ok, I was a little confused about the addresses, but it seems like you
>> are using mx28 (and corresponding addresses). APBH dma for mx23 has
>> different address according to the datasheet.
>> So I adjusted the patch a little for mx23, see attachment.
>>
> you are right. My address was wrong.
>>
>> Here is the log with some comments added on the dma.
>>
>> # ubiformat /dev/mtd1
>> ubiformat: mtd1 (nand), size 20971520 bytes (20.0 MiB), 40 eraseblocks
>> of 524288 bytes (512.0 KiB), min. I/O size 4096 bytes
>> libscan: scanning eraseblock 0 -- ?2 % complete ?[ 86.720000] [
>> start_dma_without_bch_irq : 393 ] DMA timeout, last DMA :1
>> [ 86.720000] ------------------------DMA DUMP BEGIN ----------
>> [ 86.730000] APBH REG :0 : 30000000 ? // -> ?HW_APBH_CTRL0:
>> AHB_BURST8_EN, APB_BURST4_EN
>> [ 86.730000] APBH REG :10 : 00FF0000 ? // -> ?HW_APBH_CTRL1:
>> CHX_CMDCMPLT_IRQ_EN, no cmdcmplt_irq
>> [ 86.740000] APBH REG :20 : 00000000 ? // -> ?HW_APBH_CTRL2: no error_irq
>> [ 86.740000] APBH REG :30 : 00000000 ? // -> ?HW_APBH_DEVSEL: "N/A
>> for apbh bridge dma."
>> [ 86.750000] APBH CH4 REG :200 : 418D7098 // executing last dma
>> command of command chain (see below)
>> [ 86.750000] APBH CH4 REG :210 : 00000000 // no next command, ok
>> [ 86.750000] APBH CH4 REG :220 : 000001C8 // HW_APBH_CH4_CMD:
>> COMMAND = NO DMA TRANSFER, IRQONCMPLT, WAIT4ENDCMD, SEMAPHORE,
>> HALTONTERMINATE
>> [ 86.760000] APBH CH4 REG :230 : 00000000 // HW_APBH_CH4_BAR:
>> "Address of system memory buffer to be read or written over the AHB
>> bus." -> ?strange value ...
>> [ 86.760000] APBH CH4 REG :240 : 00010000 // HW_APBH_CH4_SEMA:
>> semaphore counter is 1
>> [ 86.770000] APBH CH4 REG :250 : 03A00015 // HW_APBH_CH4_DEBUG1:
>> LOCK, NEXTCMDADDRVALID, RD_FIFO_EMPTY, WR_FIFO_EMPTY, ?STATEMACHINE =
>> "WAIT_END = 0x15 When the Wait for Command End bit is set, the state
>> machine enters this state until the DMA device indicates that the
>> command is complete."
>> [ 86.770000] APBH CH4 REG :260 : 00000000 // -> ?HW_APBH_CH4_DEBUG2:
>> no apb of ahb bytes remaining for transfer
>> [ 86.780000] [ 0 ] : ME : 418d7000, next : 418d704c, bits :
>> 00002304, bytes : 00000000, buf : 00000000
>> [ 86.790000] [ 0 ] PIO[0] : 03800000
>> [ 86.790000] [ 0 ] PIO[1] : 00000000
>> [ 86.800000] [ 0 ] PIO[2] : 00000000
>> [ 86.800000] [ 1 ] : ME : 418d704c, next : 418d7098, bits :
>> 00006304, bytes : 00000001, buf : 4181b000
>> [ 86.810000] [ 1 ] PIO[0] : 018010da
>> [ 86.810000] [ 1 ] PIO[1] : 00000000
>> [ 86.820000] [ 1 ] PIO[2] : 000011ff
>> [ 86.820000] [ 2 ] : ME : 418d7098, next : 00000000, bits :
>
> It hungs here.
>
>> 000023c8, bytes : 00000000, buf : 00000000
>> [ 86.830000] [ 2 ] PIO[0] : 038010da
>> [ 86.840000] [ 2 ] PIO[1] : 00000000
>> [ 86.840000] [ 2 ] PIO[2] : 00000000
>> [ 86.840000] ------------------------DMA DUMP END ------------
>> [ 86.850000] [ gpmi_show_regs : 076 ] -------------- Show GPMI
>> registers ----------
>> [ 86.860000] [ gpmi_show_regs : 079 ] offset 0x000 : 0x238010da
>> [ 86.870000] [ gpmi_show_regs : 079 ] offset 0x010 : 0x00000000
>> [ 86.870000] [ gpmi_show_regs : 079 ] offset 0x020 : 0x000011ff
>> [ 86.880000] [ gpmi_show_regs : 079 ] offset 0x030 : 0x000010da
>> [ 86.890000] [ gpmi_show_regs : 079 ] offset 0x040 : 0x40f0c480
>> [ 86.890000] [ gpmi_show_regs : 079 ] offset 0x050 : 0x40f09000
>> [ 86.900000] [ gpmi_show_regs : 079 ] offset 0x060 : 0x0004000c
>> [ 86.910000] [ gpmi_show_regs : 079 ] offset 0x070 : 0x00010203
>> [ 86.910000] [ gpmi_show_regs : 079 ] offset 0x080 : 0x05000000
>> [ 86.920000] [ gpmi_show_regs : 079 ] offset 0x090 : 0x09020101
>> [ 86.920000] [ gpmi_show_regs : 079 ] offset 0x0a0 : 0x00000030
>> [ 86.930000] [ gpmi_show_regs : 079 ] offset 0x0b0 : 0x80000010
>> [ 86.940000] [ gpmi_show_regs : 079 ] offset 0x0c0 : 0x100000ba
>> [ 86.940000] [ gpmi_show_regs : 079 ] offset 0x0d0 : 0x03000000
>> [ 86.950000] [ gpmi_show_regs : 081 ] -------------- Show GPMI
>> registers end ----------
>> [ 86.960000] Kernel panic - not syncing: -----------DMA
>> FAILED------------------
>
> Please post the functions-calling stack when the panic occurs.
> I also want to know where the code is running when the time-out occur.
Oops, here it is:
[ 86.960000] Kernel panic - not syncing: -----------DMA
FAILED------------------
[ 86.970000] [<c03a19b0>] (unwind_backtrace+0x0/0xf0) from
[<c0635be4>] (panic+0x58/0x18c)
[ 86.980000] [<c0635be4>] (panic+0x58/0x18c) from [<c057ebd0>]
(start_dma_without_bch_irq+0x9c/0xb4)
[ 86.990000] [<c057ebd0>] (start_dma_without_bch_irq+0x9c/0xb4) from
[<c057ec14>] (start_dma_with_bch_irq+0x2c/0x78)
[ 87.000000] [<c057ec14>] (start_dma_with_bch_irq+0x2c/0x78) from
[<c057f86c>] (gpmi_read_page+0x160/0x1cc)
[ 87.010000] [<c057f86c>] (gpmi_read_page+0x160/0x1cc) from
[<c057e33c>] (mil_ecc_read_page+0x64/0x1d8)
[ 87.020000] [<c057e33c>] (mil_ecc_read_page+0x64/0x1d8) from
[<c05795a0>] (nand_do_read_ops+0x1d8/0x468)
[ 87.030000] [<c05795a0>] (nand_do_read_ops+0x1d8/0x468) from
[<c0579ba4>] (nand_read+0x94/0xb0)
[ 87.040000] [<c0579ba4>] (nand_read+0x94/0xb0) from [<c0573768>]
(part_read+0x60/0xe4)
[ 87.050000] [<c0573768>] (part_read+0x60/0xe4) from [<c05751f0>]
(mtd_read+0xd8/0x20c)
[ 87.060000] [<c05751f0>] (mtd_read+0xd8/0x20c) from [<c0442890>]
(vfs_read+0xb0/0x180)
[ 87.060000] [<c0442890>] (vfs_read+0xb0/0x180) from [<c04429a0>]
(sys_read+0x40/0x70)
[ 87.070000] [<c04429a0>] (sys_read+0x40/0x70) from [<c039c780>]
(ret_fast_syscall+0x0/0x2c
>
> My gmail is shijie8 at gmail.com
> We may talk more on the gtalk.
i'm available using koen.beel.barco at gmail.com
>
> Best Regards
> Huang Shijie
>
>> Br,
>> Koen
>>
>>>
>>> Please try the new patch.
>>>
>>> Best Regards
>>> Huang Shijie
>>>>
>>>> Then it prints some debug info on channel 1 (ssp1) and then alle
>>>> channel 2 register except the debug register (ssp2 = not used here).
>>>>
>>>> What info do you need?
>>>>
>>>> Br,
>>>> Koen
>>>>
>>>>> Best Regards
>>>>> Huang Shijie
>>>>>
>>>> _______________________________________________
>>>> linux-arm-kernel mailing list
>>>> linux-arm-kernel at lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel at lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>>
>
>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-08 9:12 ` Huang Shijie
@ 2011-08-09 9:19 ` Wolfram Sang
2011-08-09 10:41 ` Huang Shijie
0 siblings, 1 reply; 33+ messages in thread
From: Wolfram Sang @ 2011-08-09 9:19 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
> >
> >ecclayout needs to be used to show that OOB is fully in use [1]
> >===============================================================
> >
> >Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver only
> >working with UBIFS is surely not ready for mainline.
> >
> It seems just modifying the ecclayout of GPMI-NAND can not fix the problem.
> It should also change the code of JFFS2 and mtd.
>
> Some one ever posted a patch about this:
> http://lists.infradead.org/pipermail/linux-mtd/2007-December/020047.html
This mail is about MLC. That's another issue. For SLC, ecclayout should
work, no?
Regards,
Wolfram
--
Pengutronix e.K. | Wolfram Sang |
Industrial Linux Solutions | http://www.pengutronix.de/ |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20110809/c327d9b1/attachment.sig>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-08 6:21 ` Huang Shijie
2011-08-08 9:19 ` Koen Beel
@ 2011-08-09 9:35 ` Wolfram Sang
2011-08-09 10:54 ` Huang Shijie
1 sibling, 1 reply; 33+ messages in thread
From: Wolfram Sang @ 2011-08-09 9:35 UTC (permalink / raw)
To: linux-arm-kernel
> >DMA timeouts [1]
> >================
> >
> >[ 2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA :1
> >[ 3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
> >
> >Always reproducible by me when trying to format mtd0. Sometimes(always?) seen
> >by Koen during boot (on read?). Never seen by Huang? It is currently unclear if
> After I used a different .config, it never appears in my side.
So, you have a config which triggers this? That should be useful for
debugging. What do you need to enable to see this?
> Please test the driver again when you back to office.
> Pay attention to your version of /arch/arm/configs/mxs_defconfig.
> Your mxs_defconfig may miss Shawn Guo's patches.
I have all the correct patches, I triple-checked that. Regarding the
config, I am not looking for a config that works, I want my config to
work. I meanwhile have the feeling this is a bug in the DMA driver
(because Aisheng Dong has DMA problems, too, in the audio path), still
we need to be sure.
> >problem overwriting all-0xff data in NAND [2]
> >=============================================
> >
> >Although it occured only when writing JFFS2 images so far, this is a generic
> >issue and needs to be fixed, right?
> >
> Artem said it should not change the driver, but the upper layer(jffs2).
>
> So I think i do not need to change the driver.
OK, read it again, got it now. Agreed.
> >* custom sysfs-entries
> My sysfs-entries is in the GPMI-NAND directory.
> Does be a mainline driver means I should not have any sysfs-entries?
> If it does, i can remove it.
It is some kund of ABI, so we would have to support them forever. If
there is no really strong reason to have them, it is better to remove
them.
>
> >* custom kernel command line parameters
> The kernel command line 'gpmi_nand' is to avoid the conflict with
> other modules such as
> SD.
>
> If it's be removed, I have to use different config to resolve the
> issue which is not better either. :(
This is a board-specific issue, so you should handle this at
board-level, not at driver level.
> >* namespacing (some functions have no prefix, some have "mil_", some have mx23)
> > (I think 'mil' means 'mtd interface layer', but why is that needed?)
> The mil is used to make the gpmi_nand_data{} simple.
> Without it, the gpmi_nand_data{} will very big.
>
> The functions which have mx23 prefix are only used in mx23.
> The functions which have no prefix can used in both mx28 and mx23.
I understood this, but wonder if mx23_* specific stuff has to be in the
main driver. Will have a closer look to the driver this week, then I can
say more.
> >Complexity
> >==========
> >
> >The driver is not easy to review. I wonder if it makes sense to use incremental
> >patches for it? maybe making it a staging driver could be a solution for that?
> Frankly speaking, the current driver is maybe the smallest version now.
>
> I even do not add the on-chip BBT feature now.
OK.
> >Huang, are you interested in accepting patches or do you prefer we just point
> >at certain code and you then fix it? Starting with a simpler driver and then
> Feel free to mail me the patch. it's welcome.
We'd need a branch somewhere for that, so we have a history.
> >adding stuff might be another option if we can't chase all the bugs in the
> >current driver.
> >
> >That being said, I'd think fixing the DMA issue has prio #1 and maybe we can
> >meet in IRC or something to work that out? Is there interest in that?
> What about gtalk?
Definately not my favourite, but seems like Koen and you already use it.
Might try it...
Regards,
Wolfram
--
Pengutronix e.K. | Wolfram Sang |
Industrial Linux Solutions | http://www.pengutronix.de/ |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20110809/0ed2b8f1/attachment.sig>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-08 9:19 ` Koen Beel
2011-08-08 10:37 ` Huang Shijie
2011-08-09 5:11 ` Huang Shijie
@ 2011-08-09 9:45 ` Wolfram Sang
2 siblings, 0 replies; 33+ messages in thread
From: Wolfram Sang @ 2011-08-09 9:45 UTC (permalink / raw)
To: linux-arm-kernel
> Could also be useful to make sure we test on the same HW as much as
> possible and are using the same source tree.
> HW I have:
> - mx23evk rev C1
> - mx23evk rev B2
> - own target hw using mx23 lqfp-128 chip and different type of ddr and nand.
Actively used:
FSL MX28EVK Rev. D
Karo TX28-4020 with STK5
Could get:
two custom MX23 boards (though I don't think this will make a
difference)
Regards,
Wolfram
--
Pengutronix e.K. | Wolfram Sang |
Industrial Linux Solutions | http://www.pengutronix.de/ |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20110809/7b5b62cb/attachment-0001.sig>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-09 9:19 ` Wolfram Sang
@ 2011-08-09 10:41 ` Huang Shijie
2011-08-09 11:36 ` Lothar Waßmann
0 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-09 10:41 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
> Hi,
>
>>> ecclayout needs to be used to show that OOB is fully in use [1]
>>> ===============================================================
>>>
>>> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver only
>>> working with UBIFS is surely not ready for mainline.
>>>
>> It seems just modifying the ecclayout of GPMI-NAND can not fix the problem.
>> It should also change the code of JFFS2 and mtd.
>>
>> Some one ever posted a patch about this:
>> http://lists.infradead.org/pipermail/linux-mtd/2007-December/020047.html
> This mail is about MLC. That's another issue. For SLC, ecclayout should
> work, no?
The matter is that the GPMI/BCH will use the OOB. Please see the page 1263
in 16.2.2 of mx28's datasheet. It shows an NAND PAGE's layout.
You may see that the OOB will be used for storing the DATA or ECC.
There will be left some space even when i enable the BCH, it seems the
left space can
contains the jffs2's clean marker. It needs to be confirmed. :)
thanks
Huang Shijie
> Regards,
>
> Wolfram
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-09 9:35 ` Wolfram Sang
@ 2011-08-09 10:54 ` Huang Shijie
2011-08-09 20:42 ` Wolfram Sang
0 siblings, 1 reply; 33+ messages in thread
From: Huang Shijie @ 2011-08-09 10:54 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
>>> DMA timeouts [1]
>>> ================
>>>
>>> [ 2.560000] [ start_dma_without_bch_irq : 392 ] DMA timeout, last DMA :1
>>> [ 3.560000] [ start_dma_with_bch_irq : 427 ] bch timeout!!!
>>>
>>> Always reproducible by me when trying to format mtd0. Sometimes(always?) seen
>>> by Koen during boot (on read?). Never seen by Huang? It is currently unclear if
>> After I used a different .config, it never appears in my side.
> So, you have a config which triggers this? That should be useful for
> debugging. What do you need to enable to see this?
>
My old config is made by myself. I think it was a wrong config,
and it had too much difference from the config made by 'make mxs_defconfig".
So i think it's has no use for debugging.
>> Please test the driver again when you back to office.
>> Pay attention to your version of /arch/arm/configs/mxs_defconfig.
>> Your mxs_defconfig may miss Shawn Guo's patches.
> I have all the correct patches, I triple-checked that. Regarding the
> config, I am not looking for a config that works, I want my config to
> work. I meanwhile have the feeling this is a bug in the DMA driver
> (because Aisheng Dong has DMA problems, too, in the audio path), still
> we need to be sure.
>
it's not a DMA bug, I discuss with Koen, and make sure that the bug is
caused by the GPMI or BCH.
it's a different bug from Aisheng Dong's bug.
>>> problem overwriting all-0xff data in NAND [2]
>>> =============================================
>>>
>>> Although it occured only when writing JFFS2 images so far, this is a generic
>>> issue and needs to be fixed, right?
>>>
>> Artem said it should not change the driver, but the upper layer(jffs2).
>>
>> So I think i do not need to change the driver.
> OK, read it again, got it now. Agreed.
>
>
>>> * custom sysfs-entries
>> My sysfs-entries is in the GPMI-NAND directory.
>> Does be a mainline driver means I should not have any sysfs-entries?
>> If it does, i can remove it.
> It is some kund of ABI, so we would have to support them forever. If
> there is no really strong reason to have them, it is better to remove
> them.
>
ok, thanks.
>>> * custom kernel command line parameters
>> The kernel command line 'gpmi_nand' is to avoid the conflict with
>> other modules such as
>> SD.
>>
>> If it's be removed, I have to use different config to resolve the
>> issue which is not better either. :(
> This is a board-specific issue, so you should handle this at
> board-level, not at driver level.
>
I wish to handle it at the board level.
But I have no idea how to solve the conflict between GPMI and SD. :(
Could you give me some hint?
thanks
>>> * namespacing (some functions have no prefix, some have "mil_", some have mx23)
>>> (I think 'mil' means 'mtd interface layer', but why is that needed?)
>> The mil is used to make the gpmi_nand_data{} simple.
>> Without it, the gpmi_nand_data{} will very big.
>>
>> The functions which have mx23 prefix are only used in mx23.
>> The functions which have no prefix can used in both mx28 and mx23.
> I understood this, but wonder if mx23_* specific stuff has to be in the
> main driver. Will have a closer look to the driver this week, then I can
> say more.
>
thanks
>>> Complexity
>>> ==========
>>>
>>> The driver is not easy to review. I wonder if it makes sense to use incremental
>>> patches for it? maybe making it a staging driver could be a solution for that?
>> Frankly speaking, the current driver is maybe the smallest version now.
>>
>> I even do not add the on-chip BBT feature now.
> OK.
>
>>> Huang, are you interested in accepting patches or do you prefer we just point
>>> at certain code and you then fix it? Starting with a simpler driver and then
>> Feel free to mail me the patch. it's welcome.
> We'd need a branch somewhere for that, so we have a history.
>
ok.
I will try to find some solution.
>>> adding stuff might be another option if we can't chase all the bugs in the
>>> current driver.
>>>
>>> That being said, I'd think fixing the DMA issue has prio #1 and maybe we can
>>> meet in IRC or something to work that out? Is there interest in that?
>> What about gtalk?
> Definately not my favourite, but seems like Koen and you already use it.
> Might try it...
I can use the IRC too.
Huang Shijie
> Regards,
>
> Wolfram
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-09 10:41 ` Huang Shijie
@ 2011-08-09 11:36 ` Lothar Waßmann
0 siblings, 0 replies; 33+ messages in thread
From: Lothar Waßmann @ 2011-08-09 11:36 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
Huang Shijie writes:
> Hi,
> > Hi,
> >
> >>> ecclayout needs to be used to show that OOB is fully in use [1]
> >>> ===============================================================
> >>>
> >>> Needed to make it work for JFFS2 and to pass the mtd-testsuite. A driver only
> >>> working with UBIFS is surely not ready for mainline.
> >>>
> >> It seems just modifying the ecclayout of GPMI-NAND can not fix the problem.
> >> It should also change the code of JFFS2 and mtd.
> >>
> >> Some one ever posted a patch about this:
> >> http://lists.infradead.org/pipermail/linux-mtd/2007-December/020047.html
> > This mail is about MLC. That's another issue. For SLC, ecclayout should
> > work, no?
>
> The matter is that the GPMI/BCH will use the OOB. Please see the page 1263
> in 16.2.2 of mx28's datasheet. It shows an NAND PAGE's layout.
>
> You may see that the OOB will be used for storing the DATA or ECC.
> There will be left some space even when i enable the BCH, it seems the
> left space can
> contains the jffs2's clean marker. It needs to be confirmed. :)
>
It possibly could be written there, if the spare area had a separate
ECC independent from the first data block. With the current ECC
layout (combined ECC for spare area and first block), writing the
clean marker would change the ECC for the first data block to non-FF
so that writing the first block lateron would again result in ECC
errors.
Lothar Wa?mann
--
___________________________________________________________
Ka-Ro electronics GmbH | Pascalstra?e 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Gesch?ftsf?hrer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996
www.karo-electronics.de | info at karo-electronics.de
___________________________________________________________
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-09 10:54 ` Huang Shijie
@ 2011-08-09 20:42 ` Wolfram Sang
0 siblings, 0 replies; 33+ messages in thread
From: Wolfram Sang @ 2011-08-09 20:42 UTC (permalink / raw)
To: linux-arm-kernel
> >>>Always reproducible by me when trying to format mtd0. Sometimes(always?) seen
> >>>by Koen during boot (on read?). Never seen by Huang? It is currently unclear if
> >>After I used a different .config, it never appears in my side.
> >So, you have a config which triggers this? That should be useful for
> >debugging. What do you need to enable to see this?
> >
> My old config is made by myself. I think it was a wrong config,
Let's see what the bug is in the end, but I don't think a config could be
"wrong" in a way to trigger such a bug. Even misconfiguration can be handled
gracefully with code.
> it's not a DMA bug, I discuss with Koen, and make sure that the bug is
> caused by the GPMI or BCH.
Did you get any further during this day?
> it's a different bug from Aisheng Dong's bug.
Okay, I had a look at this one today.
> >>>* custom kernel command line parameters
> >>The kernel command line 'gpmi_nand' is to avoid the conflict with
> >>other modules such as
> >>SD.
> >>
> >>If it's be removed, I have to use different config to resolve the
> >>issue which is not better either. :(
> >This is a board-specific issue, so you should handle this at
> >board-level, not at driver level.
> >
> I wish to handle it at the board level.
>
> But I have no idea how to solve the conflict between GPMI and SD. :(
>
> Could you give me some hint?
For starters, you could move some kerel-parameter to the board-file and create the
devices as needed depending on that:
if (gpmi_nand)
mx28_add_gpmi_nand(&mx28evk_gpmi_nand_data);
else
mx28_add_mxs_mmc(1, &mx28evk_mmc_pdata[1]);
or something alike unless I miss something. This is probably not the best
solution as well, but at least it keeps the driver free from the
board-configuration.
Regards,
Wolfram
--
Pengutronix e.K. | Wolfram Sang |
Industrial Linux Solutions | http://www.pengutronix.de/ |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20110809/5fc77cd0/attachment.sig>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-05 13:51 GPMI-NAND Status? Wolfram Sang
2011-08-08 6:21 ` Huang Shijie
2011-08-08 9:12 ` Huang Shijie
@ 2011-08-14 8:11 ` Ivan Djelic
2011-08-14 18:31 ` Wolfram Sang
` (2 more replies)
2 siblings, 3 replies; 33+ messages in thread
From: Ivan Djelic @ 2011-08-14 8:11 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
(...)
>
> problem overwriting all-0xff data in NAND [2]
> =============================================
>
> Although it occured only when writing JFFS2 images so far, this is a generic
> issue and needs to be fixed, right?
>
>
(...)
> [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
As explained in the thread linked above, this issue should be fixed in your
flashing tool, _not_ in your driver. The nand device you are using does not
support programming pages multiple times in a row; pretending it does in the
special all-0xff case is inefficient (you need to detect all-0xff data) and
unnecessary (just do not program blank pages !).
BR,
Ivan
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-14 8:11 ` Ivan Djelic
@ 2011-08-14 18:31 ` Wolfram Sang
2011-08-15 5:41 ` Lothar Waßmann
2011-08-15 16:22 ` Artem Bityutskiy
2 siblings, 0 replies; 33+ messages in thread
From: Wolfram Sang @ 2011-08-14 18:31 UTC (permalink / raw)
To: linux-arm-kernel
> > problem overwriting all-0xff data in NAND [2]
> > =============================================
> >
> > Although it occured only when writing JFFS2 images so far, this is a generic
> > issue and needs to be fixed, right?
> >
> >
> (...)
> > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
>
> As explained in the thread linked above, this issue should be fixed in your
Yup, Huang pointed me already to the part I got wrong.
Thanks.
--
Pengutronix e.K. | Wolfram Sang |
Industrial Linux Solutions | http://www.pengutronix.de/ |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20110814/a6e123d6/attachment.sig>
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-14 8:11 ` Ivan Djelic
2011-08-14 18:31 ` Wolfram Sang
@ 2011-08-15 5:41 ` Lothar Waßmann
2011-08-15 6:30 ` Lin Tony-B19295
` (2 more replies)
2011-08-15 16:22 ` Artem Bityutskiy
2 siblings, 3 replies; 33+ messages in thread
From: Lothar Waßmann @ 2011-08-15 5:41 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
Ivan Djelic writes:
> On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> (...)
> >
> > problem overwriting all-0xff data in NAND [2]
> > =============================================
> >
> > Although it occured only when writing JFFS2 images so far, this is a generic
> > issue and needs to be fixed, right?
> >
> >
> (...)
> > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
>
> As explained in the thread linked above, this issue should be fixed in your
> flashing tool, _not_ in your driver. The nand device you are using does not
> support programming pages multiple times in a row; pretending it does in the
>
It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
problem is that the controller generates an ECC code that is non-FF
for all-FF data, which JFFS2 cannot handle properly.
Lothar Wa?mann
--
___________________________________________________________
Ka-Ro electronics GmbH | Pascalstra?e 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Gesch?ftsf?hrer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996
www.karo-electronics.de | info at karo-electronics.de
___________________________________________________________
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-15 5:41 ` Lothar Waßmann
@ 2011-08-15 6:30 ` Lin Tony-B19295
2011-08-15 8:41 ` Ivan Djelic
2011-08-15 8:29 ` Ivan Djelic
2011-08-15 16:18 ` Artem Bityutskiy
2 siblings, 1 reply; 33+ messages in thread
From: Lin Tony-B19295 @ 2011-08-15 6:30 UTC (permalink / raw)
To: linux-arm-kernel
> -----Original Message-----
> From: linux-arm-kernel-bounces at lists.infradead.org [mailto:linux-arm-
> kernel-bounces at lists.infradead.org] On Behalf Of Lothar Wa?mann
> Sent: Monday, August 15, 2011 1:41 PM
> To: Ivan Djelic
> Cc: Koen Beel; Wolfram Sang; Huang Shijie-B32955; linux-
> mtd at lists.infradead.org; Shawn Guo; linux-arm-kernel at lists.infradead.org
> Subject: Re: GPMI-NAND Status?
>
> Hi,
>
> Ivan Djelic writes:
> > On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> > (...)
> > >
> > > problem overwriting all-0xff data in NAND [2]
> > > =============================================
> > >
> > > Although it occured only when writing JFFS2 images so far, this is a
> > > generic issue and needs to be fixed, right?
> > >
> > >
> > (...)
> > > [2]
> > > http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> >
> > As explained in the thread linked above, this issue should be fixed in
> > your flashing tool, _not_ in your driver. The nand device you are
> > using does not support programming pages multiple times in a row;
> > pretending it does in the
> >
> It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> problem is that the controller generates an ECC code that is non-FF for
> all-FF data, which JFFS2 cannot handle properly.
>
As I know, this is BCH algorithm limitation for ECC.(non-FF ECC code for all-FF data)
So that BCH engine will ignore the ECC error when all data are 0xFF. That's the BCH usage for ECC.
Under such condition, I think it's the JFFS2 that should handle such case instead
Of BCH. So far more and more SOCs are using BCH for NAND ECC, JFFS2 can't escaping such problem if
Not changed.
>
> Lothar Wa?mann
> --
> ___________________________________________________________
>
> Ka-Ro electronics GmbH | Pascalstra?e 22 | D - 52076 Aachen
> Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
> Gesch?ftsf?hrer: Matthias Kaussen
> Handelsregistereintrag: Amtsgericht Aachen, HRB 4996
>
> www.karo-electronics.de | info at karo-electronics.de
> ___________________________________________________________
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-15 5:41 ` Lothar Waßmann
2011-08-15 6:30 ` Lin Tony-B19295
@ 2011-08-15 8:29 ` Ivan Djelic
2011-08-15 9:31 ` Lothar Waßmann
2011-08-15 16:18 ` Artem Bityutskiy
2 siblings, 1 reply; 33+ messages in thread
From: Ivan Djelic @ 2011-08-15 8:29 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Aug 15, 2011 at 06:41:23AM +0100, Lothar Wa?mann wrote:
> Hi,
>
> Ivan Djelic writes:
> > On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> > (...)
> > >
> > > problem overwriting all-0xff data in NAND [2]
> > > =============================================
> > >
> > > Although it occured only when writing JFFS2 images so far, this is a generic
> > > issue and needs to be fixed, right?
> > >
> > >
> > (...)
> > > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> >
> > As explained in the thread linked above, this issue should be fixed in your
> > flashing tool, _not_ in your driver. The nand device you are using does not
> > support programming pages multiple times in a row; pretending it does in the
> >
> It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> problem is that the controller generates an ECC code that is non-FF
> for all-FF data, which JFFS2 cannot handle properly.
JFFS2 has nothing to do with it. JFFS2 does not assume it can program empty
pages and then reprogram them on a NAND flash device. You flashing method does.
If your BCH controller allows it, you could XOR the computed ECC bytes with a
specific mask to make sure all-FF data have all-FF ecc. This is useful to allow
reading erased blocks with ecc correction enabled.
But even so, you cannot work around the fact that NAND devices are different
from NOR devices, in that they typically allow only a limited number of partial
page programming operations (4 in your K9F1G08U0B).
If you implemented the mask trick described above and used it to allow
multiple page programming, you still would not track the number of partial
program operations on a given page, and expose yourself to nasty bugs (when
exceeding the number of specified partial operations); i.e. it could work on
some devices for a few operations, but not reliably on all devices for any
number of empty page programmings.
So the only real possibility is to avoid programming (physically) a page when
its target contents are empty (all-FF); this is not implemented at the driver
level because:
- it is useless: none of the existing filesystems need this "feature"
- it would waste cpu cycles to check if target data is all-FF each time a page
is programmed
Therefore... it is simply a matter of avoiding empty page programming, which
only happens in your flasher. See also the flashing guidelines [1] as per Artem
suggestion.
BR,
Ivan
[1] http://www.linux-mtd.infradead.org/doc/ubi.html#L_flasher_algo
>
>
> Lothar Wa?mann
> --
> ___________________________________________________________
>
> Ka-Ro electronics GmbH | Pascalstra?e 22 | D - 52076 Aachen
> Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
> Gesch?ftsf?hrer: Matthias Kaussen
> Handelsregistereintrag: Amtsgericht Aachen, HRB 4996
>
> www.karo-electronics.de | info at karo-electronics.de
> ___________________________________________________________
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-15 6:30 ` Lin Tony-B19295
@ 2011-08-15 8:41 ` Ivan Djelic
0 siblings, 0 replies; 33+ messages in thread
From: Ivan Djelic @ 2011-08-15 8:41 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Aug 15, 2011 at 07:30:58AM +0100, Lin Tony-B19295 wrote:
> > -----Original Message-----
> > From: linux-arm-kernel-bounces at lists.infradead.org [mailto:linux-arm-
> > kernel-bounces at lists.infradead.org] On Behalf Of Lothar Wa?mann
> > Sent: Monday, August 15, 2011 1:41 PM
> > To: Ivan Djelic
> > Cc: Koen Beel; Wolfram Sang; Huang Shijie-B32955; linux-
> > mtd at lists.infradead.org; Shawn Guo; linux-arm-kernel at lists.infradead.org
> > Subject: Re: GPMI-NAND Status?
> >
> > Hi,
> >
> > Ivan Djelic writes:
> > > On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> > > (...)
> > > >
> > > > problem overwriting all-0xff data in NAND [2]
> > > > =============================================
> > > >
> > > > Although it occured only when writing JFFS2 images so far, this is a
> > > > generic issue and needs to be fixed, right?
> > > >
> > > >
> > > (...)
> > > > [2]
> > > > http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> > >
> > > As explained in the thread linked above, this issue should be fixed in
> > > your flashing tool, _not_ in your driver. The nand device you are
> > > using does not support programming pages multiple times in a row;
> > > pretending it does in the
> > >
> > It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> > problem is that the controller generates an ECC code that is non-FF for
> > all-FF data, which JFFS2 cannot handle properly.
> >
> As I know, this is BCH algorithm limitation for ECC.(non-FF ECC code for all-FF data)
Not a BCH algorithm limitation: you can always XOR BCH bytes to get all-FF ECC
code for all-FF data. It is equivalent to simply adding a particular
polynomial. See drivers/mtd/nand/nand_bch.c:62 for an example.
Not being able to do so is a hardware limitation. For instance, the OMAP35xx
BCH engine allows this masking operation.
BR,
Ivan
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-15 8:29 ` Ivan Djelic
@ 2011-08-15 9:31 ` Lothar Waßmann
2011-08-15 12:54 ` Ivan Djelic
2011-08-15 16:34 ` Artem Bityutskiy
0 siblings, 2 replies; 33+ messages in thread
From: Lothar Waßmann @ 2011-08-15 9:31 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
Ivan Djelic writes:
> On Mon, Aug 15, 2011 at 06:41:23AM +0100, Lothar Wa?mann wrote:
> > Hi,
> >
> > Ivan Djelic writes:
> > > On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> > > (...)
> > > >
> > > > problem overwriting all-0xff data in NAND [2]
> > > > =============================================
> > > >
> > > > Although it occured only when writing JFFS2 images so far, this is a generic
> > > > issue and needs to be fixed, right?
> > > >
> > > >
> > > (...)
> > > > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> > >
> > > As explained in the thread linked above, this issue should be fixed in your
> > > flashing tool, _not_ in your driver. The nand device you are using does not
> > > support programming pages multiple times in a row; pretending it does in the
> > >
> > It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> > problem is that the controller generates an ECC code that is non-FF
> > for all-FF data, which JFFS2 cannot handle properly.
>
> JFFS2 has nothing to do with it. JFFS2 does not assume it can program empty
> pages and then reprogram them on a NAND flash device. You flashing method does.
>
AFAICT JFFS2 checks the flash for areas that contain only FF and
treats them like erased flash. At least it tries to overwrite such
areas in flash without erasing it beforehand.
I avoided the problem by creating JFFS2 images that are padded with
oxff to page size only instead of eraseblock size.
> If your BCH controller allows it, you could XOR the computed ECC bytes with a
> specific mask to make sure all-FF data have all-FF ecc. This is useful to allow
> reading erased blocks with ecc correction enabled.
>
> But even so, you cannot work around the fact that NAND devices are different
> from NOR devices, in that they typically allow only a limited number of partial
> page programming operations (4 in your K9F1G08U0B).
> If you implemented the mask trick described above and used it to allow
> multiple page programming, you still would not track the number of partial
> program operations on a given page, and expose yourself to nasty bugs (when
> exceeding the number of specified partial operations); i.e. it could work on
> some devices for a few operations, but not reliably on all devices for any
> number of empty page programmings.
>
> So the only real possibility is to avoid programming (physically) a page when
> its target contents are empty (all-FF); this is not implemented at the driver
> level because:
> - it is useless: none of the existing filesystems need this "feature"
> - it would waste cpu cycles to check if target data is all-FF each time a page
> is programmed
>
> Therefore... it is simply a matter of avoiding empty page programming, which
> only happens in your flasher. See also the flashing guidelines [1] as per Artem
> suggestion.
>
"Your flasher" is the standard mtd-utils mkfs.jffs2 to create an image
file and the U-Boot commands 'nand erase/nand write' or
the mtd-utils 'flash_eraseall/nandwrite' to write it to flash.
Lothar Wa?mann
--
___________________________________________________________
Ka-Ro electronics GmbH | Pascalstra?e 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Gesch?ftsf?hrer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996
www.karo-electronics.de | info at karo-electronics.de
___________________________________________________________
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-15 9:31 ` Lothar Waßmann
@ 2011-08-15 12:54 ` Ivan Djelic
2011-08-15 13:37 ` Lothar Waßmann
2011-08-15 16:34 ` Artem Bityutskiy
1 sibling, 1 reply; 33+ messages in thread
From: Ivan Djelic @ 2011-08-15 12:54 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Aug 15, 2011 at 10:31:34AM +0100, Lothar Wa?mann wrote:
> > > It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> > > problem is that the controller generates an ECC code that is non-FF
> > > for all-FF data, which JFFS2 cannot handle properly.
> >
> > JFFS2 has nothing to do with it. JFFS2 does not assume it can program empty
> > pages and then reprogram them on a NAND flash device. You flashing method does.
> >
> AFAICT JFFS2 checks the flash for areas that contain only FF and
> treats them like erased flash. At least it tries to overwrite such
> areas in flash without erasing it beforehand.
Hmmm, again, JFFS2 simply "writes" to an erased block. You say "it tries to
overwrite", but that's only because you programmed an empty page in the first
place! And it cannot "erase such areas beforehand" (think of a partially
programmed block with tailing empty pages).
> I avoided the problem by creating JFFS2 images that are padded with
> oxff to page size only instead of eraseblock size.
Good.
> > Therefore... it is simply a matter of avoiding empty page programming, which
> > only happens in your flasher. See also the flashing guidelines [1] as per Artem
> > suggestion.
> >
> "Your flasher" is the standard mtd-utils mkfs.jffs2 to create an image
> file and the U-Boot commands 'nand erase/nand write' or
> the mtd-utils 'flash_eraseall/nandwrite' to write it to flash.
OK, maybe you could submit a patch to fix this issue then ?
Thanks,
Ivan
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-15 12:54 ` Ivan Djelic
@ 2011-08-15 13:37 ` Lothar Waßmann
0 siblings, 0 replies; 33+ messages in thread
From: Lothar Waßmann @ 2011-08-15 13:37 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
Ivan Djelic writes:
> On Mon, Aug 15, 2011 at 10:31:34AM +0100, Lothar Wa?mann wrote:
> > > > It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> > > > problem is that the controller generates an ECC code that is non-FF
> > > > for all-FF data, which JFFS2 cannot handle properly.
> > >
> > > JFFS2 has nothing to do with it. JFFS2 does not assume it can program empty
> > > pages and then reprogram them on a NAND flash device. You flashing method does.
> > >
> > AFAICT JFFS2 checks the flash for areas that contain only FF and
> > treats them like erased flash. At least it tries to overwrite such
> > areas in flash without erasing it beforehand.
>
> Hmmm, again, JFFS2 simply "writes" to an erased block. You say "it tries to
> overwrite", but that's only because you programmed an empty page in the first
>
Which will happen most of the time (unless the image file size is near
a multiple of the eraseblock size), if you create an jffs2 image with
mkfs.jffs2 and the '-p' option.
Unfortunately the data written by U-Boot must be page aligned and
mkfs.jffs2 cannot generate images that fulfill that constraint (unless
you know the exact image file size beforehand and specify the
rounded-up file size with the '-p' option).
> place! And it cannot "erase such areas beforehand" (think of a partially
> programmed block with tailing empty pages).
>
I know, that it cannot simply erase such blocks.
> > > Therefore... it is simply a matter of avoiding empty page programming, which
> > > only happens in your flasher. See also the flashing guidelines [1] as per Artem
> > > suggestion.
> > >
> > "Your flasher" is the standard mtd-utils mkfs.jffs2 to create an image
> > file and the U-Boot commands 'nand erase/nand write' or
> > the mtd-utils 'flash_eraseall/nandwrite' to write it to flash.
>
> OK, maybe you could submit a patch to fix this issue then ?
> Thanks,
>
I have no idea how and where it should be fixed.
- at driver level, creating an all-FF ECC for all-FF data?
- in the bootloader and mtd-utils preventing stretches of all-FF data
to be written?
- in mkfs.jffs2, enabling it to generate images that are padded to
page size instead of eraseblock size?
Lothar Wa?mann
--
___________________________________________________________
Ka-Ro electronics GmbH | Pascalstra?e 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Gesch?ftsf?hrer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996
www.karo-electronics.de | info at karo-electronics.de
___________________________________________________________
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-15 5:41 ` Lothar Waßmann
2011-08-15 6:30 ` Lin Tony-B19295
2011-08-15 8:29 ` Ivan Djelic
@ 2011-08-15 16:18 ` Artem Bityutskiy
2 siblings, 0 replies; 33+ messages in thread
From: Artem Bityutskiy @ 2011-08-15 16:18 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 2011-08-15 at 07:41 +0200, Lothar Wa?mann wrote:
> > As explained in the thread linked above, this issue should be fixed in your
> > flashing tool, _not_ in your driver. The nand device you are using does not
> > support programming pages multiple times in a row; pretending it does in the
> >
> It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> problem is that the controller generates an ECC code that is non-FF
> for all-FF data, which JFFS2 cannot handle properly.
I believe that it does not matter for the kernel community that your
specific device can survive multiple writes. I certainly does matter for
you, so if you want a quick fix - just change your kernel, but I would
not recommend to do this.
We (the community) care about the _general_ case - in general, only one
write is allowed, period. Once the JFFS2 or/and the flashing tool is
fixed - the problem will go away.
Let me put it this way - hacking the driver will just hide the issue
deeper - we'll have this issue popped up again a bit later and because
of hacks like that [1] it will be more confusing. Let's avoid this.
Also, someone pointed in that thread that if I write data to NAND - I
want my data to be ECC-protected. Please, explain why my data should be
unprotected if it happened to be just 2KiB of 0xFFs covering whole NAND
page?
Ivan provided much better explanation, showing that even with NOP 4
flashes there may be problems (e.g., 1 write of all 0xFFs + 4 writes
from the user).
[1] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037181.html
--
Best Regards,
Artem Bityutskiy
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-14 8:11 ` Ivan Djelic
2011-08-14 18:31 ` Wolfram Sang
2011-08-15 5:41 ` Lothar Waßmann
@ 2011-08-15 16:22 ` Artem Bityutskiy
2011-08-15 16:57 ` Ivan Djelic
2 siblings, 1 reply; 33+ messages in thread
From: Artem Bityutskiy @ 2011-08-15 16:22 UTC (permalink / raw)
To: linux-arm-kernel
On Sun, 2011-08-14 at 10:11 +0200, Ivan Djelic wrote:
> On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> (...)
> >
> > problem overwriting all-0xff data in NAND [2]
> > =============================================
> >
> > Although it occured only when writing JFFS2 images so far, this is a generic
> > issue and needs to be fixed, right?
> >
> >
> (...)
> > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
>
> As explained in the thread linked above, this issue should be fixed in your
> flashing tool, _not_ in your driver. The nand device you are using does not
> support programming pages multiple times in a row; pretending it does in the
> special all-0xff case is inefficient (you need to detect all-0xff data) and
> unnecessary (just do not program blank pages !).
Hmm, isn't it also buggy because if my precious data contains 2KiB of
0xFFs (aligned to 2KiB boundary) then I will have no ECC protection for
this page? Or I miss something?
--
Best Regards,
Artem Bityutskiy
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-15 9:31 ` Lothar Waßmann
2011-08-15 12:54 ` Ivan Djelic
@ 2011-08-15 16:34 ` Artem Bityutskiy
1 sibling, 0 replies; 33+ messages in thread
From: Artem Bityutskiy @ 2011-08-15 16:34 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 2011-08-15 at 11:31 +0200, Lothar Wa?mann wrote:
> Hi,
>
> Ivan Djelic writes:
> > On Mon, Aug 15, 2011 at 06:41:23AM +0100, Lothar Wa?mann wrote:
> > > Hi,
> > >
> > > Ivan Djelic writes:
> > > > On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> > > > (...)
> > > > >
> > > > > problem overwriting all-0xff data in NAND [2]
> > > > > =============================================
> > > > >
> > > > > Although it occured only when writing JFFS2 images so far, this is a generic
> > > > > issue and needs to be fixed, right?
> > > > >
> > > > >
> > > > (...)
> > > > > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> > > >
> > > > As explained in the thread linked above, this issue should be fixed in your
> > > > flashing tool, _not_ in your driver. The nand device you are using does not
> > > > support programming pages multiple times in a row; pretending it does in the
> > > >
> > > It's not a problem of the device (Samsung K9F1G08U0B in my case)! The
> > > problem is that the controller generates an ECC code that is non-FF
> > > for all-FF data, which JFFS2 cannot handle properly.
> >
> > JFFS2 has nothing to do with it. JFFS2 does not assume it can program empty
> > pages and then reprogram them on a NAND flash device. You flashing method does.
> >
> AFAICT JFFS2 checks the flash for areas that contain only FF and
> treats them like erased flash. At least it tries to overwrite such
> areas in flash without erasing it beforehand.
Right, when JFFS2 scans the flash and finds a partially-used eraseblock,
it tries to find out where the data ends and the empty space starts.
JFFS2 assumes that the empty space is usable, and it uses it. JFFS2
author just missed the fact that in case of newly flashed JFFS2 image
this empty space may be unusable. And this was extremely unlikely those
times.
You may teach JFFS2 to avoid using this space or to "clean it up", just
like we taught UBIFS to do this recently. The other option is to change
the flashing method.
> I avoided the problem by creating JFFS2 images that are padded with
> oxff to page size only instead of eraseblock size.
Right.
> > Therefore... it is simply a matter of avoiding empty page programming, which
> > only happens in your flasher. See also the flashing guidelines [1] as per Artem
> > suggestion.
> >
> "Your flasher" is the standard mtd-utils mkfs.jffs2 to create an image
> file and the U-Boot commands 'nand erase/nand write' or
> the mtd-utils 'flash_eraseall/nandwrite' to write it to flash.
I do not use these tools, but if they have issues - just fix them and
send patches.
--
Best Regards,
Artem Bityutskiy
^ permalink raw reply [flat|nested] 33+ messages in thread
* GPMI-NAND Status?
2011-08-15 16:22 ` Artem Bityutskiy
@ 2011-08-15 16:57 ` Ivan Djelic
0 siblings, 0 replies; 33+ messages in thread
From: Ivan Djelic @ 2011-08-15 16:57 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Aug 15, 2011 at 05:22:13PM +0100, Artem Bityutskiy wrote:
> On Sun, 2011-08-14 at 10:11 +0200, Ivan Djelic wrote:
> > On Fri, Aug 05, 2011 at 02:51:33PM +0100, Wolfram Sang wrote:
> > (...)
> > >
> > > problem overwriting all-0xff data in NAND [2]
> > > =============================================
> > >
> > > Although it occured only when writing JFFS2 images so far, this is a generic
> > > issue and needs to be fixed, right?
> > >
> > >
> > (...)
> > > [2] http://lists.infradead.org/pipermail/linux-mtd/2011-July/037104.html
> >
> > As explained in the thread linked above, this issue should be fixed in your
> > flashing tool, _not_ in your driver. The nand device you are using does not
> > support programming pages multiple times in a row; pretending it does in the
> > special all-0xff case is inefficient (you need to detect all-0xff data) and
> > unnecessary (just do not program blank pages !).
>
> Hmm, isn't it also buggy because if my precious data contains 2KiB of
> 0xFFs (aligned to 2KiB boundary) then I will have no ECC protection for
> this page? Or I miss something?
Ouch, yes you are correct, very good point which I missed :)
Ivan
^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2011-08-15 16:57 UTC | newest]
Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-05 13:51 GPMI-NAND Status? Wolfram Sang
2011-08-08 6:21 ` Huang Shijie
2011-08-08 9:19 ` Koen Beel
2011-08-08 10:37 ` Huang Shijie
2011-08-08 12:42 ` Koen Beel
2011-08-09 6:36 ` Huang Shijie
2011-08-09 7:58 ` Koen Beel
2011-08-09 8:18 ` Huang Shijie
2011-08-09 8:25 ` Koen Beel
2011-08-09 5:11 ` Huang Shijie
2011-08-09 6:25 ` Koen Beel
2011-08-09 6:40 ` Huang Shijie
2011-08-09 9:45 ` Wolfram Sang
2011-08-09 9:35 ` Wolfram Sang
2011-08-09 10:54 ` Huang Shijie
2011-08-09 20:42 ` Wolfram Sang
2011-08-08 9:12 ` Huang Shijie
2011-08-09 9:19 ` Wolfram Sang
2011-08-09 10:41 ` Huang Shijie
2011-08-09 11:36 ` Lothar Waßmann
2011-08-14 8:11 ` Ivan Djelic
2011-08-14 18:31 ` Wolfram Sang
2011-08-15 5:41 ` Lothar Waßmann
2011-08-15 6:30 ` Lin Tony-B19295
2011-08-15 8:41 ` Ivan Djelic
2011-08-15 8:29 ` Ivan Djelic
2011-08-15 9:31 ` Lothar Waßmann
2011-08-15 12:54 ` Ivan Djelic
2011-08-15 13:37 ` Lothar Waßmann
2011-08-15 16:34 ` Artem Bityutskiy
2011-08-15 16:18 ` Artem Bityutskiy
2011-08-15 16:22 ` Artem Bityutskiy
2011-08-15 16:57 ` Ivan Djelic
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).