public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* NAND parameter page read fails with bitflip and no column change support
@ 2024-04-10 18:11 Steven Seeger
  0 siblings, 0 replies; 4+ messages in thread
From: Steven Seeger @ 2024-04-10 18:11 UTC (permalink / raw)
  To: linux-mtd@lists.infradead.org
  Cc: ada@thorsis.com, Miquel Raynal, richard@nod.at

Hello list.

I have an interesting board with 8GB of NAND, and the first of 8 chips has a bitflip in the first ONFI 256 bytes of the ONFI parameter page. My driver did not support column change operations, and this worked fine with linux 5.x kernels. I noticed this problem with a 6.6 kernel, and the fix was to add column change support to my driver. Without it, the NAND layer always reads the first 256 bytes of the parameter page. As my hardware is an FPGA that abstracts some of the direct NAND access away, and ECC Is done on-die, the driver is a bit of a hybrid between the newer exec operation API and providing functions like nand_read_page, nand_read_page_raw, etc. So the parameter page read upon module load uses the data in instructions in the exec command, while page and OOB reads and writes are done through the driver-supplied functions.

Adding support for column change operations fixed this issue, but it seems that this support should not be necessary. It  didn't use to work that way.

I had reached out to Miquel privately with this issue, only due to the desire to help inform maintainers of the issue. I was too busy at the time to join the list and write up something more formal. Miquel provided a small patch for me that did not work at the time, and then I lost access to the board for a while. Now I have a little time and a board, so I am here to report the issue and make it public.

I am fine continuing my work with my driver that supports column change operations, but also happy to use a previous commit of it to do any additional testing with any patch provided as long as I have a board.

From my original email to Miquel:

Your commit 9f820fc0651c32f8dde26b8e3ad93e1b9fdb62c4 I think has a small bug.

In rawnand_check_data_only_read_support(), you call nand_read_data_op() with a NULL buffer, which causes nand_read_data_op() to return -EINVAL, therefore the value for data_only_read is never set.

Apologies to the people on CC for sending this a couple of times due to errors.

Thanks,
Steven


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NAND parameter page read fails with bitflip and no column change support
       [not found] <mailman.25709.1712772710.1280.linux-mtd@lists.infradead.org>
@ 2024-04-10 18:45 ` Steven Seeger
  2024-04-11  8:52   ` ada
  0 siblings, 1 reply; 4+ messages in thread
From: Steven Seeger @ 2024-04-10 18:45 UTC (permalink / raw)
  To: linux-mtd@lists.infradead.org
  Cc: ada@thorsis.com, Miquel Raynal, richard@nod.at

>I had reached out to Miquel privately with this issue, only due to the desire to
>help inform maintainers of the issue. I was too busy at the time to join the list
>and write up something more formal. Miquel provided a small patch for me
>that did not work at the time, and then I lost access to the board for a while.
>Now I have a little time and a board, so I am here to report the issue and make it public.

I tried the patch from Miquel back in February and it works correctly with 6.6.12
and my older driver without column change support.

Here is h is patch for reference:

--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -2123,7 +2123,7 @@ EXPORT_SYMBOL_GPL(nand_reset_op);
 int nand_read_data_op(struct nand_chip *chip, void *buf, unsigned int len,
                      bool force_8bit, bool check_only)
 {
-       if (!len || !buf)
+       if (!len || (!check_only && !buf))
                return -EINVAL;
 
        if (nand_has_exec_op(chip)) {

I also realize that company email using outlook 365 is terrible for lists and list
etiquette. Hopefully the list handles threading from digest emails correctly.

Thanks,
Steven
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NAND parameter page read fails with bitflip and no column change support
  2024-04-10 18:45 ` NAND parameter page read fails with bitflip and no column change support Steven Seeger
@ 2024-04-11  8:52   ` ada
  2024-05-07 16:08     ` Miquel Raynal
  0 siblings, 1 reply; 4+ messages in thread
From: ada @ 2024-04-11  8:52 UTC (permalink / raw)
  To: Steven Seeger
  Cc: linux-mtd@lists.infradead.org, ada@thorsis.com, Miquel Raynal,
	richard@nod.at

Hello Steven, Miquel,

Am Wed, Apr 10, 2024 at 06:45:58PM +0000 schrieb Steven Seeger:
> >I had reached out to Miquel privately with this issue, only due to the desire to
> >help inform maintainers of the issue. I was too busy at the time to join the list
> >and write up something more formal. Miquel provided a small patch for me
> >that did not work at the time, and then I lost access to the board for a while.
> >Now I have a little time and a board, so I am here to report the issue and make it public.
> 
> I tried the patch from Miquel back in February and it works correctly with 6.6.12
> and my older driver without column change support.
> 
> Here is h is patch for reference:
> 
> --- a/drivers/mtd/nand/raw/nand_base.c
> +++ b/drivers/mtd/nand/raw/nand_base.c
> @@ -2123,7 +2123,7 @@ EXPORT_SYMBOL_GPL(nand_reset_op);
>  int nand_read_data_op(struct nand_chip *chip, void *buf, unsigned int len,
>                       bool force_8bit, bool check_only)
>  {
> -       if (!len || !buf)
> +       if (!len || (!check_only && !buf))
>                 return -EINVAL;
>  
>         if (nand_has_exec_op(chip)) {

Looking at the context aka the whole function nand_read_data_op() this
looks fine.  I had reported the same issue in March, referencing the
same commit 9f820fc0651c ("mtd: rawnand: Check the data only read
pattern only once") you mentioned in your other mail:

Link: https://lore.kernel.org/linux-mtd/20240307-pantry-deceit-78ce20f47899@thorsis.com/

With nand_read_data_op() fixed in nand_onfi_detect() the boolean
`use_datain` is true now instead of false (because
rawnand_check_data_only_read_support() calls that function, too).  So
in nand_onfi_detect() code jumps to nand_read_data_op() in the second
loop run now instead of nand_change_read_column_op() which probably
masks that other problem with mtd->writesize not set mentioned in our
earlier discussion.

Long story short: the diff above in itself seems correct to me.  A
test with that fix applied on our sam9x60 board using
nand_soft_waitrdy instead of RB# gpio successfully gets ONFI param
page 1 with nand_read_data_op() after getting ONFI param page 0 with
nand_read_param_page_op() failed.

You said this is a patch/diff from Miquel.  Miquel do you send this
as a proper patch then, or should someone else?

Greets
Alex

P.S.: still busy with other things, no time yet for proper timing
analysis of our other NAND flash problems with logic analyzer etc.


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NAND parameter page read fails with bitflip and no column change support
  2024-04-11  8:52   ` ada
@ 2024-05-07 16:08     ` Miquel Raynal
  0 siblings, 0 replies; 4+ messages in thread
From: Miquel Raynal @ 2024-05-07 16:08 UTC (permalink / raw)
  To: ada@thorsis.com
  Cc: Steven Seeger, linux-mtd@lists.infradead.org, richard@nod.at

Hello,

ada@thorsis.com wrote on Thu, 11 Apr 2024 10:52:48 +0200:

> Hello Steven, Miquel,
> 
> Am Wed, Apr 10, 2024 at 06:45:58PM +0000 schrieb Steven Seeger:
> > >I had reached out to Miquel privately with this issue, only due to the desire to
> > >help inform maintainers of the issue. I was too busy at the time to join the list
> > >and write up something more formal. Miquel provided a small patch for me
> > >that did not work at the time, and then I lost access to the board for a while.
> > >Now I have a little time and a board, so I am here to report the issue and make it public.  
> > 
> > I tried the patch from Miquel back in February and it works correctly with 6.6.12
> > and my older driver without column change support.
> > 
> > Here is h is patch for reference:
> > 
> > --- a/drivers/mtd/nand/raw/nand_base.c
> > +++ b/drivers/mtd/nand/raw/nand_base.c
> > @@ -2123,7 +2123,7 @@ EXPORT_SYMBOL_GPL(nand_reset_op);
> >  int nand_read_data_op(struct nand_chip *chip, void *buf, unsigned int len,
> >                       bool force_8bit, bool check_only)
> >  {
> > -       if (!len || !buf)
> > +       if (!len || (!check_only && !buf))
> >                 return -EINVAL;
> >  
> >         if (nand_has_exec_op(chip)) {  
> 
> Looking at the context aka the whole function nand_read_data_op() this
> looks fine.  I had reported the same issue in March, referencing the
> same commit 9f820fc0651c ("mtd: rawnand: Check the data only read
> pattern only once") you mentioned in your other mail:
> 
> Link: https://lore.kernel.org/linux-mtd/20240307-pantry-deceit-78ce20f47899@thorsis.com/
> 
> With nand_read_data_op() fixed in nand_onfi_detect() the boolean
> `use_datain` is true now instead of false (because
> rawnand_check_data_only_read_support() calls that function, too).  So
> in nand_onfi_detect() code jumps to nand_read_data_op() in the second
> loop run now instead of nand_change_read_column_op() which probably
> masks that other problem with mtd->writesize not set mentioned in our
> earlier discussion.
> 
> Long story short: the diff above in itself seems correct to me.  A
> test with that fix applied on our sam9x60 board using
> nand_soft_waitrdy instead of RB# gpio successfully gets ONFI param
> page 1 with nand_read_data_op() after getting ONFI param page 0 with
> nand_read_param_page_op() failed.
> 
> You said this is a patch/diff from Miquel.  Miquel do you send this
> as a proper patch then, or should someone else?
> 
> Greets
> Alex
> 
> P.S.: still busy with other things, no time yet for proper timing
> analysis of our other NAND flash problems with logic analyzer etc.
> 

Sorry for the delay, here is the proposal for fixing this mess:
https://lore.kernel.org/linux-mtd/20240507160546.130255-1-miquel.raynal@bootlin.com/T/#t

Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-05-07 16:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.25709.1712772710.1280.linux-mtd@lists.infradead.org>
2024-04-10 18:45 ` NAND parameter page read fails with bitflip and no column change support Steven Seeger
2024-04-11  8:52   ` ada
2024-05-07 16:08     ` Miquel Raynal
2024-04-10 18:11 Steven Seeger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox