From: "Tokunori Ikegami" <ikegami.t@gmail.com>
To: "'Sobon, Przemyslaw'" <psobon@amazon.com>,
"'Boris Brezillon'" <boris.brezillon@collabora.com>
Cc: keescook@chromium.org, joakim.tjernlund@infinera.com,
richard@nod.at, linux-kernel@vger.kernel.org,
marek.vasut@gmail.com, ikegami_to@yahoo.co.jp,
linux-mtd@lists.infradead.org, computersforpeace@gmail.com,
dwmw2@infradead.org, 'Liu Jian' <liujian56@huawei.com>
Subject: RE: Re: [PATCH] cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer
Date: Fri, 8 Feb 2019 23:23:59 +0900 [thread overview]
Message-ID: <149101d4bfb9$fdc5a330$f950e990$@gmail.com> (raw)
In-Reply-To: <632ed76bd3844ceab75066d1f30a7115@EX13D07UWA001.ant.amazon.com>
Hi Przemek-san,
Thank you so much for your explanation.
> I have seen a case myself where a value was written, chip changed
> state to "ready" but when I was reading the value was incorrect.
I also know the similar issues for the both buffer and word write.
Both issues were able to reproduce the write error behavior.
Note: The word write issue is able to reproduce now also.
Those were resolved by using chip_good() instead to check the state.
> This can happen as result of intermittent issue with flash. It is
> hard to fall into scenario when testing on limited number of devices
> but with large enough population you can see that.
If possible I would like to know the issue detail and its cause also.
> Another situation
> is when a flash chip reaches its maximum number of writes. So for
> example a chip is designed for 100k writes to a page. Once you
> reach that number of writes you can have invalid data written to
> flash but chip itself reports everything was good and switches to
> "ready" state.
Yes I see.
Regards,
Ikegami
> -----Original Message-----
> From: linux-mtd [mailto:linux-mtd-bounces@lists.infradead.org] On Behalf
> Of Sobon, Przemyslaw
> Sent: Friday, February 8, 2019 8:51 AM
> To: ikegami_to@yahoo.co.jp; Boris Brezillon
> Cc: keescook@chromium.org; marek.vasut@gmail.com;
> ikegami@allied-telesis.co.jp; richard@nod.at;
> linux-kernel@vger.kernel.org; joakim.tjernlund@infinera.com;
> linux-mtd@lists.infradead.org; computersforpeace@gmail.com;
> dwmw2@infradead.org; Liu Jian
> Subject: RE: Re: [PATCH] cfi: fix deadloop in cfi_cmdset_0002.c
> do_write_buffer
>
> Hi Ikegami,
>
> I have seen a case myself where a value was written, chip changed
> state to "ready" but when I was reading the value was incorrect.
> This can happen as result of intermittent issue with flash. It is
> hard to fall into scenario when testing on limited number of devices
> but with large enough population you can see that. Another situation
> is when a flash chip reaches its maximum number of writes. So for
> example a chip is designed for 100k writes to a page. Once you
> reach that number of writes you can have invalid data written to
> flash but chip itself reports everything was good and switches to
> "ready" state.
>
> Hope this explanation is clear. Please let me know.
>
> Regards,
> Przemek
>
> > -----Original Message-----
> > From: ikegami_to@yahoo.co.jp <ikegami_to@yahoo.co.jp>
> > Sent: Thursday, February 7, 2019 3:00 PM
> >
> > Hi Przemek-san,
> >
> > Could you please explain the case detail that the value is written
> incorrectly?
> > I think that the value is only written correctly except a bug.
> >
> > Regards,
> > Ikegami
> >
> > --- boris.brezillon@collabora.com wrote --- :
> > > Hi Sobon,
> > >
> > > On Tue, 5 Feb 2019 22:28:44 +0000
> > > "Sobon, Przemyslaw" <psobon@amazon.com> wrote:
> > >
> > > > > From: Boris Brezillon <bbrezillon@kernel.org>
> > > > > Sent: Sunday, February 3, 2019 12:35 AM
> > > > > > +Przemyslaw
> > > > > >
> > > > > > On Fri, 1 Feb 2019 07:30:39 +0800 Liu Jian
> > > > > > <liujian56@huawei.com> wrote:
> > > > > >
> > > > > > > In function do_write_buffer(), in the for loop, there is a
> > > > > > > case
> > > > > > > chip_ready() returns 1 while chip_good() returns 0, so it
> > > > > > > never break the loop.
> > > > > > > To fix this, chip_good() is enough and it should timeout if
> it
> > > > > > > stay bad for a while.
> > > > > >
> > > > > > Looks like Przemyslaw reported and fixed the same problem.
> > > > > >
> > > > > > >
> > > > > > > Fixes: dfeae1073583(mtd: cfi_cmdset_0002: Change write buffer
> > > > > > > to check correct value)
> > > > > >
> > > > > > Can you put the Fixes tag on a single, and the format is
> > > > > >
> > > > > > Fixes: <hash> ("message")
> > > > > >
> > > > > > > Signed-off-by: Yi Huaijie <yihuaijie@huawei.com>
> > > > > > > Signed-off-by: Liu Jian <liujian56@huawei.com>
> > > > > >
> > > > > > [1]http://patchwork.ozlabs.org/patch/1025566/
> > > > > >
> > > > > > > ---
> > > > > > > drivers/mtd/chips/cfi_cmdset_0002.c | 6 +++---
> > > > > > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c
> > > > > > > b/drivers/mtd/chips/cfi_cmdset_0002.c
> > > > > > > index 72428b6..818e94b 100644
> > > > > > > --- a/drivers/mtd/chips/cfi_cmdset_0002.c
> > > > > > > +++ b/drivers/mtd/chips/cfi_cmdset_0002.c
> > > > > > > @@ -1876,14 +1876,14 @@ static int __xipram
> do_write_buffer(struct map_info *map, struct flchip *chip,
> > > > > > > continue;
> > > > > > > }
> > > > > > >
> > > > > > > - if (time_after(jiffies, timeo) && !chip_ready(map,
> adr))
> > > > > > > - break;
> > > > > > > -
> > > > > > > if (chip_good(map, adr, datum)) {
> > > > > > > xip_enable(map, chip, adr);
> > > > > > > goto op_done;
> > > > > > > }
> > > > > > >
> > > > > > > + if (time_after(jiffies, timeo))
> > > > > > > + break;
> > > > > > > +
> > > > > > > /* Latency issues. Drop the lock, wait a while and
> retry */
> > > > > > > UDELAY(map, chip, adr, 1);
> > > > > > > }
> > > > > >
> > > > >
> > > > > BTW, the patch itself looks good to me. Ikegami, can you confirm
> it does the right thing?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Boris
> > > > >
> > > >
> > > > One comment to this patch. If value is written incorrectly quickly
> > > > we will be stuck in the loop even though nothing is going to change.
> > > > For example a value was written incorrectly after 1us, the loop was
> > > > set to 1ms, function will return after 1ms, this solution is not
> > > > optimized for performance. I considered same when working on this
> change and decided to do it different way.
> > >
> > > Seems like you're right if we assume that checking for GOOD state does
> > > not require a delay after the READY check, but if that's not the case
> > > and an extra delay is actually required, you might end up with a BAD
> > > status while it could have turned GOOD at some point with the 'check
> > > only for GOOD state until we timeout' approach.
> > >
> > > TBH, I don't know how CFI flashes work, so I'll let you guys sort this
> > > out.
> > >
> > > Regards,
> > >
> > > Boris
> > >
> > > ______________________________________________________
> > > Linux MTD discussion mailing list
> > > http://lists.infradead.org/mailman/listinfo/linux-mtd/
> > >
> >
> >
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
WARNING: multiple messages have this Message-ID (diff)
From: "Tokunori Ikegami" <ikegami.t@gmail.com>
To: "'Sobon, Przemyslaw'" <psobon@amazon.com>,
"'Boris Brezillon'" <boris.brezillon@collabora.com>
Cc: <keescook@chromium.org>, <marek.vasut@gmail.com>,
<richard@nod.at>, <linux-kernel@vger.kernel.org>,
<joakim.tjernlund@infinera.com>, <linux-mtd@lists.infradead.org>,
<computersforpeace@gmail.com>, <dwmw2@infradead.org>,
"'Liu Jian'" <liujian56@huawei.com>, <ikegami_to@yahoo.co.jp>
Subject: RE: Re: [PATCH] cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer
Date: Fri, 8 Feb 2019 23:23:59 +0900 [thread overview]
Message-ID: <149101d4bfb9$fdc5a330$f950e990$@gmail.com> (raw)
In-Reply-To: <632ed76bd3844ceab75066d1f30a7115@EX13D07UWA001.ant.amazon.com>
Hi Przemek-san,
Thank you so much for your explanation.
> I have seen a case myself where a value was written, chip changed
> state to "ready" but when I was reading the value was incorrect.
I also know the similar issues for the both buffer and word write.
Both issues were able to reproduce the write error behavior.
Note: The word write issue is able to reproduce now also.
Those were resolved by using chip_good() instead to check the state.
> This can happen as result of intermittent issue with flash. It is
> hard to fall into scenario when testing on limited number of devices
> but with large enough population you can see that.
If possible I would like to know the issue detail and its cause also.
> Another situation
> is when a flash chip reaches its maximum number of writes. So for
> example a chip is designed for 100k writes to a page. Once you
> reach that number of writes you can have invalid data written to
> flash but chip itself reports everything was good and switches to
> "ready" state.
Yes I see.
Regards,
Ikegami
> -----Original Message-----
> From: linux-mtd [mailto:linux-mtd-bounces@lists.infradead.org] On Behalf
> Of Sobon, Przemyslaw
> Sent: Friday, February 8, 2019 8:51 AM
> To: ikegami_to@yahoo.co.jp; Boris Brezillon
> Cc: keescook@chromium.org; marek.vasut@gmail.com;
> ikegami@allied-telesis.co.jp; richard@nod.at;
> linux-kernel@vger.kernel.org; joakim.tjernlund@infinera.com;
> linux-mtd@lists.infradead.org; computersforpeace@gmail.com;
> dwmw2@infradead.org; Liu Jian
> Subject: RE: Re: [PATCH] cfi: fix deadloop in cfi_cmdset_0002.c
> do_write_buffer
>
> Hi Ikegami,
>
> I have seen a case myself where a value was written, chip changed
> state to "ready" but when I was reading the value was incorrect.
> This can happen as result of intermittent issue with flash. It is
> hard to fall into scenario when testing on limited number of devices
> but with large enough population you can see that. Another situation
> is when a flash chip reaches its maximum number of writes. So for
> example a chip is designed for 100k writes to a page. Once you
> reach that number of writes you can have invalid data written to
> flash but chip itself reports everything was good and switches to
> "ready" state.
>
> Hope this explanation is clear. Please let me know.
>
> Regards,
> Przemek
>
> > -----Original Message-----
> > From: ikegami_to@yahoo.co.jp <ikegami_to@yahoo.co.jp>
> > Sent: Thursday, February 7, 2019 3:00 PM
> >
> > Hi Przemek-san,
> >
> > Could you please explain the case detail that the value is written
> incorrectly?
> > I think that the value is only written correctly except a bug.
> >
> > Regards,
> > Ikegami
> >
> > --- boris.brezillon@collabora.com wrote --- :
> > > Hi Sobon,
> > >
> > > On Tue, 5 Feb 2019 22:28:44 +0000
> > > "Sobon, Przemyslaw" <psobon@amazon.com> wrote:
> > >
> > > > > From: Boris Brezillon <bbrezillon@kernel.org>
> > > > > Sent: Sunday, February 3, 2019 12:35 AM
> > > > > > +Przemyslaw
> > > > > >
> > > > > > On Fri, 1 Feb 2019 07:30:39 +0800 Liu Jian
> > > > > > <liujian56@huawei.com> wrote:
> > > > > >
> > > > > > > In function do_write_buffer(), in the for loop, there is a
> > > > > > > case
> > > > > > > chip_ready() returns 1 while chip_good() returns 0, so it
> > > > > > > never break the loop.
> > > > > > > To fix this, chip_good() is enough and it should timeout if
> it
> > > > > > > stay bad for a while.
> > > > > >
> > > > > > Looks like Przemyslaw reported and fixed the same problem.
> > > > > >
> > > > > > >
> > > > > > > Fixes: dfeae1073583(mtd: cfi_cmdset_0002: Change write buffer
> > > > > > > to check correct value)
> > > > > >
> > > > > > Can you put the Fixes tag on a single, and the format is
> > > > > >
> > > > > > Fixes: <hash> ("message")
> > > > > >
> > > > > > > Signed-off-by: Yi Huaijie <yihuaijie@huawei.com>
> > > > > > > Signed-off-by: Liu Jian <liujian56@huawei.com>
> > > > > >
> > > > > > [1]http://patchwork.ozlabs.org/patch/1025566/
> > > > > >
> > > > > > > ---
> > > > > > > drivers/mtd/chips/cfi_cmdset_0002.c | 6 +++---
> > > > > > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c
> > > > > > > b/drivers/mtd/chips/cfi_cmdset_0002.c
> > > > > > > index 72428b6..818e94b 100644
> > > > > > > --- a/drivers/mtd/chips/cfi_cmdset_0002.c
> > > > > > > +++ b/drivers/mtd/chips/cfi_cmdset_0002.c
> > > > > > > @@ -1876,14 +1876,14 @@ static int __xipram
> do_write_buffer(struct map_info *map, struct flchip *chip,
> > > > > > > continue;
> > > > > > > }
> > > > > > >
> > > > > > > - if (time_after(jiffies, timeo) && !chip_ready(map,
> adr))
> > > > > > > - break;
> > > > > > > -
> > > > > > > if (chip_good(map, adr, datum)) {
> > > > > > > xip_enable(map, chip, adr);
> > > > > > > goto op_done;
> > > > > > > }
> > > > > > >
> > > > > > > + if (time_after(jiffies, timeo))
> > > > > > > + break;
> > > > > > > +
> > > > > > > /* Latency issues. Drop the lock, wait a while and
> retry */
> > > > > > > UDELAY(map, chip, adr, 1);
> > > > > > > }
> > > > > >
> > > > >
> > > > > BTW, the patch itself looks good to me. Ikegami, can you confirm
> it does the right thing?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Boris
> > > > >
> > > >
> > > > One comment to this patch. If value is written incorrectly quickly
> > > > we will be stuck in the loop even though nothing is going to change.
> > > > For example a value was written incorrectly after 1us, the loop was
> > > > set to 1ms, function will return after 1ms, this solution is not
> > > > optimized for performance. I considered same when working on this
> change and decided to do it different way.
> > >
> > > Seems like you're right if we assume that checking for GOOD state does
> > > not require a delay after the READY check, but if that's not the case
> > > and an extra delay is actually required, you might end up with a BAD
> > > status while it could have turned GOOD at some point with the 'check
> > > only for GOOD state until we timeout' approach.
> > >
> > > TBH, I don't know how CFI flashes work, so I'll let you guys sort this
> > > out.
> > >
> > > Regards,
> > >
> > > Boris
> > >
> > > ______________________________________________________
> > > Linux MTD discussion mailing list
> > > http://lists.infradead.org/mailman/listinfo/linux-mtd/
> > >
> >
> >
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
next prev parent reply other threads:[~2019-02-08 14:24 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-31 23:30 [PATCH] cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer Liu Jian
2019-01-31 23:30 ` Liu Jian
2019-02-03 8:26 ` Boris Brezillon
2019-02-03 8:26 ` Boris Brezillon
2019-02-03 8:35 ` Boris Brezillon
2019-02-03 8:35 ` Boris Brezillon
2019-02-05 22:28 ` Sobon, Przemyslaw
2019-02-05 22:28 ` Sobon, Przemyslaw
2019-02-05 23:03 ` ikegami_to
2019-02-07 8:56 ` Boris Brezillon
2019-02-07 8:56 ` Boris Brezillon
2019-02-07 22:59 ` ikegami_to
2019-02-07 23:50 ` Sobon, Przemyslaw
2019-02-07 23:50 ` Sobon, Przemyslaw
2019-02-08 8:45 ` Joakim Tjernlund
2019-02-08 8:45 ` Joakim Tjernlund
2019-02-08 14:23 ` Tokunori Ikegami [this message]
2019-02-08 14:23 ` Tokunori Ikegami
2019-02-14 1:34 ` liujian (CE)
2019-02-14 1:34 ` liujian (CE)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='149101d4bfb9$fdc5a330$f950e990$@gmail.com' \
--to=ikegami.t@gmail.com \
--cc=boris.brezillon@collabora.com \
--cc=computersforpeace@gmail.com \
--cc=dwmw2@infradead.org \
--cc=ikegami_to@yahoo.co.jp \
--cc=joakim.tjernlund@infinera.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mtd@lists.infradead.org \
--cc=liujian56@huawei.com \
--cc=marek.vasut@gmail.com \
--cc=psobon@amazon.com \
--cc=richard@nod.at \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.