* UBI: Question about error checking in ubi_eba_write_leb() @ 2014-09-30 11:00 steve 2014-10-03 15:14 ` Artem Bityutskiy 0 siblings, 1 reply; 3+ messages in thread From: steve @ 2014-09-30 11:00 UTC (permalink / raw) To: linux-mtd This is my first post to this mailing list, I think this is the right place for my question. I think the error checking in ubi_eba_write_leb() may be wrong and may cause incorrect behaviour when extra checking is enabled in debugfs, here's the scenario I have a problem with: ubi_eba_write_leb() calls ubi_io_write_data() -> ubi_io_write() -> ubi_dbg_check_all_ff() ubi_dbg_check_all_ff() will perform an mtd_read() which can fail with EBADMSG (if there is a failure to correct using ECC as in my case). This error code will bubble back up to ubi_eba_write_leb(), but recover_peb() will only be called if the error code was EIO. The question is should recover_peb() be called in this case as well? Thanks for any help with this, Steve ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: UBI: Question about error checking in ubi_eba_write_leb() 2014-09-30 11:00 UBI: Question about error checking in ubi_eba_write_leb() steve @ 2014-10-03 15:14 ` Artem Bityutskiy 2014-10-07 8:46 ` steve 0 siblings, 1 reply; 3+ messages in thread From: Artem Bityutskiy @ 2014-10-03 15:14 UTC (permalink / raw) To: steve; +Cc: linux-mtd On Tue, 2014-09-30 at 12:00 +0100, steve wrote: > This is my first post to this mailing list, I think this is the right place for > my question. > > I think the error checking in ubi_eba_write_leb() may be wrong and may cause > incorrect behaviour when extra checking is enabled in debugfs, here's the > scenario I have a problem with: > > ubi_eba_write_leb() calls ubi_io_write_data() -> ubi_io_write() -> > ubi_dbg_check_all_ff() > > ubi_dbg_check_all_ff() will perform an mtd_read() which can fail with EBADMSG > (if there is a failure to correct using ECC as in my case). Yep, this is possible. > This error code > will bubble back up to ubi_eba_write_leb() And the question is - is this the right thing to do? Original intention was to catch I/O problems and report about them promptly by dumping the information to the log and returning an error. The error we return is -EINVAL exactly in because we did not want the upper layers try to 'deal' with the problem. All we wanted to do is to catch the problems. > , but recover_peb() will only be > called if the error code was EIO. Yeah, and we never meant the recover_peb() to actually try doing anything about that. > The question is should recover_peb() be > called in this case as well? Good question - this was not meant to be. So see, you are now talking about the debugging code. It was written basically to make developers' live simpler. This is just a tool to catch bugs during the R&D. So if during the R&D phase we see that the driver returns an "ECC error" (-EBADMSG) when we read the data we just wrote - this is a problem, and we _do not_ want to hide it by starting the recovery mechanism - we want the write request to actually fail and the developer to actually notice this. The non-debug case is a completely different thing. Here we actually would be interested to handle the issue transparently - move the data to a different PEB, and mark the failing PEB as bad. But the assumption is that the write() path never returns -EBADMSG. If a driver internally does write-read-verify, we assume it should return -EIO if there is an ECC error. But not -EBADMSG. This is why we do not handle this error code in eba.c. Now, what I see is that you are looking at some old code. In the new code the function is named 'ubi_self_check_all_ff()', and if I read the code correctly, it will return -EINVAL if mtd_read() returns -EBADMSG: err = mtd_read(ubi->mtd, addr, len, &read, buf); if (err && !mtd_is_bitflip(err)) { ubi_err("error %d while reading %d bytes from PEB %d:%d, read %zd bytes", err, len, pnum, offset, read); goto error; ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: UBI: Question about error checking in ubi_eba_write_leb() 2014-10-03 15:14 ` Artem Bityutskiy @ 2014-10-07 8:46 ` steve 0 siblings, 0 replies; 3+ messages in thread From: steve @ 2014-10-07 8:46 UTC (permalink / raw) To: Artem Bityutskiy; +Cc: linux-mtd Thank you very much for your reply Artem, that makes sense to me now. On Fri, 3 Oct 2014, Artem Bityutskiy wrote: > On Tue, 2014-09-30 at 12:00 +0100, steve wrote: >> This is my first post to this mailing list, I think this is the right place for >> my question. >> >> I think the error checking in ubi_eba_write_leb() may be wrong and may cause >> incorrect behaviour when extra checking is enabled in debugfs, here's the >> scenario I have a problem with: >> >> ubi_eba_write_leb() calls ubi_io_write_data() -> ubi_io_write() -> >> ubi_dbg_check_all_ff() >> >> ubi_dbg_check_all_ff() will perform an mtd_read() which can fail with EBADMSG >> (if there is a failure to correct using ECC as in my case). > > Yep, this is possible. > >> This error code >> will bubble back up to ubi_eba_write_leb() > > And the question is - is this the right thing to do? Original intention > was to catch I/O problems and report about them promptly by dumping the > information to the log and returning an error. > > The error we return is -EINVAL exactly in because we did not want the > upper layers try to 'deal' with the problem. All we wanted to do is to > catch the problems. > >> , but recover_peb() will only be >> called if the error code was EIO. > > Yeah, and we never meant the recover_peb() to actually try doing > anything about that. > >> The question is should recover_peb() be >> called in this case as well? > > Good question - this was not meant to be. > > So see, you are now talking about the debugging code. It was written > basically to make developers' live simpler. This is just a tool to catch > bugs during the R&D. > > So if during the R&D phase we see that the driver returns an "ECC > error" (-EBADMSG) when we read the data we just wrote - this is a > problem, and we _do not_ want to hide it by starting the recovery > mechanism - we want the write request to actually fail and the developer > to actually notice this. > > The non-debug case is a completely different thing. Here we actually > would be interested to handle the issue transparently - move the data to > a different PEB, and mark the failing PEB as bad. > > But the assumption is that the write() path never returns -EBADMSG. If a > driver internally does write-read-verify, we assume it should return > -EIO if there is an ECC error. But not -EBADMSG. > > This is why we do not handle this error code in eba.c. > > Now, what I see is that you are looking at some old code. In the new > code the function is named 'ubi_self_check_all_ff()', and if I read the > code correctly, it will return -EINVAL if mtd_read() returns -EBADMSG: > > > err = mtd_read(ubi->mtd, addr, len, &read, buf); > if (err && !mtd_is_bitflip(err)) { > ubi_err("error %d while reading %d bytes from PEB %d:%d, read %zd bytes", > err, len, pnum, offset, read); > goto error; > > ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-10-07 8:51 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-09-30 11:00 UBI: Question about error checking in ubi_eba_write_leb() steve 2014-10-03 15:14 ` Artem Bityutskiy 2014-10-07 8:46 ` steve
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox