lock bit and erase errors

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

* lock bit and erase errors
@ 2005-03-14 19:08 Josh Boyer
  2005-03-14 19:20 ` Josh Boyer
  0 siblings, 1 reply; 7+ messages in thread
From: Josh Boyer @ 2005-03-14 19:08 UTC (permalink / raw)
  To: linux-mtd

Caveat:  I am using some pretty old code in this case, but the current
code is also questionable so bear with me.

I can't for the life of me figure out what the following code from
do_erase_oneblock cfi_cmdset_0001.c is doing:

	/* check for lock bit */
	if (status & CMD(0x3a)) {
		unsigned char chipstatus = status;
		if (status != CMD(status & 0xff)) {
			int i = cfi->interleave;
			for (i = 1; i<cfi->interleave; i++) {
				      chipstatus |= status >> (cfi->device_type * 8);
			}
			printk(KERN_WARNING "Status is not identical for all chips: 0x%x. Merging to give 0x%02x\n", status, chipstatus);
		}

My device is a 16-bit device, the interleave is 1, and I have a few
blocks with the lock bit on.  When I do a write to /dev/mtdblock/0/0, an
erase is attempted.  When I run an older version of this code, I get:

"Status not identical for all chips: 0xa200.  Merging to give 0x00"

What I don't understand is how any of the above code is expected to
work.  A 1 byte variable (chipstatus) is assigned a 4 byte value
(status), which gets truncated.  Then status is shifted by 16 bits (in
my case), which still means chipstatus is still 0.  When all is said and
done, I get the merging printk but none of the checks below this code
match.  That doesn't seem right...

I hacked the older code for my specific instance to do:

chipstatus = status >> 8;

and I get the proper return code for my situation (-EROFS).  I still
don't get an error in the application writing to /dev/mtdblock/0/0, but
at least the chip driver is actually returning an error.

Any ideas?

josh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lock bit and erase errors
  2005-03-14 19:08 lock bit and erase errors Josh Boyer
@ 2005-03-14 19:20 ` Josh Boyer
  2005-03-14 19:29   ` Josh Boyer
  0 siblings, 1 reply; 7+ messages in thread
From: Josh Boyer @ 2005-03-14 19:20 UTC (permalink / raw)
  To: linux-mtd

On Mon, 2005-03-14 at 13:08 -0600, Josh Boyer wrote:
> Caveat:  I am using some pretty old code in this case, but the current
> code is also questionable so bear with me.
> 
> I can't for the life of me figure out what the following code from
> do_erase_oneblock cfi_cmdset_0001.c is doing:
> 
> 	/* check for lock bit */
> 	if (status & CMD(0x3a)) {
> 		unsigned char chipstatus = status;
> 		if (status != CMD(status & 0xff)) {
> 			int i = cfi->interleave;
> 			for (i = 1; i<cfi->interleave; i++) {
> 				      chipstatus |= status >> (cfi->device_type * 8);
> 			}
> 			printk(KERN_WARNING "Status is not identical for all chips: 0x%x. Merging to give 0x%02x\n", status, chipstatus);
> 		}
> 
> 
> My device is a 16-bit device, the interleave is 1, and I have a few
> blocks with the lock bit on.  When I do a write to /dev/mtdblock/0/0, an
> erase is attempted.  When I run an older version of this code, I get:
> 
> "Status not identical for all chips: 0xa200.  Merging to give 0x00"
> 
> What I don't understand is how any of the above code is expected to
> work.  A 1 byte variable (chipstatus) is assigned a 4 byte value
> (status), which gets truncated.  Then status is shifted by 16 bits (in
> my case), which still means chipstatus is still 0.  When all is said and
> done, I get the merging printk but none of the checks below this code
> match.  That doesn't seem right...
> 
> I hacked the older code for my specific instance to do:
> 
> chipstatus = status >> 8;
> 
> and I get the proper return code for my situation (-EROFS).  I still
> don't get an error in the application writing to /dev/mtdblock/0/0, but
> at least the chip driver is actually returning an error.
> 
> Any ideas?
> 

Urm, I should also point out that I have intentionally locked the
eraseblocks on this device, so I _expect_ it to fail.  But the driver
didn't return a bad return code, which is what I'm confused about.

And I'm also confused as to why the app didn't get a bad return code
even when I "fixed" the driver to error out.  But one thing at a time.

josh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lock bit and erase errors
  2005-03-14 19:20 ` Josh Boyer
@ 2005-03-14 19:29   ` Josh Boyer
  2005-03-17 10:34     ` Jörn Engel
  0 siblings, 1 reply; 7+ messages in thread
From: Josh Boyer @ 2005-03-14 19:29 UTC (permalink / raw)
  To: linux-mtd

On Mon, 2005-03-14 at 13:20 -0600, Josh Boyer wrote:
> On Mon, 2005-03-14 at 13:08 -0600, Josh Boyer wrote:
> > Caveat:  I am using some pretty old code in this case, but the current
> > code is also questionable so bear with me.

FYI, I just ran this with a 2.6 kernel + fairly recent MTD code and I
see the same problems.

No more updates from me until I hear back, I promise :).

josh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lock bit and erase errors
  2005-03-14 19:29   ` Josh Boyer
@ 2005-03-17 10:34     ` Jörn Engel
  2005-03-21 16:52       ` Josh Boyer
  0 siblings, 1 reply; 7+ messages in thread
From: Jörn Engel @ 2005-03-17 10:34 UTC (permalink / raw)
  To: Josh Boyer; +Cc: linux-mtd

On Mon, 14 March 2005 13:29:24 -0600, Josh Boyer wrote:
> On Mon, 2005-03-14 at 13:20 -0600, Josh Boyer wrote:
> > On Mon, 2005-03-14 at 13:08 -0600, Josh Boyer wrote:
> > > Caveat:  I am using some pretty old code in this case, but the current
> > > code is also questionable so bear with me.
> 
> FYI, I just ran this with a 2.6 kernel + fairly recent MTD code and I
> see the same problems.
> 
> No more updates from me until I hear back, I promise :).

I've had a quick look and would agree with you right now.  This
appears to be a bug.

Generally speaking, there appears to be a lack of automated tests for
the cfi code.  Getting it to work right for all combinations of bus
widths, etc. is a challenge.  But automated tests would require one
device of each kind and I have no idea how to get all those.

Maybe you should just fix your special case and hope not to break
someone else.

Jörn

-- 
A surrounded army must be given a way out.
-- Sun Tzu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lock bit and erase errors
  2005-03-17 10:34     ` Jörn Engel
@ 2005-03-21 16:52       ` Josh Boyer
  2005-03-24  0:55         ` Jared Hulbert
  0 siblings, 1 reply; 7+ messages in thread
From: Josh Boyer @ 2005-03-21 16:52 UTC (permalink / raw)
  To: Jörn Engel; +Cc: linux-mtd

On Thu, 2005-03-17 at 11:34 +0100, Jörn Engel wrote:
> > 
> > FYI, I just ran this with a 2.6 kernel + fairly recent MTD code and I
> > see the same problems.
> > 
> > No more updates from me until I hear back, I promise :).
> 
> I've had a quick look and would agree with you right now.  This
> appears to be a bug.

Thomas and I thought so too.

> 
> Generally speaking, there appears to be a lack of automated tests for
> the cfi code.  Getting it to work right for all combinations of bus
> widths, etc. is a challenge.  But automated tests would require one
> device of each kind and I have no idea how to get all those.

Me either.  Some kind of test suite would be nice though.  Artem has the
NAND simulator.  Perhaps someone could make a NOR simulator as well to
test the CFI stuff?  Any volunteers?

> 
> Maybe you should just fix your special case and hope not to break
> someone else.

Thomas actually came up with a patch for it and it's in CVS now.  It
fixed my bogus status merging problem so an error is actually returned
when an erase fails.

As for why dd never reported an error, that is because dd is stupid and
doesn't flush any of it's writes.  It just relies on the flush that is
done on the implicit close when exit(0) is called.

josh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lock bit and erase errors
  2005-03-21 16:52       ` Josh Boyer
@ 2005-03-24  0:55         ` Jared Hulbert
  2005-03-24  2:02           ` Nicolas Pitre
  0 siblings, 1 reply; 7+ messages in thread
From: Jared Hulbert @ 2005-03-24  0:55 UTC (permalink / raw)
  To: Josh Boyer; +Cc: linux-mtd

> Me either.  Some kind of test suite would be nice though.  Artem has the
> NAND simulator.  Perhaps someone could make a NOR simulator as well to
> test the CFI stuff?  Any volunteers?

We've got a unit test framework in alpha that would have caught this
assuming tests were run for it.  We'll create a test for this bug to
demonstrate.  Let me get it to a halfway presentable state and get
Legal's stamp of approval to release, then I will send it out.  I'd
like to see what ya'll think about it.  There are also some issues
with the build system we could use some help with.

Can I ftp a tarball somewhere on the hallowed infradead.org or get a
directory in the CVS tree I can commit it to?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lock bit and erase errors
  2005-03-24  0:55         ` Jared Hulbert
@ 2005-03-24  2:02           ` Nicolas Pitre
  0 siblings, 0 replies; 7+ messages in thread
From: Nicolas Pitre @ 2005-03-24  2:02 UTC (permalink / raw)
  To: Jared Hulbert; +Cc: linux-mtd

On Wed, 23 Mar 2005, Jared Hulbert wrote:

> Can I ftp a tarball somewhere on the hallowed infradead.org or get a
> directory in the CVS tree I can commit it to?

The CVS directory is probably the best way forward IMHO.


Nicolas

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-03-24  2:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-14 19:08 lock bit and erase errors Josh Boyer
2005-03-14 19:20 ` Josh Boyer
2005-03-14 19:29   ` Josh Boyer
2005-03-17 10:34     ` Jörn Engel
2005-03-21 16:52       ` Josh Boyer
2005-03-24  0:55         ` Jared Hulbert
2005-03-24  2:02           ` Nicolas Pitre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox