Erase unit header corrupted when using FTL

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

* Erase unit header corrupted when using FTL
@ 2000-11-05 22:52 Brian Kuschak
  2000-11-06  0:57 ` David Woodhouse
  0 siblings, 1 reply; 29+ messages in thread
From: Brian Kuschak @ 2000-11-05 22:52 UTC (permalink / raw)
  To: 'mtd@infradead.org'

Hello,

I'm having a major problem when using the FTL.  I format and mount a
filesystem on the FTL device, and do this:

  - copy a file onto the device
  - sync
  - remove the file 
  - sync
  - copy another file onto the device
  - sync.

The file is ~84KB and the sector size is 128KB.  The second copy causes the
FTL to start erasing the first erase unit.  At this point, the erase always
fails.  Seems like the erase unit is not getting a valid header written to
it after being erased.  I can verify that the erase callback is being
called.  Any ideas?

Thanks,
Brian Kuschak


Just after format:

	sh-2.03# /tmp/ftl_check /dev/mtd0
	Memory region info:
	  Region size = 256 kb  Erase block size = 128 kb

	Partition header:
	  Formatted size = 125440 bytes, erase units = 2, transfer units = 1
	  Erase unit size = 128 kb, virtual block size = 512 bytes

	Erase unit 0:
	  Logical unit 0, erase count = 0
	  Block allocation: 3 control, 208 data, 25 free, 20 deleted

	Erase unit 1:
	  Transfer unit, erase count = 0

After the second copy and sync.  These error messages repeat for about 120
sectors.

	sh-2.03# cp /bin/ftp /mnt/
	sh-2.03# sync
	In erase callback.
	ftl_cs: erase failed: state = 192
	ftl_cs: reclaim failed: no suitable transfer units!
	ftl_write returned -5
	end_request: I/O error, dev 2c:00 (ftl), sector 78
	ftl_cs: reclaim failed: no suitable transfer units!
	ftl_write returned -5
	end_request: I/O error, dev 2c:00 (ftl), sector 80
	ftl_cs: reclaim failed: no suitable transfer units!

	sh-2.03# /tmp/ftl_check /dev/mtd0
	Memory region info:
	  Region size = 256 kb  Erase block size = 128 kb

	Partition header:
	  Formatted size = 125440 bytes, erase units = 2, transfer units = 1
	  Erase unit size = 128 kb, virtual block size = 512 bytes

	Erase unit 0:
	  Erase unit header is corrupt.

	Erase unit 1:
	  Logical unit 0, erase count = 0
	  Block allocation: 3 control, 208 data, 1 free, 44 deleted
	sh-2.03#


To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Erase unit header corrupted when using FTL
  2000-11-05 22:52 Erase unit header corrupted when using FTL Brian Kuschak
@ 2000-11-06  0:57 ` David Woodhouse
  2000-11-06  2:10   ` corruption with mtdblock Nicolas Pitre
  0 siblings, 1 reply; 29+ messages in thread
From: David Woodhouse @ 2000-11-06  0:57 UTC (permalink / raw)
  To: Brian Kuschak; +Cc: 'mtd@infradead.org'

On Sun, 5 Nov 2000, Brian Kuschak wrote:

> I'm having a major problem when using the FTL.  I format and mount a
> filesystem on the FTL device, and do this:
> 
>   - copy a file onto the device
>   - sync
>   - remove the file 
>   - sync
>   - copy another file onto the device
>   - sync.
> 
> The file is ~84KB and the sector size is 128KB.  The second copy causes the
> FTL to start erasing the first erase unit.  At this point, the erase always
> fails.  Seems like the erase unit is not getting a valid header written to
> it after being erased.  I can verify that the erase callback is being
> called.  Any ideas?

Probably because we're sleeping in the request function. We need to
transplant the kernel_thread fix from the mtdblock code.

Are you using this on a PCMCIA device? If not, and if you're not in the
Free World, then you're probably breaking the law. 

You might consider using JFFS instead.


-- 
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* corruption with mtdblock
  2000-11-06  0:57 ` David Woodhouse
@ 2000-11-06  2:10   ` Nicolas Pitre
  2000-11-06  7:27     ` David Woodhouse
  0 siblings, 1 reply; 29+ messages in thread
From: Nicolas Pitre @ 2000-11-06  2:10 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd


I've been trying to find a problem with the MTD block interface for two
days now without any luck.

On a 2.4.0-test10 kernel:

	insmod mtdcore.o
	insmod mtdram.o
	insmod mtdblock.o
	mke2fs /dev/mtdblock0
	mount /dev/mtdblock0 /mnt
	cp -a /sbin /mnt/sbin
	cat /dev/mtdblock0 > /dev/null
	[BOOM! random crash...]

Am I the only one to experience this?

Any ideas?


Nicolas



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-06  2:10   ` corruption with mtdblock Nicolas Pitre
@ 2000-11-06  7:27     ` David Woodhouse
  2000-11-06 17:54       ` Nicolas Pitre
  0 siblings, 1 reply; 29+ messages in thread
From: David Woodhouse @ 2000-11-06  7:27 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: mtd

On Sun, 5 Nov 2000, Nicolas Pitre wrote:
> 	insmod mtdcore.o
> 	insmod mtdram.o
> 	insmod mtdblock.o
> 	mke2fs /dev/mtdblock0
> 	mount /dev/mtdblock0 /mnt
> 	cp -a /sbin /mnt/sbin
> 	cat /dev/mtdblock0 > /dev/null
> 	[BOOM! random crash...]
> 
> Am I the only one to experience this?

Nope. I can reproduce it. handle_mtdblock_request is calling
end_request() when req->bh->b_end_io is NULL, even though it wasn't
like that at the beginning of handle_mtdblock_request.

Looks like the following comment is incorrect:
 * The head of our request queue is considered active so there is no need 
 * to dequeue requests before we are done.

-- 
dwmw2

To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-06 17:54       ` Nicolas Pitre
@ 2000-11-06 14:47         ` David Woodhouse
  2000-11-06 20:11           ` Nicolas Pitre
  2000-11-06 14:54         ` David Woodhouse
  1 sibling, 1 reply; 29+ messages in thread
From: David Woodhouse @ 2000-11-06 14:47 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: mtd


nico@cam.org said:
>  So in our case, q->head_active is 1 by default.
> Looking at __make_request() you can see the head of the queue is
> actually skipped when the queue is unplugged.  It is plugged only when
> actually empty and no request are processed until it gets unplugged
> again.  All this is done when io_request_lock is held.  That's why I
> came to the conclusion that requests don't have to be removed earlier.
> 

OK. How about 'plugging'?

The code looks like this...

        /*
         * skip first entry, for devices with active queue head
         */
        if (q->head_active && !q->plugged)
                head = head->next;


It may be that if our request function returns without first ending the 
request (as it does, because all it does is a wake_up()), the queue is 
'plugged' to wait for requests to be merged. 

Then the kernel thread is proceeding to deal with the requests while the 
ll_rw_blk code thinks that the driver's request function isn't running.

I'll see if I can find someone to explain the plugging stuff to me :)

--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-06 17:54       ` Nicolas Pitre
  2000-11-06 14:47         ` David Woodhouse
@ 2000-11-06 14:54         ` David Woodhouse
  2000-11-06 21:27           ` Nicolas Pitre
  1 sibling, 1 reply; 29+ messages in thread
From: David Woodhouse @ 2000-11-06 14:54 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: mtd


Try this:

Index: mtdblock.c
===================================================================
RCS file: /home/cvs/mtd/kernel/mtdblock.c,v
retrieving revision 1.29
diff -u -r1.29 mtdblock.c
--- mtdblock.c	2000/11/03 04:59:01	1.29
+++ mtdblock.c	2000/11/06 14:54:06
@@ -58,6 +58,11 @@
 #define blkdev_dequeue_request(req) do {CURRENT = req->next;} while (0)
 #else
 #define RQFUNC_ARG request_queue_t *q
+static void mtdblock_plug_device(request_queue_t *q, kdev_t dev)
+{
+	/* Do nothing. We don't want plugging */
+	return;
+}
 #endif
 
 
@@ -616,6 +621,7 @@
 	blk_dev[MAJOR_NR].request_fn = mtdblock_request;
 #else
 	blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), &mtdblock_request);
+	blk_queue_pluggable(BLK_DEFAULT_QUEUE(MAJOR_NR), &mtdblock_plug_device)
 #endif
 	kernel_thread (mtdblock_thread, NULL, CLONE_FS|CLONE_FILES|CLONE_SIGHAND);
 	return 0;


--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-06 20:11           ` Nicolas Pitre
@ 2000-11-06 15:19             ` David Woodhouse
  2000-11-07  9:04               ` Nicolas Pitre
  2000-11-06 15:31             ` David Woodhouse
  1 sibling, 1 reply; 29+ messages in thread
From: David Woodhouse @ 2000-11-06 15:19 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: mtd


nico@cam.org said:
>  Nope.  The only place where the queue can be plugged is:
>         if (list_empty(head)) {
>                 q->plug_device_fn(q, bh->b_rdev); /* is atomic */
> Therefore the queue can't be plugged until we emptied it.

But it starts off empty - so doesn't it get plugged the first time a 
request is added? But it doesn't actually bite us until requests are merged 
- when we do 'cat /dev/mtdblock0' or massive writes.

> To be sure I tried to disable plugging by providing a dummy plug
> function that does nothing.  No difference.

Ah. If something is mucking with the request at the head of the queue while 
 (q->head_active && !q->plugged) then I think that has to be a kernel bug.

Just before the end_request() in mtdblock_handle_request(), can you put in 
a sanity check for req->bh->b_end_io? If it's not a sane value, can you 
print the current values of q->head_active and q->plugged?

--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-06 20:11           ` Nicolas Pitre
  2000-11-06 15:19             ` David Woodhouse
@ 2000-11-06 15:31             ` David Woodhouse
  2000-11-06 21:24               ` Nicolas Pitre
  1 sibling, 1 reply; 29+ messages in thread
From: David Woodhouse @ 2000-11-06 15:31 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: mtd

OK. Upon prodding, it seems that I'd misunderstood beforehand. You _CAN_ 
sleep in a standard request function; you don't need to have a kernel 
thread. The only restriction is that you must not allocate anything with 
GFP_KERNEL, because it could deadlock waiting for you to write.

So we can throw out all my kernel thread crap.

--
dwmw2

To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-06 21:24               ` Nicolas Pitre
@ 2000-11-06 17:31                 ` David Woodhouse
  2000-11-07  1:02                   ` Nicolas Pitre
  0 siblings, 1 reply; 29+ messages in thread
From: David Woodhouse @ 2000-11-06 17:31 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: mtd


nico@cam.org said:
> On Mon, 6 Nov 2000, David Woodhouse wrote:
> > OK. Upon prodding, it seems that I'd misunderstood beforehand. You
> > _CAN_ sleep in a standard request function; you don't need to have a 
> > kernel thread. 

> Are you certain of that?

> One place where the request function is called is within
> generic_unplug_device() which is called through a task queue (see
> blk_init_queue()).

I believe the tq_disk task queue is always run from process context. 

--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-06  7:27     ` David Woodhouse
@ 2000-11-06 17:54       ` Nicolas Pitre
  2000-11-06 14:47         ` David Woodhouse
  2000-11-06 14:54         ` David Woodhouse
  0 siblings, 2 replies; 29+ messages in thread
From: Nicolas Pitre @ 2000-11-06 17:54 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd

On Mon, 6 Nov 2000, David Woodhouse wrote:

> On Sun, 5 Nov 2000, Nicolas Pitre wrote:
> > 	insmod mtdcore.o
> > 	insmod mtdram.o
> > 	insmod mtdblock.o
> > 	mke2fs /dev/mtdblock0
> > 	mount /dev/mtdblock0 /mnt
> > 	cp -a /sbin /mnt/sbin
> > 	cat /dev/mtdblock0 > /dev/null
> > 	[BOOM! random crash...]
> > 
> > Am I the only one to experience this?
> 
> Nope. I can reproduce it. handle_mtdblock_request is calling
> end_request() when req->bh->b_end_io is NULL, even though it wasn't
> like that at the beginning of handle_mtdblock_request.

In fact, it's not always NULL but rather random garbage.

> Looks like the following comment is incorrect:
>  * The head of our request queue is considered active so there is no need 
>  * to dequeue requests before we are done.

If you look at drivers/block/ll_rw_blk.c

/**
 * blk_queue_headactive - indicate whether head of request queue may be active
 * @q:       The queue which this applies to.
 * @active:  A flag indication where the head of the queue is active.
 *
 * Description:
 *    The driver for a block device may choose to leave the currently active
 *    request on the request queue, removing it only when it has completed.
 *    The queue handling routines assume this by default for safety reasons
 *    and will not involve the head of the request queue in any merging or
 *    reordering of requests when the queue is unplugged (and thus may be
 *    working on this particular request).
 *
 *    If a driver removes requests from the queue before processing them, then
 *    it may indicate that it does so, there by allowing the head of the queue
 *    to be involved in merging and reordering.  This is done be calling
 *    blk_queue_headactive() with an @active flag of %0.
 *
 *    If a driver processes several requests at once, it must remove them (or
 *    at least all but one of them) from the request queue.
 *
 *    When a queue is plugged (see blk_queue_pluggable()) the head will be
 *    assumed to be inactive.
 **/

So in our case, q->head_active is 1 by default.

Looking at __make_request() you can see the head of the queue is actually
skipped when the queue is unplugged.  It is plugged only when actually
empty and no request are processed until it gets unplugged again.  All
this is done when io_request_lock is held.  That's why I came to the
conclusion that requests don't have to be removed earlier.

Nicolas

To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-06 14:47         ` David Woodhouse
@ 2000-11-06 20:11           ` Nicolas Pitre
  2000-11-06 15:19             ` David Woodhouse
  2000-11-06 15:31             ` David Woodhouse
  0 siblings, 2 replies; 29+ messages in thread
From: Nicolas Pitre @ 2000-11-06 20:11 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd



On Mon, 6 Nov 2000, David Woodhouse wrote:

> 
> nico@cam.org said:
> >  So in our case, q->head_active is 1 by default.
> > Looking at __make_request() you can see the head of the queue is
> > actually skipped when the queue is unplugged.  It is plugged only when
> > actually empty and no request are processed until it gets unplugged
> > again.  All this is done when io_request_lock is held.  That's why I
> > came to the conclusion that requests don't have to be removed earlier.
> > 
> 
> OK. How about 'plugging'?
> 
> The code looks like this...
> 
>         /*
>          * skip first entry, for devices with active queue head
>          */
>         if (q->head_active && !q->plugged)
>                 head = head->next;
> 
> 
> It may be that if our request function returns without first ending the 
> request (as it does, because all it does is a wake_up()), the queue is 
> 'plugged' to wait for requests to be merged. 

Nope.  The only place where the queue can be plugged is:

        if (list_empty(head)) {
                q->plug_device_fn(q, bh->b_rdev); /* is atomic */

Therefore the queue can't be plugged until we emptied it.

To be sure I tried to disable plugging by providing a dummy plug function
that does nothing.  No difference.


Nicolas



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-06 15:31             ` David Woodhouse
@ 2000-11-06 21:24               ` Nicolas Pitre
  2000-11-06 17:31                 ` David Woodhouse
  0 siblings, 1 reply; 29+ messages in thread
From: Nicolas Pitre @ 2000-11-06 21:24 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd



On Mon, 6 Nov 2000, David Woodhouse wrote:

> 
> OK. Upon prodding, it seems that I'd misunderstood beforehand. You _CAN_ 
> sleep in a standard request function; you don't need to have a kernel 
> thread. 

Are you certain of that?

One place where the request function is called is within
generic_unplug_device() which is called through a task queue (see
blk_init_queue()).


Nicolas




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-06 14:54         ` David Woodhouse
@ 2000-11-06 21:27           ` Nicolas Pitre
  0 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2000-11-06 21:27 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd



On Mon, 6 Nov 2000, David Woodhouse wrote:

> 
> Try this:
> 
> Index: mtdblock.c
> ===================================================================
> RCS file: /home/cvs/mtd/kernel/mtdblock.c,v
> retrieving revision 1.29
> diff -u -r1.29 mtdblock.c
> --- mtdblock.c	2000/11/03 04:59:01	1.29
> +++ mtdblock.c	2000/11/06 14:54:06
> @@ -58,6 +58,11 @@
>  #define blkdev_dequeue_request(req) do {CURRENT = req->next;} while (0)
>  #else
>  #define RQFUNC_ARG request_queue_t *q
> +static void mtdblock_plug_device(request_queue_t *q, kdev_t dev)
> +{
> +	/* Do nothing. We don't want plugging */
> +	return;
> +}
>  #endif
>  
>  

Yup.  Already tried that like I said earlier.  No difference.


Nicolas



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-06 17:31                 ` David Woodhouse
@ 2000-11-07  1:02                   ` Nicolas Pitre
  0 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2000-11-07  1:02 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd

On Mon, 6 Nov 2000, David Woodhouse wrote:

> 
> nico@cam.org said:
> > On Mon, 6 Nov 2000, David Woodhouse wrote:
> > > OK. Upon prodding, it seems that I'd misunderstood beforehand. You
> > > _CAN_ sleep in a standard request function; you don't need to have a 
> > > kernel thread. 
> 
> > Are you certain of that?
> 
> > One place where the request function is called is within
> > generic_unplug_device() which is called through a task queue (see
> > blk_init_queue()).
> 
> I believe the tq_disk task queue is always run from process context. 

Hmmm...  Maybe.  However the code at many places where the tq_disk is run
seem to expect it is run asynchronously while some other places it is run
synchronously while sticking to a wait queue with the task state set to
TASK_UNINTERRUPTIBLE...  I'm afraid we could break something by suddenly
returning from run_task_queue() with the current process' state set to
TASK_RUNNING.

Anyway it doesn't explain the corruption problem.

Nicolas

To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-06 15:19             ` David Woodhouse
@ 2000-11-07  9:04               ` Nicolas Pitre
  2000-11-08 21:50                 ` David Woodhouse
  0 siblings, 1 reply; 29+ messages in thread
From: Nicolas Pitre @ 2000-11-07  9:04 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd



On Mon, 6 Nov 2000, David Woodhouse wrote:

> 
> nico@cam.org said:
> >  Nope.  The only place where the queue can be plugged is:
> >         if (list_empty(head)) {
> >                 q->plug_device_fn(q, bh->b_rdev); /* is atomic */
> > Therefore the queue can't be plugged until we emptied it.
> 
> But it starts off empty - so doesn't it get plugged the first time a 
> request is added? But it doesn't actually bite us until requests are merged 
> - when we do 'cat /dev/mtdblock0' or massive writes.
> 
> > To be sure I tried to disable plugging by providing a dummy plug
> > function that does nothing.  No difference.
> 
> Ah. If something is mucking with the request at the head of the queue while 
>  (q->head_active && !q->plugged) then I think that has to be a kernel bug.

OK!  Forget all this now.  It wasn't a kernel bug, of course.

The lesson is:  never use req->nr_sectors in a request function unless
you're playing special tricks with different segments and you know what
you're doing.  In the common case req->current_nr_sectors should be used.

It works beautifully now.


Nicolas



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-07  9:04               ` Nicolas Pitre
@ 2000-11-08 21:50                 ` David Woodhouse
  2000-11-09  0:28                   ` Nicolas Pitre
  0 siblings, 1 reply; 29+ messages in thread
From: David Woodhouse @ 2000-11-08 21:50 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: mtd

On Tue, 7 Nov 2000, Nicolas Pitre wrote:

> > Ah. If something is mucking with the request at the head of the queue while 
> >  (q->head_active && !q->plugged) then I think that has to be a kernel bug.
> 
> OK!  Forget all this now.  It wasn't a kernel bug, of course.
> 
> The lesson is:  never use req->nr_sectors in a request function unless
> you're playing special tricks with different segments and you know what
> you're doing.  In the common case req->current_nr_sectors should be used.
> 
> It works beautifully now.

I think we still need to disable plugging to protect the head of the 
list. It's a small race but it's non-zero.

-- 
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-08 21:50                 ` David Woodhouse
@ 2000-11-09  0:28                   ` Nicolas Pitre
  2000-11-09  7:58                     ` David Woodhouse
  2000-11-09  8:12                     ` David Woodhouse
  0 siblings, 2 replies; 29+ messages in thread
From: Nicolas Pitre @ 2000-11-09  0:28 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd

On Wed, 8 Nov 2000, David Woodhouse wrote:

> On Tue, 7 Nov 2000, Nicolas Pitre wrote:
> 
> > > Ah. If something is mucking with the request at the head of the queue while 
> > >  (q->head_active && !q->plugged) then I think that has to be a kernel bug.
> > 
> > OK!  Forget all this now.  It wasn't a kernel bug, of course.
> > 
> > The lesson is:  never use req->nr_sectors in a request function unless
> > you're playing special tricks with different segments and you know what
> > you're doing.  In the common case req->current_nr_sectors should be used.
> > 
> > It works beautifully now.
> 
> I think we still need to disable plugging to protect the head of the 
> list. It's a small race but it's non-zero.

Please show me.

After reviewing the code again and again, I can't see where a race could
actually happen.  Plugging is also a good thing (tm) for flash because is
increases the chance of not trashing the mtdblock local cache.  Since the
request function is called only when the queue is unplugged, and because
the queue head is considered active, we can leave the current request on
the queue while it is being processed.  As long as there is a request on
the queue, it won't get re-plugged so we are still safe.  This prevents
unnecessary processing, makes the code look prettier and is more portable
between 2.2.x and 2.4.x.

Nicolas

To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-09  0:28                   ` Nicolas Pitre
@ 2000-11-09  7:58                     ` David Woodhouse
  2000-11-09 14:22                       ` Nicolas Pitre
  2000-11-09  8:12                     ` David Woodhouse
  1 sibling, 1 reply; 29+ messages in thread
From: David Woodhouse @ 2000-11-09  7:58 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: mtd

On Wed, 8 Nov 2000, Nicolas Pitre wrote:

> > I think we still need to disable plugging to protect the head of the 
> > list. It's a small race but it's non-zero.
> 
> Please show me.

	CPU 0			CPU 1

    handle_mtdblock_request()
	-> end_request()


	   * Request queue is now empty *

			    __make_request()
    io_request_unlock
			    	-> io_request_lock
				   plug_device_fn()
				   put request on queue
				   io_request_unlock


	   * Request queue is plugged, but !empty *

    io_request_lock
    if (!QUEUE_EMPTY)
	handle_mtdblock_request()  ---- BOOM!


I agree that plugging is useful for us. So rather than disabling it, how
about changing the check in mtdblock_thread() to:

	if (QUEUE_EMPTY || QUEUE_PLUGGED)

-- 
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-09  0:28                   ` Nicolas Pitre
  2000-11-09  7:58                     ` David Woodhouse
@ 2000-11-09  8:12                     ` David Woodhouse
  2000-11-09 11:34                       ` David Woodhouse
  2000-11-09 14:34                       ` corruption with mtdblock Nicolas Pitre
  1 sibling, 2 replies; 29+ messages in thread
From: David Woodhouse @ 2000-11-09  8:12 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: mtd


Upon further consideration, it may be cleaner (at the cost of _slightly_
higher io_request_lock contention) just to _not_ drop the spinlock between
finishing one request and checking to see if the queue is empty.

How about this?

 	exit_sighand(tsk);
 	exit_fs(tsk);
 
+	spin_lock_irq(&io_request_lock);
 	while (!leaving) {
 		add_wait_queue(&thr_wq, &wait);
 		set_current_state(TASK_INTERRUPTIBLE);
-		spin_lock_irq(&io_request_lock);
 		if (QUEUE_EMPTY) {
 			spin_unlock_irq(&io_request_lock);
 			schedule();
 			remove_wait_queue(&thr_wq, &wait); 
+			spin_lock_irq(&io_request_lock);
 		} else {
 			remove_wait_queue(&thr_wq, &wait); 
 			set_current_state(TASK_RUNNING);
 			handle_mtdblock_request();
-			spin_unlock_irq(&io_request_lock);
 		}
 	}
+	spin_unlock_irq(&io_request_lock);
 	up(&thread_sem);
 	return 0;
 }


-- 
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-09  8:12                     ` David Woodhouse
@ 2000-11-09 11:34                       ` David Woodhouse
  2000-11-09 15:21                         ` Nicolas Pitre
  2000-11-09 14:34                       ` corruption with mtdblock Nicolas Pitre
  1 sibling, 1 reply; 29+ messages in thread
From: David Woodhouse @ 2000-11-09 11:34 UTC (permalink / raw)
  Cc: Nicolas Pitre, mtd


dwmw2@infradead.org said:
>  Upon further consideration, it may be cleaner (at the cost of
> _slightly_ higher io_request_lock contention) just to _not_ drop the
> spinlock between finishing one request and checking to see if the
> queue is empty.

'further consideration' is all well and good - but upon consumption of 
generous quantities of coffee, it occurs to me that this doesn't protect us 
from eating the first request on the queue if one arrives while the kernel 
thread is still initialising itself.

Defining a QUEUE_PLUGGED macro isn't that ugly, is it?

--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-09  7:58                     ` David Woodhouse
@ 2000-11-09 14:22                       ` Nicolas Pitre
  0 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2000-11-09 14:22 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd



On Thu, 9 Nov 2000, David Woodhouse wrote:

> I agree that plugging is useful for us. So rather than disabling it, how
> about changing the check in mtdblock_thread() to:
> 
> 	if (QUEUE_EMPTY || QUEUE_PLUGGED)

Agreed.


Nicolas



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-09 14:34                       ` corruption with mtdblock Nicolas Pitre
@ 2000-11-09 14:23                         ` David Woodhouse
  0 siblings, 0 replies; 29+ messages in thread
From: David Woodhouse @ 2000-11-09 14:23 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: mtd


nico@cam.org said:
>  Yep!  This way there is no opened window between the two QUEUE_EMPTY
> tests which is clearer.

/me waits for Nico to read the third mail.... 

:)

--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-09  8:12                     ` David Woodhouse
  2000-11-09 11:34                       ` David Woodhouse
@ 2000-11-09 14:34                       ` Nicolas Pitre
  2000-11-09 14:23                         ` David Woodhouse
  1 sibling, 1 reply; 29+ messages in thread
From: Nicolas Pitre @ 2000-11-09 14:34 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd



On Thu, 9 Nov 2000, David Woodhouse wrote:

> 
> Upon further consideration, it may be cleaner (at the cost of _slightly_
> higher io_request_lock contention) just to _not_ drop the spinlock between
> finishing one request and checking to see if the queue is empty.
> 
> How about this?
> 
>  	exit_sighand(tsk);
>  	exit_fs(tsk);
>  
> +	spin_lock_irq(&io_request_lock);
>  	while (!leaving) {
>  		add_wait_queue(&thr_wq, &wait);
>  		set_current_state(TASK_INTERRUPTIBLE);
> -		spin_lock_irq(&io_request_lock);
>  		if (QUEUE_EMPTY) {
>  			spin_unlock_irq(&io_request_lock);
>  			schedule();
>  			remove_wait_queue(&thr_wq, &wait); 
> +			spin_lock_irq(&io_request_lock);
>  		} else {
>  			remove_wait_queue(&thr_wq, &wait); 
>  			set_current_state(TASK_RUNNING);
>  			handle_mtdblock_request();
> -			spin_unlock_irq(&io_request_lock);
>  		}
>  	}
> +	spin_unlock_irq(&io_request_lock);
>  	up(&thread_sem);
>  	return 0;
>  }

Yep!  This way there is no opened window between the two QUEUE_EMPTY
tests which is clearer.


Nicolas



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-09 11:34                       ` David Woodhouse
@ 2000-11-09 15:21                         ` Nicolas Pitre
  2000-11-09 22:36                           ` David Woodhouse
  0 siblings, 1 reply; 29+ messages in thread
From: Nicolas Pitre @ 2000-11-09 15:21 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd



Oh!  here's the third one!  It was sunk in my inbox... ;)

On Thu, 9 Nov 2000, David Woodhouse wrote:

> 
> dwmw2@infradead.org said:
> >  Upon further consideration, it may be cleaner (at the cost of
> > _slightly_ higher io_request_lock contention) just to _not_ drop the
> > spinlock between finishing one request and checking to see if the
> > queue is empty.
> 
> 'further consideration' is all well and good - but upon consumption of 
> generous quantities of coffee, it occurs to me that this doesn't protect us 
> from eating the first request on the queue if one arrives while the kernel 
> thread is still initialising itself.

Well spotted.

(today i'm definitely destined to agree with you whatever you say...)

> Defining a QUEUE_PLUGGED macro isn't that ugly, is it?

No it isn't.


Nicolas



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: corruption with mtdblock
  2000-11-09 15:21                         ` Nicolas Pitre
@ 2000-11-09 22:36                           ` David Woodhouse
  2000-11-11 23:38                             ` Detecting the Disk-On-Chip 2800 Adam Agnew
  0 siblings, 1 reply; 29+ messages in thread
From: David Woodhouse @ 2000-11-09 22:36 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: mtd

OK, I've committed a version which checks for QUEUE_PLUGGED as discussed.
I've defined it as follows:

#if LINUX_VERSION_CODE < 0x20300
#define QUEUE_PLUGGED (blk_dev[MAJOR_NR].plug_tq.sync)
#else
#define QUEUE_PLUGGED (blk_dev[MAJOR_NR].request_queue.plugged)
#endif

The version number is suspect, but I doubt anyone's really running early
2.3 kernels any more so it's not the end of the world.

I still don't have a 2.2 kernel tree at home - so can someone verify that
I've got that version right?

-- 
dwmw2

To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Detecting the Disk-On-Chip 2800
  2000-11-09 22:36                           ` David Woodhouse
@ 2000-11-11 23:38                             ` Adam Agnew
  2000-11-12 23:27                               ` David Woodhouse
  2000-11-13  0:24                               ` Ollie Lho
  0 siblings, 2 replies; 29+ messages in thread
From: Adam Agnew @ 2000-11-11 23:38 UTC (permalink / raw)
  To: mtd

Is it common to be unable to detect the DOC after you've flashed a bios to
it? Can you get around that by explicitly stating the the memory address
and length for it?
- Adam Agnew



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Detecting the Disk-On-Chip 2800
  2000-11-11 23:38                             ` Detecting the Disk-On-Chip 2800 Adam Agnew
@ 2000-11-12 23:27                               ` David Woodhouse
  2000-11-13  0:24                               ` Ollie Lho
  1 sibling, 0 replies; 29+ messages in thread
From: David Woodhouse @ 2000-11-12 23:27 UTC (permalink / raw)
  To: Adam Agnew; +Cc: mtd

On Sat, 11 Nov 2000, Adam Agnew wrote:

> Is it common to be unable to detect the DOC after you've flashed a bios to
> it?

Yes. You've removed the 0x55 0xaa signature at the beginning of the device
which the probe code uses to detect it.

> Can you get around that by explicitly stating the the memory address
> and length for it?

You can disable the check for that signature by turning off
CONFIG_MTD_DOCPROBE_55AA. It should still do the rest of the probe as
normal.

-- 
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Detecting the Disk-On-Chip 2800
  2000-11-11 23:38                             ` Detecting the Disk-On-Chip 2800 Adam Agnew
  2000-11-12 23:27                               ` David Woodhouse
@ 2000-11-13  0:24                               ` Ollie Lho
  2000-11-13  6:35                                 ` Adam Agnew
  1 sibling, 1 reply; 29+ messages in thread
From: Ollie Lho @ 2000-11-13  0:24 UTC (permalink / raw)
  To: Adam Agnew; +Cc: mtd

Adam Agnew wrote:
> 
> Is it common to be unable to detect the DOC after you've flashed a bios to
> it? Can you get around that by explicitly stating the the memory address
> and length for it?
> - Adam Agnew
> 

Adam,
	You should ask me directly, that is what I did to the driver ;-). The
original firmware (IPL) in the DoC has 0x55AA signature in it. So the driver
probe for it to determine if there is a DoC on the bus. After we reprogramming
the IPL area, the signature is gone. You should CVS update your MTD source
tree and reconfigure the kernel. There are two options, one is PROBE_HIGH,
which will probe high address (near 4GB) rather than log address (under 1MB)
for DoC. The other is PROBE_55AA, which probe the signature. If you are playing
LinuxBIOS with SiS 630, please choose PROBE_HIGH == y, PROBE_55AA = n.

Ollie


To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Detecting the Disk-On-Chip 2800
  2000-11-13  0:24                               ` Ollie Lho
@ 2000-11-13  6:35                                 ` Adam Agnew
  0 siblings, 0 replies; 29+ messages in thread
From: Adam Agnew @ 2000-11-13  6:35 UTC (permalink / raw)
  To: Ollie Lho; +Cc: mtd

Awesome, thank you. Sorry, I haven't figured out yet who has done what,
but I really appreciate that contribution :)

On Mon, 13 Nov 2000, Ollie Lho wrote:

> Adam Agnew wrote:
> > 
> > Is it common to be unable to detect the DOC after you've flashed a bios to
> > it? Can you get around that by explicitly stating the the memory address
> > and length for it?
> > - Adam Agnew
> > 
> 
> Adam,
> 	You should ask me directly, that is what I did to the driver ;-). The
> original firmware (IPL) in the DoC has 0x55AA signature in it. So the driver
> probe for it to determine if there is a DoC on the bus. After we reprogramming
> the IPL area, the signature is gone. You should CVS update your MTD source
> tree and reconfigure the kernel. There are two options, one is PROBE_HIGH,
> which will probe high address (near 4GB) rather than log address (under 1MB)
> for DoC. The other is PROBE_55AA, which probe the signature. If you are playing
> LinuxBIOS with SiS 630, please choose PROBE_HIGH == y, PROBE_55AA = n.
> 
> Ollie
> 



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2000-11-13  6:35 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-11-05 22:52 Erase unit header corrupted when using FTL Brian Kuschak
2000-11-06  0:57 ` David Woodhouse
2000-11-06  2:10   ` corruption with mtdblock Nicolas Pitre
2000-11-06  7:27     ` David Woodhouse
2000-11-06 17:54       ` Nicolas Pitre
2000-11-06 14:47         ` David Woodhouse
2000-11-06 20:11           ` Nicolas Pitre
2000-11-06 15:19             ` David Woodhouse
2000-11-07  9:04               ` Nicolas Pitre
2000-11-08 21:50                 ` David Woodhouse
2000-11-09  0:28                   ` Nicolas Pitre
2000-11-09  7:58                     ` David Woodhouse
2000-11-09 14:22                       ` Nicolas Pitre
2000-11-09  8:12                     ` David Woodhouse
2000-11-09 11:34                       ` David Woodhouse
2000-11-09 15:21                         ` Nicolas Pitre
2000-11-09 22:36                           ` David Woodhouse
2000-11-11 23:38                             ` Detecting the Disk-On-Chip 2800 Adam Agnew
2000-11-12 23:27                               ` David Woodhouse
2000-11-13  0:24                               ` Ollie Lho
2000-11-13  6:35                                 ` Adam Agnew
2000-11-09 14:34                       ` corruption with mtdblock Nicolas Pitre
2000-11-09 14:23                         ` David Woodhouse
2000-11-06 15:31             ` David Woodhouse
2000-11-06 21:24               ` Nicolas Pitre
2000-11-06 17:31                 ` David Woodhouse
2000-11-07  1:02                   ` Nicolas Pitre
2000-11-06 14:54         ` David Woodhouse
2000-11-06 21:27           ` Nicolas Pitre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox