More comments

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

* More comments
@ 2000-03-12 12:20 Alexander Larsson
  2000-03-12 13:44 ` David Woodhouse
  0 siblings, 1 reply; 10+ messages in thread
From: Alexander Larsson @ 2000-03-12 12:20 UTC (permalink / raw)
  To: mtd

Ok. I've worked some more with mtd now.

What is the reason for having the read and write ops not always read/write
the number of bytes you specify? (I guess that this is what the size_t
*retlen argument means.)

What concurrency assumptions does mtd make? Since the ops can sleep
(could be waiting for a sector erase to finish) there must be some kind of
locking. Who does the locking? What is "int *lock" in struct mtd? (and
shouldn't it be a semaphore or a spinlock?) Which operations can be called
from interrupt-context?

/ Alex

To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: More comments
  2000-03-12 12:20 More comments Alexander Larsson
@ 2000-03-12 13:44 ` David Woodhouse
  2000-03-12 17:16   ` Alexander Larsson
  0 siblings, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2000-03-12 13:44 UTC (permalink / raw)
  To: Alexander Larsson; +Cc: mtd

alex@cendio.se said:
>  What is the reason for having the read and write ops not always read/
> write the number of bytes you specify? (I guess that this is what the
> size_t *retlen argument means.)

That's normal - the read() and write() calls of everything have always been
allowed to return a different number of bytes than the number they were asked
to transfer - and not only when the O_NONBLOCK flag is in force. I just chose
to return the actual number of bytes in a separate place to the error code,
that's all.

> What concurrency assumptions does mtd make? 

Absolutely none.

> Since the ops can sleep (could be waiting for a sector erase to finish) 
> there must be some kind of locking. Who does the locking?

Locking issues can be complex and hardware-dependent, depending on your paging 
strategy, on how many different windows you have available for paging, etc...

Therefore, any locking you require is entirely internal to your driver. You 
may be called concurrently by every processor in the system, and if you've 
called schedule() you can be called even more than that.

> What is "int *lock" in struct mtd? 

A mistake, which I believe crept in with mapped.c and its associated drivers.

> (and shouldn't it be a semaphore or a spinlock?) 

No, it should be deleted :)

> Which operations can be called from interrupt-context? 

To declare "operation XXX can be called from interrupt context" we have to 
impose that on the author of _every_ device driver. Unless you want to 
enlighten me, I can't see any reason why we'd want to do that - so my answer 
for now is 'none'.

This touches on one of the reasons why I'd like to phase out the mtd_point()
routines - because I don't like the locking issues involved. If your device
makes use of paging, then you'd have to lock the device at point() and leave it
locked until unpoint() - which could be a long time depending on the caller. So
while I'd like to have the facility, because it will allow us to do XIP, I'm
tempted to restrict it to only those devices which can do it without exclusive
locking - i.e. devices which are completely mem-mapped and not paged.

--
dwmw2

To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: More comments
  2000-03-12 13:44 ` David Woodhouse
@ 2000-03-12 17:16   ` Alexander Larsson
  2000-03-12 17:26     ` David Woodhouse
  2000-03-14 17:13     ` David Woodhouse
  0 siblings, 2 replies; 10+ messages in thread
From: Alexander Larsson @ 2000-03-12 17:16 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd

On Sun, 12 Mar 2000, David Woodhouse wrote:

> alex@cendio.se said:
> >  What is the reason for having the read and write ops not always read/
> > write the number of bytes you specify? (I guess that this is what the
> > size_t *retlen argument means.)
> 
> That's normal - the read() and write() calls of everything have always been
> allowed to return a different number of bytes than the number they were asked
> to transfer - and not only when the O_NONBLOCK flag is in force. I just chose
> to return the actual number of bytes in a separate place to the error code,
> that's all.

I'm aware of the normal conventions for read (2), but is this good in the
kernelspace api? I mean, the only reason for a flash to not read all data
is if the flash is paged, and you don't want to bother changing page in
the read function. On the other hand, all places in the kernel that 
read data from the flash except the char device read op (ie. the block
device, mtd, flash filesystems etc) needs to loop on each read. This
looks like a lot of overhead code that is easy to forget (the current ftl
code doesn't seem to do any looping on reads).
 
> > What concurrency assumptions does mtd make? 
> 
> Absolutely none.
 Good.
 
> > What is "int *lock" in struct mtd? 
> 
> A mistake, which I believe crept in with mapped.c and its associated drivers.
> 
> > (and shouldn't it be a semaphore or a spinlock?) 
> 
> No, it should be deleted :)

 Exterminate it with extreme prejudice.
 
> This touches on one of the reasons why I'd like to phase out the mtd_point()
> routines - because I don't like the locking issues involved. If your device
> makes use of paging, then you'd have to lock the device at point() and leave it
> locked until unpoint() - which could be a long time depending on the caller. So
> while I'd like to have the facility, because it will allow us to do XIP, I'm
> tempted to restrict it to only those devices which can do it without exclusive
> locking - i.e. devices which are completely mem-mapped and not paged.

But if the device is fully memory mapped the current point()/unpoint()
api is a bit overweight. A pointer and size would be enough.

/ Alex




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: More comments
  2000-03-12 17:16   ` Alexander Larsson
@ 2000-03-12 17:26     ` David Woodhouse
  2000-03-13  9:06       ` Alexander Larsson
  2000-03-14 17:13     ` David Woodhouse
  1 sibling, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2000-03-12 17:26 UTC (permalink / raw)
  To: Alexander Larsson; +Cc: mtd

alex@cendio.se said:
>  I'm aware of the normal conventions for read (2), but is this good in
> the kernelspace api? I mean, the only reason for a flash to not read
> all data is if the flash is paged, and you don't want to bother
> changing page in the read function. On the other hand, all places in
> the kernel that  read data from the flash except the char device read
> op (ie. the block device, mtd, flash filesystems etc) needs to loop on
> each read.

In practice, this hasn't mattered because the devices allow you to read up to 
a block at a time, and the block device and flash filesystems usually want to 
read <= 1 block. The only place that has to retry is the mtd chardevice, which 
means the retry is in userspace.

However, that behaviour isn't guaranteed, so strictly speaking, perhaps we 
ought to change it. How about simply declaring that the read() functions MUST 
read up to the end of the block on which they start, and may not return an 
incomplete read unless requested to do a multi-block read.

That would make it safe for all current the kernel-space users, wouldn't it?

>  But if the device is fully memory mapped the current point()/
> unpoint() api is a bit overweight. A pointer and size would be enough.

Fine. Then it can go too. Just as soon as I've stopped ftl.c from using it :)

--
dwmw2

To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: More comments
  2000-03-12 17:26     ` David Woodhouse
@ 2000-03-13  9:06       ` Alexander Larsson
  2000-03-13  9:15         ` David Woodhouse
  0 siblings, 1 reply; 10+ messages in thread
From: Alexander Larsson @ 2000-03-13  9:06 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd

On Sun, 12 Mar 2000, David Woodhouse wrote:

> 
> alex@cendio.se said:
> >  I'm aware of the normal conventions for read (2), but is this good in
> > the kernelspace api? I mean, the only reason for a flash to not read
> > all data is if the flash is paged, and you don't want to bother
> > changing page in the read function. On the other hand, all places in
> > the kernel that  read data from the flash except the char device read
> > op (ie. the block device, mtd, flash filesystems etc) needs to loop on
> > each read.
> 
> In practice, this hasn't mattered because the devices allow you to read up to 
> a block at a time, and the block device and flash filesystems usually want to 
> read <= 1 block. The only place that has to retry is the mtd chardevice, which 
> means the retry is in userspace.
> 
> However, that behaviour isn't guaranteed, so strictly speaking, perhaps we 
> ought to change it. How about simply declaring that the read() functions MUST 
> read up to the end of the block on which they start, and may not return an 
> incomplete read unless requested to do a multi-block read.

I don't know. It still sounds a bit strange to me. I can understand it
being better in the case of the char device, there you can avoid
sleeping in the kernel and just return to userspace. But I'm planning
to port Axis journaling flash file system to MTD, and it writes all
data in a linear log (like a circular buffer) meaning that writes could
easily be over block boundaries. That means i have to wrap all write
and reads in a function that loops over the read/write call.
 
> That would make it safe for all current the kernel-space users, wouldn't it?
> 
> >  But if the device is fully memory mapped the current point()/
> > unpoint() api is a bit overweight. A pointer and size would be enough.
> 
> Fine. Then it can go too. Just as soon as I've stopped ftl.c from using it :)

Hmm, i think i spoke to soon. It would still need some kind of locking.
How about an explicit read_lock()/read_unlock(). It could even use
rw_semaphores to allow concurrent reads from several readers.

/ Alex




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: More comments
  2000-03-13  9:06       ` Alexander Larsson
@ 2000-03-13  9:15         ` David Woodhouse
  2000-03-13  9:42           ` Alexander Larsson
  0 siblings, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2000-03-13  9:15 UTC (permalink / raw)
  To: Alexander Larsson; +Cc: mtd


alex@cendio.se said:
>  That means i have to wrap all write and reads in a function that
> loops over the read/write call.

This has to be done somewhere. Where's best - in _every_ device driver or in
only those front ends which attempt non-block-aligned access?


> Hmm, i think i spoke to soon. It would still need some kind of
> locking. How about an explicit read_lock()/read_unlock(). It could
> even use rw_semaphores to allow concurrent reads from several readers.

Yes - that sounds like a good idea.



--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: More comments
  2000-03-13  9:15         ` David Woodhouse
@ 2000-03-13  9:42           ` Alexander Larsson
  2000-03-28 10:18             ` David Woodhouse
  0 siblings, 1 reply; 10+ messages in thread
From: Alexander Larsson @ 2000-03-13  9:42 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd

On Mon, 13 Mar 2000, David Woodhouse wrote:

> 
> alex@cendio.se said:
> >  That means i have to wrap all write and reads in a function that
> > loops over the read/write call.
> 
> This has to be done somewhere. Where's best - in _every_ device driver or in
> only those front ends which attempt non-block-aligned access?

I don't know. You seem to think that it's better to have it in the
front-ends, and I can accept that. I think the "_every_ driver" part is
a bit of an overstatement though, as I would think many drivers are
for fully memory-mapped flashes where this wouldn't be necessary (where,
in fact, looping in the frontend is unnecessary overhead).

/ Alex

To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: More comments
  2000-03-12 17:16   ` Alexander Larsson
  2000-03-12 17:26     ` David Woodhouse
@ 2000-03-14 17:13     ` David Woodhouse
  1 sibling, 0 replies; 10+ messages in thread
From: David Woodhouse @ 2000-03-14 17:13 UTC (permalink / raw)
  To: Alexander Larsson; +Cc: mtd


alex@cendio.se said:
> 
>  Exterminate it with extreme prejudice.

Committed to CVS.

--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: More comments
  2000-03-13  9:42           ` Alexander Larsson
@ 2000-03-28 10:18             ` David Woodhouse
  2000-03-28 11:30               ` Alexander Larsson
  0 siblings, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2000-03-28 10:18 UTC (permalink / raw)
  To: Alexander Larsson; +Cc: mtd

alex@cendio.se said:
> On Mon, 13 Mar 2000, David Woodhouse wrote:
> > This has to be done somewhere. Where's best - in _every_ device driver
> > or in only those front ends which attempt non-block-aligned access?

> I don't know. You seem to think that it's better to have it in the
> front-ends, and I can accept that. I think the "_every_ driver" part
> is a bit of an overstatement though, as I would think many drivers are
> for fully memory-mapped flashes where this wouldn't be necessary
> (where, in fact, looping in the frontend is unnecessary overhead).
>  

OK. How about this:

You don't have to check the return from the mtd->write() function, it must 
always write the complete buffer or return failure.

_BUT_ you may never make a write request which crosses a block boundary.

So you still have to possibly issue more than one write request in the jffs 
code, but at least you can know beforehand and code accordingly, rather than 
having a naïve loop.

 (s/write/read/ throughout also applies, obviously.)

I'm putting together some documentation. 

	www.linux-mtd.infradead.org/tech/API.html
					/mtd_info.html
					/core.html
					/erase.html

This will contain some changes to what's currently implemented - most notably 
the notifier stuff will no longer have the horrible race conditions that it 
currently does.

--
dwmw2

To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: More comments
  2000-03-28 10:18             ` David Woodhouse
@ 2000-03-28 11:30               ` Alexander Larsson
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Larsson @ 2000-03-28 11:30 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd

On Tue, 28 Mar 2000, David Woodhouse wrote:
> alex@cendio.se said:
> > On Mon, 13 Mar 2000, David Woodhouse wrote:
> > > This has to be done somewhere. Where's best - in _every_ device driver
> > > or in only those front ends which attempt non-block-aligned access?
> 
> > I don't know. You seem to think that it's better to have it in the
> > front-ends, and I can accept that. I think the "_every_ driver" part
> > is a bit of an overstatement though, as I would think many drivers are
> > for fully memory-mapped flashes where this wouldn't be necessary
> > (where, in fact, looping in the frontend is unnecessary overhead).
> >  
> 
> OK. How about this:
> 
> You don't have to check the return from the mtd->write() function, it must 
> always write the complete buffer or return failure.
> 
> _BUT_ you may never make a write request which crosses a block boundary.
> 
> So you still have to possibly issue more than one write request in the jffs 
> code, but at least you can know beforehand and code accordingly, rather than 
> having a naïve loop.
> 
>  (s/write/read/ throughout also applies, obviously.)
 
Yes, this seems like a very good compromise. The jffs case is a bit
excpetional in this case.
 
> I'm putting together some documentation. 
> 
> 	www.linux-mtd.infradead.org/tech/API.html
> 					/mtd_info.html
> 					/core.html
> 					/erase.html
> 
> This will contain some changes to what's currently implemented - most notably 
> the notifier stuff will no longer have the horrible race conditions that it 
> currently does.

 I'll read it later tonigh. Review meetings all day...

/ Alex




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2000-03-28 11:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-03-12 12:20 More comments Alexander Larsson
2000-03-12 13:44 ` David Woodhouse
2000-03-12 17:16   ` Alexander Larsson
2000-03-12 17:26     ` David Woodhouse
2000-03-13  9:06       ` Alexander Larsson
2000-03-13  9:15         ` David Woodhouse
2000-03-13  9:42           ` Alexander Larsson
2000-03-28 10:18             ` David Woodhouse
2000-03-28 11:30               ` Alexander Larsson
2000-03-14 17:13     ` David Woodhouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox