* More comments
@ 2000-03-12 12:20 Alexander Larsson
2000-03-12 13:44 ` David Woodhouse
0 siblings, 1 reply; 10+ messages in thread
From: Alexander Larsson @ 2000-03-12 12:20 UTC (permalink / raw)
To: mtd
Ok. I've worked some more with mtd now.
What is the reason for having the read and write ops not always read/write
the number of bytes you specify? (I guess that this is what the size_t
*retlen argument means.)
What concurrency assumptions does mtd make? Since the ops can sleep
(could be waiting for a sector erase to finish) there must be some kind of
locking. Who does the locking? What is "int *lock" in struct mtd? (and
shouldn't it be a semaphore or a spinlock?) Which operations can be called
from interrupt-context?
/ Alex
To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: More comments
2000-03-12 12:20 More comments Alexander Larsson
@ 2000-03-12 13:44 ` David Woodhouse
2000-03-12 17:16 ` Alexander Larsson
0 siblings, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2000-03-12 13:44 UTC (permalink / raw)
To: Alexander Larsson; +Cc: mtd
alex@cendio.se said:
> What is the reason for having the read and write ops not always read/
> write the number of bytes you specify? (I guess that this is what the
> size_t *retlen argument means.)
That's normal - the read() and write() calls of everything have always been
allowed to return a different number of bytes than the number they were asked
to transfer - and not only when the O_NONBLOCK flag is in force. I just chose
to return the actual number of bytes in a separate place to the error code,
that's all.
> What concurrency assumptions does mtd make?
Absolutely none.
> Since the ops can sleep (could be waiting for a sector erase to finish)
> there must be some kind of locking. Who does the locking?
Locking issues can be complex and hardware-dependent, depending on your paging
strategy, on how many different windows you have available for paging, etc...
Therefore, any locking you require is entirely internal to your driver. You
may be called concurrently by every processor in the system, and if you've
called schedule() you can be called even more than that.
> What is "int *lock" in struct mtd?
A mistake, which I believe crept in with mapped.c and its associated drivers.
> (and shouldn't it be a semaphore or a spinlock?)
No, it should be deleted :)
> Which operations can be called from interrupt-context?
To declare "operation XXX can be called from interrupt context" we have to
impose that on the author of _every_ device driver. Unless you want to
enlighten me, I can't see any reason why we'd want to do that - so my answer
for now is 'none'.
This touches on one of the reasons why I'd like to phase out the mtd_point()
routines - because I don't like the locking issues involved. If your device
makes use of paging, then you'd have to lock the device at point() and leave it
locked until unpoint() - which could be a long time depending on the caller. So
while I'd like to have the facility, because it will allow us to do XIP, I'm
tempted to restrict it to only those devices which can do it without exclusive
locking - i.e. devices which are completely mem-mapped and not paged.
--
dwmw2
To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: More comments
2000-03-12 13:44 ` David Woodhouse
@ 2000-03-12 17:16 ` Alexander Larsson
2000-03-12 17:26 ` David Woodhouse
2000-03-14 17:13 ` David Woodhouse
0 siblings, 2 replies; 10+ messages in thread
From: Alexander Larsson @ 2000-03-12 17:16 UTC (permalink / raw)
To: David Woodhouse; +Cc: mtd
On Sun, 12 Mar 2000, David Woodhouse wrote:
> alex@cendio.se said:
> > What is the reason for having the read and write ops not always read/
> > write the number of bytes you specify? (I guess that this is what the
> > size_t *retlen argument means.)
>
> That's normal - the read() and write() calls of everything have always been
> allowed to return a different number of bytes than the number they were asked
> to transfer - and not only when the O_NONBLOCK flag is in force. I just chose
> to return the actual number of bytes in a separate place to the error code,
> that's all.
I'm aware of the normal conventions for read (2), but is this good in the
kernelspace api? I mean, the only reason for a flash to not read all data
is if the flash is paged, and you don't want to bother changing page in
the read function. On the other hand, all places in the kernel that
read data from the flash except the char device read op (ie. the block
device, mtd, flash filesystems etc) needs to loop on each read. This
looks like a lot of overhead code that is easy to forget (the current ftl
code doesn't seem to do any looping on reads).
> > What concurrency assumptions does mtd make?
>
> Absolutely none.
Good.
> > What is "int *lock" in struct mtd?
>
> A mistake, which I believe crept in with mapped.c and its associated drivers.
>
> > (and shouldn't it be a semaphore or a spinlock?)
>
> No, it should be deleted :)
Exterminate it with extreme prejudice.
> This touches on one of the reasons why I'd like to phase out the mtd_point()
> routines - because I don't like the locking issues involved. If your device
> makes use of paging, then you'd have to lock the device at point() and leave it
> locked until unpoint() - which could be a long time depending on the caller. So
> while I'd like to have the facility, because it will allow us to do XIP, I'm
> tempted to restrict it to only those devices which can do it without exclusive
> locking - i.e. devices which are completely mem-mapped and not paged.
But if the device is fully memory mapped the current point()/unpoint()
api is a bit overweight. A pointer and size would be enough.
/ Alex
To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: More comments
2000-03-12 17:16 ` Alexander Larsson
@ 2000-03-12 17:26 ` David Woodhouse
2000-03-13 9:06 ` Alexander Larsson
2000-03-14 17:13 ` David Woodhouse
1 sibling, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2000-03-12 17:26 UTC (permalink / raw)
To: Alexander Larsson; +Cc: mtd
alex@cendio.se said:
> I'm aware of the normal conventions for read (2), but is this good in
> the kernelspace api? I mean, the only reason for a flash to not read
> all data is if the flash is paged, and you don't want to bother
> changing page in the read function. On the other hand, all places in
> the kernel that read data from the flash except the char device read
> op (ie. the block device, mtd, flash filesystems etc) needs to loop on
> each read.
In practice, this hasn't mattered because the devices allow you to read up to
a block at a time, and the block device and flash filesystems usually want to
read <= 1 block. The only place that has to retry is the mtd chardevice, which
means the retry is in userspace.
However, that behaviour isn't guaranteed, so strictly speaking, perhaps we
ought to change it. How about simply declaring that the read() functions MUST
read up to the end of the block on which they start, and may not return an
incomplete read unless requested to do a multi-block read.
That would make it safe for all current the kernel-space users, wouldn't it?
> But if the device is fully memory mapped the current point()/
> unpoint() api is a bit overweight. A pointer and size would be enough.
Fine. Then it can go too. Just as soon as I've stopped ftl.c from using it :)
--
dwmw2
To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: More comments
2000-03-12 17:26 ` David Woodhouse
@ 2000-03-13 9:06 ` Alexander Larsson
2000-03-13 9:15 ` David Woodhouse
0 siblings, 1 reply; 10+ messages in thread
From: Alexander Larsson @ 2000-03-13 9:06 UTC (permalink / raw)
To: David Woodhouse; +Cc: mtd
On Sun, 12 Mar 2000, David Woodhouse wrote:
>
> alex@cendio.se said:
> > I'm aware of the normal conventions for read (2), but is this good in
> > the kernelspace api? I mean, the only reason for a flash to not read
> > all data is if the flash is paged, and you don't want to bother
> > changing page in the read function. On the other hand, all places in
> > the kernel that read data from the flash except the char device read
> > op (ie. the block device, mtd, flash filesystems etc) needs to loop on
> > each read.
>
> In practice, this hasn't mattered because the devices allow you to read up to
> a block at a time, and the block device and flash filesystems usually want to
> read <= 1 block. The only place that has to retry is the mtd chardevice, which
> means the retry is in userspace.
>
> However, that behaviour isn't guaranteed, so strictly speaking, perhaps we
> ought to change it. How about simply declaring that the read() functions MUST
> read up to the end of the block on which they start, and may not return an
> incomplete read unless requested to do a multi-block read.
I don't know. It still sounds a bit strange to me. I can understand it
being better in the case of the char device, there you can avoid
sleeping in the kernel and just return to userspace. But I'm planning
to port Axis journaling flash file system to MTD, and it writes all
data in a linear log (like a circular buffer) meaning that writes could
easily be over block boundaries. That means i have to wrap all write
and reads in a function that loops over the read/write call.
> That would make it safe for all current the kernel-space users, wouldn't it?
>
> > But if the device is fully memory mapped the current point()/
> > unpoint() api is a bit overweight. A pointer and size would be enough.
>
> Fine. Then it can go too. Just as soon as I've stopped ftl.c from using it :)
Hmm, i think i spoke to soon. It would still need some kind of locking.
How about an explicit read_lock()/read_unlock(). It could even use
rw_semaphores to allow concurrent reads from several readers.
/ Alex
To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: More comments
2000-03-13 9:06 ` Alexander Larsson
@ 2000-03-13 9:15 ` David Woodhouse
2000-03-13 9:42 ` Alexander Larsson
0 siblings, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2000-03-13 9:15 UTC (permalink / raw)
To: Alexander Larsson; +Cc: mtd
alex@cendio.se said:
> That means i have to wrap all write and reads in a function that
> loops over the read/write call.
This has to be done somewhere. Where's best - in _every_ device driver or in
only those front ends which attempt non-block-aligned access?
> Hmm, i think i spoke to soon. It would still need some kind of
> locking. How about an explicit read_lock()/read_unlock(). It could
> even use rw_semaphores to allow concurrent reads from several readers.
Yes - that sounds like a good idea.
--
dwmw2
To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: More comments
2000-03-13 9:15 ` David Woodhouse
@ 2000-03-13 9:42 ` Alexander Larsson
2000-03-28 10:18 ` David Woodhouse
0 siblings, 1 reply; 10+ messages in thread
From: Alexander Larsson @ 2000-03-13 9:42 UTC (permalink / raw)
To: David Woodhouse; +Cc: mtd
On Mon, 13 Mar 2000, David Woodhouse wrote:
>
> alex@cendio.se said:
> > That means i have to wrap all write and reads in a function that
> > loops over the read/write call.
>
> This has to be done somewhere. Where's best - in _every_ device driver or in
> only those front ends which attempt non-block-aligned access?
I don't know. You seem to think that it's better to have it in the
front-ends, and I can accept that. I think the "_every_ driver" part is
a bit of an overstatement though, as I would think many drivers are
for fully memory-mapped flashes where this wouldn't be necessary (where,
in fact, looping in the frontend is unnecessary overhead).
/ Alex
To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: More comments
2000-03-12 17:16 ` Alexander Larsson
2000-03-12 17:26 ` David Woodhouse
@ 2000-03-14 17:13 ` David Woodhouse
1 sibling, 0 replies; 10+ messages in thread
From: David Woodhouse @ 2000-03-14 17:13 UTC (permalink / raw)
To: Alexander Larsson; +Cc: mtd
alex@cendio.se said:
>
> Exterminate it with extreme prejudice.
Committed to CVS.
--
dwmw2
To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: More comments
2000-03-13 9:42 ` Alexander Larsson
@ 2000-03-28 10:18 ` David Woodhouse
2000-03-28 11:30 ` Alexander Larsson
0 siblings, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2000-03-28 10:18 UTC (permalink / raw)
To: Alexander Larsson; +Cc: mtd
alex@cendio.se said:
> On Mon, 13 Mar 2000, David Woodhouse wrote:
> > This has to be done somewhere. Where's best - in _every_ device driver
> > or in only those front ends which attempt non-block-aligned access?
> I don't know. You seem to think that it's better to have it in the
> front-ends, and I can accept that. I think the "_every_ driver" part
> is a bit of an overstatement though, as I would think many drivers are
> for fully memory-mapped flashes where this wouldn't be necessary
> (where, in fact, looping in the frontend is unnecessary overhead).
>
OK. How about this:
You don't have to check the return from the mtd->write() function, it must
always write the complete buffer or return failure.
_BUT_ you may never make a write request which crosses a block boundary.
So you still have to possibly issue more than one write request in the jffs
code, but at least you can know beforehand and code accordingly, rather than
having a naïve loop.
(s/write/read/ throughout also applies, obviously.)
I'm putting together some documentation.
www.linux-mtd.infradead.org/tech/API.html
/mtd_info.html
/core.html
/erase.html
This will contain some changes to what's currently implemented - most notably
the notifier stuff will no longer have the horrible race conditions that it
currently does.
--
dwmw2
To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: More comments
2000-03-28 10:18 ` David Woodhouse
@ 2000-03-28 11:30 ` Alexander Larsson
0 siblings, 0 replies; 10+ messages in thread
From: Alexander Larsson @ 2000-03-28 11:30 UTC (permalink / raw)
To: David Woodhouse; +Cc: mtd
On Tue, 28 Mar 2000, David Woodhouse wrote:
> alex@cendio.se said:
> > On Mon, 13 Mar 2000, David Woodhouse wrote:
> > > This has to be done somewhere. Where's best - in _every_ device driver
> > > or in only those front ends which attempt non-block-aligned access?
>
> > I don't know. You seem to think that it's better to have it in the
> > front-ends, and I can accept that. I think the "_every_ driver" part
> > is a bit of an overstatement though, as I would think many drivers are
> > for fully memory-mapped flashes where this wouldn't be necessary
> > (where, in fact, looping in the frontend is unnecessary overhead).
> >
>
> OK. How about this:
>
> You don't have to check the return from the mtd->write() function, it must
> always write the complete buffer or return failure.
>
> _BUT_ you may never make a write request which crosses a block boundary.
>
> So you still have to possibly issue more than one write request in the jffs
> code, but at least you can know beforehand and code accordingly, rather than
> having a naïve loop.
>
> (s/write/read/ throughout also applies, obviously.)
Yes, this seems like a very good compromise. The jffs case is a bit
excpetional in this case.
> I'm putting together some documentation.
>
> www.linux-mtd.infradead.org/tech/API.html
> /mtd_info.html
> /core.html
> /erase.html
>
> This will contain some changes to what's currently implemented - most notably
> the notifier stuff will no longer have the horrible race conditions that it
> currently does.
I'll read it later tonigh. Review meetings all day...
/ Alex
To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2000-03-28 11:30 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-03-12 12:20 More comments Alexander Larsson
2000-03-12 13:44 ` David Woodhouse
2000-03-12 17:16 ` Alexander Larsson
2000-03-12 17:26 ` David Woodhouse
2000-03-13 9:06 ` Alexander Larsson
2000-03-13 9:15 ` David Woodhouse
2000-03-13 9:42 ` Alexander Larsson
2000-03-28 10:18 ` David Woodhouse
2000-03-28 11:30 ` Alexander Larsson
2000-03-14 17:13 ` David Woodhouse
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox