* Re: Execute in place
@ 2007-05-02 23:11 Al Boldi
2007-05-03 7:31 ` Dmitry Krivoschekov
0 siblings, 1 reply; 30+ messages in thread
From: Al Boldi @ 2007-05-02 23:11 UTC (permalink / raw)
To: linux-kernel
Hugh Dickins wrote:
> On Wed, 2 May 2007, Phillip Susi wrote:
> > Hugh Dickins wrote:
> > > tmpfs doesn't store its stuff in the page cache twice: that's true,
> > > and I didn't mean to imply otherwise. But tmpfs doesn't contain any
> > > support for rom memory: you'd have to copy from rom to tmpfs to use
> > > it.
> >
> > The question is, when you execute a binary on tmpfs, does its code
> > segment get mapped directly where it's at in the buffer cache, or does
> > it get copied to another page for the executing process? At least,
> > assuming this is possible due to the vma and file offsets of the segment
> > being aligned.
>
> Its pages are mapped directly into the executing process, without copying.
Thank GOD! Boy, was I worried there for a second.
Now, if there were only an easy way to make tmpfs persistent?
Thanks!
--
Al
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-02 23:11 Execute in place Al Boldi
@ 2007-05-03 7:31 ` Dmitry Krivoschekov
2007-05-03 11:33 ` Al Boldi
0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Krivoschekov @ 2007-05-03 7:31 UTC (permalink / raw)
To: Al Boldi; +Cc: linux-kernel
Al Boldi wrote:
> Hugh Dickins wrote:
>> On Wed, 2 May 2007, Phillip Susi wrote:
>>> Hugh Dickins wrote:
>>>> tmpfs doesn't store its stuff in the page cache twice: that's true,
>>>> and I didn't mean to imply otherwise. But tmpfs doesn't contain any
>>>> support for rom memory: you'd have to copy from rom to tmpfs to use
>>>> it.
>>> The question is, when you execute a binary on tmpfs, does its code
>>> segment get mapped directly where it's at in the buffer cache, or does
>>> it get copied to another page for the executing process? At least,
>>> assuming this is possible due to the vma and file offsets of the segment
>>> being aligned.
>> Its pages are mapped directly into the executing process, without copying.
>
> Thank GOD! Boy, was I worried there for a second.
>
> Now, if there were only an easy way to make tmpfs persistent?
>
>
It would be not a tmpfs (*temporary* fs)then, but something like this
http://pramfs.sourceforge.net/
Regards,
Dmitry
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-03 7:31 ` Dmitry Krivoschekov
@ 2007-05-03 11:33 ` Al Boldi
2007-05-03 17:38 ` Dmitry Krivoschekov
2007-05-07 16:46 ` H. Peter Anvin
0 siblings, 2 replies; 30+ messages in thread
From: Al Boldi @ 2007-05-03 11:33 UTC (permalink / raw)
To: Dmitry Krivoschekov; +Cc: linux-kernel
Dmitry Krivoschekov wrote:
> Al Boldi wrote:
> > Now, if there were only an easy way to make tmpfs persistent?
>
> It would be not a tmpfs (*temporary* fs)then,
Isn't everything really just temporary?
> but something like this
>
> http://pramfs.sourceforge.net/
Thanks a lot, but this seems to rely on a non-volatile RAM.
Would something like an mmap'd tmpfs be possible?
Thanks!
--
Al
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-03 11:33 ` Al Boldi
@ 2007-05-03 17:38 ` Dmitry Krivoschekov
2007-05-07 16:46 ` H. Peter Anvin
1 sibling, 0 replies; 30+ messages in thread
From: Dmitry Krivoschekov @ 2007-05-03 17:38 UTC (permalink / raw)
To: Al Boldi; +Cc: linux-kernel
Al Boldi wrote:
> Dmitry Krivoschekov wrote:
>> Al Boldi wrote:
>>> Now, if there were only an easy way to make tmpfs persistent?
>> It would be not a tmpfs (*temporary* fs)then,
>
> Isn't everything really just temporary?
Would you like to talk about this?
Not with me, I'm not a psychoanalyst :)
>
>> but something like this
>>
>> http://pramfs.sourceforge.net/
>
> Thanks a lot, but this seems to rely on a non-volatile RAM.
No, it relies on a battery-backed normal RAM, which of course
may be considered as non-volatile.
Thanks,
Dmitry
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-03 11:33 ` Al Boldi
2007-05-03 17:38 ` Dmitry Krivoschekov
@ 2007-05-07 16:46 ` H. Peter Anvin
2007-05-07 19:37 ` Al Boldi
1 sibling, 1 reply; 30+ messages in thread
From: H. Peter Anvin @ 2007-05-07 16:46 UTC (permalink / raw)
To: Al Boldi; +Cc: Dmitry Krivoschekov, linux-kernel
Al Boldi wrote:
>
> Isn't everything really just temporary?
>
> Would something like an mmap'd tmpfs be possible?
>
No. tmpfs relies on being able to leave data structures in the running
kernel. In particular, it has no metadata store at all.
The needs for a persistent filesystem are very different, regardless of
what the underlying medium is.
-hpa
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-07 16:46 ` H. Peter Anvin
@ 2007-05-07 19:37 ` Al Boldi
2007-05-07 19:41 ` H. Peter Anvin
0 siblings, 1 reply; 30+ messages in thread
From: Al Boldi @ 2007-05-07 19:37 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Dmitry Krivoschekov, linux-kernel
H. Peter Anvin wrote:
> Al Boldi wrote:
> > Isn't everything really just temporary?
> >
> > Would something like an mmap'd tmpfs be possible?
>
> No. tmpfs relies on being able to leave data structures in the running
> kernel. In particular, it has no metadata store at all.
>
> The needs for a persistent filesystem are very different, regardless of
> what the underlying medium is.
Think Suspend-To-Disk; in that case tmpfs looks pretty persistent to me.
So what does STD do? It pushes memory out to swap.
Now, tmpfs could probably do the same thing on its own either to a private
swap or an mmap. I am suggesting mmap, as swap is currently really slow
with tmpfs, and switching it to use mmap may buy us 2 for the price of 1.
Thanks!
--
Al
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-07 19:37 ` Al Boldi
@ 2007-05-07 19:41 ` H. Peter Anvin
2007-05-07 20:56 ` Al Boldi
0 siblings, 1 reply; 30+ messages in thread
From: H. Peter Anvin @ 2007-05-07 19:41 UTC (permalink / raw)
To: Al Boldi; +Cc: Dmitry Krivoschekov, linux-kernel
Al Boldi wrote:
> H. Peter Anvin wrote:
>> Al Boldi wrote:
>>> Isn't everything really just temporary?
>>>
>>> Would something like an mmap'd tmpfs be possible?
>> No. tmpfs relies on being able to leave data structures in the running
>> kernel. In particular, it has no metadata store at all.
>>
>> The needs for a persistent filesystem are very different, regardless of
>> what the underlying medium is.
>
> Think Suspend-To-Disk; in that case tmpfs looks pretty persistent to me.
>
> So what does STD do? It pushes memory out to swap.
It pushes ALL of (used) memory out to swap.
> Now, tmpfs could probably do the same thing on its own either to a private
> swap or an mmap. I am suggesting mmap, as swap is currently really slow
> with tmpfs, and switching it to use mmap may buy us 2 for the price of 1.
No.
First of all, it would have to map ALL of kernel memory, or you would
have to change the way things like dentries, inodes, and namespaces are
allocated in the kernel itself.
Second, you still would have no stability across kernel versions.
Third, you would have to accept total data loss on unclean shutdown,
because tmpfs doesn't care about coherency, *NOR SHOULD IT*. This is
the fundamental reason why allocating a large swap partition and making
/tmp a tmpfs can give a *huge* performance boost, even though it still
hits disk.
What you're talking about is, *and should be*, a different filesystem.
You will relatively quickly find that you have to deal with the same
kind of stuff that you have to in any filesystem.
-hpa
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-07 19:41 ` H. Peter Anvin
@ 2007-05-07 20:56 ` Al Boldi
2007-05-08 6:06 ` H. Peter Anvin
0 siblings, 1 reply; 30+ messages in thread
From: Al Boldi @ 2007-05-07 20:56 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Dmitry Krivoschekov, linux-kernel
H. Peter Anvin wrote:
> Al Boldi wrote:
> > H. Peter Anvin wrote:
> >> Al Boldi wrote:
> >>> Isn't everything really just temporary?
> >>>
> >>> Would something like an mmap'd tmpfs be possible?
> >>
> >> No. tmpfs relies on being able to leave data structures in the running
> >> kernel. In particular, it has no metadata store at all.
> >>
> >> The needs for a persistent filesystem are very different, regardless of
> >> what the underlying medium is.
> >
> > Think Suspend-To-Disk; in that case tmpfs looks pretty persistent to me.
> >
> > So what does STD do? It pushes memory out to swap.
>
> It pushes ALL of (used) memory out to swap.
Ok.
> > Now, tmpfs could probably do the same thing on its own either to a
> > private swap or an mmap. I am suggesting mmap, as swap is currently
> > really slow with tmpfs, and switching it to use mmap may buy us 2 for
> > the price of 1.
>
> No.
>
> First of all, it would have to map ALL of kernel memory, or you would
> have to change the way things like dentries, inodes, and namespaces are
> allocated in the kernel itself.
That's probably one way of doing it.
> Second, you still would have no stability across kernel versions.
No problem. You could probably convince the user to do a tmpfs backup before
an upgrade.
> Third, you would have to accept total data loss on unclean shutdown,
> because tmpfs doesn't care about coherency, *NOR SHOULD IT*.
I think that's understood. But this risk could be reduced by instantiating
the latest sync'd mmap/swap.
> This is
> the fundamental reason why allocating a large swap partition and making
> /tmp a tmpfs can give a *huge* performance boost, even though it still
> hits disk.
Don't know what you mean here. Do you mean that tmpfs performance is
dependent on free swap-space being larger than some threshold. If so, what
threshold causes a tmpfs slowdown?
> What you're talking about is, *and should be*, a different filesystem.
> You will relatively quickly find that you have to deal with the same
> kind of stuff that you have to in any filesystem.
That's exactly what I want to avoid, as this would introduce a performance
penalty.
All we need is a periodically synced tmpfs to mmap, with a minimal stream
into page-cache algo on mount.
Thanks!
--
Al
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-07 20:56 ` Al Boldi
@ 2007-05-08 6:06 ` H. Peter Anvin
2007-05-08 11:36 ` Al Boldi
0 siblings, 1 reply; 30+ messages in thread
From: H. Peter Anvin @ 2007-05-08 6:06 UTC (permalink / raw)
To: Al Boldi; +Cc: Dmitry Krivoschekov, linux-kernel
Al Boldi wrote:
>
>> What you're talking about is, *and should be*, a different filesystem.
>> You will relatively quickly find that you have to deal with the same
>> kind of stuff that you have to in any filesystem.
>
> That's exactly what I want to avoid, as this would introduce a performance
> penalty.
>
> All we need is a periodically synced tmpfs to mmap, with a minimal stream
> into page-cache algo on mount.
>
... which would be completely useless, because you wouldn't get any sort
of coherency; you would have pointers pointing into space, etc. etc.
Sorry, it's useless.
-hpa
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-08 6:06 ` H. Peter Anvin
@ 2007-05-08 11:36 ` Al Boldi
2007-05-08 11:37 ` H. Peter Anvin
0 siblings, 1 reply; 30+ messages in thread
From: Al Boldi @ 2007-05-08 11:36 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Dmitry Krivoschekov, linux-kernel
H. Peter Anvin wrote:
> Al Boldi wrote:
> >> What you're talking about is, *and should be*, a different filesystem.
> >> You will relatively quickly find that you have to deal with the same
> >> kind of stuff that you have to in any filesystem.
> >
> > That's exactly what I want to avoid, as this would introduce a
> > performance penalty.
> >
> > All we need is a periodically synced tmpfs to mmap, with a minimal
> > stream into page-cache algo on mount.
>
> ... which would be completely useless, because you wouldn't get any sort
> of coherency; you would have pointers pointing into space, etc. etc.
You don't really think that anybody is suggesting to store the tmpfs data
without any coherency, do you?
I am suggesting that you can easily isolate tmpfs coherency from the rest of
the page-cache, by simply streaming tmpfs data out to an mmap and plugging
it with the tmpfs fat on umount. And streaming things back in on mount.
> Sorry, it's useless.
Persisting tmpfs is useful for the mere case of saving you the need to
repopulate on mount.
Thanks!
--
Al
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-08 11:36 ` Al Boldi
@ 2007-05-08 11:37 ` H. Peter Anvin
2007-05-08 12:02 ` Al Boldi
0 siblings, 1 reply; 30+ messages in thread
From: H. Peter Anvin @ 2007-05-08 11:37 UTC (permalink / raw)
To: Al Boldi; +Cc: Dmitry Krivoschekov, linux-kernel
Al Boldi wrote:
>
> You don't really think that anybody is suggesting to store the tmpfs data
> without any coherency, do you?
>
> I am suggesting that you can easily isolate tmpfs coherency from the rest of
> the page-cache, by simply streaming tmpfs data out to an mmap and plugging
> it with the tmpfs fat on umount. And streaming things back in on mount.
>
You still don't get it, do you?
tmpfs metadata doesn't live in the pagecache.
-hpa
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-08 11:37 ` H. Peter Anvin
@ 2007-05-08 12:02 ` Al Boldi
0 siblings, 0 replies; 30+ messages in thread
From: Al Boldi @ 2007-05-08 12:02 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Dmitry Krivoschekov, linux-kernel
H. Peter Anvin wrote:
> Al Boldi wrote:
> > You don't really think that anybody is suggesting to store the tmpfs
> > data without any coherency, do you?
> >
> > I am suggesting that you can easily isolate tmpfs coherency from the
> > rest of the page-cache, by simply streaming tmpfs data out to an mmap
> > and plugging it with the tmpfs fat on umount. And streaming things back
> > in on mount.
>
> You still don't get it, do you?
>
> tmpfs metadata doesn't live in the pagecache.
It does not matter; you can deduce it on umount.
Thanks!
--
Al
^ permalink raw reply [flat|nested] 30+ messages in thread
* Execute in place
@ 2007-05-01 21:55 Phillip Susi
2007-05-02 14:04 ` Hugh Dickins
0 siblings, 1 reply; 30+ messages in thread
From: Phillip Susi @ 2007-05-01 21:55 UTC (permalink / raw)
To: Linux-kernel
I seem to remember seeing some patches go by at some point that allowed
one of the rom type embeded system filesystems to directly execute
binaries out of the original rom memory rather than copying them to ram
first, then executing from there. I was wondering if rootfs or tmpfs
support such execute in place today, or if binaries executed from there
have their code segments duplicated in ram?
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-01 21:55 Phillip Susi
@ 2007-05-02 14:04 ` Hugh Dickins
2007-05-02 14:38 ` Björn Steinbrink
` (2 more replies)
0 siblings, 3 replies; 30+ messages in thread
From: Hugh Dickins @ 2007-05-02 14:04 UTC (permalink / raw)
To: Phillip Susi; +Cc: Linux-kernel
On Tue, 1 May 2007, Phillip Susi wrote:
> I seem to remember seeing some patches go by at some point that allowed one of
> the rom type embeded system filesystems to directly execute binaries out of
> the original rom memory rather than copying them to ram first, then executing
> from there. I was wondering if rootfs or tmpfs support such execute in place
> today, or if binaries executed from there have their code segments duplicated
> in ram?
Only ext2 supports it today: see Documentation/filesystems/xip.txt
Hugh
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: Execute in place
2007-05-02 14:04 ` Hugh Dickins
@ 2007-05-02 14:38 ` Björn Steinbrink
2007-05-02 15:22 ` Hugh Dickins
2007-05-03 11:38 ` Erik Mouw
2007-05-03 12:12 ` Robin Getz
2 siblings, 1 reply; 30+ messages in thread
From: Björn Steinbrink @ 2007-05-02 14:38 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Phillip Susi, Linux-kernel
On 2007.05.02 15:04:44 +0100, Hugh Dickins wrote:
> On Tue, 1 May 2007, Phillip Susi wrote:
> > I seem to remember seeing some patches go by at some point that
> > allowed one of the rom type embeded system filesystems to directly
> > execute binaries out of the original rom memory rather than copying
> > them to ram first, then executing from there. I was wondering if
> > rootfs or tmpfs support such execute in place today, or if binaries
> > executed from there have their code segments duplicated in ram?
>
> Only ext2 supports it today: see Documentation/filesystems/xip.txt
As I understand it, xip avoids the page cache copy. But tmpfs already
lives in the page cache (or swap), so avoiding that "copy" is
impossible. But I always expected tmpfs to implicitly due its own kind
of xip, i.e. that it doesn't have to store its stuff in the page cache
twice. Are you saying that this isn't true?
According to Documentation/filesystems/xip.txt the ramdisk driver does
something similar, pushing the data into the page cache and discarding
the original data.
thanks,
Björn
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-02 14:38 ` Björn Steinbrink
@ 2007-05-02 15:22 ` Hugh Dickins
2007-05-02 19:30 ` Phillip Susi
0 siblings, 1 reply; 30+ messages in thread
From: Hugh Dickins @ 2007-05-02 15:22 UTC (permalink / raw)
To: Björn Steinbrink; +Cc: Phillip Susi, Linux-kernel
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1218 bytes --]
On Wed, 2 May 2007, Björn Steinbrink wrote:
> On 2007.05.02 15:04:44 +0100, Hugh Dickins wrote:
> > On Tue, 1 May 2007, Phillip Susi wrote:
> > > I seem to remember seeing some patches go by at some point that
> > > allowed one of the rom type embeded system filesystems to directly
> > > execute binaries out of the original rom memory rather than copying
> > > them to ram first, then executing from there. I was wondering if
> > > rootfs or tmpfs support such execute in place today, or if binaries
> > > executed from there have their code segments duplicated in ram?
> >
> > Only ext2 supports it today: see Documentation/filesystems/xip.txt
>
> As I understand it, xip avoids the page cache copy. But tmpfs already
> lives in the page cache (or swap), so avoiding that "copy" is
> impossible. But I always expected tmpfs to implicitly due its own kind
> of xip, i.e. that it doesn't have to store its stuff in the page cache
> twice. Are you saying that this isn't true?
tmpfs doesn't store its stuff in the page cache twice: that's true,
and I didn't mean to imply otherwise. But tmpfs doesn't contain any
support for rom memory: you'd have to copy from rom to tmpfs to use it.
Hugh
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-02 15:22 ` Hugh Dickins
@ 2007-05-02 19:30 ` Phillip Susi
2007-05-02 20:34 ` Hugh Dickins
0 siblings, 1 reply; 30+ messages in thread
From: Phillip Susi @ 2007-05-02 19:30 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Björn Steinbrink, Linux-kernel
Hugh Dickins wrote:
> tmpfs doesn't store its stuff in the page cache twice: that's true,
> and I didn't mean to imply otherwise. But tmpfs doesn't contain any
> support for rom memory: you'd have to copy from rom to tmpfs to use it.
>
> Hugh
The question is, when you execute a binary on tmpfs, does its code
segment get mapped directly where it's at in the buffer cache, or does
it get copied to another page for the executing process? At least,
assuming this is possible due to the vma and file offsets of the segment
being aligned.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-02 19:30 ` Phillip Susi
@ 2007-05-02 20:34 ` Hugh Dickins
0 siblings, 0 replies; 30+ messages in thread
From: Hugh Dickins @ 2007-05-02 20:34 UTC (permalink / raw)
To: Phillip Susi; +Cc: Björn Steinbrink, Linux-kernel
On Wed, 2 May 2007, Phillip Susi wrote:
> Hugh Dickins wrote:
> > tmpfs doesn't store its stuff in the page cache twice: that's true,
> > and I didn't mean to imply otherwise. But tmpfs doesn't contain any
> > support for rom memory: you'd have to copy from rom to tmpfs to use it.
>
> The question is, when you execute a binary on tmpfs, does its code segment get
> mapped directly where it's at in the buffer cache, or does it get copied to
> another page for the executing process? At least, assuming this is possible
> due to the vma and file offsets of the segment being aligned.
Its pages are mapped directly into the executing process, without copying.
Hugh
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-02 14:04 ` Hugh Dickins
2007-05-02 14:38 ` Björn Steinbrink
@ 2007-05-03 11:38 ` Erik Mouw
2007-05-03 15:37 ` Jörn Engel
2007-05-03 12:12 ` Robin Getz
2 siblings, 1 reply; 30+ messages in thread
From: Erik Mouw @ 2007-05-03 11:38 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Phillip Susi, Linux-kernel
[-- Attachment #1: Type: text/plain, Size: 786 bytes --]
On Wed, May 02, 2007 at 03:04:44PM +0100, Hugh Dickins wrote:
> On Tue, 1 May 2007, Phillip Susi wrote:
> > I seem to remember seeing some patches go by at some point that allowed one of
> > the rom type embeded system filesystems to directly execute binaries out of
> > the original rom memory rather than copying them to ram first, then executing
> > from there. I was wondering if rootfs or tmpfs support such execute in place
> > today, or if binaries executed from there have their code segments duplicated
> > in ram?
>
> Only ext2 supports it today: see Documentation/filesystems/xip.txt
IIRC JFFS2 also supports XIP.
Erik
--
They're all fools. Don't worry. Darwin may be slow, but he'll
eventually get them. -- Matthew Lammers in alt.sysadmin.recovery
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-03 11:38 ` Erik Mouw
@ 2007-05-03 15:37 ` Jörn Engel
0 siblings, 0 replies; 30+ messages in thread
From: Jörn Engel @ 2007-05-03 15:37 UTC (permalink / raw)
To: Erik Mouw; +Cc: Hugh Dickins, Phillip Susi, Linux-kernel
On Thu, 3 May 2007 13:38:22 +0200, Erik Mouw wrote:
> On Wed, May 02, 2007 at 03:04:44PM +0100, Hugh Dickins wrote:
> >
> > Only ext2 supports it today: see Documentation/filesystems/xip.txt
>
> IIRC JFFS2 also supports XIP.
Definitely not. AXFS does, if you want to consider out-of-tree patches.
Jörn
--
Fools ignore complexity. Pragmatists suffer it.
Some can avoid it. Geniuses remove it.
-- Perlis's Programming Proverb #58, SIGPLAN Notices, Sept. 1982
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place
2007-05-02 14:04 ` Hugh Dickins
2007-05-02 14:38 ` Björn Steinbrink
2007-05-03 11:38 ` Erik Mouw
@ 2007-05-03 12:12 ` Robin Getz
2 siblings, 0 replies; 30+ messages in thread
From: Robin Getz @ 2007-05-03 12:12 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Phillip Susi, Linux-kernel
On Wed 2 May 2007 10:04, Hugh Dickins pondered:
> On Tue, 1 May 2007, Phillip Susi wrote:
> > I seem to remember seeing some patches go by at some point that
> > allowed one of the rom type embeded system filesystems to directly
> > execute binaries out of the original rom memory rather than copying
> > them to ram first, then executing from there. I was wondering if
> > rootfs or tmpfs support such execute in place today, or if
> > binaries executed from there have their code segments duplicated
> > in ram?
>
> Only ext2 supports it today: see Documentation/filesystems/xip.txt
>
Depends on if it is a noMMU or MMU platform.
Since noMMU platforms can't re-arrange non-contiguous blocks (which appears in
a read/write ext2 file system) we need to use a read only romfs which
applications are guaranteed to be contiguous by design.
I don't think the noMMU case is documented in xip.txt
-Robin
^ permalink raw reply [flat|nested] 30+ messages in thread
* Execute in place.
@ 2003-06-04 8:34 David Woodhouse
2003-06-04 9:57 ` Charles Manning
2003-06-04 9:57 ` Jörn Engel
0 siblings, 2 replies; 30+ messages in thread
From: David Woodhouse @ 2003-06-04 8:34 UTC (permalink / raw)
To: linux-mtd
[-- Attachment #1: Type: text/plain, Size: 1720 bytes --]
I've done some work on kernel XIP. The basic principle of operation is
that we copy any code which may need to run while the flash chips are in
anything but 'read' mode into RAM, and we disable interrupts while the
chips are busy. During erases, we poll not only for erase completion but
also for pending IRQs. If an IRQ is pending, we suspend the erase
operation, re-enable IRQs and call cond_resched().
The flash driver code is fairly simple, although it wants a little bit
of cleanup and I want to be convinced that the arch-specific bits are
done right before committing it.
The arch-specific parts are implemented for XScale only so far, but
should be relatively simple to do for new architectures. You need to
ensure that parts of kernel code can be loaded into RAM instead of ROM,
when marked with the '__xipram' attribute, that the udelay() function is
in that section, and also your software TLB handlers if you have them --
or indeed anything else which may be needed in the critical sections of
the flash chip driver. I've reintroduced the get_unaligned() macro to
potentially-unaligned data loads in the critical region, so alignment
fixups do _not_ need to be in RAM.
You also need to provide suitable xip_cli(), xip_sti(), and
xip_irqpending() functions or macros in include/linux/mtd/xip.h. I
suspect that wants moving to include/asm-$(ARCH)/xip.h or similar.
It's implemented for Intel command set chips only so far; adding similar
support to the other command sets is left as an exercise for the reader.
Two patches attached -- the arch-specific patch to 2.4.19-rmk7-pxa1 to
provide the generic .xipram support, and the patch against MTD CVS to
add the chip support.
Comments welcome.
--
dwmw2
[-- Attachment #2: 02-xscale-xipram.patch --]
[-- Type: text/x-patch, Size: 10371 bytes --]
diff -uNrp --exclude CVS linux-stage2-cramfs/arch/arm/Makefile linux-stage3-pxaxip/arch/arm/Makefile
--- linux-stage2-cramfs/arch/arm/Makefile Thu May 15 16:58:44 2003
+++ linux-stage3-pxaxip/arch/arm/Makefile Fri May 16 22:50:01 2003
@@ -19,7 +19,7 @@ endif
#CFLAGS :=$(CFLAGS:-O2=-Os)
ifeq ($(CONFIG_DEBUG_INFO),y)
-CFLAGS +=-g
+CFLAGS +=-g -mapcs-frame
endif
# Select CPU dependent flags. Note that order of declaration is important;
@@ -278,7 +278,8 @@ vmlinux: arch/arm/vmlinux.lds
arch/arm/vmlinux.lds: arch/arm/Makefile $(LDSCRIPT) \
$(wildcard include/config/cpu/32.h) \
$(wildcard include/config/cpu/26.h) \
- $(wildcard include/config/arch/*.h)
+ $(wildcard include/config/arch/*.h) \
+ $(wildcard include/config/xip/kernel.h)
@echo ' Generating $@'
@sed 's/TEXTADDR/$(TEXTADDR)/;s/DATAADDR/$(DATAADDR)/' $(LDSCRIPT) >$@
diff -uNrp --exclude CVS linux-stage2-cramfs/arch/arm/boot/Makefile linux-stage3-pxaxip/arch/arm/boot/Makefile
--- linux-stage2-cramfs/arch/arm/boot/Makefile Thu May 15 16:58:44 2003
+++ linux-stage3-pxaxip/arch/arm/boot/Makefile Fri May 16 22:50:01 2003
@@ -146,10 +146,11 @@ zImage: compressed/vmlinux
ifeq ($(CONFIG_XIP_KERNEL),y)
xipImage: $(CONFIGURE) $(SYSTEM)
- $(OBJCOPY) -S -O binary -R .data $(SYSTEM) vmlinux-text.bin
- $(OBJCOPY) -S -O binary -R .init -R .text -R __ex_table -R __ksymtab $(SYSTEM) vmlinux-data.bin
- cat vmlinux-text.bin vmlinux-data.bin > $@
- $(RM) -f vmlinux-text.bin vmlinux-data.bin
+ $(OBJCOPY) -S -O binary -R .data -R .xipram $(SYSTEM) vmlinux-text.bin
+ $(OBJCOPY) -S -O binary -j .xipram $(SYSTEM) vmlinux-xipram.bin
+ $(OBJCOPY) -S -O binary -j .data $(SYSTEM) vmlinux-data.bin
+ cat vmlinux-text.bin vmlinux-xipram.bin vmlinux-data.bin > $@
+ $(RM) -f vmlinux-text.bin vmlinux-xipram.bin vmlinux-data.bin
endif
bootpImage: bootp/bootp
diff -uNrp --exclude CVS linux-stage2-cramfs/arch/arm/kernel/head-armv.S linux-stage3-pxaxip/arch/arm/kernel/head-armv.S
--- linux-stage2-cramfs/arch/arm/kernel/head-armv.S Thu May 15 16:58:44 2003
+++ linux-stage3-pxaxip/arch/arm/kernel/head-armv.S Fri May 16 22:50:01 2003
@@ -227,7 +227,7 @@ __ret: ldr lr, __switch_data
.align 5
__mmap_switched:
#ifdef CONFIG_XIP_KERNEL
- ldr r3, ETEXT @ data section copy
+ ldr r3, EXIP @ data section copy
ldr r4, SDATA
ldr r5, EDATA
1:
@@ -337,15 +337,29 @@ __create_page_tables:
mov r0, #TEXTADDR & 0xff000000
add r0, r0, #TEXTADDR & 0x00f00000 @ virt kernel start
- add r0, r4, r0, lsr #18
- add r2, r3, #4 << 20 @ kernel + 4MB
+ ldr r2, SXIP
+ sub r2, r2, #1
+ mov r2, r2, lsr #20
+ mov r2, r2, lsl #20
+ sub r2, r2, r0 @ length of XIP text
+
+ add r2, r2, r3
+ add r0, r4, r0, lsr #18
1:
str r3, [r0], #4
add r3, r3, #1 << 20
cmp r3, r2
bne 1b
+ add r2, r2, #0x300000 @ xipram and .data
+ sub r3, r3, #1 << 20 @ one overlapping pgdir
+1:
+ str r3, [r0], #4
+ add r3, r3, #1 << 20
+ cmp r3, r2
+ bne 1b
+
bic r3, r4, #0x000ff000 @ ram start
add r0, r4, r3, lsr #18
orr r3, r3, r8
@@ -530,8 +544,8 @@ __lookup_architecture_type:
PGTBL: .long SYMBOL_NAME(swapper_pg_dir)
-ETEXT: .long SYMBOL_NAME(_endtext)
SDATA: .long SYMBOL_NAME(_sdata)
EDATA: .long SYMBOL_NAME(__bss_start)
-
+SXIP: .long SYMBOL_NAME(_xipram_start)
+EXIP: .long SYMBOL_NAME(_xipram_end)
#endif
diff -uNrp --exclude CVS linux-stage2-cramfs/arch/arm/kernel/setup.c linux-stage3-pxaxip/arch/arm/kernel/setup.c
--- linux-stage2-cramfs/arch/arm/kernel/setup.c Thu May 15 16:58:44 2003
+++ linux-stage3-pxaxip/arch/arm/kernel/setup.c Fri May 16 22:50:01 2003
@@ -56,7 +56,8 @@ extern void reboot_setup(char *str);
extern int root_mountflags;
extern int _stext, _text, _etext, _edata, _end;
#ifdef CONFIG_XIP_KERNEL
-extern int _endtext, _sdata;
+extern int _sdata;
+extern char _xipram_start, _xipram_end, _romend;
#endif
@@ -371,6 +372,24 @@ void __init setup_initrd(unsigned int st
#endif
}
+#ifdef CONFIG_XIP_KERNEL
+extern char _xipram_start, xipram_end;
+char *xipram_copy;
+
+static void __init setup_xipram(void)
+{
+ unsigned long xipram_len = &_xipram_end-&_xipram_start;
+ xipram_copy = alloc_bootmem_pages(xipram_len);
+
+ memcpy(xipram_copy, &_romend, &_xipram_end-&_xipram_start);
+
+ init_mm.start_code = (unsigned long) xipram_copy;
+ init_mm.end_code = (unsigned long) xipram_copy + xipram_len;
+ init_mm.start_data = (unsigned long) &_sdata;
+
+}
+#endif
+
static void __init
request_standard_resources(struct meminfo *mi, struct machine_desc *mdesc)
{
@@ -379,11 +398,7 @@ request_standard_resources(struct meminf
kernel_code.start = __virt_to_phys(init_mm.start_code);
kernel_code.end = __virt_to_phys(init_mm.end_code - 1);
-#ifndef CONFIG_XIP_KERNEL
- kernel_data.start = __virt_to_phys(init_mm.end_code);
-#else
kernel_data.start = __virt_to_phys(init_mm.start_data);
-#endif
kernel_data.end = __virt_to_phys(init_mm.brk - 1);
for (i = 0; i < mi->nr_banks; i++) {
@@ -641,13 +656,11 @@ void __init setup_arch(char **cmdline_p)
meminfo.bank[0].size = MEM_SIZE;
}
+ /* These three get overwritten by setup_xipram() if appropriate */
init_mm.start_code = (unsigned long) &_text;
-#ifndef CONFIG_XIP_KERNEL
init_mm.end_code = (unsigned long) &_etext;
-#else
- init_mm.end_code = (unsigned long) &_endtext;
- init_mm.start_data = (unsigned long) &_sdata;
-#endif
+ init_mm.start_data = (unsigned long) &_etext;
+
init_mm.end_data = (unsigned long) &_edata;
init_mm.brk = (unsigned long) &_end;
@@ -655,6 +668,9 @@ void __init setup_arch(char **cmdline_p)
saved_command_line[COMMAND_LINE_SIZE-1] = '\0';
parse_cmdline(&meminfo, cmdline_p, from);
bootmem_init(&meminfo);
+#ifdef CONFIG_XIP_KERNEL
+ setup_xipram();
+#endif
paging_init(&meminfo, mdesc);
request_standard_resources(&meminfo, mdesc);
diff -uNrp --exclude CVS linux-stage2-cramfs/arch/arm/lib/delay.S linux-stage3-pxaxip/arch/arm/lib/delay.S
--- linux-stage2-cramfs/arch/arm/lib/delay.S Mon May 5 21:39:59 2003
+++ linux-stage3-pxaxip/arch/arm/lib/delay.S Fri May 16 22:50:01 2003
@@ -9,7 +9,13 @@
*/
#include <linux/linkage.h>
#include <asm/assembler.h>
+#include <linux/config.h>
+
+#ifndef CONFIG_XIP_KERNEL
.text
+#else
+ .section ".text.xipram","ax",%progbits
+#endif
LC0: .word SYMBOL_NAME(loops_per_jiffy)
diff -uNrp --exclude CVS linux-stage2-cramfs/arch/arm/mm/init.c linux-stage3-pxaxip/arch/arm/mm/init.c
--- linux-stage2-cramfs/arch/arm/mm/init.c Thu May 15 16:58:44 2003
+++ linux-stage3-pxaxip/arch/arm/mm/init.c Fri May 16 22:50:01 2003
@@ -52,7 +52,7 @@ static unsigned long totalram_pages;
extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
extern char _stext, _text, _etext, _end, __init_begin, __init_end;
#ifdef CONFIG_XIP_KERNEL
-extern char _endtext, _sdata;
+extern char _endtext, _sdata, _xipram_start, _xipram_end;
#endif
extern unsigned long phys_initrd_start;
extern unsigned long phys_initrd_size;
@@ -612,11 +612,12 @@ void __init mem_init(void)
#ifndef CONFIG_XIP_KERNEL
codepages = &_etext - &_text;
datapages = &_end - &_etext;
+ initpages = &__init_end - &__init_begin;
#else
- codepages = &_endtext - &_text;
+ codepages = &_xipram_end - &_xipram_start;
datapages = &_end - &_sdata;
+ initpages = &_xipram_start - 0x100000 - &_stext;
#endif
- initpages = &__init_end - &__init_begin;
high_memory = (void *)__va(meminfo.end);
max_mapnr = virt_to_page(high_memory) - mem_map;
@@ -653,11 +654,17 @@ void __init mem_init(void)
}
printk(" = %luMB total\n", num_physpages >> (20 - PAGE_SHIFT));
+#ifndef CONFIG_XIP_KERNEL
printk(KERN_NOTICE "Memory: %luKB available (%dK code, "
"%dK data, %dK init)\n",
(unsigned long) nr_free_pages() << (PAGE_SHIFT-10),
codepages >> 10, datapages >> 10, initpages >> 10);
-
+#else
+ printk(KERN_NOTICE "Memory: %luKB available (%dK code, "
+ "%dK data, %dK ROM)\n",
+ (unsigned long) nr_free_pages() << (PAGE_SHIFT-10),
+ codepages >> 10, datapages >> 10, initpages >> 10);
+#endif
if (PAGE_SIZE >= 16384 && num_physpages <= 128) {
extern int sysctl_overcommit_memory;
/*
diff -uNrp --exclude CVS linux-stage2-cramfs/arch/arm/mm/mm-armv.c linux-stage3-pxaxip/arch/arm/mm/mm-armv.c
--- linux-stage2-cramfs/arch/arm/mm/mm-armv.c Thu May 15 16:58:44 2003
+++ linux-stage3-pxaxip/arch/arm/mm/mm-armv.c Fri May 16 22:50:01 2003
@@ -62,8 +62,9 @@ __setup("nowb", nowrite_setup);
#define clean_cache_area(start,size) \
cpu_cache_clean_invalidate_range((unsigned long)start, ((unsigned long)start) + size, 0);
-
-
+#ifdef CONFIG_XIP_KERNEL
+extern char _xipram_start, _xipram_end;
+#endif
/*
* need to get a 16k page for level 1
*/
@@ -358,7 +359,7 @@ void __init memtable_init(struct meminfo
#ifdef CONFIG_XIP_KERNEL
p->physical = KERNEL_XIP_BASE_PHYS;
p->virtual = KERNEL_XIP_BASE_VIRT;
- p->length = PGDIR_SIZE * 8;
+ p->length = (((unsigned long)(&_xipram_end)) & ~(PGDIR_SIZE-1)) - KERNEL_XIP_BASE_VIRT;
p->domain = DOMAIN_KERNEL;
p->prot_read = 0; /* r=0, b=0 --> read-only for kernel mode */
p->prot_write = 0;
@@ -366,7 +367,22 @@ void __init memtable_init(struct meminfo
p->bufferable = 1;
p ++;
-#endif
+
+ if (&_xipram_start != (&_xipram_end)) {
+ extern char *xipram_copy;
+
+ p->physical = virt_to_phys(xipram_copy);
+ p->virtual = &_xipram_start;
+ p->length = ((&_xipram_end-&_xipram_start) +PAGE_SIZE-1) & PAGE_MASK;
+ p->domain = DOMAIN_KERNEL;
+ p->prot_read = 0; /* r=0, b=0 --> read-only for kernel mode */
+ p->prot_write = 0;
+ p->cacheable = 1;
+ p->bufferable = 1;
+
+ p ++;
+ }
+#endif /* CONFIG_XIP_KERNEL */
/*
* Go through the initial mappings, but clear out any
diff -uNrp --exclude CVS linux-stage2-cramfs/arch/arm/vmlinux-armv-xip.lds.in linux-stage3-pxaxip/arch/arm/vmlinux-armv-xip.lds.in
--- linux-stage2-cramfs/arch/arm/vmlinux-armv-xip.lds.in Thu May 15 16:58:44 2003
+++ linux-stage3-pxaxip/arch/arm/vmlinux-armv-xip.lds.in Fri May 16 22:50:01 2003
@@ -74,9 +74,23 @@ SECTIONS
__start___ksymtab = .;
*(__ksymtab)
__stop___ksymtab = .;
+
+ . = ALIGN(4096);
}
- _endtext = .;
+ _romend = .;
+
+ /* Add 1MiB so that the .xipram functions can be accessed
+ through a section mapping to start with. */
+ . = . + 0x100000;
+
+ .xipram : {
+ _xipram_start = .;
+
+ *(.text.xipram)
+
+ _xipram_end = .;
+ }
. = DATAADDR;
[-- Attachment #3: 06a-mtdcvs-xip.patch --]
[-- Type: text/x-patch, Size: 22315 bytes --]
Index: drivers/mtd/chips/cfi_cmdset_0001.c
===================================================================
RCS file: /home/cvs/mtd/drivers/mtd/chips/cfi_cmdset_0001.c,v
retrieving revision 1.123
diff -u -p -r1.123 cfi_cmdset_0001.c
--- drivers/mtd/chips/cfi_cmdset_0001.c 28 May 2003 12:51:48 -0000 1.123
+++ drivers/mtd/chips/cfi_cmdset_0001.c 4 Jun 2003 08:11:47 -0000
@@ -24,6 +24,7 @@
#include <linux/init.h>
#include <asm/io.h>
#include <asm/byteorder.h>
+#include <asm/unaligned.h>
#include <linux/errno.h>
#include <linux/slab.h>
@@ -33,6 +34,7 @@
#include <linux/mtd/mtd.h>
#include <linux/mtd/compatmac.h>
#include <linux/mtd/cfi.h>
+#include <linux/mtd/xip.h>
// debugging, turns off buffer write mode if set to 1
#define FORCE_WORD_WRITE 0
@@ -124,7 +126,7 @@ static void cfi_tell_features(struct cfi
* this module is non-zero, i.e. between inter_module_get and
* inter_module_put. Keith Owens <kaos@ocs.com.au> 29 Oct 2000.
*/
-struct mtd_info *cfi_cmdset_0001(struct map_info *map, int primary)
+struct mtd_info * __xipram cfi_cmdset_0001(struct map_info *map, int primary)
{
struct cfi_private *cfi = map->fldrv_priv;
int i;
@@ -144,21 +146,25 @@ struct mtd_info *cfi_cmdset_0001(struct
if (!adr)
return NULL;
- /* Switch it into Query Mode */
- cfi_send_gen_cmd(0x98, 0x55, base, map, cfi, cfi->device_type, NULL);
-
extp = kmalloc(sizeof(*extp), GFP_KERNEL);
if (!extp) {
printk(KERN_ERR "Failed to allocate memory\n");
return NULL;
}
+
+ /* Switch it into Query Mode */
+ xip_cli();
+ cfi_send_gen_cmd(0x98, 0x55, base, map, cfi, cfi->device_type, NULL);
/* Read in the Extended Query Table */
for (i=0; i<sizeof(*extp); i++) {
((unsigned char *)extp)[i] =
cfi_read_query(map, (base+((adr+i)*ofs_factor)));
}
-
+
+ cfi_send_gen_cmd(0xff, 0x55, base, map, cfi, cfi->device_type, NULL);
+ xip_sti();
+
if (extp->MajorVersion != '1' ||
(extp->MinorVersion < '0' || extp->MinorVersion > '3')) {
printk(KERN_WARNING " Unknown IntelExt Extended Query "
@@ -207,7 +213,7 @@ struct mtd_info *cfi_cmdset_0001(struct
return cfi_intelext_setup(map);
}
-static struct mtd_info *cfi_intelext_setup(struct map_info *map)
+static struct mtd_info * cfi_intelext_setup(struct map_info *map)
{
struct cfi_private *cfi = map->fldrv_priv;
struct mtd_info *mtd;
@@ -308,6 +314,36 @@ static struct mtd_info *cfi_intelext_set
/*
* *********** CHIP ACCESS FUNCTIONS ***********
*/
+#ifdef CONFIG_XIP_KERNEL
+static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr, int mode)
+{
+ while (chip->state != FL_READY) {
+ DECLARE_WAITQUEUE(wait, current);
+ set_current_state(TASK_UNINTERRUPTIBLE);
+ add_wait_queue(&chip->wq, &wait);
+ spin_unlock(chip->mutex);
+ schedule();
+ remove_wait_queue(&chip->wq, &wait);
+ spin_lock(chip->mutex);
+ }
+ /* We _know_ the chip is in FL_READY mode, or we wouldn't be here */
+ return 0;
+}
+
+static void __xipram put_chip(struct map_info *map, struct flchip *chip, unsigned long adr)
+{
+ struct cfi_private *cfi = map->fldrv_priv;
+
+ /* We also know there wasn't a command suspended without its
+ controlling process knowing about it. */
+ if (chip->state != FL_READY && chip->state != FL_POINT) {
+ cfi_write(map, CMD(0xff), adr);
+ chip->state = FL_READY;
+ }
+ wake_up(&chip->wq);
+}
+
+#else /* !CONFIG_XIP_KERNEL */
static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr, int mode)
{
@@ -406,6 +442,7 @@ static int get_chip(struct map_info *map
spin_lock(chip->mutex);
goto resettime;
}
+ /* Not reached */
}
static void put_chip(struct map_info *map, struct flchip *chip, unsigned long adr)
@@ -439,6 +476,7 @@ static void put_chip(struct map_info *ma
}
wake_up(&chip->wq);
}
+#endif /* !CONFIG_XIP_KERNEL */
static int do_point_onechip (struct map_info *map, struct flchip *chip, loff_t adr, size_t len)
{
@@ -627,7 +665,7 @@ static int cfi_intelext_read (struct mtd
return ret;
}
-static int cfi_intelext_read_prot_reg (struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, u_char *buf, int base_offst, int reg_sz)
+static int __xipram cfi_intelext_read_prot_reg (struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, u_char *buf, int base_offst, int reg_sz)
{
struct map_info *map = mtd->priv;
struct cfi_private *cfi = map->fldrv_priv;
@@ -656,6 +694,7 @@ static int cfi_intelext_read_prot_reg (s
return (len-count)?:ret;
}
+ xip_cli();
if (chip->state != FL_JEDEC_QUERY) {
cfi_write(map, CMD(0x90), chip->start);
chip->state = FL_JEDEC_QUERY;
@@ -669,6 +708,7 @@ static int cfi_intelext_read_prot_reg (s
}
put_chip(map, chip, chip->start);
+ xip_sti();
spin_unlock(chip->mutex);
/* Move on to the next chip */
@@ -719,7 +759,7 @@ static int cfi_intelext_read_fact_prot_r
}
-static int do_write_oneword(struct map_info *map, struct flchip *chip, unsigned long adr, cfi_word datum)
+static int __xipram do_write_oneword(struct map_info *map, struct flchip *chip, unsigned long adr, cfi_word datum)
{
struct cfi_private *cfi = map->fldrv_priv;
cfi_word status, status_OK;
@@ -739,6 +779,7 @@ static int do_write_oneword(struct map_i
}
ENABLE_VPP(map);
+ xip_cli();
cfi_write(map, CMD(0x40), adr);
cfi_write(map, datum, adr);
chip->state = FL_WRITING;
@@ -750,6 +791,7 @@ static int do_write_oneword(struct map_i
timeo = jiffies + (HZ/2);
z = 0;
for (;;) {
+#ifndef CONFIG_XIP_KERNEL
if (chip->state != FL_WRITING) {
/* Someone's suspended the write. Sleep */
DECLARE_WAITQUEUE(wait, current);
@@ -763,11 +805,12 @@ static int do_write_oneword(struct map_i
spin_lock(chip->mutex);
continue;
}
-
+#endif
status = cfi_read(map, adr);
if ((status & status_OK) == status_OK)
break;
+#ifndef CONFIG_XIP_KERNEL
/* OK Still waiting */
if (time_after(jiffies, timeo)) {
chip->state = FL_STATUS;
@@ -775,7 +818,7 @@ static int do_write_oneword(struct map_i
ret = -EIO;
goto out;
}
-
+#endif
/* Latency issues. Drop the lock, wait a while and retry */
spin_unlock(chip->mutex);
z++;
@@ -802,6 +845,7 @@ static int do_write_oneword(struct map_i
}
out:
put_chip(map, chip, adr);
+ xip_sti();
spin_unlock(chip->mutex);
return ret;
@@ -930,7 +974,7 @@ static int cfi_intelext_write_words (str
}
-static inline int do_write_buffer(struct map_info *map, struct flchip *chip,
+static int __xipram do_write_buffer(struct map_info *map, struct flchip *chip,
unsigned long adr, const u_char *buf, int len)
{
struct cfi_private *cfi = map->fldrv_priv;
@@ -952,6 +996,8 @@ static inline int do_write_buffer(struct
return ret;
}
+ ENABLE_VPP(map);
+ xip_cli();
if (chip->state != FL_STATUS)
cfi_write(map, CMD(0x70), cmd_adr);
@@ -962,11 +1008,14 @@ static inline int do_write_buffer(struct
So we must check here and reset those bits if they're set. Otherwise
we're just pissing in the wind */
if (status & CMD(0x30)) {
+#ifdef CONFIG_XIP_KERNEL
+ cfi_write(map, CMD(0xFF), cmd_adr);
+#endif
printk(KERN_WARNING "SR.4 or SR.5 bits set in buffer write (status %x). Clearing.\n", status);
cfi_write(map, CMD(0x50), cmd_adr);
cfi_write(map, CMD(0x70), cmd_adr);
}
- ENABLE_VPP(map);
+
chip->state = FL_WRITING_TO_BUFFER;
z = 0;
@@ -982,10 +1031,13 @@ static inline int do_write_buffer(struct
spin_lock(chip->mutex);
if (++z > 20) {
+ cfi_word status2;
/* Argh. Not ready for write to buffer */
cfi_write(map, CMD(0x70), cmd_adr);
- chip->state = FL_STATUS;
- printk(KERN_ERR "Chip not ready for buffer write. Xstatus = %llx, status = %llx\n", (__u64)status, (__u64)cfi_read(map, cmd_adr));
+ status2 = cfi_read(map, cmd_adr);
+ cfi_write(map, CMD(0xFF), cmd_adr);
+ chip->state = FL_READY;
+ printk(KERN_ERR "Chip not ready for buffer write. Xstatus = %llx, status = %llx\n", (__u64)status, (__u64)status2);
/* Odd. Clear status bits */
cfi_write(map, CMD(0x50), cmd_adr);
cfi_write(map, CMD(0x70), cmd_adr);
@@ -1002,11 +1054,11 @@ static inline int do_write_buffer(struct
if (cfi_buswidth_is_1()) {
map_write8 (map, *((__u8*)buf)++, adr+z);
} else if (cfi_buswidth_is_2()) {
- map_write16 (map, *((__u16*)buf)++, adr+z);
+ map_write16 (map, get_unaligned(((__u16*)buf)++), adr+z);
} else if (cfi_buswidth_is_4()) {
- map_write32 (map, *((__u32*)buf)++, adr+z);
+ map_write32 (map, get_unaligned(((__u32*)buf)++), adr+z);
} else if (cfi_buswidth_is_8()) {
- map_write64 (map, *((__u64*)buf)++, adr+z);
+ map_write64 (map, get_unaligned(((__u64*)buf)++), adr+z);
} else {
ret = -EINVAL;
goto out;
@@ -1023,6 +1075,7 @@ static inline int do_write_buffer(struct
timeo = jiffies + (HZ/2);
z = 0;
for (;;) {
+#ifndef CONFIG_XIP_KERNEL
if (chip->state != FL_WRITING) {
/* Someone's suspended the write. Sleep */
DECLARE_WAITQUEUE(wait, current);
@@ -1035,11 +1088,12 @@ static inline int do_write_buffer(struct
spin_lock(chip->mutex);
continue;
}
-
+#endif
status = cfi_read(map, cmd_adr);
if ((status & status_OK) == status_OK)
break;
+#ifndef CONFIG_XIP_KERNEL
/* OK Still waiting */
if (time_after(jiffies, timeo)) {
chip->state = FL_STATUS;
@@ -1047,7 +1101,7 @@ static inline int do_write_buffer(struct
ret = -EIO;
goto out;
}
-
+#endif
/* Latency issues. Drop the lock, wait a while and retry */
spin_unlock(chip->mutex);
cfi_udelay(1);
@@ -1076,6 +1130,7 @@ static inline int do_write_buffer(struct
out:
put_chip(map, chip, cmd_adr);
+ xip_sti();
spin_unlock(chip->mutex);
return ret;
}
@@ -1248,13 +1303,12 @@ static int cfi_intelext_varsize_frob(str
}
-static int do_erase_oneblock(struct map_info *map, struct flchip *chip, unsigned long adr, void *thunk)
+static int __xipram do_erase_oneblock(struct map_info *map, struct flchip *chip, unsigned long adr, void *thunk)
{
struct cfi_private *cfi = map->fldrv_priv;
cfi_word status, status_OK;
unsigned long timeo;
int retries = 3;
- DECLARE_WAITQUEUE(wait, current);
int ret = 0;
adr += chip->start;
@@ -1271,6 +1325,7 @@ static int do_erase_oneblock(struct map_
}
ENABLE_VPP(map);
+ xip_cli();
/* Clear the status register first */
cfi_write(map, CMD(0x50), adr);
@@ -1280,17 +1335,20 @@ static int do_erase_oneblock(struct map_
chip->state = FL_ERASING;
chip->erase_suspended = 0;
+#ifndef CONFIG_XIP_KERNEL
spin_unlock(chip->mutex);
set_current_state(TASK_UNINTERRUPTIBLE);
schedule_timeout((chip->erase_time*HZ)/(2*1000));
spin_lock(chip->mutex);
-
+#endif
/* FIXME. Use a timer to check this, and return immediately. */
/* Once the state machine's known to be working I'll do that */
timeo = jiffies + (HZ*20);
for (;;) {
+#ifndef CONFIG_XIP_KERNEL
if (chip->state != FL_ERASING) {
+ DECLARE_WAITQUEUE(wait, current);
/* Someone's suspended the erase. Sleep */
set_current_state(TASK_UNINTERRUPTIBLE);
add_wait_queue(&chip->wq, &wait);
@@ -1306,11 +1364,12 @@ static int do_erase_oneblock(struct map_
timeo = jiffies + (HZ*20); /* FIXME */
chip->erase_suspended = 0;
}
-
+#endif
status = cfi_read(map, adr);
if ((status & status_OK) == status_OK)
break;
-
+
+#ifndef CONFIG_XIP_KERNEL
/* OK Still waiting */
if (time_after(jiffies, timeo)) {
cfi_write(map, CMD(0x70), adr);
@@ -1324,12 +1383,29 @@ static int do_erase_oneblock(struct map_
spin_unlock(chip->mutex);
return -EIO;
}
-
/* Latency issues. Drop the lock, wait a while and retry */
spin_unlock(chip->mutex);
set_current_state(TASK_UNINTERRUPTIBLE);
schedule_timeout(1);
spin_lock(chip->mutex);
+#else /* XIP */
+ if (xip_irqpending()) {
+ cfi_write(map, CMD(0xB0), adr);
+ cfi_write(map, CMD(0x70), adr);
+ for (;;) {
+ status = cfi_read(map, adr);
+ if ((status & status_OK) == status_OK)
+ break;
+ }
+ cfi_write(map, CMD(0xFF), adr);
+ xip_sti();
+// printk("IRQ was pending. Erase suspended\n");
+ cond_resched();
+ xip_cli();
+ cfi_write(map, CMD(0xd0), adr);
+ cfi_write(map, CMD(0x70), adr);
+ }
+#endif /* XIP */
}
DISABLE_VPP(map);
@@ -1340,6 +1416,9 @@ static int do_erase_oneblock(struct map_
chip->state = FL_STATUS;
status = cfi_read(map, adr);
+#ifdef CONFIG_XIP_KERNEL
+ cfi_write(map, CMD(0xFF), adr);
+#endif
/* check for lock bit */
if (status & CMD(0x3a)) {
unsigned char chipstatus = status;
@@ -1377,8 +1456,10 @@ static int do_erase_oneblock(struct map_
}
}
- wake_up(&chip->wq);
+ put_chip(map, chip, adr);
spin_unlock(chip->mutex);
+ xip_sti();
+ DISABLE_VPP(map);
return ret;
}
@@ -1445,11 +1526,17 @@ static int do_printlockstatus_oneblock(s
{
struct cfi_private *cfi = map->fldrv_priv;
int ofs_factor = cfi->interleave * cfi->device_type;
+ int d;
+ xip_cli();
cfi_send_gen_cmd(0x90, 0x55, 0, map, cfi, cfi->device_type, NULL);
- printk(KERN_DEBUG "block status register for 0x%08lx is %x\n",
- adr, cfi_read_query(map, adr+(2*ofs_factor)));
+ d = cfi_read_query(map, adr+(2*ofs_factor));
+
cfi_send_gen_cmd(0xff, 0x55, 0, map, cfi, cfi->device_type, NULL);
+ xip_sti();
+
+ printk(KERN_DEBUG "block status register for 0x%08lx is %x\n",
+ adr, d);
return 0;
}
@@ -1478,6 +1565,7 @@ static int do_xxlock_oneblock(struct map
}
ENABLE_VPP(map);
+ xip_cli();
cfi_write(map, CMD(0x60), adr);
if (thunk == DO_XXLOCK_ONEBLOCK_LOCK) {
@@ -1489,10 +1577,11 @@ static int do_xxlock_oneblock(struct map
} else
BUG();
+#ifndef CONFIG_XIP_KERNEL
spin_unlock(chip->mutex);
schedule_timeout(HZ);
spin_lock(chip->mutex);
-
+#endif
/* FIXME. Use a timer to check this, and return immediately. */
/* Once the state machine's known to be working I'll do that */
@@ -1502,7 +1591,8 @@ static int do_xxlock_oneblock(struct map
status = cfi_read(map, adr);
if ((status & status_OK) == status_OK)
break;
-
+
+#ifndef CONFIG_XIP_KERNEL
/* OK Still waiting */
if (time_after(jiffies, timeo)) {
cfi_write(map, CMD(0x70), adr);
@@ -1512,7 +1602,7 @@ static int do_xxlock_oneblock(struct map
spin_unlock(chip->mutex);
return -EIO;
}
-
+#endif
/* Latency issues. Drop the lock, wait a while and retry */
spin_unlock(chip->mutex);
cfi_udelay(1);
@@ -1523,6 +1613,7 @@ static int do_xxlock_oneblock(struct map
chip->state = FL_STATUS;
put_chip(map, chip, adr);
spin_unlock(chip->mutex);
+ xip_sti();
return 0;
}
Index: drivers/mtd/chips/cfi_probe.c
===================================================================
RCS file: /home/cvs/mtd/drivers/mtd/chips/cfi_probe.c,v
retrieving revision 1.71
diff -u -p -r1.71 cfi_probe.c
--- drivers/mtd/chips/cfi_probe.c 28 May 2003 12:51:48 -0000 1.71
+++ drivers/mtd/chips/cfi_probe.c 4 Jun 2003 08:11:47 -0000
@@ -17,6 +17,7 @@
#include <linux/mtd/map.h>
#include <linux/mtd/cfi.h>
+#include <linux/mtd/xip.h>
#include <linux/mtd/gen_probe.h>
//#define DEBUG_CFI
@@ -25,9 +26,9 @@
static void print_cfi_ident(struct cfi_ident *);
#endif
-static int cfi_probe_chip(struct map_info *map, __u32 base,
+static int __xipram cfi_probe_chip(struct map_info *map, __u32 base,
struct flchip *chips, struct cfi_private *cfi);
-static int cfi_chip_setup(struct map_info *map, struct cfi_private *cfi);
+static int __xipram cfi_chip_setup(struct map_info *map, struct cfi_private *cfi);
struct mtd_info *cfi_probe(struct map_info *map);
@@ -35,7 +36,7 @@ struct mtd_info *cfi_probe(struct map_in
in: interleave,type,mode
ret: table index, <0 for error
*/
-static inline int qry_present(struct map_info *map, __u32 base,
+static inline int __xipram qry_present(struct map_info *map, __u32 base,
struct cfi_private *cfi)
{
int osf = cfi->interleave * cfi->device_type; // scale factor
@@ -48,7 +49,7 @@ static inline int qry_present(struct map
return 0; // nothing found
}
-static int cfi_probe_chip(struct map_info *map, __u32 base,
+static int __xipram cfi_probe_chip(struct map_info *map, __u32 base,
struct flchip *chips, struct cfi_private *cfi)
{
int i;
@@ -66,10 +67,13 @@ static int cfi_probe_chip(struct map_inf
return 0;
}
cfi_send_gen_cmd(0xF0, 0, base, map, cfi, cfi->device_type, NULL);
+ xip_cli();
cfi_send_gen_cmd(0x98, 0x55, base, map, cfi, cfi->device_type, NULL);
- if (!qry_present(map,base,cfi))
+ if (!qry_present(map,base,cfi)) {
+ xip_sti();
return 0;
+ }
if (!cfi->numchips) {
/* This is the first time we're called. Set up the CFI
@@ -88,6 +92,7 @@ static int cfi_probe_chip(struct map_inf
/* If the QRY marker goes away, it's an alias */
if (!qry_present(map, chips[i].start, cfi)) {
+ xip_sti();
printk(KERN_DEBUG "%s: Found an alias at 0x%x for the chip at 0x%lx\n",
map->name, base, chips[i].start);
return 0;
@@ -99,26 +104,27 @@ static int cfi_probe_chip(struct map_inf
cfi_send_gen_cmd(0xF0, 0, base, map, cfi, cfi->device_type, NULL);
if (qry_present(map, base, cfi)) {
+ xip_sti();
printk(KERN_DEBUG "%s: Found an alias at 0x%x for the chip at 0x%lx\n",
map->name, base, chips[i].start);
return 0;
}
}
- }
+ }
+ /* Put it back into Read Mode */
+ cfi_send_gen_cmd(0xF0, 0, base, map, cfi, cfi->device_type, NULL);
+ xip_sti();
+
/* OK, if we got to here, then none of the previous chips appear to
be aliases for the current one. */
if (cfi->numchips == MAX_CFI_CHIPS) {
printk(KERN_WARNING"%s: Too many flash chips detected. Increase MAX_CFI_CHIPS from %d.\n", map->name, MAX_CFI_CHIPS);
- /* Doesn't matter about resetting it to Read Mode - we're not going to talk to it anyway */
return -1;
}
chips[cfi->numchips].start = base;
chips[cfi->numchips].state = FL_READY;
cfi->numchips++;
-
- /* Put it back into Read Mode */
- cfi_send_gen_cmd(0xF0, 0, base, map, cfi, cfi->device_type, NULL);
printk(KERN_INFO "%s: Found %d x%d devices at 0x%x in %d-bit mode\n",
map->name, cfi->interleave, cfi->device_type*8, base,
@@ -127,36 +133,50 @@ static int cfi_probe_chip(struct map_inf
return 1;
}
-static int cfi_chip_setup(struct map_info *map,
+/* Called with IRQs already disabled, in the XIP case */
+static int __xipram cfi_chip_setup(struct map_info *map,
struct cfi_private *cfi)
{
int ofs_factor = cfi->interleave*cfi->device_type;
__u32 base = 0;
int num_erase_regions = cfi_read_query(map, base + (0x10 + 28)*ofs_factor);
int i;
+ unsigned char cfi_copy[sizeof(struct cfi_ident) + 32];
+
+ if (!num_erase_regions) {
+ /* Put it back into Read Mode */
+ cfi_send_gen_cmd(0xF0, 0, base, map, cfi, cfi->device_type, NULL);
+ xip_sti();
+ return 0;
+ }
+
+ cfi->cfi_mode = CFI_MODE_CFI;
+ cfi->fast_prog=1; /* CFI supports fast programming */
+
+ memset(cfi_copy,0,sizeof(cfi_copy));
+
+ if (num_erase_regions)
+ /* Read the CFI info structure */
+ for (i=0; i<(sizeof(struct cfi_ident) + num_erase_regions * 4); i++) {
+ cfi_copy[i] = cfi_read_query(map,base + (0x10 + i)*ofs_factor);
+ }
+
+ /* Put it back into Read Mode */
+ cfi_send_gen_cmd(0xF0, 0, base, map, cfi, cfi->device_type, NULL);
+ xip_sti();
#ifdef DEBUG_CFI
printk("Number of erase regions: %d\n", num_erase_regions);
#endif
- if (!num_erase_regions)
- return 0;
cfi->cfiq = kmalloc(sizeof(struct cfi_ident) + num_erase_regions * 4, GFP_KERNEL);
if (!cfi->cfiq) {
printk(KERN_WARNING "%s: kmalloc failed for CFI ident structure\n", map->name);
return 0;
}
-
- memset(cfi->cfiq,0,sizeof(struct cfi_ident));
-
- cfi->cfi_mode = CFI_MODE_CFI;
- cfi->fast_prog=1; /* CFI supports fast programming */
-
- /* Read the CFI info structure */
- for (i=0; i<(sizeof(struct cfi_ident) + num_erase_regions * 4); i++) {
- ((unsigned char *)cfi->cfiq)[i] = cfi_read_query(map,base + (0x10 + i)*ofs_factor);
- }
-
+
+ memcpy(cfi->cfiq, cfi_copy, sizeof(struct cfi_ident) + num_erase_regions * 4);
+
/* Do any necessary byteswapping */
cfi->cfiq->P_ID = le16_to_cpu(cfi->cfiq->P_ID);
@@ -166,6 +186,7 @@ static int cfi_chip_setup(struct map_inf
cfi->cfiq->InterfaceDesc = le16_to_cpu(cfi->cfiq->InterfaceDesc);
cfi->cfiq->MaxBufWriteSize = le16_to_cpu(cfi->cfiq->MaxBufWriteSize);
+
#ifdef DEBUG_CFI
/* Dump the information therein */
print_cfi_ident(cfi->cfiq);
@@ -180,8 +201,6 @@ static int cfi_chip_setup(struct map_inf
(cfi->cfiq->EraseRegionInfo[i] & 0xffff) + 1);
#endif
}
- /* Put it back into Read Mode */
- cfi_send_gen_cmd(0xF0, 0, base, map, cfi, cfi->device_type, NULL);
return 1;
}
Index: include/linux/mtd/cfi.h
===================================================================
RCS file: /home/cvs/mtd/include/linux/mtd/cfi.h,v
retrieving revision 1.35
diff -u -p -r1.35 cfi.h
--- include/linux/mtd/cfi.h 28 May 2003 15:37:32 -0000 1.35
+++ include/linux/mtd/cfi.h 4 Jun 2003 08:11:48 -0000
@@ -458,7 +458,7 @@ static inline __u8 cfi_read_query(struct
static inline void cfi_udelay(int us)
{
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0) && !defined (CONFIG_XIP_KERNEL)
unsigned long t = us * HZ / 1000000;
if (t) {
set_current_state(TASK_UNINTERRUPTIBLE);
@@ -467,7 +467,9 @@ static inline void cfi_udelay(int us)
}
#endif
udelay(us);
+#ifndef CONFIG_XIP_KERNEL
cond_resched();
+#endif
}
static inline void cfi_spin_lock(spinlock_t *mutex)
--- /dev/null Thu Jan 30 10:24:37 2003
+++ include/linux/mtd/xip.h Wed Jun 4 09:10:34 2003
@@ -0,0 +1,31 @@
+/*
+ * $Id$
+ */
+
+#ifndef __LINUX_MTD_XIP_H__
+#define __LINUX_MTD_XIP_H__
+
+
+
+#ifdef CONFIG_XIP_KERNEL
+#ifdef CONFIG_ARCH_PXA
+#include <asm/arch/pxa-regs.h>
+#define xip_irqpending() (ICIP & ICMR)
+#else
+#warning "No known way to check for IRQ pending on this arch"
+#define xip_irqpending() (0)
+#endif /* ARCH */
+
+
+#define xip_cli() do { cli(); } while(0)
+#define xip_sti() do { sti(); } while(0)
+
+#define __xipram __attribute__ ((__section__ (".text.xipram")))
+#else /* !CONFIG_XIP_KERNEL */
+#define xip_cli() do { } while(0)
+#define xip_sti() do { } while(0)
+#define xip_irqpending() (0)
+#define __xipram
+#endif /* CONFIG_XIP_KERNEL */
+
+#endif /* __LINUX_MTD_XIP_H__ */
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: Execute in place.
2003-06-04 8:34 David Woodhouse
@ 2003-06-04 9:57 ` Charles Manning
2003-06-04 10:02 ` David Woodhouse
2003-06-04 9:57 ` Jörn Engel
1 sibling, 1 reply; 30+ messages in thread
From: Charles Manning @ 2003-06-04 9:57 UTC (permalink / raw)
To: David Woodhouse, linux-mtd
David
Is there any discussion leading up to XIP? ie. Why? ... and why not!
A few questions/comments inline.
On Wednesday 04 June 2003 20:34, David Woodhouse wrote:
> I've done some work on kernel XIP. The basic principle of operation is
> that we copy any code which may need to run while the flash chips are in
> anything but 'read' mode into RAM, and we disable interrupts while the
> chips are busy. During erases, we poll not only for erase completion but
> also for pending IRQs. If an IRQ is pending, we suspend the erase
> operation, re-enable IRQs and call cond_resched().
It scares me that this adds a potentially huge interrupt latency. Some NOR
devices need a good 20usec or so to get into erase suspend state. There are
many situations (eg ARM FIQs) where this would not be a nice thing to do.
Thus, mixing XIP with say a file system which might cause erases at
(relatively) arbitrary times needs to be done with great caution. Men with
red flags needed up front.
>
> The flash driver code is fairly simple, although it wants a little bit
> of cleanup and I want to be convinced that the arch-specific bits are
> done right before committing it.
>
> The arch-specific parts are implemented for XScale only so far, but
> should be relatively simple to do for new architectures. You need to
> ensure that parts of kernel code can be loaded into RAM instead of ROM,
> when marked with the '__xipram' attribute, that the udelay() function is
> in that section, and also your software TLB handlers if you have them --
> or indeed anything else which may be needed in the critical sections of
> the flash chip driver. I've reintroduced the get_unaligned() macro to
> potentially-unaligned data loads in the critical region, so alignment
> fixups do _not_ need to be in RAM.
>
> You also need to provide suitable xip_cli(), xip_sti(), and
> xip_irqpending() functions or macros in include/linux/mtd/xip.h. I
> suspect that wants moving to include/asm-$(ARCH)/xip.h or similar.
The fundamental implementation seems quite simple and elegant.IMHO.
-- Charles
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place.
2003-06-04 9:57 ` Charles Manning
@ 2003-06-04 10:02 ` David Woodhouse
2003-06-04 21:02 ` Charles Manning
0 siblings, 1 reply; 30+ messages in thread
From: David Woodhouse @ 2003-06-04 10:02 UTC (permalink / raw)
To: manningc2; +Cc: linux-mtd
On Wed, 2003-06-04 at 10:57, Charles Manning wrote:
> Is there any discussion leading up to XIP? ie. Why? ... and why not!
Why? Because your hardware designer didn't realise that RAM is cheaper
than flash, and gave you more flash than you need and not enough RAM.
Hence you want to run stuff uncompressed and use up all that nice
expensive flash space, while retaining your precious RAM.
Or, more charitably, perhaps you _really_ care about power consumption
and observe that flash takes less power than RAM.
Why not? Because it's mutually exclusive with compression and there are
interesting performance issues.
> It scares me that this adds a potentially huge interrupt latency. Some NOR
> devices need a good 20usec or so to get into erase suspend state. There are
> many situations (eg ARM FIQs) where this would not be a nice thing to do.
> Thus, mixing XIP with say a file system which might cause erases at
> (relatively) arbitrary times needs to be done with great caution. Men with
> red flags needed up front.
True. In practice it doesn't suck anywhere near as much as I expected it
to. Mount an empty flash as JFFS2, causing the whole thing to be erased
and cleanmarkers to be written, while flood-pinging... observe IRQ
latency really isn't that bad.
If you're doing pseudo-DMA tricks with FIQ, then of course you are going
to have problems, but that's not really the common case.
> The fundamental implementation seems quite simple and elegant.IMHO.
Good. My main concern was that there would be some platform where we
can't actually poll for interrupts like this, and we have to redirect
the IRQ vectors to somewhere in RAM, etc. I suppose that can still be
hidden behind xip_cli(), xip_sti() and xip_irqpending() though.
--
dwmw2
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place.
2003-06-04 10:02 ` David Woodhouse
@ 2003-06-04 21:02 ` Charles Manning
0 siblings, 0 replies; 30+ messages in thread
From: Charles Manning @ 2003-06-04 21:02 UTC (permalink / raw)
To: David Woodhouse; +Cc: linux-mtd
On Wednesday 04 June 2003 22:02, you wrote:
> On Wed, 2003-06-04 at 10:57, Charles Manning wrote:
> > Is there any discussion leading up to XIP? ie. Why? ... and why not!
The main reason I raised this is that I would not like to think that people
are going to say "Ooooh lookie - nice new XIP stuff" and design for it
without understanding the pros and cons. Generally I think though that having
XIP is a GoodThing as it does provide more design options and flexibility to
redeploy old hw designs.
>
> Why? Because your hardware designer didn't realise that RAM is cheaper
> than flash, and gave you more flash than you need and not enough RAM.
> Hence you want to run stuff uncompressed and use up all that nice
> expensive flash space, while retaining your precious RAM.
>
> Or, more charitably, perhaps you _really_ care about power consumption
> and observe that flash takes less power than RAM.
Hmmmm. I don't think it is as straight forward as that at a systems level,
but anyway...
[snip]
> > The fundamental implementation seems quite simple and elegant.IMHO.
>
> Good. My main concern was that there would be some platform where we
> can't actually poll for interrupts like this, and we have to redirect
> the IRQ vectors to somewhere in RAM, etc. I suppose that can still be
> hidden behind xip_cli(), xip_sti() and xip_irqpending() though.
Of the micros I have morked with, the most opaque interrupt controller (and
hence hardest to deal with) is the 8259 as used with x86 (well excluding the
8048 and 8051 which are not worthy of consideration here). The 8259 is
sufficently transparent to provide this polling. The trend with more modern
devices has been to make the peripherals (inc interrupt controller) more
transparent thus making this less of a problem.
There I've made a definitive statement - watch me make an idiot of myself!
-- CHarles
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place.
2003-06-04 8:34 David Woodhouse
2003-06-04 9:57 ` Charles Manning
@ 2003-06-04 9:57 ` Jörn Engel
2003-06-04 10:06 ` David Woodhouse
1 sibling, 1 reply; 30+ messages in thread
From: Jörn Engel @ 2003-06-04 9:57 UTC (permalink / raw)
To: David Woodhouse; +Cc: linux-mtd
On Wed, 4 June 2003 09:34:50 +0100, David Woodhouse wrote:
>
> I've done some work on kernel XIP. The basic principle of operation is
> that we copy any code which may need to run while the flash chips are in
> anything but 'read' mode into RAM, and we disable interrupts while the
> chips are busy. During erases, we poll not only for erase completion but
> also for pending IRQs. If an IRQ is pending, we suspend the erase
> operation, re-enable IRQs and call cond_resched().
Maybe a stupid one, but anyway...
For my flash chips, an erase operation is in the ballpark of a second.
That is very long for interrupts, so I would assume, we never complete
an erase without being interrupted. Does this mean that a formerly
suspended and resumed erase operation completes quicker than a fresh
one? Or do we need really fast flash chips?
Jörn
--
Ninety percent of everything is crap.
-- Sturgeon's Law
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place.
2003-06-04 9:57 ` Jörn Engel
@ 2003-06-04 10:06 ` David Woodhouse
2003-06-04 10:08 ` David Woodhouse
2003-06-04 20:48 ` Charles Manning
0 siblings, 2 replies; 30+ messages in thread
From: David Woodhouse @ 2003-06-04 10:06 UTC (permalink / raw)
To: Jörn Engel; +Cc: linux-mtd
On Wed, 2003-06-04 at 10:57, Jörn Engel wrote:
> For my flash chips, an erase operation is in the ballpark of a second.
> That is very long for interrupts, so I would assume, we never complete
> an erase without being interrupted. Does this mean that a formerly
> suspended and resumed erase operation completes quicker than a fresh
> one?
That appears to be the case, yes. Certainly, in the common case of it
being 10ms before you get the next timer tick and consider rescheduling,
the chip's made enough progress that it doesn't have to start again from
scratch.
Someone with more knowledge of flash chip internals could possibly give
a more coherent and informative answer -- but I was concerned about the
possibility you raise and that's partly why I was doing the flood-ping
testing.
--
dwmw2
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place.
2003-06-04 10:06 ` David Woodhouse
@ 2003-06-04 10:08 ` David Woodhouse
2003-06-04 12:28 ` Jörn Engel
2003-06-04 20:48 ` Charles Manning
1 sibling, 1 reply; 30+ messages in thread
From: David Woodhouse @ 2003-06-04 10:08 UTC (permalink / raw)
To: Jörn Engel; +Cc: linux-mtd
On Wed, 2003-06-04 at 11:06, David Woodhouse wrote:
> On Wed, 2003-06-04 at 10:57, Jörn Engel wrote:
> > For my flash chips, an erase operation is in the ballpark of a second.
> > That is very long for interrupts, so I would assume, we never complete
> > an erase without being interrupted. Does this mean that a formerly
> > suspended and resumed erase operation completes quicker than a fresh
> > one?
>
> That appears to be the case, yes. Certainly, in the common case of it
> being 10ms before you get the next timer tick and consider rescheduling,
> the chip's made enough progress that it doesn't have to start again from
> scratch.
Oh, and bear in mind that if erases were happening and you were trying
to read from the filesystem, that was causing many many erase suspends
to happen too anyway -- if this was a problem, it would have bitten us
as soon as we started suspending erases to permit reads to happen.
--
dwmw2
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place.
2003-06-04 10:08 ` David Woodhouse
@ 2003-06-04 12:28 ` Jörn Engel
0 siblings, 0 replies; 30+ messages in thread
From: Jörn Engel @ 2003-06-04 12:28 UTC (permalink / raw)
To: David Woodhouse; +Cc: linux-mtd
On Wed, 4 June 2003 11:08:48 +0100, David Woodhouse wrote:
> On Wed, 2003-06-04 at 11:06, David Woodhouse wrote:
> > On Wed, 2003-06-04 at 10:57, Jörn Engel wrote:
> > > For my flash chips, an erase operation is in the ballpark of a second.
> > > That is very long for interrupts, so I would assume, we never complete
> > > an erase without being interrupted. Does this mean that a formerly
> > > suspended and resumed erase operation completes quicker than a fresh
> > > one?
> >
> > That appears to be the case, yes. Certainly, in the common case of it
> > being 10ms before you get the next timer tick and consider rescheduling,
> > the chip's made enough progress that it doesn't have to start again from
> > scratch.
>
> Oh, and bear in mind that if erases were happening and you were trying
> to read from the filesystem, that was causing many many erase suspends
> to happen too anyway -- if this was a problem, it would have bitten us
> as soon as we started suspending erases to permit reads to happen.
Right. The great goddess Empiria smiles upon us.
Jörn
--
The cheapest, fastest and most reliable components of a computer
system are those that aren't there.
-- Gordon Bell, DEC labratories
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Execute in place.
2003-06-04 10:06 ` David Woodhouse
2003-06-04 10:08 ` David Woodhouse
@ 2003-06-04 20:48 ` Charles Manning
1 sibling, 0 replies; 30+ messages in thread
From: Charles Manning @ 2003-06-04 20:48 UTC (permalink / raw)
To: David Woodhouse, Jörn Engel; +Cc: linux-mtd
On Wednesday 04 June 2003 22:06, David Woodhouse wrote:
> On Wed, 2003-06-04 at 10:57, Jörn Engel wrote:
> > For my flash chips, an erase operation is in the ballpark of a second.
> > That is very long for interrupts, so I would assume, we never complete
> > an erase without being interrupted. Does this mean that a formerly
> > suspended and resumed erase operation completes quicker than a fresh
> > one?
>
> That appears to be the case, yes. Certainly, in the common case of it
> being 10ms before you get the next timer tick and consider rescheduling,
> the chip's made enough progress that it doesn't have to start again from
> scratch.
>
> Someone with more knowledge of flash chip internals could possibly give
> a more coherent and informative answer -- but I was concerned about the
> possibility you raise and that's partly why I was doing the flood-ping
> testing.
That's the whole purpose of erase suspend/resume. Inside the flash there is a
wee counter thing that drives the erase state machine. When you erase suspend
it stops; when you resume it continues where it left off.
Of course if you abort the erase (like you'd do for NAND and those NORs that
might not have erase suspend), then the state machine starts from scratch. In
this case you cannot guarantee termination since you might be forever
restarting the erase and it never gets a chance to complete.
NAND does not bug me since NAND and XIP are mutually exclusive.
-- Charles
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2007-05-08 11:59 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-02 23:11 Execute in place Al Boldi
2007-05-03 7:31 ` Dmitry Krivoschekov
2007-05-03 11:33 ` Al Boldi
2007-05-03 17:38 ` Dmitry Krivoschekov
2007-05-07 16:46 ` H. Peter Anvin
2007-05-07 19:37 ` Al Boldi
2007-05-07 19:41 ` H. Peter Anvin
2007-05-07 20:56 ` Al Boldi
2007-05-08 6:06 ` H. Peter Anvin
2007-05-08 11:36 ` Al Boldi
2007-05-08 11:37 ` H. Peter Anvin
2007-05-08 12:02 ` Al Boldi
-- strict thread matches above, loose matches on Subject: below --
2007-05-01 21:55 Phillip Susi
2007-05-02 14:04 ` Hugh Dickins
2007-05-02 14:38 ` Björn Steinbrink
2007-05-02 15:22 ` Hugh Dickins
2007-05-02 19:30 ` Phillip Susi
2007-05-02 20:34 ` Hugh Dickins
2007-05-03 11:38 ` Erik Mouw
2007-05-03 15:37 ` Jörn Engel
2007-05-03 12:12 ` Robin Getz
2003-06-04 8:34 David Woodhouse
2003-06-04 9:57 ` Charles Manning
2003-06-04 10:02 ` David Woodhouse
2003-06-04 21:02 ` Charles Manning
2003-06-04 9:57 ` Jörn Engel
2003-06-04 10:06 ` David Woodhouse
2003-06-04 10:08 ` David Woodhouse
2003-06-04 12:28 ` Jörn Engel
2003-06-04 20:48 ` Charles Manning
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.