* JFFS3 and RAM consumprion reincarnated @ 2005-03-01 16:28 Artem B. Bityuckiy 2005-03-02 14:44 ` Jörn Engel 0 siblings, 1 reply; 8+ messages in thread From: Artem B. Bityuckiy @ 2005-03-01 16:28 UTC (permalink / raw) To: MTD List Hello, I'd like to continue the JFFS3 RAM consumption discussion. Please, take a glimpse at http://lists.infradead.org/pipermail/linux-mtd/2005-January/011671.html and follow the conversation athttp://lists.infradead.org/pipermail/linux-mtd/2005-January/011716.html to refresh your memory. Well, I suppose the reader knows what are Summary and ICP. If no, follow the links above. We've stopped on the design like this: each inode has ICP which is stored on flash. The ICP is being outdated by GC and users. Since it is inefficient to rewrite new inode every time, we won't rewrite it but add node_ref instead. Sometimes we'll flush ICP and free node_refs. Example: 1. We have inode X with 10 nodes (node 1, node 2, ... node 10) and ICP node which describes them. So far so good. On iget() call we read ICP and have the inode built. 2. Suppose GC has moved node 1 of our inode. Its position has been changed, so the relating ICP entry is obsolete now. In this case JFFS3 does not rewrite new ICP. Instesd, it allocates node_ref for node 1 and keeps it in-core. Now, on iget() call, we read ICP and the node 1's node_ref and this is sufficient to build our inode. Anologeously, if another nodes are moved, we just allocate more node_refs. Sometimes, I don't specify when, we might rewrite old obsolete ICP and free node_refs. And so on. I think it makes sence to introduce JFFS3 parameter that limits the abount of in-core RAM that JFFS3 might consume. Jets call this parameter JFFS3_MAXRAM. If JFFS3_MAXRAM = 0, we rewrite new ICP every time when any of its entry is obsoleted. If JFFS3_MAXRAM is very large, we have JFFS3 = JFFS2 in the memory consumption's respect. Comments? -- Best Regards, Artem B. Bityuckiy, St.-Petersburg, Russia. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: JFFS3 and RAM consumprion reincarnated 2005-03-01 16:28 JFFS3 and RAM consumprion reincarnated Artem B. Bityuckiy @ 2005-03-02 14:44 ` Jörn Engel 2005-03-03 11:09 ` Artem B. Bityuckiy 2005-03-03 18:34 ` Jared Hulbert 0 siblings, 2 replies; 8+ messages in thread From: Jörn Engel @ 2005-03-02 14:44 UTC (permalink / raw) To: Artem B. Bityuckiy; +Cc: MTD List On Tue, 1 March 2005 16:28:36 +0000, Artem B. Bityuckiy wrote: > > We've stopped on the design like this: each inode has ICP which is stored > on flash. The ICP is being outdated by GC and users. Since it is > inefficient to rewrite new inode every time, we won't rewrite it but add > node_ref instead. Sometimes we'll flush ICP and free node_refs. I agree with tglx, your approach is complicated (aka horrible). How about something much simpler: o The ICP is just a list of erase blocks. o For any non-obsolete node belonging to an inode, the containing erase block number *must* be part of ICP. o If an erase block doesn't contain non-obsolute nodes for this inode, its number *should* not be part of ICP. o The ICP *can* be stored in flash. Advantages over current design: o Lower memory consumption, as we don't track individual nodes anymore. Advantages over your old ICP concept: o GC and write are simple. They simply add the current eraseblock to the ICP list, if it isn't part of it already. Disadvantages: o Whenever we need to check the full node list, we take a few more indirections. o Removal of erase blocks from the ICP list is done on lookup. "Whoops, this erase block doesn't contain any of my nodes." Ultimate advantage: Design can be explained in less words. :) Jörn -- My second remark is that our intellectual powers are rather geared to master static relations and that our powers to visualize processes evolving in time are relatively poorly developed. -- Edsger W. Dijkstra ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: JFFS3 and RAM consumprion reincarnated 2005-03-02 14:44 ` Jörn Engel @ 2005-03-03 11:09 ` Artem B. Bityuckiy 2005-03-04 16:24 ` Jörn Engel 2005-03-03 18:34 ` Jared Hulbert 1 sibling, 1 reply; 8+ messages in thread From: Artem B. Bityuckiy @ 2005-03-03 11:09 UTC (permalink / raw) To: Jörn Engel; +Cc: MTD List Dear Joern, > I agree with tglx, your approach is complicated (aka horrible). How > about something much simpler: Whel this is on of approaches I've already proposed :-) > o The ICP is just a list of erase blocks. > o For any non-obsolete node belonging to an inode, the containing > erase block number *must* be part of ICP. > o If an erase block doesn't contain non-obsolute nodes for this inode, > its number *should* not be part of ICP. > o The ICP *can* be stored in flash. Ok. > Advantages over current design: > o Lower memory consumption, as we don't track individual nodes anymore. Right. > Advantages over your old ICP concept: > o GC and write are simple. They simply add the current eraseblock to > the ICP list, if it isn't part of it already. Right. > Disadvantages: > o Whenever we need to check the full node list, we take a few more > indirections. I'll elaborate your "whenever" - this is iget() call at most. The per-inode list might also be required when we have deals with deletion/deleted direntries. > o Removal of erase blocks from the ICP list is done on lookup. > "Whoops, this erase block doesn't contain any of my nodes." Right. > Ultimate advantage: Design can be explained in less words. :) I don't think this is so important :-) Again: > o Whenever we need to check the full node list, we take a few more > indirections. Well, this might be serious disadvantage. Conceivably, there is a large file which is distributed over a lot of blocks. The iget() of the relating inode assumes: for (all blocks relating to our inode) { Read_the_block_summary(); Identify_the_position_of_all_the_inode's_nodes(); for (all the nodes found) { Read_the_node(); } } We risk to end up with extremely slow iget(). But yes, as I wrote earlier and as you has affirmed, this is fairly simple and elegant idea. ICP is not needed here at all while summary nodes are obligatory. And this fits well to the idea of superblock which is distributed and encompasses summaries. BTW, I assume that the technique we're talking about is applied *only* to inodes that aren't in the inode cache (i.e., iget() haven't been called for them yet). (Is there some term for them?). Those inodes that are in inode cache do not need this since they keep track of nodes in fragtree/dirent list. -- Best Regards, Artem B. Bityuckiy, St.-Petersburg, Russia. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: JFFS3 and RAM consumprion reincarnated 2005-03-03 11:09 ` Artem B. Bityuckiy @ 2005-03-04 16:24 ` Jörn Engel 2005-03-05 11:15 ` Artem B. Bityuckiy 0 siblings, 1 reply; 8+ messages in thread From: Jörn Engel @ 2005-03-04 16:24 UTC (permalink / raw) To: Artem B. Bityuckiy; +Cc: MTD List On Thu, 3 March 2005 11:09:12 +0000, Artem B. Bityuckiy wrote: > > > I agree with tglx, your approach is complicated (aka horrible). How > > about something much simpler: > Whel this is on of approaches I've already proposed :-) Ok, sorry. -ENOTIME, didn't read all previous postings, just your references. > > Disadvantages: > > o Whenever we need to check the full node list, we take a few more > > indirections. > I'll elaborate your "whenever" - this is iget() call at most. The > per-inode list might also be required when we have deals with > deletion/deleted direntries. Yup. Thanks for checking, I was too lazy. > > Ultimate advantage: Design can be explained in less words. :) > I don't think this is so important :-) Not in English, no. But a simple design is an indication of a simple implementation - more robust, less buggy, pick your favorite attribute. > Again: > > o Whenever we need to check the full node list, we take a few more > > indirections. > Well, this might be serious disadvantage. Conceivably, there is a large > file which is distributed over a lot of blocks. The iget() of the relating > inode assumes: > > for (all blocks relating to our inode) > { > Read_the_block_summary(); > Identify_the_position_of_all_the_inode's_nodes(); > for (all the nodes found) > { > Read_the_node(); > } > } Almost. Unless I misread Read_the_block_summary() and you mean "take the list of erase blocks from ICP" > We risk to end up with extremely slow iget(). Hence, we should cache this. Extremely slow iget() under memory pressure is fine, still much faster than OOM. Without memory pressure, we'd have current performance. > But yes, as I wrote earlier and as you has affirmed, this is fairly simple > and elegant idea. ICP is not needed here at all while summary nodes are > obligatory. And this fits well to the idea of superblock which is > distributed and encompasses summaries. Well, I'd still store *some* information, namely the full list of erase blocks containing nodes. Not sure if that is necessary, maybe you're right and we should get rid of this information as well. Time to code and test things. Jörn -- Measure. Don't tune for speed until you've measured, and even then don't unless one part of the code overwhelms the rest. -- Rob Pike ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: JFFS3 and RAM consumprion reincarnated 2005-03-04 16:24 ` Jörn Engel @ 2005-03-05 11:15 ` Artem B. Bityuckiy 2005-03-05 11:27 ` Jörn Engel 0 siblings, 1 reply; 8+ messages in thread From: Artem B. Bityuckiy @ 2005-03-05 11:15 UTC (permalink / raw) To: Jörn Engel; +Cc: MTD List On Fri, 2005-03-04 at 17:24 +0100, Jörn Engel wrote: > Not in English, no. But a simple design is an indication of a simple > implementation - more robust, less buggy, pick your favorite > attribute. Simple and clear design is certainly of high priority. But again, I afraid the performance will suffer too much. > Almost. Unless I misread Read_the_block_summary() and you mean "take > the list of erase blocks from ICP" ICP contains per-inode information. Physical nodes are placed everywhere. Summary node contains per-block information, i.e., one summary node describes all the nodes in the current block. We suppose JFFS3 supports summaries. Consequently, Read_the_block_summary() means read the summary node, no need to scan block. > > > We risk to end up with extremely slow iget(). > > Hence, we should cache this. Extremely slow iget() under memory > pressure is fine, still much faster than OOM. Without memory > pressure, we'd have current performance. Hmm. Do you know whether it possible to register JFFS2-specific "reap" function ? > > > But yes, as I wrote earlier and as you has affirmed, this is fairly simple > > and elegant idea. ICP is not needed here at all while summary nodes are > > obligatory. And this fits well to the idea of superblock which is > > distributed and encompasses summaries. > > Well, I'd still store *some* information, namely the full list of > erase blocks containing nodes. Not sure if that is necessary, maybe > you're right and we should get rid of this information as well. No need to create ICP to store this IMO. > Time to code and test things. I think it is a bit early. We need to discuss and agree on something. Then document it and agree again. :-) If there are several approaches, I'd like to design them all :-) I'll try to gather all together and document this. I'd be happy to get some help :=) Furthermore, I'd like to discuss several extra ideas, e.g.: * Separate users writes and GC writes between different blocks. * Deletion direntries processing. It is far no good in JFFS3. -- Best Regards, Artem B. Bityuckiy, St.-Petersburg, Russia. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: JFFS3 and RAM consumprion reincarnated 2005-03-05 11:15 ` Artem B. Bityuckiy @ 2005-03-05 11:27 ` Jörn Engel 0 siblings, 0 replies; 8+ messages in thread From: Jörn Engel @ 2005-03-05 11:27 UTC (permalink / raw) To: Artem B. Bityuckiy; +Cc: MTD List On Sat, 5 March 2005 14:15:40 +0300, Artem B. Bityuckiy wrote: > On Fri, 2005-03-04 at 17:24 +0100, Jörn Engel wrote: > > Not in English, no. But a simple design is an indication of a simple > > implementation - more robust, less buggy, pick your favorite > > attribute. > Simple and clear design is certainly of high priority. But again, I > afraid the performance will suffer too much. I don't care. Code, test. If tests agree with you, you win. > > Almost. Unless I misread Read_the_block_summary() and you mean "take > > the list of erase blocks from ICP" > ICP contains per-inode information. Physical nodes are placed > everywhere. > Summary node contains per-block information, i.e., one summary node > describes all the nodes in the current block. We suppose JFFS3 supports > summaries. > > Consequently, Read_the_block_summary() means read the summary node, no > need to scan block. Ok, sorry. s/ICP/IBL/ IBL is the Inode Block List, which contains all erase blocks containing valid nodes. It may contain more, but not less. > > Hence, we should cache this. Extremely slow iget() under memory > > pressure is fine, still much faster than OOM. Without memory > > pressure, we'd have current performance. > Hmm. Do you know whether it possible to register JFFS2-specific "reap" > function ? set_shrinker > > Well, I'd still store *some* information, namely the full list of > > erase blocks containing nodes. Not sure if that is necessary, maybe > > you're right and we should get rid of this information as well. > No need to create ICP to store this IMO. Could be. The IBL will reduce the number of erase blocks to test (performance), but increase the memory consumption again. As usual, we need to test to see if the tradeoff is worth it. > > Time to code and test things. > I think it is a bit early. We need to discuss and agree on something. > Then document it and agree again. :-) If there are several approaches, > I'd like to design them all :-) > I'll try to gather all together and document this. I'd be happy to get > some help :=) I personally hate to discuss, agree, document, carve in stone and have the pope sprinkle holy water over something that may not survive the first contact with reality. Get some cold hard numbers, noone will disagree with those. If the case was obvious, yes, we could go down your path. But our discussion already proves, it's not. > Furthermore, I'd like to discuss several extra ideas, e.g.: > * Separate users writes and GC writes between different blocks. > * Deletion direntries processing. It is far no good in JFFS3. Sure. Seperate threads? Jörn -- It's not whether you win or lose, it's how you place the blame. -- unknown ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: JFFS3 and RAM consumprion reincarnated 2005-03-02 14:44 ` Jörn Engel 2005-03-03 11:09 ` Artem B. Bityuckiy @ 2005-03-03 18:34 ` Jared Hulbert 2005-03-04 16:07 ` Jörn Engel 1 sibling, 1 reply; 8+ messages in thread From: Jared Hulbert @ 2005-03-03 18:34 UTC (permalink / raw) To: Jörn Engel; +Cc: MTD List On Wed, 2 Mar 2005 15:44:07 +0100, Jörn Engel <joern@wohnheim.fh-wedel.de> wrote: > I agree with tglx, your approach is complicated (aka horrible). Can you point me to the arguments where you and tglx explain why it's 'horrible'. I was thinking it was a rather nice idea myself. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: JFFS3 and RAM consumprion reincarnated 2005-03-03 18:34 ` Jared Hulbert @ 2005-03-04 16:07 ` Jörn Engel 0 siblings, 0 replies; 8+ messages in thread From: Jörn Engel @ 2005-03-04 16:07 UTC (permalink / raw) To: Jared Hulbert; +Cc: MTD List On Thu, 3 March 2005 10:34:52 -0800, Jared Hulbert wrote: > On Wed, 2 Mar 2005 15:44:07 +0100, Jörn Engel > <joern@wohnheim.fh-wedel.de> wrote: > > I agree with tglx, your approach is complicated (aka horrible). > > Can you point me to the arguments where you and tglx explain why it's > 'horrible'. I was thinking it was a rather nice idea myself. Follow one of the links Artem had in the original mail to this thread. One of tglx's comments was "this is horrible" or similar. Jörn -- Good warriors cause others to come to them and do not go to others. -- Sun Tzu ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-03-05 11:36 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-03-01 16:28 JFFS3 and RAM consumprion reincarnated Artem B. Bityuckiy 2005-03-02 14:44 ` Jörn Engel 2005-03-03 11:09 ` Artem B. Bityuckiy 2005-03-04 16:24 ` Jörn Engel 2005-03-05 11:15 ` Artem B. Bityuckiy 2005-03-05 11:27 ` Jörn Engel 2005-03-03 18:34 ` Jared Hulbert 2005-03-04 16:07 ` Jörn Engel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox