From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from 208.177.141.226.ptr.us.xo.net ([208.177.141.226] helo=ash.lnxi.com) by pentafluge.infradead.org with smtp (Exim 4.30 #5 (Red Hat Linux)) id 1Al2vs-0007o4-F8 for linux-mtd@lists.infradead.org; Mon, 26 Jan 2004 09:21:24 +0000 To: David Woodhouse References: <1075099329.17157.97.camel@lapdancer.baythorne.internal> <1075102799.17157.209.camel@lapdancer.baythorne.internal> From: ebiederman@lnxi.com (Eric W. Biederman) Date: 26 Jan 2004 02:23:23 -0700 In-Reply-To: <1075102799.17157.209.camel@lapdancer.baythorne.internal> Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: "Eric W. Biederman" cc: linux-mtd@lists.infradead.org Subject: Re: Q: Filesystem choice.. List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , David Woodhouse writes: > On Mon, 2004-01-26 at 00:09 -0700, Eric W. Biederman wrote: > > Has anyone gotten as far as a proof. Or are there some informal > > things that almost make up a proof, so I could get a feel? Reserving > > more than a single erase block is going to be hard to swallow for such > > a small filesystem. > > You need to have enough space to let garbage collection make progress. > Which means it has to be able to GC a whole erase block into space > elsewhere, then erase it. That's basically one block you require. > > Except you have to account for write errors or power cycles during a GC > write, wasting some of your free space. You have to account for the > possibility that what started off as a single 4KiB node in the original > block now hits the end of the new erase block and is split between that > and the start of another, so effectively it grew because it has an extra > node header now. And of course when you do that you get worse > compression ratios too, since 2KiB blocks compress less effectively than > 4KiB blocks do. Compression is an interesting question. Do you encode the uncompressed size of a block in bytes. If so I don't think it would be too difficult to get your uncompressed block size > page size. With the page cache there is real reason a block size <= page size. You just need what amounts to scatter/gather support. My real question here is how difficult is it to disable compression? Or can compression be deliberately disabled on a per file basis? For the two primary files I am thinking of using neither one would need compression. A file of my BIOS settings is would be dense and quite small (128 bytes on a big day). A kernel is already compressed and carries it's own decompresser, and whole file compression is more effective than compressing small blocks. > When you get down to the kind of sizes you're talking about, I suspect > we need to be thinking in bytes rather than blocks -- because there > isn't just one threshold; there's many, of which three are particularly > relevant: That makes sense. This at least looks like a viable alternative for the 1MB case. [snip actual formulas] > You want resv_blocks_write to be larger than resv_blocks_deletion, and I > suspect you could get away with values of 2 and 1.5 respectively, if we > were counting bytes rather than whole eraseblocks. I have a truly perverse case I would like to ask your opinion about. A filesystem composed of 2 8K erase blocks? That is one of the weird special cases that flash chips often support. I could only store my parameter file in there but it would be interesting. I think if I counted bytes very carefully and never got above .5 of a block full I suspect that it would work, and be useful. I'd just have to make certain the degenerate case matched the original jffs. And a last question. jffs2 rounds all erase blocks up to a common size doesn't it? Eric