From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.nokia.com ([192.100.105.134] helo=mgw-mx09.nokia.com) by bombadil.infradead.org with esmtps (Exim 4.69 #1 (Red Hat Linux)) id 1OC7Qy-0002NV-Jk for linux-mtd@lists.infradead.org; Wed, 12 May 2010 08:36:53 +0000 Subject: Re: UBIL design doc From: Artem Bityutskiy To: Brijesh Singh In-Reply-To: References: <1273475736.2209.88.camel@localhost> <1273650099.22706.41.camel@localhost> Content-Type: text/plain; charset="UTF-8" Date: Wed, 12 May 2010 11:35:41 +0300 Message-ID: <1273653341.22706.46.camel@localhost> Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Cc: Thomas Gleixner , linux-mtd@lists.infradead.org, rohitvdongre@gmail.com Reply-To: dedekind1@gmail.com List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Wed, 2010-05-12 at 13:33 +0530, Brijesh Singh wrote: > Hi, > > On Wed, May 12, 2010 at 1:11 PM, Artem Bityutskiy wrote: > > On Tue, 2010-05-11 at 21:17 +0200, Thomas Gleixner wrote: > >> B1;2005;0cOn Mon, 10 May 2010, Artem Bityutskiy wrote: > >> > >> > On Sun, 2010-05-09 at 01:09 +0530, Brijesh Singh wrote: > >> > > Hi, > >> > > I am forwarding you the design document for ubi with log. Please > >> > > find the ubil document at > >> > > http://git.infradead.org/users/brijesh/ubil_results/blob_plain/HEAD:/UBIL > >> > > design document.pdf > >> > >> @Brijesh, thanks for tackling this ! > >> > >> > Hi guys, > >> > > >> > I've read the document. Looks very promising. Here some feed-back. > >> > > >> > 1. SB PEB wear-out. What if the reaseblock lifetime is, say, 10000 > >> > erease cycles? Won't the SB PEB wear out very quickly? Why you did not > >> > go for the chaining approach which I described in the old JFFS3 design > >> > doc? > >> > > >> > If we do not implement chaining, we should at least design it and make > >> > sure UBIL can be extended later so that SB chaining could be added. > >> > >> The super block needs to be scanned for from the beginning of flash > >> anyway due to bad blocks. Putting it into a fixed position (first good > >> erase block) is a very bad design decision vs. wear leveling. > >> > >> The super block must be moveable like any other block, though we can > >> keep it as close to the start of flash as possible. > >> > >> Also chaining has a tradeoff. The more chains you need to walk the > >> closer you get to the point where you are equally bad as a full scan. > > > > Well, every new chain member reduces the superblock wear speed by order > > 2, so I the chain would have 2-4 eraseblocks in most cases, I guess, > > which is not bad. > > > > In the opposite, moving the SB 3-4 eraseblocks further only reduces the > > load merely by factor 3-4. > > > >> > 2. SB PEB at the end. I think this is a very bad idea. Imagine you have > >> > to do UBIL images for production on the factory. With your design you > >> > have the following bad drawbacks: > >> > > >> > a. NAND flash has initial bad blocks, and you do not know how many, > >> > and where they sit. These may be the last 8 eraseblocks. So, when > >> > you prepare an image (say, with the ubinize user-space tool), where > >> > will you put the second SB PEB? > >> > > >> > b. Currently, UBI/UBIFS images are small. E.g., if you make an > >> > UBI/UBIFS image for 1GiB flash, and you have just few KiB of files, > >> > your image will be few megs - it will contain the files, and all > >> > the needed UBI/UBIFS meta-data. > >> > > >> > So now what will be image size for UBIL - 1GiB, and this is bad. > >> > You then will transfer 1GiB of data to the devices during flashing > >> > or you will have to invent ways to work around this. Do you need > >> > these complexities? > >> > > >> > I think the second SB PEB should not be at the end. > >> > >> I think we do not need a second SB at all. UBI should not depend on > >> the super block in any way. The super block is an optimization for the > >> common case - nothing more. > > > > Yeah, if we preserve the headers we can always fall-back to scanning > > should something be broken. > > > >> > >> > 3. Backward-compatibility. In UBIL you removed EC anc VID headers in > >> > PEBs. That's fine for optimization purposes. But it has draw-backs: > >> > > >> > a. If any of the UBIL meta-data blocks like SB, CMT or log are > >> > corrupted - that's it - we are screwed. You cannot anymore > >> > re-consturct the data by scanning. The robustness goes down. > >> > > >> > c. Backward compatibility - UBI will not be able to attach UBIL > >> > images. This is not very nice. > >> > > >> > So, I think you should keep EC and VID headers in PEBs. And you should > >> > make the SB/CMT/log blocks to be a new type of UBI volume with > >> > UBI_COMPAT_DELETE or UBI_COMPAT_PRESERVE or UBI_COMPAT_RO type. In this > >> > case UBI will attach UBIL volumes just fine. > >> > > >> > Then, you can add an _option_ to have no EC/VID headers in PEBs. This > >> > then can be used for performance, if one wants to sacrifice robustness. > >> > But this should be the second step. In this case, you will just need to > >> > put a VID header with UBI_COMPAT_REJECT flag to the first PEB. > >> > >> I don't think it's a good idea to kill the EC/VID headers. It not only > >> violates the backwards compability it also fundamentally weakens UBIs > >> reliability for no good reason and I doubt that the performance win is > >> big enough to make it worth. > >> > >> The performance gain is at attach time by getting rid of the flash > >> scan, but not by getting rid of writing the EC/VID headers. > > > > Well, there are some space savings as well. > > > >> > >> The logging is a speed up / optimization for the common case, but it > >> needs to preserve full reconstruction via scanning all eraseblocks and > >> checking the EC/VID headers. That also allows retrofitting on existing > >> devices. > >> > >> I'd rather see the super block / log volume as a checkpointing > >> mechanism which provides a snapshot of the EC/VID headers at a given > >> point and a list of eraseblocks which need to be scanned at attach > >> time. > >> > >> > >> That has two main advantages: > >> 1) It limits the number of log writes > >> 2) It allows full backward and forward compatibility > > > > I think this is what they do, but they for some reasons removed the > > headers. If they add them back, it should look like you described. > > > > We should preserve the headers. It is always easy to disable them later, > > if someone needs this for optimization purposes. E.g., we can add an > > ubi_compat=0 option or something like that. > > > >> Looking at > >> http://git.infradead.org/users/brijesh/ubil_results/blob/HEAD:/nand_mount_time.pdf > >> I still see a linear - though less steep - attach time. For the 1GB > >> flash size it's still 0.8s which is nice progress vs. the 2s for the > >> non logging case. But that's surprising as one would expect that > >> logging would provide a more aggressive and non linear gain. > >> > >> Just doing the simple math: > >> > >> 1GB FLASH with erase block size 128K and page size 2k, that > >> translates to 8192 erase blocks > >> > >> So UBI scans 8192 erase block EC/VID headers in 2 seconds. That > >> equals to 8192 FLASH pages. > >> > >> UBIL needs 0.8 seconds. That means that UBIL still scans ~3236 FLASH > >> pages (or spends the equivivalent time) to achieve the same result. > >> > >> That looks wrong. Care to explain ? > > > > I suspect these are implementation issues. I did not look at the code, > > but I suspect they read whole CMT block and populate the all EB > > associations at scan time. However, they could populate them lazily, > > which would optimize things. > I am trying to summarize what I have understood. > I will send the patches if this is correct. > 1) Commit will have ec and vid headers just as any other UBI block. > The compat flag helps in backword compatibility, > 2)chained sb will locate commit. It will be part of internal volume as well. > 3) Commit will be called on unmount. > 4) Any unclean un-mount will lead to flash scanning just as UBI. No! Why you have the log then? Unclean reboots are handled by the log. Scanning happens only when you have _corrupted_ SB, or corrupted cmt, or log. Then you fall-back to scanning. > Any thing goes bad, normal scanning becomes recovery. > 5) Not sure if log is required in first place. But it could be an option. > Is that correct? No, at least I did not suggest you to get rid of the log. It is needed to handle unclean reboots. -- Best Regards, Artem Bityutskiy (Артём Битюцкий)