From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from www.tglx.de ([62.245.132.106]) by bombadil.infradead.org with esmtps (Exim 4.69 #1 (Red Hat Linux)) id 1OC9dh-000425-Rk for linux-mtd@lists.infradead.org; Wed, 12 May 2010 10:58:11 +0000 Date: Wed, 12 May 2010 12:58:01 +0200 (CEST) From: Thomas Gleixner To: Brijesh Singh Subject: Re: UBIL design doc In-Reply-To: Message-ID: References: <1273475736.2209.88.camel@localhost> <1273650099.22706.41.camel@localhost> <1273653341.22706.46.camel@localhost> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: rohitvdongre@gmail.com, linux-mtd@lists.infradead.org, dedekind1@gmail.com List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Wed, 12 May 2010, Brijesh Singh wrote: > On Wed, May 12, 2010 at 2:05 PM, Artem Bityutskiy wrote: > > On Wed, 2010-05-12 at 13:33 +0530, Brijesh Singh wrote: > >> 4) Any unclean un-mount will lead to flash scanning just as UBI. > > > > No! Why you have the log then? Unclean reboots are handled by the log. > > > > Scanning happens only when you have _corrupted_ SB, or corrupted cmt, or > > log. Then you fall-back to scanning. > > > >> Any thing goes bad, normal scanning becomes recovery. > >> 5) Not sure if log is required in first place. But it could be an option. > >> Is that correct? > > > > No, at least I did not suggest you to get rid of the log. It is needed > > to handle unclean reboots. > > Log is written for each EC or VID change. Frequency of log write is same as > the frequency of these headers. In case we keep both, there will be one log > write penalty per write/erase. So write performance will drop considerably. True, but the reliability will drop as well. Losing a log block is going to be fatal as there is no way to reconstruct while losing a single block in UBI is not inevitably fatal. Back then when UBI was designed / written we discussed a different approach of avoiding the full flash scan while keeping the reliability intact. Superblock in the first couple of erase blocks which points to a snapshot block. snapshot block(s) contain a compressed EC/VID header snapshot. A defined number of blocks in that snapshot is marked as NEED_SCAN. At the point of creating the snapshot these blocks are empty and belong to the blocks with the lowest erase count. Now when an UBI client (filesystem ...) requests an erase block one of those NEED_SCAN marked blocks is given out. Blocks which are handed back from the client for erasure which are not marked NEED_SCAN are erased and not given out as long as there are still enough empty blocks marked NEED_SCAN available. When we run out of NEED_SCAN marked blocks we write a new snapshot with a new set of NEED_SCAN blocks. So at attach time we read the snapshot and scan the few NEED_SCAN blocks. They are either empty or assigned to a volume. If assigned they can replace an already existing logical erase block reference in the snapshot, so we know that we need to put the original physical erase block into a lazy back ground scan list. With that approach we keep the reliability of UBI untouched with the penalty of scanning a limited number of erase blocks at attach time. That limits the number of writes to the snapshot / log significantly. For devices with a low write frequency that means that the snapshot block can be untouched for a very long time. The speed penalty is constant and does not depend on the number of log entries after the snapshot. Your full log approach is going to slower once the number of log entries is greater than the number of NEED_SCAN marked blocks. If we assume a page read time of 1ms and the number of NEED_SCAN blocks of 64, then we talk about a constant overhead of 64 ms. So lets look at the full picture: Flashsize: 1 GiB Eraseblocksize: 128 KiB Pagesize: 2 KiB Subpagesize: 1 KiB Number of erase blocks: 8192 Snapshot size per block: 16 Byte Full snapshot size: 128 KiB Full snapshot pages: 64 Number of NEED_SCAN blocks: 64 Number of blocks to scan for finding super block(s): 64 So with an assumption of page read time == 1ms the total time of building the initial data structures in RAM is 3 * 64ms. So yes, it _IS_ 3 times the time which we need for your log approach (assumed that the super block is first good block and the number of log entries after the snapshot is 0) So once we agree that a moveable super block is the correct way, the speed advantage is of your log approach is 64ms (still assumed that the number of log entry pages is 0) Now take the log entries into account. Once you have to read 64 pages worth of log entries, which happens in the above example after exaclty 128 entries, the speed advantage is exaclty zero. From that point on it's going to be worse. Thoughts ? tglx