From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marek Vasut Date: Mon, 21 Mar 2016 17:49:27 +0100 Subject: [U-Boot] [RFC V2 PATCH 0/3] Add cache for block devices In-Reply-To: <56EFFBAD.4070700@cox.net> References: <1458524727-4643-1-git-send-email-eric@nelint.com> <56EF559D.3040608@denx.de> <56EFFBAD.4070700@cox.net> Message-ID: <56F02617.6010700@denx.de> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de On 03/21/2016 02:48 PM, Eric Nelson wrote: > Hi Marek, > > On 03/20/2016 06:59 PM, Marek Vasut wrote: >> On 03/21/2016 02:45 AM, Eric Nelson wrote: >>> Here's a more full-featured implementation of a cache for block >>> devices that uses a small linked list of cache blocks. >> >> Why do you use linked list ? You have four entries, you can as well >> use fixed array. Maybe you should implement an adaptive cache would >> would use the unpopulated malloc area and hash the sector number(s) >> into that area ? >> > > I was looking for a simple implementation that would allow tweaking of > the max entries/size per entry. > > We could get higher performance through hashing, but with such a > small cache, it's probably not worth extra code. The hashing function can be a simple modulo on sector number ;-) That'd be less code than linked lists. > Using an array and re-allocating on changes to the max entries variable > is feasible, but I think it would be slightly more code. That would indeed be more code. >>> Experimentation loading a 4.5 MiB kernel from the root directory of >>> a FAT filesystem shows that a single cache entry of a single >>> block is the only >> >> only ... what ? This is where things started to be interesting, but >> you leave us hanging :) >> > > Oops. > > ... I was planning on re-wording that. > > My testing showed no gain in performance (additional cache hits) past a > single entry of a single block. This was done on a small (32MiB) > partition with a small number of files (~10) and only a single > read is skipped. I'd kinda expect that indeed. > => blkc c ; blkc i ; blkc 0 0 ; > changed to max of 0 entries of 0 blocks each > => load mmc 0 10008000 /zImage > reading /zImage > 4955304 bytes read in 247 ms (19.1 MiB/s) > => blkc > block cache: > 0 hits > 7 misses > 0 entries in cache > trace off > max blocks/entry 0 > max entries 0 > => blkc c ; blkc i ; blkc 1 1 ; > changed to max of 1 entries of 1 blocks each > => load mmc 0 10008000 /zImage > reading /zImage > 4955304 bytes read in 243 ms (19.4 MiB/s) > => blkc > block cache: > 1 hits > 6 misses > 1 entries in cache > trace off > max blocks/entry 1 > max entries 1 > > I don't believe that enabling the cache is worth the extra code > for this use case. > > By comparison, a load of 150 MiB compressed disk image from > ext4 showed a 30x speedup with the V1 patch (single block, > single entry) from ~150s to 5s. > > Without some form of cache, the 150s was long enough to make > a user (me) think something is broken. I'm obviously loving this improvement. Best regards, Marek Vasut