Re: [RFC][PATCH] ubi: Implement a read cache

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Richard Weinberger <richard@nod.at>
To: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: linux-mtd@lists.infradead.org, david@sigma-star.at,
	computersforpeace@gmail.com, dedekind1@gmail.com
Subject: Re: [RFC][PATCH] ubi: Implement a read cache
Date: Mon, 18 Jul 2016 16:30:31 +0200	[thread overview]
Message-ID: <578CE807.5060204@nod.at> (raw)
In-Reply-To: <20160718162141.1d061791@bbrezillon>

Boris,

Am 18.07.2016 um 16:21 schrieb Boris Brezillon:
> Hi Richard,
> 
> On Fri, 15 Jul 2016 14:57:34 +0200
> Richard Weinberger <richard@nod.at> wrote:
> 
>> Implement a simple read cache such that small adjacent reads
>> with in the same NAND page can be cached.
>> When a read smaller than a NAND page is requested, UBI reads the full
>> page and caches it.
>> NAND core has already such a cache but it will go away soon.
>>
>> To be able to benchmark better, there is also a debugfs file which reports
>> cache stats, misses, hits and no_query.
>> Misses and hits should be clear, no_query is incremented when the cache
>> cannot be used, i.e. for reads larger than a page size.
>>
>> This patch is just a PoC.
>>
> 
> Just a few comments on the approach. I'll try to review the code later.
> 
>> Signed-off-by: Richard Weinberger <richard@nod.at>
>> ---
>> Hi!
>>
>> This is an initial draft for a read cache implemented in UBI.
>>
>> The cache sits in ubi_eba_read_leb() and is used when the read length is
>> smaller than a NAND page and does not cross a page boundary, this can be
>> improved later.
> 
> Hm, we should probably use the cache for aligned accesses and
> multi-pages read.
> 
>> Currently the cache consists of 61 cache lines where the PEB acts
>> as selector.
> 
> If we go for a multi-pages cache, we should probably use an LRU cache
> instead of this 'simple PEB hash + no collision' approach. Remember
> that reading a page from NAND is more time consuming than iterating
> over a small array of NAND page buffers.

Yeah, this implementation more or less moves the cache we have nand_base.c
into UBI.

>> Maybe less cache lines would also do it, this needs to be benchmarked
>> in more detail.
> 
> See my comment below.
> 
>>
>> I did already some benchmarks.
>> The system I've used has a 4GiB MLC NAND with 16KiB page size.
>> Userspace is a Debian 8 with systemd.
>>
>> The following tasks have been benchmarked so far.
>> 1. system bootup (userspace time as reported by systemd-analyze)
>> 2. walking /usr, since this involves a lot of TNC lookups this is interesting
>> 3. removing a 1GiB file, it involves also a lot of TNC lookups
>>
>> Baseline
>> ========
>>
>> 1. bootup time: 18 seconds
>>
>> 2. time find /usr > /dev/null
>> real    0m19.869s
>> user    0m0.120s
>> sys     0m7.300s
>>
>> 3. time rm /root/viel
>> real    0m32.992s
>> user    0m0.000s
>> sys     0m12.380s
>>
>> While the results for 1 and 2 are okay, we observe that removing the file
>> took a rather long time.
>> The NAND driver on this target supports subpage reads and therefore the NAND
>> page cache was never used.
>> Since removing a large file leads to a lot of TNC lookups UBIFS issues many
>> reads with a length around 200 bytes and the overhead of every single mtd_read()
>> kickes in.
>>
>> Baseline + NAND cache (hence, subpage read support disabled in our driver)
>> ==========================================================================
>>
>> 1. bootup time: 19,5 seconds
>>
>> 2. time find /usr > /dev/null
>> real    0m24.555s
>> user    0m0.150s
>> sys     0m8.220s
>>
>> 3. time rm /root/viel
>> real    0m5.327s
>> user    0m0.000s
>> sys     0m3.300s
>>
>> We observe that the bootup took a little longer, but the jitter here is about one
>> second.
>> Walking /usr took about 5 seconds longer, not sure why. This needs more inspection.
>>
>> The most interesting result is that removing the large file took only 5 seconds,
>> compared to 33 seconds this is huge speedup.
>> Since the znodes of the same file are adjacent data structures on the flash
>> the cache on NAND level helped a lot.
>>
>> UBI read cache with 1 cache line
>> ================================
>>
>> 1. bootup time: 19,5 seconds
> 
> Hm, I would expect an attach time close to what we have in Baseline
> with subpage read. Do you know where the difference comes from?

Boottime is pure userspace boot time.
I suspect systemd "jitter". Debian 8 starts a lot of stuff
in parallel...

As soon I'll have time, I'll benchmark ubi attach time.
Or when David has time...

>> cache usage:
>> hits: 1546
>> misses: 6887
>> no_query: 649
>>
>>
>> 2. time find /usr > /dev/null
>> real    0m24.297s
>> user    0m0.100s
>> sys     0m7.800s
>>
>> cache usage:
>> hits: 4068
>> misses: 17669
>> no_query: 107
>>
>> 3. time rm /root/viel
>> real    0m5.457s
>> user    0m0.000s
>> sys     0m2.730s
>>
>> cache usage:
>> hits: 34517
>> misses: 2490
>> no_query: 212
>>
>> The results are more or less the same as with caching on the NAND side.
>> So, we could drop the NAND cache and have it implemented in UBI.
>> As expected the cache brings us most while deleting large files.
>>
>> UBI read cache with 61 cache lines
>> ==================================
>>
>> 1. bootup time: 17,5 seconds
>> hits: 2991
>> misses: 5436
>> no_query: 649
>>
>> 2. time find /usr > /dev/null
>> real    0m20.557s
>> user    0m0.120s
>> sys     0m7.080s
>>
>> hits: 7064
>> misses: 14145
>> no_query: 116
>>
>> 3. time rm /root/viel
>> real    0m5.244s
>> user    0m0.000s
>> sys     0m2.840s
>>
>> hits: 34228
>> misses: 2248
>> no_query: 202
>>
>> With more cache lines we can reduce the boot time a bit, we also observe
>> that, compared to a single line, the cache hit rate is better.
>> Same for walking /usr.
>> Removing a large file is also fast.
>> So, having more cache lines helps but we need to figure out first how
>> much lines make sense.
> 
> That's true, but if we go for an adjustable UBI write-unit cache, I'd
> prefer to have an implementation relying the page-cache mechanism so
> that the system can adjust the cache size (by reclaiming cached pages)
> on-demand instead of hardcoding the number of cache lines at compile
> time.
> 
> IMO, we should either implement a basic 'single page cache' mechanism
> at the UBI level to replace the one at the NAND level and still benefit
> from subpage reads for EC and VID headers, or go for the more complex
> page-cache based mechanism.
> 
> Artem, Brian, any opinion?
> 
> BTW, this implementation still assumes that the upper layer will try to
> fill NAND pages entirely. This seems to be true for UBIFS, so I'm not
> sure we should bother about other cases now, but I'd still like to have
> your opinions.

Yeah, it does what nand_base.c does and helps fulfilling UBIFS' assumptions.
I'm not sure whether a complex LRU cache is worth the performance gain.
But we'll see. :-)

Thanks,
//richard

     prev parent reply	other threads:[~2016-07-18 14:30 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-15 12:57 [RFC][PATCH] ubi: Implement a read cache Richard Weinberger
2016-07-18  8:05 ` David Gstir
2016-07-18 14:21 ` Boris Brezillon
2016-07-18 14:30   ` Richard Weinberger [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=578CE807.5060204@nod.at \
    --to=richard@nod.at \
    --cc=boris.brezillon@free-electrons.com \
    --cc=computersforpeace@gmail.com \
    --cc=david@sigma-star.at \
    --cc=dedekind1@gmail.com \
    --cc=linux-mtd@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.