public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@gmail.com>
To: linux-kernel@vger.kernel.org
Subject: Using the page cache while maintaining a reserve? Or do it some other way?
Date: Sat, 18 Sep 2010 03:52:53 -0700	[thread overview]
Message-ID: <4C949A05.5060209@gmail.com> (raw)

I've been writing bcache - a patch to use SSDs to cache arbitrary block 
devices. The biggest thing left before it can be considered production 
ready (besides lots more testing) is making memory allocation deadlock 
proof.

I've been using the page cache for bcache's btree; it might be 50-100 mb 
of data with a decent sized SSD and I don't want to roll my own cache if 
I don't have to.

The problem is I need to maintain a reserve of enough memory for new 
btree nodes to split all the way to the root, and bcache's btree nodes 
are sized to the SSD's erase block. Since most consumer SSDs seem to 
have 512k erase blocks, that means a couple megabytes of reserve. (Yes, 
bcache can efficiently use 512k btree nodes, with 16 byte keys - pretty 
cool, huh? :)

Just keeping a couple megabytes of memory around unused is a little 
silly when the cache could easily be the reserve - except I don't see 
any way of doing this with the page cache. I imagine it'd probably be 
possible to rip pages belonging to bcache's mapping out of the page 
cache, except I don't think I'd be able to pick out the oldest ones that 
way and besides that it doesn't sound like a very sane thing to do. 
Guaranteeing the reserve might not be so bad by holding more buckets 
pinned in memory from within bcache, but probably it'd end up tricky and 
error prone at best.

So, I imagine I'm going to need a dedicated cache in order to do this 
right - if I maintain an lru for the whole btree implementing the 
reserve is easy, I'd just need to make the pages reclaimable or 
otherwise respond to memory pressure (which is what I have not figured 
out how to do). But I can't be the first person to have this problem, so 
I'm hoping someone will have some suggestions or there'll be an easier 
solution... (It seems like the kind of thing filesystems might run into...).

I'd also like to get my code reviewed more, but as the patch is almost 
6k lines now that's inevitably not going to be easy. I don't see any way 
to usefully split much of it out, to make it easier to digest - if 
anyone has any suggestions, I'm all ears.

http://bcache.evilpiepirate.org
git://evilpiepirate.org/~kent/linux-bcache.git
git://evilpiepirate.org/~kent/bcache-tools.git

                 reply	other threads:[~2010-09-18 10:52 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C949A05.5060209@gmail.com \
    --to=kent.overstreet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox