All of lore.kernel.org
 help / color / mirror / Atom feed
* rbd caching
@ 2012-05-05 23:51 Sage Weil
  2012-05-07  7:09 ` Josh Durgin
  0 siblings, 1 reply; 3+ messages in thread
From: Sage Weil @ 2012-05-05 23:51 UTC (permalink / raw)
  To: ceph-devel

The wip-rbd-wt branch has some potential changes to the rbd cache code.  
Currently, each image gets its own separate cache using its own pool of 
memory.  The ObjectCacher module has essentially three knobs:

 max size -- total bytes we'll cache
 max dirty -- max dirty bytes before a write will block and wait for writeback
 target dirty -- threshold at which we'll start writing back older buffers

(There's also a hard-coded max age value (~1 second) after which a dirty 
buffer will be written; that needs to be turned into a tunable.)

The first several patches add 'write-thru' caching support.  This is done 
by simply setting the max_dirty value to 0.  A write will then block until 
it is written to the cluster, but we still get the benefits of the read 
cache.

The second set of patches restructure the way the cache itself is managed.  
One goal is to be able to control cache behavior on a per-image basis 
(this one write-thru, this was write-back, etc.).  Another goal is to 
share a single pool of memory for several images.  The librbd.h calls to 
do this currently look something like this:

int rbd_cache_create(rados_t cluster, rbd_cache_t *cache, uint64_t max_size,
		     uint64_t max_dirty, uint64_t target_dirty);
int rbd_cache_destroy(rbd_cache_t cache);
int rbd_open_cached(rados_ioctx_t io, const char *name, rbd_image_t image,
		 const char *snap_name, rbd_cache_t cache);

Setting the cache tunables should probably be broken out into several 
different calls, so that it is possible to add new ones in the future.  
Beyond that, though, the limitation here is that you can set the 
target_dirty or max_dirty for a _cache_, and then have multiple images 
share that cache, but you can't then set a max_dirty limit for an 
individual image.

Does it matter?  Ideally, I supposed, you could set:

 - per-cache size
 - per-cache max_dirty
 - per-cache target_dirty
 - per-image max_dirty  (0 for write-thru)
 - per-image target_dirty

and then share a single cache for many images, and the flushing logic 
could observe both sets of dirty limits.  That just means calls to set 
max_dirty and target_dirty for individual images, too.

Is it worth the complexity?  In the end, this will be wired up to the qemu 
writeback options, so the range of actual usage will fall within 
whatever is doable with those options and generic 'rbd cache size = ..' 
tunables, most likely...

sage


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-05-07 16:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-05 23:51 rbd caching Sage Weil
2012-05-07  7:09 ` Josh Durgin
2012-05-07 16:44   ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.