* [Qemu-devel] [PATCH v3 0/3] Clean unused entries in the qcow2 L2/refcount cache @ 2015-05-27 9:46 Alberto Garcia 2015-05-27 9:46 ` [Qemu-devel] [PATCH 1/3] qcow2: mark the memory as no longer needed after qcow2_cache_empty() Alberto Garcia ` (2 more replies) 0 siblings, 3 replies; 29+ messages in thread From: Alberto Garcia @ 2015-05-27 9:46 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Stefan Hajnoczi v3: - Add 'cache-clean-interval' field to ImageInfoSpecificQCow2. v2: https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg05316.html - Clarify that the block-commit mentioned in the first patch refers to the HMP commit command. - Check the value of cache_clean_interval and cast it accordingly to prevent it from overflowing. v1: https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg03510.html Regards, Berto Alberto Garcia (3): qcow2: mark the memory as no longer needed after qcow2_cache_empty() qcow2: add option to clean unused cache entries after some time qcow2: reorder fields in Qcow2CachedTable to reduce padding block/qcow2-cache.c | 57 ++++++++++++++++++++++++++++++++++++++++++++- block/qcow2.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++ block/qcow2.h | 4 ++++ qapi/block-core.json | 13 +++++++++-- 4 files changed, 137 insertions(+), 3 deletions(-) -- 2.1.4 ^ permalink raw reply [flat|nested] 29+ messages in thread
* [Qemu-devel] [PATCH 1/3] qcow2: mark the memory as no longer needed after qcow2_cache_empty() 2015-05-27 9:46 [Qemu-devel] [PATCH v3 0/3] Clean unused entries in the qcow2 L2/refcount cache Alberto Garcia @ 2015-05-27 9:46 ` Alberto Garcia 2015-05-27 9:46 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia 2015-05-27 9:46 ` [Qemu-devel] [PATCH 3/3] qcow2: reorder fields in Qcow2CachedTable to reduce padding Alberto Garcia 2 siblings, 0 replies; 29+ messages in thread From: Alberto Garcia @ 2015-05-27 9:46 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Stefan Hajnoczi After having emptied the cache, the data in the cache tables is no longer useful, so we can tell the kernel that we are done with it. In Linux this frees the resources associated with it. The effect of this can be seen in the HMP commit operation: it moves data from the top to the base image (and fills both caches), then it empties the top image. At this point the data in that cache is no longer needed so it's just wasting memory. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> --- block/qcow2-cache.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index ed92a09..ed14a92 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -22,8 +22,10 @@ * THE SOFTWARE. */ +#include <sys/mman.h> #include "block/block_int.h" #include "qemu-common.h" +#include "qemu/osdep.h" #include "qcow2.h" #include "trace.h" @@ -60,6 +62,22 @@ static inline int qcow2_cache_get_table_idx(BlockDriverState *bs, return idx; } +static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c, + int i, int num_tables) +{ +#if QEMU_MADV_DONTNEED != QEMU_MADV_INVALID + BDRVQcowState *s = bs->opaque; + void *t = qcow2_cache_get_table_addr(bs, c, i); + long align = sysconf(_SC_PAGESIZE); + size_t mem_size = (size_t) s->cluster_size * num_tables; + size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t; + size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align); + if (length > 0) { + qemu_madvise((uint8_t *) t + offset, length, QEMU_MADV_DONTNEED); + } +#endif +} + Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables) { BDRVQcowState *s = bs->opaque; @@ -237,6 +255,8 @@ int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c) c->entries[i].lru_counter = 0; } + qcow2_cache_table_release(bs, c, 0, c->size); + c->lru_counter = 0; return 0; -- 2.1.4 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-27 9:46 [Qemu-devel] [PATCH v3 0/3] Clean unused entries in the qcow2 L2/refcount cache Alberto Garcia 2015-05-27 9:46 ` [Qemu-devel] [PATCH 1/3] qcow2: mark the memory as no longer needed after qcow2_cache_empty() Alberto Garcia @ 2015-05-27 9:46 ` Alberto Garcia 2015-05-27 12:27 ` Eric Blake 2015-05-28 14:56 ` Max Reitz 2015-05-27 9:46 ` [Qemu-devel] [PATCH 3/3] qcow2: reorder fields in Qcow2CachedTable to reduce padding Alberto Garcia 2 siblings, 2 replies; 29+ messages in thread From: Alberto Garcia @ 2015-05-27 9:46 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Stefan Hajnoczi This adds a new 'cache-clean-interval' option that cleans all qcow2 cache entries that haven't been used in a certain interval, given in seconds. This allows setting a large L2 cache size so it can handle scenarios with lots of I/O and at the same time use little memory during periods of inactivity. This feature currently relies on MADV_DONTNEED to free that memory, so it is not useful in systems that don't follow that behavior. Signed-off-by: Alberto Garcia <berto@igalia.com> Cc: Max Reitz <mreitz@redhat.com> --- block/qcow2-cache.c | 35 ++++++++++++++++++++++++++++ block/qcow2.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++ block/qcow2.h | 4 ++++ qapi/block-core.json | 13 +++++++++-- 4 files changed, 116 insertions(+), 2 deletions(-) diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index ed14a92..a215f5b 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -43,6 +43,7 @@ struct Qcow2Cache { bool depends_on_flush; void *table_array; uint64_t lru_counter; + uint64_t cache_clean_lru_counter; }; static inline void *qcow2_cache_get_table_addr(BlockDriverState *bs, @@ -78,6 +79,40 @@ static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c, #endif } +static inline bool can_clean_entry(Qcow2Cache *c, int i) +{ + Qcow2CachedTable *t = &c->entries[i]; + return t->ref == 0 && !t->dirty && t->offset != 0 && + t->lru_counter <= c->cache_clean_lru_counter; +} + +void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c) +{ + int i = 0; + while (i < c->size) { + int to_clean = 0; + + /* Skip the entries that we don't need to clean */ + while (i < c->size && !can_clean_entry(c, i)) { + i++; + } + + /* And count how many we can clean in a row */ + while (i < c->size && can_clean_entry(c, i)) { + c->entries[i].offset = 0; + c->entries[i].lru_counter = 0; + i++; + to_clean++; + } + + if (to_clean > 0) { + qcow2_cache_table_release(bs, c, i - to_clean, to_clean); + } + } + + c->cache_clean_lru_counter = c->lru_counter; +} + Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables) { BDRVQcowState *s = bs->opaque; diff --git a/block/qcow2.c b/block/qcow2.c index f7b4cc6..abafcdf 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -468,6 +468,11 @@ static QemuOptsList qcow2_runtime_opts = { .type = QEMU_OPT_SIZE, .help = "Maximum refcount block cache size", }, + { + .name = QCOW2_OPT_CACHE_CLEAN_INTERVAL, + .type = QEMU_OPT_NUMBER, + .help = "Clean unused cache entries after this time (in seconds)", + }, { /* end of list */ } }, }; @@ -483,6 +488,49 @@ static const char *overlap_bool_option_names[QCOW2_OL_MAX_BITNR] = { [QCOW2_OL_INACTIVE_L2_BITNR] = QCOW2_OPT_OVERLAP_INACTIVE_L2, }; +static void cache_clean_timer_cb(void *opaque) +{ + BlockDriverState *bs = opaque; + BDRVQcowState *s = bs->opaque; + qcow2_cache_clean_unused(bs, s->l2_table_cache); + qcow2_cache_clean_unused(bs, s->refcount_block_cache); + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + (int64_t) s->cache_clean_interval * 1000); +} + +static void cache_clean_timer_init(BlockDriverState *bs, AioContext *context) +{ + BDRVQcowState *s = bs->opaque; + if (s->cache_clean_interval > 0) { + s->cache_clean_timer = aio_timer_new(context, QEMU_CLOCK_VIRTUAL, + SCALE_MS, cache_clean_timer_cb, + bs); + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + (int64_t) s->cache_clean_interval * 1000); + } +} + +static void cache_clean_timer_del(BlockDriverState *bs) +{ + BDRVQcowState *s = bs->opaque; + if (s->cache_clean_timer) { + timer_del(s->cache_clean_timer); + timer_free(s->cache_clean_timer); + s->cache_clean_timer = NULL; + } +} + +static void qcow2_detach_aio_context(BlockDriverState *bs) +{ + cache_clean_timer_del(bs); +} + +static void qcow2_attach_aio_context(BlockDriverState *bs, + AioContext *new_context) +{ + cache_clean_timer_init(bs, new_context); +} + static void read_cache_sizes(QemuOpts *opts, uint64_t *l2_cache_size, uint64_t *refcount_cache_size, Error **errp) { @@ -552,6 +600,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, const char *opt_overlap_check, *opt_overlap_check_template; int overlap_check_template = 0; uint64_t l2_cache_size, refcount_cache_size; + uint64_t cache_clean_interval; ret = bdrv_pread(bs->file, 0, &header, sizeof(header)); if (ret < 0) { @@ -839,6 +888,16 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } + cache_clean_interval = + qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, 0); + if (cache_clean_interval > UINT_MAX) { + error_setg(errp, "Cache clean interval too big"); + ret = -EINVAL; + goto fail; + } + s->cache_clean_interval = cache_clean_interval; + cache_clean_timer_init(bs, bdrv_get_aio_context(bs)); + s->cluster_cache = g_malloc(s->cluster_size); /* one more sector for decompressed data alignment */ s->cluster_data = qemu_try_blockalign(bs->file, QCOW_MAX_CRYPT_CLUSTERS @@ -1004,6 +1063,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, qemu_vfree(s->l1_table); /* else pre-write overlap checks in cache_destroy may crash */ s->l1_table = NULL; + cache_clean_timer_del(bs); if (s->l2_table_cache) { qcow2_cache_destroy(bs, s->l2_table_cache); } @@ -1458,6 +1518,7 @@ static void qcow2_close(BlockDriverState *bs) } } + cache_clean_timer_del(bs); qcow2_cache_destroy(bs, s->l2_table_cache); qcow2_cache_destroy(bs, s->refcount_block_cache); @@ -2552,6 +2613,8 @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs) }; } + spec_info->qcow2->cache_clean_interval = s->cache_clean_interval; + return spec_info; } @@ -2970,6 +3033,9 @@ BlockDriver bdrv_qcow2 = { .create_opts = &qcow2_create_opts, .bdrv_check = qcow2_check, .bdrv_amend_options = qcow2_amend_options, + + .bdrv_detach_aio_context = qcow2_detach_aio_context, + .bdrv_attach_aio_context = qcow2_attach_aio_context, }; static void bdrv_qcow2_init(void) diff --git a/block/qcow2.h b/block/qcow2.h index 0076512..2c45eb2 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -93,6 +93,7 @@ #define QCOW2_OPT_CACHE_SIZE "cache-size" #define QCOW2_OPT_L2_CACHE_SIZE "l2-cache-size" #define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size" +#define QCOW2_OPT_CACHE_CLEAN_INTERVAL "cache-clean-interval" typedef struct QCowHeader { uint32_t magic; @@ -236,6 +237,8 @@ typedef struct BDRVQcowState { Qcow2Cache* l2_table_cache; Qcow2Cache* refcount_block_cache; + QEMUTimer *cache_clean_timer; + unsigned cache_clean_interval; uint8_t *cluster_cache; uint8_t *cluster_data; @@ -581,6 +584,7 @@ int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, Qcow2Cache *dependency); void qcow2_cache_depends_on_flush(Qcow2Cache *c); +void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c); int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c); int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, diff --git a/qapi/block-core.json b/qapi/block-core.json index 863ffea..3a9b590 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -41,6 +41,10 @@ # @corrupt: #optional true if the image has been marked corrupt; only valid for # compat >= 1.1 (since 2.2) # +# @cache-clean-interval: interval in seconds after which unused L2 and +# refcount cache entries are removed. If 0 then +# this feature is not enabled (since 2.4) +# # @refcount-bits: width of a refcount entry in bits (since 2.3) # # Since: 1.7 @@ -50,7 +54,8 @@ 'compat': 'str', '*lazy-refcounts': 'bool', '*corrupt': 'bool', - 'refcount-bits': 'int' + 'refcount-bits': 'int', + 'cache-clean-interval': 'int' } } ## @@ -1538,6 +1543,9 @@ # @refcount-cache-size: #optional the maximum size of the refcount block cache # in bytes (since 2.2) # +# @cache-clean-interval: #optional clean unused entries in the L2 and refcount +# caches. The interval is in seconds (since 2.4) +# # Since: 1.7 ## { 'struct': 'BlockdevOptionsQcow2', @@ -1549,7 +1557,8 @@ '*overlap-check': 'Qcow2OverlapChecks', '*cache-size': 'int', '*l2-cache-size': 'int', - '*refcount-cache-size': 'int' } } + '*refcount-cache-size': 'int', + '*cache-clean-interval': 'int' } } ## -- 2.1.4 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-27 9:46 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia @ 2015-05-27 12:27 ` Eric Blake 2015-05-27 14:34 ` Alberto Garcia 2015-05-28 14:47 ` Max Reitz 2015-05-28 14:56 ` Max Reitz 1 sibling, 2 replies; 29+ messages in thread From: Eric Blake @ 2015-05-27 12:27 UTC (permalink / raw) To: Alberto Garcia, qemu-devel Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi, Max Reitz [-- Attachment #1: Type: text/plain, Size: 2971 bytes --] On 05/27/2015 03:46 AM, Alberto Garcia wrote: > This adds a new 'cache-clean-interval' option that cleans all qcow2 > cache entries that haven't been used in a certain interval, given in > seconds. > > This allows setting a large L2 cache size so it can handle scenarios > with lots of I/O and at the same time use little memory during periods > of inactivity. > > This feature currently relies on MADV_DONTNEED to free that memory, so > it is not useful in systems that don't follow that behavior. > > Signed-off-by: Alberto Garcia <berto@igalia.com> > Cc: Max Reitz <mreitz@redhat.com> > --- > block/qcow2-cache.c | 35 ++++++++++++++++++++++++++++ > block/qcow2.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > block/qcow2.h | 4 ++++ > qapi/block-core.json | 13 +++++++++-- > 4 files changed, 116 insertions(+), 2 deletions(-) > > +static void cache_clean_timer_init(BlockDriverState *bs, AioContext *context) > +{ > + BDRVQcowState *s = bs->opaque; > + if (s->cache_clean_interval > 0) { > + s->cache_clean_timer = aio_timer_new(context, QEMU_CLOCK_VIRTUAL, > + SCALE_MS, cache_clean_timer_cb, > + bs); > + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + > + (int64_t) s->cache_clean_interval * 1000); > + } > +} > + This function sets up a timer for non-zero interval, but does nothing if interval is zero. [1] > @@ -839,6 +888,16 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, > goto fail; > } > > + cache_clean_interval = > + qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, 0); > + if (cache_clean_interval > UINT_MAX) { > + error_setg(errp, "Cache clean interval too big"); > + ret = -EINVAL; > + goto fail; > + } If you type the qapi code as 'uint32' rather than 'int', you could skip the error checking here because the parser would have already clamped things. But I can live with this as-is. > + s->cache_clean_interval = cache_clean_interval; > + cache_clean_timer_init(bs, bdrv_get_aio_context(bs)); [1] But here, you are unconditionally calling init, whether the new value is 0 or nonzero. Can a block reopen ever cause an existing BDS to change its interval, in which case I could create a BDS originally with a timer, then reopen it without a timer, and init() would have to remove the existing timer? If I'm reading this patch correctly, right now the interval is a write-once deal (no way to change it after the fact), so your code is okay; but should a separate patch be added to allow adjusting the interval, via a reopen operation? Reviewed-by: Eric Blake <eblake@redhat.com> -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-27 12:27 ` Eric Blake @ 2015-05-27 14:34 ` Alberto Garcia 2015-05-28 14:47 ` Max Reitz 1 sibling, 0 replies; 29+ messages in thread From: Alberto Garcia @ 2015-05-27 14:34 UTC (permalink / raw) To: Eric Blake, qemu-devel; +Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi, Max Reitz On Wed 27 May 2015 02:27:35 PM CEST, Eric Blake <eblake@redhat.com> wrote: >> +static void cache_clean_timer_init(BlockDriverState *bs, AioContext *context) > > This function sets up a timer for non-zero interval, but does nothing > if interval is zero. [1] Right, zero means there's no interval. I could clarify that in the documentation (I did it in the ImageInfoSpecificQCow2 part, though). >> + if (cache_clean_interval > UINT_MAX) { >> + error_setg(errp, "Cache clean interval too big"); >> + ret = -EINVAL; >> + goto fail; >> + } > > If you type the qapi code as 'uint32' rather than 'int', you could > skip the error checking here because the parser would have already > clamped things. But I can live with this as-is. Hmmm... the UINT_MAX limit is just a way to prevent an overflow because of an abnormal value, not because the limit itself make sense (it's ~136 years, way beyond any reasonable value for that setting). I'm not sure if it's a good idea to expose that in the API, it gives the idea that there's a reason why it is a 32-bit integer, but there's none. I would actually be ok with having a smaller limit (I don't know, 1 year?), but there's also no easy way to choose one I guess, so I decided to let the user choose as long as it doesn't break anything. >> + s->cache_clean_interval = cache_clean_interval; >> + cache_clean_timer_init(bs, bdrv_get_aio_context(bs)); > > [1] But here, you are unconditionally calling init, whether the new > value is 0 or nonzero. Instead of having the same check in all places where we call cache_clean_timer_init() (there's two at the moment) I think it's enough with checking the value in the function where it is actually used (which is anyway a good idea). > Can a block reopen ever cause an existing BDS to change its interval, > in which case I could create a BDS originally with a timer, then > reopen it without a timer, and init() would have to remove the > existing timer? If I'm reading this patch correctly, right now the > interval is a write-once deal (no way to change it after the fact), so > your code is okay; but should a separate patch be added to allow > adjusting the interval, via a reopen operation? I have no objections against changing the interval, I just didn't consider that case. If we want to support it it should be as simple as timer_del(); s->interval = new_interval; timer_init(); Berto ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-27 12:27 ` Eric Blake 2015-05-27 14:34 ` Alberto Garcia @ 2015-05-28 14:47 ` Max Reitz 1 sibling, 0 replies; 29+ messages in thread From: Max Reitz @ 2015-05-28 14:47 UTC (permalink / raw) To: Eric Blake, Alberto Garcia, qemu-devel Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi On 27.05.2015 14:27, Eric Blake wrote: > On 05/27/2015 03:46 AM, Alberto Garcia wrote: >> This adds a new 'cache-clean-interval' option that cleans all qcow2 >> cache entries that haven't been used in a certain interval, given in >> seconds. >> >> This allows setting a large L2 cache size so it can handle scenarios >> with lots of I/O and at the same time use little memory during periods >> of inactivity. >> >> This feature currently relies on MADV_DONTNEED to free that memory, so >> it is not useful in systems that don't follow that behavior. >> >> Signed-off-by: Alberto Garcia <berto@igalia.com> >> Cc: Max Reitz <mreitz@redhat.com> >> --- >> block/qcow2-cache.c | 35 ++++++++++++++++++++++++++++ >> block/qcow2.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++ >> block/qcow2.h | 4 ++++ >> qapi/block-core.json | 13 +++++++++-- >> 4 files changed, 116 insertions(+), 2 deletions(-) >> >> +static void cache_clean_timer_init(BlockDriverState *bs, AioContext *context) >> +{ >> + BDRVQcowState *s = bs->opaque; >> + if (s->cache_clean_interval > 0) { >> + s->cache_clean_timer = aio_timer_new(context, QEMU_CLOCK_VIRTUAL, >> + SCALE_MS, cache_clean_timer_cb, >> + bs); >> + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + >> + (int64_t) s->cache_clean_interval * 1000); >> + } >> +} >> + > This function sets up a timer for non-zero interval, but does nothing if > interval is zero. [1] > >> @@ -839,6 +888,16 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, >> goto fail; >> } >> >> + cache_clean_interval = >> + qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, 0); >> + if (cache_clean_interval > UINT_MAX) { >> + error_setg(errp, "Cache clean interval too big"); >> + ret = -EINVAL; >> + goto fail; >> + } > If you type the qapi code as 'uint32' rather than 'int', you could skip > the error checking here because the parser would have already clamped > things. But I can live with this as-is. Well, for blockdev-add, yes, but I don't think that applies when the option has been passed on the command line. Max >> + s->cache_clean_interval = cache_clean_interval; >> + cache_clean_timer_init(bs, bdrv_get_aio_context(bs)); > [1] But here, you are unconditionally calling init, whether the new > value is 0 or nonzero. Can a block reopen ever cause an existing BDS to > change its interval, in which case I could create a BDS originally with > a timer, then reopen it without a timer, and init() would have to remove > the existing timer? If I'm reading this patch correctly, right now the > interval is a write-once deal (no way to change it after the fact), so > your code is okay; but should a separate patch be added to allow > adjusting the interval, via a reopen operation? > > Reviewed-by: Eric Blake <eblake@redhat.com> > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-27 9:46 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia 2015-05-27 12:27 ` Eric Blake @ 2015-05-28 14:56 ` Max Reitz 2015-05-28 15:02 ` Alberto Garcia 2015-05-28 15:13 ` Eric Blake 1 sibling, 2 replies; 29+ messages in thread From: Max Reitz @ 2015-05-28 14:56 UTC (permalink / raw) To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi On 27.05.2015 11:46, Alberto Garcia wrote: > This adds a new 'cache-clean-interval' option that cleans all qcow2 > cache entries that haven't been used in a certain interval, given in > seconds. > > This allows setting a large L2 cache size so it can handle scenarios > with lots of I/O and at the same time use little memory during periods > of inactivity. > > This feature currently relies on MADV_DONTNEED to free that memory, so > it is not useful in systems that don't follow that behavior. > > Signed-off-by: Alberto Garcia <berto@igalia.com> > Cc: Max Reitz <mreitz@redhat.com> > --- > block/qcow2-cache.c | 35 ++++++++++++++++++++++++++++ > block/qcow2.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > block/qcow2.h | 4 ++++ > qapi/block-core.json | 13 +++++++++-- > 4 files changed, 116 insertions(+), 2 deletions(-) > > diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c > index ed14a92..a215f5b 100644 > --- a/block/qcow2-cache.c > +++ b/block/qcow2-cache.c > @@ -43,6 +43,7 @@ struct Qcow2Cache { > bool depends_on_flush; > void *table_array; > uint64_t lru_counter; > + uint64_t cache_clean_lru_counter; > }; > > static inline void *qcow2_cache_get_table_addr(BlockDriverState *bs, > @@ -78,6 +79,40 @@ static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c, > #endif > } > > +static inline bool can_clean_entry(Qcow2Cache *c, int i) > +{ > + Qcow2CachedTable *t = &c->entries[i]; > + return t->ref == 0 && !t->dirty && t->offset != 0 && > + t->lru_counter <= c->cache_clean_lru_counter; > +} > + > +void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c) > +{ > + int i = 0; > + while (i < c->size) { > + int to_clean = 0; > + > + /* Skip the entries that we don't need to clean */ > + while (i < c->size && !can_clean_entry(c, i)) { > + i++; > + } > + > + /* And count how many we can clean in a row */ > + while (i < c->size && can_clean_entry(c, i)) { > + c->entries[i].offset = 0; > + c->entries[i].lru_counter = 0; > + i++; > + to_clean++; > + } > + > + if (to_clean > 0) { > + qcow2_cache_table_release(bs, c, i - to_clean, to_clean); > + } > + } > + > + c->cache_clean_lru_counter = c->lru_counter; > +} > + > Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables) > { > BDRVQcowState *s = bs->opaque; > diff --git a/block/qcow2.c b/block/qcow2.c > index f7b4cc6..abafcdf 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -468,6 +468,11 @@ static QemuOptsList qcow2_runtime_opts = { > .type = QEMU_OPT_SIZE, > .help = "Maximum refcount block cache size", > }, > + { > + .name = QCOW2_OPT_CACHE_CLEAN_INTERVAL, > + .type = QEMU_OPT_NUMBER, > + .help = "Clean unused cache entries after this time (in seconds)", > + }, > { /* end of list */ } > }, > }; > @@ -483,6 +488,49 @@ static const char *overlap_bool_option_names[QCOW2_OL_MAX_BITNR] = { > [QCOW2_OL_INACTIVE_L2_BITNR] = QCOW2_OPT_OVERLAP_INACTIVE_L2, > }; > > +static void cache_clean_timer_cb(void *opaque) > +{ > + BlockDriverState *bs = opaque; > + BDRVQcowState *s = bs->opaque; > + qcow2_cache_clean_unused(bs, s->l2_table_cache); > + qcow2_cache_clean_unused(bs, s->refcount_block_cache); > + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + > + (int64_t) s->cache_clean_interval * 1000); > +} > + > +static void cache_clean_timer_init(BlockDriverState *bs, AioContext *context) > +{ > + BDRVQcowState *s = bs->opaque; > + if (s->cache_clean_interval > 0) { > + s->cache_clean_timer = aio_timer_new(context, QEMU_CLOCK_VIRTUAL, > + SCALE_MS, cache_clean_timer_cb, > + bs); > + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + > + (int64_t) s->cache_clean_interval * 1000); > + } > +} > + > +static void cache_clean_timer_del(BlockDriverState *bs) > +{ > + BDRVQcowState *s = bs->opaque; > + if (s->cache_clean_timer) { > + timer_del(s->cache_clean_timer); > + timer_free(s->cache_clean_timer); > + s->cache_clean_timer = NULL; > + } > +} > + > +static void qcow2_detach_aio_context(BlockDriverState *bs) > +{ > + cache_clean_timer_del(bs); > +} > + > +static void qcow2_attach_aio_context(BlockDriverState *bs, > + AioContext *new_context) > +{ > + cache_clean_timer_init(bs, new_context); > +} > + > static void read_cache_sizes(QemuOpts *opts, uint64_t *l2_cache_size, > uint64_t *refcount_cache_size, Error **errp) > { > @@ -552,6 +600,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, > const char *opt_overlap_check, *opt_overlap_check_template; > int overlap_check_template = 0; > uint64_t l2_cache_size, refcount_cache_size; > + uint64_t cache_clean_interval; > > ret = bdrv_pread(bs->file, 0, &header, sizeof(header)); > if (ret < 0) { > @@ -839,6 +888,16 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, > goto fail; > } > > + cache_clean_interval = > + qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, 0); > + if (cache_clean_interval > UINT_MAX) { > + error_setg(errp, "Cache clean interval too big"); > + ret = -EINVAL; > + goto fail; > + } > + s->cache_clean_interval = cache_clean_interval; > + cache_clean_timer_init(bs, bdrv_get_aio_context(bs)); > + > s->cluster_cache = g_malloc(s->cluster_size); > /* one more sector for decompressed data alignment */ > s->cluster_data = qemu_try_blockalign(bs->file, QCOW_MAX_CRYPT_CLUSTERS > @@ -1004,6 +1063,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, > qemu_vfree(s->l1_table); > /* else pre-write overlap checks in cache_destroy may crash */ > s->l1_table = NULL; > + cache_clean_timer_del(bs); > if (s->l2_table_cache) { > qcow2_cache_destroy(bs, s->l2_table_cache); > } > @@ -1458,6 +1518,7 @@ static void qcow2_close(BlockDriverState *bs) > } > } > > + cache_clean_timer_del(bs); > qcow2_cache_destroy(bs, s->l2_table_cache); > qcow2_cache_destroy(bs, s->refcount_block_cache); > > @@ -2552,6 +2613,8 @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs) > }; > } > > + spec_info->qcow2->cache_clean_interval = s->cache_clean_interval; > + > return spec_info; > } > > @@ -2970,6 +3033,9 @@ BlockDriver bdrv_qcow2 = { > .create_opts = &qcow2_create_opts, > .bdrv_check = qcow2_check, > .bdrv_amend_options = qcow2_amend_options, > + > + .bdrv_detach_aio_context = qcow2_detach_aio_context, > + .bdrv_attach_aio_context = qcow2_attach_aio_context, > }; > > static void bdrv_qcow2_init(void) > diff --git a/block/qcow2.h b/block/qcow2.h > index 0076512..2c45eb2 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -93,6 +93,7 @@ > #define QCOW2_OPT_CACHE_SIZE "cache-size" > #define QCOW2_OPT_L2_CACHE_SIZE "l2-cache-size" > #define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size" > +#define QCOW2_OPT_CACHE_CLEAN_INTERVAL "cache-clean-interval" > > typedef struct QCowHeader { > uint32_t magic; > @@ -236,6 +237,8 @@ typedef struct BDRVQcowState { > > Qcow2Cache* l2_table_cache; > Qcow2Cache* refcount_block_cache; > + QEMUTimer *cache_clean_timer; > + unsigned cache_clean_interval; > > uint8_t *cluster_cache; > uint8_t *cluster_data; > @@ -581,6 +584,7 @@ int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, > Qcow2Cache *dependency); > void qcow2_cache_depends_on_flush(Qcow2Cache *c); > > +void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c); > int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c); > > int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, > diff --git a/qapi/block-core.json b/qapi/block-core.json > index 863ffea..3a9b590 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -41,6 +41,10 @@ > # @corrupt: #optional true if the image has been marked corrupt; only valid for > # compat >= 1.1 (since 2.2) > # > +# @cache-clean-interval: interval in seconds after which unused L2 and > +# refcount cache entries are removed. If 0 then > +# this feature is not enabled (since 2.4) > +# > # @refcount-bits: width of a refcount entry in bits (since 2.3) > # > # Since: 1.7 > @@ -50,7 +54,8 @@ > 'compat': 'str', > '*lazy-refcounts': 'bool', > '*corrupt': 'bool', > - 'refcount-bits': 'int' > + 'refcount-bits': 'int', > + 'cache-clean-interval': 'int' > } } I'm not too happy about making this part of ImageInfoSpecificQCow2. Two reasons for this: First, it's eventually part of ImageInfo, which is defined as "Information about a QEMU image file", but this option cannot be set in the image file itself but is only a run-time option. Second, my original goal for ImageInfoSpecific was to provide more information through qemu-img info, and this value will look pretty strange there. I don't know how to resolve this quandary, though. Since qemu cannot change this interval by itself, I think not providing an interface for reading it is fine. On the other hand, if Eric finds such an interface absolutely mandatory, I can't think of a better place to return it than here. The only solution which would satisfy both requirements would be another structure which contains "online" flags, and thus is not evaluated by qemu-img info, but only by QMP commands. But that's ugly. Max > ## > @@ -1538,6 +1543,9 @@ > # @refcount-cache-size: #optional the maximum size of the refcount block cache > # in bytes (since 2.2) > # > +# @cache-clean-interval: #optional clean unused entries in the L2 and refcount > +# caches. The interval is in seconds (since 2.4) > +# > # Since: 1.7 > ## > { 'struct': 'BlockdevOptionsQcow2', > @@ -1549,7 +1557,8 @@ > '*overlap-check': 'Qcow2OverlapChecks', > '*cache-size': 'int', > '*l2-cache-size': 'int', > - '*refcount-cache-size': 'int' } } > + '*refcount-cache-size': 'int', > + '*cache-clean-interval': 'int' } } > > > ## ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-28 14:56 ` Max Reitz @ 2015-05-28 15:02 ` Alberto Garcia 2015-05-28 15:04 ` Max Reitz 2015-05-28 15:13 ` Eric Blake 1 sibling, 1 reply; 29+ messages in thread From: Alberto Garcia @ 2015-05-28 15:02 UTC (permalink / raw) To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi On Thu 28 May 2015 04:56:34 PM CEST, Max Reitz wrote: >> 'compat': 'str', >> '*lazy-refcounts': 'bool', >> '*corrupt': 'bool', >> - 'refcount-bits': 'int' >> + 'refcount-bits': 'int', >> + 'cache-clean-interval': 'int' >> } } > > I'm not too happy about making this part of ImageInfoSpecificQCow2. > Two reasons for this: First, it's eventually part of ImageInfo, which > is defined as "Information about a QEMU image file", but this option > cannot be set in the image file itself but is only a run-time option. That's a valid point. Now that I think of it, do we actually have a way to retrieve the sizes of the L2 and refcount caches? I think cache-size, l2-cache-size, and refcount-cache-size are already write-only values. Berto ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-28 15:02 ` Alberto Garcia @ 2015-05-28 15:04 ` Max Reitz 2015-05-28 15:06 ` Max Reitz 0 siblings, 1 reply; 29+ messages in thread From: Max Reitz @ 2015-05-28 15:04 UTC (permalink / raw) To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi On 28.05.2015 17:02, Alberto Garcia wrote: > On Thu 28 May 2015 04:56:34 PM CEST, Max Reitz wrote: >>> 'compat': 'str', >>> '*lazy-refcounts': 'bool', >>> '*corrupt': 'bool', >>> - 'refcount-bits': 'int' >>> + 'refcount-bits': 'int', >>> + 'cache-clean-interval': 'int' >>> } } >> I'm not too happy about making this part of ImageInfoSpecificQCow2. >> Two reasons for this: First, it's eventually part of ImageInfo, which >> is defined as "Information about a QEMU image file", but this option >> cannot be set in the image file itself but is only a run-time option. > That's a valid point. Now that I think of it, do we actually have a way > to retrieve the sizes of the L2 and refcount caches? I think cache-size, > l2-cache-size, and refcount-cache-size are already write-only values. No, in fact we don't. Well, except for if you manage to retrieve the JSON filename for the qcow2 BDS, it should be part of that. :-P Max ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-28 15:04 ` Max Reitz @ 2015-05-28 15:06 ` Max Reitz 0 siblings, 0 replies; 29+ messages in thread From: Max Reitz @ 2015-05-28 15:06 UTC (permalink / raw) To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi On 28.05.2015 17:04, Max Reitz wrote: > On 28.05.2015 17:02, Alberto Garcia wrote: >> On Thu 28 May 2015 04:56:34 PM CEST, Max Reitz wrote: >>>> 'compat': 'str', >>>> '*lazy-refcounts': 'bool', >>>> '*corrupt': 'bool', >>>> - 'refcount-bits': 'int' >>>> + 'refcount-bits': 'int', >>>> + 'cache-clean-interval': 'int' >>>> } } >>> I'm not too happy about making this part of ImageInfoSpecificQCow2. >>> Two reasons for this: First, it's eventually part of ImageInfo, which >>> is defined as "Information about a QEMU image file", but this option >>> cannot be set in the image file itself but is only a run-time option. >> That's a valid point. Now that I think of it, do we actually have a way >> to retrieve the sizes of the L2 and refcount caches? I think cache-size, >> l2-cache-size, and refcount-cache-size are already write-only values. > > No, in fact we don't. Well, except for if you manage to retrieve the > JSON filename for the qcow2 BDS, it should be part of that. :-P And I just noticed, passing that option will result in qemu returning the JSON filename when queried: $ qemu-img info "json:{'l2-cache-size':65536,'file.filename':'test.qcow2'}" image: json:{"driver": "qcow2", "l2-cache-size": 65536, "file": {"driver": "file", "filename": "test.qcow2"}} […] Max ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-28 14:56 ` Max Reitz 2015-05-28 15:02 ` Alberto Garcia @ 2015-05-28 15:13 ` Eric Blake 2015-05-28 15:14 ` Max Reitz 2015-05-28 16:44 ` Kevin Wolf 1 sibling, 2 replies; 29+ messages in thread From: Eric Blake @ 2015-05-28 15:13 UTC (permalink / raw) To: Max Reitz, Alberto Garcia, qemu-devel Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi [-- Attachment #1: Type: text/plain, Size: 2843 bytes --] On 05/28/2015 08:56 AM, Max Reitz wrote: > On 27.05.2015 11:46, Alberto Garcia wrote: >> This adds a new 'cache-clean-interval' option that cleans all qcow2 >> cache entries that haven't been used in a certain interval, given in >> seconds. >> >> This allows setting a large L2 cache size so it can handle scenarios >> with lots of I/O and at the same time use little memory during periods >> of inactivity. >> >> This feature currently relies on MADV_DONTNEED to free that memory, so >> it is not useful in systems that don't follow that behavior. >> >> +++ b/qapi/block-core.json >> @@ -41,6 +41,10 @@ >> # @corrupt: #optional true if the image has been marked corrupt; >> only valid for >> # compat >= 1.1 (since 2.2) >> # >> +# @cache-clean-interval: interval in seconds after which unused L2 and >> +# refcount cache entries are removed. If 0 then >> +# this feature is not enabled (since 2.4) >> +# >> # @refcount-bits: width of a refcount entry in bits (since 2.3) >> # >> # Since: 1.7 >> @@ -50,7 +54,8 @@ >> 'compat': 'str', >> '*lazy-refcounts': 'bool', >> '*corrupt': 'bool', >> - 'refcount-bits': 'int' >> + 'refcount-bits': 'int', >> + 'cache-clean-interval': 'int' >> } } > > I'm not too happy about making this part of ImageInfoSpecificQCow2. Two > reasons for this: First, it's eventually part of ImageInfo, which is > defined as "Information about a QEMU image file", but this option cannot > be set in the image file itself but is only a run-time option. > > Second, my original goal for ImageInfoSpecific was to provide more > information through qemu-img info, and this value will look pretty > strange there. > > I don't know how to resolve this quandary, though. Since qemu cannot > change this interval by itself, I think not providing an interface for > reading it is fine. On the other hand, if Eric finds such an interface > absolutely mandatory, I can't think of a better place to return it than > here. Can we mark the parameter optional, and only provide it when it is non-zero? That way, qemu-img (which cannot set an interval) will not report it, and the only time it will appear is if it was set as part of opening the block device under qemu. > > The only solution which would satisfy both requirements would be another > structure which contains "online" flags, and thus is not evaluated by > qemu-img info, but only by QMP commands. But that's ugly. > Yeah, I'm not sure such duplication helps. I'd still like it reported somewhere, though, as it is nice to query that a requested setting is actually working. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-28 15:13 ` Eric Blake @ 2015-05-28 15:14 ` Max Reitz 2015-05-28 15:19 ` Alberto Garcia 2015-05-28 16:44 ` Kevin Wolf 1 sibling, 1 reply; 29+ messages in thread From: Max Reitz @ 2015-05-28 15:14 UTC (permalink / raw) To: Eric Blake, Alberto Garcia, qemu-devel Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi On 28.05.2015 17:13, Eric Blake wrote: > On 05/28/2015 08:56 AM, Max Reitz wrote: >> On 27.05.2015 11:46, Alberto Garcia wrote: >>> This adds a new 'cache-clean-interval' option that cleans all qcow2 >>> cache entries that haven't been used in a certain interval, given in >>> seconds. >>> >>> This allows setting a large L2 cache size so it can handle scenarios >>> with lots of I/O and at the same time use little memory during periods >>> of inactivity. >>> >>> This feature currently relies on MADV_DONTNEED to free that memory, so >>> it is not useful in systems that don't follow that behavior. >>> >>> +++ b/qapi/block-core.json >>> @@ -41,6 +41,10 @@ >>> # @corrupt: #optional true if the image has been marked corrupt; >>> only valid for >>> # compat >= 1.1 (since 2.2) >>> # >>> +# @cache-clean-interval: interval in seconds after which unused L2 and >>> +# refcount cache entries are removed. If 0 then >>> +# this feature is not enabled (since 2.4) >>> +# >>> # @refcount-bits: width of a refcount entry in bits (since 2.3) >>> # >>> # Since: 1.7 >>> @@ -50,7 +54,8 @@ >>> 'compat': 'str', >>> '*lazy-refcounts': 'bool', >>> '*corrupt': 'bool', >>> - 'refcount-bits': 'int' >>> + 'refcount-bits': 'int', >>> + 'cache-clean-interval': 'int' >>> } } >> I'm not too happy about making this part of ImageInfoSpecificQCow2. Two >> reasons for this: First, it's eventually part of ImageInfo, which is >> defined as "Information about a QEMU image file", but this option cannot >> be set in the image file itself but is only a run-time option. >> >> Second, my original goal for ImageInfoSpecific was to provide more >> information through qemu-img info, and this value will look pretty >> strange there. >> >> I don't know how to resolve this quandary, though. Since qemu cannot >> change this interval by itself, I think not providing an interface for >> reading it is fine. On the other hand, if Eric finds such an interface >> absolutely mandatory, I can't think of a better place to return it than >> here. > Can we mark the parameter optional, and only provide it when it is > non-zero? That way, qemu-img (which cannot set an interval) will not > report it, and the only time it will appear is if it was set as part of > opening the block device under qemu. That sounds good. Max >> The only solution which would satisfy both requirements would be another >> structure which contains "online" flags, and thus is not evaluated by >> qemu-img info, but only by QMP commands. But that's ugly. >> > Yeah, I'm not sure such duplication helps. I'd still like it reported > somewhere, though, as it is nice to query that a requested setting is > actually working. > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-28 15:14 ` Max Reitz @ 2015-05-28 15:19 ` Alberto Garcia 2015-05-28 15:23 ` Max Reitz 0 siblings, 1 reply; 29+ messages in thread From: Alberto Garcia @ 2015-05-28 15:19 UTC (permalink / raw) To: Max Reitz, Eric Blake, qemu-devel; +Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi On Thu 28 May 2015 05:14:12 PM CEST, Max Reitz wrote: >>>> 'compat': 'str', >>>> '*lazy-refcounts': 'bool', >>>> '*corrupt': 'bool', >>>> - 'refcount-bits': 'int' >>>> + 'refcount-bits': 'int', >>>> + 'cache-clean-interval': 'int' >>>> } } >>> I'm not too happy about making this part of ImageInfoSpecificQCow2. >>> Two reasons for this: First, it's eventually part of ImageInfo, >>> which is defined as "Information about a QEMU image file", but this >>> option cannot be set in the image file itself but is only a run-time >>> option. >>> >> Can we mark the parameter optional, and only provide it when it is >> non-zero? That way, qemu-img (which cannot set an interval) will not >> report it, and the only time it will appear is if it was set as part >> of opening the block device under qemu. > > That sounds good. But what do we do with the other parameters (refcount-cache-size, l2-cache-size)? We cannot have the same solution there because they don't belong to the image file either, and they're never going go be zero. Berto ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-28 15:19 ` Alberto Garcia @ 2015-05-28 15:23 ` Max Reitz 2015-05-28 15:30 ` Alberto Garcia 2015-05-28 19:41 ` Eric Blake 0 siblings, 2 replies; 29+ messages in thread From: Max Reitz @ 2015-05-28 15:23 UTC (permalink / raw) To: Alberto Garcia, Eric Blake, qemu-devel Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi On 28.05.2015 17:19, Alberto Garcia wrote: > On Thu 28 May 2015 05:14:12 PM CEST, Max Reitz wrote: >>>>> 'compat': 'str', >>>>> '*lazy-refcounts': 'bool', >>>>> '*corrupt': 'bool', >>>>> - 'refcount-bits': 'int' >>>>> + 'refcount-bits': 'int', >>>>> + 'cache-clean-interval': 'int' >>>>> } } >>>> I'm not too happy about making this part of ImageInfoSpecificQCow2. >>>> Two reasons for this: First, it's eventually part of ImageInfo, >>>> which is defined as "Information about a QEMU image file", but this >>>> option cannot be set in the image file itself but is only a run-time >>>> option. >>>> >>> Can we mark the parameter optional, and only provide it when it is >>> non-zero? That way, qemu-img (which cannot set an interval) will not >>> report it, and the only time it will appear is if it was set as part >>> of opening the block device under qemu. >> That sounds good. > But what do we do with the other parameters (refcount-cache-size, > l2-cache-size)? We cannot have the same solution there because they > don't belong to the image file either, and they're never going go be > zero. Pssht, don't mention it, or Eric will notice. :-) Well, one solution would be to remember whether they have been set explicitly or not. But that gets ugly really quickly. Maybe Kevin's series helps there, but I don't know. Of course, the simplest solution is to worry about cache-clean-interval for now and worry about the cache sizes later… But that basically means "We'll never worry about them unless someone complains", so… Max ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-28 15:23 ` Max Reitz @ 2015-05-28 15:30 ` Alberto Garcia 2015-05-28 19:41 ` Eric Blake 1 sibling, 0 replies; 29+ messages in thread From: Alberto Garcia @ 2015-05-28 15:30 UTC (permalink / raw) To: Max Reitz, Eric Blake, qemu-devel; +Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi On Thu 28 May 2015 05:23:40 PM CEST, Max Reitz wrote: >>>> Can we mark the parameter optional, and only provide it when it is >>>> non-zero? That way, qemu-img (which cannot set an interval) will >>>> not report it, and the only time it will appear is if it was set as >>>> part of opening the block device under qemu. >>> That sounds good. >> But what do we do with the other parameters (refcount-cache-size, >> l2-cache-size)? We cannot have the same solution there because they >> don't belong to the image file either, and they're never going go be >> zero. > Of course, the simplest solution is to worry about cache-clean-interval > for now and worry about the cache sizes later… But that basically means > "We'll never worry about them unless someone complains", so… My worry is that the solution works in this case because the default value happens to be 0 and means "disabled", so it makes sense, but it doesn't work in general with other parameters, so we would be adding something that we know it's ad-hoc. Berto ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-28 15:23 ` Max Reitz 2015-05-28 15:30 ` Alberto Garcia @ 2015-05-28 19:41 ` Eric Blake 2015-05-29 8:32 ` Kevin Wolf 1 sibling, 1 reply; 29+ messages in thread From: Eric Blake @ 2015-05-28 19:41 UTC (permalink / raw) To: Max Reitz, Alberto Garcia, qemu-devel Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi [-- Attachment #1: Type: text/plain, Size: 1417 bytes --] On 05/28/2015 09:23 AM, Max Reitz wrote: >>>> Can we mark the parameter optional, and only provide it when it is >>>> non-zero? That way, qemu-img (which cannot set an interval) will not >>>> report it, and the only time it will appear is if it was set as part >>>> of opening the block device under qemu. >>> That sounds good. >> But what do we do with the other parameters (refcount-cache-size, >> l2-cache-size)? We cannot have the same solution there because they >> don't belong to the image file either, and they're never going go be >> zero. > > Pssht, don't mention it, or Eric will notice. :-) > > Well, one solution would be to remember whether they have been set > explicitly or not. But that gets ugly really quickly. Maybe Kevin's > series helps there, but I don't know. > > Of course, the simplest solution is to worry about cache-clean-interval > for now and worry about the cache sizes later… But that basically means > "We'll never worry about them unless someone complains", so… Hmm, now that we have three pieces of data, I'm starting to lean more towards ImageInfoSpecific being the wrong place for this after all; it would still be nice to advertise all three, but where? Is BlockdevCacheInfo more appropriate (as a sub-struct of BlockDeviceInfo)? -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-28 19:41 ` Eric Blake @ 2015-05-29 8:32 ` Kevin Wolf 0 siblings, 0 replies; 29+ messages in thread From: Kevin Wolf @ 2015-05-29 8:32 UTC (permalink / raw) To: Eric Blake Cc: qemu-block, Alberto Garcia, qemu-devel, Stefan Hajnoczi, Max Reitz [-- Attachment #1: Type: text/plain, Size: 1798 bytes --] Am 28.05.2015 um 21:41 hat Eric Blake geschrieben: > On 05/28/2015 09:23 AM, Max Reitz wrote: > > >>>> Can we mark the parameter optional, and only provide it when it is > >>>> non-zero? That way, qemu-img (which cannot set an interval) will not > >>>> report it, and the only time it will appear is if it was set as part > >>>> of opening the block device under qemu. > >>> That sounds good. > >> But what do we do with the other parameters (refcount-cache-size, > >> l2-cache-size)? We cannot have the same solution there because they > >> don't belong to the image file either, and they're never going go be > >> zero. > > > > Pssht, don't mention it, or Eric will notice. :-) > > > > Well, one solution would be to remember whether they have been set > > explicitly or not. But that gets ugly really quickly. Maybe Kevin's > > series helps there, but I don't know. > > > > Of course, the simplest solution is to worry about cache-clean-interval > > for now and worry about the cache sizes later… But that basically means > > "We'll never worry about them unless someone complains", so… > > Hmm, now that we have three pieces of data, I'm starting to lean more > towards ImageInfoSpecific being the wrong place for this after all; it > would still be nice to advertise all three, but where? Is > BlockdevCacheInfo more appropriate (as a sub-struct of BlockDeviceInfo)? BlockDeviceInfo contains runtime information, so that looks like the right place indeed. As these options aren't present for all devices, I don't think we should extend BlockdevCacheInfo, but rather introduce a BlockDeviceInfoSpecific that works like ImageInfoSpecific, just for runtime information. I agree with Berto that that's out of scope for this series, though. Kevin [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-28 15:13 ` Eric Blake 2015-05-28 15:14 ` Max Reitz @ 2015-05-28 16:44 ` Kevin Wolf 2015-05-28 17:03 ` Alberto Garcia 1 sibling, 1 reply; 29+ messages in thread From: Kevin Wolf @ 2015-05-28 16:44 UTC (permalink / raw) To: Eric Blake Cc: qemu-block, Alberto Garcia, qemu-devel, Stefan Hajnoczi, Max Reitz [-- Attachment #1: Type: text/plain, Size: 3400 bytes --] Am 28.05.2015 um 17:13 hat Eric Blake geschrieben: > On 05/28/2015 08:56 AM, Max Reitz wrote: > > On 27.05.2015 11:46, Alberto Garcia wrote: > >> This adds a new 'cache-clean-interval' option that cleans all qcow2 > >> cache entries that haven't been used in a certain interval, given in > >> seconds. > >> > >> This allows setting a large L2 cache size so it can handle scenarios > >> with lots of I/O and at the same time use little memory during periods > >> of inactivity. > >> > >> This feature currently relies on MADV_DONTNEED to free that memory, so > >> it is not useful in systems that don't follow that behavior. > >> > > >> +++ b/qapi/block-core.json > >> @@ -41,6 +41,10 @@ > >> # @corrupt: #optional true if the image has been marked corrupt; > >> only valid for > >> # compat >= 1.1 (since 2.2) > >> # > >> +# @cache-clean-interval: interval in seconds after which unused L2 and > >> +# refcount cache entries are removed. If 0 then > >> +# this feature is not enabled (since 2.4) > >> +# > >> # @refcount-bits: width of a refcount entry in bits (since 2.3) > >> # > >> # Since: 1.7 > >> @@ -50,7 +54,8 @@ > >> 'compat': 'str', > >> '*lazy-refcounts': 'bool', > >> '*corrupt': 'bool', > >> - 'refcount-bits': 'int' > >> + 'refcount-bits': 'int', > >> + 'cache-clean-interval': 'int' > >> } } > > > > I'm not too happy about making this part of ImageInfoSpecificQCow2. Two > > reasons for this: First, it's eventually part of ImageInfo, which is > > defined as "Information about a QEMU image file", but this option cannot > > be set in the image file itself but is only a run-time option. > > > > Second, my original goal for ImageInfoSpecific was to provide more > > information through qemu-img info, and this value will look pretty > > strange there. > > > > I don't know how to resolve this quandary, though. Since qemu cannot > > change this interval by itself, I think not providing an interface for > > reading it is fine. On the other hand, if Eric finds such an interface > > absolutely mandatory, I can't think of a better place to return it than > > here. > > Can we mark the parameter optional, and only provide it when it is > non-zero? That way, qemu-img (which cannot set an interval) will not > report it, and the only time it will appear is if it was set as part of > opening the block device under qemu. No, that wouldn't be right. It's not information tied to the image file. > > The only solution which would satisfy both requirements would be another > > structure which contains "online" flags, and thus is not evaluated by > > qemu-img info, but only by QMP commands. But that's ugly. > > > > Yeah, I'm not sure such duplication helps. I'd still like it reported > somewhere, though, as it is nice to query that a requested setting is > actually working. This isn't duplicated information. You can have ImageInfoSpecificQCow2 show lazy_refcounts=off because that is what the image file contains, but the real value in effect could be lazy_refcounts=on. Options stored in the image and runtime options are two different pieces of information, even if the former specify the defaults for the latter. So I think it should be possible to query both of them. Kevin [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-28 16:44 ` Kevin Wolf @ 2015-05-28 17:03 ` Alberto Garcia 0 siblings, 0 replies; 29+ messages in thread From: Alberto Garcia @ 2015-05-28 17:03 UTC (permalink / raw) To: Kevin Wolf, Eric Blake; +Cc: qemu-block, qemu-devel, Stefan Hajnoczi, Max Reitz On Thu 28 May 2015 06:44:55 PM CEST, Kevin Wolf wrote: >> Yeah, I'm not sure such duplication helps. I'd still like it >> reported somewhere, though, as it is nice to query that a requested >> setting is actually working. > > This isn't duplicated information. You can have ImageInfoSpecificQCow2 > show lazy_refcounts=off because that is what the image file contains, > but the real value in effect could be lazy_refcounts=on. That's right, ImageInfoSpecificQCow2 returns the value from the header (s->compatible_features), not the runtime value (s->use_lazy_refcounts). I think I'll resend the patch without the ImageInfo change. Querying the runtime values is a task on its own. Berto ^ permalink raw reply [flat|nested] 29+ messages in thread
* [Qemu-devel] [PATCH 3/3] qcow2: reorder fields in Qcow2CachedTable to reduce padding 2015-05-27 9:46 [Qemu-devel] [PATCH v3 0/3] Clean unused entries in the qcow2 L2/refcount cache Alberto Garcia 2015-05-27 9:46 ` [Qemu-devel] [PATCH 1/3] qcow2: mark the memory as no longer needed after qcow2_cache_empty() Alberto Garcia 2015-05-27 9:46 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia @ 2015-05-27 9:46 ` Alberto Garcia 2 siblings, 0 replies; 29+ messages in thread From: Alberto Garcia @ 2015-05-27 9:46 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Stefan Hajnoczi Changing the current ordering saves 8 bytes per cache entry in x86_64. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> --- block/qcow2-cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index a215f5b..43590ff 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -31,9 +31,9 @@ typedef struct Qcow2CachedTable { int64_t offset; - bool dirty; uint64_t lru_counter; int ref; + bool dirty; } Qcow2CachedTable; struct Qcow2Cache { -- 2.1.4 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [Qemu-devel] [PATCH v4 0/3] Clean unused entries in the qcow2 L2/refcount cache @ 2015-05-29 9:24 Alberto Garcia 2015-05-29 9:24 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia 0 siblings, 1 reply; 29+ messages in thread From: Alberto Garcia @ 2015-05-29 9:24 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Stefan Hajnoczi v4: - Revert the 'cache-clean-interval' change. This should probably go into a new BlockDeviceInfoSpecific struct (along with other settings), but is out of the scope for this series. v3: https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg05473.html - Add 'cache-clean-interval' field to ImageInfoSpecificQCow2. v2: https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg05316.html - Clarify that the block-commit mentioned in the first patch refers to the HMP commit command. - Check the value of cache_clean_interval and cast it accordingly to prevent it from overflowing. v1: https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg03510.html Regards, Berto Alberto Garcia (3): qcow2: mark the memory as no longer needed after qcow2_cache_empty() qcow2: add option to clean unused cache entries after some time qcow2: reorder fields in Qcow2CachedTable to reduce padding block/qcow2-cache.c | 57 +++++++++++++++++++++++++++++++++++++++++++++- block/qcow2.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++ block/qcow2.h | 4 ++++ qapi/block-core.json | 6 ++++- 4 files changed, 129 insertions(+), 2 deletions(-) -- 2.1.4 ^ permalink raw reply [flat|nested] 29+ messages in thread
* [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-29 9:24 [Qemu-devel] [PATCH v4 0/3] Clean unused entries in the qcow2 L2/refcount cache Alberto Garcia @ 2015-05-29 9:24 ` Alberto Garcia 2015-06-02 11:04 ` Kevin Wolf 0 siblings, 1 reply; 29+ messages in thread From: Alberto Garcia @ 2015-05-29 9:24 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Stefan Hajnoczi This adds a new 'cache-clean-interval' option that cleans all qcow2 cache entries that haven't been used in a certain interval, given in seconds. This allows setting a large L2 cache size so it can handle scenarios with lots of I/O and at the same time use little memory during periods of inactivity. This feature currently relies on MADV_DONTNEED to free that memory, so it is not useful in systems that don't follow that behavior. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> --- block/qcow2-cache.c | 35 ++++++++++++++++++++++++++++ block/qcow2.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++ block/qcow2.h | 4 ++++ qapi/block-core.json | 6 ++++- 4 files changed, 108 insertions(+), 1 deletion(-) diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index ed14a92..a215f5b 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -43,6 +43,7 @@ struct Qcow2Cache { bool depends_on_flush; void *table_array; uint64_t lru_counter; + uint64_t cache_clean_lru_counter; }; static inline void *qcow2_cache_get_table_addr(BlockDriverState *bs, @@ -78,6 +79,40 @@ static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c, #endif } +static inline bool can_clean_entry(Qcow2Cache *c, int i) +{ + Qcow2CachedTable *t = &c->entries[i]; + return t->ref == 0 && !t->dirty && t->offset != 0 && + t->lru_counter <= c->cache_clean_lru_counter; +} + +void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c) +{ + int i = 0; + while (i < c->size) { + int to_clean = 0; + + /* Skip the entries that we don't need to clean */ + while (i < c->size && !can_clean_entry(c, i)) { + i++; + } + + /* And count how many we can clean in a row */ + while (i < c->size && can_clean_entry(c, i)) { + c->entries[i].offset = 0; + c->entries[i].lru_counter = 0; + i++; + to_clean++; + } + + if (to_clean > 0) { + qcow2_cache_table_release(bs, c, i - to_clean, to_clean); + } + } + + c->cache_clean_lru_counter = c->lru_counter; +} + Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables) { BDRVQcowState *s = bs->opaque; diff --git a/block/qcow2.c b/block/qcow2.c index f7b4cc6..8c565e9 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -468,6 +468,11 @@ static QemuOptsList qcow2_runtime_opts = { .type = QEMU_OPT_SIZE, .help = "Maximum refcount block cache size", }, + { + .name = QCOW2_OPT_CACHE_CLEAN_INTERVAL, + .type = QEMU_OPT_NUMBER, + .help = "Clean unused cache entries after this time (in seconds)", + }, { /* end of list */ } }, }; @@ -483,6 +488,49 @@ static const char *overlap_bool_option_names[QCOW2_OL_MAX_BITNR] = { [QCOW2_OL_INACTIVE_L2_BITNR] = QCOW2_OPT_OVERLAP_INACTIVE_L2, }; +static void cache_clean_timer_cb(void *opaque) +{ + BlockDriverState *bs = opaque; + BDRVQcowState *s = bs->opaque; + qcow2_cache_clean_unused(bs, s->l2_table_cache); + qcow2_cache_clean_unused(bs, s->refcount_block_cache); + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + (int64_t) s->cache_clean_interval * 1000); +} + +static void cache_clean_timer_init(BlockDriverState *bs, AioContext *context) +{ + BDRVQcowState *s = bs->opaque; + if (s->cache_clean_interval > 0) { + s->cache_clean_timer = aio_timer_new(context, QEMU_CLOCK_VIRTUAL, + SCALE_MS, cache_clean_timer_cb, + bs); + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + (int64_t) s->cache_clean_interval * 1000); + } +} + +static void cache_clean_timer_del(BlockDriverState *bs) +{ + BDRVQcowState *s = bs->opaque; + if (s->cache_clean_timer) { + timer_del(s->cache_clean_timer); + timer_free(s->cache_clean_timer); + s->cache_clean_timer = NULL; + } +} + +static void qcow2_detach_aio_context(BlockDriverState *bs) +{ + cache_clean_timer_del(bs); +} + +static void qcow2_attach_aio_context(BlockDriverState *bs, + AioContext *new_context) +{ + cache_clean_timer_init(bs, new_context); +} + static void read_cache_sizes(QemuOpts *opts, uint64_t *l2_cache_size, uint64_t *refcount_cache_size, Error **errp) { @@ -552,6 +600,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, const char *opt_overlap_check, *opt_overlap_check_template; int overlap_check_template = 0; uint64_t l2_cache_size, refcount_cache_size; + uint64_t cache_clean_interval; ret = bdrv_pread(bs->file, 0, &header, sizeof(header)); if (ret < 0) { @@ -839,6 +888,16 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } + cache_clean_interval = + qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, 0); + if (cache_clean_interval > UINT_MAX) { + error_setg(errp, "Cache clean interval too big"); + ret = -EINVAL; + goto fail; + } + s->cache_clean_interval = cache_clean_interval; + cache_clean_timer_init(bs, bdrv_get_aio_context(bs)); + s->cluster_cache = g_malloc(s->cluster_size); /* one more sector for decompressed data alignment */ s->cluster_data = qemu_try_blockalign(bs->file, QCOW_MAX_CRYPT_CLUSTERS @@ -1004,6 +1063,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, qemu_vfree(s->l1_table); /* else pre-write overlap checks in cache_destroy may crash */ s->l1_table = NULL; + cache_clean_timer_del(bs); if (s->l2_table_cache) { qcow2_cache_destroy(bs, s->l2_table_cache); } @@ -1458,6 +1518,7 @@ static void qcow2_close(BlockDriverState *bs) } } + cache_clean_timer_del(bs); qcow2_cache_destroy(bs, s->l2_table_cache); qcow2_cache_destroy(bs, s->refcount_block_cache); @@ -2970,6 +3031,9 @@ BlockDriver bdrv_qcow2 = { .create_opts = &qcow2_create_opts, .bdrv_check = qcow2_check, .bdrv_amend_options = qcow2_amend_options, + + .bdrv_detach_aio_context = qcow2_detach_aio_context, + .bdrv_attach_aio_context = qcow2_attach_aio_context, }; static void bdrv_qcow2_init(void) diff --git a/block/qcow2.h b/block/qcow2.h index 0076512..2c45eb2 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -93,6 +93,7 @@ #define QCOW2_OPT_CACHE_SIZE "cache-size" #define QCOW2_OPT_L2_CACHE_SIZE "l2-cache-size" #define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size" +#define QCOW2_OPT_CACHE_CLEAN_INTERVAL "cache-clean-interval" typedef struct QCowHeader { uint32_t magic; @@ -236,6 +237,8 @@ typedef struct BDRVQcowState { Qcow2Cache* l2_table_cache; Qcow2Cache* refcount_block_cache; + QEMUTimer *cache_clean_timer; + unsigned cache_clean_interval; uint8_t *cluster_cache; uint8_t *cluster_data; @@ -581,6 +584,7 @@ int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, Qcow2Cache *dependency); void qcow2_cache_depends_on_flush(Qcow2Cache *c); +void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c); int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c); int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, diff --git a/qapi/block-core.json b/qapi/block-core.json index 863ffea..f42338d 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1538,6 +1538,9 @@ # @refcount-cache-size: #optional the maximum size of the refcount block cache # in bytes (since 2.2) # +# @cache-clean-interval: #optional clean unused entries in the L2 and refcount +# caches. The interval is in seconds (since 2.4) +# # Since: 1.7 ## { 'struct': 'BlockdevOptionsQcow2', @@ -1549,7 +1552,8 @@ '*overlap-check': 'Qcow2OverlapChecks', '*cache-size': 'int', '*l2-cache-size': 'int', - '*refcount-cache-size': 'int' } } + '*refcount-cache-size': 'int', + '*cache-clean-interval': 'int' } } ## -- 2.1.4 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-29 9:24 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia @ 2015-06-02 11:04 ` Kevin Wolf 0 siblings, 0 replies; 29+ messages in thread From: Kevin Wolf @ 2015-06-02 11:04 UTC (permalink / raw) To: Alberto Garcia; +Cc: qemu-block, qemu-devel, Stefan Hajnoczi, Max Reitz Am 29.05.2015 um 11:24 hat Alberto Garcia geschrieben: > This adds a new 'cache-clean-interval' option that cleans all qcow2 > cache entries that haven't been used in a certain interval, given in > seconds. > > This allows setting a large L2 cache size so it can handle scenarios > with lots of I/O and at the same time use little memory during periods > of inactivity. > > This feature currently relies on MADV_DONTNEED to free that memory, so > it is not useful in systems that don't follow that behavior. > > Signed-off-by: Alberto Garcia <berto@igalia.com> > Reviewed-by: Max Reitz <mreitz@redhat.com> > diff --git a/qapi/block-core.json b/qapi/block-core.json > index 863ffea..f42338d 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -1538,6 +1538,9 @@ > # @refcount-cache-size: #optional the maximum size of the refcount block cache > # in bytes (since 2.2) > # > +# @cache-clean-interval: #optional clean unused entries in the L2 and refcount > +# caches. The interval is in seconds (since 2.4) Should add that 0 means disabling the cleaning, and 0 is the default. Kevin ^ permalink raw reply [flat|nested] 29+ messages in thread
* [Qemu-devel] [PATCH v2 0/3] Clean unused entries in the qcow2 L2/refcount cache @ 2015-05-26 17:14 Alberto Garcia 2015-05-26 17:14 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia 0 siblings, 1 reply; 29+ messages in thread From: Alberto Garcia @ 2015-05-26 17:14 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Stefan Hajnoczi New version of the series. v2: - Clarify that the block-commit mentioned in the first patch refers to the HMP commit command. - Check the value of cache_clean_interval and cast it accordingly to prevent it from overflowing. v1: https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg03510.html Regards, Berto Alberto Garcia (3): qcow2: mark the memory as no longer needed after qcow2_cache_empty() qcow2: add option to clean unused cache entries after some time qcow2: reorder fields in Qcow2CachedTable to reduce padding block/qcow2-cache.c | 57 +++++++++++++++++++++++++++++++++++++++++++++- block/qcow2.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++ block/qcow2.h | 4 ++++ qapi/block-core.json | 6 ++++- 4 files changed, 129 insertions(+), 2 deletions(-) -- 2.1.4 ^ permalink raw reply [flat|nested] 29+ messages in thread
* [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-26 17:14 [Qemu-devel] [PATCH v2 0/3] Clean unused entries in the qcow2 L2/refcount cache Alberto Garcia @ 2015-05-26 17:14 ` Alberto Garcia 2015-05-26 17:26 ` Max Reitz 2015-05-26 18:52 ` Eric Blake 0 siblings, 2 replies; 29+ messages in thread From: Alberto Garcia @ 2015-05-26 17:14 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Stefan Hajnoczi This adds a new 'cache-clean-interval' option that cleans all qcow2 cache entries that haven't been used in a certain interval, given in seconds. This allows setting a large L2 cache size so it can handle scenarios with lots of I/O and at the same time use little memory during periods of inactivity. This feature currently relies on MADV_DONTNEED to free that memory, so it is not useful in systems that don't follow that behavior. Signed-off-by: Alberto Garcia <berto@igalia.com> --- block/qcow2-cache.c | 35 ++++++++++++++++++++++++++++ block/qcow2.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++ block/qcow2.h | 4 ++++ qapi/block-core.json | 6 ++++- 4 files changed, 108 insertions(+), 1 deletion(-) diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index ed14a92..a215f5b 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -43,6 +43,7 @@ struct Qcow2Cache { bool depends_on_flush; void *table_array; uint64_t lru_counter; + uint64_t cache_clean_lru_counter; }; static inline void *qcow2_cache_get_table_addr(BlockDriverState *bs, @@ -78,6 +79,40 @@ static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c, #endif } +static inline bool can_clean_entry(Qcow2Cache *c, int i) +{ + Qcow2CachedTable *t = &c->entries[i]; + return t->ref == 0 && !t->dirty && t->offset != 0 && + t->lru_counter <= c->cache_clean_lru_counter; +} + +void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c) +{ + int i = 0; + while (i < c->size) { + int to_clean = 0; + + /* Skip the entries that we don't need to clean */ + while (i < c->size && !can_clean_entry(c, i)) { + i++; + } + + /* And count how many we can clean in a row */ + while (i < c->size && can_clean_entry(c, i)) { + c->entries[i].offset = 0; + c->entries[i].lru_counter = 0; + i++; + to_clean++; + } + + if (to_clean > 0) { + qcow2_cache_table_release(bs, c, i - to_clean, to_clean); + } + } + + c->cache_clean_lru_counter = c->lru_counter; +} + Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables) { BDRVQcowState *s = bs->opaque; diff --git a/block/qcow2.c b/block/qcow2.c index f7b4cc6..8c565e9 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -468,6 +468,11 @@ static QemuOptsList qcow2_runtime_opts = { .type = QEMU_OPT_SIZE, .help = "Maximum refcount block cache size", }, + { + .name = QCOW2_OPT_CACHE_CLEAN_INTERVAL, + .type = QEMU_OPT_NUMBER, + .help = "Clean unused cache entries after this time (in seconds)", + }, { /* end of list */ } }, }; @@ -483,6 +488,49 @@ static const char *overlap_bool_option_names[QCOW2_OL_MAX_BITNR] = { [QCOW2_OL_INACTIVE_L2_BITNR] = QCOW2_OPT_OVERLAP_INACTIVE_L2, }; +static void cache_clean_timer_cb(void *opaque) +{ + BlockDriverState *bs = opaque; + BDRVQcowState *s = bs->opaque; + qcow2_cache_clean_unused(bs, s->l2_table_cache); + qcow2_cache_clean_unused(bs, s->refcount_block_cache); + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + (int64_t) s->cache_clean_interval * 1000); +} + +static void cache_clean_timer_init(BlockDriverState *bs, AioContext *context) +{ + BDRVQcowState *s = bs->opaque; + if (s->cache_clean_interval > 0) { + s->cache_clean_timer = aio_timer_new(context, QEMU_CLOCK_VIRTUAL, + SCALE_MS, cache_clean_timer_cb, + bs); + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + (int64_t) s->cache_clean_interval * 1000); + } +} + +static void cache_clean_timer_del(BlockDriverState *bs) +{ + BDRVQcowState *s = bs->opaque; + if (s->cache_clean_timer) { + timer_del(s->cache_clean_timer); + timer_free(s->cache_clean_timer); + s->cache_clean_timer = NULL; + } +} + +static void qcow2_detach_aio_context(BlockDriverState *bs) +{ + cache_clean_timer_del(bs); +} + +static void qcow2_attach_aio_context(BlockDriverState *bs, + AioContext *new_context) +{ + cache_clean_timer_init(bs, new_context); +} + static void read_cache_sizes(QemuOpts *opts, uint64_t *l2_cache_size, uint64_t *refcount_cache_size, Error **errp) { @@ -552,6 +600,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, const char *opt_overlap_check, *opt_overlap_check_template; int overlap_check_template = 0; uint64_t l2_cache_size, refcount_cache_size; + uint64_t cache_clean_interval; ret = bdrv_pread(bs->file, 0, &header, sizeof(header)); if (ret < 0) { @@ -839,6 +888,16 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } + cache_clean_interval = + qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, 0); + if (cache_clean_interval > UINT_MAX) { + error_setg(errp, "Cache clean interval too big"); + ret = -EINVAL; + goto fail; + } + s->cache_clean_interval = cache_clean_interval; + cache_clean_timer_init(bs, bdrv_get_aio_context(bs)); + s->cluster_cache = g_malloc(s->cluster_size); /* one more sector for decompressed data alignment */ s->cluster_data = qemu_try_blockalign(bs->file, QCOW_MAX_CRYPT_CLUSTERS @@ -1004,6 +1063,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, qemu_vfree(s->l1_table); /* else pre-write overlap checks in cache_destroy may crash */ s->l1_table = NULL; + cache_clean_timer_del(bs); if (s->l2_table_cache) { qcow2_cache_destroy(bs, s->l2_table_cache); } @@ -1458,6 +1518,7 @@ static void qcow2_close(BlockDriverState *bs) } } + cache_clean_timer_del(bs); qcow2_cache_destroy(bs, s->l2_table_cache); qcow2_cache_destroy(bs, s->refcount_block_cache); @@ -2970,6 +3031,9 @@ BlockDriver bdrv_qcow2 = { .create_opts = &qcow2_create_opts, .bdrv_check = qcow2_check, .bdrv_amend_options = qcow2_amend_options, + + .bdrv_detach_aio_context = qcow2_detach_aio_context, + .bdrv_attach_aio_context = qcow2_attach_aio_context, }; static void bdrv_qcow2_init(void) diff --git a/block/qcow2.h b/block/qcow2.h index 0076512..2c45eb2 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -93,6 +93,7 @@ #define QCOW2_OPT_CACHE_SIZE "cache-size" #define QCOW2_OPT_L2_CACHE_SIZE "l2-cache-size" #define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size" +#define QCOW2_OPT_CACHE_CLEAN_INTERVAL "cache-clean-interval" typedef struct QCowHeader { uint32_t magic; @@ -236,6 +237,8 @@ typedef struct BDRVQcowState { Qcow2Cache* l2_table_cache; Qcow2Cache* refcount_block_cache; + QEMUTimer *cache_clean_timer; + unsigned cache_clean_interval; uint8_t *cluster_cache; uint8_t *cluster_data; @@ -581,6 +584,7 @@ int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, Qcow2Cache *dependency); void qcow2_cache_depends_on_flush(Qcow2Cache *c); +void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c); int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c); int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, diff --git a/qapi/block-core.json b/qapi/block-core.json index 863ffea..f42338d 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1538,6 +1538,9 @@ # @refcount-cache-size: #optional the maximum size of the refcount block cache # in bytes (since 2.2) # +# @cache-clean-interval: #optional clean unused entries in the L2 and refcount +# caches. The interval is in seconds (since 2.4) +# # Since: 1.7 ## { 'struct': 'BlockdevOptionsQcow2', @@ -1549,7 +1552,8 @@ '*overlap-check': 'Qcow2OverlapChecks', '*cache-size': 'int', '*l2-cache-size': 'int', - '*refcount-cache-size': 'int' } } + '*refcount-cache-size': 'int', + '*cache-clean-interval': 'int' } } ## -- 2.1.4 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-26 17:14 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia @ 2015-05-26 17:26 ` Max Reitz 2015-05-26 18:52 ` Eric Blake 1 sibling, 0 replies; 29+ messages in thread From: Max Reitz @ 2015-05-26 17:26 UTC (permalink / raw) To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi On 26.05.2015 19:14, Alberto Garcia wrote: > This adds a new 'cache-clean-interval' option that cleans all qcow2 > cache entries that haven't been used in a certain interval, given in > seconds. > > This allows setting a large L2 cache size so it can handle scenarios > with lots of I/O and at the same time use little memory during periods > of inactivity. > > This feature currently relies on MADV_DONTNEED to free that memory, so > it is not useful in systems that don't follow that behavior. > > Signed-off-by: Alberto Garcia <berto@igalia.com> > --- > block/qcow2-cache.c | 35 ++++++++++++++++++++++++++++ > block/qcow2.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > block/qcow2.h | 4 ++++ > qapi/block-core.json | 6 ++++- > 4 files changed, 108 insertions(+), 1 deletion(-) Reviewed-by: Max Reitz <mreitz@redhat.com> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-26 17:14 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia 2015-05-26 17:26 ` Max Reitz @ 2015-05-26 18:52 ` Eric Blake 2015-05-26 19:10 ` Alberto Garcia 1 sibling, 1 reply; 29+ messages in thread From: Eric Blake @ 2015-05-26 18:52 UTC (permalink / raw) To: Alberto Garcia, qemu-devel Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi, Max Reitz [-- Attachment #1: Type: text/plain, Size: 1451 bytes --] On 05/26/2015 11:14 AM, Alberto Garcia wrote: > This adds a new 'cache-clean-interval' option that cleans all qcow2 > cache entries that haven't been used in a certain interval, given in > seconds. > > This allows setting a large L2 cache size so it can handle scenarios > with lots of I/O and at the same time use little memory during periods > of inactivity. > > This feature currently relies on MADV_DONTNEED to free that memory, so > it is not useful in systems that don't follow that behavior. > > Signed-off-by: Alberto Garcia <berto@igalia.com> > --- > block/qcow2-cache.c | 35 ++++++++++++++++++++++++++++ > block/qcow2.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > block/qcow2.h | 4 ++++ > qapi/block-core.json | 6 ++++- > 4 files changed, 108 insertions(+), 1 deletion(-) > > +++ b/qapi/block-core.json > @@ -1538,6 +1538,9 @@ > # @refcount-cache-size: #optional the maximum size of the refcount block cache > # in bytes (since 2.2) > # > +# @cache-clean-interval: #optional clean unused entries in the L2 and refcount > +# caches. The interval is in seconds (since 2.4) > +# > # Since: 1.7 Is there any QMP command that can query the current interval? I'm not a big fan of write-only interfaces. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-26 18:52 ` Eric Blake @ 2015-05-26 19:10 ` Alberto Garcia 2015-05-26 19:15 ` Eric Blake 0 siblings, 1 reply; 29+ messages in thread From: Alberto Garcia @ 2015-05-26 19:10 UTC (permalink / raw) To: Eric Blake, qemu-devel; +Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi, Max Reitz On Tue 26 May 2015 08:52:41 PM CEST, Eric Blake <eblake@redhat.com> wrote: >> +# @cache-clean-interval: #optional clean unused entries in the L2 and refcount >> +# caches. The interval is in seconds (since 2.4) >> +# >> # Since: 1.7 > > Is there any QMP command that can query the current interval? I'm not > a big fan of write-only interfaces. That's a good suggestion, thanks. Do we have any API where we could add this? I would rather not have a separate command just for this option, if possible. Berto ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-26 19:10 ` Alberto Garcia @ 2015-05-26 19:15 ` Eric Blake 0 siblings, 0 replies; 29+ messages in thread From: Eric Blake @ 2015-05-26 19:15 UTC (permalink / raw) To: Alberto Garcia, qemu-devel Cc: Kevin Wolf, qemu-block, Stefan Hajnoczi, Max Reitz [-- Attachment #1: Type: text/plain, Size: 979 bytes --] On 05/26/2015 01:10 PM, Alberto Garcia wrote: > On Tue 26 May 2015 08:52:41 PM CEST, Eric Blake <eblake@redhat.com> wrote: > >>> +# @cache-clean-interval: #optional clean unused entries in the L2 and refcount >>> +# caches. The interval is in seconds (since 2.4) >>> +# >>> # Since: 1.7 >> >> Is there any QMP command that can query the current interval? I'm not >> a big fan of write-only interfaces. > > That's a good suggestion, thanks. > > Do we have any API where we could add this? I would rather not have a > separate command just for this option, if possible. I agree that a new command is not necessary, and that the best is to modify an existing command. It looks like ImageInfoSpecificQCow2 is the corresponding struct to edit, at which point ImageInfo (part of query-block) will automatically expose it. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* [Qemu-devel] [PATCH 0/3] Clean unused entries in the qcow2 L2/refcount cache @ 2015-05-18 16:48 Alberto Garcia 2015-05-18 16:48 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia 0 siblings, 1 reply; 29+ messages in thread From: Alberto Garcia @ 2015-05-18 16:48 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Stefan Hajnoczi These patches use MADV_DONTNEED to clean unused cache entries. Under Linux it frees the memory used by those pages. Berto Alberto Garcia (3): qcow2: mark the memory as no longer needed after qcow2_cache_empty() qcow2: add option to clean unused cache entries after some time qcow2: reorder fields in Qcow2CachedTable to reduce padding block/qcow2-cache.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++- block/qcow2.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++ block/qcow2.h | 4 ++++ qapi/block-core.json | 6 +++++- 4 files changed, 121 insertions(+), 2 deletions(-) -- 2.1.4 ^ permalink raw reply [flat|nested] 29+ messages in thread
* [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-18 16:48 [Qemu-devel] [PATCH 0/3] Clean unused entries in the qcow2 L2/refcount cache Alberto Garcia @ 2015-05-18 16:48 ` Alberto Garcia 2015-05-26 16:07 ` Max Reitz 0 siblings, 1 reply; 29+ messages in thread From: Alberto Garcia @ 2015-05-18 16:48 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz, Stefan Hajnoczi This adds a new 'cache-clean-interval' option that cleans all qcow2 cache entries that haven't been used in a certain interval, given in seconds. This allows setting a large L2 cache size so it can handle scenarios with lots of I/O and at the same time use little memory during periods of inactivity. This feature currently relies on MADV_DONTNEED to free that memory, so it is not useful in systems that don't follow that behavior. Signed-off-by: Alberto Garcia <berto@igalia.com> --- block/qcow2-cache.c | 35 ++++++++++++++++++++++++++++++++ block/qcow2.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++ block/qcow2.h | 4 ++++ qapi/block-core.json | 6 +++++- 4 files changed, 100 insertions(+), 1 deletion(-) diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index ed14a92..a215f5b 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -43,6 +43,7 @@ struct Qcow2Cache { bool depends_on_flush; void *table_array; uint64_t lru_counter; + uint64_t cache_clean_lru_counter; }; static inline void *qcow2_cache_get_table_addr(BlockDriverState *bs, @@ -78,6 +79,40 @@ static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c, #endif } +static inline bool can_clean_entry(Qcow2Cache *c, int i) +{ + Qcow2CachedTable *t = &c->entries[i]; + return t->ref == 0 && !t->dirty && t->offset != 0 && + t->lru_counter <= c->cache_clean_lru_counter; +} + +void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c) +{ + int i = 0; + while (i < c->size) { + int to_clean = 0; + + /* Skip the entries that we don't need to clean */ + while (i < c->size && !can_clean_entry(c, i)) { + i++; + } + + /* And count how many we can clean in a row */ + while (i < c->size && can_clean_entry(c, i)) { + c->entries[i].offset = 0; + c->entries[i].lru_counter = 0; + i++; + to_clean++; + } + + if (to_clean > 0) { + qcow2_cache_table_release(bs, c, i - to_clean, to_clean); + } + } + + c->cache_clean_lru_counter = c->lru_counter; +} + Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables) { BDRVQcowState *s = bs->opaque; diff --git a/block/qcow2.c b/block/qcow2.c index b9a72e3..f6ed39f 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -468,6 +468,11 @@ static QemuOptsList qcow2_runtime_opts = { .type = QEMU_OPT_SIZE, .help = "Maximum refcount block cache size", }, + { + .name = QCOW2_OPT_CACHE_CLEAN_INTERVAL, + .type = QEMU_OPT_NUMBER, + .help = "Clean unused cache entries after this time (in seconds)", + }, { /* end of list */ } }, }; @@ -483,6 +488,49 @@ static const char *overlap_bool_option_names[QCOW2_OL_MAX_BITNR] = { [QCOW2_OL_INACTIVE_L2_BITNR] = QCOW2_OPT_OVERLAP_INACTIVE_L2, }; +static void cache_clean_timer_cb(void *opaque) +{ + BlockDriverState *bs = opaque; + BDRVQcowState *s = bs->opaque; + qcow2_cache_clean_unused(bs, s->l2_table_cache); + qcow2_cache_clean_unused(bs, s->refcount_block_cache); + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + s->cache_clean_interval * 1000); +} + +static void cache_clean_timer_init(BlockDriverState *bs, AioContext *context) +{ + BDRVQcowState *s = bs->opaque; + if (s->cache_clean_interval > 0) { + s->cache_clean_timer = aio_timer_new(context, QEMU_CLOCK_VIRTUAL, + SCALE_MS, cache_clean_timer_cb, + bs); + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + s->cache_clean_interval * 1000); + } +} + +static void cache_clean_timer_del(BlockDriverState *bs) +{ + BDRVQcowState *s = bs->opaque; + if (s->cache_clean_timer) { + timer_del(s->cache_clean_timer); + timer_free(s->cache_clean_timer); + s->cache_clean_timer = NULL; + } +} + +static void qcow2_detach_aio_context(BlockDriverState *bs) +{ + cache_clean_timer_del(bs); +} + +static void qcow2_attach_aio_context(BlockDriverState *bs, + AioContext *new_context) +{ + cache_clean_timer_init(bs, new_context); +} + static void read_cache_sizes(QemuOpts *opts, uint64_t *l2_cache_size, uint64_t *refcount_cache_size, Error **errp) { @@ -838,6 +886,9 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, ret = -ENOMEM; goto fail; } + s->cache_clean_interval = + qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, 0); + cache_clean_timer_init(bs, bdrv_get_aio_context(bs)); s->cluster_cache = g_malloc(s->cluster_size); /* one more sector for decompressed data alignment */ @@ -1004,6 +1055,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, qemu_vfree(s->l1_table); /* else pre-write overlap checks in cache_destroy may crash */ s->l1_table = NULL; + cache_clean_timer_del(bs); if (s->l2_table_cache) { qcow2_cache_destroy(bs, s->l2_table_cache); } @@ -1453,6 +1505,7 @@ static void qcow2_close(BlockDriverState *bs) } } + cache_clean_timer_del(bs); qcow2_cache_destroy(bs, s->l2_table_cache); qcow2_cache_destroy(bs, s->refcount_block_cache); @@ -2964,6 +3017,9 @@ BlockDriver bdrv_qcow2 = { .create_opts = &qcow2_create_opts, .bdrv_check = qcow2_check, .bdrv_amend_options = qcow2_amend_options, + + .bdrv_detach_aio_context = qcow2_detach_aio_context, + .bdrv_attach_aio_context = qcow2_attach_aio_context, }; static void bdrv_qcow2_init(void) diff --git a/block/qcow2.h b/block/qcow2.h index 0076512..2c45eb2 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -93,6 +93,7 @@ #define QCOW2_OPT_CACHE_SIZE "cache-size" #define QCOW2_OPT_L2_CACHE_SIZE "l2-cache-size" #define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size" +#define QCOW2_OPT_CACHE_CLEAN_INTERVAL "cache-clean-interval" typedef struct QCowHeader { uint32_t magic; @@ -236,6 +237,8 @@ typedef struct BDRVQcowState { Qcow2Cache* l2_table_cache; Qcow2Cache* refcount_block_cache; + QEMUTimer *cache_clean_timer; + unsigned cache_clean_interval; uint8_t *cluster_cache; uint8_t *cluster_data; @@ -581,6 +584,7 @@ int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, Qcow2Cache *dependency); void qcow2_cache_depends_on_flush(Qcow2Cache *c); +void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c); int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c); int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, diff --git a/qapi/block-core.json b/qapi/block-core.json index 863ffea..f42338d 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1538,6 +1538,9 @@ # @refcount-cache-size: #optional the maximum size of the refcount block cache # in bytes (since 2.2) # +# @cache-clean-interval: #optional clean unused entries in the L2 and refcount +# caches. The interval is in seconds (since 2.4) +# # Since: 1.7 ## { 'struct': 'BlockdevOptionsQcow2', @@ -1549,7 +1552,8 @@ '*overlap-check': 'Qcow2OverlapChecks', '*cache-size': 'int', '*l2-cache-size': 'int', - '*refcount-cache-size': 'int' } } + '*refcount-cache-size': 'int', + '*cache-clean-interval': 'int' } } ## -- 2.1.4 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time 2015-05-18 16:48 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia @ 2015-05-26 16:07 ` Max Reitz 0 siblings, 0 replies; 29+ messages in thread From: Max Reitz @ 2015-05-26 16:07 UTC (permalink / raw) To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-block On 18.05.2015 18:48, Alberto Garcia wrote: > This adds a new 'cache-clean-interval' option that cleans all qcow2 > cache entries that haven't been used in a certain interval, given in > seconds. > > This allows setting a large L2 cache size so it can handle scenarios > with lots of I/O and at the same time use little memory during periods > of inactivity. > > This feature currently relies on MADV_DONTNEED to free that memory, so > it is not useful in systems that don't follow that behavior. > > Signed-off-by: Alberto Garcia <berto@igalia.com> > --- > block/qcow2-cache.c | 35 ++++++++++++++++++++++++++++++++ > block/qcow2.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > block/qcow2.h | 4 ++++ > qapi/block-core.json | 6 +++++- > 4 files changed, 100 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c > index ed14a92..a215f5b 100644 > --- a/block/qcow2-cache.c > +++ b/block/qcow2-cache.c > @@ -43,6 +43,7 @@ struct Qcow2Cache { > bool depends_on_flush; > void *table_array; > uint64_t lru_counter; > + uint64_t cache_clean_lru_counter; > }; > > static inline void *qcow2_cache_get_table_addr(BlockDriverState *bs, > @@ -78,6 +79,40 @@ static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c, > #endif > } > > +static inline bool can_clean_entry(Qcow2Cache *c, int i) > +{ > + Qcow2CachedTable *t = &c->entries[i]; > + return t->ref == 0 && !t->dirty && t->offset != 0 && > + t->lru_counter <= c->cache_clean_lru_counter; > +} > + > +void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c) > +{ > + int i = 0; > + while (i < c->size) { > + int to_clean = 0; > + > + /* Skip the entries that we don't need to clean */ > + while (i < c->size && !can_clean_entry(c, i)) { > + i++; > + } > + > + /* And count how many we can clean in a row */ > + while (i < c->size && can_clean_entry(c, i)) { > + c->entries[i].offset = 0; > + c->entries[i].lru_counter = 0; > + i++; > + to_clean++; > + } > + > + if (to_clean > 0) { > + qcow2_cache_table_release(bs, c, i - to_clean, to_clean); > + } > + } > + > + c->cache_clean_lru_counter = c->lru_counter; > +} > + > Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables) > { > BDRVQcowState *s = bs->opaque; > diff --git a/block/qcow2.c b/block/qcow2.c > index b9a72e3..f6ed39f 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -468,6 +468,11 @@ static QemuOptsList qcow2_runtime_opts = { > .type = QEMU_OPT_SIZE, > .help = "Maximum refcount block cache size", > }, > + { > + .name = QCOW2_OPT_CACHE_CLEAN_INTERVAL, > + .type = QEMU_OPT_NUMBER, > + .help = "Clean unused cache entries after this time (in seconds)", > + }, > { /* end of list */ } > }, > }; > @@ -483,6 +488,49 @@ static const char *overlap_bool_option_names[QCOW2_OL_MAX_BITNR] = { > [QCOW2_OL_INACTIVE_L2_BITNR] = QCOW2_OPT_OVERLAP_INACTIVE_L2, > }; > > +static void cache_clean_timer_cb(void *opaque) > +{ > + BlockDriverState *bs = opaque; > + BDRVQcowState *s = bs->opaque; > + qcow2_cache_clean_unused(bs, s->l2_table_cache); > + qcow2_cache_clean_unused(bs, s->refcount_block_cache); > + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + > + s->cache_clean_interval * 1000); > +} > + > +static void cache_clean_timer_init(BlockDriverState *bs, AioContext *context) > +{ > + BDRVQcowState *s = bs->opaque; > + if (s->cache_clean_interval > 0) { > + s->cache_clean_timer = aio_timer_new(context, QEMU_CLOCK_VIRTUAL, > + SCALE_MS, cache_clean_timer_cb, > + bs); > + timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + > + s->cache_clean_interval * 1000); Can overflow. > + } > +} > + > +static void cache_clean_timer_del(BlockDriverState *bs) > +{ > + BDRVQcowState *s = bs->opaque; > + if (s->cache_clean_timer) { > + timer_del(s->cache_clean_timer); > + timer_free(s->cache_clean_timer); > + s->cache_clean_timer = NULL; > + } > +} > + > +static void qcow2_detach_aio_context(BlockDriverState *bs) > +{ > + cache_clean_timer_del(bs); > +} > + > +static void qcow2_attach_aio_context(BlockDriverState *bs, > + AioContext *new_context) > +{ > + cache_clean_timer_init(bs, new_context); > +} > + > static void read_cache_sizes(QemuOpts *opts, uint64_t *l2_cache_size, > uint64_t *refcount_cache_size, Error **errp) > { > @@ -838,6 +886,9 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, > ret = -ENOMEM; > goto fail; > } > + s->cache_clean_interval = > + qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, 0); Can overflow. Apart from these two things (which are by no means really bad, but it'd better not to introduce them in the first place...), the patch looks good. Max > + cache_clean_timer_init(bs, bdrv_get_aio_context(bs)); > > s->cluster_cache = g_malloc(s->cluster_size); > /* one more sector for decompressed data alignment */ > @@ -1004,6 +1055,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, > qemu_vfree(s->l1_table); > /* else pre-write overlap checks in cache_destroy may crash */ > s->l1_table = NULL; > + cache_clean_timer_del(bs); > if (s->l2_table_cache) { > qcow2_cache_destroy(bs, s->l2_table_cache); > } > @@ -1453,6 +1505,7 @@ static void qcow2_close(BlockDriverState *bs) > } > } > > + cache_clean_timer_del(bs); > qcow2_cache_destroy(bs, s->l2_table_cache); > qcow2_cache_destroy(bs, s->refcount_block_cache); > > @@ -2964,6 +3017,9 @@ BlockDriver bdrv_qcow2 = { > .create_opts = &qcow2_create_opts, > .bdrv_check = qcow2_check, > .bdrv_amend_options = qcow2_amend_options, > + > + .bdrv_detach_aio_context = qcow2_detach_aio_context, > + .bdrv_attach_aio_context = qcow2_attach_aio_context, > }; > > static void bdrv_qcow2_init(void) > diff --git a/block/qcow2.h b/block/qcow2.h > index 0076512..2c45eb2 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -93,6 +93,7 @@ > #define QCOW2_OPT_CACHE_SIZE "cache-size" > #define QCOW2_OPT_L2_CACHE_SIZE "l2-cache-size" > #define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size" > +#define QCOW2_OPT_CACHE_CLEAN_INTERVAL "cache-clean-interval" > > typedef struct QCowHeader { > uint32_t magic; > @@ -236,6 +237,8 @@ typedef struct BDRVQcowState { > > Qcow2Cache* l2_table_cache; > Qcow2Cache* refcount_block_cache; > + QEMUTimer *cache_clean_timer; > + unsigned cache_clean_interval; > > uint8_t *cluster_cache; > uint8_t *cluster_data; > @@ -581,6 +584,7 @@ int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, > Qcow2Cache *dependency); > void qcow2_cache_depends_on_flush(Qcow2Cache *c); > > +void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c); > int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c); > > int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, > diff --git a/qapi/block-core.json b/qapi/block-core.json > index 863ffea..f42338d 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -1538,6 +1538,9 @@ > # @refcount-cache-size: #optional the maximum size of the refcount block cache > # in bytes (since 2.2) > # > +# @cache-clean-interval: #optional clean unused entries in the L2 and refcount > +# caches. The interval is in seconds (since 2.4) > +# > # Since: 1.7 > ## > { 'struct': 'BlockdevOptionsQcow2', > @@ -1549,7 +1552,8 @@ > '*overlap-check': 'Qcow2OverlapChecks', > '*cache-size': 'int', > '*l2-cache-size': 'int', > - '*refcount-cache-size': 'int' } } > + '*refcount-cache-size': 'int', > + '*cache-clean-interval': 'int' } } > > > ## ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2015-06-02 11:04 UTC | newest] Thread overview: 29+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-05-27 9:46 [Qemu-devel] [PATCH v3 0/3] Clean unused entries in the qcow2 L2/refcount cache Alberto Garcia 2015-05-27 9:46 ` [Qemu-devel] [PATCH 1/3] qcow2: mark the memory as no longer needed after qcow2_cache_empty() Alberto Garcia 2015-05-27 9:46 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia 2015-05-27 12:27 ` Eric Blake 2015-05-27 14:34 ` Alberto Garcia 2015-05-28 14:47 ` Max Reitz 2015-05-28 14:56 ` Max Reitz 2015-05-28 15:02 ` Alberto Garcia 2015-05-28 15:04 ` Max Reitz 2015-05-28 15:06 ` Max Reitz 2015-05-28 15:13 ` Eric Blake 2015-05-28 15:14 ` Max Reitz 2015-05-28 15:19 ` Alberto Garcia 2015-05-28 15:23 ` Max Reitz 2015-05-28 15:30 ` Alberto Garcia 2015-05-28 19:41 ` Eric Blake 2015-05-29 8:32 ` Kevin Wolf 2015-05-28 16:44 ` Kevin Wolf 2015-05-28 17:03 ` Alberto Garcia 2015-05-27 9:46 ` [Qemu-devel] [PATCH 3/3] qcow2: reorder fields in Qcow2CachedTable to reduce padding Alberto Garcia -- strict thread matches above, loose matches on Subject: below -- 2015-05-29 9:24 [Qemu-devel] [PATCH v4 0/3] Clean unused entries in the qcow2 L2/refcount cache Alberto Garcia 2015-05-29 9:24 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia 2015-06-02 11:04 ` Kevin Wolf 2015-05-26 17:14 [Qemu-devel] [PATCH v2 0/3] Clean unused entries in the qcow2 L2/refcount cache Alberto Garcia 2015-05-26 17:14 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia 2015-05-26 17:26 ` Max Reitz 2015-05-26 18:52 ` Eric Blake 2015-05-26 19:10 ` Alberto Garcia 2015-05-26 19:15 ` Eric Blake 2015-05-18 16:48 [Qemu-devel] [PATCH 0/3] Clean unused entries in the qcow2 L2/refcount cache Alberto Garcia 2015-05-18 16:48 ` [Qemu-devel] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time Alberto Garcia 2015-05-26 16:07 ` Max Reitz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).