* [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache @ 2015-02-05 12:55 Alberto Garcia 2015-02-05 13:48 ` Max Reitz 2015-02-06 14:09 ` Kevin Wolf 0 siblings, 2 replies; 10+ messages in thread From: Alberto Garcia @ 2015-02-05 12:55 UTC (permalink / raw) To: qemu-devel; +Cc: Kevin Wolf, Alberto Garcia, Stefan Hajnoczi The current algorithm to replace entries from the L2 cache gives priority to newer hits by dividing the hit count of all existing entries by two everytime there is a cache miss. However, if there are several cache misses the hit count of the existing entries can easily go down to 0. This will result in those entries being replaced even when there are others that have never been used. This problem is more noticeable with larger disk images and cache sizes, since the chances of having several misses before the cache is full are higher. If we make sure that the hit count can never go down to 0 again, unused entries will always have priority. Signed-off-by: Alberto Garcia <berto@igalia.com> --- block/qcow2-cache.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index 904f6b1..b115549 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -253,7 +253,9 @@ static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c) /* Give newer hits priority */ /* TODO Check how to optimize the replacement strategy */ - c->entries[i].cache_hits /= 2; + if (c->entries[i].cache_hits > 1) { + c->entries[i].cache_hits /= 2; + } } if (min_index == -1) { -- 2.1.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache 2015-02-05 12:55 [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache Alberto Garcia @ 2015-02-05 13:48 ` Max Reitz 2015-02-05 13:59 ` Alberto Garcia 2015-02-06 14:09 ` Kevin Wolf 1 sibling, 1 reply; 10+ messages in thread From: Max Reitz @ 2015-02-05 13:48 UTC (permalink / raw) To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi On 2015-02-05 at 07:55, Alberto Garcia wrote: > The current algorithm to replace entries from the L2 cache gives > priority to newer hits by dividing the hit count of all existing > entries by two everytime there is a cache miss. > > However, if there are several cache misses the hit count of the > existing entries can easily go down to 0. This will result in those > entries being replaced even when there are others that have never been > used. > > This problem is more noticeable with larger disk images and cache > sizes, since the chances of having several misses before the cache is > full are higher. > > If we make sure that the hit count can never go down to 0 again, > unused entries will always have priority. > > Signed-off-by: Alberto Garcia <berto@igalia.com> > --- > block/qcow2-cache.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c > index 904f6b1..b115549 100644 > --- a/block/qcow2-cache.c > +++ b/block/qcow2-cache.c > @@ -253,7 +253,9 @@ static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c) > > /* Give newer hits priority */ > /* TODO Check how to optimize the replacement strategy */ > - c->entries[i].cache_hits /= 2; > + if (c->entries[i].cache_hits > 1) { > + c->entries[i].cache_hits /= 2; > + } > } > > if (min_index == -1) { Hm, I can't see where the code is actually giving priority to unused entries. qcow2_cache_find_entry_to_replace() is the only place which selects the entry to be used, and it does not check entries[i].offset which it should in order to determine whether there are unused entries. Furthermore, I think we should prioritize clean entries over dirty ones; but I guess that's what the TODO comment is hinting at... Max ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache 2015-02-05 13:48 ` Max Reitz @ 2015-02-05 13:59 ` Alberto Garcia 2015-02-05 14:03 ` Max Reitz 2015-02-05 14:17 ` Kevin Wolf 0 siblings, 2 replies; 10+ messages in thread From: Alberto Garcia @ 2015-02-05 13:59 UTC (permalink / raw) To: Max Reitz; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi On Thu, Feb 05, 2015 at 08:48:30AM -0500, Max Reitz wrote: > >- c->entries[i].cache_hits /= 2; > >+ if (c->entries[i].cache_hits > 1) { > >+ c->entries[i].cache_hits /= 2; > >+ } > > } > > if (min_index == -1) { > > Hm, I can't see where the code is actually giving priority to unused > entries. qcow2_cache_find_entry_to_replace() is the only place which > selects the entry to be used Yes, and it looks for the one with the lowest cache hit count. That is the only criteria: if (c->entries[i].cache_hits < min_count) { min_index = i; min_count = c->entries[i].cache_hits; } If there are several with the same hit count then the first one is chosen. Since dividing the hit count by two everytime there's a cache miss can make it go down to zero, an existing entry with cache_hits == 0 will always be chosen before any empty one located afterwards in the array. By never allowing the hit count to go down to zero, we make sure that all unused entries are chosen first before a valid one is discarded. Berto ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache 2015-02-05 13:59 ` Alberto Garcia @ 2015-02-05 14:03 ` Max Reitz 2015-02-05 14:17 ` Kevin Wolf 1 sibling, 0 replies; 10+ messages in thread From: Max Reitz @ 2015-02-05 14:03 UTC (permalink / raw) To: Alberto Garcia; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi On 2015-02-05 at 08:59, Alberto Garcia wrote: > On Thu, Feb 05, 2015 at 08:48:30AM -0500, Max Reitz wrote: > >>> - c->entries[i].cache_hits /= 2; >>> + if (c->entries[i].cache_hits > 1) { >>> + c->entries[i].cache_hits /= 2; >>> + } >>> } >>> if (min_index == -1) { >> Hm, I can't see where the code is actually giving priority to unused >> entries. qcow2_cache_find_entry_to_replace() is the only place which >> selects the entry to be used > Yes, and it looks for the one with the lowest cache hit count. That is > the only criteria: > > if (c->entries[i].cache_hits < min_count) { > min_index = i; > min_count = c->entries[i].cache_hits; > } > > If there are several with the same hit count then the first one is > chosen. > > Since dividing the hit count by two everytime there's a cache miss can > make it go down to zero, an existing entry with cache_hits == 0 will > always be chosen before any empty one located afterwards in the array. > > By never allowing the hit count to go down to zero, we make sure that > all unused entries are chosen first before a valid one is discarded. Oh, right. I was wondering because cache_hits is not reset to 0 in qcow2_cache_entry_flush(); but that function is not meant for emptying an entry but only making sure it's not dirty, and qcow2_cache_empty() indeed sets cache_hits to 0. Thus: Reviewed-by: Max Reitz <mreitz@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache 2015-02-05 13:59 ` Alberto Garcia 2015-02-05 14:03 ` Max Reitz @ 2015-02-05 14:17 ` Kevin Wolf 2015-02-05 14:42 ` Alberto Garcia 2015-02-06 9:44 ` Alberto Garcia 1 sibling, 2 replies; 10+ messages in thread From: Kevin Wolf @ 2015-02-05 14:17 UTC (permalink / raw) To: Alberto Garcia; +Cc: qemu-devel, Stefan Hajnoczi, Max Reitz Am 05.02.2015 um 14:59 hat Alberto Garcia geschrieben: > On Thu, Feb 05, 2015 at 08:48:30AM -0500, Max Reitz wrote: > > > >- c->entries[i].cache_hits /= 2; > > >+ if (c->entries[i].cache_hits > 1) { > > >+ c->entries[i].cache_hits /= 2; > > >+ } > > > } > > > if (min_index == -1) { > > > > Hm, I can't see where the code is actually giving priority to unused > > entries. qcow2_cache_find_entry_to_replace() is the only place which > > selects the entry to be used > > Yes, and it looks for the one with the lowest cache hit count. That is > the only criteria: > > if (c->entries[i].cache_hits < min_count) { > min_index = i; > min_count = c->entries[i].cache_hits; > } > > If there are several with the same hit count then the first one is > chosen. > > Since dividing the hit count by two everytime there's a cache miss can > make it go down to zero, an existing entry with cache_hits == 0 will > always be chosen before any empty one located afterwards in the array. > > By never allowing the hit count to go down to zero, we make sure that > all unused entries are chosen first before a valid one is discarded. But does this actually improve a lot? cache_hits is only 0 for the first few accesses and it never becomes 0 again after that. The result might be that soon all the entries have cache_hits == 1, and we get the same problem as you're describing - only the first few entries will be reused. If this happens a lot in practice and we get much more cache misses than cache hits that would be required to keep tables in memory, we may need to rethink the whole eviction strategy rather than changing just a small detail. Do you have any specific workload that is improved with your patch, and do you have numbers for the effect of the change? Kevin ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache 2015-02-05 14:17 ` Kevin Wolf @ 2015-02-05 14:42 ` Alberto Garcia 2015-02-06 9:44 ` Alberto Garcia 1 sibling, 0 replies; 10+ messages in thread From: Alberto Garcia @ 2015-02-05 14:42 UTC (permalink / raw) To: Kevin Wolf; +Cc: qemu-devel, Stefan Hajnoczi, Max Reitz On Thu, Feb 05, 2015 at 03:17:19PM +0100, Kevin Wolf wrote: > > By never allowing the hit count to go down to zero, we make sure > > that all unused entries are chosen first before a valid one is > > discarded. > > But does this actually improve a lot? cache_hits is only 0 for the > first few accesses and it never becomes 0 again after that. The > result might be that soon all the entries have cache_hits == 1, and > we get the same problem as you're describing - only the first few > entries will be reused. I targeted the specific case where the size of the L2 cache is set so that it's big enough for the whole disk. I have a setup with a 16GB disk image and a 2MB L2 cache (that's 32 entries). If I do random rw, after two minutes I get an average of 3K IOPS, and more than half (19) of the cache entries are still unused. With the patch the cache fills up immediately and I get sustained ~20K IOPS from the very beginning. Berto ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache 2015-02-05 14:17 ` Kevin Wolf 2015-02-05 14:42 ` Alberto Garcia @ 2015-02-06 9:44 ` Alberto Garcia 2015-02-06 11:18 ` Kevin Wolf 1 sibling, 1 reply; 10+ messages in thread From: Alberto Garcia @ 2015-02-06 9:44 UTC (permalink / raw) To: Kevin Wolf; +Cc: qemu-devel, Stefan Hajnoczi, Max Reitz On Thu, Feb 05, 2015 at 03:17:19PM +0100, Kevin Wolf wrote: > > By never allowing the hit count to go down to zero, we make sure > > that all unused entries are chosen first before a valid one is > > discarded. > > But does this actually improve a lot? cache_hits is only 0 for the > first few accesses and it never becomes 0 again after that. The > result might be that soon all the entries have cache_hits == 1, and > we get the same problem as you're describing - only the first few > entries will be reused. I wanted to measure the actual impact of this, so I modified the current code to implement a quick and dirty LRU algorithm and made some numbers. First of all, it's true that if the cache is not big enough and there's a lot of cache misses, the entries being reused are always the first ones, whereas with LRU all of them are evicted at some point (I actually measured this and the results are very clear). However this doesn't seem to translate in actual performance improvements in my tests. I compared all three algorithms (the original one, my patched version and my LRU version) using a 20GB disk image and different L2 cache sizes. I tested using fio doing random reads for one minute. Here are the results: |---------------+-----------------------------| | | Average IOPS | |---------------+----------+---------+--------| | Cache entries | Original | Patched | LRU | |---------------+----------+---------+--------| | 16 (8GB) | 4.0 K | 4.9 K | 5.1 K | | 24 (12GB) | 4.1 K | 7.2 K | 7.3 K | | 32 (16GB) | 4.1 K | 12.8 K | 12.7 K | | 40 (20GB) | 4.0 K | 64.0 K | 63.6 K | |---------------+----------+---------+--------| And these are the results with a 40GB disk image (I skipped the original code in this case since its limits are clear): |---------------+------------------| | | Average IOPS | |---------------+---------+--------| | Cache entries | Patched | LRU | |---------------+---------+--------| | 16 (8GB) | 3.7 K | 3.7 K | | 40 (20GB) | 5.4 K | 5.8 K | | 72 (36GB) | 20.8 K | 21.4 K | | 78 (39GB) | 42.6 K | 42.4 K | | 80 (40GB) | 64.2 K | 64.1 K | |---------------+---------+--------| So it seems that under this scenario, if the cache is not big enough for the whole disk, the eviction algorithm doesn't really make a difference. This makes sense because since I'm doing random reads, the previous usage of a cache entry doesn't have an influence on the probability that it will be used again. Under a different scenario the results might be different but I don't have a use case with data to prove that. That's why I chose this simple change to the current algorithm rather than attempting a complete rewrite. Berto ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache 2015-02-06 9:44 ` Alberto Garcia @ 2015-02-06 11:18 ` Kevin Wolf 2015-02-06 13:46 ` Alberto Garcia 0 siblings, 1 reply; 10+ messages in thread From: Kevin Wolf @ 2015-02-06 11:18 UTC (permalink / raw) To: Alberto Garcia; +Cc: qemu-devel, Stefan Hajnoczi, Max Reitz Am 06.02.2015 um 10:44 hat Alberto Garcia geschrieben: > On Thu, Feb 05, 2015 at 03:17:19PM +0100, Kevin Wolf wrote: > > > > By never allowing the hit count to go down to zero, we make sure > > > that all unused entries are chosen first before a valid one is > > > discarded. > > > > But does this actually improve a lot? cache_hits is only 0 for the > > first few accesses and it never becomes 0 again after that. The > > result might be that soon all the entries have cache_hits == 1, and > > we get the same problem as you're describing - only the first few > > entries will be reused. > > I wanted to measure the actual impact of this, so I modified the > current code to implement a quick and dirty LRU algorithm and made > some numbers. Cool, thanks. I think switching to an LRU algorithm is an option that might make sense for the general case, so this is something to consider. > First of all, it's true that if the cache is not big enough and > there's a lot of cache misses, the entries being reused are always the > first ones, whereas with LRU all of them are evicted at some point (I > actually measured this and the results are very clear). Good to have this confirmed. > However this doesn't seem to translate in actual performance > improvements in my tests. > > I compared all three algorithms (the original one, my patched version > and my LRU version) using a 20GB disk image and different L2 cache > sizes. I tested using fio doing random reads for one minute. Here are > the results: > > |---------------+-----------------------------| > | | Average IOPS | > |---------------+----------+---------+--------| > | Cache entries | Original | Patched | LRU | > |---------------+----------+---------+--------| > | 16 (8GB) | 4.0 K | 4.9 K | 5.1 K | > | 24 (12GB) | 4.1 K | 7.2 K | 7.3 K | > | 32 (16GB) | 4.1 K | 12.8 K | 12.7 K | > | 40 (20GB) | 4.0 K | 64.0 K | 63.6 K | > |---------------+----------+---------+--------| > > And these are the results with a 40GB disk image (I skipped the > original code in this case since its limits are clear): > > |---------------+------------------| > | | Average IOPS | > |---------------+---------+--------| > | Cache entries | Patched | LRU | > |---------------+---------+--------| > | 16 (8GB) | 3.7 K | 3.7 K | > | 40 (20GB) | 5.4 K | 5.8 K | > | 72 (36GB) | 20.8 K | 21.4 K | > | 78 (39GB) | 42.6 K | 42.4 K | > | 80 (40GB) | 64.2 K | 64.1 K | > |---------------+---------+--------| > > So it seems that under this scenario, if the cache is not big enough > for the whole disk, the eviction algorithm doesn't really make a > difference. > > This makes sense because since I'm doing random reads, the previous > usage of a cache entry doesn't have an influence on the probability > that it will be used again. Indeed, random reads across the whole disk are the worst case for caching. Your patches and LRU both "only" help in that they increase the share of cached metadata and therefore the chance of randomly picking something that is cached. It doesn't really matter which part of the metadata is cached. > Under a different scenario the results might be different but I > don't have a use case with data to prove that. That's why I chose > this simple change to the current algorithm rather than attempting a > complete rewrite. I was going to suggest a test run for a different scenario, like with many interleaved sequential reads, but thinking a bit more about how to do it so that the eviction algorithm could make a significant change, it occurred to me that it might not be all that easy (and therefore possibly not practically relevant either). So yes, I think you have convinced me that your patch is a worthwhile improvement that can be made independently from changes to the eviction strategy. Kevin ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache 2015-02-06 11:18 ` Kevin Wolf @ 2015-02-06 13:46 ` Alberto Garcia 0 siblings, 0 replies; 10+ messages in thread From: Alberto Garcia @ 2015-02-06 13:46 UTC (permalink / raw) To: Kevin Wolf; +Cc: qemu-devel, Stefan Hajnoczi, Max Reitz [-- Attachment #1: Type: text/plain, Size: 870 bytes --] On Fri, Feb 06, 2015 at 12:18:07PM +0100, Kevin Wolf wrote: > > I wanted to measure the actual impact of this, so I modified the > > current code to implement a quick and dirty LRU algorithm and made > > some numbers. > > Cool, thanks. I think switching to an LRU algorithm is an option > that might make sense for the general case, so this is something to > consider. I'm anyway sharing the implementation in case you want to give it a look. > I was going to suggest a test run for a different scenario, like > with many interleaved sequential reads, but thinking a bit more > about how to do it so that the eviction algorithm could make a > significant change, it occurred to me that it might not be all that > easy (and therefore possibly not practically relevant either). Yes, that was exactly my point. Thanks a lot for your comments and quick review, Berto [-- Attachment #2: lru.diff --] [-- Type: text/x-diff, Size: 2819 bytes --] diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index b115549..0dd0676 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -28,11 +28,11 @@ #include "trace.h" typedef struct Qcow2CachedTable { - void* table; - int64_t offset; - bool dirty; - int cache_hits; - int ref; + void* table; + int64_t offset; + bool dirty; + unsigned lru_count; + int ref; } Qcow2CachedTable; struct Qcow2Cache { @@ -40,6 +40,7 @@ struct Qcow2Cache { struct Qcow2Cache* depends; int size; bool depends_on_flush; + unsigned max_lru_count; }; Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables) @@ -228,16 +229,18 @@ int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c) for (i = 0; i < c->size; i++) { assert(c->entries[i].ref == 0); c->entries[i].offset = 0; - c->entries[i].cache_hits = 0; + c->entries[i].lru_count = 0; } + c->max_lru_count = 0; + return 0; } static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c) { int i; - int min_count = INT_MAX; + int min_count = UINT_MAX; int min_index = -1; @@ -246,15 +249,9 @@ static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c) continue; } - if (c->entries[i].cache_hits < min_count) { + if (c->entries[i].lru_count < min_count) { min_index = i; - min_count = c->entries[i].cache_hits; - } - - /* Give newer hits priority */ - /* TODO Check how to optimize the replacement strategy */ - if (c->entries[i].cache_hits > 1) { - c->entries[i].cache_hits /= 2; + min_count = c->entries[i].lru_count; } } @@ -310,17 +307,26 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c, } } - /* Give the table some hits for the start so that it won't be replaced - * immediately. The number 32 is completely arbitrary. */ - c->entries[i].cache_hits = 32; + c->entries[i].lru_count = ++c->max_lru_count; c->entries[i].offset = offset; /* And return the right table */ found: - c->entries[i].cache_hits++; + if (c->entries[i].lru_count < c->max_lru_count) { + c->entries[i].lru_count = ++c->max_lru_count; + } c->entries[i].ref++; *table = c->entries[i].table; + /* Reset LRU counters before they overflow */ + if (c->max_lru_count == UINT_MAX) { + unsigned f; + for (f = 0; f < c->size; f++) { + c->entries[f].lru_count = 0; + } + c->max_lru_count = 0; + } + trace_qcow2_cache_get_done(qemu_coroutine_self(), c == s->l2_table_cache, i); ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache 2015-02-05 12:55 [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache Alberto Garcia 2015-02-05 13:48 ` Max Reitz @ 2015-02-06 14:09 ` Kevin Wolf 1 sibling, 0 replies; 10+ messages in thread From: Kevin Wolf @ 2015-02-06 14:09 UTC (permalink / raw) To: Alberto Garcia; +Cc: qemu-devel, Stefan Hajnoczi Am 05.02.2015 um 13:55 hat Alberto Garcia geschrieben: > The current algorithm to replace entries from the L2 cache gives > priority to newer hits by dividing the hit count of all existing > entries by two everytime there is a cache miss. > > However, if there are several cache misses the hit count of the > existing entries can easily go down to 0. This will result in those > entries being replaced even when there are others that have never been > used. > > This problem is more noticeable with larger disk images and cache > sizes, since the chances of having several misses before the cache is > full are higher. > > If we make sure that the hit count can never go down to 0 again, > unused entries will always have priority. > > Signed-off-by: Alberto Garcia <berto@igalia.com> Thanks, applied to the block branch. Kevin ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-02-06 14:10 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-02-05 12:55 [Qemu-devel] [PATCH] block: Give always priority to unused entries in the qcow2 L2 cache Alberto Garcia 2015-02-05 13:48 ` Max Reitz 2015-02-05 13:59 ` Alberto Garcia 2015-02-05 14:03 ` Max Reitz 2015-02-05 14:17 ` Kevin Wolf 2015-02-05 14:42 ` Alberto Garcia 2015-02-06 9:44 ` Alberto Garcia 2015-02-06 11:18 ` Kevin Wolf 2015-02-06 13:46 ` Alberto Garcia 2015-02-06 14:09 ` Kevin Wolf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).