* qemu CI & ccache: cache size is too small
@ 2024-05-27 10:49 Michael Tokarev
2024-05-27 11:19 ` Thomas Huth
2024-06-03 11:25 ` Daniel P. Berrangé
0 siblings, 2 replies; 6+ messages in thread
From: Michael Tokarev @ 2024-05-27 10:49 UTC (permalink / raw)
To: QEMU Developers, Daniel P. Berrange
Hi!
Noticed today that a rebuild of basically the same tree (a few commits apart)
in CI result in just 11% hit rate of ccache:
https://gitlab.com/mjt0k/qemu/-/jobs/6947445337#L5054
while it should be near 100%. What's interesting in there is:
1) cache size is close to max cache size,
and more important,
2) cleanups performed 78
so it has to remove old entries before it finished the build.
So effectively, our ccache usage is an extra burden, not help.
I should be increased at least, I think. But it's actually difficult
to say really, - is the cache shared between all builds or is it unique
for each build config? Because if it the former, it shouldn't even
work since different ccache versions use different format of the files
in cache.
What's unique in my pipeline run - I ran just a single build job
in two pipelines, nothing more.
Thanks,
/mjt
--
GPG Key transition (from rsa2048 to rsa4096) since 2024-04-24.
New key: rsa4096/61AD3D98ECDF2C8E 9D8B E14E 3F2A 9DD7 9199 28F1 61AD 3D98 ECDF 2C8E
Old key: rsa2048/457CE0A0804465C5 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
Transition statement: http://www.corpit.ru/mjt/gpg-transition-2024.txt
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: qemu CI & ccache: cache size is too small
2024-05-27 10:49 qemu CI & ccache: cache size is too small Michael Tokarev
@ 2024-05-27 11:19 ` Thomas Huth
2024-05-27 11:38 ` Michael Tokarev
2024-06-03 11:25 ` Daniel P. Berrangé
1 sibling, 1 reply; 6+ messages in thread
From: Thomas Huth @ 2024-05-27 11:19 UTC (permalink / raw)
To: qemu-devel; +Cc: Stefan Hajnoczi, Daniel P. Berrange
On 27/05/2024 12.49, Michael Tokarev wrote:
> Hi!
>
> Noticed today that a rebuild of basically the same tree (a few commits apart)
> in CI result in just 11% hit rate of ccache:
>
> https://gitlab.com/mjt0k/qemu/-/jobs/6947445337#L5054
For me, the results look better:
https://gitlab.com/thuth/qemu/-/jobs/6918599017#L4954
> while it should be near 100%. What's interesting in there is:
>
> 1) cache size is close to max cache size,
> and more important,
> 2) cleanups performed 78
>
> so it has to remove old entries before it finished the build.
Did you maybe switch between master and stable branches before that run? ...
I guess that could have invalidated most of the cached files since we
switched from CentOS 8 to 9 recently...?
Thomas
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: qemu CI & ccache: cache size is too small
2024-05-27 11:19 ` Thomas Huth
@ 2024-05-27 11:38 ` Michael Tokarev
2024-06-03 11:29 ` Daniel P. Berrangé
0 siblings, 1 reply; 6+ messages in thread
From: Michael Tokarev @ 2024-05-27 11:38 UTC (permalink / raw)
To: Thomas Huth, qemu-devel; +Cc: Stefan Hajnoczi, Daniel P. Berrange
27.05.2024 14:19, Thomas Huth wrote:
> On 27/05/2024 12.49, Michael Tokarev wrote:
>> Hi!
>>
>> Noticed today that a rebuild of basically the same tree (a few commits apart)
>> in CI result in just 11% hit rate of ccache:
>>
>> https://gitlab.com/mjt0k/qemu/-/jobs/6947445337#L5054
>
> For me, the results look better:
>
> https://gitlab.com/thuth/qemu/-/jobs/6918599017#L4954
Yeah, it's a bit better, but still not good enough.
I dunno how much changes the source had between the two runs.
It still had 11 cleanups, and the cache size is at the same level.
(It is an older ccache, too).
>> while it should be near 100%. What's interesting in there is:
>>
>> 1) cache size is close to max cache size,
>> and more important,
>> 2) cleanups performed 78
>>
>> so it has to remove old entries before it finished the build.
>
> Did you maybe switch between master and stable branches before that run? ... I guess that could have invalidated most of the cached files since we
> switched from CentOS 8 to 9 recently...?
Nope, nothing else ran between the two and it was just a few
source-level commits (stable-8.2 pick ups), without changing
giltab/containers/etc configuration.
I increased cache size to 900M and did another test run, here are
the results: https://gitlab.com/mjt0k/qemu/-/jobs/6947894974#L5054
cache directory /builds/mjt0k/qemu/ccache
primary config /builds/mjt0k/qemu/ccache/ccache.conf
secondary config (readonly) /etc/ccache.conf
stats updated Mon May 27 11:17:44 2024
stats zeroed Mon May 27 11:10:22 2024
cache hit (direct) 1862
cache hit (preprocessed) 274
cache miss 1219
cache hit rate 63.67 %
called for link 285
called for preprocessing 71
compiler produced empty output 5
preprocessor error 2
no input file 6
cleanups performed 0
files in cache 9948
cache size 654.6 MB
max cache size 900.0 MB
This is having in mind that the previous run was with CCACHE_SIZE=500M
and had multiple cleanups, so 63% is actually more than I'd expect already.
Thanks,
/mjt
--
GPG Key transition (from rsa2048 to rsa4096) since 2024-04-24.
New key: rsa4096/61AD3D98ECDF2C8E 9D8B E14E 3F2A 9DD7 9199 28F1 61AD 3D98 ECDF 2C8E
Old key: rsa2048/457CE0A0804465C5 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
Transition statement: http://www.corpit.ru/mjt/gpg-transition-2024.txt
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: qemu CI & ccache: cache size is too small
2024-05-27 10:49 qemu CI & ccache: cache size is too small Michael Tokarev
2024-05-27 11:19 ` Thomas Huth
@ 2024-06-03 11:25 ` Daniel P. Berrangé
1 sibling, 0 replies; 6+ messages in thread
From: Daniel P. Berrangé @ 2024-06-03 11:25 UTC (permalink / raw)
To: Michael Tokarev; +Cc: QEMU Developers
On Mon, May 27, 2024 at 01:49:41PM +0300, Michael Tokarev wrote:
> Hi!
>
> Noticed today that a rebuild of basically the same tree (a few commits apart)
> in CI result in just 11% hit rate of ccache:
>
> https://gitlab.com/mjt0k/qemu/-/jobs/6947445337#L5054
>
> while it should be near 100%. What's interesting in there is:
>
> 1) cache size is close to max cache size,
> and more important,
> 2) cleanups performed 78
>
> so it has to remove old entries before it finished the build.
>
> So effectively, our ccache usage is an extra burden, not help.
I think this ends up being different per job. If I try the
'build-system-fedora' job, for example, I get a 99% cache
hit rate, and 0.2 GB usage of cache storage
https://gitlab.com/berrange/qemu/-/jobs/6876054586
$ ccache --show-stats
Cacheable calls: 3018 / 3208 (94.08%)
Hits: 49 / 3018 ( 1.62%)
Direct: 0 / 49 ( 0.00%)
Preprocessed: 49 / 49 (100.0%)
Misses: 2969 / 3018 (98.38%)
Uncacheable calls: 190 / 3208 ( 5.92%)
Local storage:
Cache size (GB): 0.2 / 0.5 (30.55%)
Hits: 49 / 3018 ( 1.62%)
Misses: 2969 / 3018 (98.38%)
If I compare the jobs, the big differences are the target lists:
CentOS: '--target-list=ppc64-softmmu or1k-softmmu s390x-softmmu x86_64-softmmu rx-softmmu sh4-softmmu'
Fedora: '--target-list=microblaze-softmmu mips-softmmu xtensa-softmmu m68k-softmmu riscv32-softmmu ppc-softmmu sparc64-softmmu'
And then a few minor things:
CentOS: '--disable-nettle' '--enable-gcrypt' '--enable-vfio-user-server' '--enable-modules' '--enable-trace-backends=dtrace'
Fedora: '--disable-gcrypt' '--enable-nettle'
the crypto won't make a diffeernce to caching. Modules ought not to make a
difference either, as that's just moving some .o files from the exe to a
so, not adding many more exes.
The trace backends will add quite a few .o files, but I'm not sure that
will impact cache.
IOW, I bet the target list has the big difference on the amount of data
that needs to be cached, to explain the different cache usage.
I wonder what the picture looks like for cache hits / cache disk usage
across all the other jobs. Is CentOS an outlier or is FEdora an outlier?
We do want cache to be in the 90+% mark if possible as it has a big impact
on build time.
> I should be increased at least, I think. But it's actually difficult
> to say really, - is the cache shared between all builds or is it unique
> for each build config? Because if it the former, it shouldn't even
> work since different ccache versions use different format of the files
> in cache.
It is unique per job per buildtest-template.yml:
cache:
paths:
- ccache
key: "$CI_JOB_NAME"
when: always
> What's unique in my pipeline run - I ran just a single build job
> in two pipelines, nothing more.
In my test I ran a job, then re-ran it in the same pipeline.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: qemu CI & ccache: cache size is too small
2024-05-27 11:38 ` Michael Tokarev
@ 2024-06-03 11:29 ` Daniel P. Berrangé
2024-06-03 12:09 ` Michael Tokarev
0 siblings, 1 reply; 6+ messages in thread
From: Daniel P. Berrangé @ 2024-06-03 11:29 UTC (permalink / raw)
To: Michael Tokarev; +Cc: Thomas Huth, qemu-devel, Stefan Hajnoczi
On Mon, May 27, 2024 at 02:38:08PM +0300, Michael Tokarev wrote:
> 27.05.2024 14:19, Thomas Huth wrote:
> > On 27/05/2024 12.49, Michael Tokarev wrote:
> > > Hi!
> > >
> > > Noticed today that a rebuild of basically the same tree (a few commits apart)
> > > in CI result in just 11% hit rate of ccache:
> > >
> > > https://gitlab.com/mjt0k/qemu/-/jobs/6947445337#L5054
> >
> > For me, the results look better:
> >
> > https://gitlab.com/thuth/qemu/-/jobs/6918599017#L4954
>
> Yeah, it's a bit better, but still not good enough.
> I dunno how much changes the source had between the two runs.
> It still had 11 cleanups, and the cache size is at the same level.
> (It is an older ccache, too).
>
> > > while it should be near 100%. What's interesting in there is:
> > >
> > > 1) cache size is close to max cache size,
> > > and more important,
> > > 2) cleanups performed 78
> > >
> > > so it has to remove old entries before it finished the build.
> >
> > Did you maybe switch between master and stable branches before that run?
> > ... I guess that could have invalidated most of the cached files since
> > we switched from CentOS 8 to 9 recently...?
>
> Nope, nothing else ran between the two and it was just a few
> source-level commits (stable-8.2 pick ups), without changing
> giltab/containers/etc configuration.
>
> I increased cache size to 900M and did another test run, here are
> the results: https://gitlab.com/mjt0k/qemu/-/jobs/6947894974#L5054
>
> cache directory /builds/mjt0k/qemu/ccache
> primary config /builds/mjt0k/qemu/ccache/ccache.conf
> secondary config (readonly) /etc/ccache.conf
> stats updated Mon May 27 11:17:44 2024
> stats zeroed Mon May 27 11:10:22 2024
> cache hit (direct) 1862
> cache hit (preprocessed) 274
> cache miss 1219
> cache hit rate 63.67 %
> called for link 285
> called for preprocessing 71
> compiler produced empty output 5
> preprocessor error 2
> no input file 6
> cleanups performed 0
> files in cache 9948
> cache size 654.6 MB
> max cache size 900.0 MB
>
> This is having in mind that the previous run was with CCACHE_SIZE=500M
> and had multiple cleanups, so 63% is actually more than I'd expect already.
Given your original job had cache of 447 MB, and new cache is 654 MB, the
old cache is 68% of size of the new cache. So effectively your 63% is
high 90's cache hit rate of what was present.
This would suggest a cache size of 700 MB is more appropriate, unless some
other jobs have even high usage needs.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: qemu CI & ccache: cache size is too small
2024-06-03 11:29 ` Daniel P. Berrangé
@ 2024-06-03 12:09 ` Michael Tokarev
0 siblings, 0 replies; 6+ messages in thread
From: Michael Tokarev @ 2024-06-03 12:09 UTC (permalink / raw)
To: Daniel P. Berrangé; +Cc: Thomas Huth, qemu-devel, Stefan Hajnoczi
03.06.2024 14:29, Daniel P. Berrangé wrote:
> Given your original job had cache of 447 MB, and new cache is 654 MB, the
> old cache is 68% of size of the new cache. So effectively your 63% is
> high 90's cache hit rate of what was present.
Don't forget the way how old items are evicted from the cache. If we have
N files to compile but the cache can only fit N-1 files, the cache hit ratio
might be near zero - provided we compile files in the same order and oldest
files gets evicted.
When doing the compiles I forgot to reset cache stats before the second run
(with larger cache), - the hit ratio should've been about 100% there.
So we need the cache size not less than to hold WHOLE compilation plus a fine
bit more so it wont evict things which can be reused in favor of changed
files.
> This would suggest a cache size of 700 MB is more appropriate, unless some
> other jobs have even high usage needs.
Yes, that seems right. I'd keep it at 800MB if possible.
/mjt
--
GPG Key transition (from rsa2048 to rsa4096) since 2024-04-24.
New key: rsa4096/61AD3D98ECDF2C8E 9D8B E14E 3F2A 9DD7 9199 28F1 61AD 3D98 ECDF 2C8E
Old key: rsa2048/457CE0A0804465C5 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
Transition statement: http://www.corpit.ru/mjt/gpg-transition-2024.txt
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-06-03 12:11 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-27 10:49 qemu CI & ccache: cache size is too small Michael Tokarev
2024-05-27 11:19 ` Thomas Huth
2024-05-27 11:38 ` Michael Tokarev
2024-06-03 11:29 ` Daniel P. Berrangé
2024-06-03 12:09 ` Michael Tokarev
2024-06-03 11:25 ` Daniel P. Berrangé
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).