* sstate compression
@ 2012-01-04 15:05 Phil Blundell
2012-01-04 15:33 ` Richard Purdie
0 siblings, 1 reply; 9+ messages in thread
From: Phil Blundell @ 2012-01-04 15:05 UTC (permalink / raw)
To: openembedded-core
Has anybody experimented with the effects of using different/no
compression for the sstate packages?
I happened to notice that, for one of my builds, oe was spending a
noticeable amount of time gzipping the tarfiles for sstate. Obviously
this indicates that I should get myself a better computer, but I do
wonder whether there is a different tradeoff that can be made here.
Taking one of my webkit sstate archives as an example, the uncompressed
tarfile is 112M.
gzip (default compression) reduces this to 28M in about 4.2 seconds.
gzip -1 reduces it to 33M in about 1.75 seconds.
lzop reduces it to 40M in about 0.75 seconds.
Presumably the same sort of considerations apply to the compressed data
inside .ipks, though of course the file sizes there tend to be rather
smaller so I guess the impact is less.
I'm currently setting GZIP="-1" in my local configuration since that's
easy to arrange and seems to produce a worthwhile benefit. It's not
totally obvious to me whether switching to lzo would be a net win or a
loss, but either way this would involve hacking sstate.bbclass and I was
too lazy to do that so far.
p.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sstate compression
2012-01-04 15:05 sstate compression Phil Blundell
@ 2012-01-04 15:33 ` Richard Purdie
2012-01-04 16:31 ` McClintock Matthew-B29882
0 siblings, 1 reply; 9+ messages in thread
From: Richard Purdie @ 2012-01-04 15:33 UTC (permalink / raw)
To: Patches and discussions about the oe-core layer
On Wed, 2012-01-04 at 15:05 +0000, Phil Blundell wrote:
> Has anybody experimented with the effects of using different/no
> compression for the sstate packages?
>
> I happened to notice that, for one of my builds, oe was spending a
> noticeable amount of time gzipping the tarfiles for sstate. Obviously
> this indicates that I should get myself a better computer, but I do
> wonder whether there is a different tradeoff that can be made here.
>
> Taking one of my webkit sstate archives as an example, the uncompressed
> tarfile is 112M.
>
> gzip (default compression) reduces this to 28M in about 4.2 seconds.
> gzip -1 reduces it to 33M in about 1.75 seconds.
> lzop reduces it to 40M in about 0.75 seconds.
>
> Presumably the same sort of considerations apply to the compressed data
> inside .ipks, though of course the file sizes there tend to be rather
> smaller so I guess the impact is less.
>
> I'm currently setting GZIP="-1" in my local configuration since that's
> easy to arrange and seems to produce a worthwhile benefit. It's not
> totally obvious to me whether switching to lzo would be a net win or a
> loss, but either way this would involve hacking sstate.bbclass and I was
> too lazy to do that so far.
There has already been a patch allowing different compression mechanisms
from Matthew posted on this mailing list. Its not been merged yet as I'd
really prefer one format and we didn't (and still don't) have good
statistics about what the tradeoffs are.
Where the sstate tarballs are hitting the network, size probably is more
important than speed. There are also a number of oustanding bugs todo
with bootstrap of sstate alongside gzip-native, tar-native and so on
since things can fail when these are half installed in a sysroot and
something executes using them :/.
Cheers,
Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sstate compression
2012-01-04 15:33 ` Richard Purdie
@ 2012-01-04 16:31 ` McClintock Matthew-B29882
2012-01-04 16:32 ` Chris Larson
0 siblings, 1 reply; 9+ messages in thread
From: McClintock Matthew-B29882 @ 2012-01-04 16:31 UTC (permalink / raw)
To: Patches and discussions about the oe-core layer
On Wed, Jan 4, 2012 at 9:33 AM, Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
> There has already been a patch allowing different compression mechanisms
> from Matthew posted on this mailing list. Its not been merged yet as I'd
> really prefer one format and we didn't (and still don't) have good
> statistics about what the tradeoffs are.
I think there will always be size/speed tradeoffs and we should keep
multiple types available.
-M
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sstate compression
2012-01-04 16:31 ` McClintock Matthew-B29882
@ 2012-01-04 16:32 ` Chris Larson
2012-01-04 16:41 ` Phil Blundell
0 siblings, 1 reply; 9+ messages in thread
From: Chris Larson @ 2012-01-04 16:32 UTC (permalink / raw)
To: McClintock Matthew-B29882,
Patches and discussions about the oe-core layer
On Wed, Jan 4, 2012 at 9:31 AM, McClintock Matthew-B29882
<B29882@freescale.com> wrote:
> On Wed, Jan 4, 2012 at 9:33 AM, Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
>> There has already been a patch allowing different compression mechanisms
>> from Matthew posted on this mailing list. Its not been merged yet as I'd
>> really prefer one format and we didn't (and still don't) have good
>> statistics about what the tradeoffs are.
>
> I think there will always be size/speed tradeoffs and we should keep
> multiple types available.
Agreed. Sstate can get truly massive, being able to opt-into xz or
something would be lovely for us as well.
--
Christopher Larson
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sstate compression
2012-01-04 16:32 ` Chris Larson
@ 2012-01-04 16:41 ` Phil Blundell
2012-01-04 16:47 ` Richard Purdie
0 siblings, 1 reply; 9+ messages in thread
From: Phil Blundell @ 2012-01-04 16:41 UTC (permalink / raw)
To: Patches and discussions about the oe-core layer
On Wed, 2012-01-04 at 09:32 -0700, Chris Larson wrote:
> Agreed. Sstate can get truly massive, being able to opt-into xz or
> something would be lovely for us as well.
Yes, that would be nice. xz does seem to compress nearly twice as well
as "gzip -9", though it takes about six times as long to do it. (16M
and 55 seconds vs 28M and 9.5 seconds for my webkit testcase.)
Personally I don't care about the disk space as much as the build time,
but I can see that sstate could start to become quite unwieldy if you
have a lot of packages in there.
Alternatively, maybe we could have sstate.bbclass accept multiple
compression types when reading (i.e. look for .tar.xz, .tar.gz, .tar.lzo
etc in turn), make it use the fastest reasonable compression when
generating the archives in the first place, and then folks who want to
either put them into long-term storage or send them over slow links can
post-process the sstate-cache by transcoding them into .xz or whatever
format.
p.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sstate compression
2012-01-04 16:41 ` Phil Blundell
@ 2012-01-04 16:47 ` Richard Purdie
2012-01-04 16:53 ` Phil Blundell
2012-01-04 16:55 ` McClintock Matthew-B29882
0 siblings, 2 replies; 9+ messages in thread
From: Richard Purdie @ 2012-01-04 16:47 UTC (permalink / raw)
To: Patches and discussions about the oe-core layer
On Wed, 2012-01-04 at 16:41 +0000, Phil Blundell wrote:
> On Wed, 2012-01-04 at 09:32 -0700, Chris Larson wrote:
> > Agreed. Sstate can get truly massive, being able to opt-into xz or
> > something would be lovely for us as well.
>
> Yes, that would be nice. xz does seem to compress nearly twice as well
> as "gzip -9", though it takes about six times as long to do it. (16M
> and 55 seconds vs 28M and 9.5 seconds for my webkit testcase.)
> Personally I don't care about the disk space as much as the build time,
> but I can see that sstate could start to become quite unwieldy if you
> have a lot of packages in there.
>
> Alternatively, maybe we could have sstate.bbclass accept multiple
> compression types when reading (i.e. look for .tar.xz, .tar.gz, .tar.lzo
> etc in turn), make it use the fastest reasonable compression when
> generating the archives in the first place, and then folks who want to
> either put them into long-term storage or send them over slow links can
> post-process the sstate-cache by transcoding them into .xz or whatever
> format.
Just to note that looking for multiple versions can cause a fair bit of
network traffic as for http:// mirror urls it will have to wget each in
turn. Better would be one file name and dynamic detection of the
compression format I guess.
Cheers,
Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sstate compression
2012-01-04 16:47 ` Richard Purdie
@ 2012-01-04 16:53 ` Phil Blundell
2012-01-04 16:58 ` Richard Purdie
2012-01-04 16:55 ` McClintock Matthew-B29882
1 sibling, 1 reply; 9+ messages in thread
From: Phil Blundell @ 2012-01-04 16:53 UTC (permalink / raw)
To: Patches and discussions about the oe-core layer
On Wed, 2012-01-04 at 16:47 +0000, Richard Purdie wrote:
> Just to note that looking for multiple versions can cause a fair bit of
> network traffic as for http:// mirror urls it will have to wget each in
> turn.
True, though I suppose if the fetching was to be moved into python
(rather than an external wget) then it would just be repeated GETs over
a single persistent connection which wouldn't be all that much overhead.
And, even with non-persistent connections, the amount of data involved
in establishing an extra TCP connection and sending a GET is fairly
negligible compared to the size of the download you're going to end up
doing.
>Better would be one file name and dynamic detection of the
>compression format I guess.
Yes, or that. It's a shame that "tar -a" doesn't have the capability to
determine the compression method using magic numbers instead of the
filename.
p.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sstate compression
2012-01-04 16:53 ` Phil Blundell
@ 2012-01-04 16:58 ` Richard Purdie
0 siblings, 0 replies; 9+ messages in thread
From: Richard Purdie @ 2012-01-04 16:58 UTC (permalink / raw)
To: Patches and discussions about the oe-core layer
On Wed, 2012-01-04 at 16:53 +0000, Phil Blundell wrote:
> On Wed, 2012-01-04 at 16:47 +0000, Richard Purdie wrote:
> > Just to note that looking for multiple versions can cause a fair bit of
> > network traffic as for http:// mirror urls it will have to wget each in
> > turn.
>
> True, though I suppose if the fetching was to be moved into python
> (rather than an external wget) then it would just be repeated GETs over
> a single persistent connection which wouldn't be all that much overhead.
> And, even with non-persistent connections, the amount of data involved
> in establishing an extra TCP connection and sending a GET is fairly
> negligible compared to the size of the download you're going to end up
> doing.
The sstate code currently calls into the fetcher code so this would be
best as a fetcher enhancement but it complicates it as it would have the
be general code.
> >Better would be one file name and dynamic detection of the
> >compression format I guess.
>
> Yes, or that. It's a shame that "tar -a" doesn't have the capability to
> determine the compression method using magic numbers instead of the
> filename.
Indeed...
Cheers,
Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sstate compression
2012-01-04 16:47 ` Richard Purdie
2012-01-04 16:53 ` Phil Blundell
@ 2012-01-04 16:55 ` McClintock Matthew-B29882
1 sibling, 0 replies; 9+ messages in thread
From: McClintock Matthew-B29882 @ 2012-01-04 16:55 UTC (permalink / raw)
To: Patches and discussions about the oe-core layer
On Wed, Jan 4, 2012 at 10:47 AM, Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
> On Wed, 2012-01-04 at 16:41 +0000, Phil Blundell wrote:
>> On Wed, 2012-01-04 at 09:32 -0700, Chris Larson wrote:
>> > Agreed. Sstate can get truly massive, being able to opt-into xz or
>> > something would be lovely for us as well.
>>
>> Yes, that would be nice. xz does seem to compress nearly twice as well
>> as "gzip -9", though it takes about six times as long to do it. (16M
>> and 55 seconds vs 28M and 9.5 seconds for my webkit testcase.)
>> Personally I don't care about the disk space as much as the build time,
>> but I can see that sstate could start to become quite unwieldy if you
>> have a lot of packages in there.
>>
>> Alternatively, maybe we could have sstate.bbclass accept multiple
>> compression types when reading (i.e. look for .tar.xz, .tar.gz, .tar.lzo
>> etc in turn), make it use the fastest reasonable compression when
>> generating the archives in the first place, and then folks who want to
>> either put them into long-term storage or send them over slow links can
>> post-process the sstate-cache by transcoding them into .xz or whatever
>> format.
>
> Just to note that looking for multiple versions can cause a fair bit of
> network traffic as for http:// mirror urls it will have to wget each in
> turn. Better would be one file name and dynamic detection of the
> compression format I guess.
The patch I submitted just looks for the type you have selected, so
there is no looking for gzip or xz sstate-cache. It just checks one,
like we currently do.
-M
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-01-04 17:06 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-04 15:05 sstate compression Phil Blundell
2012-01-04 15:33 ` Richard Purdie
2012-01-04 16:31 ` McClintock Matthew-B29882
2012-01-04 16:32 ` Chris Larson
2012-01-04 16:41 ` Phil Blundell
2012-01-04 16:47 ` Richard Purdie
2012-01-04 16:53 ` Phil Blundell
2012-01-04 16:58 ` Richard Purdie
2012-01-04 16:55 ` McClintock Matthew-B29882
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox