* Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) @ 2012-03-29 20:53 Eric Bénard 2012-03-29 22:03 ` Richard Purdie 0 siblings, 1 reply; 20+ messages in thread From: Eric Bénard @ 2012-03-29 20:53 UTC (permalink / raw) To: openembedded-core Hi, I noticed in from scratch builds for qemuarm that the longest time is taken in fetching sources, especially those fetched using git (linux-yocto for example) & svn (gcc, eglibc & co). To reduce the fetch time would that make sense to - fetch gcc/glibc & co from the archive of a stable version and then apply patches on top of it (maybe patches stored in an archive fetched from oe's website and applied in bulk or patches stored in OE) - do the same thing for the linux-yocto kernel or add a --reference option to the git fetcher so that we can provide a local tree as a reference ? Do you have other ideas (appart from using a local mirror) to optimize the fetch time ? Thanks, Eric ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-29 20:53 Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) Eric Bénard @ 2012-03-29 22:03 ` Richard Purdie 2012-03-30 1:03 ` Bruce Ashfield 2012-03-30 8:50 ` Eric Bénard 0 siblings, 2 replies; 20+ messages in thread From: Richard Purdie @ 2012-03-29 22:03 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On Thu, 2012-03-29 at 22:53 +0200, Eric Bénard wrote: > I noticed in from scratch builds for qemuarm that the longest time is > taken in fetching sources, especially those fetched using git > (linux-yocto for example) & svn (gcc, eglibc & co). Are you timing these as fetches from the source control systems or from the mirror tarballs of the repositories. The tarballs should be faster... > To reduce the fetch time would that make sense to > - fetch gcc/glibc & co from the archive of a stable version and then > apply patches on top of it (maybe patches stored in an archive > fetched from oe's website and applied in bulk or patches stored in OE) Unfortunately the patches tend to get unwieldy. The tarballs of the svn repos on the mirror should be about equal in size to the upstream archive in this case. > - do the same thing for the linux-yocto kernel or add a --reference > option to the git fetcher so that we can provide a local tree as a > reference ? This is effectively how the repositories in DL_DIR are used. If you place a tree in the right place there, it should reuse references... > Do you have other ideas (appart from using a local mirror) to optimize > the fetch time ? I'd be interested firstly to understand if you're using the SCM directly or using the mirror tarballs as that should make a big difference. In the standard configuration it should be using mirror tarballs... Cheers, Richard ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-29 22:03 ` Richard Purdie @ 2012-03-30 1:03 ` Bruce Ashfield 2012-03-30 6:44 ` Samuel Stirtzel 2012-03-30 7:00 ` Martin Jansa 2012-03-30 8:50 ` Eric Bénard 1 sibling, 2 replies; 20+ messages in thread From: Bruce Ashfield @ 2012-03-30 1:03 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On Thu, Mar 29, 2012 at 6:03 PM, Richard Purdie <richard.purdie@linuxfoundation.org> wrote: > On Thu, 2012-03-29 at 22:53 +0200, Eric Bénard wrote: >> I noticed in from scratch builds for qemuarm that the longest time is >> taken in fetching sources, especially those fetched using git >> (linux-yocto for example) & svn (gcc, eglibc & co). > > Are you timing these as fetches from the source control systems or from > the mirror tarballs of the repositories. The tarballs should be > faster... > >> To reduce the fetch time would that make sense to >> - fetch gcc/glibc & co from the archive of a stable version and then >> apply patches on top of it (maybe patches stored in an archive >> fetched from oe's website and applied in bulk or patches stored in OE) > > Unfortunately the patches tend to get unwieldy. The tarballs of the svn > repos on the mirror should be about equal in size to the upstream > archive in this case. > >> - do the same thing for the linux-yocto kernel or add a --reference >> option to the git fetcher so that we can provide a local tree as a >> reference ? > > This is effectively how the repositories in DL_DIR are used. If you > place a tree in the right place there, it should reuse references... Agreed .. they definitely do here. Richard probably recalls me asking for a --reference option several years ago as well .. but in the end, at some point the initial fetch happens and then the blobs are re-used. So setting up local mirrors, or pre-fetching are options to make sure that the first download is primed and ready to go. For most builds I do, any time fetching just happens in the background and doesn't get in the way. > >> Do you have other ideas (appart from using a local mirror) to optimize >> the fetch time ? > > I'd be interested firstly to understand if you're using the SCM directly > or using the mirror tarballs as that should make a big difference. In > the standard configuration it should be using mirror tarballs... As would I, since there are some ideas, but they either break workflows, don't follow best practices or compromise the completeness of the data. Cheers, Bruce > > Cheers, > > Richard > > > _______________________________________________ > Openembedded-core mailing list > Openembedded-core@lists.openembedded.org > http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core -- "Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end" ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 1:03 ` Bruce Ashfield @ 2012-03-30 6:44 ` Samuel Stirtzel 2012-03-30 9:21 ` Paul Eggleton 2012-03-30 9:32 ` Richard Purdie 2012-03-30 7:00 ` Martin Jansa 1 sibling, 2 replies; 20+ messages in thread From: Samuel Stirtzel @ 2012-03-30 6:44 UTC (permalink / raw) To: Patches and discussions about the oe-core layer 2012/3/30 Bruce Ashfield <bruce.ashfield@gmail.com>: > On Thu, Mar 29, 2012 at 6:03 PM, Richard Purdie > <richard.purdie@linuxfoundation.org> wrote: >> On Thu, 2012-03-29 at 22:53 +0200, Eric Bénard wrote: >>> I noticed in from scratch builds for qemuarm that the longest time is >>> taken in fetching sources, especially those fetched using git >>> (linux-yocto for example) & svn (gcc, eglibc & co). >> >> Are you timing these as fetches from the source control systems or from >> the mirror tarballs of the repositories. The tarballs should be >> faster... >> >>> To reduce the fetch time would that make sense to >>> - fetch gcc/glibc & co from the archive of a stable version and then >>> apply patches on top of it (maybe patches stored in an archive >>> fetched from oe's website and applied in bulk or patches stored in OE) >> >> Unfortunately the patches tend to get unwieldy. The tarballs of the svn >> repos on the mirror should be about equal in size to the upstream >> archive in this case. >> >>> - do the same thing for the linux-yocto kernel or add a --reference >>> option to the git fetcher so that we can provide a local tree as a >>> reference ? >> >> This is effectively how the repositories in DL_DIR are used. If you >> place a tree in the right place there, it should reuse references... > > Agreed .. they definitely do here. > > Richard probably recalls me asking for a --reference option several > years ago as well .. but in the end, at some point the initial fetch happens > and then the blobs are re-used. So setting up local mirrors, or pre-fetching > are options to make sure that the first download is primed and ready to > go. For most builds I do, any time fetching just happens in the background > and doesn't get in the way. > >> >>> Do you have other ideas (appart from using a local mirror) to optimize >>> the fetch time ? >> >> I'd be interested firstly to understand if you're using the SCM directly >> or using the mirror tarballs as that should make a big difference. In >> the standard configuration it should be using mirror tarballs... > > As would I, since there are some ideas, but they either break workflows, > don't follow best practices or compromise the completeness of the data. > > Cheers, > > Bruce > >> >> Cheers, >> >> Richard >> >> >> _______________________________________________ >> Openembedded-core mailing list >> Openembedded-core@lists.openembedded.org >> http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core > > > > -- > "Thou shalt not follow the NULL pointer, for chaos and madness await > thee at its end" > > _______________________________________________ > Openembedded-core mailing list > Openembedded-core@lists.openembedded.org > http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core Hi, this might be a bit off-topic, but another idea would be to add a separate threading mechanism for fetching. Current threading can help to use the CPU and memory load to it's optimum, but sometimes you have to wait for a download to finish.. Instead there could be a separate set of threads that only download the sources and make optimal use of the bandwidth too. This would also allow to fetch files when the normal threads are busy with configuring/building/packaging recipes. The downside would be that it requires some sort of inter process communication. Or it could be regulated with a simple check if the download is finished.. How does this idea sound to you? -- Regards Samuel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 6:44 ` Samuel Stirtzel @ 2012-03-30 9:21 ` Paul Eggleton 2012-03-30 9:32 ` Richard Purdie 1 sibling, 0 replies; 20+ messages in thread From: Paul Eggleton @ 2012-03-30 9:21 UTC (permalink / raw) To: Samuel Stirtzel; +Cc: openembedded-core On Friday 30 March 2012 08:44:56 Samuel Stirtzel wrote: > this might be a bit off-topic, but another idea would be to add a > separate threading mechanism for fetching. > > Current threading can help to use the CPU and memory load to it's optimum, > but sometimes you have to wait for a download to finish.. > Instead there could be a separate set of threads that only download > the sources and make optimal use of the bandwidth too. > > This would also allow to fetch files when the normal threads are busy > with configuring/building/packaging recipes. What you're really suggesting here is a modified BitBake scheduler that understands that fetch tasks that require network bandwidth are different from other tasks such as compile ones which stress the CPU. It sounds like it might be worth investigating at least. FYI, BitBake's schedulers are pluggable and not particularly complicated (see bitbake/lib/bb/runqueue.py). Cheers, Paul -- Paul Eggleton Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 6:44 ` Samuel Stirtzel 2012-03-30 9:21 ` Paul Eggleton @ 2012-03-30 9:32 ` Richard Purdie 2012-03-30 10:07 ` Samuel Stirtzel 1 sibling, 1 reply; 20+ messages in thread From: Richard Purdie @ 2012-03-30 9:32 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On Fri, 2012-03-30 at 08:44 +0200, Samuel Stirtzel wrote: > this might be a bit off-topic, but another idea would be to add a > separate threading mechanism for fetching. > > Current threading can help to use the CPU and memory load to it's optimum, > but sometimes you have to wait for a download to finish.. > Instead there could be a separate set of threads that only download > the sources and make optimal use of the bandwidth too. > > This would also allow to fetch files when the normal threads are busy > with configuring/building/packaging recipes. > > > The downside would be that it requires some sort of inter process > communication. > Or it could be regulated with a simple check if the download is finished.. > > How does this idea sound to you? Its easier than you think to do this, bitbake has a plugable scheduler implementation so you'd just have to write one which ignores "fetch" operations from the total thread count. Sadly this isn't really the place most people have a bottleneck in day to day usage of the system. People have tried various algorithms for enhancing the scheduler and as far as I know never found anything that makes a significant difference, much to everyone's surprise :/. Cheers, Richard ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 9:32 ` Richard Purdie @ 2012-03-30 10:07 ` Samuel Stirtzel 2012-03-30 10:45 ` Richard Purdie 0 siblings, 1 reply; 20+ messages in thread From: Samuel Stirtzel @ 2012-03-30 10:07 UTC (permalink / raw) To: Patches and discussions about the oe-core layer 2012/3/30 Richard Purdie <richard.purdie@linuxfoundation.org>: > On Fri, 2012-03-30 at 08:44 +0200, Samuel Stirtzel wrote: >> this might be a bit off-topic, but another idea would be to add a >> separate threading mechanism for fetching. >> >> Current threading can help to use the CPU and memory load to it's optimum, >> but sometimes you have to wait for a download to finish.. >> Instead there could be a separate set of threads that only download >> the sources and make optimal use of the bandwidth too. >> >> This would also allow to fetch files when the normal threads are busy >> with configuring/building/packaging recipes. >> >> >> The downside would be that it requires some sort of inter process >> communication. >> Or it could be regulated with a simple check if the download is finished.. >> >> How does this idea sound to you? > > Its easier than you think to do this, bitbake has a plugable scheduler > implementation so you'd just have to write one which ignores "fetch" > operations from the total thread count. > > Sadly this isn't really the place most people have a bottleneck in day > to day usage of the system. People have tried various algorithms for > enhancing the scheduler and as far as I know never found anything that > makes a significant difference, much to everyone's surprise :/. > > Cheers, > > Richard > > > > _______________________________________________ > Openembedded-core mailing list > Openembedded-core@lists.openembedded.org > http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core Of course this will only reduce the time of recipes if they are build for the first time, or when the version/URL changes. It is not that important, I agree, but it would improve the situation for first time users, or new installations. Example for 2 threads: http://pastebin.com/kviwQZJ3 It is very likely that the current situation also uses cpu and network resources at the same time, but it might occur that the build-task has to wait for a download to finish or vice versa. Ignoring fetch tasks from the thread count would only do half of the job and _could_ cause network bottlenecks ;) Fetching should be "independent" from the dependency chain. E.g.: it should not wait with the downloads for dependencies to finish building, the download sequence should still match the dependency chain sequence. If it is really that easy, then I will look into it. -- Regards Samuel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 10:07 ` Samuel Stirtzel @ 2012-03-30 10:45 ` Richard Purdie 2012-04-02 8:15 ` Samuel Stirtzel 0 siblings, 1 reply; 20+ messages in thread From: Richard Purdie @ 2012-03-30 10:45 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On Fri, 2012-03-30 at 12:07 +0200, Samuel Stirtzel wrote: > Of course this will only reduce the time of recipes if they are build > for the first time, > or when the version/URL changes. > It is not that important, I agree, > but it would improve the situation for first time users, or new installations. > > Example for 2 threads: > http://pastebin.com/kviwQZJ3 > It is very likely that the current situation also uses cpu and network > resources at the same time, > but it might occur that the build-task has to wait for a download to > finish or vice versa. > > Ignoring fetch tasks from the thread count would only do half of the > job and _could_ cause network bottlenecks ;) > Fetching should be "independent" from the dependency chain. This simply isn't true and there is also no benefit to splitting them to be independent. The fetch tasks have dependencies just like any other task (for example git:// urls depend on git-native being built unless its in ASSUME_PROVIDED). > E.g.: it should not wait with the downloads for dependencies to finish building, > the download sequence should still match the dependency chain sequence. I'm afraid I don't understand what you mean. I think you will find that if you exclude the "fetch" tasks from the normal "cpu" thread count you will get the behaviour you are describing. Cheers, Richard ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 10:45 ` Richard Purdie @ 2012-04-02 8:15 ` Samuel Stirtzel 0 siblings, 0 replies; 20+ messages in thread From: Samuel Stirtzel @ 2012-04-02 8:15 UTC (permalink / raw) To: Patches and discussions about the oe-core layer 2012/3/30 Richard Purdie <richard.purdie@linuxfoundation.org>: > On Fri, 2012-03-30 at 12:07 +0200, Samuel Stirtzel wrote: >> Of course this will only reduce the time of recipes if they are build >> for the first time, >> or when the version/URL changes. >> It is not that important, I agree, >> but it would improve the situation for first time users, or new installations. >> >> Example for 2 threads: >> http://pastebin.com/kviwQZJ3 >> It is very likely that the current situation also uses cpu and network >> resources at the same time, >> but it might occur that the build-task has to wait for a download to >> finish or vice versa. >> >> Ignoring fetch tasks from the thread count would only do half of the >> job and _could_ cause network bottlenecks ;) >> Fetching should be "independent" from the dependency chain. > > This simply isn't true and there is also no benefit to splitting them to > be independent. The fetch tasks have dependencies just like any other > task (for example git:// urls depend on git-native being built unless > its in ASSUME_PROVIDED). You are right, my mistake. Adding some line like PARALLEL_FETCH to the config will do the rest. > >> E.g.: it should not wait with the downloads for dependencies to finish building, >> the download sequence should still match the dependency chain sequence. > > I'm afraid I don't understand what you mean. I think you will find that > if you exclude the "fetch" tasks from the normal "cpu" thread count you > will get the behaviour you are describing. I was erroneously assuming that the download only starts after all dependencies finished building, but of course this was only derivated as the threads where blocked by the build tasks. So yes the method you mentioned will work. > > Cheers, > > Richard > > > _______________________________________________ > Openembedded-core mailing list > Openembedded-core@lists.openembedded.org > http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core -- Regards Samuel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 1:03 ` Bruce Ashfield 2012-03-30 6:44 ` Samuel Stirtzel @ 2012-03-30 7:00 ` Martin Jansa 2012-03-30 10:06 ` Richard Purdie 1 sibling, 1 reply; 20+ messages in thread From: Martin Jansa @ 2012-03-30 7:00 UTC (permalink / raw) To: Patches and discussions about the oe-core layer [-- Attachment #1: Type: text/plain, Size: 2118 bytes --] On Thu, Mar 29, 2012 at 09:03:15PM -0400, Bruce Ashfield wrote: > On Thu, Mar 29, 2012 at 6:03 PM, Richard Purdie > <richard.purdie@linuxfoundation.org> wrote: > > On Thu, 2012-03-29 at 22:53 +0200, Eric Bénard wrote: > >> I noticed in from scratch builds for qemuarm that the longest time is > >> taken in fetching sources, especially those fetched using git > >> (linux-yocto for example) & svn (gcc, eglibc & co). > > > > Are you timing these as fetches from the source control systems or from > > the mirror tarballs of the repositories. The tarballs should be > > faster... > > > >> To reduce the fetch time would that make sense to > >> - fetch gcc/glibc & co from the archive of a stable version and then > >> apply patches on top of it (maybe patches stored in an archive > >> fetched from oe's website and applied in bulk or patches stored in OE) > > > > Unfortunately the patches tend to get unwieldy. The tarballs of the svn > > repos on the mirror should be about equal in size to the upstream > > archive in this case. > > > >> - do the same thing for the linux-yocto kernel or add a --reference > >> option to the git fetcher so that we can provide a local tree as a > >> reference ? > > > > This is effectively how the repositories in DL_DIR are used. If you > > place a tree in the right place there, it should reuse references... > > Agreed .. they definitely do here. What's right place? I guess the idea was to use --reference for e.g. some other kernel recipe sources checkout. And I guess that building linux-foo won't notice that there is e.g. /OE/downloads/git2/gitorious.org.shr.linux.git: from which it can share a lot of objects using --reference Bob Ham (rah on #oe) said that he is working on some sort of support for --reference with bitbake after I've refused to add just another linux-bar recipe to meta-smartphone, but not sure how he plans to implement it to be usefull and working oob in different env with different sources available in downloads dir. Cheers, -- Martin 'JaMa' Jansa jabber: Martin.Jansa@gmail.com [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 205 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 7:00 ` Martin Jansa @ 2012-03-30 10:06 ` Richard Purdie 0 siblings, 0 replies; 20+ messages in thread From: Richard Purdie @ 2012-03-30 10:06 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On Fri, 2012-03-30 at 09:00 +0200, Martin Jansa wrote: > On Thu, Mar 29, 2012 at 09:03:15PM -0400, Bruce Ashfield wrote: > > On Thu, Mar 29, 2012 at 6:03 PM, Richard Purdie > > <richard.purdie@linuxfoundation.org> wrote: > > >> - do the same thing for the linux-yocto kernel or add a --reference > > >> option to the git fetcher so that we can provide a local tree as a > > >> reference ? > > > > > > This is effectively how the repositories in DL_DIR are used. If you > > > place a tree in the right place there, it should reuse references... > > > > Agreed .. they definitely do here. > > What's right place? > > I guess the idea was to use --reference for e.g. some other kernel recipe > sources checkout. > > And I guess that building linux-foo won't notice that there is e.g. > /OE/downloads/git2/gitorious.org.shr.linux.git: > from which it can share a lot of objects using --reference > > Bob Ham (rah on #oe) said that he is working on some sort of support > for --reference with bitbake after I've refused to add just another > linux-bar recipe to meta-smartphone, but not sure how he plans to > implement it to be usefull and working oob in different env with > different sources available in downloads dir. You could conceivably symlink all your different kernel directories together within git2/. As far as I can tell, the fetcher simply wouldn't care in most cases. The branch structures could get a little mangled I guess and you'd not want to share the resulting mirror tarballs. There is an argument for using one large shared container for all the git objects as another way of solving this. I don't know how well git has that supported but at the object level its a non-issue at least. Cheers, Richard ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-29 22:03 ` Richard Purdie 2012-03-30 1:03 ` Bruce Ashfield @ 2012-03-30 8:50 ` Eric Bénard 2012-03-30 15:12 ` Richard Purdie 1 sibling, 1 reply; 20+ messages in thread From: Eric Bénard @ 2012-03-30 8:50 UTC (permalink / raw) To: openembedded-core Le Thu, 29 Mar 2012 23:03:13 +0100, Richard Purdie <richard.purdie@linuxfoundation.org> a écrit : > On Thu, 2012-03-29 at 22:53 +0200, Eric Bénard wrote: > > I noticed in from scratch builds for qemuarm that the longest time is > > taken in fetching sources, especially those fetched using git > > (linux-yocto for example) & svn (gcc, eglibc & co). > > Are you timing these as fetches from the source control systems or from > the mirror tarballs of the repositories. The tarballs should be > faster... > the default configuration seems to fetch from source control systems as I always see very long time to fetch gcc/eglibc/linux-yocto (despite having a 2.2 MBytes/s downlink DSL line). > > To reduce the fetch time would that make sense to > > - fetch gcc/glibc & co from the archive of a stable version and then > > apply patches on top of it (maybe patches stored in an archive > > fetched from oe's website and applied in bulk or patches stored in OE) > > Unfortunately the patches tend to get unwieldy. The tarballs of the svn > repos on the mirror should be about equal in size to the upstream > archive in this case. > I don't think that's a size problem but that fetching through svn or git is far less efficient than http or ftp especially from gnu's svn which may be overloaded. Morover in a pure OE context we have no interest of all the source history provided by svn or git and that makes a very big volume to download. > > - do the same thing for the linux-yocto kernel or add a --reference > > option to the git fetcher so that we can provide a local tree as a > > reference ? > > This is effectively how the repositories in DL_DIR are used. If you > place a tree in the right place there, it should reuse references... > > > Do you have other ideas (appart from using a local mirror) to optimize > > the fetch time ? > > I'd be interested firstly to understand if you're using the SCM directly > or using the mirror tarballs as that should make a big difference. In > the standard configuration it should be using mirror tarballs... > that doesn't seems to be the case : from a clean oe-core + bitbake clone : . ./openembedded-core/oe-init-build-env edit local.conf to select qemuarm & BBTHREAD to 8 bitbake core-image-minimal -c fetchall and then I see bitbake stops at around 209 or 214 tasks waiting and I see that in ps : /home/ebenard/OE-CORE/build/tmp-eglibc/sysroots/x86_64-linux/usr/bin/git.real clone --bare --mirror git://git.yoctoproject.org/linux-yocto-3.2 /home/ebenard/OE-CORE/build/downloads/git2/git.yoctoproject.org.linux-yocto-3.2 and svn co -r 184847 http://gcc.gnu.org/svn/gcc/branches/gcc-4_6-branch@184847 ... and both are actually fetching at only around 200 KiB/s which last for quite a long time as (from an other downloads dir) the final tree size are huge : du -s git.yoctoproject.org.linux-yocto-3.2/ 610824 git.yoctoproject.org.linux-yocto-3.2/ du -s gcc.gnu.org/ 1602496 gcc.gnu.org/ du -s www.eglibc.org/ 625048 www.eglibc.org If I launch at the same time : wget ftp://ftp.gnu.org/gnu/gcc/gcc-4.6.3/gcc-4.6.3.tar.bz2 I get a download speed close to 1MB/s and the file to download is only 64MB which would save bandwidth. Eric ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 8:50 ` Eric Bénard @ 2012-03-30 15:12 ` Richard Purdie 2012-03-30 15:24 ` Eric Bénard 0 siblings, 1 reply; 20+ messages in thread From: Richard Purdie @ 2012-03-30 15:12 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On Fri, 2012-03-30 at 10:50 +0200, Eric Bénard wrote: > Le Thu, 29 Mar 2012 23:03:13 +0100, > Richard Purdie <richard.purdie@linuxfoundation.org> a écrit : > > > On Thu, 2012-03-29 at 22:53 +0200, Eric Bénard wrote: > > > I noticed in from scratch builds for qemuarm that the longest time is > > > taken in fetching sources, especially those fetched using git > > > (linux-yocto for example) & svn (gcc, eglibc & co). > > > > Are you timing these as fetches from the source control systems or from > > the mirror tarballs of the repositories. The tarballs should be > > faster... > > > the default configuration seems to fetch from source control systems > as I always see very long time to fetch gcc/eglibc/linux-yocto > (despite having a 2.2 MBytes/s downlink DSL line). If you're hitting the SCMs I can understand the frustration. > I don't think that's a size problem but that fetching through svn or > git is far less efficient than http or ftp especially from gnu's svn > which may be overloaded. Agreed. > Morover in a pure OE context we have no interest of all the source > history provided by svn or git and that makes a very big volume to > download. The fetcher will deal with this well in the svn case. In the git case, we made a decision to include history since its not that more expensive. Both these assumptions are based on a working up to date mirror. > . ./openembedded-core/oe-init-build-env > edit local.conf to select qemuarm & BBTHREAD to 8 > bitbake core-image-minimal -c fetchall > > and then I see bitbake stops at around 209 or 214 tasks waiting and > I see that in ps : > /home/ebenard/OE-CORE/build/tmp-eglibc/sysroots/x86_64-linux/usr/bin/git.real > clone --bare --mirror > git://git.yoctoproject.org/linux-yocto-3.2 /home/ebenard/OE-CORE/build/downloads/git2/git.yoctoproject.org.linux-yocto-3.2 > and > svn co -r 184847 > http://gcc.gnu.org/svn/gcc/branches/gcc-4_6-branch@184847 > > ... and both are actually fetching at only around 200 KiB/s which last > for quite a long time as (from an other downloads dir) the final > tree size are huge : > du -s git.yoctoproject.org.linux-yocto-3.2/ > 610824 git.yoctoproject.org.linux-yocto-3.2/ > du -s gcc.gnu.org/ > 1602496 gcc.gnu.org/ > du -s www.eglibc.org/ > 625048 www.eglibc.org > If I launch at the same time : > wget ftp://ftp.gnu.org/gnu/gcc/gcc-4.6.3/gcc-4.6.3.tar.bz2 > I get a download speed close to 1MB/s and the file to download is only > 64MB which would save bandwidth. Try adding this to your configuration: PREMIRRORS = "\ git://.*/.* http://downloads.yoctoproject.org/mirror/sources/ \n \ svn://.*/.* http://downloads.yoctoproject.org/mirror/sources/ \n" and see if that helps the performance. It might be we consider making this the default for OE-Core although some people are nervous about doing this... Cheers, Richard ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 15:12 ` Richard Purdie @ 2012-03-30 15:24 ` Eric Bénard 2012-03-30 15:49 ` Bruce Ashfield 2012-03-30 16:02 ` Richard Purdie 0 siblings, 2 replies; 20+ messages in thread From: Eric Bénard @ 2012-03-30 15:24 UTC (permalink / raw) To: openembedded-core Le Fri, 30 Mar 2012 16:12:44 +0100, Richard Purdie <richard.purdie@linuxfoundation.org> a écrit : > On Fri, 2012-03-30 at 10:50 +0200, Eric Bénard wrote: > > the default configuration seems to fetch from source control systems > > as I always see very long time to fetch gcc/eglibc/linux-yocto > > (despite having a 2.2 MBytes/s downlink DSL line). > > If you're hitting the SCMs I can understand the frustration. > that's not a frustration, that's a feedback on the default behaviour. But I agree with you that could be a frustration for someone trying OE-core for the first time ;-) > Try adding this to your configuration: > > PREMIRRORS = "\ > git://.*/.* http://downloads.yoctoproject.org/mirror/sources/ \n \ > svn://.*/.* http://downloads.yoctoproject.org/mirror/sources/ \n" > > and see if that helps the performance. It might be we consider making > this the default for OE-Core although some people are nervous about > doing this... > sure that will help : in my work setup I have my own mirrors configured but here again, that's not what a new user will have and in that case, I'm testing the plain default configuration to help finding bugs or things to improve the release. I think fetching from git or svn should not be the first thing to do in recipes like gcc, eglibc, linux & co where we are based on a stable released version : this doesn't bring real added value to the user in OE context and this wastes bandwidth (a tbz2 kernel is around 75MB, a git one is around 600MB). Eric ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 15:24 ` Eric Bénard @ 2012-03-30 15:49 ` Bruce Ashfield 2012-03-30 15:55 ` Eric Bénard 2012-03-30 16:02 ` Richard Purdie 1 sibling, 1 reply; 20+ messages in thread From: Bruce Ashfield @ 2012-03-30 15:49 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On Fri, Mar 30, 2012 at 11:24 AM, Eric Bénard <eric@eukrea.com> wrote: > Le Fri, 30 Mar 2012 16:12:44 +0100, > Richard Purdie <richard.purdie@linuxfoundation.org> a écrit : >> On Fri, 2012-03-30 at 10:50 +0200, Eric Bénard wrote: >> > the default configuration seems to fetch from source control systems >> > as I always see very long time to fetch gcc/eglibc/linux-yocto >> > (despite having a 2.2 MBytes/s downlink DSL line). >> >> If you're hitting the SCMs I can understand the frustration. >> > that's not a frustration, that's a feedback on the default > behaviour. But I agree with you that could be a frustration for someone > trying OE-core for the first time ;-) > >> Try adding this to your configuration: >> >> PREMIRRORS = "\ >> git://.*/.* http://downloads.yoctoproject.org/mirror/sources/ \n \ >> svn://.*/.* http://downloads.yoctoproject.org/mirror/sources/ \n" >> >> and see if that helps the performance. It might be we consider making >> this the default for OE-Core although some people are nervous about >> doing this... >> > sure that will help : in my work setup I have my own mirrors configured > but here again, that's not what a new user will have and in that > case, I'm testing the plain default configuration to help finding bugs > or things to improve the release. > > I think fetching from git or svn should not be the first thing to do in > recipes like gcc, eglibc, linux & co where we are based on a > stable released version : this doesn't bring real added value to the > user in OE context and this wastes bandwidth (a tbz2 kernel is around s/user/developer/ and there is value in having git history. I know we'd never do without it in our shop. I suggested shallow clones and some other options to Richard a few weeks ago, or some other hybrid models. They all vary in terms of nastiness and have some good and bad points. But from a kernel guy's point of view, you definitely want to work inside git, but I can see from non-kernel point of view, build and boot is all that really matters. Cheers, Bruce > 75MB, a git one is around 600MB). > > Eric > > _______________________________________________ > Openembedded-core mailing list > Openembedded-core@lists.openembedded.org > http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core -- "Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end" ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 15:49 ` Bruce Ashfield @ 2012-03-30 15:55 ` Eric Bénard 0 siblings, 0 replies; 20+ messages in thread From: Eric Bénard @ 2012-03-30 15:55 UTC (permalink / raw) To: openembedded-core Le Fri, 30 Mar 2012 11:49:33 -0400, Bruce Ashfield <bruce.ashfield@gmail.com> a écrit : > On Fri, Mar 30, 2012 at 11:24 AM, Eric Bénard <eric@eukrea.com> wrote: > > I think fetching from git or svn should not be the first thing to do in > > recipes like gcc, eglibc, linux & co where we are based on a > > stable released version : this doesn't bring real added value to the > > user in OE context and this wastes bandwidth (a tbz2 kernel is around > > s/user/developer/ and there is value in having git history. I know we'd never do > without it in our shop. > > I suggested shallow clones and some other options to Richard a few weeks > ago, or some other hybrid models. They all vary in terms of nastiness and > have some good and bad points. > > But from a kernel guy's point of view, you definitely want to work > inside git, but Do you mean you work in the git tree of linux-yocto *directly inside* OE's sources / downloads ? Eric ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 15:24 ` Eric Bénard 2012-03-30 15:49 ` Bruce Ashfield @ 2012-03-30 16:02 ` Richard Purdie 2012-03-30 16:17 ` Eric Bénard 1 sibling, 1 reply; 20+ messages in thread From: Richard Purdie @ 2012-03-30 16:02 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On Fri, 2012-03-30 at 17:24 +0200, Eric Bénard wrote: > Le Fri, 30 Mar 2012 16:12:44 +0100, > Richard Purdie <richard.purdie@linuxfoundation.org> a écrit : > > On Fri, 2012-03-30 at 10:50 +0200, Eric Bénard wrote: > > > the default configuration seems to fetch from source control systems > > > as I always see very long time to fetch gcc/eglibc/linux-yocto > > > (despite having a 2.2 MBytes/s downlink DSL line). > > > > If you're hitting the SCMs I can understand the frustration. > > > that's not a frustration, that's a feedback on the default > behaviour. But I agree with you that could be a frustration for someone > trying OE-core for the first time ;-) > > > Try adding this to your configuration: > > > > PREMIRRORS = "\ > > git://.*/.* http://downloads.yoctoproject.org/mirror/sources/ \n \ > > svn://.*/.* http://downloads.yoctoproject.org/mirror/sources/ \n" > > > > and see if that helps the performance. It might be we consider making > > this the default for OE-Core although some people are nervous about > > doing this... > > > sure that will help : in my work setup I have my own mirrors configured > but here again, that's not what a new user will have and in that > case, I'm testing the plain default configuration to help finding bugs > or things to improve the release. > > I think fetching from git or svn should not be the first thing to do in > recipes like gcc, eglibc, linux & co where we are based on a > stable released version : this doesn't bring real added value to the > user in OE context and this wastes bandwidth (a tbz2 kernel is around > 75MB, a git one is around 600MB). We've gone around in circles on this. We did use tarballs for gcc, people complained. We switched to svn, you're not happy and probably others aren't. We can't win. Adding the PREMIRRORS makes the situation better, I agree its not perfect. The original question was how can we speed it up and this is an easy way to do so for the default user case without changing anything major. I hear what you're saying on the tarball vs. SCM issue but using tarballs does break use cases some users do use, the opposite is not true. Its also ultimately down to the maintainers of recipes. The gcc issue is more maintainable the way its set up now compared to large numbers of patches (which took an age to apply) and doesn't have much of an additional bandwidth cost. The linux-yocto kernel recipe heavily uses the SCM to do things so whilst it does have a higher download cost, it as adds value and is ultimately a maintainers choice too. So whilst I hear what you're saying, I don't think we can change anything other than the PREMIRROR... Cheers, Richard ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 16:02 ` Richard Purdie @ 2012-03-30 16:17 ` Eric Bénard 2012-03-30 17:33 ` Bruce Ashfield 0 siblings, 1 reply; 20+ messages in thread From: Eric Bénard @ 2012-03-30 16:17 UTC (permalink / raw) To: openembedded-core Le Fri, 30 Mar 2012 17:02:24 +0100, Richard Purdie <richard.purdie@linuxfoundation.org> a écrit : > The linux-yocto kernel recipe heavily uses the SCM to do things so > whilst it does have a higher download cost, it as adds value and is > ultimately a maintainers choice too. > OK now that I've given a closer look at the linux-yocto recipes & bbclass I understand better why you need it in that recipe and that this recipe is heavily based on git's features. > So whilst I hear what you're saying, I don't think we can change > anything other than the PREMIRROR... > then maybe for the new users testing OE, having PREMIRRORs set in the default configuration would be a great thing so that they don't believe OE is a big slow beast just because they have to wait hours for git or svn to fetch sources. Eric ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 16:17 ` Eric Bénard @ 2012-03-30 17:33 ` Bruce Ashfield 2012-03-30 18:36 ` Eric Bénard 0 siblings, 1 reply; 20+ messages in thread From: Bruce Ashfield @ 2012-03-30 17:33 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On Fri, Mar 30, 2012 at 12:17 PM, Eric Bénard <eric@eukrea.com> wrote: > Le Fri, 30 Mar 2012 17:02:24 +0100, > Richard Purdie <richard.purdie@linuxfoundation.org> a écrit : >> The linux-yocto kernel recipe heavily uses the SCM to do things so >> whilst it does have a higher download cost, it as adds value and is >> ultimately a maintainers choice too. >> > OK now that I've given a closer look at the linux-yocto recipes & > bbclass I understand better why you need it in that recipe and that > this recipe is heavily based on git's features. There are alternatives that I'm going to be exploring going forward, just nothing that we can bring in during the stabilization cycle. The recipes manipulate git and use it to construct what you build, they don't absolutely require a full git history, so there are some potential savings to be had. It just obviously limits flexibility if a derived recipe wants to merge branches and histories to construct what is built. So having a simple/shallow history for basic builds while not breaking more complex cases probably hits the sweet spot. Cheers, Bruce > >> So whilst I hear what you're saying, I don't think we can change >> anything other than the PREMIRROR... >> > then maybe for the new users testing OE, having PREMIRRORs set in the > default configuration would be a great thing so that they don't believe > OE is a big slow beast just because they have to wait hours for git or > svn to fetch sources. > > Eric > > _______________________________________________ > Openembedded-core mailing list > Openembedded-core@lists.openembedded.org > http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core -- "Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end" ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) 2012-03-30 17:33 ` Bruce Ashfield @ 2012-03-30 18:36 ` Eric Bénard 0 siblings, 0 replies; 20+ messages in thread From: Eric Bénard @ 2012-03-30 18:36 UTC (permalink / raw) To: openembedded-core Le Fri, 30 Mar 2012 13:33:05 -0400, Bruce Ashfield <bruce.ashfield@gmail.com> a écrit : > There are alternatives that I'm going to be exploring going forward, > just nothing > that we can bring in during the stabilization cycle. The recipes manipulate git > and use it to construct what you build, they don't absolutely require a full > git history, so there are some potential savings to be had. > > It just obviously limits flexibility if a derived recipe wants to merge branches > and histories to construct what is built. So having a simple/shallow history for > basic builds while not breaking more complex cases probably hits the sweet > spot. > OK in the end all the slow download problems I met while testing oe-core & qemuarm from scratch were due to a problem on the server hosting yocto's git and mirror services (so setting PREMIRROR to use yocto's mirror didn't improve the situation). Now that this problem is fixed on the yocto server, the time to download linux-yocto kernel went from 90-120 minutes down to 20-30 minutes which seems more reasonnable ! So there was really a problem but I was not looking in the right direction to fix it :-( Eric ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2012-04-02 8:25 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-29 20:53 Fetch time optimization (svn : gcc/eglibc - git : linux-yocto) Eric Bénard 2012-03-29 22:03 ` Richard Purdie 2012-03-30 1:03 ` Bruce Ashfield 2012-03-30 6:44 ` Samuel Stirtzel 2012-03-30 9:21 ` Paul Eggleton 2012-03-30 9:32 ` Richard Purdie 2012-03-30 10:07 ` Samuel Stirtzel 2012-03-30 10:45 ` Richard Purdie 2012-04-02 8:15 ` Samuel Stirtzel 2012-03-30 7:00 ` Martin Jansa 2012-03-30 10:06 ` Richard Purdie 2012-03-30 8:50 ` Eric Bénard 2012-03-30 15:12 ` Richard Purdie 2012-03-30 15:24 ` Eric Bénard 2012-03-30 15:49 ` Bruce Ashfield 2012-03-30 15:55 ` Eric Bénard 2012-03-30 16:02 ` Richard Purdie 2012-03-30 16:17 ` Eric Bénard 2012-03-30 17:33 ` Bruce Ashfield 2012-03-30 18:36 ` Eric Bénard
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox