* Performance improvements and machine build configuration
@ 2012-10-23 18:45 Elvis Dowson
2012-10-23 19:39 ` Chris Tapp
2012-10-23 21:19 ` McClintock Matthew-B29882
0 siblings, 2 replies; 7+ messages in thread
From: Elvis Dowson @ 2012-10-23 18:45 UTC (permalink / raw)
To: Yocto Discussion Mailing List
Hi,
I noticed that between commits
http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=0260bb5c6978839c068007fcff2f704937805faf
and
http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=a3d5e9e6b7729319c518dcaf25bbe0643bfb25db
the build time has improved by around 7 minutes for my machine configuration, for building a core-image-minimal rootfs for the Xilinx ZC-702 FPGA with dual ARM Cortex A-9 CPUs.
commit id 0260bb5c6978839c068007fcff2f704937805faf took 29 minutes
commit id a3d5e9e6b7729319c518dcaf25bbe0643bfb25db took 22 minutes
The machine configuration is an Intel i7 3770K over-clocked to 4.2GHz, with 16GB RAM at 1600Mhz, two 120GB SSDs configured into a striped disk array (Intel 330 series SSDs) with a write performance of 838MB/s and read performance of around 600MB/s, in RAID0 configuration, with a Corsair HT100 liquid CPU cooler keeping the CPU cool at around 52 degree centigrade during the build process. The motherboard is a gigabyte GA-Z77X-UP5TH
http://www.gigabyte.com/products/product-page.aspx?pid=4279#ov
This motherboard has a thunderbolt display port, so I can re-use my existing Apple Thunderbolt display. I've run Ubuntu 12.04.1 LTS and Ubuntu 12.10, and it appears to work after a few tweaks.
The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.
The 3TB HDD took the approximately 2 or 3 minutes longer than the 120GB x 2 RAID0 SSD configuration for commit id 0260bb5c6978839c068007fcff2f704937805faf (31 minutes vs. 29 minutes).
My local.conf parallelism settings were set to 6 threads for bitbake and make, for the quad-core (virtual 8 cpu cores)system.
Has anyone tried yocto builds with a 6-core, 8-core or 10-core Xeon processor system? How do those figures fare? I'm thinking my current bottleneck might be the CPU and not the HDD (?!), for the yocto build workloads, which I find curious and would like to confirm.
Best regards,
Elvis Dowson
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Performance improvements and machine build configuration 2012-10-23 18:45 Performance improvements and machine build configuration Elvis Dowson @ 2012-10-23 19:39 ` Chris Tapp 2012-10-25 8:53 ` Elvis Dowson 2012-10-23 21:19 ` McClintock Matthew-B29882 1 sibling, 1 reply; 7+ messages in thread From: Chris Tapp @ 2012-10-23 19:39 UTC (permalink / raw) To: Elvis Dowson; +Cc: Yocto Discussion Mailing List On 23 Oct 2012, at 19:45, Elvis Dowson wrote: > Hi, > I noticed that between commits > > http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=0260bb5c6978839c068007fcff2f704937805faf > > and > > http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=a3d5e9e6b7729319c518dcaf25bbe0643bfb25db > > the build time has improved by around 7 minutes for my machine configuration, for building a core-image-minimal rootfs for the Xilinx ZC-702 FPGA with dual ARM Cortex A-9 CPUs. > > commit id 0260bb5c6978839c068007fcff2f704937805faf took 29 minutes > commit id a3d5e9e6b7729319c518dcaf25bbe0643bfb25db took 22 minutes > > The machine configuration is an Intel i7 3770K over-clocked to 4.2GHz, with 16GB RAM at 1600Mhz, two 120GB SSDs configured into a striped disk array (Intel 330 series SSDs) with a write performance of 838MB/s and read performance of around 600MB/s, in RAID0 configuration, with a Corsair HT100 liquid CPU cooler keeping the CPU cool at around 52 degree centigrade during the build process. The motherboard is a gigabyte GA-Z77X-UP5TH > > http://www.gigabyte.com/products/product-page.aspx?pid=4279#ov > > This motherboard has a thunderbolt display port, so I can re-use my existing Apple Thunderbolt display. I've run Ubuntu 12.04.1 LTS and Ubuntu 12.10, and it appears to work after a few tweaks. > > The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port. > > The 3TB HDD took the approximately 2 or 3 minutes longer than the 120GB x 2 RAID0 SSD configuration for commit id 0260bb5c6978839c068007fcff2f704937805faf (31 minutes vs. 29 minutes). > > My local.conf parallelism settings were set to 6 threads for bitbake and make, for the quad-core (virtual 8 cpu cores)system. > > Has anyone tried yocto builds with a 6-core, 8-core or 10-core Xeon processor system? How do those figures fare? I'm thinking my current bottleneck might be the CPU and not the HDD (?!), for the yocto build workloads, which I find curious and would like to confirm. I did quite a bit of experimenting with this a while back (similar spec, but with nearly 1000MB/s read/write SDD array). CPU was quad core with hyper-threading, so 8 virtual cores. I generally run with 16 threads, 16 parallel make as I find that the main performance hit is running out of stuff to keep all the cores busy. Most of the time all 8 cores are maxed out, but around when the kernel gets built (and cross tools needed for it) I see the total CPU use drop to about 25%. This isn't because the system is I/O bound; it simply doesn't have enough tasks ready to run at that point in time. I estimate that my 55 min build times would come down by 10 to 15 minutes if I could keep the CPUs busy (still, much better than the 10 hour build times on my previous system!). I tried 'tinkering' with the run queue priority order, but all I proved was that inverting it (i.e. make the things that were previous given high-priority have low-priority) made no measurable difference to my build times! I'm trying not to think too much about that one ;-) Chris Tapp opensource@keylevel.com www.keylevel.com ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance improvements and machine build configuration 2012-10-23 19:39 ` Chris Tapp @ 2012-10-25 8:53 ` Elvis Dowson 0 siblings, 0 replies; 7+ messages in thread From: Elvis Dowson @ 2012-10-25 8:53 UTC (permalink / raw) To: Chris Tapp; +Cc: Yocto Discussion Mailing List Hi Chris, On Oct 23, 2012, at 11:39 PM, Chris Tapp <opensource@keylevel.com> wrote: > On 23 Oct 2012, at 19:45, Elvis Dowson wrote: > >> I noticed that between commits >> >> http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=0260bb5c6978839c068007fcff2f704937805faf >> >> and >> >> http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=a3d5e9e6b7729319c518dcaf25bbe0643bfb25db >> >> the build time has improved by around 7 minutes for my machine configuration, for building a core-image-minimal rootfs for the Xilinx ZC-702 FPGA with dual ARM Cortex A-9 CPUs. >> >> commit id 0260bb5c6978839c068007fcff2f704937805faf took 29 minutes >> commit id a3d5e9e6b7729319c518dcaf25bbe0643bfb25db took 22 minutes >> >> The machine configuration is an Intel i7 3770K over-clocked to 4.2GHz, with 16GB RAM at 1600Mhz, two 120GB SSDs configured into a striped disk array (Intel 330 series SSDs) with a write performance of 838MB/s and read performance of around 600MB/s, in RAID0 configuration, with a Corsair HT100 liquid CPU cooler keeping the CPU cool at around 52 degree centigrade during the build process. The motherboard is a gigabyte GA-Z77X-UP5TH >> >> http://www.gigabyte.com/products/product-page.aspx?pid=4279#ov >> >> This motherboard has a thunderbolt display port, so I can re-use my existing Apple Thunderbolt display. I've run Ubuntu 12.04.1 LTS and Ubuntu 12.10, and it appears to work after a few tweaks. >> >> The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port. >> >> The 3TB HDD took the approximately 2 or 3 minutes longer than the 120GB x 2 RAID0 SSD configuration for commit id 0260bb5c6978839c068007fcff2f704937805faf (31 minutes vs. 29 minutes). >> >> My local.conf parallelism settings were set to 6 threads for bitbake and make, for the quad-core (virtual 8 cpu cores)system. >> >> Has anyone tried yocto builds with a 6-core, 8-core or 10-core Xeon processor system? How do those figures fare? I'm thinking my current bottleneck might be the CPU and not the HDD (?!), for the yocto build workloads, which I find curious and would like to confirm. > > > I did quite a bit of experimenting with this a while back (similar spec, but with nearly 1000MB/s read/write SDD array). CPU was quad core with hyper-threading, so 8 virtual cores. I generally run with 16 threads, 16 parallel make as I find that the main performance hit is running out of stuff to keep all the cores busy. > > Most of the time all 8 cores are maxed out, but around when the kernel gets built (and cross tools needed for it) I see the total CPU use drop to about 25%. This isn't because the system is I/O bound; it simply doesn't have enough tasks ready to run at that point in time. > > I estimate that my 55 min build times would come down by 10 to 15 minutes if I could keep the CPUs busy (still, much better than the 10 hour build times on my previous system!). > With the poky/master branch commit 33440ee70623394d06a4b214c2be10788cba6d08, which is the tip master branch, I tried two builds 01. parallelism set to 16, which took 23 minutes 21 seconds. 02. parallelism set to 6, which took less time at 22 minutes 13 seconds. Therefore, for a quad core machine (Intel i7-3770K @ 4.2GHz over-clocked, 16GB 1600MHz RAM), setting the parallelism parameters to 6 appears to be better than setting it to 16. Run # 01 ======== BB_NUMBER_THREADS = "16" PARALLEL_MAKE = "-j 16" Build Configuration: BB_VERSION = "1.16.0" TARGET_ARCH = "arm" TARGET_OS = "linux-gnueabi" MACHINE = "zynq-zc702" DISTRO = "poky" DISTRO_VERSION = "1.3+snapshot-20121025" TUNE_FEATURES = "armv7a vfp neon cortexa9" TARGET_FPU = "vfp-neon" meta meta-yocto = "master:33440ee70623394d06a4b214c2be10788cba6d08" toolchain-layer = "master:55855cd569fbff7182974ca08b1de8435bf0f597" meta-zynq-balister = "master-xilinx-zc702-gcc-4.7:d168cea411034d1f1530e4eacf6eb3ce4affd1c8" NOTE: Resolving any missing task queue dependencies NOTE: Preparing runqueue NOTE: Executing SetScene Tasks NOTE: Executing RunQueue Tasks NOTE: validating kernel configuration cat: meta/cfg/standard/zynq-zc702/specified.cfg: No such file or directory cat: meta/cfg/standard/zynq-zc702/specified.cfg: No such file or directory ** NOTE: There were 0 required options requested that do not have a corresponding value present in the final ".config" file. This is a violation of the policy defined by the higher level config The full list can be found in your kernel src dir at: meta/cfg/standard/zynq-zc702/missing_required.cfg NOTE: Tasks Summary: Attempted 1396 tasks of which 227 didn't need to be rerun and all succeeded. real 23m21.545s user 99m42.990s sys 11m20.835s Run # 02 ======== BB_NUMBER_THREADS = "6" PARALLEL_MAKE = "-j 6" Build Configuration: BB_VERSION = "1.16.0" TARGET_ARCH = "arm" TARGET_OS = "linux-gnueabi" MACHINE = "zynq-zc702" DISTRO = "poky" DISTRO_VERSION = "1.3+snapshot-20121025" TUNE_FEATURES = "armv7a vfp neon cortexa9" TARGET_FPU = "vfp-neon" meta meta-yocto = "master:33440ee70623394d06a4b214c2be10788cba6d08" toolchain-layer = "master:55855cd569fbff7182974ca08b1de8435bf0f597" meta-zynq-balister = "master-xilinx-zc702-gcc-4.7:d168cea411034d1f1530e4eacf6eb3ce4affd1c8" NOTE: Resolving any missing task queue dependencies NOTE: Preparing runqueue NOTE: Executing SetScene Tasks NOTE: Executing RunQueue Tasks NOTE: validating kernel configuration cat: meta/cfg/standard/zynq-zc702/specified.cfg: No such file or directory cat: meta/cfg/standard/zynq-zc702/specified.cfg: No such file or directory ** NOTE: There were 0 required options requested that do not have a corresponding value present in the final ".config" file. This is a violation of the policy defined by the higher level config The full list can be found in your kernel src dir at: meta/cfg/standard/zynq-zc702/missing_required.cfg NOTE: Tasks Summary: Attempted 1396 tasks of which 227 didn't need to be rerun and all succeeded. real 22m13.749s user 96m16.053s sys 11m40.320s Best regards, Elvis Dowson ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance improvements and machine build configuration 2012-10-23 18:45 Performance improvements and machine build configuration Elvis Dowson 2012-10-23 19:39 ` Chris Tapp @ 2012-10-23 21:19 ` McClintock Matthew-B29882 2012-10-23 21:24 ` Ross Burton 1 sibling, 1 reply; 7+ messages in thread From: McClintock Matthew-B29882 @ 2012-10-23 21:19 UTC (permalink / raw) To: Elvis Dowson; +Cc: Yocto Discussion Mailing List On Tue, Oct 23, 2012 at 1:45 PM, Elvis Dowson <elvis.dowson@gmail.com> wrote: > The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port. You probably don't use much disk I/O with 16GB of memory for a build. -M ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance improvements and machine build configuration 2012-10-23 21:19 ` McClintock Matthew-B29882 @ 2012-10-23 21:24 ` Ross Burton 2012-10-23 21:37 ` McClintock Matthew-B29882 0 siblings, 1 reply; 7+ messages in thread From: Ross Burton @ 2012-10-23 21:24 UTC (permalink / raw) To: McClintock Matthew-B29882; +Cc: Yocto Discussion Mailing List On Tuesday, 23 October 2012 at 22:19, McClintock Matthew-B29882 wrote: > On Tue, Oct 23, 2012 at 1:45 PM, Elvis Dowson <elvis.dowson@gmail.com (mailto:elvis.dowson@gmail.com)> wrote: > > The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port. > > > > You probably don't use much disk I/O with 16GB of memory for a build. My machine has 16G of RAM, and after a good build will have 12G of "cache" (according to /proc), but the disk activity light was frantic. I can only imagine it would be more frantic with less RAM to act as an over-sized disk cache. Ross ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance improvements and machine build configuration 2012-10-23 21:24 ` Ross Burton @ 2012-10-23 21:37 ` McClintock Matthew-B29882 2012-10-23 21:55 ` Ross Burton 0 siblings, 1 reply; 7+ messages in thread From: McClintock Matthew-B29882 @ 2012-10-23 21:37 UTC (permalink / raw) To: Ross Burton; +Cc: McClintock Matthew-B29882, Yocto Discussion Mailing List On Tue, Oct 23, 2012 at 4:24 PM, Ross Burton <ross.burton@intel.com> wrote: > On Tuesday, 23 October 2012 at 22:19, McClintock Matthew-B29882 wrote: >> On Tue, Oct 23, 2012 at 1:45 PM, Elvis Dowson <elvis.dowson@gmail.com (mailto:elvis.dowson@gmail.com)> wrote: >> > The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port. >> >> >> >> You probably don't use much disk I/O with 16GB of memory for a build. > My machine has 16G of RAM, and after a good build will have 12G of "cache" (according to /proc), but the disk activity light was frantic. I can only imagine it would be more frantic with less RAM to act as an over-sized disk cache. Frantic but was it actually limiting the build time? It would seem not according to Elvis observations. -M ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance improvements and machine build configuration 2012-10-23 21:37 ` McClintock Matthew-B29882 @ 2012-10-23 21:55 ` Ross Burton 0 siblings, 0 replies; 7+ messages in thread From: Ross Burton @ 2012-10-23 21:55 UTC (permalink / raw) To: McClintock Matthew-B29882; +Cc: Yocto Discussion Mailing List On Tuesday, 23 October 2012 at 22:37, McClintock Matthew-B29882 wrote: > > My machine has 16G of RAM, and after a good build will have 12G of "cache" (according to /proc), but the disk activity light was frantic. I can only imagine it would be more frantic with less RAM to act as an over-sized disk cache. > > > Frantic but was it actually limiting the build time? It would seem not > according to Elvis observations. No idea, it's always had 16G. One day I'll do some benchmarking. Ross ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-10-25 8:53 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-10-23 18:45 Performance improvements and machine build configuration Elvis Dowson 2012-10-23 19:39 ` Chris Tapp 2012-10-25 8:53 ` Elvis Dowson 2012-10-23 21:19 ` McClintock Matthew-B29882 2012-10-23 21:24 ` Ross Burton 2012-10-23 21:37 ` McClintock Matthew-B29882 2012-10-23 21:55 ` Ross Burton
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.