* Performance improvements and machine build configuration
@ 2012-10-23 18:45 Elvis Dowson
2012-10-23 19:39 ` Chris Tapp
2012-10-23 21:19 ` McClintock Matthew-B29882
0 siblings, 2 replies; 7+ messages in thread
From: Elvis Dowson @ 2012-10-23 18:45 UTC (permalink / raw)
To: Yocto Discussion Mailing List
Hi,
I noticed that between commits
http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=0260bb5c6978839c068007fcff2f704937805faf
and
http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=a3d5e9e6b7729319c518dcaf25bbe0643bfb25db
the build time has improved by around 7 minutes for my machine configuration, for building a core-image-minimal rootfs for the Xilinx ZC-702 FPGA with dual ARM Cortex A-9 CPUs.
commit id 0260bb5c6978839c068007fcff2f704937805faf took 29 minutes
commit id a3d5e9e6b7729319c518dcaf25bbe0643bfb25db took 22 minutes
The machine configuration is an Intel i7 3770K over-clocked to 4.2GHz, with 16GB RAM at 1600Mhz, two 120GB SSDs configured into a striped disk array (Intel 330 series SSDs) with a write performance of 838MB/s and read performance of around 600MB/s, in RAID0 configuration, with a Corsair HT100 liquid CPU cooler keeping the CPU cool at around 52 degree centigrade during the build process. The motherboard is a gigabyte GA-Z77X-UP5TH
http://www.gigabyte.com/products/product-page.aspx?pid=4279#ov
This motherboard has a thunderbolt display port, so I can re-use my existing Apple Thunderbolt display. I've run Ubuntu 12.04.1 LTS and Ubuntu 12.10, and it appears to work after a few tweaks.
The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.
The 3TB HDD took the approximately 2 or 3 minutes longer than the 120GB x 2 RAID0 SSD configuration for commit id 0260bb5c6978839c068007fcff2f704937805faf (31 minutes vs. 29 minutes).
My local.conf parallelism settings were set to 6 threads for bitbake and make, for the quad-core (virtual 8 cpu cores)system.
Has anyone tried yocto builds with a 6-core, 8-core or 10-core Xeon processor system? How do those figures fare? I'm thinking my current bottleneck might be the CPU and not the HDD (?!), for the yocto build workloads, which I find curious and would like to confirm.
Best regards,
Elvis Dowson
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance improvements and machine build configuration
2012-10-23 18:45 Performance improvements and machine build configuration Elvis Dowson
@ 2012-10-23 19:39 ` Chris Tapp
2012-10-25 8:53 ` Elvis Dowson
2012-10-23 21:19 ` McClintock Matthew-B29882
1 sibling, 1 reply; 7+ messages in thread
From: Chris Tapp @ 2012-10-23 19:39 UTC (permalink / raw)
To: Elvis Dowson; +Cc: Yocto Discussion Mailing List
On 23 Oct 2012, at 19:45, Elvis Dowson wrote:
> Hi,
> I noticed that between commits
>
> http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=0260bb5c6978839c068007fcff2f704937805faf
>
> and
>
> http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=a3d5e9e6b7729319c518dcaf25bbe0643bfb25db
>
> the build time has improved by around 7 minutes for my machine configuration, for building a core-image-minimal rootfs for the Xilinx ZC-702 FPGA with dual ARM Cortex A-9 CPUs.
>
> commit id 0260bb5c6978839c068007fcff2f704937805faf took 29 minutes
> commit id a3d5e9e6b7729319c518dcaf25bbe0643bfb25db took 22 minutes
>
> The machine configuration is an Intel i7 3770K over-clocked to 4.2GHz, with 16GB RAM at 1600Mhz, two 120GB SSDs configured into a striped disk array (Intel 330 series SSDs) with a write performance of 838MB/s and read performance of around 600MB/s, in RAID0 configuration, with a Corsair HT100 liquid CPU cooler keeping the CPU cool at around 52 degree centigrade during the build process. The motherboard is a gigabyte GA-Z77X-UP5TH
>
> http://www.gigabyte.com/products/product-page.aspx?pid=4279#ov
>
> This motherboard has a thunderbolt display port, so I can re-use my existing Apple Thunderbolt display. I've run Ubuntu 12.04.1 LTS and Ubuntu 12.10, and it appears to work after a few tweaks.
>
> The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.
>
> The 3TB HDD took the approximately 2 or 3 minutes longer than the 120GB x 2 RAID0 SSD configuration for commit id 0260bb5c6978839c068007fcff2f704937805faf (31 minutes vs. 29 minutes).
>
> My local.conf parallelism settings were set to 6 threads for bitbake and make, for the quad-core (virtual 8 cpu cores)system.
>
> Has anyone tried yocto builds with a 6-core, 8-core or 10-core Xeon processor system? How do those figures fare? I'm thinking my current bottleneck might be the CPU and not the HDD (?!), for the yocto build workloads, which I find curious and would like to confirm.
I did quite a bit of experimenting with this a while back (similar spec, but with nearly 1000MB/s read/write SDD array). CPU was quad core with hyper-threading, so 8 virtual cores. I generally run with 16 threads, 16 parallel make as I find that the main performance hit is running out of stuff to keep all the cores busy.
Most of the time all 8 cores are maxed out, but around when the kernel gets built (and cross tools needed for it) I see the total CPU use drop to about 25%. This isn't because the system is I/O bound; it simply doesn't have enough tasks ready to run at that point in time.
I estimate that my 55 min build times would come down by 10 to 15 minutes if I could keep the CPUs busy (still, much better than the 10 hour build times on my previous system!).
I tried 'tinkering' with the run queue priority order, but all I proved was that inverting it (i.e. make the things that were previous given high-priority have low-priority) made no measurable difference to my build times! I'm trying not to think too much about that one ;-)
Chris Tapp
opensource@keylevel.com
www.keylevel.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance improvements and machine build configuration
2012-10-23 18:45 Performance improvements and machine build configuration Elvis Dowson
2012-10-23 19:39 ` Chris Tapp
@ 2012-10-23 21:19 ` McClintock Matthew-B29882
2012-10-23 21:24 ` Ross Burton
1 sibling, 1 reply; 7+ messages in thread
From: McClintock Matthew-B29882 @ 2012-10-23 21:19 UTC (permalink / raw)
To: Elvis Dowson; +Cc: Yocto Discussion Mailing List
On Tue, Oct 23, 2012 at 1:45 PM, Elvis Dowson <elvis.dowson@gmail.com> wrote:
> The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.
You probably don't use much disk I/O with 16GB of memory for a build.
-M
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance improvements and machine build configuration
2012-10-23 21:19 ` McClintock Matthew-B29882
@ 2012-10-23 21:24 ` Ross Burton
2012-10-23 21:37 ` McClintock Matthew-B29882
0 siblings, 1 reply; 7+ messages in thread
From: Ross Burton @ 2012-10-23 21:24 UTC (permalink / raw)
To: McClintock Matthew-B29882; +Cc: Yocto Discussion Mailing List
On Tuesday, 23 October 2012 at 22:19, McClintock Matthew-B29882 wrote:
> On Tue, Oct 23, 2012 at 1:45 PM, Elvis Dowson <elvis.dowson@gmail.com (mailto:elvis.dowson@gmail.com)> wrote:
> > The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.
>
>
>
> You probably don't use much disk I/O with 16GB of memory for a build.
My machine has 16G of RAM, and after a good build will have 12G of "cache" (according to /proc), but the disk activity light was frantic. I can only imagine it would be more frantic with less RAM to act as an over-sized disk cache.
Ross
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance improvements and machine build configuration
2012-10-23 21:24 ` Ross Burton
@ 2012-10-23 21:37 ` McClintock Matthew-B29882
2012-10-23 21:55 ` Ross Burton
0 siblings, 1 reply; 7+ messages in thread
From: McClintock Matthew-B29882 @ 2012-10-23 21:37 UTC (permalink / raw)
To: Ross Burton; +Cc: McClintock Matthew-B29882, Yocto Discussion Mailing List
On Tue, Oct 23, 2012 at 4:24 PM, Ross Burton <ross.burton@intel.com> wrote:
> On Tuesday, 23 October 2012 at 22:19, McClintock Matthew-B29882 wrote:
>> On Tue, Oct 23, 2012 at 1:45 PM, Elvis Dowson <elvis.dowson@gmail.com (mailto:elvis.dowson@gmail.com)> wrote:
>> > The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.
>>
>>
>>
>> You probably don't use much disk I/O with 16GB of memory for a build.
> My machine has 16G of RAM, and after a good build will have 12G of "cache" (according to /proc), but the disk activity light was frantic. I can only imagine it would be more frantic with less RAM to act as an over-sized disk cache.
Frantic but was it actually limiting the build time? It would seem not
according to Elvis observations.
-M
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance improvements and machine build configuration
2012-10-23 21:37 ` McClintock Matthew-B29882
@ 2012-10-23 21:55 ` Ross Burton
0 siblings, 0 replies; 7+ messages in thread
From: Ross Burton @ 2012-10-23 21:55 UTC (permalink / raw)
To: McClintock Matthew-B29882; +Cc: Yocto Discussion Mailing List
On Tuesday, 23 October 2012 at 22:37, McClintock Matthew-B29882 wrote:
> > My machine has 16G of RAM, and after a good build will have 12G of "cache" (according to /proc), but the disk activity light was frantic. I can only imagine it would be more frantic with less RAM to act as an over-sized disk cache.
>
>
> Frantic but was it actually limiting the build time? It would seem not
> according to Elvis observations.
No idea, it's always had 16G. One day I'll do some benchmarking.
Ross
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance improvements and machine build configuration
2012-10-23 19:39 ` Chris Tapp
@ 2012-10-25 8:53 ` Elvis Dowson
0 siblings, 0 replies; 7+ messages in thread
From: Elvis Dowson @ 2012-10-25 8:53 UTC (permalink / raw)
To: Chris Tapp; +Cc: Yocto Discussion Mailing List
Hi Chris,
On Oct 23, 2012, at 11:39 PM, Chris Tapp <opensource@keylevel.com> wrote:
> On 23 Oct 2012, at 19:45, Elvis Dowson wrote:
>
>> I noticed that between commits
>>
>> http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=0260bb5c6978839c068007fcff2f704937805faf
>>
>> and
>>
>> http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=a3d5e9e6b7729319c518dcaf25bbe0643bfb25db
>>
>> the build time has improved by around 7 minutes for my machine configuration, for building a core-image-minimal rootfs for the Xilinx ZC-702 FPGA with dual ARM Cortex A-9 CPUs.
>>
>> commit id 0260bb5c6978839c068007fcff2f704937805faf took 29 minutes
>> commit id a3d5e9e6b7729319c518dcaf25bbe0643bfb25db took 22 minutes
>>
>> The machine configuration is an Intel i7 3770K over-clocked to 4.2GHz, with 16GB RAM at 1600Mhz, two 120GB SSDs configured into a striped disk array (Intel 330 series SSDs) with a write performance of 838MB/s and read performance of around 600MB/s, in RAID0 configuration, with a Corsair HT100 liquid CPU cooler keeping the CPU cool at around 52 degree centigrade during the build process. The motherboard is a gigabyte GA-Z77X-UP5TH
>>
>> http://www.gigabyte.com/products/product-page.aspx?pid=4279#ov
>>
>> This motherboard has a thunderbolt display port, so I can re-use my existing Apple Thunderbolt display. I've run Ubuntu 12.04.1 LTS and Ubuntu 12.10, and it appears to work after a few tweaks.
>>
>> The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.
>>
>> The 3TB HDD took the approximately 2 or 3 minutes longer than the 120GB x 2 RAID0 SSD configuration for commit id 0260bb5c6978839c068007fcff2f704937805faf (31 minutes vs. 29 minutes).
>>
>> My local.conf parallelism settings were set to 6 threads for bitbake and make, for the quad-core (virtual 8 cpu cores)system.
>>
>> Has anyone tried yocto builds with a 6-core, 8-core or 10-core Xeon processor system? How do those figures fare? I'm thinking my current bottleneck might be the CPU and not the HDD (?!), for the yocto build workloads, which I find curious and would like to confirm.
>
>
> I did quite a bit of experimenting with this a while back (similar spec, but with nearly 1000MB/s read/write SDD array). CPU was quad core with hyper-threading, so 8 virtual cores. I generally run with 16 threads, 16 parallel make as I find that the main performance hit is running out of stuff to keep all the cores busy.
>
> Most of the time all 8 cores are maxed out, but around when the kernel gets built (and cross tools needed for it) I see the total CPU use drop to about 25%. This isn't because the system is I/O bound; it simply doesn't have enough tasks ready to run at that point in time.
>
> I estimate that my 55 min build times would come down by 10 to 15 minutes if I could keep the CPUs busy (still, much better than the 10 hour build times on my previous system!).
>
With the poky/master branch commit 33440ee70623394d06a4b214c2be10788cba6d08, which is the tip master branch, I tried two builds
01. parallelism set to 16, which took 23 minutes 21 seconds.
02. parallelism set to 6, which took less time at 22 minutes 13 seconds.
Therefore, for a quad core machine (Intel i7-3770K @ 4.2GHz over-clocked, 16GB 1600MHz RAM), setting the parallelism parameters to 6 appears to be better than setting it to 16.
Run # 01
========
BB_NUMBER_THREADS = "16"
PARALLEL_MAKE = "-j 16"
Build Configuration:
BB_VERSION = "1.16.0"
TARGET_ARCH = "arm"
TARGET_OS = "linux-gnueabi"
MACHINE = "zynq-zc702"
DISTRO = "poky"
DISTRO_VERSION = "1.3+snapshot-20121025"
TUNE_FEATURES = "armv7a vfp neon cortexa9"
TARGET_FPU = "vfp-neon"
meta
meta-yocto = "master:33440ee70623394d06a4b214c2be10788cba6d08"
toolchain-layer = "master:55855cd569fbff7182974ca08b1de8435bf0f597"
meta-zynq-balister = "master-xilinx-zc702-gcc-4.7:d168cea411034d1f1530e4eacf6eb3ce4affd1c8"
NOTE: Resolving any missing task queue dependencies
NOTE: Preparing runqueue
NOTE: Executing SetScene Tasks
NOTE: Executing RunQueue Tasks
NOTE: validating kernel configuration
cat: meta/cfg/standard/zynq-zc702/specified.cfg: No such file or directory
cat: meta/cfg/standard/zynq-zc702/specified.cfg: No such file or directory
** NOTE: There were 0 required options requested that do not
have a corresponding value present in the final ".config" file.
This is a violation of the policy defined by the higher level config
The full list can be found in your kernel src dir at:
meta/cfg/standard/zynq-zc702/missing_required.cfg
NOTE: Tasks Summary: Attempted 1396 tasks of which 227 didn't need to be rerun and all succeeded.
real 23m21.545s
user 99m42.990s
sys 11m20.835s
Run # 02
========
BB_NUMBER_THREADS = "6"
PARALLEL_MAKE = "-j 6"
Build Configuration:
BB_VERSION = "1.16.0"
TARGET_ARCH = "arm"
TARGET_OS = "linux-gnueabi"
MACHINE = "zynq-zc702"
DISTRO = "poky"
DISTRO_VERSION = "1.3+snapshot-20121025"
TUNE_FEATURES = "armv7a vfp neon cortexa9"
TARGET_FPU = "vfp-neon"
meta
meta-yocto = "master:33440ee70623394d06a4b214c2be10788cba6d08"
toolchain-layer = "master:55855cd569fbff7182974ca08b1de8435bf0f597"
meta-zynq-balister = "master-xilinx-zc702-gcc-4.7:d168cea411034d1f1530e4eacf6eb3ce4affd1c8"
NOTE: Resolving any missing task queue dependencies
NOTE: Preparing runqueue
NOTE: Executing SetScene Tasks
NOTE: Executing RunQueue Tasks
NOTE: validating kernel configuration
cat: meta/cfg/standard/zynq-zc702/specified.cfg: No such file or directory
cat: meta/cfg/standard/zynq-zc702/specified.cfg: No such file or directory
** NOTE: There were 0 required options requested that do not
have a corresponding value present in the final ".config" file.
This is a violation of the policy defined by the higher level config
The full list can be found in your kernel src dir at:
meta/cfg/standard/zynq-zc702/missing_required.cfg
NOTE: Tasks Summary: Attempted 1396 tasks of which 227 didn't need to be rerun and all succeeded.
real 22m13.749s
user 96m16.053s
sys 11m40.320s
Best regards,
Elvis Dowson
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-10-25 8:53 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-23 18:45 Performance improvements and machine build configuration Elvis Dowson
2012-10-23 19:39 ` Chris Tapp
2012-10-25 8:53 ` Elvis Dowson
2012-10-23 21:19 ` McClintock Matthew-B29882
2012-10-23 21:24 ` Ross Burton
2012-10-23 21:37 ` McClintock Matthew-B29882
2012-10-23 21:55 ` Ross Burton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.