All of lore.kernel.org
 help / color / mirror / Atom feed
* Performance improvements and machine build configuration
@ 2012-10-23 18:45 Elvis Dowson
  2012-10-23 19:39 ` Chris Tapp
  2012-10-23 21:19 ` McClintock Matthew-B29882
  0 siblings, 2 replies; 7+ messages in thread
From: Elvis Dowson @ 2012-10-23 18:45 UTC (permalink / raw)
  To: Yocto Discussion Mailing List

Hi,
      I noticed that between commits 

http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=0260bb5c6978839c068007fcff2f704937805faf

and 

http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=a3d5e9e6b7729319c518dcaf25bbe0643bfb25db

the build time has improved by around 7 minutes for my machine configuration, for building a core-image-minimal rootfs for the Xilinx ZC-702 FPGA with dual ARM Cortex A-9 CPUs.

commit id 0260bb5c6978839c068007fcff2f704937805faf        took 29 minutes
commit id a3d5e9e6b7729319c518dcaf25bbe0643bfb25db  took 22 minutes

The machine configuration is an Intel i7 3770K over-clocked to 4.2GHz, with 16GB RAM at 1600Mhz, two 120GB SSDs configured into a striped disk array (Intel 330 series SSDs) with a write performance of 838MB/s and read performance of around 600MB/s, in RAID0 configuration, with a Corsair HT100 liquid CPU cooler keeping the CPU cool at around 52 degree centigrade during the build process. The motherboard is a gigabyte GA-Z77X-UP5TH

http://www.gigabyte.com/products/product-page.aspx?pid=4279#ov

This motherboard has a thunderbolt display port, so I can re-use my existing Apple Thunderbolt display. I've run Ubuntu 12.04.1 LTS and Ubuntu 12.10, and it appears to work after a few tweaks.

The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.

The 3TB HDD took the approximately 2 or 3 minutes longer than the 120GB x 2 RAID0 SSD configuration for commit id 0260bb5c6978839c068007fcff2f704937805faf (31 minutes vs. 29 minutes).

My local.conf parallelism settings were set to 6 threads for bitbake and make, for the quad-core (virtual 8 cpu cores)system.

Has anyone tried yocto builds with a 6-core, 8-core or 10-core Xeon processor system? How do those figures fare? I'm thinking my current bottleneck might be the CPU and not the HDD (?!), for the yocto build workloads, which I find curious and would like to confirm. 

Best regards,

Elvis Dowson

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance improvements and machine build configuration
  2012-10-23 18:45 Performance improvements and machine build configuration Elvis Dowson
@ 2012-10-23 19:39 ` Chris Tapp
  2012-10-25  8:53   ` Elvis Dowson
  2012-10-23 21:19 ` McClintock Matthew-B29882
  1 sibling, 1 reply; 7+ messages in thread
From: Chris Tapp @ 2012-10-23 19:39 UTC (permalink / raw)
  To: Elvis Dowson; +Cc: Yocto Discussion Mailing List

On 23 Oct 2012, at 19:45, Elvis Dowson wrote:

> Hi,
>      I noticed that between commits 
> 
> http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=0260bb5c6978839c068007fcff2f704937805faf
> 
> and 
> 
> http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=a3d5e9e6b7729319c518dcaf25bbe0643bfb25db
> 
> the build time has improved by around 7 minutes for my machine configuration, for building a core-image-minimal rootfs for the Xilinx ZC-702 FPGA with dual ARM Cortex A-9 CPUs.
> 
> commit id 0260bb5c6978839c068007fcff2f704937805faf        took 29 minutes
> commit id a3d5e9e6b7729319c518dcaf25bbe0643bfb25db  took 22 minutes
> 
> The machine configuration is an Intel i7 3770K over-clocked to 4.2GHz, with 16GB RAM at 1600Mhz, two 120GB SSDs configured into a striped disk array (Intel 330 series SSDs) with a write performance of 838MB/s and read performance of around 600MB/s, in RAID0 configuration, with a Corsair HT100 liquid CPU cooler keeping the CPU cool at around 52 degree centigrade during the build process. The motherboard is a gigabyte GA-Z77X-UP5TH
> 
> http://www.gigabyte.com/products/product-page.aspx?pid=4279#ov
> 
> This motherboard has a thunderbolt display port, so I can re-use my existing Apple Thunderbolt display. I've run Ubuntu 12.04.1 LTS and Ubuntu 12.10, and it appears to work after a few tweaks.
> 
> The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.
> 
> The 3TB HDD took the approximately 2 or 3 minutes longer than the 120GB x 2 RAID0 SSD configuration for commit id 0260bb5c6978839c068007fcff2f704937805faf (31 minutes vs. 29 minutes).
> 
> My local.conf parallelism settings were set to 6 threads for bitbake and make, for the quad-core (virtual 8 cpu cores)system.
> 
> Has anyone tried yocto builds with a 6-core, 8-core or 10-core Xeon processor system? How do those figures fare? I'm thinking my current bottleneck might be the CPU and not the HDD (?!), for the yocto build workloads, which I find curious and would like to confirm. 


I did quite a bit of experimenting with this a while back (similar spec, but with nearly 1000MB/s read/write SDD array). CPU was quad core with hyper-threading, so 8 virtual cores. I generally run with 16 threads, 16 parallel make as I find that the main performance hit is running out of stuff to keep all the cores busy.

Most of the time all 8 cores are maxed out, but around when the kernel gets built (and cross tools needed for it) I see the total CPU use drop to about 25%. This isn't because the system is I/O bound; it simply doesn't have enough tasks ready to run at that point in time.

I estimate that my 55 min build times would come down by 10 to 15 minutes if I could keep the CPUs busy (still, much better than the 10 hour build times on my previous system!).

I tried 'tinkering' with the run queue priority order, but all I proved was that inverting it (i.e. make the things that were previous given high-priority have low-priority) made no measurable difference to my build times! I'm trying not to think too much about that one ;-)

Chris Tapp

opensource@keylevel.com
www.keylevel.com





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance improvements and machine build configuration
  2012-10-23 18:45 Performance improvements and machine build configuration Elvis Dowson
  2012-10-23 19:39 ` Chris Tapp
@ 2012-10-23 21:19 ` McClintock Matthew-B29882
  2012-10-23 21:24   ` Ross Burton
  1 sibling, 1 reply; 7+ messages in thread
From: McClintock Matthew-B29882 @ 2012-10-23 21:19 UTC (permalink / raw)
  To: Elvis Dowson; +Cc: Yocto Discussion Mailing List

On Tue, Oct 23, 2012 at 1:45 PM, Elvis Dowson <elvis.dowson@gmail.com> wrote:
> The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.

You probably don't use much disk I/O with 16GB of memory for a build.

-M


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance improvements and machine build configuration
  2012-10-23 21:19 ` McClintock Matthew-B29882
@ 2012-10-23 21:24   ` Ross Burton
  2012-10-23 21:37     ` McClintock Matthew-B29882
  0 siblings, 1 reply; 7+ messages in thread
From: Ross Burton @ 2012-10-23 21:24 UTC (permalink / raw)
  To: McClintock Matthew-B29882; +Cc: Yocto Discussion Mailing List

On Tuesday, 23 October 2012 at 22:19, McClintock Matthew-B29882 wrote:
> On Tue, Oct 23, 2012 at 1:45 PM, Elvis Dowson <elvis.dowson@gmail.com (mailto:elvis.dowson@gmail.com)> wrote:
> > The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.
> 
> 
> 
> You probably don't use much disk I/O with 16GB of memory for a build.
My machine has 16G of RAM, and after a good build will have 12G of "cache" (according to /proc), but the disk activity light was frantic.  I can only imagine it would be more frantic with less RAM to act as an over-sized disk cache.

Ross 




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance improvements and machine build configuration
  2012-10-23 21:24   ` Ross Burton
@ 2012-10-23 21:37     ` McClintock Matthew-B29882
  2012-10-23 21:55       ` Ross Burton
  0 siblings, 1 reply; 7+ messages in thread
From: McClintock Matthew-B29882 @ 2012-10-23 21:37 UTC (permalink / raw)
  To: Ross Burton; +Cc: McClintock Matthew-B29882, Yocto Discussion Mailing List

On Tue, Oct 23, 2012 at 4:24 PM, Ross Burton <ross.burton@intel.com> wrote:
> On Tuesday, 23 October 2012 at 22:19, McClintock Matthew-B29882 wrote:
>> On Tue, Oct 23, 2012 at 1:45 PM, Elvis Dowson <elvis.dowson@gmail.com (mailto:elvis.dowson@gmail.com)> wrote:
>> > The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.
>>
>>
>>
>> You probably don't use much disk I/O with 16GB of memory for a build.
> My machine has 16G of RAM, and after a good build will have 12G of "cache" (according to /proc), but the disk activity light was frantic.  I can only imagine it would be more frantic with less RAM to act as an over-sized disk cache.

Frantic but was it actually limiting the build time? It would seem not
according to Elvis observations.

-M


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance improvements and machine build configuration
  2012-10-23 21:37     ` McClintock Matthew-B29882
@ 2012-10-23 21:55       ` Ross Burton
  0 siblings, 0 replies; 7+ messages in thread
From: Ross Burton @ 2012-10-23 21:55 UTC (permalink / raw)
  To: McClintock Matthew-B29882; +Cc: Yocto Discussion Mailing List

On Tuesday, 23 October 2012 at 22:37, McClintock Matthew-B29882 wrote:
> > My machine has 16G of RAM, and after a good build will have 12G of "cache" (according to /proc), but the disk activity light was frantic. I can only imagine it would be more frantic with less RAM to act as an over-sized disk cache.
> 
> 
> Frantic but was it actually limiting the build time? It would seem not
> according to Elvis observations.

No idea, it's always had 16G.  One day I'll do some benchmarking.

Ross 




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance improvements and machine build configuration
  2012-10-23 19:39 ` Chris Tapp
@ 2012-10-25  8:53   ` Elvis Dowson
  0 siblings, 0 replies; 7+ messages in thread
From: Elvis Dowson @ 2012-10-25  8:53 UTC (permalink / raw)
  To: Chris Tapp; +Cc: Yocto Discussion Mailing List

Hi Chris,
                  
On Oct 23, 2012, at 11:39 PM, Chris Tapp <opensource@keylevel.com> wrote:

> On 23 Oct 2012, at 19:45, Elvis Dowson wrote:
> 
>> I noticed that between commits 
>> 
>> http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=0260bb5c6978839c068007fcff2f704937805faf
>> 
>> and 
>> 
>> http://git.yoctoproject.org/cgit/cgit.cgi/poky/commit/?id=a3d5e9e6b7729319c518dcaf25bbe0643bfb25db
>> 
>> the build time has improved by around 7 minutes for my machine configuration, for building a core-image-minimal rootfs for the Xilinx ZC-702 FPGA with dual ARM Cortex A-9 CPUs.
>> 
>> commit id 0260bb5c6978839c068007fcff2f704937805faf        took 29 minutes
>> commit id a3d5e9e6b7729319c518dcaf25bbe0643bfb25db  took 22 minutes
>> 
>> The machine configuration is an Intel i7 3770K over-clocked to 4.2GHz, with 16GB RAM at 1600Mhz, two 120GB SSDs configured into a striped disk array (Intel 330 series SSDs) with a write performance of 838MB/s and read performance of around 600MB/s, in RAID0 configuration, with a Corsair HT100 liquid CPU cooler keeping the CPU cool at around 52 degree centigrade during the build process. The motherboard is a gigabyte GA-Z77X-UP5TH
>> 
>> http://www.gigabyte.com/products/product-page.aspx?pid=4279#ov
>> 
>> This motherboard has a thunderbolt display port, so I can re-use my existing Apple Thunderbolt display. I've run Ubuntu 12.04.1 LTS and Ubuntu 12.10, and it appears to work after a few tweaks.
>> 
>> The only curious thing that I've noticed is that I don't see a large performance improvement using a standard 3TB Seagate Barracuda 7200 RPM HDD, and the two Intel Series 330 SSDs in a striped RAID0 configuration. The read (600MB/s) / write (838MB/s) figures are impressive, although I expected the read performance to be higher than write performance, as is normally with a single SSD. I'm using the motherboard's hardware RAID support on a 6GB/s SATA 3 port.
>> 
>> The 3TB HDD took the approximately 2 or 3 minutes longer than the 120GB x 2 RAID0 SSD configuration for commit id 0260bb5c6978839c068007fcff2f704937805faf (31 minutes vs. 29 minutes).
>> 
>> My local.conf parallelism settings were set to 6 threads for bitbake and make, for the quad-core (virtual 8 cpu cores)system.
>> 
>> Has anyone tried yocto builds with a 6-core, 8-core or 10-core Xeon processor system? How do those figures fare? I'm thinking my current bottleneck might be the CPU and not the HDD (?!), for the yocto build workloads, which I find curious and would like to confirm. 
> 
> 
> I did quite a bit of experimenting with this a while back (similar spec, but with nearly 1000MB/s read/write SDD array). CPU was quad core with hyper-threading, so 8 virtual cores. I generally run with 16 threads, 16 parallel make as I find that the main performance hit is running out of stuff to keep all the cores busy.
> 
> Most of the time all 8 cores are maxed out, but around when the kernel gets built (and cross tools needed for it) I see the total CPU use drop to about 25%. This isn't because the system is I/O bound; it simply doesn't have enough tasks ready to run at that point in time.
> 
> I estimate that my 55 min build times would come down by 10 to 15 minutes if I could keep the CPUs busy (still, much better than the 10 hour build times on my previous system!).
> 

With the poky/master branch commit 33440ee70623394d06a4b214c2be10788cba6d08, which is the tip master branch, I tried two builds 

01. parallelism set to 16, which took 23 minutes 21 seconds.

02. parallelism set to 6, which took less time at 22 minutes 13 seconds.

Therefore, for a quad core machine (Intel i7-3770K @ 4.2GHz over-clocked, 16GB 1600MHz RAM), setting the parallelism parameters to 6 appears to be better than setting it to 16. 


Run # 01
========

BB_NUMBER_THREADS = "16"
PARALLEL_MAKE = "-j 16"

Build Configuration:
BB_VERSION        = "1.16.0"
TARGET_ARCH       = "arm"
TARGET_OS         = "linux-gnueabi"
MACHINE           = "zynq-zc702"
DISTRO            = "poky"
DISTRO_VERSION    = "1.3+snapshot-20121025"
TUNE_FEATURES     = "armv7a vfp neon cortexa9"
TARGET_FPU        = "vfp-neon"
meta              
meta-yocto        = "master:33440ee70623394d06a4b214c2be10788cba6d08"
toolchain-layer   = "master:55855cd569fbff7182974ca08b1de8435bf0f597"
meta-zynq-balister = "master-xilinx-zc702-gcc-4.7:d168cea411034d1f1530e4eacf6eb3ce4affd1c8"

NOTE: Resolving any missing task queue dependencies
NOTE: Preparing runqueue
NOTE: Executing SetScene Tasks
NOTE: Executing RunQueue Tasks
NOTE: validating kernel configuration
cat: meta/cfg/standard/zynq-zc702/specified.cfg: No such file or directory
cat: meta/cfg/standard/zynq-zc702/specified.cfg: No such file or directory
** NOTE: There were 0 required options requested that do not
         have a corresponding value present in the final ".config" file.
         This is a violation of the policy defined by the higher level config
The full list can be found in your kernel src dir at:
meta/cfg/standard/zynq-zc702/missing_required.cfg
NOTE: Tasks Summary: Attempted 1396 tasks of which 227 didn't need to be rerun and all succeeded.

real	23m21.545s
user	99m42.990s
sys	11m20.835s



Run # 02
========

BB_NUMBER_THREADS = "6"
PARALLEL_MAKE = "-j 6"

Build Configuration:
BB_VERSION        = "1.16.0"
TARGET_ARCH       = "arm"
TARGET_OS         = "linux-gnueabi"
MACHINE           = "zynq-zc702"
DISTRO            = "poky"
DISTRO_VERSION    = "1.3+snapshot-20121025"
TUNE_FEATURES     = "armv7a vfp neon cortexa9"
TARGET_FPU        = "vfp-neon"
meta              
meta-yocto        = "master:33440ee70623394d06a4b214c2be10788cba6d08"
toolchain-layer   = "master:55855cd569fbff7182974ca08b1de8435bf0f597"
meta-zynq-balister = "master-xilinx-zc702-gcc-4.7:d168cea411034d1f1530e4eacf6eb3ce4affd1c8"

NOTE: Resolving any missing task queue dependencies
NOTE: Preparing runqueue
NOTE: Executing SetScene Tasks
NOTE: Executing RunQueue Tasks
NOTE: validating kernel configuration
cat: meta/cfg/standard/zynq-zc702/specified.cfg: No such file or directory
cat: meta/cfg/standard/zynq-zc702/specified.cfg: No such file or directory
** NOTE: There were 0 required options requested that do not
         have a corresponding value present in the final ".config" file.
         This is a violation of the policy defined by the higher level config
The full list can be found in your kernel src dir at:
meta/cfg/standard/zynq-zc702/missing_required.cfg
NOTE: Tasks Summary: Attempted 1396 tasks of which 227 didn't need to be rerun and all succeeded.

real	22m13.749s
user	96m16.053s
sys	11m40.320s


Best regards,

Elvis Dowson




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-10-25  8:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-23 18:45 Performance improvements and machine build configuration Elvis Dowson
2012-10-23 19:39 ` Chris Tapp
2012-10-25  8:53   ` Elvis Dowson
2012-10-23 21:19 ` McClintock Matthew-B29882
2012-10-23 21:24   ` Ross Burton
2012-10-23 21:37     ` McClintock Matthew-B29882
2012-10-23 21:55       ` Ross Burton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.