Buildroot Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Yann E. MORIN <yann.morin.1998@free.fr>
To: buildroot@busybox.net
Subject: [Buildroot] [PATCHv5 2/4] support/scripts: add size-stats script
Date: Thu, 3 Sep 2015 12:27:24 +0200	[thread overview]
Message-ID: <20150903102724.GB3689@free.fr> (raw)
In-Reply-To: <1441228505-23235-3-git-send-email-thomas.petazzoni@free-electrons.com>

Thomas, All,

On 2015-09-02 23:15 +0200, Thomas Petazzoni spake thusly:
> This new script uses the data collected by the step_pkg_size
> instrumentation hook to generate a pie chart of the size contribution
> of each package to the target root filesystem, and two CSV files with
> statistics about the package size and file size. To achieve this, it
> looks at each file in $(TARGET_DIR), and using the
> packages-file-list.txt information collected by the step_pkg_size
> hook, it determines to which package the file belongs. It is therefore
> able to give the size installed by each package.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>

Tested-by: "Yann E. MORIN" <yann.morin.1998@free.fr>

I'm not in a position to review the code, however. I'll entrust our
Python experts in their reviews. ;-)

Regards,
Yann E. MORIN.

> ---
>  support/scripts/size-stats | 217 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 217 insertions(+)
>  create mode 100755 support/scripts/size-stats
> 
> diff --git a/support/scripts/size-stats b/support/scripts/size-stats
> new file mode 100755
> index 0000000..54685f6
> --- /dev/null
> +++ b/support/scripts/size-stats
> @@ -0,0 +1,217 @@
> +#!/usr/bin/env python
> +
> +# Copyright (C) 2014 by Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +# General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write to the Free Software
> +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> +
> +import sys
> +import os
> +import os.path
> +import argparse
> +import csv
> +import collections
> +
> +try:
> +    import matplotlib
> +    matplotlib.use('Agg')
> +    import matplotlib.font_manager as fm
> +    import matplotlib.pyplot as plt
> +except ImportError:
> +    sys.stderr.write("You need python-matplotlib to generate the size graph\n")
> +    exit(1)
> +
> +colors = ['#e60004', '#009836', '#2e1d86', '#ffed00',
> +          '#0068b5', '#f28e00', '#940084', '#97c000']
> +
> +#
> +# This function adds a new file to 'filesdict', after checking its
> +# size. The 'filesdict' contain the relative path of the file as the
> +# key, and as the value a tuple containing the name of the package to
> +# which the file belongs and the size of the file.
> +#
> +# filesdict: the dict to which  the file is added
> +# relpath: relative path of the file
> +# fullpath: absolute path to the file
> +# pkg: package to which the file belongs
> +#
> +def add_file(filesdict, relpath, abspath, pkg):
> +    if not os.path.exists(abspath):
> +        return
> +    if os.path.islink(abspath):
> +        return
> +    sz = os.stat(abspath).st_size
> +    filesdict[relpath] = (pkg, sz)
> +
> +#
> +# This function returns a dict where each key is the path of a file in
> +# the root filesystem, and the value is a tuple containing two
> +# elements: the name of the package to which this file belongs and the
> +# size of the file.
> +#
> +# builddir: path to the Buildroot output directory
> +#
> +def build_package_dict(builddir):
> +    filesdict = {}
> +    with open(os.path.join(builddir, "build", "packages-file-list.txt")) as filelistf:
> +        for l in filelistf.readlines():
> +            pkg, fpath = l.split(",")
> +            # remove the initial './' in each file path
> +            fpath = fpath.strip()[2:]
> +            fullpath = os.path.join(builddir, "target", fpath)
> +            add_file(filesdict, fpath, fullpath, pkg)
> +    return filesdict
> +
> +#
> +# This function builds a dictionary that contains the name of a
> +# package as key, and the size of the files installed by this package
> +# as the value.
> +#
> +# filesdict: dictionary with the name of the files as key, and as
> +# value a tuple containing the name of the package to which the files
> +# belongs, and the size of the file. As returned by
> +# build_package_dict.
> +#
> +# builddir: path to the Buildroot output directory
> +#
> +def build_package_size(filesdict, builddir):
> +    pkgsize = collections.defaultdict(int)
> +
> +    for root, _, files in os.walk(os.path.join(builddir, "target")):
> +        for f in files:
> +            fpath = os.path.join(root, f)
> +            if os.path.islink(fpath):
> +                continue
> +            frelpath = os.path.relpath(fpath, os.path.join(builddir, "target"))
> +            if not frelpath in filesdict:
> +                print("WARNING: %s is not part of any package" % frelpath)
> +                pkg = "unknown"
> +            else:
> +                pkg = filesdict[frelpath][0]
> +
> +            pkgsize[pkg] += os.path.getsize(fpath)
> +
> +    return pkgsize
> +
> +#
> +# Given a dict returned by build_package_size(), this function
> +# generates a pie chart of the size installed by each package.
> +#
> +# pkgsize: dictionary with the name of the package as a key, and the
> +# size as the value, as returned by build_package_size.
> +#
> +# outputf: output file for the graph
> +#
> +def draw_graph(pkgsize, outputf):
> +    total = sum(pkgsize.values())
> +    labels = []
> +    values = []
> +    other_value = 0
> +    for (p, sz) in pkgsize.items():
> +        if sz < (total * 0.01):
> +            other_value += sz
> +        else:
> +            labels.append("%s (%d kB)" % (p, sz / 1000.))
> +            values.append(sz)
> +    labels.append("Other (%d kB)" % (other_value / 1000.))
> +    values.append(other_value)
> +
> +    plt.figure()
> +    patches, texts, autotexts = plt.pie(values, labels=labels,
> +                                        autopct='%1.1f%%', shadow=True,
> +                                        colors=colors)
> +    # Reduce text size
> +    proptease = fm.FontProperties()
> +    proptease.set_size('xx-small')
> +    plt.setp(autotexts, fontproperties=proptease)
> +    plt.setp(texts, fontproperties=proptease)
> +
> +    plt.suptitle("Filesystem size per package", fontsize=18, y=.97)
> +    plt.title("Total filesystem size: %d kB" % (total / 1000.), fontsize=10, y=.96)
> +    plt.savefig(outputf)
> +
> +#
> +# Generate a CSV file with statistics about the size of each file, its
> +# size contribution to the package and to the overall system.
> +#
> +# filesdict: dictionary with the name of the files as key, and as
> +# value a tuple containing the name of the package to which the files
> +# belongs, and the size of the file. As returned by
> +# build_package_dict.
> +#
> +# pkgsize: dictionary with the name of the package as a key, and the
> +# size as the value, as returned by build_package_size.
> +#
> +# outputf: output CSV file
> +#
> +def gen_files_csv(filesdict, pkgsizes, outputf):
> +    total = 0
> +    for (p, sz) in pkgsizes.items():
> +        total += sz
> +    with open(outputf, 'w') as csvfile:
> +        wr = csv.writer(csvfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
> +        wr.writerow(["File name",
> +                     "Package name",
> +                     "File size",
> +                     "Package size",
> +                     "File size in package (%)",
> +                     "File size in system (%)"])
> +        for f, (pkgname, filesize) in filesdict.items():
> +            pkgsize = pkgsizes[pkgname]
> +            wr.writerow([f, pkgname, filesize, pkgsize,
> +                         "%.1f" % (float(filesize) / pkgsize * 100),
> +                         "%.1f" % (float(filesize) / total * 100)])
> +
> +
> +#
> +# Generate a CSV file with statistics about the size of each package,
> +# and their size contribution to the overall system.
> +#
> +# pkgsize: dictionary with the name of the package as a key, and the
> +# size as the value, as returned by build_package_size.
> +#
> +# outputf: output CSV file
> +#
> +def gen_packages_csv(pkgsizes, outputf):
> +    total = sum(pkgsizes.values())
> +    with open(outputf, 'w') as csvfile:
> +        wr = csv.writer(csvfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
> +        wr.writerow(["Package name", "Package size", "Package size in system (%)"])
> +        for (pkg, size) in pkgsizes.items():
> +            wr.writerow([pkg, size, "%.1f" % (float(size) / total * 100)])
> +
> +parser = argparse.ArgumentParser(description='Draw build time graphs')
> +
> +parser.add_argument("--builddir", '-i', metavar="BUILDDIR", required=True,
> +                    help="Buildroot output directory")
> +parser.add_argument("--graph", '-g', metavar="GRAPH",
> +                    help="Graph output file (.pdf or .png extension)")
> +parser.add_argument("--file-size-csv", '-f', metavar="FILE_SIZE_CSV",
> +                    help="CSV output file with file size statistics")
> +parser.add_argument("--package-size-csv", '-p', metavar="PKG_SIZE_CSV",
> +                    help="CSV output file with package size statistics")
> +args = parser.parse_args()
> +
> +# Find out which package installed what files
> +pkgdict = build_package_dict(args.builddir)
> +
> +# Collect the size installed by each package
> +pkgsize = build_package_size(pkgdict, args.builddir)
> +
> +if args.graph:
> +    draw_graph(pkgsize, args.graph)
> +if args.file_size_csv:
> +    gen_files_csv(pkgdict, pkgsize, args.file_size_csv)
> +if args.package_size_csv:
> +    gen_packages_csv(pkgsize, args.package_size_csv)
> -- 
> 2.5.1
> 
> _______________________________________________
> buildroot mailing list
> buildroot at busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

  reply	other threads:[~2015-09-03 10:27 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-02 21:15 [Buildroot] [PATCHv5 0/4] Generate package size statistics Thomas Petazzoni
2015-09-02 21:15 ` [Buildroot] [PATCHv5 1/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
2015-09-03 10:25   ` Yann E. MORIN
2015-09-09 13:11   ` Vicente Olivert Riera
2015-09-09 13:46     ` Thomas Petazzoni
2015-09-09 13:50       ` Vicente Olivert Riera
2015-09-09 14:10         ` Thomas Petazzoni
2015-09-02 21:15 ` [Buildroot] [PATCHv5 2/4] support/scripts: add size-stats script Thomas Petazzoni
2015-09-03 10:27   ` Yann E. MORIN [this message]
2015-09-02 21:15 ` [Buildroot] [PATCHv5 3/4] Makefile: implement a size-stats target Thomas Petazzoni
2015-09-03 10:29   ` Yann E. MORIN
2015-09-03 12:21     ` Thomas Petazzoni
2015-09-03 12:36       ` Yann E. MORIN
2015-09-02 21:15 ` [Buildroot] [PATCHv5 4/4] docs/manual: add section about size graphing Thomas Petazzoni
2015-09-03 10:37   ` Yann E. MORIN
2015-09-03 10:43     ` Yann E. MORIN
2015-09-15  2:54       ` Ryan Barnett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150903102724.GB3689@free.fr \
    --to=yann.morin.1998@free.fr \
    --cc=buildroot@busybox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox