Buildroot Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] [PATCHv2 0/4] Generate package size statistics
@ 2014-12-01 21:41 Thomas Petazzoni
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 1/4] toolchain-external: split target installation from staging installation Thomas Petazzoni
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Thomas Petazzoni @ 2014-12-01 21:41 UTC (permalink / raw)
  To: buildroot

Hello,

Buildroot currently doesn't provide any tool to help users analyze the
size contribution of the different files and packages to the overall
size of the filesystem. For people having tight system size
requirements, providing such tools could be useful.

This set of four patches implement tools that allow to generate a pie
chart graph of the size contribution of each package, plus CSV files
giving the detailed size statistics per-package and per-file.

In sequence:

 - The first patch changes the external toolchain logic to separate
   staging installation from target installation. Currently, target
   files are installed in the staging installation step, which is
   wrong and defeats the logic used in the following patches.

 - The second patch adds a new pkg_step hook which collects information
   about which file was installed by which package. The file size is
   not collected at this point, since binary stripping takes place at
   the very end of the build. This hook is conditionally enabled
   depending on a new Config.in option, because the hook calculates
   the md5 of all installed files at the beginning and end of each
   package installation, which can be considered time consuming.

   The information about which files are installed by which packages
   is stored in $(O)/build/packages-file-list.txt.

 - The third patch adds a Python script responsible for reading the
   data in $(O)/build/packages-file-list.txt, looking at the size of
   installed files, and generating the graph and CSV files.

 - The fourth patch creates a new 'size-stats' make target to allow
   users to easily generate the graph and CSV files.

Changes since v1:

 - Improve the logic to properly take into account packages that
   overwrite files installed by other packages.

 - Make the new pkg_step hook optional.

 - Extend the facility to not only generate a graph, but also CSV
   files. As a consequence, everything was renamed from 'graph-size'
   to 'size-stats'.

Best regards,

Thomas

Thomas Petazzoni (4):
  toolchain-external: split target installation from staging
    installation
  pkg-generic: add step_pkg_size global instrumentation hook
  support/scripts: add size-stats script
  Makefile: implement a size-stats target

 Config.in                                          |   9 +
 Makefile                                           |  11 +
 package/pkg-generic.mk                             |  36 ++++
 support/scripts/size-stats                         | 225 +++++++++++++++++++++
 toolchain/toolchain-external/toolchain-external.mk |  36 +++-
 5 files changed, 310 insertions(+), 7 deletions(-)
 create mode 100755 support/scripts/size-stats

-- 
2.1.0

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 1/4] toolchain-external: split target installation from staging installation
  2014-12-01 21:41 [Buildroot] [PATCHv2 0/4] Generate package size statistics Thomas Petazzoni
@ 2014-12-01 21:41 ` Thomas Petazzoni
  2014-12-02 11:00   ` Jérôme Pouiller
  2015-01-10 17:02   ` Thomas Petazzoni
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 2/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 17+ messages in thread
From: Thomas Petazzoni @ 2014-12-01 21:41 UTC (permalink / raw)
  To: buildroot

Currently, all the installation work of the toolchain-external package
is done during the install-staging step. However, in order to be able
to properly collect the size added by each package to the target
filesystem, we need to make sure that toolchain-external installs its
files to $(TARGET_DIR) during the install-target step.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 toolchain/toolchain-external/toolchain-external.mk | 36 +++++++++++++++++-----
 1 file changed, 29 insertions(+), 7 deletions(-)

diff --git a/toolchain/toolchain-external/toolchain-external.mk b/toolchain/toolchain-external/toolchain-external.mk
index 51e0af5..933f812 100644
--- a/toolchain/toolchain-external/toolchain-external.mk
+++ b/toolchain/toolchain-external/toolchain-external.mk
@@ -538,7 +538,7 @@ endif
 #                       considered when searching libraries for copy
 #                       to the target filesystem.
 
-define TOOLCHAIN_EXTERNAL_INSTALL_CORE
+define TOOLCHAIN_EXTERNAL_INSTALL_TARGET_LIBS
 	$(Q)SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC))" ; \
 	if test -z "$${SYSROOT_DIR}" ; then \
 		@echo "External toolchain doesn't support --sysroot. Cannot use." ; \
@@ -563,8 +563,6 @@ define TOOLCHAIN_EXTERNAL_INSTALL_CORE
 			$(call copy_toolchain_lib_root,$${ARCH_SYSROOT_DIR},$${SUPPORT_LIB_DIR},$${ARCH_LIB_DIR},$$libs,/usr/lib); \
 		done ; \
 	fi ; \
-	$(call MESSAGE,"Copying external toolchain sysroot to staging...") ; \
-	$(call copy_toolchain_sysroot,$${SYSROOT_DIR},$${ARCH_SYSROOT_DIR},$${ARCH_SUBDIR},$${ARCH_LIB_DIR},$${SUPPORT_LIB_DIR}) ; \
 	if test "$(BR2_TOOLCHAIN_EXTERNAL_GDB_SERVER_COPY)" = "y"; then \
 		$(call MESSAGE,"Copying gdbserver") ; \
 		gdbserver_found=0 ; \
@@ -582,6 +580,26 @@ define TOOLCHAIN_EXTERNAL_INSTALL_CORE
 	fi
 endef
 
+define TOOLCHAIN_EXTERNAL_INSTALL_SYSROOT_LIBS
+	$(Q)SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC))" ; \
+	if test -z "$${SYSROOT_DIR}" ; then \
+		@echo "External toolchain doesn't support --sysroot. Cannot use." ; \
+		exit 1 ; \
+	fi ; \
+	ARCH_SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS))" ; \
+	ARCH_LIB_DIR="$(call toolchain_find_libdir,$(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS))" ; \
+	SUPPORT_LIB_DIR="" ; \
+	if test `find $${ARCH_SYSROOT_DIR} -name 'libstdc++.a' | wc -l` -eq 0 ; then \
+		LIBSTDCPP_A_LOCATION=$$(LANG=C $(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS) -print-file-name=libstdc++.a) ; \
+		if [ -e "$${LIBSTDCPP_A_LOCATION}" ]; then \
+			SUPPORT_LIB_DIR=`readlink -f $${LIBSTDCPP_A_LOCATION} | sed -r -e 's:libstdc\+\+\.a::'` ; \
+		fi ; \
+	fi ; \
+	ARCH_SUBDIR=`echo $${ARCH_SYSROOT_DIR} | sed -r -e "s:^$${SYSROOT_DIR}(.*)/$$:\1:"` ; \
+	$(call MESSAGE,"Copying external toolchain sysroot to staging...") ; \
+	$(call copy_toolchain_sysroot,$${SYSROOT_DIR},$${ARCH_SYSROOT_DIR},$${ARCH_SUBDIR},$${ARCH_LIB_DIR},$${SUPPORT_LIB_DIR})
+endef
+
 # Special installation target used on the Blackfin architecture when
 # FDPIC is not the primary binary format being used, but the user has
 # nonetheless requested the installation of the FDPIC libraries to the
@@ -680,15 +698,19 @@ define TOOLCHAIN_EXTERNAL_INSTALL_GDBINIT
 	fi
 endef
 
+define TOOLCHAIN_EXTERNAL_INSTALL_STAGING_CMDS
+	$(TOOLCHAIN_EXTERNAL_INSTALL_SYSROOT_LIBS)
+	$(TOOLCHAIN_EXTERNAL_INSTALL_WRAPPER)
+	$(TOOLCHAIN_EXTERNAL_INSTALL_GDBINIT)
+endef
+
 # Even though we're installing things in both the staging, the host
 # and the target directory, we do everything within the
 # install-staging step, arbitrarily.
-define TOOLCHAIN_EXTERNAL_INSTALL_STAGING_CMDS
-	$(TOOLCHAIN_EXTERNAL_INSTALL_CORE)
+define TOOLCHAIN_EXTERNAL_INSTALL_TARGET_CMDS
+	$(TOOLCHAIN_EXTERNAL_INSTALL_TARGET_LIBS)
 	$(TOOLCHAIN_EXTERNAL_INSTALL_BFIN_FDPIC)
 	$(TOOLCHAIN_EXTERNAL_INSTALL_BFIN_FLAT)
-	$(TOOLCHAIN_EXTERNAL_INSTALL_WRAPPER)
-	$(TOOLCHAIN_EXTERNAL_INSTALL_GDBINIT)
 endef
 
 $(eval $(generic-package))
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-12-01 21:41 [Buildroot] [PATCHv2 0/4] Generate package size statistics Thomas Petazzoni
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 1/4] toolchain-external: split target installation from staging installation Thomas Petazzoni
@ 2014-12-01 21:41 ` Thomas Petazzoni
  2014-12-02 11:00   ` Jérôme Pouiller
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 3/4] support/scripts: add size-stats script Thomas Petazzoni
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 4/4] Makefile: implement a size-stats target Thomas Petazzoni
  3 siblings, 1 reply; 17+ messages in thread
From: Thomas Petazzoni @ 2014-12-01 21:41 UTC (permalink / raw)
  To: buildroot

This patch adds a global instrumentation hook that collects the list
of files installed in $(TARGET_DIR) by each package, and stores this
list into a file called $(BUILD_DIR)/packages-file-list.txt. It can
later be used to determine the size contribution of each package to
the target root filesystem.

Note that in order to detect if a file installed by one package is
later overriden by another package, we calculate the md5 of installed
files and compare them at each installation of a new package.

This commit also adds a Config.in option to enable the collection of
this data, as calculating the md5 of all installed files at the
beginning and end of the installation of each package can be
considered a time-consuming process which maybe some users will not be
willing to suffer from.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 Config.in              |  9 +++++++++
 package/pkg-generic.mk | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/Config.in b/Config.in
index 1aa1080..328654c 100644
--- a/Config.in
+++ b/Config.in
@@ -569,6 +569,15 @@ config BR2_GLOBAL_PATCH_DIR
 	  Otherwise, if the directory <global-patch-dir>/<packagename> exists,
 	  then all *.patch files in the directory will be applied.
 
+config BR2_COLLECT_FILE_SIZE_STATS
+	bool "collect statistics about installed file size"
+	help
+	  Enable this option to let Buildroot collect data about the
+	  installed files. When this option is enabled, you will be
+	  able to use the 'size-stats' make target, which will
+	  generate a graph and CSV files giving statistics about the
+	  installed size of each file and each package.
+
 endmenu
 
 source "toolchain/Config.in"
diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
index 9643a30..82f8ff8 100644
--- a/package/pkg-generic.mk
+++ b/package/pkg-generic.mk
@@ -55,6 +55,42 @@ define step_time
 endef
 GLOBAL_INSTRUMENTATION_HOOKS += step_time
 
+# Hooks to collect statistics about installed files
+ifeq ($(BR2_COLLECT_FILE_SIZE_STATS),y)
+
+# This hook will be called before the target installation of a
+# package. We store in a file named $(1).filelist_before the list of
+# files currently installed in the target. Note that the MD5 is also
+# stored, in order to identify if the files are overwritten.
+define step_pkg_size_start
+	(cd $(TARGET_DIR) ; find . -type f | xargs md5sum) | sort > \
+		$(BUILD_DIR)/$(1).filelist_before
+endef
+
+# This hook will be called after the target installation of a
+# package. We store in a file named $(1).filelist_after the list
+# of files (and their MD5) currently installed in the target. We then
+# do a diff with the $(1).filelist_before to compute the list of
+# files installed by this package.
+define step_pkg_size_end
+	(cd $(TARGET_DIR); find . -type f | xargs md5sum) | sort > \
+		$(BUILD_DIR)/$(1).filelist_after
+	comm -13 $(BUILD_DIR)/$(1).filelist_before $(BUILD_DIR)/$(1).filelist_after | \
+		while read hash file ; do \
+			echo "$(1),$${file}" >> $(BUILD_DIR)/packages-file-list.txt ; \
+		done
+	$(RM) -f $(BUILD_DIR)/$(1).filelist_before \
+		$(BUILD_DIR)/$(1).filelist_after
+endef
+
+define step_pkg_size
+	$(if $(filter install-target,$(2)),\
+		$(if $(filter start,$(1)),$(call step_pkg_size_start,$(3))) \
+		$(if $(filter end,$(1)),$(call step_pkg_size_end,$(3))))
+endef
+GLOBAL_INSTRUMENTATION_HOOKS += step_pkg_size
+endif
+
 # User-supplied script
 ifneq ($(BR2_INSTRUMENTATION_SCRIPTS),)
 define step_user
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 3/4] support/scripts: add size-stats script
  2014-12-01 21:41 [Buildroot] [PATCHv2 0/4] Generate package size statistics Thomas Petazzoni
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 1/4] toolchain-external: split target installation from staging installation Thomas Petazzoni
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 2/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
@ 2014-12-01 21:41 ` Thomas Petazzoni
  2014-12-02 11:01   ` Jérôme Pouiller
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 4/4] Makefile: implement a size-stats target Thomas Petazzoni
  3 siblings, 1 reply; 17+ messages in thread
From: Thomas Petazzoni @ 2014-12-01 21:41 UTC (permalink / raw)
  To: buildroot

This new script uses the data collected by the step_pkg_size
instrumentation hook to generate a pie chart of the size contribution
of each package to the target root filesystem, and two CSV files with
statistics about the package size and file size. To achieve this, it
looks at each file in $(TARGET_DIR), and using the
packages-file-list.txt information collected by the step_pkg_size
hook, it determines to which package the file belongs. It is therefore
able to give the size installed by each package.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 support/scripts/size-stats | 225 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 225 insertions(+)
 create mode 100755 support/scripts/size-stats

diff --git a/support/scripts/size-stats b/support/scripts/size-stats
new file mode 100755
index 0000000..7dc28a0
--- /dev/null
+++ b/support/scripts/size-stats
@@ -0,0 +1,225 @@
+#!/usr/bin/env python
+
+# Copyright (C) 2014 by Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+import sys
+import os
+import os.path
+import argparse
+import csv
+
+try:
+    import matplotlib.font_manager as fm
+    import matplotlib.pyplot as plt
+except ImportError:
+    sys.stderr.write("You need python-matplotlib to generate the size graph\n")
+    exit(1)
+
+colors = ['#e60004', '#009836', '#2e1d86', '#ffed00',
+          '#0068b5', '#f28e00', '#940084', '#97c000']
+
+#
+# This function returns a dict containing as keys the files present in
+# the filesystem skeleton, and as value, the string "skeleton". It is
+# used to simulate a fake "skeleton" package, to assign the files from
+# the skeleton to some package.
+#
+# builddir: path to the Buildroot output directory
+#
+def build_skeleton_dict(builddir):
+    skeleton_files = {}
+    for root, _, files in os.walk("system/skeleton"):
+        for f in files:
+            if f == ".empty":
+                continue
+            frelpath = os.path.relpath(os.path.join(root, f), "system/skeleton")
+            # Get the real size of the installed file
+            targetpath = os.path.join(builddir, "target", frelpath)
+            if os.path.islink(targetpath):
+                continue
+            sz = os.stat(targetpath).st_size
+            skeleton_files[frelpath] = { 'pkg': "skeleton", 'size': sz }
+    return skeleton_files
+
+#
+# This function returns a dict where each key is the path of a file in
+# the root filesystem, and the value is a dict containing two
+# elements: the name of the package to which this file belongs (key:
+# pkg) and the size of the file (key: size).
+#
+# builddir: path to the Buildroot output directory
+#
+def build_package_dict(builddir):
+    pkgdict = {}
+    with open(os.path.join(builddir, "build", "packages-file-list.txt")) as filelistf:
+        for l in filelistf.readlines():
+            f = l.split(",")
+            fpath = f[1].strip().replace("./", "")
+            fullpath = os.path.join(builddir, "target", fpath)
+            if not os.path.exists(fullpath):
+                continue
+            pkg = f[0]
+            sz = os.stat(fullpath).st_size
+            pkgdict[fpath] = { 'pkg': pkg, 'size': sz }
+    pkgdict.update(build_skeleton_dict(builddir))
+    return pkgdict
+
+#
+# This function builds a dictionary that contains the name of a
+# package as key, and the size of the files installed by this package
+# as the value.
+#
+# pkgdict: dictionary with the name of the files as key, and as value
+# a dict containing the name of the package to which the files
+# belongs, and the size of the file. As returned by
+# build_package_dict.
+#
+# builddir: path to the Buildroot output directory
+#
+def build_package_size(pkgdict, builddir):
+    pkgsize = {}
+
+    for root, _, files in os.walk(os.path.join(builddir, "target")):
+        for f in files:
+            fpath = os.path.join(root, f)
+            if os.path.islink(fpath):
+                continue
+            frelpath = os.path.relpath(fpath, os.path.join(builddir, "target"))
+            if not frelpath in pkgdict:
+                print("WARNING: %s is not part of any package" % frelpath)
+                pkg = "unknown"
+            else:
+                pkg = pkgdict[frelpath]["pkg"]
+
+            if pkg in pkgsize:
+                pkgsize[pkg] += os.path.getsize(fpath)
+            else:
+                pkgsize[pkg] = os.path.getsize(fpath)
+
+    return pkgsize
+
+#
+# Given a dict returned by build_package_size(), this function
+# generates a pie chart of the size installed by each package.
+#
+# pkgsize: dictionary with the name of the package as a key, and the
+# size as the value, as returned by build_package_size.
+#
+# outputf: output file for the graph
+#
+def draw_graph(pkgsize, outputf):
+    total = 0
+    for (p, sz) in pkgsize.items():
+        total += sz
+    labels = []
+    values = []
+    other_value = 0
+    for (p, sz) in pkgsize.items():
+        if sz < (total * 0.01):
+            other_value += sz
+        else:
+            labels.append("%s (%d kB)" % (p, sz / 1000.))
+            values.append(sz)
+    labels.append("Other (%d kB)" % (other_value / 1000.))
+    values.append(other_value)
+
+    plt.figure()
+    patches, texts, autotexts = plt.pie(values, labels=labels,
+                                        autopct='%1.1f%%', shadow=True,
+                                        colors=colors)
+    # Reduce text size
+    proptease = fm.FontProperties()
+    proptease.set_size('xx-small')
+    plt.setp(autotexts, fontproperties=proptease)
+    plt.setp(texts, fontproperties=proptease)
+
+    plt.title('Size per package')
+    plt.savefig(outputf)
+
+#
+# Generate a CSV file with statistics about the size of each file, its
+# size contribution to the package and to the overall system.
+#
+# pkgdict: dictionary with the name of the files as key, and as value
+# a dict containing the name of the package to which the files
+# belongs, and the size of the file. As returned by
+# build_package_dict.
+#
+# pkgsize: dictionary with the name of the package as a key, and the
+# size as the value, as returned by build_package_size.
+#
+# outputf: output CSV file
+#
+def gen_files_csv(pkgdict, pkgsizes, outputf):
+    total = 0
+    for (p, sz) in pkgsizes.items():
+        total += sz
+    with open(outputf, 'w') as csvfile:
+        wr = csv.writer(csvfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
+        wr.writerow(["File name",
+                     "Package name",
+                     "File size",
+                     "Package size",
+                     "File size in package (%)",
+                     "File size in system (%)"])
+        for (f, info) in pkgdict.items():
+            pkgname = info["pkg"]
+            filesize = info["size"]
+            pkgsize = pkgsizes[pkgname]
+            wr.writerow([f, pkgname, filesize, pkgsize, "%.4f" % (float(filesize) / pkgsize * 100), "%.4f" % (float(filesize) / total * 100)])
+
+
+#
+# Generate a CSV file with statistics about the size of each package,
+# and their size contribution to the overall system.
+#
+# pkgsize: dictionary with the name of the package as a key, and the
+# size as the value, as returned by build_package_size.
+#
+# outputf: output CSV file
+#
+def gen_packages_csv(pkgsizes, outputf):
+    total = 0
+    for (p, sz) in pkgsizes.items():
+        total += sz
+    with open(outputf, 'w') as csvfile:
+        wr = csv.writer(csvfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
+        wr.writerow(["Package name", "Package size", "Package size in system (%)"])
+        for (pkg, size) in pkgsizes.items():
+            wr.writerow([pkg, size, "%.4f" % (float(size) / total * 100)])
+
+parser = argparse.ArgumentParser(description='Draw build time graphs')
+
+parser.add_argument("--builddir", '-i', metavar="BUILDDIR", required=True,
+                    help="Buildroot output directory")
+parser.add_argument("--graph", '-g', metavar="GRAPH",
+                    help="Graph output file (.pdf or .png extension)")
+parser.add_argument("--file-size-csv", '-f', metavar="FILE_SIZE_CSV",
+                    help="CSV output file with file size statistics")
+parser.add_argument("--package-size-csv", '-p', metavar="PKG_SIZE_CSV",
+                    help="CSV output file with package size statistics")
+args = parser.parse_args()
+
+pkgdict = build_package_dict(args.builddir)
+pkgsize = build_package_size(pkgdict, args.builddir)
+
+if args.graph:
+    draw_graph(pkgsize, args.graph)
+if args.file_size_csv:
+    gen_files_csv(pkgdict, pkgsize, args.file_size_csv)
+if args.package_size_csv:
+    gen_packages_csv(pkgsize, args.package_size_csv)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 4/4] Makefile: implement a size-stats target
  2014-12-01 21:41 [Buildroot] [PATCHv2 0/4] Generate package size statistics Thomas Petazzoni
                   ` (2 preceding siblings ...)
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 3/4] support/scripts: add size-stats script Thomas Petazzoni
@ 2014-12-01 21:41 ` Thomas Petazzoni
  2015-01-12 22:47   ` Romain Naour
  3 siblings, 1 reply; 17+ messages in thread
From: Thomas Petazzoni @ 2014-12-01 21:41 UTC (permalink / raw)
  To: buildroot

This commit implements a size-stats target that calls the script of
the same name to generate the graph and CSV files related to package
and file sizes.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 Makefile | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/Makefile b/Makefile
index ad018c8..56b2b93 100644
--- a/Makefile
+++ b/Makefile
@@ -677,6 +677,16 @@ graph-depends: graph-depends-requirements
 	|tee $(BASE_DIR)/graphs/$(@).dot \
 	|dot $(BR2_GRAPH_DOT_OPTS) -T$(BR_GRAPH_OUT) -o $(BASE_DIR)/graphs/$(@).$(BR_GRAPH_OUT)
 
+size-stats:
+	@[ -f $(O)/build/packages-file-list.txt ] || \
+		{ echo "ERROR: No package size information available, please rebuild with BR2_COLLECT_FILE_SIZE_STATS" ; exit 1; }
+	@$(INSTALL) -d $(O)/graphs
+	@cd "$(CONFIG_DIR)"; \
+	$(TOPDIR)/support/scripts/graph-size --builddir $(O) \
+		--graph $(O)/graphs/graph-size.$(BR_GRAPH_OUT) \
+		--file-size-csv $(O)/build/file-size-stats.csv \
+		--package-size-csv $(O)/build/package-size-stats.csv
+
 else # ifeq ($(BR2_HAVE_DOT_CONFIG),y)
 
 all: menuconfig
@@ -889,6 +899,7 @@ endif
 	@echo '  manual-epub            - build manual in ePub'
 	@echo '  graph-build            - generate graphs of the build times'
 	@echo '  graph-depends          - generate graph of the dependency tree'
+	@echo '  size-stats             - generate stats of the filesystem size'
 	@echo
 	@echo 'Miscellaneous:'
 	@echo '  source                 - download all sources needed for offline-build'
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 1/4] toolchain-external: split target installation from staging installation
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 1/4] toolchain-external: split target installation from staging installation Thomas Petazzoni
@ 2014-12-02 11:00   ` Jérôme Pouiller
  2015-01-10 17:02   ` Thomas Petazzoni
  1 sibling, 0 replies; 17+ messages in thread
From: Jérôme Pouiller @ 2014-12-02 11:00 UTC (permalink / raw)
  To: buildroot

On Monday 01 December 2014 22:41:37 Thomas Petazzoni wrote:
> Currently, all the installation work of the toolchain-external package
> is done during the install-staging step. However, in order to be able
> to properly collect the size added by each package to the target
> filesystem, we need to make sure that toolchain-external installs its
> files to $(TARGET_DIR) during the install-target step.
Great, this also fix:

   rm -r target build/*/.stamp_target_installed && make

Tested-by: J?r?me Pouiller <jezz@sysmic.org>

> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  toolchain/toolchain-external/toolchain-external.mk | 36 +++++++++++++++++-----
>  1 file changed, 29 insertions(+), 7 deletions(-)

-- 
J?r?me Pouiller, Sysmic
Embedded Linux specialist
http://www.sysmic.fr

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 2/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
@ 2014-12-02 11:00   ` Jérôme Pouiller
  2014-12-02 12:23     ` Thomas Petazzoni
  0 siblings, 1 reply; 17+ messages in thread
From: Jérôme Pouiller @ 2014-12-02 11:00 UTC (permalink / raw)
  To: buildroot

Hello Thomas,

On Monday 01 December 2014 22:41:38 Thomas Petazzoni wrote:
> This patch adds a global instrumentation hook that collects the list
> of files installed in $(TARGET_DIR) by each package, and stores this
> list into a file called $(BUILD_DIR)/packages-file-list.txt. It can
> later be used to determine the size contribution of each package to
> the target root filesystem.
> 
> Note that in order to detect if a file installed by one package is
> later overriden by another package, we calculate the md5 of installed
> files and compare them at each installation of a new package.
> 
> This commit also adds a Config.in option to enable the collection of
> this data, as calculating the md5 of all installed files at the
> beginning and end of the installation of each package can be
> considered a time-consuming process which maybe some users will not be
> willing to suffer from.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  Config.in              |  9 +++++++++
>  package/pkg-generic.mk | 36 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/Config.in b/Config.in
> index 1aa1080..328654c 100644
> --- a/Config.in
> +++ b/Config.in
> @@ -569,6 +569,15 @@ config BR2_GLOBAL_PATCH_DIR
>  	  Otherwise, if the directory <global-patch-dir>/<packagename> exists,
>  	  then all *.patch files in the directory will be applied.
>  
> +config BR2_COLLECT_FILE_SIZE_STATS
> +	bool "collect statistics about installed file size"
> +	help
> +	  Enable this option to let Buildroot collect data about the
> +	  installed files. When this option is enabled, you will be
> +	  able to use the 'size-stats' make target, which will
> +	  generate a graph and CSV files giving statistics about the
> +	  installed size of each file and each package.
> +
>  endmenu
>  
>  source "toolchain/Config.in"
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index 9643a30..82f8ff8 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -55,6 +55,42 @@ define step_time
>  endef
>  GLOBAL_INSTRUMENTATION_HOOKS += step_time
>  
> +# Hooks to collect statistics about installed files
> +ifeq ($(BR2_COLLECT_FILE_SIZE_STATS),y)
> +
> +# This hook will be called before the target installation of a
> +# package. We store in a file named $(1).filelist_before the list of
> +# files currently installed in the target. Note that the MD5 is also
> +# stored, in order to identify if the files are overwritten.
> +define step_pkg_size_start
> +	(cd $(TARGET_DIR) ; find . -type f | xargs md5sum) | sort > \
> +		$(BUILD_DIR)/$(1).filelist_before
> +endef
I think this does not work if filename contains spaces.


> +# This hook will be called after the target installation of a
> +# package. We store in a file named $(1).filelist_after the list
> +# of files (and their MD5) currently installed in the target. We then
> +# do a diff with the $(1).filelist_before to compute the list of
> +# files installed by this package.
> +define step_pkg_size_end
> +	(cd $(TARGET_DIR); find . -type f | xargs md5sum) | sort > \
> +		$(BUILD_DIR)/$(1).filelist_after
> +	comm -13 $(BUILD_DIR)/$(1).filelist_before $(BUILD_DIR)/$(1).filelist_after | \
> +		while read hash file ; do \
> +			echo "$(1),$${file}" >> $(BUILD_DIR)/packages-file-list.txt ; \
> +		done
Does it would make sense if we also record removed lines? We may wrote 
another script that detect if a file was in conflict between two packages.

> +	$(RM) -f $(BUILD_DIR)/$(1).filelist_before \
> +		$(BUILD_DIR)/$(1).filelist_after
> +endef
> +
> +define step_pkg_size
> +	$(if $(filter install-target,$(2)),\
> +		$(if $(filter start,$(1)),$(call step_pkg_size_start,$(3))) \
> +		$(if $(filter end,$(1)),$(call step_pkg_size_end,$(3))))
> +endef
> +GLOBAL_INSTRUMENTATION_HOOKS += step_pkg_size
> +endif
> +
>  # User-supplied script
>  ifneq ($(BR2_INSTRUMENTATION_SCRIPTS),)
>  define step_user
> 

-- 
J?r?me Pouiller, Sysmic
Embedded Linux specialist
http://www.sysmic.fr

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 3/4] support/scripts: add size-stats script
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 3/4] support/scripts: add size-stats script Thomas Petazzoni
@ 2014-12-02 11:01   ` Jérôme Pouiller
  2014-12-02 12:28     ` Thomas Petazzoni
  0 siblings, 1 reply; 17+ messages in thread
From: Jérôme Pouiller @ 2014-12-02 11:01 UTC (permalink / raw)
  To: buildroot

Hello Thomas,

I have a few comments below.

On Monday 01 December 2014 22:41:39 Thomas Petazzoni wrote:
> This new script uses the data collected by the step_pkg_size
> instrumentation hook to generate a pie chart of the size contribution
> of each package to the target root filesystem, and two CSV files with
> statistics about the package size and file size. To achieve this, it
> looks at each file in $(TARGET_DIR), and using the
> packages-file-list.txt information collected by the step_pkg_size
> hook, it determines to which package the file belongs. It is therefore
> able to give the size installed by each package.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  support/scripts/size-stats | 225 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 225 insertions(+)
>  create mode 100755 support/scripts/size-stats
> 
> diff --git a/support/scripts/size-stats b/support/scripts/size-stats
> new file mode 100755
> index 0000000..7dc28a0
> --- /dev/null
> +++ b/support/scripts/size-stats
> @@ -0,0 +1,225 @@
> +#!/usr/bin/env python
[...]
> +#
> +# This function returns a dict containing as keys the files present in
> +# the filesystem skeleton, and as value, the string "skeleton". It is
> +# used to simulate a fake "skeleton" package, to assign the files from
> +# the skeleton to some package.
> +#
> +# builddir: path to the Buildroot output directory
> +#
> +def build_skeleton_dict(builddir):
> +    skeleton_files = {}
> +    for root, _, files in os.walk("system/skeleton"):
> +        for f in files:
> +            if f == ".empty":
> +                continue
> +            frelpath = os.path.relpath(os.path.join(root, f), "system/skeleton")
> +            # Get the real size of the installed file
> +            targetpath = os.path.join(builddir, "target", frelpath)
> +            if os.path.islink(targetpath):
> +                continue
> +            sz = os.stat(targetpath).st_size
> +            skeleton_files[frelpath] = { 'pkg': "skeleton", 'size': sz }
> +    return skeleton_files
Is it possible to rely on Kconfiglib in order to support customized skeleton
and overlays? Or you think skeleton and overlays (and post-build scripts) 
should be managed by package infra (like https://patchwork.ozlabs.org/patch/399413/)?

> +#
> +# This function returns a dict where each key is the path of a file in
> +# the root filesystem, and the value is a dict containing two
> +# elements: the name of the package to which this file belongs (key:
> +# pkg) and the size of the file (key: size).
> +#
> +# builddir: path to the Buildroot output directory
> +#
> +def build_package_dict(builddir):
> +    pkgdict = {}
> +    with open(os.path.join(builddir, "build", "packages-file-list.txt")) as filelistf:
> +        for l in filelistf.readlines():
> +            f = l.split(",")
> +            fpath = f[1].strip().replace("./", "")
> +            fullpath = os.path.join(builddir, "target", fpath)
> +            if not os.path.exists(fullpath):
> +                continue
It means the file was remove by another package. You may emit a warning
there?

> +            pkg = f[0]
> +            sz = os.stat(fullpath).st_size
> +            pkgdict[fpath] = { 'pkg': pkg, 'size': sz }
If pkgdict[fpath] is already defined, it means:
  a. pkg == pkgdict[fpath].pkg -> Package was reinstalled
  b. pkg != pkgdict[fpath].pkg -> File was overwritten by another package

You may emit a warning is second case?


> +    pkgdict.update(build_skeleton_dict(builddir))
> +    return pkgdict
> +
[...]

-- 
J?r?me Pouiller, Sysmic
Embedded Linux specialist
http://www.sysmic.fr

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-12-02 11:00   ` Jérôme Pouiller
@ 2014-12-02 12:23     ` Thomas Petazzoni
  2014-12-02 13:22       ` Jérôme Pouiller
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Petazzoni @ 2014-12-02 12:23 UTC (permalink / raw)
  To: buildroot

Dear J?r?me Pouiller,

On Tue, 02 Dec 2014 12:00:51 +0100, J?r?me Pouiller wrote:

> > +# This hook will be called before the target installation of a
> > +# package. We store in a file named $(1).filelist_before the list of
> > +# files currently installed in the target. Note that the MD5 is also
> > +# stored, in order to identify if the files are overwritten.
> > +define step_pkg_size_start
> > +	(cd $(TARGET_DIR) ; find . -type f | xargs md5sum) | sort > \
> > +		$(BUILD_DIR)/$(1).filelist_before
> > +endef
> I think this does not work if filename contains spaces.

Hum, yes, very possible. But is Buildroot really working fine as a
whole if some file in the target filesystem has some spaces?

> > +# This hook will be called after the target installation of a
> > +# package. We store in a file named $(1).filelist_after the list
> > +# of files (and their MD5) currently installed in the target. We then
> > +# do a diff with the $(1).filelist_before to compute the list of
> > +# files installed by this package.
> > +define step_pkg_size_end
> > +	(cd $(TARGET_DIR); find . -type f | xargs md5sum) | sort > \
> > +		$(BUILD_DIR)/$(1).filelist_after
> > +	comm -13 $(BUILD_DIR)/$(1).filelist_before $(BUILD_DIR)/$(1).filelist_after | \
> > +		while read hash file ; do \
> > +			echo "$(1),$${file}" >> $(BUILD_DIR)/packages-file-list.txt ; \
> > +		done
> Does it would make sense if we also record removed lines? We may wrote 
> another script that detect if a file was in conflict between two packages.

I'm not sure to follow you here. We already take care of packages
installing the same file, that's the whole point of storing the MD5 of
each file. By using comm -13, we keep only the lines that are unique in
the second file (compared to the first file). So we keep lines for
either new files added by this package, or files already installed but
overwritten by the package (detected using the MD5).

Thanks,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 3/4] support/scripts: add size-stats script
  2014-12-02 11:01   ` Jérôme Pouiller
@ 2014-12-02 12:28     ` Thomas Petazzoni
  2014-12-02 13:24       ` Jérôme Pouiller
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Petazzoni @ 2014-12-02 12:28 UTC (permalink / raw)
  To: buildroot

Dear J?r?me Pouiller,

On Tue, 02 Dec 2014 12:01:11 +0100, J?r?me Pouiller wrote:

> Is it possible to rely on Kconfiglib in order to support customized skeleton
> and overlays? Or you think skeleton and overlays (and post-build scripts) 
> should be managed by package infra (like https://patchwork.ozlabs.org/patch/399413/)?

Ultimately, yes, I think the skeleton should be a package. Regarding
overlays, maybe not, since we want overlays to be copied on each "make"
invocation, to make them easy to install. So it would indeed be
necessary to add some support for overlays, and post-build script as
well.

I guess we could consider this a follow-up improvement, but I'm willing
to work on this topic once the basic stuff is merged.

> > +#
> > +# This function returns a dict where each key is the path of a file in
> > +# the root filesystem, and the value is a dict containing two
> > +# elements: the name of the package to which this file belongs (key:
> > +# pkg) and the size of the file (key: size).
> > +#
> > +# builddir: path to the Buildroot output directory
> > +#
> > +def build_package_dict(builddir):
> > +    pkgdict = {}
> > +    with open(os.path.join(builddir, "build", "packages-file-list.txt")) as filelistf:
> > +        for l in filelistf.readlines():
> > +            f = l.split(",")
> > +            fpath = f[1].strip().replace("./", "")
> > +            fullpath = os.path.join(builddir, "target", fpath)
> > +            if not os.path.exists(fullpath):
> > +                continue
> It means the file was remove by another package. You may emit a warning
> there?

Hum, indeed, emitting a warning here seems useful.

> > +            pkg = f[0]
> > +            sz = os.stat(fullpath).st_size
> > +            pkgdict[fpath] = { 'pkg': pkg, 'size': sz }
> If pkgdict[fpath] is already defined, it means:
>   a. pkg == pkgdict[fpath].pkg -> Package was reinstalled
>   b. pkg != pkgdict[fpath].pkg -> File was overwritten by another package
> 
> You may emit a warning is second case?

Well, it depends on whether we consider overwritten files as normal or
not. For now, the main case where we overwrite things in Buildroot is
when a package such as coreutils, installs some commands that are
"better" than the Busybox ones, in which case coreutils wins over
Busybox. But since Busybox installs symlinks, and this tool doesn't
track symlinks, we don't consider this as a file being overwritten.

So maybe I could have a warning here.

Regarding "package was reinstalled", this entire script/logic is meant
to be used after a "make clean all" cycle. I don't think it is worth
bothering with supporting package re-installation and other crazy things
that can happen outside of a normal "make clean all" cycle.

Thanks for the review!

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-12-02 12:23     ` Thomas Petazzoni
@ 2014-12-02 13:22       ` Jérôme Pouiller
  2014-12-02 13:40         ` Jérôme Pouiller
  0 siblings, 1 reply; 17+ messages in thread
From: Jérôme Pouiller @ 2014-12-02 13:22 UTC (permalink / raw)
  To: buildroot

On Tuesday 02 December 2014 13:23:49 Thomas Petazzoni wrote:
> Dear J?r?me Pouiller,
> 
> On Tue, 02 Dec 2014 12:00:51 +0100, J?r?me Pouiller wrote:
> 
> > > +# This hook will be called before the target installation of a
> > > +# package. We store in a file named $(1).filelist_before the list of
> > > +# files currently installed in the target. Note that the MD5 is also
> > > +# stored, in order to identify if the files are overwritten.
> > > +define step_pkg_size_start
> > > +	(cd $(TARGET_DIR) ; find . -type f | xargs md5sum) | sort > \
> > > +		$(BUILD_DIR)/$(1).filelist_before
> > > +endef
> > I think this does not work if filename contains spaces.
> 
> Hum, yes, very possible. But is Buildroot really working fine as a
> whole if some file in the target filesystem has some spaces?
I don't know, but adding -print0/--null is cheap.


> > > +# This hook will be called after the target installation of a
> > > +# package. We store in a file named $(1).filelist_after the list
> > > +# of files (and their MD5) currently installed in the target. We then
> > > +# do a diff with the $(1).filelist_before to compute the list of
> > > +# files installed by this package.
> > > +define step_pkg_size_end
> > > +	(cd $(TARGET_DIR); find . -type f | xargs md5sum) | sort > \
> > > +		$(BUILD_DIR)/$(1).filelist_after
> > > +	comm -13 $(BUILD_DIR)/$(1).filelist_before $(BUILD_DIR)/$(1).filelist_after | \
> > > +		while read hash file ; do \
> > > +			echo "$(1),$${file}" >> $(BUILD_DIR)/packages-file-list.txt ; \
> > > +		done
> > Does it would make sense if we also record removed lines? We may wrote 
> > another script that detect if a file was in conflict between two packages.
> 
> I'm not sure to follow you here. We already take care of packages
> installing the same file, that's the whole point of storing the MD5 of
> each file. By using comm -13, we keep only the lines that are unique in
> the second file (compared to the first file). So we keep lines for
> either new files added by this package, or files already installed but
> overwritten by the package (detected using the MD5).
Recording deleted files has no interest for current purpose. However, I 
though to use packages-file-list.txt for other scripts, and especially, 
to detect suspicious file modifications.

I agree current format is enough to give information about overwrote
files, but it may be handier to exploit with file removal information.
(In add, in case of file removal, it is not possible to find guilty
package).




I just noticed another thing. To make this feature compatible to 
BR2_JLEVEL, we just need to manage a mutex in step_pkg_size hook. Do 
you planned to add one?



-- 
J?r?me Pouiller, Sysmic
Embedded Linux specialist
http://www.sysmic.fr

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 3/4] support/scripts: add size-stats script
  2014-12-02 12:28     ` Thomas Petazzoni
@ 2014-12-02 13:24       ` Jérôme Pouiller
  0 siblings, 0 replies; 17+ messages in thread
From: Jérôme Pouiller @ 2014-12-02 13:24 UTC (permalink / raw)
  To: buildroot

On Tuesday 02 December 2014 13:28:44 Thomas Petazzoni wrote:
> Dear J?r?me Pouiller,
> 
> On Tue, 02 Dec 2014 12:01:11 +0100, J?r?me Pouiller wrote:
[...]
> > > +            pkg = f[0]
> > > +            sz = os.stat(fullpath).st_size
> > > +            pkgdict[fpath] = { 'pkg': pkg, 'size': sz }
> > If pkgdict[fpath] is already defined, it means:
> >   a. pkg == pkgdict[fpath].pkg -> Package was reinstalled
> >   b. pkg != pkgdict[fpath].pkg -> File was overwritten by another package
> > 
> > You may emit a warning is second case?
> 
> Well, it depends on whether we consider overwritten files as normal or
> not. For now, the main case where we overwrite things in Buildroot is
> when a package such as coreutils, installs some commands that are
> "better" than the Busybox ones, in which case coreutils wins over
> Busybox. But since Busybox installs symlinks, and this tool doesn't
> track symlinks, we don't consider this as a file being overwritten.
> 
> So maybe I could have a warning here.
> 
> Regarding "package was reinstalled", this entire script/logic is meant
> to be used after a "make clean all" cycle. I don't think it is worth
> bothering with supporting package re-installation and other crazy things
> that can happen outside of a normal "make clean all" cycle.
Your current implementation seems to be resistant enough to package 
reinstall.


-- 
J?r?me Pouiller, Sysmic
Embedded Linux specialist
http://www.sysmic.fr

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-12-02 13:22       ` Jérôme Pouiller
@ 2014-12-02 13:40         ` Jérôme Pouiller
  0 siblings, 0 replies; 17+ messages in thread
From: Jérôme Pouiller @ 2014-12-02 13:40 UTC (permalink / raw)
  To: buildroot

On Tuesday 02 December 2014 14:22:11 J?r?me Pouiller wrote:
> On Tuesday 02 December 2014 13:23:49 Thomas Petazzoni wrote:
[...]
> I just noticed another thing. To make this feature compatible to 
> BR2_JLEVEL, we just need to manage a mutex in step_pkg_size hook. Do 
  ^^^^^^^^^^
Sure, I meant "top-level parallel make".

> you planned to add one?

-- 
J?r?me Pouiller, Sysmic
Embedded Linux specialist
http://www.sysmic.fr

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 1/4] toolchain-external: split target installation from staging installation
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 1/4] toolchain-external: split target installation from staging installation Thomas Petazzoni
  2014-12-02 11:00   ` Jérôme Pouiller
@ 2015-01-10 17:02   ` Thomas Petazzoni
  1 sibling, 0 replies; 17+ messages in thread
From: Thomas Petazzoni @ 2015-01-10 17:02 UTC (permalink / raw)
  To: buildroot

Hello,

On Mon,  1 Dec 2014 22:41:37 +0100, Thomas Petazzoni wrote:
> Currently, all the installation work of the toolchain-external package
> is done during the install-staging step. However, in order to be able
> to properly collect the size added by each package to the target
> filesystem, we need to make sure that toolchain-external installs its
> files to $(TARGET_DIR) during the install-target step.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  toolchain/toolchain-external/toolchain-external.mk | 36 +++++++++++++++++-----
>  1 file changed, 29 insertions(+), 7 deletions(-)

Since this has been tested by J?r?me, I applied it.

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 4/4] Makefile: implement a size-stats target
  2014-12-01 21:41 ` [Buildroot] [PATCHv2 4/4] Makefile: implement a size-stats target Thomas Petazzoni
@ 2015-01-12 22:47   ` Romain Naour
  2015-01-13  8:12     ` Thomas Petazzoni
  0 siblings, 1 reply; 17+ messages in thread
From: Romain Naour @ 2015-01-12 22:47 UTC (permalink / raw)
  To: buildroot

Hi Thomas,

Le 01/12/2014 22:41, Thomas Petazzoni a ?crit :
> This commit implements a size-stats target that calls the script of
> the same name to generate the graph and CSV files related to package
> and file sizes.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  Makefile | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/Makefile b/Makefile
> index ad018c8..56b2b93 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -677,6 +677,16 @@ graph-depends: graph-depends-requirements
>  	|tee $(BASE_DIR)/graphs/$(@).dot \
>  	|dot $(BR2_GRAPH_DOT_OPTS) -T$(BR_GRAPH_OUT) -o $(BASE_DIR)/graphs/$(@).$(BR_GRAPH_OUT)
>  
> +size-stats:
> +	@[ -f $(O)/build/packages-file-list.txt ] || \
> +		{ echo "ERROR: No package size information available, please rebuild with BR2_COLLECT_FILE_SIZE_STATS" ; exit 1; }
> +	@$(INSTALL) -d $(O)/graphs
> +	@cd "$(CONFIG_DIR)"; \
> +	$(TOPDIR)/support/scripts/graph-size --builddir $(O) \

graph-size -> size-stats ?

> +		--graph $(O)/graphs/graph-size.$(BR_GRAPH_OUT) \
> +		--file-size-csv $(O)/build/file-size-stats.csv \
> +		--package-size-csv $(O)/build/package-size-stats.csv
> +
>  else # ifeq ($(BR2_HAVE_DOT_CONFIG),y)
>  
>  all: menuconfig
> @@ -889,6 +899,7 @@ endif
>  	@echo '  manual-epub            - build manual in ePub'
>  	@echo '  graph-build            - generate graphs of the build times'
>  	@echo '  graph-depends          - generate graph of the dependency tree'
> +	@echo '  size-stats             - generate stats of the filesystem size'
>  	@echo
>  	@echo 'Miscellaneous:'
>  	@echo '  source                 - download all sources needed for offline-build'
> 

But with that fixed I get:
$ make O=test/xfsprogs/ size-stats
Traceback (most recent call last):
  File "/home/naourr/git/buildroot/support/scripts/size-stats", line 217, in <module>
    pkgdict = build_package_dict(args.builddir)
  File "/home/naourr/git/buildroot/support/scripts/size-stats", line 68, in build_package_dict
    with open(os.path.join(builddir, "build", "packages-file-list.txt")) as filelistf:
IOError: [Errno 2] No such file or directory: 'test/xfsprogs/build/packages-file-list.txt'

But this file exist:
$ ls test/xfsprogs/build/packages-file-list.txt
test/xfsprogs/build/packages-file-list.txt

Also I noticed that the man pages (and headers) 
are taken into account in packages-file-list.txt:
[...]
tmux,./usr/bin/tmux
tmux,./usr/share/man/man1/tmux.1

Since they are removed at the end, I think we want to 
remove them from packages-file-list.txt during
TARGET_DIR cleanup ?

Best regards,
Romain

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 4/4] Makefile: implement a size-stats target
  2015-01-12 22:47   ` Romain Naour
@ 2015-01-13  8:12     ` Thomas Petazzoni
  2015-01-13 23:06       ` Romain Naour
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Petazzoni @ 2015-01-13  8:12 UTC (permalink / raw)
  To: buildroot

Dear Romain Naour,

On Mon, 12 Jan 2015 23:47:28 +0100, Romain Naour wrote:

> > +size-stats:
> > +	@[ -f $(O)/build/packages-file-list.txt ] || \
> > +		{ echo "ERROR: No package size information available, please rebuild with BR2_COLLECT_FILE_SIZE_STATS" ; exit 1; }
> > +	@$(INSTALL) -d $(O)/graphs
> > +	@cd "$(CONFIG_DIR)"; \
> > +	$(TOPDIR)/support/scripts/graph-size --builddir $(O) \
> 
> graph-size -> size-stats ?

Will fix, thanks for noticing (I renamed the script late in my
development).

> But with that fixed I get:
> $ make O=test/xfsprogs/ size-stats
> Traceback (most recent call last):
>   File "/home/naourr/git/buildroot/support/scripts/size-stats", line 217, in <module>
>     pkgdict = build_package_dict(args.builddir)
>   File "/home/naourr/git/buildroot/support/scripts/size-stats", line 68, in build_package_dict
>     with open(os.path.join(builddir, "build", "packages-file-list.txt")) as filelistf:
> IOError: [Errno 2] No such file or directory: 'test/xfsprogs/build/packages-file-list.txt'
> 
> But this file exist:
> $ ls test/xfsprogs/build/packages-file-list.txt
> test/xfsprogs/build/packages-file-list.txt

Might be that I didn't properly take into account out-of-tree build. My
bad, I'll have a look.

> Also I noticed that the man pages (and headers) 
> are taken into account in packages-file-list.txt:
> [...]
> tmux,./usr/bin/tmux
> tmux,./usr/share/man/man1/tmux.1
> 
> Since they are removed at the end, I think we want to 
> remove them from packages-file-list.txt during
> TARGET_DIR cleanup ?

It does not really matter if there are more files in
packages-file-list.txt that there really are in TARGET_DIR, because the
logic of the size-stats script is to travel through TARGET_DIR, and for
each file in TARGET_DIR, lookup in packages-file-list.txt which package
was the last one to install this file. So all files that are in
packages-file-list.txt but not in TARGET_DIR are purely and simply
ignored.

Thanks for your review and testing!

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Buildroot] [PATCHv2 4/4] Makefile: implement a size-stats target
  2015-01-13  8:12     ` Thomas Petazzoni
@ 2015-01-13 23:06       ` Romain Naour
  0 siblings, 0 replies; 17+ messages in thread
From: Romain Naour @ 2015-01-13 23:06 UTC (permalink / raw)
  To: buildroot

Hi Thomas,

Le 13/01/2015 09:12, Thomas Petazzoni a ?crit :
> Dear Romain Naour,
> 
> On Mon, 12 Jan 2015 23:47:28 +0100, Romain Naour wrote:
> 
>>> +size-stats:
>>> +	@[ -f $(O)/build/packages-file-list.txt ] || \
>>> +		{ echo "ERROR: No package size information available, please rebuild with BR2_COLLECT_FILE_SIZE_STATS" ; exit 1; }
>>> +	@$(INSTALL) -d $(O)/graphs
>>> +	@cd "$(CONFIG_DIR)"; \
>>> +	$(TOPDIR)/support/scripts/graph-size --builddir $(O) \
>>
>> graph-size -> size-stats ?
> 
> Will fix, thanks for noticing (I renamed the script late in my
> development).
> 
>> But with that fixed I get:
>> $ make O=test/xfsprogs/ size-stats
>> Traceback (most recent call last):
>>   File "/home/naourr/git/buildroot/support/scripts/size-stats", line 217, in <module>
>>     pkgdict = build_package_dict(args.builddir)
>>   File "/home/naourr/git/buildroot/support/scripts/size-stats", line 68, in build_package_dict
>>     with open(os.path.join(builddir, "build", "packages-file-list.txt")) as filelistf:
>> IOError: [Errno 2] No such file or directory: 'test/xfsprogs/build/packages-file-list.txt'
>>
>> But this file exist:
>> $ ls test/xfsprogs/build/packages-file-list.txt
>> test/xfsprogs/build/packages-file-list.txt
> 
> Might be that I didn't properly take into account out-of-tree build. My
> bad, I'll have a look.

The problem is that the size-stats script is called from CONFIG_DIR
I removed this line "@cd "$(CONFIG_DIR)"; \" and I get this graph (see attached
file).

> 
>> Also I noticed that the man pages (and headers) 
>> are taken into account in packages-file-list.txt:
>> [...]
>> tmux,./usr/bin/tmux
>> tmux,./usr/share/man/man1/tmux.1
>>
>> Since they are removed at the end, I think we want to 
>> remove them from packages-file-list.txt during
>> TARGET_DIR cleanup ?
> 
> It does not really matter if there are more files in
> packages-file-list.txt that there really are in TARGET_DIR, because the
> logic of the size-stats script is to travel through TARGET_DIR, and for
> each file in TARGET_DIR, lookup in packages-file-list.txt which package
> was the last one to install this file. So all files that are in
> packages-file-list.txt but not in TARGET_DIR are purely and simply
> ignored.

Ok, I haven't reviewed the size-stats script yet.
Thank for the explanation.

Best regards,
Romain
> 
> Thanks for your review and testing!
> 
> Thomas
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: graph-size.pdf
Type: application/pdf
Size: 18491 bytes
Desc: not available
URL: <http://lists.busybox.net/pipermail/buildroot/attachments/20150114/e0d04737/attachment.pdf>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2015-01-13 23:06 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-01 21:41 [Buildroot] [PATCHv2 0/4] Generate package size statistics Thomas Petazzoni
2014-12-01 21:41 ` [Buildroot] [PATCHv2 1/4] toolchain-external: split target installation from staging installation Thomas Petazzoni
2014-12-02 11:00   ` Jérôme Pouiller
2015-01-10 17:02   ` Thomas Petazzoni
2014-12-01 21:41 ` [Buildroot] [PATCHv2 2/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
2014-12-02 11:00   ` Jérôme Pouiller
2014-12-02 12:23     ` Thomas Petazzoni
2014-12-02 13:22       ` Jérôme Pouiller
2014-12-02 13:40         ` Jérôme Pouiller
2014-12-01 21:41 ` [Buildroot] [PATCHv2 3/4] support/scripts: add size-stats script Thomas Petazzoni
2014-12-02 11:01   ` Jérôme Pouiller
2014-12-02 12:28     ` Thomas Petazzoni
2014-12-02 13:24       ` Jérôme Pouiller
2014-12-01 21:41 ` [Buildroot] [PATCHv2 4/4] Makefile: implement a size-stats target Thomas Petazzoni
2015-01-12 22:47   ` Romain Naour
2015-01-13  8:12     ` Thomas Petazzoni
2015-01-13 23:06       ` Romain Naour

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox