All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Preserve hard links
@ 2011-02-09  5:51 Mark Hatle
  2011-02-09  5:51 ` [PATCH 1/2] package.bbclass: Preserve hard links! Mark Hatle
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Mark Hatle @ 2011-02-09  5:51 UTC (permalink / raw)
  To: poky

While working on another part of the system, I noticed that Poky didn't appear
to be preserving any hard links in the system.  This change not only
preserves hardlinks during packaging, but also shrinks the overall
disk space required for a build.  In my poky-image-minimal my required
footprint shank by 200+ MB.

The second part of this patch I took a guess as to the right components to
modify.  It has passed the testing I've performed so far, but I'm not
100% sure that it's correct.  I believe a bit of additional review may be
necessary.

Pull URL: git://git.pokylinux.org/poky-contrib.git
  Branch: mhatle/hardlink
  Browse: http://git.pokylinux.org/cgit.cgi/poky-contrib/log/?h=mhatle/hardlink

Thanks,
    Mark Hatle <mark.hatle@windriver.com>
---


Mark Hatle (2):
  package.bbclass: Preserve hard links!
  Misc hard link fixes

 meta/classes/libc-package.bbclass     |   12 ++++++------
 meta/classes/package.bbclass          |   18 +++++++++++++++++-
 meta/classes/populate_sdk_deb.bbclass |    2 +-
 meta/classes/sourcepkg.bbclass        |    2 +-
 meta/classes/staging.bbclass          |    2 +-
 5 files changed, 26 insertions(+), 10 deletions(-)

-- 
1.7.3.4



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] package.bbclass: Preserve hard links!
  2011-02-09  5:51 [PATCH 0/2] Preserve hard links Mark Hatle
@ 2011-02-09  5:51 ` Mark Hatle
  2011-02-09  5:51 ` [PATCH 2/2] Misc hard link fixes Mark Hatle
  2011-02-09 12:20 ` [PATCH 0/2] Preserve hard links Richard Purdie
  2 siblings, 0 replies; 5+ messages in thread
From: Mark Hatle @ 2011-02-09  5:51 UTC (permalink / raw)
  To: poky

Hard links were not being preserved in the move from the install image
-> package copy.  Again they were being discarded in the package ->
packages-split copy as well.

By preserving the hard links we have the potential to save a ton of rootfs
space.

Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
---
 meta/classes/package.bbclass |   18 +++++++++++++++++-
 1 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/meta/classes/package.bbclass b/meta/classes/package.bbclass
index 8f58ad0..0698f64 100644
--- a/meta/classes/package.bbclass
+++ b/meta/classes/package.bbclass
@@ -323,7 +323,8 @@ python perform_packagecopy () {
 	# Start by package population by taking a copy of the installed 
 	# files to operate on
 	os.system('rm -rf %s/*' % (dvar))
-	os.system('cp -pPR %s/* %s/' % (dest, dvar))
+	# Preserve sparse files and hard links
+	os.system('tar -cf - -C %s -ps . | tar -xf - -C %s' % (dest, dvar))
 }
 
 python populate_packages () {
@@ -383,6 +384,7 @@ python populate_packages () {
 
 		filesvar = bb.data.getVar('FILES', localdata, True) or ""
 		files = filesvar.split()
+		file_links = {}
 		for file in files:
 			if os.path.isabs(file):
 				file = '.' + file
@@ -406,9 +408,23 @@ python populate_packages () {
 				bb.mkdirhier(os.path.join(root,file))
 				os.chmod(os.path.join(root,file), os.stat(file).st_mode)
 				continue
+
 			fpath = os.path.join(root,file)
 			dpath = os.path.dirname(fpath)
 			bb.mkdirhier(dpath)
+
+			# Check if this is a hardlink to something... if it is
+			# attempt to preserve the link information, instead of copy.
+			if not os.path.islink(file):
+				s = os.stat(file)
+				if s.st_nlink > 1:
+					file_reference = "%d_%d" % (s.st_dev, s.st_ino)
+					if file_reference not in file_links:
+						# Save the reference for next time...
+						file_links[file_reference] = fpath
+					else:
+						os.link(file_links[file_reference], fpath)
+						continue
 			ret = bb.copyfile(file, fpath)
 			if ret is False or ret == 0:
 				raise bb.build.FuncFailed("File population failed")
-- 
1.7.3.4



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] Misc hard link fixes
  2011-02-09  5:51 [PATCH 0/2] Preserve hard links Mark Hatle
  2011-02-09  5:51 ` [PATCH 1/2] package.bbclass: Preserve hard links! Mark Hatle
@ 2011-02-09  5:51 ` Mark Hatle
  2011-02-09 12:20 ` [PATCH 0/2] Preserve hard links Richard Purdie
  2 siblings, 0 replies; 5+ messages in thread
From: Mark Hatle @ 2011-02-09  5:51 UTC (permalink / raw)
  To: poky

I searched the various classes and looked for copies that should attempt to
preserve hardlinks.  This fixes the majority of this copies by switching to
using tar as the copy method.  It also has the side effect of preserving sparse
files.

Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
---
 meta/classes/libc-package.bbclass     |   12 ++++++------
 meta/classes/populate_sdk_deb.bbclass |    2 +-
 meta/classes/sourcepkg.bbclass        |    2 +-
 meta/classes/staging.bbclass          |    2 +-
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/meta/classes/libc-package.bbclass b/meta/classes/libc-package.bbclass
index 733f26b..c9d81f0 100644
--- a/meta/classes/libc-package.bbclass
+++ b/meta/classes/libc-package.bbclass
@@ -104,24 +104,24 @@ TMP_LOCALE="/tmp/locale${libdir}/locale"
 do_prep_locale_tree() {
 	treedir=${WORKDIR}/locale-tree
 	rm -rf $treedir
-	mkdir -p $treedir/bin $treedir/lib $treedir/${datadir} $treedir/${libdir}/locale
-	cp -pPR ${PKGD}${datadir}/i18n $treedir/${datadir}/i18n
+	mkdir -p $treedir/${base_bindir} $treedir/${base_libdir} $treedir/${datadir} $treedir/${libdir}/locale
+	tar -cf - -C ${PKGD}${datadir} -ps i18n | tar -xf - -C $treedir/${datadir}
 	# unzip to avoid parsing errors
 	for i in $treedir/${datadir}/i18n/charmaps/*gz; do 
 		gunzip $i
 	done
-	cp -pPR ${PKGD}${base_libdir}/* $treedir/lib
+	tar -cf - -C ${PKGD}${base_libdir} -ps . | tar -xf - -C $treedir/${base_libdir}
 	if [ -f ${STAGING_DIR_NATIVE}${prefix_native}/lib/libgcc_s.* ]; then
-		cp -pPR ${STAGING_DIR_NATIVE}/${prefix_native}/lib/libgcc_s.* $treedir/lib
+		tar -cf - -C ${STAGING_DIR_NATIVE}/${prefix_native}/${base_libdir} -ps libgcc_s.* | tar -xf - -C $treedir/${base_libdir}
 	fi
-	install -m 0755 ${PKGD}${bindir}/localedef $treedir/bin
+	install -m 0755 ${PKGD}${bindir}/localedef $treedir/${base_bindir}
 }
 
 do_collect_bins_from_locale_tree() {
 	treedir=${WORKDIR}/locale-tree
 
 	mkdir -p ${PKGD}${libdir}
-	cp -pPR $treedir/${libdir}/locale ${PKGD}${libdir}
+	tar -cf - -C $treedir/${libdir} -ps locale | tar -xf - -C ${PKGD}${libdir}
 }
 
 inherit qemu
diff --git a/meta/classes/populate_sdk_deb.bbclass b/meta/classes/populate_sdk_deb.bbclass
index d563c28..a5b6384 100644
--- a/meta/classes/populate_sdk_deb.bbclass
+++ b/meta/classes/populate_sdk_deb.bbclass
@@ -6,7 +6,7 @@ populate_sdk_post_deb () {
 
 	local target_rootfs=$1
 
-	cp -r ${STAGING_ETCDIR_NATIVE}/apt ${target_rootfs}/etc
+	tar -cf -C ${STAGING_ETCDIR_NATIVE} -ps apt | tar -xf - -C ${target_rootfs}/etc
 }
 
 fakeroot populate_sdk_deb () {
diff --git a/meta/classes/sourcepkg.bbclass b/meta/classes/sourcepkg.bbclass
index f738553..f12a195 100644
--- a/meta/classes/sourcepkg.bbclass
+++ b/meta/classes/sourcepkg.bbclass
@@ -41,7 +41,7 @@ sourcepkg_do_create_orig_tgz(){
 	echo $src_tree
 	oenote "Creating .orig.tar.gz in ${DEPLOY_DIR_SRC}/${P}.orig.tar.gz"
 	tar cvzf ${DEPLOY_DIR_SRC}/${P}.orig.tar.gz --exclude-from temp/exclude-from-file $src_tree
-	cp -pPR $src_tree $src_tree.orig
+	tar -cf - -C $src_tree -ps . | tar -xf - -C $src_tree.orig
 }
 
 sourcepkg_do_archive_bb() {
diff --git a/meta/classes/staging.bbclass b/meta/classes/staging.bbclass
index a713734..fef6457 100644
--- a/meta/classes/staging.bbclass
+++ b/meta/classes/staging.bbclass
@@ -17,7 +17,7 @@ sysroot_stage_dir() {
 	# However we always want to stage a $src itself, even if it's empty
 	mkdir -p "$dest"
 	if [ -d "$src" ]; then
-		cp -fpPR "$src"/* "$dest"
+		tar -cf - -C "$src" -ps . | tar -xf - -C "$dest"
 	fi
 }
 
-- 
1.7.3.4



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] Preserve hard links
  2011-02-09  5:51 [PATCH 0/2] Preserve hard links Mark Hatle
  2011-02-09  5:51 ` [PATCH 1/2] package.bbclass: Preserve hard links! Mark Hatle
  2011-02-09  5:51 ` [PATCH 2/2] Misc hard link fixes Mark Hatle
@ 2011-02-09 12:20 ` Richard Purdie
  2011-02-09 13:10   ` Richard Purdie
  2 siblings, 1 reply; 5+ messages in thread
From: Richard Purdie @ 2011-02-09 12:20 UTC (permalink / raw)
  To: Mark Hatle; +Cc: poky

On Tue, 2011-02-08 at 23:51 -0600, Mark Hatle wrote:
> While working on another part of the system, I noticed that Poky didn't appear
> to be preserving any hard links in the system.  This change not only
> preserves hardlinks during packaging, but also shrinks the overall
> disk space required for a build.  In my poky-image-minimal my required
> footprint shank by 200+ MB.
> 
> The second part of this patch I took a guess as to the right components to
> modify.  It has passed the testing I've performed so far, but I'm not
> 100% sure that it's correct.  I believe a bit of additional review may be
> necessary.
> 
> Pull URL: git://git.pokylinux.org/poky-contrib.git
>   Branch: mhatle/hardlink
>   Browse: http://git.pokylinux.org/cgit.cgi/poky-contrib/log/?h=mhatle/hardlink

This looks good, my main worry is the copytree() function in
meta/lib/oe/path.py as I suspect that will still smash things for
sstate.

I'm seriously considering making that function an os.system call to tar
as I suspect performance would improve substantially. I need some
numbers to back up that idea first though.

I'm probably going to hold off on these patches for 24 hours as there is
risk in them, I'd like to fix the above issue too and at the moment I
want to see us have a "successful" build and stabilise a bit after the
fetcher churn.

Cheers,

Richard




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] Preserve hard links
  2011-02-09 12:20 ` [PATCH 0/2] Preserve hard links Richard Purdie
@ 2011-02-09 13:10   ` Richard Purdie
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Purdie @ 2011-02-09 13:10 UTC (permalink / raw)
  To: Mark Hatle; +Cc: poky

On Wed, 2011-02-09 at 12:20 +0000, Richard Purdie wrote:
> On Tue, 2011-02-08 at 23:51 -0600, Mark Hatle wrote:
> > While working on another part of the system, I noticed that Poky didn't appear
> > to be preserving any hard links in the system.  This change not only
> > preserves hardlinks during packaging, but also shrinks the overall
> > disk space required for a build.  In my poky-image-minimal my required
> > footprint shank by 200+ MB.
> > 
> > The second part of this patch I took a guess as to the right components to
> > modify.  It has passed the testing I've performed so far, but I'm not
> > 100% sure that it's correct.  I believe a bit of additional review may be
> > necessary.
> > 
> > Pull URL: git://git.pokylinux.org/poky-contrib.git
> >   Branch: mhatle/hardlink
> >   Browse: http://git.pokylinux.org/cgit.cgi/poky-contrib/log/?h=mhatle/hardlink
> 
> This looks good, my main worry is the copytree() function in
> meta/lib/oe/path.py as I suspect that will still smash things for
> sstate.
> 
> I'm seriously considering making that function an os.system call to tar
> as I suspect performance would improve substantially. I need some
> numbers to back up that idea first though.
> 
> I'm probably going to hold off on these patches for 24 hours as there is
> risk in them, I'd like to fix the above issue too and at the moment I
> want to see us have a "successful" build and stabilise a bit after the
> fetcher churn.

I played with this a little and came up with:

http://git.pokylinux.org/cgit.cgi/poky-contrib/commit/?h=rpurdie/hardlink&id=2131c96ced468f30c3a6f7039617799606faf983

before and after profiles for a copy of the glibc package directory
using oe.path.copytree():

sh-4.1$ head p1 -n 20
Wed Feb  9 12:35:01 2011    profile-testlog.bb-do_crazylog.log

         406162 function calls (397328 primitive calls) in 3.440 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    32674    0.687    0.000    0.687    0.000 {method 'write' of 'file' objects}
    39040    0.320    0.000    0.320    0.000 {method 'read' of 'file' objects}
     6369    0.227    0.000    2.605    0.000 /rphome/poky/bitbake/lib/bb/utils.py:730(copyfile)
        2    0.209    0.104    0.209    0.104 {posix.unlink}
    26452    0.183    0.000    0.183    0.000 {posix.lstat}
    12740    0.179    0.000    0.179    0.000 {open}
    20008    0.150    0.000    0.150    0.000 {posix.chmod}
    21815    0.142    0.000    0.142    0.000 {posix.stat}
     6369    0.137    0.000    1.144    0.000 /usr/lib/python2.6/shutil.py:24(copyfileobj)
    12741    0.127    0.000    0.127    0.000 {method 'close' of 'file' objects}
     6369    0.120    0.000    0.120    0.000 {posix.rename}
    901/1    0.113    0.000    3.181    3.181 /rphome/poky/meta/lib/oe/path.py:55(copytree)
    13639    0.092    0.000    0.092    0.000 {posix.utime}
sh-4.1$ head p2 -n 20
Wed Feb  9 12:37:14 2011    profile-testlog.bb-do_crazylog.log

         15823 function calls (7889 primitive calls) in 1.650 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    1.577    1.577    1.577    1.577 {posix.system}
10072/2570    0.026    0.000    0.029    0.000 /rphome/poky/bitbake/lib/bb/data_smart.py:388(_keys)
        2    0.021    0.010    0.021    0.010 {open}
     2650    0.004    0.000    0.004    0.000 {method 'add' of 'set' objects}
       21    0.002    0.000    0.002    0.000 {compile}
        1    0.002    0.002    0.016    0.016 /rphome/poky/bitbake/lib/bb/data_smart.py:400(__len__)
        1    0.002    0.002    0.032    0.032 /usr/lib/python2.6/_abcoll.py:346(keys)
   129/42    0.001    0.000    0.011    0.000 /rphome/poky/bitbake/lib/bb/data_smart.py:103(expandWithRefs)
   138/68    0.001    0.000    0.010    0.000 {built-in method sub}
      269    0.001    0.000    0.003    0.000 /rphome/poky/bitbake/lib/bb/data_smart.py:295(getVarFlag)
      294    0.001    0.000    0.001    0.000 /rphome/poky/bitbake/lib/bb/data_smart.py:201(_findVar)
   197/75    0.001    0.000    0.011    0.000 /rphome/poky/bitbake/lib/bb/data_smart.py:256(getVar)
    96/44    0.001    0.000    0.010    0.000 /rphome/poky/bitbake/lib/bb/data_smart.py:55(var_sub)

Of note is the drop from 406,000 function calls to 16,000 and the speed
change from 3.4 seconds to 1.6 seconds. If you didn't wipe out the
destination directory first, it would take up to 15 seconds.

Since this is a core function heavily used by sstate, speeding it up is
a good move.

I'm going to run some further tests on my branch with a view to merging
this and Marks fixes.

Cheers,

Richard






^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-02-09 13:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-09  5:51 [PATCH 0/2] Preserve hard links Mark Hatle
2011-02-09  5:51 ` [PATCH 1/2] package.bbclass: Preserve hard links! Mark Hatle
2011-02-09  5:51 ` [PATCH 2/2] Misc hard link fixes Mark Hatle
2011-02-09 12:20 ` [PATCH 0/2] Preserve hard links Richard Purdie
2011-02-09 13:10   ` Richard Purdie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.