* [Buildroot] [PATCH] manual: add section about dealing efficiently with big image files
@ 2013-12-15 23:05 Yann E. MORIN
2013-12-16 7:32 ` Thomas De Schampheleire
0 siblings, 1 reply; 9+ messages in thread
From: Yann E. MORIN @ 2013-12-15 23:05 UTC (permalink / raw)
To: buildroot
From: "Yann E. MORIN" <yann.morin.1998@free.fr>
As reported by Ryan, it is not well-known that most tools can deal
efficiently with big sparse files.
Add a section in the manual about this, with tar and cp used as
examples, and a hinting to the man pages for the others.
Reported-by: Ryan Barnett <rjbarnet@rockwellcollins.com>
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Cc: Ryan Barnett <rjbarnet@rockwellcollins.com>
Cc: Peter Korsgaard <jacmet@uclibc.org>
---
changes v1 -> v2:
- remove 'dd' since it can be dangerous (Peter)
- add a sentence that sparse files are to be used only on the build
machine, not while transferring to the target device (Peter)
---
docs/manual/common-usage.txt | 38 ++++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/docs/manual/common-usage.txt b/docs/manual/common-usage.txt
index 1290dfc..736ff57 100644
--- a/docs/manual/common-usage.txt
+++ b/docs/manual/common-usage.txt
@@ -100,3 +100,41 @@ or +g+++ for building helper-binaries on your host, then do
--------------------
$ make HOSTCXX=g++-4.3-HEAD HOSTCC=gcc-4.3-HEAD
--------------------
+
+Dealing efficiently with filesystem images
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Filesystem images can get pretty big, depending on the filesystem you choose,
+the number of packages, whether you provisionned free space... Yet, some
+locations in the filesystems images may just be _empty_ (eg. a long run of
+'zeroes'); such a file is called a _sparse_ file.
+
+Most tools can handle sparse files efficiently, and will only store or write
+those parts of a sparse file that are not empty.
+
+For example:
+
+* +tar+ accepts the +-S+ option to tell it to only store non-zero blocks
+ of sparse files:
+** +tar cf archive.tar -S [files...]+ will efficiently store sparse files
+ in a tarball
+** +tar xf archive.tar -S+ will efficiently store sparse files extracted
+ from a tarball
+
+* +cp+ accepts the +--sparse=WHEN+ option (+WHEN+ is one of +auto+,
+ +never+ or +always+):
+** +cp --sparse=always source.file dest.file+ will make +dest.file+ a
+ sparse file if +source.file+ has long runs of zeroes
+
+Other tools may have similar options. Please consult their own man pages.
+
+You can use sparse files if you need to store the filesystem images (eg.
+to transfer from one machine to another), of if you need to send them (eg.
+to the Q&A team).
+
+Note however that flashing a filesystem image to a device while using the
+sparse mode of +dd+ may result in a broken filesystem (eg. the block bitmap
+of an ext2 filesystem may be corrupted; or, if you have sparse files in
+your filesystem, those parts may not be all-zeroes when read back). You
+should only use sparse files when handling files on the build machine, not
+when transferring them to an actual device that will be used on the target.
--
1.8.1.2
^ permalink raw reply related [flat|nested] 9+ messages in thread* [Buildroot] [PATCH] manual: add section about dealing efficiently with big image files
2013-12-15 23:05 [Buildroot] [PATCH] manual: add section about dealing efficiently with big image files Yann E. MORIN
@ 2013-12-16 7:32 ` Thomas De Schampheleire
2013-12-16 22:35 ` Yann E. MORIN
0 siblings, 1 reply; 9+ messages in thread
From: Thomas De Schampheleire @ 2013-12-16 7:32 UTC (permalink / raw)
To: buildroot
Hi Yann,
On Mon, Dec 16, 2013 at 12:05 AM, Yann E. MORIN <yann.morin.1998@free.fr> wrote:
> From: "Yann E. MORIN" <yann.morin.1998@free.fr>
>
> As reported by Ryan, it is not well-known that most tools can deal
> efficiently with big sparse files.
>
> Add a section in the manual about this, with tar and cp used as
> examples, and a hinting to the man pages for the others.
>
> Reported-by: Ryan Barnett <rjbarnet@rockwellcollins.com>
> Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
> Cc: Ryan Barnett <rjbarnet@rockwellcollins.com>
> Cc: Peter Korsgaard <jacmet@uclibc.org>
>
> ---
> changes v1 -> v2:
> - remove 'dd' since it can be dangerous (Peter)
> - add a sentence that sparse files are to be used only on the build
> machine, not while transferring to the target device (Peter)
> ---
> docs/manual/common-usage.txt | 38 ++++++++++++++++++++++++++++++++++++++
> 1 file changed, 38 insertions(+)
>
> diff --git a/docs/manual/common-usage.txt b/docs/manual/common-usage.txt
> index 1290dfc..736ff57 100644
> --- a/docs/manual/common-usage.txt
> +++ b/docs/manual/common-usage.txt
> @@ -100,3 +100,41 @@ or +g+++ for building helper-binaries on your host, then do
> --------------------
> $ make HOSTCXX=g++-4.3-HEAD HOSTCC=gcc-4.3-HEAD
> --------------------
> +
> +Dealing efficiently with filesystem images
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Filesystem images can get pretty big, depending on the filesystem you choose,
> +the number of packages, whether you provisionned free space... Yet, some
provisioned
> +locations in the filesystems images may just be _empty_ (eg. a long run of
> +'zeroes'); such a file is called a _sparse_ file.
> +
> +Most tools can handle sparse files efficiently, and will only store or write
> +those parts of a sparse file that are not empty.
> +
> +For example:
> +
> +* +tar+ accepts the +-S+ option to tell it to only store non-zero blocks
> + of sparse files:
> +** +tar cf archive.tar -S [files...]+ will efficiently store sparse files
> + in a tarball
> +** +tar xf archive.tar -S+ will efficiently store sparse files extracted
> + from a tarball
> +
> +* +cp+ accepts the +--sparse=WHEN+ option (+WHEN+ is one of +auto+,
> + +never+ or +always+):
> +** +cp --sparse=always source.file dest.file+ will make +dest.file+ a
> + sparse file if +source.file+ has long runs of zeroes
> +
> +Other tools may have similar options. Please consult their own man pages.
I find the usage of 'own' strange here. Suggestions:
Please consult their man pages.
Please consult their respective man pages.
> +
> +You can use sparse files if you need to store the filesystem images (eg.
> +to transfer from one machine to another), of if you need to send them (eg.
> +to the Q&A team).
> +
> +Note however that flashing a filesystem image to a device while using the
> +sparse mode of +dd+ may result in a broken filesystem (eg. the block bitmap
> +of an ext2 filesystem may be corrupted; or, if you have sparse files in
> +your filesystem, those parts may not be all-zeroes when read back). You
> +should only use sparse files when handling files on the build machine, not
> +when transferring them to an actual device that will be used on the target.
> --
Best regards,
Thomas
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Buildroot] [PATCH] manual: add section about dealing efficiently with big image files
2013-12-16 7:32 ` Thomas De Schampheleire
@ 2013-12-16 22:35 ` Yann E. MORIN
0 siblings, 0 replies; 9+ messages in thread
From: Yann E. MORIN @ 2013-12-16 22:35 UTC (permalink / raw)
To: buildroot
Thomas, All,
On 2013-12-16 08:32 +0100, Thomas De Schampheleire spake thusly:
> On Mon, Dec 16, 2013 at 12:05 AM, Yann E. MORIN <yann.morin.1998@free.fr> wrote:
[--SNIP--]
> > +Filesystem images can get pretty big, depending on the filesystem you choose,
> > +the number of packages, whether you provisionned free space... Yet, some
>
> provisioned
Fixed.
[--SNIP--]
> > +Other tools may have similar options. Please consult their own man pages.
>
> I find the usage of 'own' strange here. Suggestions:
> Please consult their man pages.
> Please consult their respective man pages.
Fixed, I went for the second variant.
I'll resend in a moment. Thank you! :-)
Regards,
Yann E. MORIN.
--
.-----------------.--------------------.------------------.--------------------.
| Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ |
| +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. |
'------------------------------^-------^------------------^--------------------'
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Buildroot] [PATCH] manual: add section about dealing efficiently with big image files
@ 2013-12-15 18:56 Yann E. MORIN
2013-12-15 19:54 ` Peter Korsgaard
0 siblings, 1 reply; 9+ messages in thread
From: Yann E. MORIN @ 2013-12-15 18:56 UTC (permalink / raw)
To: buildroot
From: "Yann E. MORIN" <yann.morin.1998@free.fr>
As reported by Ryan, it is not well-known that most tools can deal
efficiently with big sparse files.
Add a section in the manual about this, with three tools as examples,
and a hint to the man pages for the others.
Reported-by: Ryan Barnett <rjbarnet@rockwellcollins.com>
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Cc: Ryan Barnett <rjbarnet@rockwellcollins.com>
---
docs/manual/common-usage.txt | 33 +++++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/docs/manual/common-usage.txt b/docs/manual/common-usage.txt
index 1290dfc..46183ec 100644
--- a/docs/manual/common-usage.txt
+++ b/docs/manual/common-usage.txt
@@ -100,3 +100,36 @@ or +g+++ for building helper-binaries on your host, then do
--------------------
$ make HOSTCXX=g++-4.3-HEAD HOSTCC=gcc-4.3-HEAD
--------------------
+
+Dealing efficiently with filesystem images
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Filesystem images can get pretty big, depending on the filesystem you choose,
+the number of packages, whether you provisionned free space... Yet, some
+locations in the filesystems images may just be _empty_ (eg. a long run of
+'zeroes'); such a file is called a _sparse_ file.
+
+Most tools can handle sparse files efficiently, and will only store or write
+those parts of a sparse file that are not empty.
+
+For example:
+
+* +tar+ accepts the +-S+ option to tell it to only store non-zero blocks
+ of sparse files:
+** +tar cf archive.tar -S [files...]+ will efficiently store sparse files
+ in a tarball
+** +tar xf archive.tar -S+ will efficiently store sparse files extracted
+ from a tarball
+
+* +cp+ accepts the +--sparse=WHEN+ option (+WHEN+ is one of +auto+,
+ +never+ or +always+):
+** +cp --sparse=always source.file dest.file+ will make +dest.file+ a
+ sparse file if +source.file+ has long runs of zeroes
+
+* +dd+ accepts the +sparse+ value in a +conv+ list (specifying the
+ block-size is recommended):
+** +dd if=image.file of=/dev/device bs=131072 conv=sparse+ will only
+ write to +/dev/device+ the blocks of +image.file+ that are not made
+ only of zeroes
+
+Other tools may have similar options. Please consult their own man pages.
--
1.8.1.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Buildroot] [PATCH] manual: add section about dealing efficiently with big image files
2013-12-15 18:56 Yann E. MORIN
@ 2013-12-15 19:54 ` Peter Korsgaard
2013-12-15 22:25 ` Yann E. MORIN
0 siblings, 1 reply; 9+ messages in thread
From: Peter Korsgaard @ 2013-12-15 19:54 UTC (permalink / raw)
To: buildroot
>>>>> "Yann" == Yann E MORIN <yann.morin.1998@free.fr> writes:
> From: "Yann E. MORIN" <yann.morin.1998@free.fr>
> As reported by Ryan, it is not well-known that most tools can deal
> efficiently with big sparse files.
> Add a section in the manual about this, with three tools as examples,
> and a hint to the man pages for the others.
...
> +* +tar+ accepts the +-S+ option to tell it to only store non-zero blocks
> + of sparse files:
> +** +tar cf archive.tar -S [files...]+ will efficiently store sparse files
> + in a tarball
> +** +tar xf archive.tar -S+ will efficiently store sparse files extracted
> + from a tarball
> +
> +* +cp+ accepts the +--sparse=WHEN+ option (+WHEN+ is one of +auto+,
> + +never+ or +always+):
> +** +cp --sparse=always source.file dest.file+ will make +dest.file+ a
> + sparse file if +source.file+ has long runs of zeroes
> +
> +* +dd+ accepts the +sparse+ value in a +conv+ list (specifying the
> + block-size is recommended):
We should probably note that these things are only supported by (recent
versions of ) the big GNU versions of the tools, not the busybox variant
(which people are likely to be using).
> +** +dd if=image.file of=/dev/device bs=131072 conv=sparse+ will only
> + write to +/dev/device+ the blocks of +image.file+ that are not made
> + only of zeroes
Is that safe advise when writing to devices? If the image file contains
128K of spaces in the used part of the filesystem (E.G. as part of a
file or in the block bitmap or wherever), then that data will not be
written and you will instead read back whatever was at that location in
the device before.
I think that you should be using bigger block sizes, tools that
understand the filesystem layout or resize afterwards (E.G. resize2fs)
instead.
--
Bye, Peter Korsgaard
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Buildroot] [PATCH] manual: add section about dealing efficiently with big image files
2013-12-15 19:54 ` Peter Korsgaard
@ 2013-12-15 22:25 ` Yann E. MORIN
2013-12-15 22:38 ` Peter Korsgaard
0 siblings, 1 reply; 9+ messages in thread
From: Yann E. MORIN @ 2013-12-15 22:25 UTC (permalink / raw)
To: buildroot
Peter, All,
On 2013-12-15 20:54 +0100, Peter Korsgaard spake thusly:
> >>>>> "Yann" == Yann E MORIN <yann.morin.1998@free.fr> writes:
>
> > From: "Yann E. MORIN" <yann.morin.1998@free.fr>
> > As reported by Ryan, it is not well-known that most tools can deal
> > efficiently with big sparse files.
>
> > Add a section in the manual about this, with three tools as examples,
> > and a hint to the man pages for the others.
>
> ...
>
> > +* +tar+ accepts the +-S+ option to tell it to only store non-zero blocks
> > + of sparse files:
> > +** +tar cf archive.tar -S [files...]+ will efficiently store sparse files
> > + in a tarball
> > +** +tar xf archive.tar -S+ will efficiently store sparse files extracted
> > + from a tarball
> > +
> > +* +cp+ accepts the +--sparse=WHEN+ option (+WHEN+ is one of +auto+,
> > + +never+ or +always+):
> > +** +cp --sparse=always source.file dest.file+ will make +dest.file+ a
> > + sparse file if +source.file+ has long runs of zeroes
> > +
> > +* +dd+ accepts the +sparse+ value in a +conv+ list (specifying the
> > + block-size is recommended):
>
> We should probably note that these things are only supported by (recent
> versions of ) the big GNU versions of the tools, not the busybox variant
> (which people are likely to be using).
As said on IRC: this is about handling those files at build time, not
on the target.
> > +** +dd if=image.file of=/dev/device bs=131072 conv=sparse+ will only
> > + write to +/dev/device+ the blocks of +image.file+ that are not made
> > + only of zeroes
>
> Is that safe advise when writing to devices? If the image file contains
> 128K of spaces in the used part of the filesystem (E.G. as part of a
> file or in the block bitmap or wherever), then that data will not be
> written and you will instead read back whatever was at that location in
> the device before.
Yes, that's true. But those are just example on how to handle big sparse
files, an in no way the recommended way of flashing a device.
I'll remove it before re-submitting. After all, we hint to the man pages
of other tools, so it's thoroughly documented. And if anyone if foolish
enough to try that, then we can say 'we never wrote that!' :-)
> I think that you should be using bigger block sizes, tools that
> understand the filesystem layout or resize afterwards (E.G. resize2fs)
> instead.
Not sure I follow you on that one. What if the user enters a large
number of blocks for his ext2 filesystem? Those will be empty
(zero-filled), but the image file will not be made sparse. So there is
no 'fs resize' or such in the process.
Regards,
Yann E. MORIN.
--
.-----------------.--------------------.------------------.--------------------.
| Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ |
| +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. |
'------------------------------^-------^------------------^--------------------'
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Buildroot] [PATCH] manual: add section about dealing efficiently with big image files
2013-12-15 22:25 ` Yann E. MORIN
@ 2013-12-15 22:38 ` Peter Korsgaard
2013-12-15 22:48 ` Yann E. MORIN
0 siblings, 1 reply; 9+ messages in thread
From: Peter Korsgaard @ 2013-12-15 22:38 UTC (permalink / raw)
To: buildroot
>>>>> "Yann" == Yann E MORIN <yann.morin.1998@free.fr> writes:
Hi,
> I'll remove it before re-submitting. After all, we hint to the man pages
> of other tools, so it's thoroughly documented. And if anyone if foolish
> enough to try that, then we can say 'we never wrote that!' :-)
Indeed, thanks ;)
>> I think that you should be using bigger block sizes, tools that
>> understand the filesystem layout or resize afterwards (E.G. resize2fs)
>> instead.
> Not sure I follow you on that one. What if the user enters a large
> number of blocks for his ext2 filesystem? Those will be empty
> (zero-filled), but the image file will not be made sparse. So there is
> no 'fs resize' or such in the process.
What I meant was simply that if you want to end up with a filesystem
with lots of free space and don't want to waste time writing zeroes to
the unused areas, it is safer to:
- create the filesystem spanning the entire partition yourself on the
fly (mkfs + tar xf output/images/rootfs.tar)
- or resize fs to the full partition size after writing the image
(dd if=output/images/rootfs.ext2 + resize2fs)
--
Bye, Peter Korsgaard
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Buildroot] [PATCH] manual: add section about dealing efficiently with big image files
2013-12-15 22:38 ` Peter Korsgaard
@ 2013-12-15 22:48 ` Yann E. MORIN
2013-12-15 23:28 ` Peter Korsgaard
0 siblings, 1 reply; 9+ messages in thread
From: Yann E. MORIN @ 2013-12-15 22:48 UTC (permalink / raw)
To: buildroot
Peter, All,
On 2013-12-15 23:38 +0100, Peter Korsgaard spake thusly:
> >>>>> "Yann" == Yann E MORIN <yann.morin.1998@free.fr> writes:
> >> I think that you should be using bigger block sizes, tools that
> >> understand the filesystem layout or resize afterwards (E.G. resize2fs)
> >> instead.
>
> > Not sure I follow you on that one. What if the user enters a large
> > number of blocks for his ext2 filesystem? Those will be empty
> > (zero-filled), but the image file will not be made sparse. So there is
> > no 'fs resize' or such in the process.
>
> What I meant was simply that if you want to end up with a filesystem
> with lots of free space and don't want to waste time writing zeroes to
> the unused areas, it is safer to:
>
> - create the filesystem spanning the entire partition yourself on the
> fly (mkfs + tar xf output/images/rootfs.tar)
>
> - or resize fs to the full partition size after writing the image
> (dd if=output/images/rootfs.ext2 + resize2fs)
I see, but in that case, we should no offer the user the ability to
specify the number of blocks to use for the ext2 filesystem, with
BR2_TARGET_ROOTFS_EXT2_BLOCKS, for example (and ext2 if the only
writable filesystem for which we do it).
Should I cook up a patch to remove BR2_TARGET_ROOTFS_EXT2_BLOCKS?
Regards,
Yann E. MORIN.
--
.-----------------.--------------------.------------------.--------------------.
| Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ |
| +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. |
'------------------------------^-------^------------------^--------------------'
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Buildroot] [PATCH] manual: add section about dealing efficiently with big image files
2013-12-15 22:48 ` Yann E. MORIN
@ 2013-12-15 23:28 ` Peter Korsgaard
0 siblings, 0 replies; 9+ messages in thread
From: Peter Korsgaard @ 2013-12-15 23:28 UTC (permalink / raw)
To: buildroot
>>>>> "Yann" == Yann E MORIN <yann.morin.1998@free.fr> writes:
>> What I meant was simply that if you want to end up with a filesystem
>> with lots of free space and don't want to waste time writing zeroes to
>> the unused areas, it is safer to:
>>
>> - create the filesystem spanning the entire partition yourself on the
>> fly (mkfs + tar xf output/images/rootfs.tar)
>>
>> - or resize fs to the full partition size after writing the image
>> (dd if=output/images/rootfs.ext2 + resize2fs)
> I see, but in that case, we should no offer the user the ability to
> specify the number of blocks to use for the ext2 filesystem, with
> BR2_TARGET_ROOTFS_EXT2_BLOCKS, for example (and ext2 if the only
> writable filesystem for which we do it).
Historically, part of the reason why we had that option was that our
size estimate wasn't perfect, so we sometimes didn't forsee enough
space. Next to that, some people might be using it already for
provide-me-a-bit-extra-space (I know I have used it in the past). It's
not that the option is really bad, just that it is more efficient to not
transfer / write all those zeroes for the unallocated space if you can.
> Should I cook up a patch to remove BR2_TARGET_ROOTFS_EXT2_BLOCKS?
I would prefer to keep it. It doesn't add much complexity in Buildroot.
--
Bye, Peter Korsgaard
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-12-16 22:35 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-15 23:05 [Buildroot] [PATCH] manual: add section about dealing efficiently with big image files Yann E. MORIN
2013-12-16 7:32 ` Thomas De Schampheleire
2013-12-16 22:35 ` Yann E. MORIN
-- strict thread matches above, loose matches on Subject: below --
2013-12-15 18:56 Yann E. MORIN
2013-12-15 19:54 ` Peter Korsgaard
2013-12-15 22:25 ` Yann E. MORIN
2013-12-15 22:38 ` Peter Korsgaard
2013-12-15 22:48 ` Yann E. MORIN
2013-12-15 23:28 ` Peter Korsgaard
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox