From: John Reiser <jreiser-Po6cBsTGB2ZWk0Htik3J/w@public.gmane.org>
To: initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: building initramfs is slow
Date: Thu, 18 Aug 2011 11:18:33 -0700 [thread overview]
Message-ID: <4E4D5779.6090209@bitwagon.com> (raw)
Building an initramfs is unreasonably slow. On Fedora 16
dracut-011 takes almost a minute when installing a new kernel:
real 56s
user 20s \__ dracut is CPU bound, not I/O bound.
sys 31s /
The final "gzip -9" takes 12 seconds, and the cpio 1 second,
which leaves 43 seconds for the rest of dracut. That's a
factor of 3 or 4 too long. The output initramfs is 14.9MB
(41MB unzipped) and contains 1619 files, including 367 .ko
kernel modules.
Running
strace -o strace.out -f -e trace=execve dracut test.img
and applying some text processing to strace.out shows
12518 SIGCHLD (processes terminated)
So dracut fondles each file with an average of (12518 / 1619)
= 7.7 processes. No wonder building an initramfs is slow!
Again in strace.out:
8917 execve (address-space images instantiated)
and taking (#SIGCHLD - #execve) gives:
3591 fork-and-no-exec (shell builtins that need a process)
because there is almost no chaining of execve without a fork.
The sorted histogram of execve begins:
3803 execve("/bin/egrep"
1343 execve("/bin/cp"
858 execve("/lib64/ld-linux-x86-64.so.2"
760 execve("/usr/bin/ldd"
375 execve("/sbin/modinfo"
359 execve("/bin/chmod"
344 execve("/bin/rm"
341 execve("/sbin/modprobe"
256 execve("/bin/mkdir"
222 execve("/bin/readlink"
100 execve("/bin/cat"
This data, and a glance at the source of dracut, suggests
considering the bash shell regexp operator "[[ string =~ pattern ]]"
and the expansion substitution operator "${parameter/pattern/string}"
to replace most instances of egrep.
The uses of cp, ldd, chmod, and modinfo should be investigated for
the possibility of batching more than one file at a time. Operating
inside one directory at a time can effectively remove the threat of
exceeding the 32KB limit on the arglist to execve.
Using pipelines (possibly including bash's "while read fname ; do")
to filter streamed lists of filenames can reduce overhead significantly
in contrast to "for fname in ...; do <<execve>>". A pipeline may also
introduce effective parallelism.
"sort --uniq" handily removes duplicates.
In most cases "cat filename |" should be replaced with ordinary
redirection "< filename", and similarly "$(cat filename)" should
be "$(< filename)". If SELinux denies access by dracut (etc.)
but allows /bin/cat, then such a comment is REQUIRED.
Yes, I'm going to work on it.
--
next reply other threads:[~2011-08-18 18:18 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-18 18:18 John Reiser [this message]
[not found] ` <CALAkbJOkMTQdkmhBBvqHk3oKRzMHvXcp1MxasMrMpCbTP3+0eg@mail.gmail.com>
[not found] ` <CALAkbJOkMTQdkmhBBvqHk3oKRzMHvXcp1MxasMrMpCbTP3+0eg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-08-18 23:09 ` building initramfs is slow John Reiser
2011-08-19 4:53 ` WANG Cong
2011-08-19 6:47 ` Harald Hoyer
[not found] ` <4E4E0707.4060504-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-08-19 7:04 ` Américo Wang
[not found] ` <CAM_iQpUr2mVRM+PFeYkefzx9xEAOJKhZh+wpaXgKg6bj+1dozQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-08-19 7:07 ` Harald Hoyer
[not found] ` <4E4E0B95.6040909-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-08-19 7:32 ` Dan Horák
2011-08-19 18:27 ` John Reiser
[not found] ` <4E4D5779.6090209-Po6cBsTGB2ZWk0Htik3J/w@public.gmane.org>
2011-08-19 7:03 ` Harald Hoyer
2011-08-19 8:24 ` Harald Hoyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E4D5779.6090209@bitwagon.com \
--to=jreiser-po6cbstgb2zwk0htik3j/w@public.gmane.org \
--cc=initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox