From: John Reiser <jreiser-Po6cBsTGB2ZWk0Htik3J/w@public.gmane.org>
To: initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: building initramfs is slow
Date: Thu, 18 Aug 2011 11:18:33 -0700 [thread overview]
Message-ID: <4E4D5779.6090209@bitwagon.com> (raw)
Building an initramfs is unreasonably slow. On Fedora 16
dracut-011 takes almost a minute when installing a new kernel:
real 56s
user 20s \__ dracut is CPU bound, not I/O bound.
sys 31s /
The final "gzip -9" takes 12 seconds, and the cpio 1 second,
which leaves 43 seconds for the rest of dracut. That's a
factor of 3 or 4 too long. The output initramfs is 14.9MB
(41MB unzipped) and contains 1619 files, including 367 .ko
kernel modules.
Running
strace -o strace.out -f -e trace=execve dracut test.img
and applying some text processing to strace.out shows
12518 SIGCHLD (processes terminated)
So dracut fondles each file with an average of (12518 / 1619)
= 7.7 processes. No wonder building an initramfs is slow!
Again in strace.out:
8917 execve (address-space images instantiated)
and taking (#SIGCHLD - #execve) gives:
3591 fork-and-no-exec (shell builtins that need a process)
because there is almost no chaining of execve without a fork.
The sorted histogram of execve begins:
3803 execve("/bin/egrep"
1343 execve("/bin/cp"
858 execve("/lib64/ld-linux-x86-64.so.2"
760 execve("/usr/bin/ldd"
375 execve("/sbin/modinfo"
359 execve("/bin/chmod"
344 execve("/bin/rm"
341 execve("/sbin/modprobe"
256 execve("/bin/mkdir"
222 execve("/bin/readlink"
100 execve("/bin/cat"
This data, and a glance at the source of dracut, suggests
considering the bash shell regexp operator "[[ string =~ pattern ]]"
and the expansion substitution operator "${parameter/pattern/string}"
to replace most instances of egrep.
The uses of cp, ldd, chmod, and modinfo should be investigated for
the possibility of batching more than one file at a time. Operating
inside one directory at a time can effectively remove the threat of
exceeding the 32KB limit on the arglist to execve.
Using pipelines (possibly including bash's "while read fname ; do")
to filter streamed lists of filenames can reduce overhead significantly
in contrast to "for fname in ...; do <<execve>>". A pipeline may also
introduce effective parallelism.
"sort --uniq" handily removes duplicates.
In most cases "cat filename |" should be replaced with ordinary
redirection "< filename", and similarly "$(cat filename)" should
be "$(< filename)". If SELinux denies access by dracut (etc.)
but allows /bin/cat, then such a comment is REQUIRED.
Yes, I'm going to work on it.
--
next reply other threads:[~2011-08-18 18:18 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-18 18:18 John Reiser [this message]
[not found] ` <CALAkbJOkMTQdkmhBBvqHk3oKRzMHvXcp1MxasMrMpCbTP3+0eg@mail.gmail.com>
[not found] ` <CALAkbJOkMTQdkmhBBvqHk3oKRzMHvXcp1MxasMrMpCbTP3+0eg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-08-18 23:09 ` building initramfs is slow John Reiser
2011-08-19 4:53 ` WANG Cong
2011-08-19 6:47 ` Harald Hoyer
[not found] ` <4E4E0707.4060504-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-08-19 7:04 ` Américo Wang
[not found] ` <CAM_iQpUr2mVRM+PFeYkefzx9xEAOJKhZh+wpaXgKg6bj+1dozQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-08-19 7:07 ` Harald Hoyer
[not found] ` <4E4E0B95.6040909-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-08-19 7:32 ` Dan Horák
2011-08-19 18:27 ` John Reiser
[not found] ` <4E4D5779.6090209-Po6cBsTGB2ZWk0Htik3J/w@public.gmane.org>
2011-08-19 7:03 ` Harald Hoyer
2011-08-19 8:24 ` Harald Hoyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E4D5779.6090209@bitwagon.com \
--to=jreiser-po6cbstgb2zwk0htik3j/w@public.gmane.org \
--cc=initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.