From: Peter Seebach <peter.seebach@windriver.com>
To: "openembedded-core@lists.openembedded.org"
<openembedded-core@lists.openembedded.org>
Subject: [PATCH 0/3] Pseudo performance changes...
Date: Sat, 16 Feb 2013 20:23:51 -0600 [thread overview]
Message-ID: <1361067834-23267-1-git-send-email-peter.seebach@windriver.com> (raw)
Unlike most of my submissions, this isn't patches against oe-core; rather,
it's patches against pseudo, and if I can get some confirmation that they do
what I think they do, and some review, I'm planning to make this into
pseudo 1.5, and send a patch "soonish" to merge that into oe-core.
What this does: Fix a number of build performance issues. By far the
largest change is actually not so much a problem with pseudo as a problem
that pseudo can solve by brute force. Packaging systems (at least RPM and
SMART) do a lot of fsync() and fdatasync() calls. That usually implies
flushing EVERYTHING that's been written, not just one specific file. And
that, in turn, results in a severe performance hit.
So, for instance, on one of my test workstations, this moves a do_rootfs
with about 1200 RPMs from about 22 minutes to about 4.5. Yeah.
The other changes aren't as dramatic for that case, but have very significant
performance impact for at least some workloads. The first is switching to
using an in-memory database for the files database, dumping it to disk only
when the pseudo daemon is idle or shutting down. This doesn't produce huge
benefits in all cases, but for workloads with a lot of parallelism, it can
produce a very noticeable reduction in how much pseudo slows things down.
The second is a fairly major protocol change. In short, with this patch,
pseudo clients only wait for a server response when they need information
from the server in order to continue. That's OP_FSTAT, OP_STAT,
OP_MAY_UNLINK, and OP_MKNOD. Everything else just silently assumes that
it probably succeeded.
How much does this matter? Between the protocol change and the memory
DB, a trivial unpack of a tarball (lots of writes to the database, very
few reads) can be about 4x faster. Removing stuff isn't much faster, but
it might be a bit faster.
This is most noticeable, by far, when running more than one build, or
when running builds while doing other things. It has a much smaller effect
on builds with no shared state (compile time still dominates that), but
even there I'm seeing decreases from ~83 minutes to ~64 from just the
fsync and memory changes. Still waiting on my real test case (multiple
simultaneous builds which need compiles) completing.
Peter Seebach (3):
Use in-memory database for files
allow pseudo to force asynchronous behavior
If you don't want the answer, don't ask the question.
ChangeLog.txt | 10 ++++
Makefile.in | 6 +-
configure | 51 ++++++++++++++++
enums/msg_type.in | 1 +
enums/op.in | 46 +++++++-------
enums/query_type.in | 18 +++---
guts/COPYRIGHT | 2 +-
maketables | 41 +++++++------
makewrappers | 13 +++-
ports/darwin/guts/open.c | 5 +-
ports/linux/guts/openat.c | 21 ++++++-
ports/unix/guts/fchmod.c | 16 ++---
ports/unix/guts/fchmodat.c | 31 ++++++----
ports/unix/guts/fchown.c | 16 ++---
ports/unix/guts/fchownat.c | 16 ++---
ports/unix/guts/fdatasync.c | 16 +++++
ports/unix/guts/fsync.c | 16 +++++
ports/unix/guts/mknodat.c | 4 +-
ports/unix/guts/msync.c | 16 +++++
ports/unix/guts/sync.c | 16 +++++
ports/unix/guts/sync_file_range.c | 13 ++++
ports/unix/guts/syncfs.c | 13 ++++
ports/unix/wrapfuncs.in | 9 +++
pseudo.c | 3 +-
pseudo_client.c | 22 ++++---
pseudo_db.c | 119 ++++++++++++++++++++++++++++++++++++-
pseudo_db.h | 1 +
pseudo_server.c | 32 ++++++----
templates/wrapfuncs.c | 2 +
29 files changed, 453 insertions(+), 122 deletions(-)
create mode 100644 ports/unix/guts/fdatasync.c
create mode 100644 ports/unix/guts/fsync.c
create mode 100644 ports/unix/guts/msync.c
create mode 100644 ports/unix/guts/sync.c
create mode 100644 ports/unix/guts/sync_file_range.c
create mode 100644 ports/unix/guts/syncfs.c
--
1.7.9.5
next reply other threads:[~2013-02-17 2:40 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-17 2:23 Peter Seebach [this message]
2013-02-17 2:23 ` [PATCH 1/3] Use in-memory database for files Peter Seebach
2013-02-17 2:23 ` [PATCH 2/3] allow pseudo to force asynchronous behavior Peter Seebach
2013-02-17 2:23 ` [PATCH 3/3] If you don't want the answer, don't ask the question Peter Seebach
2013-02-17 10:27 ` [PATCH 0/3] Pseudo performance changes Richard Purdie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1361067834-23267-1-git-send-email-peter.seebach@windriver.com \
--to=peter.seebach@windriver.com \
--cc=openembedded-core@lists.openembedded.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox