From: Theodore Tso <tytso@mit.edu>
To: Bryan Henderson <hbryan@us.ibm.com>
Cc: Bodo Eggert <7eggert@gmx.de>, Andreas Dilger <adilger@sun.com>,
Andreas Dilger <adilger@clusterfs.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
Adrian Bunk <bunk@kernel.org>, David Chinner <dgc@sgi.com>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, Ric Wheeler <ric@emc.com>,
Valerie Henson <val@vahconsulting.com>,
Valdis.Kletnieks@vt.edu
Subject: Re: [RFC] Parallelize IO for e2fsck
Date: Sat, 26 Jan 2008 08:21:24 -0500 [thread overview]
Message-ID: <20080126132124.GA8348@mit.edu> (raw)
In-Reply-To: <OF9B297CB7.4B613CFE-ON882573DC.00092EE4-882573DC.000A9B77@us.ibm.com>
On Fri, Jan 25, 2008 at 05:55:51PM -0800, Bryan Henderson wrote:
> I was surprised to see AIX do late allocation by default, because IBM's
> traditional style is bulletproof systems. A system where a process can be
> killed at unpredictable times because of resource demands of unrelated
> processes doesn't really fit that style.
>
> It's really a fairly unusual application that benefits from late
> allocation: one that creates a lot more virtual memory than it ever
> touches. For example, a sparse array. Or am I missing something?
I guess it depends on how far you try to do "bulletproof". OSF/1 used
to use "bulletproof" as its default --- and I had to turn it off on
tsx-11.mit.edu (the first North American ftp server for Linux :-),
because the difference was something like 50 ftp daemons versus over
500 on the same server. It reserved VM space for the text segement of
every single process, since at least in theory, it's possible for
every single text page to get modified using ptrace if (for example) a
debugger were to set a break point on every single page of every
single text segement of every single ftp daemon.
You can also see potential problems for Java programs. Suppose you
had some gigantic Java Application (say, Lotus Notes, or Websphere
Application Server) which is taking up many, many, MANY gigabytes of
VM space. Now suppose the Java application needs to fork and exec
some trivial helper program. For that tiny instant, between the fork
and exec, the VM requirements in "bulletproof" mode would double,
since while 99.9999% of the time programs will immediately discard the
VM upon the exec, there is always the possibility that the child
process will touch every single data page, forcing a copy on write,
and never do the exec.
There are of course different levels of "bulletproof" between the
extremes of "totally bulletproof" and "late binding" from an
algorithmic standpoint. For example, you could ignore the needed
pages caused by ptrace(); more challenging would be to how to handle
the fork/exec semantics, although there could be kludges such as
strongly encouraging applications to use an old-fashed BSD-style
vfork() to guarantee that the child couldn't double VM requirements
between the vfork() and exec(). I certainly can't say for sure what
the AIX designers had in mind, and why they didn't choose one of the
more intermediate design choices.
However, it is fair to say that "100% bulletproof" can require
reserving far more VM resources than you might first expect. Even a
company which is highly incented to sell large amounts of hardware,
such as Digital, might not have wanted their OS to be only able to
support an embarassingly small number of simultaneous ftpd
connections. I know this for sure because the OSF/1 documentation,
when discussing their VM tuning knobs, specifically talked about the
scenario that I ran into with tsx-11.mit.edu.
Regards,
- Ted
next prev parent reply other threads:[~2008-01-26 13:21 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <9Mo9w-7Ws-25@gated-at.bofh.it>
[not found] ` <9Mo9w-7Ws-23@gated-at.bofh.it>
[not found] ` <9OdWm-7uN-25@gated-at.bofh.it>
[not found] ` <9Oi9A-5EJ-3@gated-at.bofh.it>
[not found] ` <9OiMg-6IC-1@gated-at.bofh.it>
[not found] ` <9OlqL-2xG-3@gated-at.bofh.it>
[not found] ` <9Orda-3ub-45@gated-at.bofh.it>
2008-01-24 17:32 ` [RFC] Parallelize IO for e2fsck Bodo Eggert
[not found] ` <E1JI5vz-0001GG-Vs@be1.7eggert.dyndns.org>
2008-01-24 22:07 ` Andreas Dilger
2008-01-24 23:08 ` Adrian Bunk
2008-01-24 23:40 ` Theodore Tso
2008-01-25 0:25 ` Zan Lynx
2008-01-25 11:09 ` Andreas Dilger
2008-01-26 0:55 ` Zan Lynx
2008-01-26 11:56 ` KOSAKI Motohiro
2008-01-25 18:03 ` Bryan Henderson
2008-01-25 23:01 ` Bodo Eggert
2008-01-26 1:55 ` Bryan Henderson
2008-01-26 13:21 ` Theodore Tso [this message]
2008-01-26 12:32 ` KOSAKI Motohiro
[not found] ` <2f11576a0801260432y4405d817p6ef4005d06189654@mail.gmail.com>
2008-01-26 13:55 ` Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck) Al Boldi
2008-01-26 16:01 ` KOSAKI Motohiro
2008-01-28 23:23 ` Jon Masters
[not found] ` <1201562634.5412.70.camel@jcmlaptop>
2008-02-03 13:38 ` KOSAKI Motohiro
[not found] <70b6f0bf0801161322k2740a8dch6a0d6e6e112cd2d0@mail.gmail.com>
2008-01-16 21:30 ` [RFC] Parallelize IO for e2fsck Valerie Henson
2008-01-18 1:15 ` David Chinner
2008-01-18 1:43 ` Valerie Henson
2008-01-21 23:00 ` Andreas Dilger
2008-01-22 3:38 ` David Chinner
2008-01-22 4:17 ` Valdis.Kletnieks
2008-01-22 7:00 ` Andreas Dilger
2008-01-22 13:05 ` Alan Cox
[not found] ` <20080122144052.GC17804@mit.edu>
[not found] ` <20080128193005.GC4032@ucw.cz>
2008-01-28 19:56 ` Theodore Tso
2008-01-29 8:29 ` david
[not found] ` <20080128200105.GA4719@ucw.cz>
2008-02-03 13:51 ` KOSAKI Motohiro
2008-01-22 7:05 ` Andreas Dilger
2008-01-22 8:16 ` David Chinner
2008-01-22 17:42 ` Bryan Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080126132124.GA8348@mit.edu \
--to=tytso@mit.edu \
--cc=7eggert@gmx.de \
--cc=Valdis.Kletnieks@vt.edu \
--cc=adilger@clusterfs.com \
--cc=adilger@sun.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=bunk@kernel.org \
--cc=dgc@sgi.com \
--cc=hbryan@us.ibm.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ric@emc.com \
--cc=val@vahconsulting.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).