From: Christoph Hellwig <hch@infradead.org>
To: hendriks@lanl.gov
Cc: torvalds@transmeta.com, linux-kernel@vger.kernel.org
Subject: Re: Syscall numbers for BProc
Date: Thu, 10 Apr 2003 07:29:10 +0100 [thread overview]
Message-ID: <20030410072910.A15440@infradead.org> (raw)
In-Reply-To: <20030405201537.GA18755@lanl.gov>; from hendriks@lanl.gov on Sat, Apr 05, 2003 at 01:15:37PM -0700
On Sat, Apr 05, 2003 at 01:15:37PM -0700, hendriks@lanl.gov wrote:
> The reason it is the way it is because when I'm trying to avoid
> stomping on other syscalls, having a small foot print is a good thing.
Adding more syscalls isn't really a big deal - whether you add one or
a bunch of them in a diff doesn't really matter.
> Breaking out every call into a separate syscall number would also make
> it more difficult to add new features in the future.
Which is a good thing :) Having syscall multiplexers leads to very
messy APIs like the one you proposed :)
> Since our nodes are running *nothing* but the Bproc slave, you can't
> log in some other way to kill the slave and then reboot and you can't
> run shutdown -r or something like that becuase there are no init
> scripts.
We have a reboot notifier call chain in the kernel.
> > Should be read() on a special file.
>
> It started life like that but then I liked the idea of being able to
> do it from any node in the system. (remember no shared fs)
You have this no shared fs argument a few times - why don't you _add_
a shared virtual filesystem for kerne, information? This would clean
up many of the messier APIs.
> > I'm pretty sure this would better be a /proc/<pid>/image file you
> > can read from.
>
> I'm a little fuzzy on what you mean here. If you're suggesting that a
> process read from its own /proc/pid/image, then that's hard because
> the process is changing while you do it. In the 3rd party case (which
> vmadump doesn't support) it gets more tricky because you need to make
> sure the process is stopped and the CPU state stored while you're
> reading this.
Okay, you're right - this should be a syscall.
> VMADump doesn't depend on BProc at all. You will, however, need a
> system call for it the way it's written now :)
Yeah, conviencded. Care to submit a separated out vmadump aptch with
the syscalls for 2.5?
>
> > > 0x1030 - VMAD_LIB_CLEAR - clear the library list
> > > no arguments
> >
> > What library lists are all those calls about? Needs more explanation.
>
> If you look at the virtual memory space of a dynamically linked
> program, the percentage of space used by the program itself (i.e. not
> libraries) is often very small. In an effort to make process
> migration really cheap, we're willing to say that files X, Y and Z are
> available on the machine where we'll be restoring the process image.
> The candidates for remote caching are, obviously, large shared
> libraries.
>
> So, the dumper needs to know what it can expect to find on the remote
> system and what it can't. That's where the library list comes in. It
> probably should just be called the remote file list or something.
> It's a gross hack where we tell the kernel code what it doesn't need
> to dump. Anything that isn't dumped gets stored in the dump file as a
> reference to a file. (e.g. map X bytes of /lib/libc-2.3.2.so @ offset
> Y)
>
> And yeah, this might be cleaner as a writable special file but this
> was easy given the big syscall mux.
I don't think you really want a device for this. It's more an attribute
of the mapping, so a MAP_ALWAYS_LOCAL flag to mmap sounds like the right
thing.
next prev parent reply other threads:[~2003-04-10 6:17 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-04-04 19:32 Syscall numbers for BProc Erik Hendriks
2003-04-04 19:35 ` Christoph Hellwig
2003-04-04 20:43 ` hendriks
2003-04-04 21:54 ` H. Peter Anvin
2003-04-05 0:44 ` hendriks
2003-04-05 5:45 ` Christoph Hellwig
2003-04-05 20:15 ` hendriks
2003-04-08 19:19 ` H. Peter Anvin
2003-04-10 6:29 ` Christoph Hellwig [this message]
2003-04-14 17:18 ` hendriks
2003-04-14 17:32 ` Richard B. Johnson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20030410072910.A15440@infradead.org \
--to=hch@infradead.org \
--cc=hendriks@lanl.gov \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox