public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [BK PATCHES] initramfs merge, part 1 of N
@ 2002-11-02  8:13 Jeff Garzik
  2002-11-02  8:18 ` Jeff Garzik
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Jeff Garzik @ 2002-11-02  8:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: LKML, hpa, viro

[-- Attachment #1: Type: text/plain, Size: 6229 bytes --]

Linus,

The attached below is the first of several changes for initramfs / early 
userspace.

This change is intentionally very simple, not really proving its worth 
until next week when patches 2 and 3 in this series arrive in your 
inbox.  A description of "the future" follows description of this 
specific cset.

1) Introduce init/initramfs.c itself, which is a module that 
uncompresses a .cpio.gz archive, and uses it to populate rootfs with 
files early very in the bootup process (between signals_init and 
proc_root_init in init/main.c).  People will see a small listing in 
dmesg of unpacked files.  We need to keep this for now (and for now it's 
small), but we may want to remove this output or turn the knob down to 
KERN_DEBUG before 2.6.x release:
    -> file1
    -> file2
    -> etc...

(architecture maintainers note!)
2) Introduce ARCHBLOBLFLAGS in arch/$arch/Makefile, for turning an 
arbitrary binary object into a .o file using objcopy.

3) Link the initramfs cpio archive in vmlinux image via 
arch/$arch/vmlinux.lds.S, in the init section.

4) Introduce the new linux/usr directory.  Currently it is not very 
interesting, only containing a small host-built proggie that generates 
the initial cpio archive, gen_init_cpio.  This program will go away when 
early userspace is further along.  It currently exists to show initramfs 
is working, by allowing us to remove three simple lines from 
init/do_mounts.c.



The Future.

Early userspace is going to be merged in a series of evolutionary 
changes, following what I call "The Al Viro model."  NO KERNEL BEHAVIOR 
SHOULD CHANGE.  [that's for the lkml listeners, not you <g>]  "make" 
will continue to simply Do The Right Thing(tm) on all platforms, while 
the kernel image continues to get progressively smaller.  Here is the 
initial plan for early userspace, i.e. the patches you are going to be 
seeing next week:

#2 - merge klibc.

As I said earlier, I am not sure if we will wind up removing klibc just 
before 2.6.x release or not.  Comments welcome.  But for now, klibc will 
be merged into the kernel tarball, because otherwise version drift 
during the evolution of early userspace will be a huge PITA, and slow 
things down.  It is a tiny libc written specifically for the kernel.

This patch will add klibc to the build system, and create a tiny, 
statically-linked binary "kinit".  kinit is the beginnings of early 
userspace.  Some tiny, token amount of do_mounts.c code will be moving 
into kinit in patch #2, only enough to prove the system is working.

#3 - move initrd to userspace

Unfortunately we don't start seeing tangible benefits to early userspace 
until this patch, but that's how evolution works :)  Here, initrd 
unpacking code is moved to userspace, as much as possible.  Some initrd 
code will inevitably stay in the kernel, because it is arch-specific how 
to grab the initrd image from bootmem [or whereever], but the vast 
majority of initrd code goes poof (yay!).  No initrd behavior will 
change at all, from current kernels.  It is simply getting moved to 
early userspace.  Users will not need to do anything on their end to 
make sure their existing setups continue to work -- any such actions are 
a bug on my part.

This patch will also turn "kinit" into a shared binary, and introduce 
the gzip binary into early userspace.  [see "Items For Discussion" 
below, too, WRT this.]

#4 - move mounting root to userspace

People probably breathed a sigh of relief at patch #3, they will heave a 
bigger sigh for this patch :)   This moves mounting of the root 
filesystem to early userspace, including getting rid of 
NFSroot/bootp/dhcp code in the kernel.

#N - to infinity... and beyond!

I, and hopefully others, will continue in the series of evolutionary 
patches, moving more and more stuff to early userspace.  There are a lot 
of possibilities, and I will be looking for input from others on useful 
things to move, as well as continuing my own work of finding items that 
can be moved.



Items For Discussion

#1 - shared kinit

"kinit" is _the_ early userspace binary -- but not necessarily the only 
one.  Peter Anvin and Russell King have several binaries in the klibc 
tarball, gzip, ash, and several smaller utilities.  Peter also put work 
into making klibc a shared object -- that doesn't need an shlib loader. 
 It's pretty nifty how he does it, IMO:  klibc.so becomes an ELF 
interpreter for any klibc-linked binaries.  klibc-linked binaries are, 
to the ELF system, static binaries, but they wind up sharing klibc.so 
anyway due to this trick.

Anyway, there is a certain elegance in adding coding to kinit instead of 
an explosion of binaries and shell scripts.  The other side of that coin 
is that with elegance you sacrifice some ease of making changes.  I am 
60% certain we want a shared klibc and multiple binaries, but am willing 
to be convinced in either direction.  If you think about it, there _are_ 
several benefits to leaving kinit as the lone binary in the stock kernel 
early userspace build, so the decision is not as cut-n-dry as it may 
immediately seem.

#2 - klibc in the kernel tarball

It's going in, for now.  That's not open to discussion.  However, the 
future is...   I know the old maxim of "once's it in, it never goes 
away."  Maybe that's the case, and if so, no big deal.  I know at least 
a couple people who would like to see it leave the kernel tarball before 
2.6.0 is released.  I solicit comments on this item, though I think we 
will won't be in a position to answer this question until 2.6.0 release 
is near.



That's it for now.  Questions, comments, and flames welcome.  If I 
missed some early userspace benefits, let me know.  If you know of good 
things to move to early userspace, let me know.  If there are upcoming 
bumps in the road I did not mention, let me know.

Credits:  Al Viro for the initramfs work.  hpa, rmk, Greg KH, Alan, and 
innumerable others have contributed ideas if not actual code towards 
this effort.  And thanks to our Emporer Penguin for giving me a break, 
when I blew the Halloween deadline spending 24 hours debugging an 
_incredibly_ stupid bug on my part.  I deserve not one but two brown 
paper bags for that one.



[-- Attachment #2: minitramfs-2.5.txt --]
[-- Type: text/plain, Size: 811 bytes --]

Linus, please do a

	bk pull http://gkernel.bkbits.net/minitramfs-2.5

This will update the following files:

 Makefile                |    2 
 arch/i386/Makefile      |    1 
 arch/i386/vmlinux.lds.S |    4 
 init/Makefile           |    2 
 init/do_mounts.c        |    4 
 init/initramfs.c        |  466 +++++++++++++++++++++++++++++++++++++++++++++++-
 init/main.c             |    2 
 usr/Makefile            |   18 +
 usr/gen_init_cpio.c     |  137 ++++++++++++++
 9 files changed, 629 insertions(+), 7 deletions(-)

through these ChangeSets:

<jgarzik@redhat.com> (02/11/01 1.862.2.2)
   Kill stupid bug in initramfs that prevented it from working.
   (thanks to Al Viro for his patience, I owe him one)

<jgarzik@redhat.com> (02/11/01 1.858.2.1)
   Minimal initramfs support (based on Al Viro's work).


[-- Attachment #3: patch --]
[-- Type: text/plain, Size: 16072 bytes --]

diff -Nru a/Makefile b/Makefile
--- a/Makefile	Sat Nov  2 02:34:50 2002
+++ b/Makefile	Sat Nov  2 02:34:50 2002
@@ -209,7 +209,7 @@
 drivers-y	:= drivers/ sound/
 net-y		:= net/
 libs-y		:= lib/
-core-y		:=
+core-y		:= usr/
 SUBDIRS		:=
 
 ifeq ($(filter $(noconfig_targets),$(MAKECMDGOALS)),)
diff -Nru a/arch/i386/Makefile b/arch/i386/Makefile
--- a/arch/i386/Makefile	Sat Nov  2 02:34:50 2002
+++ b/arch/i386/Makefile	Sat Nov  2 02:34:50 2002
@@ -18,6 +18,7 @@
 
 LDFLAGS		:= -m elf_i386
 OBJCOPYFLAGS	:= -O binary -R .note -R .comment -S
+ARCHBLOBLFLAGS	:= -I binary -O elf32-i386 -B i386
 LDFLAGS_vmlinux := -e stext
 
 CFLAGS += -pipe
diff -Nru a/arch/i386/vmlinux.lds.S b/arch/i386/vmlinux.lds.S
--- a/arch/i386/vmlinux.lds.S	Sat Nov  2 02:34:50 2002
+++ b/arch/i386/vmlinux.lds.S	Sat Nov  2 02:34:50 2002
@@ -77,6 +77,10 @@
 	*(.initcall7.init)
   }
   __initcall_end = .;
+  . = ALIGN(4096);
+  __initramfs_start = .;
+  .init.ramfs : { *(.init.initramfs) }
+  __initramfs_end = .;
   . = ALIGN(32);
   __per_cpu_start = .;
   .data.percpu  : { *(.data.percpu) }
diff -Nru a/init/Makefile b/init/Makefile
--- a/init/Makefile	Sat Nov  2 02:34:50 2002
+++ b/init/Makefile	Sat Nov  2 02:34:50 2002
@@ -2,7 +2,7 @@
 # Makefile for the linux kernel.
 #
 
-obj-y    := main.o version.o do_mounts.o
+obj-y    := main.o version.o do_mounts.o initramfs.o
 
 # files to be removed upon make clean
 clean-files := ../include/linux/compile.h
diff -Nru a/init/do_mounts.c b/init/do_mounts.c
--- a/init/do_mounts.c	Sat Nov  2 02:34:50 2002
+++ b/init/do_mounts.c	Sat Nov  2 02:34:50 2002
@@ -748,9 +748,7 @@
 		mount_initrd = 0;
 	real_root_dev = ROOT_DEV;
 #endif
-	sys_mkdir("/dev", 0700);
-	sys_mkdir("/root", 0700);
-	sys_mknod("/dev/console", S_IFCHR|0600, MKDEV(TTYAUX_MAJOR, 1));
+
 #ifdef CONFIG_DEVFS_FS
 	sys_mount("devfs", "/dev", "devfs", 0, NULL);
 	do_devfs = 1;
diff -Nru a/init/initramfs.c b/init/initramfs.c
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/init/initramfs.c	Sat Nov  2 02:34:50 2002
@@ -0,0 +1,462 @@
+#define __KERNEL_SYSCALLS__
+#include <linux/init.h>
+#include <linux/fs.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/fcntl.h>
+#include <linux/unistd.h>
+#include <linux/delay.h>
+
+static void __init error(char *x)
+{
+	panic("populate_root: %s\n", x);
+}
+
+static void __init *malloc(int size)
+{
+	return kmalloc(size, GFP_KERNEL);
+}
+
+static void __init free(void *where)
+{
+	kfree(where);
+}
+
+asmlinkage long sys_mkdir(char *name, int mode);
+asmlinkage long sys_mknod(char *name, int mode, dev_t dev);
+asmlinkage long sys_symlink(char *old, char *new);
+asmlinkage long sys_link(char *old, char *new);
+asmlinkage long sys_write(int fd, void *buf, ssize_t size);
+asmlinkage long sys_chown(char *name, uid_t uid, gid_t gid);
+asmlinkage long sys_lchown(char *name, uid_t uid, gid_t gid);
+asmlinkage long sys_fchown(int fd, uid_t uid, gid_t gid);
+asmlinkage long sys_chmod(char *name, mode_t mode);
+asmlinkage long sys_fchmod(int fd, mode_t mode);
+
+/* link hash */
+
+static struct hash {
+	int ino, minor, major;
+	struct hash *next;
+	char *name;
+} *head[32];
+
+static inline int hash(int major, int minor, int ino)
+{
+	unsigned long tmp = ino + minor + (major << 3);
+	tmp += tmp >> 5;
+	return tmp & 31;
+}
+
+static char __init *find_link(int major, int minor, int ino, char *name)
+{
+	struct hash **p, *q;
+	for (p = head + hash(major, minor, ino); *p; p = &(*p)->next) {
+		if ((*p)->ino != ino)
+			continue;
+		if ((*p)->minor != minor)
+			continue;
+		if ((*p)->major != major)
+			continue;
+		return (*p)->name;
+	}
+	q = (struct hash *)malloc(sizeof(struct hash));
+	if (!q)
+		error("can't allocate link hash entry");
+	q->ino = ino;
+	q->minor = minor;
+	q->major = major;
+	q->name = name;
+	q->next = NULL;
+	*p = q;
+	return NULL;
+}
+
+static void __init free_hash(void)
+{
+	struct hash **p, *q;
+	for (p = head; p < head + 32; p++) {
+		while (*p) {
+			q = *p;
+			*p = q->next;
+			free(q);
+		}
+	}
+}
+
+/* cpio header parsing */
+
+static __initdata unsigned long ino, major, minor, nlink;
+static __initdata mode_t mode;
+static __initdata unsigned long body_len, name_len;
+static __initdata uid_t uid;
+static __initdata gid_t gid;
+static __initdata dev_t rdev;
+
+static void __init parse_header(char *s)
+{
+	unsigned long parsed[12];
+	char buf[9];
+	int i;
+
+	buf[8] = '\0';
+	for (i = 0, s += 6; i < 12; i++, s += 8) {
+		memcpy(buf, s, 8);
+		parsed[i] = simple_strtoul(buf, NULL, 16);
+	}
+	ino = parsed[0];
+	mode = parsed[1];
+	uid = parsed[2];
+	gid = parsed[3];
+	nlink = parsed[4];
+	body_len = parsed[6];
+	major = parsed[7];
+	minor = parsed[8];
+	rdev = MKDEV(parsed[9], parsed[10]);
+	name_len = parsed[11];
+}
+
+/* FSM */
+
+enum state {
+	Start,
+	Collect,
+	GotHeader,
+	SkipIt,
+	GotName,
+	CopyFile,
+	GotSymlink,
+	Reset
+} state, next_state;
+
+char *victim;
+unsigned count;
+loff_t this_header, next_header;
+
+static inline void eat(unsigned n)
+{
+	victim += n;
+	this_header += n;
+	count -= n;
+}
+
+#define N_ALIGN(len) ((((len) + 1) & ~3) + 2)
+
+static __initdata char *collected;
+static __initdata int remains;
+static __initdata char *collect;
+
+static void __init read_into(char *buf, unsigned size, enum state next)
+{
+	if (count >= size) {
+		collected = victim;
+		eat(size);
+		state = next;
+	} else {
+		collect = collected = buf;
+		remains = size;
+		next_state = next;
+		state = Collect;
+	}
+}
+
+static __initdata char *header_buf, *symlink_buf, *name_buf;
+
+static int __init do_start(void)
+{
+	read_into(header_buf, 110, GotHeader);
+	return 0;
+}
+
+static int __init do_collect(void)
+{
+	unsigned n = remains;
+	if (count < n)
+		n = count;
+	memcpy(collect, victim, n);
+	eat(n);
+	collect += n;
+	if (remains -= n)
+		return 1;
+	state = next_state;
+	return 0;
+}
+
+static int __init do_header(void)
+{
+	parse_header(collected);
+	next_header = this_header + N_ALIGN(name_len) + body_len;
+	next_header = (next_header + 3) & ~3;
+	if (name_len <= 0 || name_len > PATH_MAX)
+		state = SkipIt;
+	else if (S_ISLNK(mode)) {
+		if (body_len > PATH_MAX)
+			state = SkipIt;
+		else {
+			collect = collected = symlink_buf;
+			remains = N_ALIGN(name_len) + body_len;
+			next_state = GotSymlink;
+			state = Collect;
+		}
+	} else if (body_len && !S_ISREG(mode))
+		state = SkipIt;
+	else
+		read_into(name_buf, N_ALIGN(name_len), GotName);
+	return 0;
+}
+
+static int __init do_skip(void)
+{
+	if (this_header + count <= next_header) {
+		eat(count);
+		return 1;
+	} else {
+		eat(next_header - this_header);
+		state = next_state;
+		return 0;
+	}
+}
+
+static int __init do_reset(void)
+{
+	while(count && *victim == '\0')
+		eat(1);
+	if (count && (this_header & 3))
+		error("broken padding");
+	return 1;
+}
+
+static int __init maybe_link(void)
+{
+	if (nlink >= 2) {
+		char *old = find_link(major, minor, ino, collected);
+		if (old)
+			return (sys_link(old, collected) < 0) ? -1 : 1;
+	}
+	return 0;
+}
+
+static __initdata int wfd;
+
+static int __init do_name(void)
+{
+	state = SkipIt;
+	next_state = Start;
+	if (strcmp(collected, "TRAILER!!!") == 0) {
+		free_hash();
+		next_state = Reset;
+		return 0;
+	}
+	printk(KERN_INFO "-> %s\n", collected);
+	if (S_ISREG(mode)) {
+		if (maybe_link() >= 0) {
+			wfd = sys_open(collected, O_WRONLY|O_CREAT, mode);
+			if (wfd >= 0) {
+				sys_fchown(wfd, uid, gid);
+				sys_fchmod(wfd, mode);
+				state = CopyFile;
+			}
+		}
+	} else if (S_ISDIR(mode)) {
+		sys_mkdir(collected, mode);
+		sys_chown(collected, uid, gid);
+	} else if (S_ISBLK(mode) || S_ISCHR(mode) ||
+		   S_ISFIFO(mode) || S_ISSOCK(mode)) {
+		if (maybe_link() == 0) {
+			sys_mknod(collected, mode, rdev);
+			sys_chown(collected, uid, gid);
+		}
+	} else
+		panic("populate_root: bogus mode: %o\n", mode);
+	return 0;
+}
+
+static int __init do_copy(void)
+{
+	if (count >= body_len) {
+		sys_write(wfd, victim, body_len);
+		sys_close(wfd);
+		eat(body_len);
+		state = SkipIt;
+		return 0;
+	} else {
+		sys_write(wfd, victim, count);
+		body_len -= count;
+		eat(count);
+		return 1;
+	}
+}
+
+static int __init do_symlink(void)
+{
+	collected[N_ALIGN(name_len) + body_len] = '\0';
+	sys_symlink(collected + N_ALIGN(name_len), collected);
+	sys_lchown(collected, uid, gid);
+	state = SkipIt;
+	next_state = Start;
+	return 0;
+}
+
+static __initdata int (*actions[])(void) = {
+	[Start]		do_start,
+	[Collect]	do_collect,
+	[GotHeader]	do_header,
+	[SkipIt]	do_skip,
+	[GotName]	do_name,
+	[CopyFile]	do_copy,
+	[GotSymlink]	do_symlink,
+	[Reset]		do_reset,
+};
+
+static int __init write_buffer(char *buf, unsigned len)
+{
+	count = len;
+	victim = buf;
+
+	while (!actions[state]())
+		;
+	return len - count;
+}
+
+static void __init flush_buffer(char *buf, unsigned len)
+{
+	int written;
+	while ((written = write_buffer(buf, len)) < len) {
+		char c = buf[written];
+		if (c == '0') {
+			buf += written;
+			len -= written;
+			state = Start;
+			continue;
+		} else
+			error("junk in compressed archive");
+	}
+}
+
+/*
+ * gzip declarations
+ */
+
+#define OF(args)  args
+
+#ifndef memzero
+#define memzero(s, n)     memset ((s), 0, (n))
+#endif
+
+typedef unsigned char  uch;
+typedef unsigned short ush;
+typedef unsigned long  ulg;
+
+#define WSIZE 0x8000    /* window size--must be a power of two, and */
+			/*  at least 32K for zip's deflate method */
+
+static uch *inbuf;
+static uch *window;
+
+static unsigned insize;  /* valid bytes in inbuf */
+static unsigned inptr;   /* index of next byte to be processed in inbuf */
+static unsigned outcnt;  /* bytes in output buffer */
+static long bytes_out;
+
+#define get_byte()  (inptr < insize ? inbuf[inptr++] : -1)
+		
+/* Diagnostic functions (stubbed out) */
+#define Assert(cond,msg)
+#define Trace(x)
+#define Tracev(x)
+#define Tracevv(x)
+#define Tracec(c,x)
+#define Tracecv(c,x)
+
+#define STATIC static
+
+static void flush_window(void);
+static void error(char *m);
+static void gzip_mark(void **);
+static void gzip_release(void **);
+
+#include "../lib/inflate.c"
+
+static void __init gzip_mark(void **ptr)
+{
+}
+
+static void __init gzip_release(void **ptr)
+{
+}
+
+/* ===========================================================================
+ * Write the output window window[0..outcnt-1] and update crc and bytes_out.
+ * (Used for the decompressed data only.)
+ */
+static void __init flush_window(void)
+{
+	ulg c = crc;         /* temporary variable */
+	unsigned n;
+	uch *in, ch;
+
+	flush_buffer(window, outcnt);
+	in = window;
+	for (n = 0; n < outcnt; n++) {
+		ch = *in++;
+		c = crc_32_tab[((int)c ^ ch) & 0xff] ^ (c >> 8);
+	}
+	crc = c;
+	bytes_out += (ulg)outcnt;
+	outcnt = 0;
+}
+
+static void __init unpack_to_rootfs(char *buf, unsigned len)
+{
+	int written;
+	header_buf = malloc(110);
+	symlink_buf = malloc(PATH_MAX + N_ALIGN(PATH_MAX) + 1);
+	name_buf = malloc(N_ALIGN(PATH_MAX));
+	window = malloc(WSIZE);
+	if (!window || !header_buf || !symlink_buf || !name_buf)
+		error("can't allocate buffers");
+	state = Start;
+	this_header = 0;
+	while (len) {
+		loff_t saved_offset = this_header;
+		if (*buf == '0' && !(this_header & 3)) {
+			state = Start;
+			written = write_buffer(buf, len);
+			buf += written;
+			len -= written;
+			continue;
+		} else if (!*buf) {
+			buf++;
+			len--;
+			this_header++;
+			continue;
+		}
+		this_header = 0;
+		insize = len;
+		inbuf = buf;
+		inptr = 0;
+		outcnt = 0;		/* bytes in output buffer */
+		bytes_out = 0;
+		crc = (ulg)0xffffffffL; /* shift register contents */
+		makecrc();
+		if (gunzip())
+			error("ungzip failed");
+		if (state != Reset)
+			error("junk in gzipped archive");
+		this_header = saved_offset + inptr;
+		buf += inptr;
+		len -= inptr;
+	}
+	free(window);
+	free(name_buf);
+	free(symlink_buf);
+	free(header_buf);
+}
+
+extern unsigned long __initramfs_start, __initramfs_end;
+
+void __init populate_rootfs(void)
+{
+	unpack_to_rootfs((void *) &__initramfs_start,
+			 &__initramfs_end - &__initramfs_start);
+}
diff -Nru a/init/main.c b/init/main.c
--- a/init/main.c	Sat Nov  2 02:34:50 2002
+++ b/init/main.c	Sat Nov  2 02:34:50 2002
@@ -72,6 +72,7 @@
 extern void pte_chain_init(void);
 extern void radix_tree_init(void);
 extern void free_initmem(void);
+extern void populate_rootfs(void);
 
 #ifdef CONFIG_TC
 extern void tc_init(void);
@@ -433,6 +434,7 @@
 	vfs_caches_init(num_physpages);
 	radix_tree_init();
 	signals_init();
+	populate_rootfs();
 #ifdef CONFIG_PROC_FS
 	proc_root_init();
 #endif
diff -Nru a/usr/Makefile b/usr/Makefile
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/usr/Makefile	Sat Nov  2 02:34:50 2002
@@ -0,0 +1,18 @@
+
+include arch/$(ARCH)/Makefile
+
+obj-y := initramfs_data.o
+
+host-progs := gen_init_cpio
+
+clean-files := initramfs_data.cpio.gz
+
+$(obj)/initramfs_data.o: $(obj)/initramfs_data.cpio.gz
+	$(OBJCOPY) $(ARCHBLOBLFLAGS) \
+		--rename-section .data=.init.initramfs \
+		$(obj)/initramfs_data.cpio.gz $(obj)/initramfs_data.o
+	$(STRIP) -s $(obj)/initramfs_data.o
+
+$(obj)/initramfs_data.cpio.gz: $(obj)/gen_init_cpio
+	( cd $(obj) ; ./gen_init_cpio | gzip -9c > initramfs_data.cpio.gz )
+
diff -Nru a/usr/gen_init_cpio.c b/usr/gen_init_cpio.c
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/usr/gen_init_cpio.c	Sat Nov  2 02:34:50 2002
@@ -0,0 +1,137 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <string.h>
+#include <unistd.h>
+#include <time.h>
+
+static unsigned int offset;
+static unsigned int ino = 721;
+
+static void push_rest(const char *name)
+{
+	unsigned int name_len = strlen(name) + 1;
+	unsigned int tmp_ofs;
+
+	fputs(name, stdout);
+	putchar(0);
+	offset += name_len;
+
+	tmp_ofs = name_len + 110;
+	while (tmp_ofs & 3) {
+		putchar(0);
+		offset++;
+		tmp_ofs++;
+	}
+}
+
+static void push_hdr(const char *s)
+{
+	fputs(s, stdout);
+	offset += 110;
+}
+
+static void cpio_trailer(void)
+{
+	char s[256];
+	const char *name = "TRAILER!!!";
+
+	sprintf(s, "%s%08X%08X%08lX%08lX%08X%08lX"
+	       "%08X%08X%08X%08X%08X%08X%08X",
+		"070701",		/* magic */
+		0,			/* ino */
+		0,			/* mode */
+		(long) 0,		/* uid */
+		(long) 0,		/* gid */
+		1,			/* nlink */
+		(long) 0,		/* mtime */
+		0,			/* filesize */
+		0,			/* major */
+		0,			/* minor */
+		0,			/* rmajor */
+		0,			/* rminor */
+		strlen(name) + 1,	/* namesize */
+		0);			/* chksum */
+	push_hdr(s);
+	push_rest(name);
+
+	while (offset % 512) {
+		putchar(0);
+		offset++;
+	}
+}
+
+static void cpio_mkdir(const char *name, unsigned int mode,
+		       uid_t uid, gid_t gid)
+{
+	char s[256];
+	time_t mtime = time(NULL);
+
+	sprintf(s,"%s%08X%08X%08lX%08lX%08X%08lX"
+	       "%08X%08X%08X%08X%08X%08X%08X",
+		"070701",		/* magic */
+		ino++,			/* ino */
+		S_IFDIR | mode,		/* mode */
+		(long) uid,		/* uid */
+		(long) gid,		/* gid */
+		2,			/* nlink */
+		(long) mtime,		/* mtime */
+		0,			/* filesize */
+		3,			/* major */
+		1,			/* minor */
+		0,			/* rmajor */
+		0,			/* rminor */
+		strlen(name) + 1,	/* namesize */
+		0);			/* chksum */
+	push_hdr(s);
+	push_rest(name);
+}
+
+static void cpio_mknod(const char *name, unsigned int mode,
+		       uid_t uid, gid_t gid, int dev_type,
+		       unsigned int maj, unsigned int min)
+{
+	char s[256];
+	time_t mtime = time(NULL);
+
+	if (dev_type == 'b')
+		mode |= S_IFBLK;
+	else
+		mode |= S_IFCHR;
+
+	sprintf(s,"%s%08X%08X%08lX%08lX%08X%08lX"
+	       "%08X%08X%08X%08X%08X%08X%08X",
+		"070701",		/* magic */
+		ino++,			/* ino */
+		mode,			/* mode */
+		(long) uid,		/* uid */
+		(long) gid,		/* gid */
+		1,			/* nlink */
+		(long) mtime,		/* mtime */
+		0,			/* filesize */
+		3,			/* major */
+		1,			/* minor */
+		maj,			/* rmajor */
+		min,			/* rminor */
+		strlen(name) + 1,	/* namesize */
+		0);			/* chksum */
+	push_hdr(s);
+	push_rest(name);
+}
+
+int main (int argc, char *argv[])
+{
+	cpio_mkdir("/dev", 0700, 0, 0);
+	cpio_mknod("/dev/console", 0600, 0, 0, 'c', 5, 1);
+	cpio_mkdir("/root", 0700, 0, 0);
+	cpio_trailer();
+
+	exit(0);
+
+	/* silence compiler warnings */
+	return 0;
+	(void) argc;
+	(void) argv;
+}
+

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02  8:13 [BK PATCHES] initramfs merge, part 1 of N Jeff Garzik
@ 2002-11-02  8:18 ` Jeff Garzik
  2002-11-02  8:42 ` Aaron Lehmann
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Jeff Garzik @ 2002-11-02  8:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: LKML, hpa, viro

Oh yeah... quick addition.

At some point in the evolution, I will add the ability to load initramfs 
in all the ways that initrd is currently loaded now (from the 
bootloader, etc.).  Substituting a custom initramfs cpio archive in the 
kernel link will also be added at a later time.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02  8:13 [BK PATCHES] initramfs merge, part 1 of N Jeff Garzik
  2002-11-02  8:18 ` Jeff Garzik
@ 2002-11-02  8:42 ` Aaron Lehmann
  2002-11-02  8:46   ` Jeff Garzik
  2002-11-02 19:01   ` Linus Torvalds
  2002-11-02 10:51 ` miltonm
  2002-11-02 17:12 ` Matt Porter
  3 siblings, 2 replies; 18+ messages in thread
From: Aaron Lehmann @ 2002-11-02  8:42 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linus Torvalds, LKML, hpa, viro

On Sat, Nov 02, 2002 at 03:13:45AM -0500, Jeff Garzik wrote:
> The Future.
> 
> Early userspace is going to be merged in a series of evolutionary 
> changes, following what I call "The Al Viro model."  NO KERNEL BEHAVIOR 
> SHOULD CHANGE.  [that's for the lkml listeners, not you <g>]  "make" 
> will continue to simply Do The Right Thing(tm) on all platforms, while 
> the kernel image continues to get progressively smaller.

Won't the initial userspace be linked into the kernel? If so, why will
the kernel image get smaller?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02  8:42 ` Aaron Lehmann
@ 2002-11-02  8:46   ` Jeff Garzik
  2002-11-02  8:50     ` H. Peter Anvin
  2002-11-02 19:01   ` Linus Torvalds
  1 sibling, 1 reply; 18+ messages in thread
From: Jeff Garzik @ 2002-11-02  8:46 UTC (permalink / raw)
  To: Aaron Lehmann; +Cc: Linus Torvalds, LKML, hpa, viro

Aaron Lehmann wrote:

>On Sat, Nov 02, 2002 at 03:13:45AM -0500, Jeff Garzik wrote:
>  
>
>>The Future.
>>
>>Early userspace is going to be merged in a series of evolutionary 
>>changes, following what I call "The Al Viro model."  NO KERNEL BEHAVIOR 
>>SHOULD CHANGE.  [that's for the lkml listeners, not you <g>]  "make" 
>>will continue to simply Do The Right Thing(tm) on all platforms, while 
>>the kernel image continues to get progressively smaller.
>>    
>>
>
>Won't the initial userspace be linked into the kernel? If so, why will
>the kernel image get smaller?
>  
>

Yes and no ;-)

Ignoring for a moment initramfses loaded from your bootloader (a la 
initrd)...   The amount of code that runs in kernel space shrinks, which 
is the main point of early userspace.  If you are talking in terms of 
overall kernel image size, yes, but the initramfs cpio archive is 
ditching along with the rest of __init code, so you're really only 
talking about wasting a couple of additional pages in vmlinux -- a 
slight increase in disk space usage, and that's it.

So runtime memory usage certainly does not increase...

    Jeff





^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02  8:46   ` Jeff Garzik
@ 2002-11-02  8:50     ` H. Peter Anvin
  0 siblings, 0 replies; 18+ messages in thread
From: H. Peter Anvin @ 2002-11-02  8:50 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Aaron Lehmann, Linus Torvalds, LKML, viro

Jeff Garzik wrote:
> Aaron Lehmann wrote:
> 
>> On Sat, Nov 02, 2002 at 03:13:45AM -0500, Jeff Garzik wrote:
>>  
>>
>>> The Future.
>>>
>>> Early userspace is going to be merged in a series of evolutionary 
>>> changes, following what I call "The Al Viro model."  NO KERNEL 
>>> BEHAVIOR SHOULD CHANGE.  [that's for the lkml listeners, not you 
>>> <g>]  "make" will continue to simply Do The Right Thing(tm) on all 
>>> platforms, while the kernel image continues to get progressively 
>>> smaller.
>>>   
>>
>>
>> Won't the initial userspace be linked into the kernel? If so, why will
>> the kernel image get smaller?
>>  
>>
> 
> Yes and no ;-)
> 
> Ignoring for a moment initramfses loaded from your bootloader (a la 
> initrd)...   The amount of code that runs in kernel space shrinks, which 
> is the main point of early userspace.  If you are talking in terms of 
> overall kernel image size, yes, but the initramfs cpio archive is 
> ditching along with the rest of __init code, so you're really only 
> talking about wasting a couple of additional pages in vmlinux -- a 
> slight increase in disk space usage, and that's it.
> 
> So runtime memory usage certainly does not increase...
> 

By the way, the final initramfs should typically be a union of whatever 
sources there are; with the ones linked into the kernel image unpacked 
first (so they can be overwritten if so specified to the bootloader.)

	-hpa



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02  8:13 [BK PATCHES] initramfs merge, part 1 of N Jeff Garzik
  2002-11-02  8:18 ` Jeff Garzik
  2002-11-02  8:42 ` Aaron Lehmann
@ 2002-11-02 10:51 ` miltonm
  2002-11-02 17:12 ` Matt Porter
  3 siblings, 0 replies; 18+ messages in thread
From: miltonm @ 2002-11-02 10:51 UTC (permalink / raw)
  To: Linux Kernel


> Items For Discussion
> 
> #1 - shared kinit
> 
> "kinit" is _the_ early userspace binary -- but not necessarily the only
> one. Peter Anvin and Russell King have several binaries in the klibc
> tarball, gzip, ash, and several smaller utilities. Peter also put work
> into making klibc a shared object -- that doesn't need an shlib loader.
>  It's pretty nifty how he does it, IMO: klibc.so becomes an ELF
> interpreter for any klibc-linked binaries. klibc-linked binaries are,
> to the ELF system, static binaries, but they wind up sharing klibc.so
> anyway due to this trick.
> 
> Anyway, there is a certain elegance in adding coding to kinit instead of
> an explosion of binaries and shell scripts. The other side of that coin
> is that with elegance you sacrifice some ease of making changes. I am
> 60% certain we want a shared klibc and multiple binaries, but am willing
> to be convinced in either direction. If you think about it, there _are_
> several benefits to leaving kinit as the lone binary in the stock kernel
> early userspace build, so the decision is not as cut-n-dry as it may
> immediately seem. 


One idea I experimented some time ago with (and can revive after
some sleep) is, rather than interpreting cpio in the kernel, objcopy
a binary into a init and copy that into pagecache in a ramfs/libfs
file system.   The population was all initfunctions, trying to make
it disappear at runtime.  /dev/initrd was left for userspace to
expand the rest of the loaders.  With libfs, the write code reinstated
so standard directories, device nodes, console and initrd nodes
can be created and opened in userspace, further shrinking the static
linked-in code.

This argues that this initial code is unshared and uncompressed
(or rather, compressed like the rest of the kernel); for shared we
would have to copy a couple of pieces this way.  It traded off a
table of offset,length,mode,name with cpio headers and parsing.

I had this running on 2.4.19-pre10 (around the time of the kernel
summit, just before the fixed directory link counts went in) with
busybox.  (I seperated the 2.4 compat vs 2.5 stuff at that time).

Comments?

milton

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02 19:01   ` Linus Torvalds
@ 2002-11-02 12:07     ` H. Peter Anvin
  2002-11-02 20:24     ` Alexander Viro
  2002-11-02 23:46     ` Dave Cinege
  2 siblings, 0 replies; 18+ messages in thread
From: H. Peter Anvin @ 2002-11-02 12:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Aaron Lehmann, Jeff Garzik, LKML, viro

Linus Torvalds wrote:
> 
> The real advantage to me is two-fold:
> 
>  - make it easier for people to customize their initial system without 
>    having to muck with kernel code or even use a different boot sequence.  
>    One example of this is the difference between vendor install kernels 
>    (using initrd) and a normal install kernel (which doesn't).
> 
>    So I'd much rather see us _always_ using initrd, and the difference 
>    between an install kernel and a regular kernel is really just the size 
>    of the initrd thing.
> 
>  - Many things are much more easily done in user space, because user space 
>    has protections, "infinite stack", and in general a lot better 
>    infrastructure (ie easier to debug etc). At the same time, many things 
>    need to be done _before_ the kernel is fully ready to hand over control 
>    to a normal user space: do ACPI parsing so that we can initialize the
>    devices so that we can use the "real" user space that is installed on
>    disk etc.
> 
>    Sometimes there is overlap between these two things (ie the "easier to 
>    do in user space" and "needs to be done before normal user space can be
>    loaded"). ACPI is one potential example. Mounting the root filesystem 
>    over NFS after having done DHCP or other auto-discovery is another.  
> 

I agree 100% with this.  I don't think <kernel>+<early userspace> will 
ever be smaller than the current kernel, but I have invested quite a bit 
of effort into it for exactly the reasons done above.

klibc binaries might not be what one usually tends to run, but during 
klibc development I could still use standard gdb, strace, and just plain 
"run it off the command line" debugging techniques from a full-blown 
environment.  When that doesn't work (like testing dynamic klibc), 
chroot will usually do the trick.  The compile-test-debug cycle is so 
much faster than for a kernel boot that it's just plain amazing.

	-hpa



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02 17:12 ` Matt Porter
@ 2002-11-02 12:14   ` H. Peter Anvin
  2002-11-02 20:37     ` an idling kernel Anu
                       ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: H. Peter Anvin @ 2002-11-02 12:14 UTC (permalink / raw)
  To: Matt Porter; +Cc: Jeff Garzik, Linus Torvalds, LKML, viro

Matt Porter wrote:
> On Sat, Nov 02, 2002 at 03:13:45AM -0500, Jeff Garzik wrote:
> 
>>#4 - move mounting root to userspace
>>
>>People probably breathed a sigh of relief at patch #3, they will heave a 
>>bigger sigh for this patch :)   This moves mounting of the root 
>>filesystem to early userspace, including getting rid of 
>>NFSroot/bootp/dhcp code in the kernel.
> 
> 
> For those of us who only develop on nfsroot-based systems, does this
> step include adding userspace network interface configuration and
> bootp/dhcp client functionality to kinit?  I want to assume that
> "getting rid of NFSroot/bootp/dhcp" means moving that particular
> functionality as part of this step.  Just wondering what the
> short-term impact will be on the poor embedded guys. :)
> 

Probably not to kinit, but to early userspace, yes.  There is no real 
reason to put everything into kinit, and a lot of these things we have 
already written up as part of the klibc bundle.

	-hpa




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02  8:13 [BK PATCHES] initramfs merge, part 1 of N Jeff Garzik
                   ` (2 preceding siblings ...)
  2002-11-02 10:51 ` miltonm
@ 2002-11-02 17:12 ` Matt Porter
  2002-11-02 12:14   ` H. Peter Anvin
  3 siblings, 1 reply; 18+ messages in thread
From: Matt Porter @ 2002-11-02 17:12 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linus Torvalds, LKML, hpa, viro

On Sat, Nov 02, 2002 at 03:13:45AM -0500, Jeff Garzik wrote:
> #4 - move mounting root to userspace
> 
> People probably breathed a sigh of relief at patch #3, they will heave a 
> bigger sigh for this patch :)   This moves mounting of the root 
> filesystem to early userspace, including getting rid of 
> NFSroot/bootp/dhcp code in the kernel.

For those of us who only develop on nfsroot-based systems, does this
step include adding userspace network interface configuration and
bootp/dhcp client functionality to kinit?  I want to assume that
"getting rid of NFSroot/bootp/dhcp" means moving that particular
functionality as part of this step.  Just wondering what the
short-term impact will be on the poor embedded guys. :)

Regards,
-- 
Matt Porter
porter@cox.net
This is Linux Country. On a quiet night, you can hear Windows reboot.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02  8:42 ` Aaron Lehmann
  2002-11-02  8:46   ` Jeff Garzik
@ 2002-11-02 19:01   ` Linus Torvalds
  2002-11-02 12:07     ` H. Peter Anvin
                       ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Linus Torvalds @ 2002-11-02 19:01 UTC (permalink / raw)
  To: Aaron Lehmann; +Cc: Jeff Garzik, LKML, hpa, viro


On Sat, 2 Nov 2002, Aaron Lehmann wrote:
> 
> Won't the initial userspace be linked into the kernel? If so, why will
> the kernel image get smaller?

Note that the reason I personally really want initramfs is not to make the
kernel boot image smaller, or the kernel sources smaller. That won't
happen for a long time, since I suspect that we'll be carrying the
initramfs user space with us for quite a while (eventually it will
probably split up into a project of its own, but certainly for the
forseeable future it would be very closely tied to the kernel).

The real advantage to me is two-fold:

 - make it easier for people to customize their initial system without 
   having to muck with kernel code or even use a different boot sequence.  
   One example of this is the difference between vendor install kernels 
   (using initrd) and a normal install kernel (which doesn't).

   So I'd much rather see us _always_ using initrd, and the difference 
   between an install kernel and a regular kernel is really just the size 
   of the initrd thing.

 - Many things are much more easily done in user space, because user space 
   has protections, "infinite stack", and in general a lot better 
   infrastructure (ie easier to debug etc). At the same time, many things 
   need to be done _before_ the kernel is fully ready to hand over control 
   to a normal user space: do ACPI parsing so that we can initialize the
   devices so that we can use the "real" user space that is installed on
   disk etc.

   Sometimes there is overlap between these two things (ie the "easier to 
   do in user space" and "needs to be done before normal user space can be
   loaded"). ACPI is one potential example. Mounting the root filesystem 
   over NFS after having done DHCP or other auto-discovery is another.  

So "shrinking the kernel" is not on my list here. It's really a matter of
"some initialization is better done in user space", and not primarily "we
want to make the kernel smaller". I'm not a big believer in microkernels 
and trying to get everything out of the kernel itself, but I _do_ believe 
that sometimes it's easier to just let the user do his own choices (while 
still giving him all the protection implied by running in user space).

		Linus


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02 19:01   ` Linus Torvalds
  2002-11-02 12:07     ` H. Peter Anvin
@ 2002-11-02 20:24     ` Alexander Viro
  2002-11-02 23:46     ` Dave Cinege
  2 siblings, 0 replies; 18+ messages in thread
From: Alexander Viro @ 2002-11-02 20:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Aaron Lehmann, Jeff Garzik, LKML, hpa



On Sat, 2 Nov 2002, Linus Torvalds wrote:

> Note that the reason I personally really want initramfs is not to make the
> kernel boot image smaller, or the kernel sources smaller. That won't
> happen for a long time, since I suspect that we'll be carrying the
> initramfs user space with us for quite a while (eventually it will
> probably split up into a project of its own, but certainly for the
> forseeable future it would be very closely tied to the kernel).
> 
> The real advantage to me is two-fold:
[snip]

Let me add the third one: userland is more limited.  And no, that's not
a typo - and it's a good thing.  Userland has to use normal, regular
syscalls instead of poking its fingers into hell knows what parts of
kernel data structures.

Which means that it's more robust and that it doesn't stand in the way
of work on kernel.  90% of PITA with super.c used to be of that kind -
mounting root filesystem had been done with very ugly kludges and what's
more, these kludges got filtered down in the normal codepath.  Getting
rid of that took a _lot_ of very careful manipulations with the guts
of the thing.  And guess what?  There was no reason why all that black
magic would be necessary - current code uses normal, garden-variety
system calls.

In effect, we used to have special cases of mount(2), etc., with very
kludgy semantics.  They were not exposed to userland, but that didn't
make them less nasty or less painful to work with.  They still cluttered
the code, they still stood in the way of work on the thing and they still
were butt-ugly.

And that's what moving code to userland should prevent - it's much easier
to catch somebody bringing a patch with magical extension of system call
than to catch an attempt to sneak special-case code used only by kernel.

BTW, that's a thing we need to watch for - there obviously will be a lot
of patches moving stuff to userland and there will be a strong temptation
to add magic interfaces just for that.  _That_ should be prevented - it's
better to leave ugly crap as is than export the same crap to userland.
The point is to get the things cleaned up and make sure that they stay
clean, not to cement them in place by adding a magic ioctl/syscall/flag/whatnot.
We may very well end up extending existing interfaces, but we'd damn better
make sure that such additions make sense for generic use.

We have a lot of ugly crap that would be unnecessary if we had early
access to writable fs.  Basically, we got magic methods, magic codepaths,
etc. simply because the normal access to the functionality in question
required opened file descriptors.  Now we _do_ have a writable filesystem
mounted very early, so that cruft can be killed off.  And moving code
to userland acts as a filter - there we don't have access to magic, so
all such magic immediately shows up.  It could be done in the kernel
(and quite a few things had been done already), but move to userland
acts as a safeguard against reintroduction of magic crap.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* an idling kernel
  2002-11-02 12:14   ` H. Peter Anvin
@ 2002-11-02 20:37     ` Anu
  2002-11-02 22:16       ` Jos Hulzink
                         ` (2 more replies)
  2002-11-02 20:37     ` [BK PATCHES] initramfs merge, part 1 of N Alexander Viro
  2002-11-02 23:36     ` Matt Porter
  2 siblings, 3 replies; 18+ messages in thread
From: Anu @ 2002-11-02 20:37 UTC (permalink / raw)
  To: LKML

disclaimer: if this is the wrong ng to be posting this to, its only due to
ignorance.. I dont know the first thing about where to post this
question..

----------------------------------------------------------------------

Hello,
	Im ready to be beaten up for asking this question ( I am not sure
which group to post to -- all this is new to me) but, I was wondering how
one could figure out if the kernel was in idle mode (or idling).

I *have* tried to look for the answer and here is waht I have come up with
so far :

Process 0 is the idle process.. but, I dont understand how you can tell if
this means that the kernel is in idle mode. Do we just probe the state
field of all process entries and check to see if everyone is sleeping and
conclude that the kernel is idling??

for_each_process(p)
 {
    if(process->state == S)
     {
        countup;
     }
 }

if countup == number of processes, then the kernel was idling?


-anu

********************************************************************************

			     Think, Train, Be

*******************************************************************************



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02 12:14   ` H. Peter Anvin
  2002-11-02 20:37     ` an idling kernel Anu
@ 2002-11-02 20:37     ` Alexander Viro
  2002-11-02 23:36     ` Matt Porter
  2 siblings, 0 replies; 18+ messages in thread
From: Alexander Viro @ 2002-11-02 20:37 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Matt Porter, Jeff Garzik, Linus Torvalds, LKML



On Sat, 2 Nov 2002, H. Peter Anvin wrote:

> Probably not to kinit, but to early userspace, yes.  There is no real 
> reason to put everything into kinit, and a lot of these things we have 
> already written up as part of the klibc bundle.

s/probably/definitely/

There is a lot of reasons for _not_ putting everything into one binary -
if nothing else, it allows to deal with situations like
	/* do a lot of things that are OK for userland */
	/* do ugly magic */
	/* do a lot of things that are OK for userland */
without exporting ugly crap.

It's much better to have several userland helpers called from init sequence
to do sane stuff in userland and leave remaining crap where it is, than
to add user-visible interfaces that don't make sense.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: an idling kernel
  2002-11-02 20:37     ` an idling kernel Anu
@ 2002-11-02 22:16       ` Jos Hulzink
  2002-11-03  0:43       ` identifying the idling kernel and kernel hacking Anu
  2002-11-04 19:16       ` an idling kernel Werner Almesberger
  2 siblings, 0 replies; 18+ messages in thread
From: Jos Hulzink @ 2002-11-02 22:16 UTC (permalink / raw)
  To: Anu, LKML

Hi,

Well.. this mailing list is not that bad for questions like this. You got the 
idea somewhat right though it is implemented quite different. The big word 
here is scheduling. A scheduler is a piece of code that determines what 
thread is to be executed next. How this is done is something entire books are 
written about, and a topic that will be discussed on the lkml often.

With linux, the idle thread is entered when the scheduler finds no threads 
ready for executing. (Not only sleeping, but also waiting for Disk etc) With 
some BSD clones, there is something of an idle queue, a list of threads that 
is only to be executed when the system is actually idle. When the scheduler 
access this queue, you know it has nothing important to do anymore. Linux 
uses priority queueing (see the nice manual for info on that). But, for both 
solutions holds: as soon as the scheduler reaches the end of the queue(Linux) 
/ queues (some BSDs) without finding a thread that can be executed, the 
scheduler enters the real idle thread.

In short: you don't really have to count. You only have to check if you reach 
the end of your thread list. Checking if a thread is able to run is what your 
scheduler already does.

See the scheduler code for more info.

Jos


On Saturday 02 November 2002 21:37, Anu wrote:
> disclaimer: if this is the wrong ng to be posting this to, its only due to
> ignorance.. I dont know the first thing about where to post this
> question..
>
> ----------------------------------------------------------------------
>
> Hello,
> 	Im ready to be beaten up for asking this question ( I am not sure
> which group to post to -- all this is new to me) but, I was wondering how
> one could figure out if the kernel was in idle mode (or idling).
>
> I *have* tried to look for the answer and here is waht I have come up with
> so far :
>
> Process 0 is the idle process.. but, I dont understand how you can tell if
> this means that the kernel is in idle mode. Do we just probe the state
> field of all process entries and check to see if everyone is sleeping and
> conclude that the kernel is idling??
>
> for_each_process(p)
>  {
>     if(process->state == S)
>      {
>         countup;
>      }
>  }
>
> if countup == number of processes, then the kernel was idling?
>
>
> -anu
>
> ***************************************************************************
>*****
>
> 			     Think, Train, Be
>
> ***************************************************************************
>****
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02 12:14   ` H. Peter Anvin
  2002-11-02 20:37     ` an idling kernel Anu
  2002-11-02 20:37     ` [BK PATCHES] initramfs merge, part 1 of N Alexander Viro
@ 2002-11-02 23:36     ` Matt Porter
  2 siblings, 0 replies; 18+ messages in thread
From: Matt Porter @ 2002-11-02 23:36 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Matt Porter, Jeff Garzik, Linus Torvalds, LKML, viro

On Sat, Nov 02, 2002 at 04:14:34AM -0800, H. Peter Anvin wrote:
> Matt Porter wrote:
> > On Sat, Nov 02, 2002 at 03:13:45AM -0500, Jeff Garzik wrote:
> > 
> >>#4 - move mounting root to userspace
> >>
> >>People probably breathed a sigh of relief at patch #3, they will heave a 
> >>bigger sigh for this patch :)   This moves mounting of the root 
> >>filesystem to early userspace, including getting rid of 
> >>NFSroot/bootp/dhcp code in the kernel.
> > 
> > 
> > For those of us who only develop on nfsroot-based systems, does this
> > step include adding userspace network interface configuration and
> > bootp/dhcp client functionality to kinit?  I want to assume that
> > "getting rid of NFSroot/bootp/dhcp" means moving that particular
> > functionality as part of this step.  Just wondering what the
> > short-term impact will be on the poor embedded guys. :)
> > 
> 
> Probably not to kinit, but to early userspace, yes.  There is no real 
> reason to put everything into kinit, and a lot of these things we have 
> already written up as part of the klibc bundle.

Ok, sounds good.  I only mentioned kinit since Jeff's roadmap seemed
to be hazy on whether there was consensus on the single binary approach
versus several binaries.  For maintenance sake, it seems that optional
separate binaries is the only way to go.  Glad to hear that this is the
plan.

Regards,
-- 
Matt Porter
porter@cox.net
This is Linux Country. On a quiet night, you can hear Windows reboot.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BK PATCHES] initramfs merge, part 1 of N
  2002-11-02 19:01   ` Linus Torvalds
  2002-11-02 12:07     ` H. Peter Anvin
  2002-11-02 20:24     ` Alexander Viro
@ 2002-11-02 23:46     ` Dave Cinege
  2 siblings, 0 replies; 18+ messages in thread
From: Dave Cinege @ 2002-11-02 23:46 UTC (permalink / raw)
  To: Linus Torvalds, Aaron Lehmann; +Cc: Jeff Garzik, LKML, hpa, viro

[-- Attachment #1: Type: text/plain, Size: 1386 bytes --]

On Saturday 02 November 2002 14:01, Linus Torvalds wrote:
>
> Note that the reason I personally really want initramfs is not to make the
> kernel boot image smaller, or the kernel sources smaller.

Again for your consideration:

Initrd Dynamic  (Dynamic Initial Ramdisk)

Initrd Dynamic allows extracting tar and tar.gz archives to the rootfs.
It additonally cleans do_mounts, and rewrites the legacy initrd system. 

It provides the same functionality of initramfs but in a more mature and 
robust system. It does not depend on legacy initrd operation. It will
prepare 'early userspace' with klibc, et al. 

With your acceptance an additonal patch will be forthcoming making
the legacy initrd system a compile time option, and moving the call to 
initrd_mount() from do_mounts, to main. (IE compile time initrd.o)

Further patches will purge specific legacy initrd operations from the
general code base and move them to initrd.c where appropreate. 

A patch against 2.5.45 is here and attached:
http://ftp.psychosis.com/linux/initrd-dyn/kernelpatches/2.5.45/initrd_dynamic-2.5.45.diff.gz

You can view the primary files involved here, already 'post-patched' 2.5.45:
http://ftp.psychosis.com/linux/initrd-dyn/kernelpatches/2.5.45/do_mounts.c
http://ftp.psychosis.com/linux/initrd-dyn/kernelpatches/2.5.45/initrd.c
http://ftp.psychosis.com/linux/initrd-dyn/kernelpatches/2.5.45/untar.c


[-- Attachment #2: initrd_dynamic-2.5.45.diff --]
[-- Type: text/x-diff, Size: 54330 bytes --]

diff -uNr linux-2.5.45-virgin/drivers/block/Kconfig linux-2.5.45-initrd_dyn/drivers/block/Kconfig
--- linux-2.5.45-virgin/drivers/block/Kconfig	2002-11-01 05:24:59.000000000 -0500
+++ linux-2.5.45-initrd_dyn/drivers/block/Kconfig	2002-11-01 01:55:37.000000000 -0500
@@ -331,6 +331,18 @@
 	  "real" root file system, etc. See <file:Documentation/initrd.txt>
 	  for details.
 
+config BLK_DEV_INITRD_UNTAR
+	bool "Initial RAM disk untar support (requires TMPFS)"
+	depends on BLK_DEV_INITRD && TMPFS
+	help
+	  Untar support to a tmpfs root. Say Y.
+
+config BLK_DEV_INITRD_GUNZIP
+	bool "Initial RAM disk gunzip support"
+	depends on BLK_DEV_INITRD
+	help
+	  Use gzip compressed images/archives with initrd. Say Y.
+
 config LBD
 	bool "Support for Large Block Devices"
 	depends on X86
diff -uNr linux-2.5.45-virgin/include/linux/root_dev.h linux-2.5.45-initrd_dyn/include/linux/root_dev.h
--- linux-2.5.45-virgin/include/linux/root_dev.h	2002-10-19 00:01:18.000000000 -0400
+++ linux-2.5.45-initrd_dyn/include/linux/root_dev.h	2002-11-01 03:11:42.000000000 -0500
@@ -3,6 +3,7 @@
 
 enum {
 	Root_NFS = MKDEV(UNNAMED_MAJOR, 255),
+	Root_TMPFS = MKDEV(UNNAMED_MAJOR, 12),
 	Root_RAM0 = MKDEV(RAMDISK_MAJOR, 0),
 	Root_RAM1 = MKDEV(RAMDISK_MAJOR, 1),
 	Root_FD0 = MKDEV(FLOPPY_MAJOR, 0),
diff -uNr linux-2.5.45-virgin/init/do_mounts.c linux-2.5.45-initrd_dyn/init/do_mounts.c
--- linux-2.5.45-virgin/init/do_mounts.c	2002-11-01 05:25:08.000000000 -0500
+++ linux-2.5.45-initrd_dyn/init/do_mounts.c	2002-11-01 01:56:50.000000000 -0500
@@ -20,8 +20,6 @@
 #include <linux/ext2_fs.h>
 #include <linux/romfs_fs.h>
 
-#define BUILD_CRAMDISK
-
 extern int get_filesystem_list(char * buf);
 
 extern asmlinkage long sys_mount(char *dev_name, char *dir_name, char *type,
@@ -37,24 +35,16 @@
 extern asmlinkage long sys_umount(char *name, int flags);
 extern asmlinkage long sys_ioctl(int fd, int cmd, unsigned long arg);
 
-#ifdef CONFIG_BLK_DEV_INITRD
-unsigned int real_root_dev;	/* do_proc_dointvec cannot handle kdev_t */
-static int __initdata mount_initrd = 1;
+int root_mountflags = MS_RDONLY | MS_VERBOSE;
 
-static int __init no_initrd(char *str)
-{
-	mount_initrd = 0;
-	return 1;
-}
+int __initdata rd_doload;	/* 1 = load initrd, 0 = don't load */
+int __initdata rd_prompt = 1;	/* 1 = prompt for initrd floppy, 0 = don't prompt */
+int __initdata rd_image_start;	/* starting block # of image */
 
-__setup("noinitrd", no_initrd);
-#else
-static int __initdata mount_initrd = 0;
+#ifdef CONFIG_BLK_DEV_INITRD_UNTAR
+static unsigned long __initdata tmpfs_root_fssize = 0;
 #endif
 
-int __initdata rd_doload;	/* 1 = load RAM disk, 0 = don't load */
-
-int root_mountflags = MS_RDONLY | MS_VERBOSE;
 static char root_device_name[64];
 static char saved_root_name[64];
 
@@ -63,13 +53,6 @@
 
 static int do_devfs = 0;
 
-static int __init load_ramdisk(char *str)
-{
-	rd_doload = simple_strtol(str,NULL,0) & 3;
-	return 1;
-}
-__setup("load_ramdisk=", load_ramdisk);
-
 static int __init readonly(char *str)
 {
 	if (*str)
@@ -85,7 +68,6 @@
 	root_mountflags &= ~MS_RDONLY;
 	return 1;
 }
-
 __setup("ro", readonly);
 __setup("rw", readwrite);
 
@@ -220,7 +202,6 @@
 	saved_root_name[63] = '\0';
 	return 1;
 }
-
 __setup("root=", root_dev_setup);
 
 static char * __initdata root_mount_data;
@@ -236,7 +217,6 @@
 	root_fs_names = str;
 	return 1;
 }
-
 __setup("rootflags=", root_data_setup);
 __setup("rootfstype=", fs_names_setup);
 
@@ -266,6 +246,20 @@
 	}
 	*s = '\0';
 }
+
+static void __init bad_root_panic(void)
+{
+        /*
+	 * Allow the user to distinguish between failed open
+	 * and bad superblock on root device.
+	 */
+	printk ("VFS: Cannot open root device \"%s\" or %s\n",
+		root_device_name, kdevname (to_kdev_t(ROOT_DEV)));
+	printk ("Please append a correct \"root=\" boot option\n");
+	panic("VFS: Unable to mount root fs on %s",
+		kdevname(to_kdev_t(ROOT_DEV)));
+}
+
 static void __init mount_block_root(char *name, int flags)
 {
 	char *fs_names = __getname();
@@ -284,15 +278,7 @@
 			case -EINVAL:
 				continue;
 		}
-	        /*
-		 * Allow the user to distinguish between failed open
-		 * and bad superblock on root device.
-		 */
-		printk ("VFS: Cannot open root device \"%s\" or %s\n",
-			root_device_name, kdevname (to_kdev_t(ROOT_DEV)));
-		printk ("Please append a correct \"root=\" boot option\n");
-		panic("VFS: Unable to mount root fs on %s",
-			kdevname(to_kdev_t(ROOT_DEV)));
+		bad_root_panic();
 	}
 	panic("VFS: Unable to mount root fs on %s", kdevname(to_kdev_t(ROOT_DEV)));
 out:
@@ -304,16 +290,63 @@
 		(current->fs->pwdmnt->mnt_sb->s_flags & MS_RDONLY) ? " readonly" : "");
 }
  
-#ifdef CONFIG_ROOT_NFS
+static int __init mount_tmpfs_root(char *name)
+{
+#ifndef CONFIG_BLK_DEV_INITRD_UNTAR
+	if (ROOT_DEV != Root_TMPFS)
+		return 0;
+		
+	printk(KERN_ERR "VFS: Kernel does not support root fs on tmpfs.\n");
+	bad_root_panic();
+	return 0;
+#else
+	char data[16] = {0};
+	
+	if (ROOT_DEV != Root_TMPFS)
+		return 0;
+		
+	if (tmpfs_root_fssize > 0)
+		snprintf(data, sizeof(data), "size=%luk", tmpfs_root_fssize);
+
+	if (sys_mount(name,"/root","tmpfs",root_mountflags & ~MS_RDONLY,data) == 0) {
+		sys_chdir("/root");
+		ROOT_DEV = current->fs->pwdmnt->mnt_sb->s_dev;
+		printk("VFS: Mounted root (tmpfs filesystem). [");
+		if (tmpfs_root_fssize) printk("%luKB",tmpfs_root_fssize); else printk("No");
+		printk(" Ceiling]\n");
+		return 1;
+	}
+	printk(KERN_ERR "VFS: Unable to mount root fs on tmpfs.\n");
+	return 0;
+#endif
+}
+ 
 static int __init mount_nfs_root(void)
 {
+#ifndef CONFIG_ROOT_NFS
+	if (ROOT_DEV != Root_NFS)
+		return 0;
+		
+	printk(KERN_ERR "VFS: Kernel does not support root fs via NFS.\n");
+	bad_root_panic();
+	return 0;
+#else
 	void *data = nfs_root_data();
 
-	if (data && sys_mount("/dev/root","/root","nfs",root_mountflags,data) == 0)
+	if (ROOT_DEV != Root_NFS)
+		return 0;
+
+	if (data && sys_mount("/dev/root","/root","nfs",root_mountflags,data) == 0) {
+		sys_chdir("/root");
+		ROOT_DEV = current->fs->pwdmnt->mnt_sb->s_dev;
+		printk("VFS: Mounted root (nfs filesystem).\n");
 		return 1;
+	}
+	printk(KERN_ERR "VFS: Unable to mount root fs via NFS, trying initrd.\n");
+	root_dev_setup("/dev/ram0");
 	return 0;
-}
 #endif
+}
 
 static int __init create_dev(char *name, dev_t dev, char *devfs_name)
 {
@@ -336,398 +369,21 @@
 	return sys_symlink(path + n + 5, name);
 }
 
-#if defined(CONFIG_BLK_DEV_RAM) || defined(CONFIG_BLK_DEV_FD)
-static void __init change_floppy(char *fmt, ...)
-{
-	struct termios termios;
-	char buf[80];
-	char c;
-	int fd;
-	va_list args;
-	va_start(args, fmt);
-	vsprintf(buf, fmt, args);
-	va_end(args);
-	fd = open("/dev/root", O_RDWR | O_NDELAY, 0);
-	if (fd >= 0) {
-		sys_ioctl(fd, FDEJECT, 0);
-		close(fd);
-	}
-	printk(KERN_NOTICE "VFS: Insert %s and press ENTER\n", buf);
-	fd = open("/dev/console", O_RDWR, 0);
-	if (fd >= 0) {
-		sys_ioctl(fd, TCGETS, (long)&termios);
-		termios.c_lflag &= ~ICANON;
-		sys_ioctl(fd, TCSETSF, (long)&termios);
-		read(fd, &c, 1);
-		termios.c_lflag |= ICANON;
-		sys_ioctl(fd, TCSETSF, (long)&termios);
-		close(fd);
-	}
-}
-#endif
-
-#ifdef CONFIG_BLK_DEV_RAM
-
-int __initdata rd_prompt = 1;	/* 1 = prompt for RAM disk, 0 = don't prompt */
-
-static int __init prompt_ramdisk(char *str)
-{
-	rd_prompt = simple_strtol(str,NULL,0) & 1;
-	return 1;
-}
-__setup("prompt_ramdisk=", prompt_ramdisk);
-
-int __initdata rd_image_start;		/* starting block # of image */
-
-static int __init ramdisk_start_setup(char *str)
-{
-	rd_image_start = simple_strtol(str,NULL,0);
-	return 1;
-}
-__setup("ramdisk_start=", ramdisk_start_setup);
-
-static int __init crd_load(int in_fd, int out_fd);
-
-/*
- * This routine tries to find a RAM disk image to load, and returns the
- * number of blocks to read for a non-compressed image, 0 if the image
- * is a compressed image, and -1 if an image with the right magic
- * numbers could not be found.
- *
- * We currently check for the following magic numbers:
- * 	minix
- * 	ext2
- *	romfs
- * 	gzip
- */
-static int __init 
-identify_ramdisk_image(int fd, int start_block)
-{
-	const int size = 512;
-	struct minix_super_block *minixsb;
-	struct ext2_super_block *ext2sb;
-	struct romfs_super_block *romfsb;
-	int nblocks = -1;
-	unsigned char *buf;
-
-	buf = kmalloc(size, GFP_KERNEL);
-	if (buf == 0)
-		return -1;
-
-	minixsb = (struct minix_super_block *) buf;
-	ext2sb = (struct ext2_super_block *) buf;
-	romfsb = (struct romfs_super_block *) buf;
-	memset(buf, 0xe5, size);
-
-	/*
-	 * Read block 0 to test for gzipped kernel
-	 */
-	lseek(fd, start_block * BLOCK_SIZE, 0);
-	read(fd, buf, size);
-
-	/*
-	 * If it matches the gzip magic numbers, return -1
-	 */
-	if (buf[0] == 037 && ((buf[1] == 0213) || (buf[1] == 0236))) {
-		printk(KERN_NOTICE
-		       "RAMDISK: Compressed image found at block %d\n",
-		       start_block);
-		nblocks = 0;
-		goto done;
-	}
-
-	/* romfs is at block zero too */
-	if (romfsb->word0 == ROMSB_WORD0 &&
-	    romfsb->word1 == ROMSB_WORD1) {
-		printk(KERN_NOTICE
-		       "RAMDISK: romfs filesystem found at block %d\n",
-		       start_block);
-		nblocks = (ntohl(romfsb->size)+BLOCK_SIZE-1)>>BLOCK_SIZE_BITS;
-		goto done;
-	}
-
-	/*
-	 * Read block 1 to test for minix and ext2 superblock
-	 */
-	lseek(fd, (start_block+1) * BLOCK_SIZE, 0);
-	read(fd, buf, size);
-
-	/* Try minix */
-	if (minixsb->s_magic == MINIX_SUPER_MAGIC ||
-	    minixsb->s_magic == MINIX_SUPER_MAGIC2) {
-		printk(KERN_NOTICE
-		       "RAMDISK: Minix filesystem found at block %d\n",
-		       start_block);
-		nblocks = minixsb->s_nzones << minixsb->s_log_zone_size;
-		goto done;
-	}
-
-	/* Try ext2 */
-	if (ext2sb->s_magic == cpu_to_le16(EXT2_SUPER_MAGIC)) {
-		printk(KERN_NOTICE
-		       "RAMDISK: ext2 filesystem found at block %d\n",
-		       start_block);
-		nblocks = le32_to_cpu(ext2sb->s_blocks_count);
-		goto done;
-	}
-
-	printk(KERN_NOTICE
-	       "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n",
-	       start_block);
-	
-done:
-	lseek(fd, start_block * BLOCK_SIZE, 0);
-	kfree(buf);
-	return nblocks;
-}
-#endif
-
-static int __init rd_load_image(char *from)
-{
-	int res = 0;
-
-#ifdef CONFIG_BLK_DEV_RAM
-	int in_fd, out_fd;
-	int nblocks, rd_blocks, devblocks, i;
-	char *buf;
-	unsigned short rotate = 0;
-#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES)
-	char rotator[4] = { '|' , '/' , '-' , '\\' };
-#endif
-
-	out_fd = open("/dev/ram", O_RDWR, 0);
-	if (out_fd < 0)
-		goto out;
-
-	in_fd = open(from, O_RDONLY, 0);
-	if (in_fd < 0)
-		goto noclose_input;
-
-	nblocks = identify_ramdisk_image(in_fd, rd_image_start);
-	if (nblocks < 0)
-		goto done;
-
-	if (nblocks == 0) {
-#ifdef BUILD_CRAMDISK
-		if (crd_load(in_fd, out_fd) == 0)
-			goto successful_load;
+#ifdef CONFIG_BLK_DEV_INITRD 
+static int __init initrd_mount(void);	// in initrd.c
 #else
-		printk(KERN_NOTICE
-		       "RAMDISK: Kernel does not support compressed "
-		       "RAM disk images\n");
+static int __init initrd_mount(void) { return 0;}
 #endif
-		goto done;
-	}
-
-	/*
-	 * NOTE NOTE: nblocks suppose that the blocksize is BLOCK_SIZE, so
-	 * rd_load_image will work only with filesystem BLOCK_SIZE wide!
-	 * So make sure to use 1k blocksize while generating ext2fs
-	 * ramdisk-images.
-	 */
-	if (sys_ioctl(out_fd, BLKGETSIZE, (unsigned long)&rd_blocks) < 0)
-		rd_blocks = 0;
-	else
-		rd_blocks >>= 1;
-
-	if (nblocks > rd_blocks) {
-		printk("RAMDISK: image too big! (%d/%d blocks)\n",
-		       nblocks, rd_blocks);
-		goto done;
-	}
-		
-	/*
-	 * OK, time to copy in the data
-	 */
-	buf = kmalloc(BLOCK_SIZE, GFP_KERNEL);
-	if (buf == 0) {
-		printk(KERN_ERR "RAMDISK: could not allocate buffer\n");
-		goto done;
-	}
-
-	if (sys_ioctl(in_fd, BLKGETSIZE, (unsigned long)&devblocks) < 0)
-		devblocks = 0;
-	else
-		devblocks >>= 1;
-
-	if (strcmp(from, "/dev/initrd") == 0)
-		devblocks = nblocks;
-
-	if (devblocks == 0) {
-		printk(KERN_ERR "RAMDISK: could not determine device size\n");
-		goto done;
-	}
-
-	printk(KERN_NOTICE "RAMDISK: Loading %d blocks [%d disk%s] into ram disk... ", 
-		nblocks, ((nblocks-1)/devblocks)+1, nblocks>devblocks ? "s" : "");
-	for (i=0; i < nblocks; i++) {
-		if (i && (i % devblocks == 0)) {
-			printk("done disk #%d.\n", i/devblocks);
-			rotate = 0;
-			if (close(in_fd)) {
-				printk("Error closing the disk.\n");
-				goto noclose_input;
-			}
-			change_floppy("disk #%d", i/devblocks+1);
-			in_fd = open(from, O_RDONLY, 0);
-			if (in_fd < 0)  {
-				printk("Error opening disk.\n");
-				goto noclose_input;
-			}
-			printk("Loading disk #%d... ", i/devblocks+1);
-		}
-		read(in_fd, buf, BLOCK_SIZE);
-		write(out_fd, buf, BLOCK_SIZE);
-#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES)
-		if (!(i % 16)) {
-			printk("%c\b", rotator[rotate & 0x3]);
-			rotate++;
-		}
-#endif
-	}
-	printk("done.\n");
-	kfree(buf);
-
-successful_load:
-	res = 1;
-done:
-	close(in_fd);
-noclose_input:
-	close(out_fd);
-out:
-	sys_unlink("/dev/ram");
-#endif
-	return res;
-}
-
-static int __init rd_load_disk(int n)
-{
-#ifdef CONFIG_BLK_DEV_RAM
-	if (rd_prompt)
-		change_floppy("root floppy disk to be loaded into RAM disk");
-	create_dev("/dev/ram", MKDEV(RAMDISK_MAJOR, n), NULL);
-#endif
-	return rd_load_image("/dev/root");
-}
 
 static void __init mount_root(void)
 {
-#ifdef CONFIG_ROOT_NFS
-	if (MAJOR(ROOT_DEV) == UNNAMED_MAJOR) {
-		if (mount_nfs_root()) {
-			sys_chdir("/root");
-			ROOT_DEV = current->fs->pwdmnt->mnt_sb->s_dev;
-			printk("VFS: Mounted root (nfs filesystem).\n");
-			return;
-		}
-		printk(KERN_ERR "VFS: Unable to mount root fs via NFS, trying floppy.\n");
-		ROOT_DEV = Root_FD0;
-	}
-#endif
 	create_dev("/dev/root", ROOT_DEV, root_device_name);
-#ifdef CONFIG_BLK_DEV_FD
-	if (MAJOR(ROOT_DEV) == FLOPPY_MAJOR) {
-		/* rd_doload is 2 for a dual initrd/ramload setup */
-		if (rd_doload==2) {
-			if (rd_load_disk(1)) {
-				ROOT_DEV = Root_RAM1;
-				create_dev("/dev/root", ROOT_DEV, NULL);
-			}
-		} else
-			change_floppy("root floppy");
-	}
-#endif
-	mount_block_root("/dev/root", root_mountflags);
-}
-
-#ifdef CONFIG_BLK_DEV_INITRD
-static int old_fd, root_fd;
-static int do_linuxrc(void * shell)
-{
-	static char *argv[] = { "linuxrc", NULL, };
-	extern char * envp_init[];
-
-	close(old_fd);close(root_fd);
-	close(0);close(1);close(2);
-	setsid();
-	(void) open("/dev/console",O_RDWR,0);
-	(void) dup(0);
-	(void) dup(0);
-	return execve(shell, argv, envp_init);
-}
-
-#endif
-
-static void __init handle_initrd(void)
-{
-#ifdef CONFIG_BLK_DEV_INITRD
-	int error;
-	int i, pid;
-
-	create_dev("/dev/root.old", Root_RAM0, NULL);
-	/* mount initrd on rootfs' /root */
-	mount_block_root("/dev/root.old", root_mountflags & ~MS_RDONLY);
-	sys_mkdir("/old", 0700);
-	root_fd = open("/", 0, 0);
-	old_fd = open("/old", 0, 0);
-	/* move initrd over / and chdir/chroot in initrd root */
-	sys_chdir("/root");
-	sys_mount(".", "/", NULL, MS_MOVE, NULL);
-	sys_chroot(".");
-	mount_devfs_fs ();
 
-	pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
-	if (pid > 0) {
-		while (pid != waitpid(-1, &i, 0))
-			yield();
-	}
-
-	/* move initrd to rootfs' /old */
-	sys_fchdir(old_fd);
-	sys_mount("/", ".", NULL, MS_MOVE, NULL);
-	/* switch root and cwd back to / of rootfs */
-	sys_fchdir(root_fd);
-	sys_chroot(".");
-	close(old_fd);
-	close(root_fd);
-	sys_umount("/old/dev", 0);
-
-	if (real_root_dev == Root_RAM0) {
-		sys_chdir("/old");
-		return;
-	}
-
-	ROOT_DEV = real_root_dev;
-	mount_root();
-
-	printk(KERN_NOTICE "Trying to move old root to /initrd ... ");
-	error = sys_mount("/old", "/root/initrd", NULL, MS_MOVE, NULL);
-	if (!error)
-		printk("okay\n");
-	else {
-		int fd = open("/dev/root.old", O_RDWR, 0);
-		printk("failed\n");
-		printk(KERN_NOTICE "Unmounting old root\n");
-		sys_umount("/old", MNT_DETACH);
-		printk(KERN_NOTICE "Trying to free ramdisk memory ... ");
-		if (fd < 0) {
-			error = fd;
-		} else {
-			error = sys_ioctl(fd, BLKFLSBUF, 0);
-			close(fd);
-		}
-		printk(!error ? "okay\n" : "failed\n");
-	}
-#endif
-}
-
-static int __init initrd_load(void)
-{
-#ifdef CONFIG_BLK_DEV_INITRD
-	create_dev("/dev/ram", MKDEV(RAMDISK_MAJOR, 0), NULL);
-	create_dev("/dev/initrd", MKDEV(RAMDISK_MAJOR, INITRD_MINOR), NULL);
-#endif
-	return rd_load_image("/dev/initrd");
+	if (mount_nfs_root())	return;
+	
+	if (initrd_mount())	return;
+	
+	mount_block_root("/dev/root", root_mountflags);
 }
 
 /*
@@ -735,7 +391,6 @@
  */
 void prepare_namespace(void)
 {
-	int is_floppy = MAJOR(ROOT_DEV) == FLOPPY_MAJOR;
 	if (saved_root_name[0]) {
 		char *p = saved_root_name;
 		ROOT_DEV = name_to_dev_t(p);
@@ -743,11 +398,6 @@
 			p += 5;
 		strcpy(root_device_name, p);
 	}
-#ifdef CONFIG_BLK_DEV_INITRD
-	if (!initrd_start)
-		mount_initrd = 0;
-	real_root_dev = ROOT_DEV;
-#endif
 	sys_mkdir("/dev", 0700);
 	sys_mkdir("/root", 0700);
 	sys_mknod("/dev/console", S_IFCHR|0600, MKDEV(TTYAUX_MAJOR, 1));
@@ -755,22 +405,14 @@
 	sys_mount("devfs", "/dev", "devfs", 0, NULL);
 	do_devfs = 1;
 #endif
-
 	create_dev("/dev/root", ROOT_DEV, NULL);
 
 	/* This has to be before mounting root, because even readonly mount of reiserfs would replay
 	   log corrupting stuff */
 	software_resume();
 
-	if (mount_initrd) {
-		if (initrd_load() && ROOT_DEV != Root_RAM0) {
-			handle_initrd();
-			goto out;
-		}
-	} else if (is_floppy && rd_doload && rd_load_disk(0))
-		ROOT_DEV = Root_RAM0;
 	mount_root();
-out:
+
 	sys_umount("/dev", 0);
 	sys_mount(".", "/", NULL, MS_MOVE, NULL);
 	sys_chroot(".");
@@ -778,149 +420,6 @@
 	mount_devfs_fs ();
 }
 
-#if defined(BUILD_CRAMDISK) && defined(CONFIG_BLK_DEV_RAM)
-
-/*
- * gzip declarations
- */
-
-#define OF(args)  args
-
-#ifndef memzero
-#define memzero(s, n)     memset ((s), 0, (n))
+#ifdef CONFIG_BLK_DEV_INITRD 
+#include "initrd.c"
 #endif
-
-typedef unsigned char  uch;
-typedef unsigned short ush;
-typedef unsigned long  ulg;
-
-#define INBUFSIZ 4096
-#define WSIZE 0x8000    /* window size--must be a power of two, and */
-			/*  at least 32K for zip's deflate method */
-
-static uch *inbuf;
-static uch *window;
-
-static unsigned insize;  /* valid bytes in inbuf */
-static unsigned inptr;   /* index of next byte to be processed in inbuf */
-static unsigned outcnt;  /* bytes in output buffer */
-static int exit_code;
-static long bytes_out;
-static int crd_infd, crd_outfd;
-
-#define get_byte()  (inptr < insize ? inbuf[inptr++] : fill_inbuf())
-		
-/* Diagnostic functions (stubbed out) */
-#define Assert(cond,msg)
-#define Trace(x)
-#define Tracev(x)
-#define Tracevv(x)
-#define Tracec(c,x)
-#define Tracecv(c,x)
-
-#define STATIC static
-
-static int  fill_inbuf(void);
-static void flush_window(void);
-static void *malloc(int size);
-static void free(void *where);
-static void error(char *m);
-static void gzip_mark(void **);
-static void gzip_release(void **);
-
-#include "../lib/inflate.c"
-
-static void __init *malloc(int size)
-{
-	return kmalloc(size, GFP_KERNEL);
-}
-
-static void __init free(void *where)
-{
-	kfree(where);
-}
-
-static void __init gzip_mark(void **ptr)
-{
-}
-
-static void __init gzip_release(void **ptr)
-{
-}
-
-
-/* ===========================================================================
- * Fill the input buffer. This is called only when the buffer is empty
- * and at least one byte is really needed.
- */
-static int __init fill_inbuf(void)
-{
-	if (exit_code) return -1;
-	
-	insize = read(crd_infd, inbuf, INBUFSIZ);
-	if (insize == 0) return -1;
-
-	inptr = 1;
-
-	return inbuf[0];
-}
-
-/* ===========================================================================
- * Write the output window window[0..outcnt-1] and update crc and bytes_out.
- * (Used for the decompressed data only.)
- */
-static void __init flush_window(void)
-{
-    ulg c = crc;         /* temporary variable */
-    unsigned n;
-    uch *in, ch;
-    
-    write(crd_outfd, window, outcnt);
-    in = window;
-    for (n = 0; n < outcnt; n++) {
-	    ch = *in++;
-	    c = crc_32_tab[((int)c ^ ch) & 0xff] ^ (c >> 8);
-    }
-    crc = c;
-    bytes_out += (ulg)outcnt;
-    outcnt = 0;
-}
-
-static void __init error(char *x)
-{
-	printk(KERN_ERR "%s", x);
-	exit_code = 1;
-}
-
-static int __init crd_load(int in_fd, int out_fd)
-{
-	int result;
-
-	insize = 0;		/* valid bytes in inbuf */
-	inptr = 0;		/* index of next byte to be processed in inbuf */
-	outcnt = 0;		/* bytes in output buffer */
-	exit_code = 0;
-	bytes_out = 0;
-	crc = (ulg)0xffffffffL; /* shift register contents */
-
-	crd_infd = in_fd;
-	crd_outfd = out_fd;
-	inbuf = kmalloc(INBUFSIZ, GFP_KERNEL);
-	if (inbuf == 0) {
-		printk(KERN_ERR "RAMDISK: Couldn't allocate gzip buffer\n");
-		return -1;
-	}
-	window = kmalloc(WSIZE, GFP_KERNEL);
-	if (window == 0) {
-		printk(KERN_ERR "RAMDISK: Couldn't allocate gzip window\n");
-		kfree(inbuf);
-		return -1;
-	}
-	makecrc();
-	result = gunzip();
-	kfree(inbuf);
-	kfree(window);
-	return result;
-}
-
-#endif  /* BUILD_CRAMDISK && CONFIG_BLK_DEV_RAM */
diff -uNr linux-2.5.45-virgin/init/initrd.c linux-2.5.45-initrd_dyn/init/initrd.c
--- linux-2.5.45-virgin/init/initrd.c	1969-12-31 19:00:00.000000000 -0500
+++ linux-2.5.45-initrd_dyn/init/initrd.c	2002-11-01 04:17:08.000000000 -0500
@@ -0,0 +1,890 @@
+/*
+ * Copyright 2002 Dave Cinege <dcinege@psychosis.com>
+ * GPL2 - Copyright notice may not be altered.
+ * initrd rewrite and untar to tmpfs additions
+ *
+ */
+ 
+/* 
+DOCS:
+bootloader	initrd=root.img
+cmdline		root=/dev/ram0
+
+Load raw image to /dev/ram0. Mount /dev/ram0 as primary root.
+Try to execute /linuxrc. If /linuxrc changes the root device
+pivot to that device and remount /dev/ram0 to /initrd.
+(DOING THIS HAS DEPRECIATED! Set root= to the final root,
+that is unless you want /dev/ram0 as you final root device.)
+
+bootloader	initrd=root.img
+cmdline		root=/dev/hda1
+Load raw image to /dev/ram0. Mount /dev/ram0 as primary root.
+Try to execute /linuxrc. After excuting /linuxrc pivot to root= device
+and remount initrd to /initrd.
+
+cmdline		root=/dev/hda1 initrd_from_floppy=1
+Attempt to load raw image to /dev/ram0, from /dev/fd0.
+Mount /dev/ram0 as primary root. Try to execute /linuxrc.
+After excuting /linuxrc pivot to root= device and remount /dev/ram0
+to /initrd. If the bootloader has already loaded an initrd image
+it is deallocated.
+
+
+bootloader	initrd=root.tgz,etc.tgz,myhostconifg.tgz
+cmdline		root=/dev/tmpfs initrd_tmpfs_fssize=10MB
+Mount /dev/tmpfs as the primary root. If initrd_tmpfs_fssize= is
+specificed, the root will have a ceiling of that size.
+If initrd_tmpfs_fssize=0, their will be no ceiling limit.
+If initrd_tmpfs_fssize= is not specified, the root will have a
+ceiling equal to ramdisk_size.
+Extract tar.gz archives sequencially.to the root.
+Try to execute /linuxrc. If /linuxrc changes the root device
+pivot to that device and remount /dev/tmpfs to /initrd.
+(DOING THIS HAS DEPRECIATED! Set root= to the final root,
+that is unless you want /dev/tmpfs as your final root device.)
+
+NOTE: Your bootloader must support loading multiple files
+sequencially into the initrd memory space. If your boot loader
+is outdated you can create a multi-archive file like this:
+  cat root.tgz etc.tgz myhostconifg.tgz > root_tgzs.img
+And load this single file.
+
+
+bootloader	initrd=root.tar
+cmdline		root=/dev/hda1
+Mount /dev/tmpfs as the primary root. 
+Since initrd_tmpfs_fssize= is not specified, the root will have a
+ceiling equal to ramdisk_size.
+Extract tar archive to the root.
+Try to execute /linuxrc. After excuting /linuxrc pivot to root= device
+and remount initrd to /initrd.
+
+*/
+#include <linux/file.h>
+
+
+#ifdef CONFIG_BLK_DEV_INITRD_GUNZIP
+static int __init gunzip_load(int in_fd, int out_fd, int size);
+#endif
+#ifdef CONFIG_BLK_DEV_INITRD_UNTAR
+static int __init initrd_untar(int in_fd);
+#endif
+
+extern asmlinkage long sys_access(const char * filename, int mode);
+
+unsigned int real_root_dev;	/* do_proc_dointvec cannot handle kdev_t */
+
+static int __initdata mount_initrd = 1;
+static int __initdata initrd_from_floppy = 0;
+static unsigned long __initdata initrd_fssize = 0;
+#ifdef CONFIG_BLK_DEV_INITRD_UNTAR
+static int __initdata initrd_tmpfs_fssize_speced = 0;
+#endif
+
+
+enum image_type {
+	UNKNOWN		= 0,
+	GZIPPED		= 2<<0,	
+	TAR		= 2<<1,
+	IMAGE		= 2<<2,
+	ROMFS		= 2<<3,
+	EXT2		= 2<<4,
+	MINIX		= 2<<5
+};
+
+static int __init no_initrd(char *str)
+{
+	mount_initrd = 0;
+	return 1;
+}
+__setup("noinitrd", no_initrd);
+
+
+#ifdef CONFIG_BLK_DEV_FD
+static int __init prompt_ramdisk(char *str)
+{
+	rd_prompt = simple_strtol(str,NULL,0) & 1;
+	return 1;
+}
+__setup("prompt_ramdisk=", prompt_ramdisk);
+
+static int __init load_floppy(char *str)
+{
+	initrd_from_floppy = 1;
+	return 1;
+}
+__setup("initrd_from_floppy",load_floppy);
+
+static int __init ramdisk_start_setup(char *str)
+{
+	rd_image_start = simple_strtol(str,NULL,0);
+	return 1;
+}
+__setup("ramdisk_start=", ramdisk_start_setup);
+
+static void __init change_floppy(char *fmt, ...)
+{
+	struct termios termios;
+	char buf[80];
+	char c;
+	int fd;
+	va_list args;
+	va_start(args, fmt);
+	vsprintf(buf, fmt, args);
+	va_end(args);
+	fd = open("/dev/fd0", O_RDWR | O_NDELAY, 0);
+	if (fd >= 0) {
+		sys_ioctl(fd, FDEJECT, 0);
+		close(fd);
+	}
+	printk(KERN_NOTICE "VFS: Insert %s and press ENTER\n", buf);
+	fd = open("/dev/console", O_RDWR, 0);
+	if (fd >= 0) {
+		sys_ioctl(fd, TCGETS, (long)&termios);
+		termios.c_lflag &= ~ICANON;
+		sys_ioctl(fd, TCSETSF, (long)&termios);
+		read(fd, &c, 1);
+		termios.c_lflag |= ICANON;
+		sys_ioctl(fd, TCSETSF, (long)&termios);
+		close(fd);
+	}
+}
+#endif
+
+// dc: FIX ME updates these comments
+/*
+ * This routine tries to find a RAM disk image to load, and returns the
+ * number of blocks to read for a non-compressed image, 0 if the image
+ * is a compressed image, and -1 if an image with the right magic
+ * numbers could not be found.
+ *
+ * We currently check for the following magic numbers:
+ *	tar
+ * 	minix
+ * 	ext2
+ *	romfs
+ * 	gzip
+ */
+static int __init initrd_identify(int in_fd, int start_block, int *nblocks)
+{
+	const int size = 512;
+	struct minix_super_block *minixsb;
+	struct ext2_super_block *ext2sb;
+	struct romfs_super_block *romfsb;
+
+	unsigned char *buf;
+	int fd, buf_fd = -1;
+	int initrd_type = 0;
+	
+	*nblocks = -1;
+
+	buf = kmalloc(size, GFP_KERNEL);
+	if (buf == 0)
+		return -1;
+	memset(buf, 0xe5, size);
+
+	/*
+	 * Read block 0 to test for gzipped kernel
+	 */
+	fd = in_fd;
+	lseek(fd, start_block * BLOCK_SIZE, 0);
+	read(fd, buf, size);
+
+	/*
+	 * If it matches the gzip magic, gunzip some bytes to compare
+	 */
+	if (buf[0] == 037 && ((buf[1] == 0213) || (buf[1] == 0236))) {
+		*nblocks = 0;
+		initrd_type = GZIPPED;
+#ifndef CONFIG_BLK_DEV_INITRD_GUNZIP
+		goto done;
+#else
+		memset(buf, 0xe5, size);
+		lseek(in_fd, start_block * BLOCK_SIZE, 0);
+
+		create_dev("/dev/ram1", MKDEV(RAMDISK_MAJOR, 1), NULL);
+		buf_fd = open("/dev/ram1", O_RDWR, 0);
+		if (buf_fd < 0)
+			goto done;
+		gunzip_load(in_fd, buf_fd, ((start_block+1) * BLOCK_SIZE) + size);
+		lseek(buf_fd, 0, 0);
+		read(buf_fd, buf, size);
+		fd = buf_fd;
+#endif
+	}
+
+	/* tar archive */	
+	if (strncmp(&buf[257],"ustar",5) == 0) {
+		*nblocks = 0;
+		initrd_type |= TAR;
+		goto done;
+	}
+
+	romfsb = (struct romfs_super_block *) buf;
+	minixsb = (struct minix_super_block *) buf;
+	ext2sb = (struct ext2_super_block *) buf;
+
+	/* romfs is at block zero too */
+	if (romfsb->word0 == ROMSB_WORD0 &&
+	    romfsb->word1 == ROMSB_WORD1) {
+		*nblocks = (ntohl(romfsb->size)+BLOCK_SIZE-1)>>BLOCK_SIZE_BITS;
+		initrd_type |=  IMAGE | ROMFS;
+		goto done;
+	}
+
+	/*
+	 * Read block 1 to test for minix and ext2 superblock
+	 */
+	lseek(fd, (start_block+1) * BLOCK_SIZE, 0);
+	read(fd, buf, size);
+
+	/* Try minix */
+	if (minixsb->s_magic == MINIX_SUPER_MAGIC ||
+	    minixsb->s_magic == MINIX_SUPER_MAGIC2) {
+		*nblocks = minixsb->s_nzones << minixsb->s_log_zone_size;
+		initrd_type |=  IMAGE | MINIX;
+		goto done;
+	}
+
+	/* Try ext2 */
+	if (ext2sb->s_magic == cpu_to_le16(EXT2_SUPER_MAGIC)) {
+		*nblocks = le32_to_cpu(ext2sb->s_blocks_count);
+		initrd_type |=  IMAGE | EXT2;
+		goto done;
+	}
+
+	/* It's.....Unknown! */
+	initrd_type = UNKNOWN;
+
+done:
+	lseek(fd, start_block * BLOCK_SIZE, 0);
+	kfree(buf);
+
+#ifdef CONFIG_BLK_DEV_INITRD_GUNZIP
+	if (buf_fd > -1) {
+		close(buf_fd);
+		sys_unlink("/dev/ram1");
+	}
+#endif
+	
+	return initrd_type;
+}
+
+
+static int __init rd_load_image(int fd, int nblocks)
+{
+	int res = 0;
+
+	int in_fd, out_fd;
+	int rd_blocks, devblocks, i;
+	char *buf;
+	unsigned short rotate = 0;
+#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES)
+	char rotator[4] = { '|' , '/' , '-' , '\\' };
+#endif
+	in_fd = fd;
+
+	out_fd = open("/dev/ram0", O_RDWR, 0);
+	if (out_fd < 0)
+		goto out;
+		
+	if (nblocks < 0)
+		goto done;
+		
+	if (nblocks == 0) {
+#ifdef CONFIG_BLK_DEV_INITRD_GUNZIP
+		if (gunzip_load(in_fd, out_fd, 0) == 0)
+			goto successful_load;
+#endif
+		goto done;
+	}
+
+	/*
+	 * NOTE NOTE: nblocks suppose that the blocksize is BLOCK_SIZE, so
+	 * rd_load_image will work only with filesystem BLOCK_SIZE wide!
+	 * So make sure to use 1k blocksize while generating ext2fs
+	 * ramdisk-images.
+	 */
+	if (sys_ioctl(out_fd, BLKGETSIZE, (unsigned long)&rd_blocks) < 0)
+		rd_blocks = 0;
+	else
+		rd_blocks >>= 1;
+
+	if (nblocks > rd_blocks) {
+		printk("INITRD: image too big! (%d/%d blocks)\n",
+		       nblocks, rd_blocks);
+		goto done;
+	}
+		
+	/*
+	 * OK, time to copy in the data
+	 */
+	buf = kmalloc(BLOCK_SIZE, GFP_KERNEL);
+	if (buf == 0) {
+		printk(KERN_ERR "INITRD: could not allocate buffer\n");
+		goto done;
+	}
+
+	if (sys_ioctl(in_fd, BLKGETSIZE, (unsigned long)&devblocks) < 0)
+		devblocks = 0;
+	else
+		devblocks >>= 1;
+
+	// dc: FIX ME
+	if (initrd_start) //if (strcmp(from, "/dev/initrd") == 0)
+		devblocks = nblocks;
+
+	if (devblocks == 0) {
+		printk(KERN_ERR "INITRD: could not determine device size\n");
+		goto done;
+	}
+
+	printk(KERN_NOTICE "INITRD: Loading %d blocks [%d disk%s] into ram disk... ", 
+		nblocks, ((nblocks-1)/devblocks)+1, nblocks>devblocks ? "s" : "");
+	for (i=0; i < nblocks; i++) {
+		if (i && (i % devblocks == 0)) {
+			printk("done disk #%d.\n", i/devblocks);
+			rotate = 0;
+			if (close(in_fd)) {
+				printk("Error closing the disk.\n");
+				goto noclose_input;
+			}
+			change_floppy("disk #%d", i/devblocks+1);
+			in_fd = open("/dev/fd0", O_RDONLY, 0);
+			if (in_fd < 0)  {
+				printk("Error opening disk.\n");
+				goto noclose_input;
+			}
+			printk("Loading disk #%d... ", i/devblocks+1);
+		}
+		read(in_fd, buf, BLOCK_SIZE);
+		write(out_fd, buf, BLOCK_SIZE);
+#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES)
+		if (!(i % 16)) {
+			printk("%c\b", rotator[rotate & 0x3]);
+			rotate++;
+		}
+#endif
+	}
+	printk("done.\n");
+	kfree(buf);
+
+successful_load:
+	res = 1;
+done:
+	close(in_fd);
+noclose_input:
+	close(out_fd);
+out:
+	return res;
+}
+
+
+static void __init initrd_free_memory(char *dev)
+{
+	int fd = open(dev, O_RDWR, 0);
+	if (fd >= 0) {
+		sys_ioctl(fd, BLKFLSBUF, 0);
+		close(fd);
+	}
+}
+
+#ifdef CONFIG_BLK_DEV_INITRD_UNTAR
+static int __init initrd_tmpfs_setup(char *str)
+{
+	initrd_fssize = simple_strtoul(str,NULL,10);
+	initrd_tmpfs_fssize_speced = 1;
+	tmpfs_root_fssize = initrd_fssize;
+	return 1;
+}
+__setup("initrd_tmpfs_size=", initrd_tmpfs_setup);
+#endif
+
+static int initrd_fd, root_fd;
+static int do_linuxrc(void * shell)
+{
+	static char *argv[] = { "linuxrc", NULL, };
+	extern char * envp_init[];
+
+	close(initrd_fd);close(root_fd);
+	close(0);close(1);close(2);
+	setsid();
+	(void) open("/dev/console",O_RDWR,0);
+	(void) dup(0);
+	(void) dup(0);
+	return execve(shell, argv, envp_init);
+}
+
+/* Depreciated. Purge at will.*/
+static void __init initrd_pivot(void)
+{
+	int error;
+	int i, pid;
+
+	/* If there's no executable /linuxrc to change the ROOT_DEV, it can't pivot. */
+	if (ROOT_DEV == real_root_dev && !sys_access("./linuxrc",1))
+		return;
+
+	create_dev("/dev/root_initrd", ROOT_DEV, NULL);
+
+	/* make location to pivot to. */
+	sys_mkdir("/initrd", 0700);
+	root_fd = open("/", 0, 0);
+	initrd_fd = open("/initrd", 0, 0);
+		
+	/* move initrd over / and chdir/chroot in initrd root */
+	sys_chdir("/root");
+	sys_mount(".", "/", NULL, MS_MOVE, NULL);
+	sys_chroot(".");
+	mount_devfs_fs ();
+
+	printk(KERN_NOTICE "INITRD: Executing /linuxrc.\n");
+	pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
+	if (pid > 0) {
+		while (pid != waitpid(-1, &i, 0))
+			yield();
+	}
+	printk(KERN_NOTICE "INITRD: Exiting /linuxrc.\n");
+
+	/* linuxrc changed the root. move initrd to rootfs' /initrd */
+	if (ROOT_DEV != real_root_dev) {
+		sys_fchdir(initrd_fd);
+		sys_mount("/", ".", NULL, MS_MOVE, NULL);
+	}
+	
+	/* switch root and cwd back to / of rootfs */
+	sys_fchdir(root_fd);
+	sys_chroot(".");
+	close(initrd_fd);
+	close(root_fd);
+
+	/* linuxrc did not change the root. Move it back. Can't pivot */
+	if (ROOT_DEV == real_root_dev) {
+		sys_umount("/root/dev", 0);
+		create_dev("/dev/root", ROOT_DEV, NULL);
+		sys_chdir("/root");
+		return;
+	}
+	
+	sys_umount("/initrd/dev", 0);
+	
+	ROOT_DEV = real_root_dev;
+	create_dev("/dev/root", ROOT_DEV, NULL);
+	// dc: FIX ME do this or make it safe to reenter mount_root()?
+	// linuxrc pivoting *IS* depreciated after all...
+	mount_block_root("/dev/root", root_mountflags);
+
+	printk(KERN_NOTICE "INITRD: Moving initrd root to /initrd...");
+	error = sys_mount("/initrd", "/root/initrd", NULL, MS_MOVE, NULL);
+	if (!error) {
+		printk("done.\n");
+	} else {
+		printk("failed!\n");
+		sys_umount("/initrd", MNT_DETACH);
+		initrd_free_memory("/dev/initrd");
+	}
+}
+
+static int __init initrd_open(void)
+{
+	create_dev("/dev/ram0", MKDEV(RAMDISK_MAJOR, 0), NULL);
+	create_dev("/dev/initrd", MKDEV(RAMDISK_MAJOR, INITRD_MINOR), NULL);
+
+#ifdef CONFIG_BLK_DEV_FD
+	/* Try loading initrd from floppy. Wipe initrd image if present in memory. */
+	if (initrd_from_floppy) {
+		initrd_free_memory("/dev/initrd");
+		create_dev("/dev/fd0", MKDEV(FLOPPY_MAJOR, 0), NULL);
+		if (rd_prompt)
+			change_floppy("root floppy disk to be loaded into RAM disk");
+		return open("/dev/fd0", O_RDONLY, 0);
+	}
+#endif
+	return open("/dev/initrd", O_RDONLY, 0);
+}
+
+
+static void __init initrd_print_type(int initrd_type, int nblocks)
+{
+	printk(KERN_NOTICE "INITRD: ");
+	if (initrd_type & GZIPPED)		printk("Compressed ");
+	
+	if (initrd_type & TAR)			printk("Tar archive found at block %d.\n",rd_image_start);
+		
+	if (initrd_type & ROMFS)		printk("RomFS");
+	if (initrd_type & EXT2)			printk("EXT2");
+	if (initrd_type & MINIX)		printk("Minix");
+
+	if (initrd_type & IMAGE)		printk(" filesystem found at block %d. %d blocks.\n",rd_image_start, nblocks);
+}
+
+static int __init initrd_mount(void)
+{
+	extern int rd_size;
+	int initrd_type, nblocks;
+	int in_fd;
+
+	/* There is no image in memory AND we shouldn't try to load one from floppy. */
+	if (initrd_start == 0 && initrd_from_floppy == 0)
+		return 0;
+
+	/* Don't mount_initrd (noinitrd option)*/		
+	if (mount_initrd == 0)
+		return 0;	
+
+	real_root_dev = ROOT_DEV;
+
+	/* Open initrd */
+	in_fd = initrd_open();
+	if (in_fd < 0) {
+		printk(KERN_NOTICE "INITRD: Could not open initrd. (This is an error if you loaded one...)\n");
+		return 0;
+	}
+	
+	initrd_type = initrd_identify(in_fd, rd_image_start, &nblocks);
+
+	if (initrd_type & UNKNOWN) {
+		printk(KERN_ERR "INITRD: Could not find a valid image or archive starting at %d!\n",rd_image_start);
+		return 0;
+	}
+#ifndef CONFIG_BLK_DEV_INITRD_GUNZIP
+	if (initrd_type & GZIPPED) {
+		printk(KERN_ERR "INITRD: Kernel does not support compressed images!\n");
+		return 0;
+	}
+#endif
+#ifndef CONFIG_BLK_DEV_INITRD_UNTAR
+	if (initrd_type & TAR) {
+		printk(KERN_ERR "INITRD: Kernel does not support Tar archives!\n");
+		return 0;
+	}
+#endif
+	if ((initrd_type & IMAGE) && (nblocks > rd_size)) {
+		printk(KERN_ERR "INITRD: Image will not fit on ramdisk! (%d vs. %d)\n",nblocks,rd_size);
+		return 0;
+	}
+	
+	initrd_print_type(initrd_type, nblocks);
+
+	/* Mount initrd on rootfs' /root, chdir there, and extract archive if it's tmpfs */
+	if (initrd_type & TAR) {
+#ifndef CONFIG_BLK_DEV_INITRD_UNTAR
+		mount_tmpfs_root("/dev/root");
+	}
+#else
+		if (initrd_tmpfs_fssize_speced == 0)
+			initrd_fssize = rd_size;
+		ROOT_DEV = Root_TMPFS;
+		create_dev("/dev/root", ROOT_DEV, NULL);
+		mount_tmpfs_root("/dev/root");
+		initrd_untar(in_fd);	// dc: FIX ME chroot first. Get rid of mount_tmpfs_root().
+		close(in_fd);
+	}
+#endif	
+	if (initrd_type & IMAGE) {
+		initrd_fssize = rd_size;
+		if(rd_load_image(in_fd, nblocks) == 0)
+			return 0;
+		ROOT_DEV = Root_RAM0;
+		create_dev("/dev/root", ROOT_DEV, NULL);
+		mount_block_root("/dev/root", root_mountflags & ~MS_RDONLY);
+	}
+	
+	/* Make failsafe /dev and /dev/console on initrd.*/
+	sys_mkdir("./dev", 0700);
+	sys_mknod("./dev/console", S_IFCHR|0600, MKDEV(TTYAUX_MAJOR, 1));
+
+	/* Depreciated. Purge at will.*/
+	initrd_pivot();
+	
+	return -1;
+}	
+
+
+#if defined(CONFIG_BLK_DEV_INITRD_GUNZIP) || defined(CONFIG_BLK_DEV_INITRD_UNTAR)
+
+/*
+ * gzip and untar declarations
+ */
+#define OF(args)  args
+
+#ifndef memzero
+#define memzero(s, n)     memset ((s), 0, (n))
+#endif
+
+typedef unsigned char  uch;
+typedef unsigned short ush;
+typedef unsigned long  ulg;
+
+#define INBUFSIZ 4096
+#define WSIZE 0x8000    /* window size--must be a power of two, and */
+			/*  at least 32K for zip's deflate method */
+
+static uch *inbuf;
+static unsigned insizeT;  /* valid bytes in inbuf */
+static unsigned insize;  /* valid bytes in inbuf */
+static unsigned inptr;   /* index of next byte to be processed in inbuf */
+static int exit_code;
+static long bytes_out;
+static long bytes_limit; /* if > 0,  write no more then X bytes to out_fd*/
+static int crd_infd;
+
+
+/* ===========================================================================
+ * Fill the input buffer. This is called only when the buffer is empty
+ * and at least one byte is really needed.
+ */
+static int __init fill_inbuf(void)
+{
+	if ((bytes_out >= bytes_limit) && (bytes_limit > 0))
+		exit_code = 1;
+	
+	insize = read(crd_infd, inbuf, INBUFSIZ);
+	if (insize == 0) return -1;
+
+	insizeT += insize;
+
+	inptr = 1;
+
+	return inbuf[0];
+}
+#endif	//CONFIG_BLK_DEV_INITRD_GUNZIP || CONFIG_BLK_DEV_INITRD_UNTAR
+
+
+#ifdef CONFIG_BLK_DEV_INITRD_GUNZIP
+
+static uch *window;
+static unsigned outcnt;  /* bytes in output buffer */
+static int crd_outfd;
+
+#define get_byte()  (inptr < insize ? inbuf[inptr++] : fill_inbuf())
+#define EXIT_CODE exit_code
+		
+/* Diagnostic functions (stubbed out) */
+#define Assert(cond,msg)
+#define Trace(x)
+#define Tracev(x)
+#define Tracevv(x)
+#define Tracec(c,x)
+#define Tracecv(c,x)
+
+#define STATIC static
+
+//static int  fill_inbuf(void);
+static void flush_window(void);
+static void *malloc(int size);
+static void free(void *where);
+static void error(char *m);
+static void gzip_mark(void **);
+static void gzip_release(void **);
+
+#include "../lib/inflate.c"
+
+static void __init *malloc(int size)
+{
+	return kmalloc(size, GFP_KERNEL);
+}
+
+static void __init free(void *where)
+{
+	kfree(where);
+}
+
+static void __init gzip_mark(void **ptr)
+{
+}
+
+static void __init gzip_release(void **ptr)
+{
+}
+
+
+/* ===========================================================================
+ * Write the output window window[0..outcnt-1] and update crc and bytes_out.
+ * (Used for the decompressed data only.)
+ */
+static void __init flush_window(void)
+{
+	ulg c = crc;		 /* temporary variable */
+	unsigned n;
+	uch *in, ch;
+
+	if ((((bytes_out + (ulg)outcnt) > bytes_limit) && (bytes_limit > 0)))
+		outcnt = bytes_limit - bytes_out;	// dc: FIX ME? We get lazy here and ignore the crc
+
+	// dc: FIX ME? Check for return -1, IE out of space
+	write(crd_outfd, window, outcnt);
+
+	in = window;
+	for (n = 0; n < outcnt; n++) {
+		ch = *in++;
+		c = crc_32_tab[((int)c ^ ch) & 0xff] ^ (c >> 8);
+	}
+	crc = c;
+	bytes_out += (ulg)outcnt;
+	outcnt = 0;
+}
+
+static void __init error(char *x)
+{
+	printk(KERN_ERR "%s", x);
+	exit_code = 1;
+}
+
+static int __init gunzip_load(int in_fd, int out_fd, int size)
+{
+	int result;
+
+	insize = 0;		/* valid bytes in inbuf */
+	inptr = 0;		/* index of next byte to be processed in inbuf */
+	outcnt = 0;		/* bytes in output buffer */
+	exit_code = 0;
+	bytes_out = 0;		
+	bytes_limit = size;     /* limit of bytes to write out. 0 == unlimited */
+
+	crd_infd = in_fd;
+	crd_outfd = out_fd;
+	inbuf = kmalloc(INBUFSIZ, GFP_KERNEL);
+	if (inbuf == 0) {
+		printk(KERN_ERR "INITRD: Couldn't allocate gzip buffer\n");
+		return -1;
+	}
+	window = kmalloc(WSIZE, GFP_KERNEL);
+	if (window == 0) {
+		printk(KERN_ERR "INITRD: Couldn't allocate gzip window\n");
+		kfree(inbuf);
+		return -1;
+	}
+	makecrc();
+	result = gunzip();
+	lseek(in_fd, -1 * (insize - inptr), 1);	// Place in_fd.fpos where gunzip ended, not window.
+	
+	kfree(inbuf);
+	kfree(window);
+	return result;
+}
+#endif	//CONFIG_BLK_DEV_INITRD_GUNZIP
+
+
+#ifdef CONFIG_BLK_DEV_INITRD_UNTAR
+
+#include "../lib/untar.c"
+
+static struct untar_context *untarPtr;
+
+static int __init untar_load(int in_fd, int size)
+{
+	int result = 0;
+
+	insize = 0;		/* valid bytes in inbuf */
+	exit_code = 0;
+	bytes_out = 0;		
+	bytes_limit = size;     /* limit of bytes to write out. 0 == unlimited */
+
+	crd_infd = in_fd;
+	inbuf = kmalloc(INBUFSIZ, GFP_KERNEL);
+	if (inbuf == 0) {
+		printk(KERN_ERR "INITRD: Couldn't allocate untar buffer\n");
+		return -1;
+	}
+	while (fill_inbuf() >= 0) {
+		int count;
+		count = untar_write(untarPtr, inbuf, insize);
+		bytes_out += insize;
+		if (count > 0) {	// End of archive
+			count -= TARBLOCKSIZE;
+			lseek(in_fd, -1 * count, 1);
+			bytes_out -= count;
+			break;
+		} else if (count < 0) {
+			printk(KERN_ERR "INITRD: corrupt tar archive!\n");
+			result = -1;
+		}
+	}
+	kfree(inbuf);
+	return result;
+}
+
+#ifdef CONFIG_BLK_DEV_INITRD_GUNZIP
+// 'wrapper' so we can access untar_write() like a file descriptor
+static ssize_t __init untar_write_fd(struct file *fp, const char *buf, size_t count, loff_t *fpos)
+{
+	return untar_write(untarPtr, (char*)buf, count);
+}
+#endif
+
+static int __init initrd_untar(int in_fd)
+{
+	int err, i;
+	int type; // 0 unknown, 1 gzip(tar), 2 tar
+	const int size = 512;
+	unsigned char *buf;
+#ifdef CONFIG_BLK_DEV_INITRD_GUNZIP	
+	int untar_fd;
+	struct file *untar_filp;
+#endif
+	untarPtr = kmalloc(sizeof(struct untar_context), GFP_KERNEL);
+	if (untarPtr == 0) {
+		printk(KERN_ERR "INITRD: Couldn't allocate untar buffer\n");
+		return -1;
+	}
+
+	untar_init(untarPtr);
+
+	buf = kmalloc(size, GFP_KERNEL);
+	if (buf == 0)
+		return -1;
+	
+	lseek(in_fd, rd_image_start * BLOCK_SIZE, 0); // Sanity
+
+#ifdef CONFIG_BLK_DEV_INITRD_GUNZIP
+	// Open a 'dummy' fd and filp. Replace f_op.write with untar_write_fd()
+	// It's a bit tidier then deficating in flush_window() with a global var
+	untar_fd = open("/dev/null", O_RDWR, 0);	// Opening against '/dev/null' should be safe, no?
+	untar_filp = fget(untar_fd);
+	untar_filp->f_op->write = untar_write_fd;
+	fput(untar_filp);
+#endif
+	// Now uncompress and/or untar the initrd.tgz/tar archives(s)
+        // Continue for as long as we find magic, for multiple archives.
+	for(i = 0;;i++) {
+		memset(buf, 0xe5, size);
+		read(in_fd, buf, size);
+		lseek(in_fd, -1 * size, 1);	
+		
+		// dc: FIX ME? Call initrd_identify() here in the loop? Overkill?
+		
+		type = 0;
+		if (buf[0] == 037 && ((buf[1] == 0213) || (buf[1] == 0236)))
+			type = 1;	// gzip
+		if (strncmp(&buf[257],"ustar",5) == 0)
+			type = 2;	// tar
+		if (type == 0)
+			break;		// unknown/EOF
+		
+		if (type == 1) {
+#ifdef CONFIG_BLK_DEV_INITRD_GUNZIP		
+			printk(KERN_NOTICE "INITRD: Extracting TGZ archive[%d]: ",i);
+			err = gunzip_load(in_fd,untar_fd,0);
+			if(err == 0)
+				printk("done. [%lu bytes]\n", bytes_out);
+#else
+			printk(KERN_ERR "INITRD: Kernel does not support compressed Tar archives.\n");
+			return 0;
+#endif		       
+		}  else {
+			printk(KERN_NOTICE "INITRD: Extracting Tar archive[%d]: ",i);
+			err = untar_load(in_fd,0);
+			if(err == 0)
+				printk("done. [%lu bytes]\n", bytes_out);
+			
+		}
+	}
+#ifdef CONFIG_BLK_DEV_INITRD_GUNZIP	
+	close(untar_fd);
+#endif
+	kfree(untarPtr);
+	kfree(buf);
+	return 0;
+}
+#endif	//CONFIG_BLK_DEV_INITRD_UNTAR
diff -uNr linux-2.5.45-virgin/lib/inflate.c linux-2.5.45-initrd_dyn/lib/inflate.c
--- linux-2.5.45-virgin/lib/inflate.c	2002-10-19 00:01:53.000000000 -0400
+++ linux-2.5.45-initrd_dyn/lib/inflate.c	2002-11-01 02:48:50.000000000 -0500
@@ -161,6 +161,10 @@
 #define wp outcnt
 #define flush_output(w) (wp=(w),flush_window())
 
+#ifndef EXIT_CODE
+#define EXIT_CODE 0
+#endif
+
 /* Tables for deflate from PKZIP's appnote.txt. */
 static const unsigned border[] = {    /* Order of the bit length code lengths */
         16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15};
@@ -980,7 +984,7 @@
     gzip_release(&ptr);
     if (hufts > h)
       h = hufts;
-  } while (!e);
+  } while (!e && !EXIT_CODE);
 
   /* Undo too much lookahead. The next read will be byte aligned so we
    * can discard unused bits in the last meaningful byte.
@@ -1146,7 +1150,10 @@
 	    }
 	    return -1;
     }
-	    
+    
+    if (EXIT_CODE)
+        return 0;
+	
     /* Get the crc and original length */
     /* crc32  (see algorithm.doc)
      * uncompressed input size modulo 2^32
diff -uNr linux-2.5.45-virgin/lib/untar.c linux-2.5.45-initrd_dyn/lib/untar.c
--- linux-2.5.45-virgin/lib/untar.c	1969-12-31 19:00:00.000000000 -0500
+++ linux-2.5.45-initrd_dyn/lib/untar.c	2002-11-01 04:21:34.000000000 -0500
@@ -0,0 +1,306 @@
+/*
+ * untar.c
+ * 
+ * Copyright 1998-2002 Dave Cinege <dcinege@psychosis.com>
+ * GPL2 - Copyright notice may not be altered.
+ * 
+ * This is the untar portition of the original Initird Dynamic feature
+ *	 Some original portions by Dimitri Maziuk (very first untar.c)
+ *	 and Robert Kaiser (gunzip buffer)
+ * 
+ * This version of untar is very limited. It expects a fairly good archive, 
+ * as well as expansion to a clean root.
+ * It does not:
+ * 
+ * 	Look for a leading '/'
+ * 	Check before making links
+ * 	Look at the header checksum
+ * 	Check or create any leading directory structure
+ * 
+ * 1998-01-10
+ * First version of Initrd Archive. Extract tar.gz's to a dynamically
+ * created minix FS on /dev/ram0.
+ * 
+ * 2000-05-27
+ * Major code reorganization and rename to Initrd Dynamic.
+ * Unified for 2.0/2.2/2.4 kernels versions up to 2.4.18
+ *
+ * 2001-01-27
+ * Support extracting mutiple tgz's.
+ *
+ * 2001-08-18
+ * Added support for using tmpfs on 2.4 kernels.
+ *
+ * 2002-10-27
+ * Final kernel merge for adoption into 2.4.20/2.5.45.
+ * Initrd routines rewritten. Support for both un/compressed tar archives.
+ * Seemless operation with archive magic testing. Purged minix support.
+ * Many untar enhancments.
+ *
+ */
+
+#include <linux/unistd.h>
+#include <linux/stat.h>
+#include <linux/fcntl.h>
+#include <linux/stddef.h>  
+#include <linux/utime.h>
+
+//#define DEBUG_UNTAR
+
+// These must be already present:
+
+//extern sys_mknod();
+//extern sys_mkdir();
+//extern sys_chdir();
+//extern sys_symlink();
+//extern sys_access();
+
+extern asmlinkage long sys_write(unsigned int fd, const char * buf,unsigned int count);
+extern asmlinkage long sys_chmod(const char * filename, mode_t mode);
+extern asmlinkage long sys_chown(const char * filename, uid_t user, gid_t group);
+extern asmlinkage long sys_lchown(const char * filename, uid_t user, gid_t group);
+extern asmlinkage long sys_link(const char * oldname, const char * newname);
+extern asmlinkage long sys_utime(char * filename, struct utimbuf * times);
+extern asmlinkage long sys_time(int * tloc);
+
+#define	TARBLOCKSIZE	512
+
+#define NAME_FIELD_SIZE   100
+#define PREFIX_FIELD_SIZE 155
+#define UNAME_FIELD_SIZE   32
+#define GNAME_FIELD_SIZE   32
+
+#define TMAGIC   "ustar"	// 6 chars and a null
+#define TMAGLEN  6
+#define TVERSION " \0"		// 00 and no null
+#define TVERSLEN 2
+
+	/* POSIX Tar header */
+struct TarFileHeader {		// byte offset
+	char name[NAME_FIELD_SIZE];	//   0
+	char mode[8];			// 100
+	char uid[8];			// 108
+	char gid[8];			// 116
+	char size[12];			// 124
+	char mtime[12];			// 136
+	char chksum[8];			// 148
+	char typeflag;			// 156
+	char linkname[NAME_FIELD_SIZE];	// 157
+	char magic[TMAGLEN];		// 257
+	char version[TVERSLEN];		// 263
+	char uname[UNAME_FIELD_SIZE];	// 265
+	char gname[GNAME_FIELD_SIZE];	// 297
+	char devmajor[8];		// 329
+	char devminor[8];		// 337
+	char prefix[PREFIX_FIELD_SIZE];	// 345
+};
+
+enum untar_type {
+	AREGTYPE = 0,		// regular file
+	REGTYPE	 = '0',		// regular file
+	LNKTYPE  = '1',		// link
+	SYMTYPE  = '2',		// reserved
+	CHRTYPE  = '3',		// character special
+	BLKTYPE  = '4',		// block special
+	DIRTYPE  = '5',		// directory
+	FIFOTYPE = '6',		// FIFO special
+	CONTTYPE = '7'		// reserved
+};
+
+enum untar_status {
+	READING_HEADER,
+	READING_DATA,
+	SKIPPING_REST
+};
+
+typedef union TarInfo {
+	char buf[TARBLOCKSIZE];
+	struct TarFileHeader header;
+} TarInfo;
+
+struct untar_context {
+	enum untar_status	state;
+	unsigned long		fsize_remaining;
+	int			out_fd;
+	int			rotate;
+	TarInfo			tarInfo;	
+} *Untar;
+
+
+/*
+ * simple_strtoul() does not strip leading spaces (0x20) from the input
+ * string. The 'real' strtoul does. This wrapper works around this.
+ * NOTE: This really should be fixed in: lib/vsprintf.c 
+ *
+ */
+#define strtoul simple_strtoul_wrapper
+static unsigned long  __init  simple_strtoul_wrapper( const char *cp, char **endp, unsigned int base)
+{
+	while (*cp == ' ')
+		cp++;
+	return ( simple_strtoul(cp,endp,base) );
+}
+
+/* 
+ * Initialize/Reset the context structure.
+ * The external parent is responcible for definition and kmalloc of *Untar
+ *
+ */
+static void  __init untar_init(struct untar_context *Untar)
+{
+	memset(Untar,0,sizeof(struct untar_context));
+	Untar->state  = READING_HEADER;
+	Untar->rotate = 0;
+}
+
+#define Type	Untar->tarInfo.header.typeflag
+#define Name	Untar->tarInfo.header.name
+#define LName	Untar->tarInfo.header.linkname
+#define Mode	strtoul(Untar->tarInfo.header.mode,NULL,8)
+#define Uid	strtoul(Untar->tarInfo.header.uid,NULL,8)
+#define Gid	strtoul(Untar->tarInfo.header.gid,NULL,8)
+#define Size	strtoul(Untar->tarInfo.header.size,NULL,8)
+#define Dev	MKDEV(strtoul(Untar->tarInfo.header.devmajor,NULL,8),strtoul(Untar->tarInfo.header.devminor,NULL,8))
+#define Magic	Untar->tarInfo.header.magic
+
+static int __init untar_setmodes(struct untar_context *Untar)
+{
+	int err;
+	struct	utimbuf ut;
+
+	if(Type == LNKTYPE || Type == SYMTYPE)
+		return sys_lchown(Name, Uid, Gid);	
+	
+	err = sys_chown(Name, Uid, Gid);
+	err |= sys_chmod(Name, Mode);
+	
+	ut.actime=sys_time(NULL);
+	ut.modtime=strtoul(Untar->tarInfo.header.mtime,NULL,8);
+	err |= sys_utime(Untar->tarInfo.header.name,&ut);
+
+	return err;
+}
+
+static int __init untar_create(struct untar_context *Untar)
+{
+	int err = 0;
+	switch(Type) {
+		case AREGTYPE : case REGTYPE  :
+			Untar->out_fd = sys_open(Name,O_CREAT|O_WRONLY|O_TRUNC,Mode);
+			if(Untar->out_fd == -1) {
+				err = 1;
+				break;
+			}
+			Untar->state = READING_DATA;
+			Untar->fsize_remaining = Size;
+			break;
+		case LNKTYPE  :
+			err = sys_link(LName, Name);
+			break;
+		case SYMTYPE  :
+			err = sys_symlink(LName, Name);
+			break;
+		case CHRTYPE  :	case BLKTYPE  :	case FIFOTYPE :	
+			err = sys_mknod(Name,Mode,Dev);
+   	    		break;
+		case DIRTYPE  :
+			if((Name[0] != '.' && Name[0] != 0) &&	// Skip if dirname is "" "./" "." ".."
+			   (Name[1] != '/' && Name[1] != 0) &&	// Yes, Gnu tar can do stupid shit like this.
+			   (Name[1] != 0)) {
+			   	err = sys_mkdir(Name,Mode);
+			}
+			break;
+		default:	//  corrupt tar archive
+			Untar->state = SKIPPING_REST;
+			return -1;
+	}
+	if (err == 0)
+		return untar_setmodes(Untar);
+	
+	return err;
+}
+
+/*
+ * untar_write() can be re-entered as needed by the external parent function.
+ * We keep status of the current state and Tar header in our context
+ * structure, and return when we need more data to continue with extraction
+ * The parent is responcible for preparing root, chdir(), and chroot().
+ *
+ */
+static ssize_t __init untar_write(struct untar_context *Untar, char *buf, size_t count)
+{
+	int	i, scoop, err;
+	char	*to;
+#ifndef DEBUG_UNTAR
+#ifndef CONFIG_ARCH_S390
+	char rotator[4] = { '|' , '/' , '-' , '\\' };
+
+	printk("%c\b", rotator[Untar->rotate & 0x3]);
+	Untar->rotate++;
+#endif
+#endif
+	while (count) {	
+		err = 0;	
+		switch (Untar->state) {
+			case READING_HEADER:
+				if (count < TARBLOCKSIZE) {
+					count = 0;
+					break;
+				}
+				to = (char*)&Untar->tarInfo;
+				for (i = 0; i < TARBLOCKSIZE; i++) {
+					*to++ = *buf++;
+					--count;
+				}
+				// Tar usually pads the output byte to a multiple of it's block size,
+				// appending zeroes if necessary. Here we skip those zeroes:
+				if (Name[0] == 0 && Magic[0] == 0) {
+					if(count < TARBLOCKSIZE)
+						count = 0;
+					break;
+				}				
+				// Check magic to see if it's a valid header.
+				// If not assume overrun of EOF, and return 'count' that was valid.
+				// Parent can reset buf position with this offset.
+				if (strncmp(Magic,TMAGIC,sizeof(TMAGIC) != 0)) {
+					return count;
+				}
+
+				err = untar_create(Untar);
+	
+				break;
+			case READING_DATA:
+				scoop = 0;
+				while (count > 0 && Untar->fsize_remaining > 0) {
+					scoop = Untar->fsize_remaining > TARBLOCKSIZE ? TARBLOCKSIZE : Untar->fsize_remaining;
+					if(sys_write(Untar->out_fd, buf, scoop) < scoop)
+						err = 1;
+					count -= scoop;
+					buf   += scoop;
+					Untar->fsize_remaining -= scoop;
+				}
+				if (Untar->fsize_remaining == 0) {	//skip to the next tar block
+					sys_close(Untar->out_fd);
+					scoop = count % TARBLOCKSIZE;
+					buf   += scoop;
+					count -= scoop;
+					Untar->state = READING_HEADER;
+				}
+				break;
+
+			case SKIPPING_REST:
+				count = 0;
+				break;
+		}
+		if (err) {
+#ifndef DEBUG_UNTAR		
+			printk("!");
+#else
+			printk("\nerr=%d, Error making %s", err, Name);
+#endif
+		}
+
+	
+	}
+	return 0;
+}

^ permalink raw reply	[flat|nested] 18+ messages in thread

* identifying the idling kernel and kernel hacking.
  2002-11-02 20:37     ` an idling kernel Anu
  2002-11-02 22:16       ` Jos Hulzink
@ 2002-11-03  0:43       ` Anu
  2002-11-04 19:16       ` an idling kernel Werner Almesberger
  2 siblings, 0 replies; 18+ messages in thread
From: Anu @ 2002-11-03  0:43 UTC (permalink / raw)
  To: LKML

Hello,
         I am looking at some way of "automatically" figuring out when a
kernel might be idle -- more at the level of the kernel code itself. After
a day's reading i have the following pieces of information:

.a. There is something called the run_queue which has a list of process
    that can be run.
.b. When nothing is running, we have the swapper process running (process
    0 ) that is the ancestor of all processes.
.c. the scheduler goes in and checks this readyqueue every so often, so,
    we can figure out if there are no processes running at any given
    time..
.d. I am now trying to modify the kernel to do something interesting when
    there is only the idle process running..  No idea what though. Linux
2.4.9 (which is the version i am looking at ) has a bunch of gotos and I
think i have identified that the section under still_running_back: is the
place to identify when the run_queue is empty.. I am thinking of putting
in some printfs() to make the kernel put out a message that says something
like "i am idling.." everytime the kernel is idling.. and execute a ps uax
simultaneously to show that the kernel is indeed idling..

are there any obvious disasters that u chaps see? (im not an OS person..)

-a



********************************************************************************

			      Think, Train, Be

*******************************************************************************



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: an idling kernel
  2002-11-02 20:37     ` an idling kernel Anu
  2002-11-02 22:16       ` Jos Hulzink
  2002-11-03  0:43       ` identifying the idling kernel and kernel hacking Anu
@ 2002-11-04 19:16       ` Werner Almesberger
  2 siblings, 0 replies; 18+ messages in thread
From: Werner Almesberger @ 2002-11-04 19:16 UTC (permalink / raw)
  To: Anu; +Cc: LKML

Anu wrote:
> 	Im ready to be beaten up for asking this question ( I am not sure
> which group to post to -- all this is new to me) but, I was wondering how
> one could figure out if the kernel was in idle mode (or idling).

There's more to is than just processes: if your kernel has runnable
tasklets or pending interrupts, it is not truly idle, even though
there may be no runnable processes.

In umlsim, I have some heuristics that seem to catch most cases, but
may be a bit too paranoid. Look at timer.c:wait_kernel (called from
idle) in http://www.almesberger.net/umlsim/umlsim-4.tar.gz

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2002-11-04 19:10 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-02  8:13 [BK PATCHES] initramfs merge, part 1 of N Jeff Garzik
2002-11-02  8:18 ` Jeff Garzik
2002-11-02  8:42 ` Aaron Lehmann
2002-11-02  8:46   ` Jeff Garzik
2002-11-02  8:50     ` H. Peter Anvin
2002-11-02 19:01   ` Linus Torvalds
2002-11-02 12:07     ` H. Peter Anvin
2002-11-02 20:24     ` Alexander Viro
2002-11-02 23:46     ` Dave Cinege
2002-11-02 10:51 ` miltonm
2002-11-02 17:12 ` Matt Porter
2002-11-02 12:14   ` H. Peter Anvin
2002-11-02 20:37     ` an idling kernel Anu
2002-11-02 22:16       ` Jos Hulzink
2002-11-03  0:43       ` identifying the idling kernel and kernel hacking Anu
2002-11-04 19:16       ` an idling kernel Werner Almesberger
2002-11-02 20:37     ` [BK PATCHES] initramfs merge, part 1 of N Alexander Viro
2002-11-02 23:36     ` Matt Porter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox