re: Announce: dumpfs v0.01 - common RAS output API

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* re: Announce: dumpfs v0.01 - common RAS output API
@ 2004-07-22 15:42 Dan Kegel
  0 siblings, 0 replies; 60+ messages in thread
From: Dan Kegel @ 2004-07-22 15:42 UTC (permalink / raw)
  To: Linux Kernel Mailing List, kaos

Keith Owens <kaos () sgi ! com> wrote:
> Announcing dumpfs - a common API for all the RAS code that wants to
> save data during a kernel failure and to extract that RAS data on the
> next boot.  The documentation file is appended to this mail.
 > ...

I looked, but couldn't see any definition for RAS in your doc.
Could you add one?
The fs/Kconfig hunk might be a nice place to define it, since
naive users might see that text when configuring kernels.

http://www.kernelnewbies.org/glossary/#R does define it,
but it's so far down on
http://www.google.com/search?q=define%3Aras
that most people configuring a kernel might not be familiar with that sense.
- Dan

-- 
My technical stuff: http://kegel.com
My politics: see http://www.misleader.org for examples of why I'm for regime change

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Announce: dumpfs v0.01 - common RAS output API
@ 2004-07-22 16:19 Keith Owens
  2004-07-26  6:57 ` Andrew Morton
  0 siblings, 1 reply; 60+ messages in thread
From: Keith Owens @ 2004-07-22 16:19 UTC (permalink / raw)
  To: linux-kernel

Announcing dumpfs - a common API for all the RAS code that wants to
save data during a kernel failure and to extract that RAS data on the
next boot.  The documentation file is appended to this mail.

ftp://oss.sgi.com/projects/kdb/download/dumpfs - current version is
v0.01, patch against 2.6.8-rc2.

This is a work in progress, the code is not complete and is subject to
change without notice.

dumpfs-v0.01 handles mounting the dumpfs partitions, including reliable
sharing with swap partitions and clearing the dumpfs partitions.  I am
working on the code that reads and writes dumpfs data from kernel
space, it is incomplete and has not been tested yet.  After
dumpfs_kernel is working, dumpfs_user is trivial.  The code is proof of
concept, some sections of the API (including polled I/O and data
compression) are not supported yet, and some of the code is ugly.

Why announce incomplete and untested code?  Mainly because RAS and
kernel dumping are being discussed at OLS this week.  Since I cannot be
at OLS, this is the next best thing.  Also the dumpfs API has
stabilized for the first cut, so it is time to get more discussion on
the API and to determine if it is worth continuing with the dumpfs
approach.  If dumpfs is discussed at OLS then I would appreciate any
feedback.

Questions for the other people who care about RAS (which rules out most
of the kernel developers) -

* Is using a common dump API the right thing to do?

  Obviously I think that this makes sense.  At the moment every bit of
  RAS code has its own dedicated I/O mechanism, not to mention its own
  user space tools to interface with the kernel, and to initialize,
  extract and clear its own data.

  dumpfs consolidates a lot of common code that is scattered over
  several RAS tools.  dumpfs removes the need for special RAS tools to
  extract dump data on reboot, instead standard user space commands
  will do the job.

* Is overloading mount the best approach?

  Making mount dumpfs share the partition with swap is ugly.  OTOH most
  of the existing code that dumpfs is intended to replace makes no
  attempt to verify its partition usage.  At least dumpfs tries to
  verify its partition data, ugly though the code is.

* Does the dumpfs API need to be extended or even replaced, either in
  kernel or in user space?

  One obvious extension is to make compression selective, so that some
  sections of the file can be compressed and others be in clear text.
  The lcrash header springs to mind.  Omitted for now since this
  version does not support compression yet.

* How do we get a clean API to do polling mode I/O to disk?

  One thing that is absolutely required for reliable RAS output is a
  polling mode method.  netdump is available for the network, we need
  the equivalent for disk I/O.  What is the best way to integrate
  polling mode I/O into the block device subsystem?

If the people who care about RAS think that a common RAS output API is
worthwhile then I will continue working on dumpfs.  Otherwise it will
be just another idea that did not get taken up, and each RAS tool will
continue to be developed and maintained in isolation.

==== 2.6.8-rc2/Documentation/filesystems/dumpfs.txt ====

dumpfs provides a common API for RAS components that need to dump kernel data
during a problem.  The dumped data is expected to be copied and cleared on the
next successful boot.

dumpfs consists of two layers, with completely different semantics.  These are
dumpfs (kernel only) and dumpfs_user (user space view of any saved dump data).

dumpfs uses one mount for each dump partition.  Each dumpfs partition can be
mounted with option share or noshare, the default is noshare.  The only
allowable user space operations on a dumpfs partition are mount and umount, user
space cannot directly access the dumpfs data.  Each dumpfs partition is mounted
with "mount -t dumpfs /dev/partition /mnt/dumpfs".  /mnt/dumpfs must be a
directory; it never contains anything useful but the mount semantics require a
directory here.

A shared dumpfs partition will normally coexist with a swap partition; the
dumpfs superblock is stored at an offset which leaves the swap signature alone.
A shared dump partition has no superblock on disk until the first dump file is
created.  Mounting a dumpfs partition with "-o clear" will completely zero the
dumpfs superblock, including the magic field.  This ensures that old dumpfs data
in a shared partition will not be used, its contents are unreliable because of
the data sharing.

When mounting a shared dumpfs partition, no check is made to see if the disk
contains a dumpfs superblock.  Mounting a dumpfs partition with -o share will
only share with a swap partition, it will not share with any other mounted
partition.

A non-shared dumpfs partition must have a superblock before being mounted.
mkfs.dumpfs and fsck.dumpfs (only used for non-shared partitions) are trivial.
Mounting dumpfs with "-o noshare,clear" will clear the metadata in the dumpfs
superblock, but preserve the magic field.

mkfs.dumpfs

#!/bin/sh
dd if=/dev/zero of="$1" bs=64k count=1
echo 'dum0' | dd of="$1" bs=64k seek=1 conv=sync

fsck.dumpfs

#!/bin/sh
true

Each dumpfs partition can be mounted with option poll or nopoll, the default is
poll.  Poll uses low level polled mode I/O direct to the partition, completely
bypassing the normal interrupt driven code.  This is done in an attempt to get
the data out to disk even when the kernel is so badly broken that interrupts are
not working.  Poll requires that the device driver for the dumpfs partition
supports polling mode I/O.  Nopoll uses the standard kernel I/O mechanisms, so
it is not guaranteed to work when the kernel is crashing.  Nopoll should only be
used when your device driver does not support polling mode I/O yet; you must
accept that dumpfs may hang waiting for the I/O to be serviced.

Another option when mounting a dumpfs partition is to specify the size of its
data buffer, in kibibytes.  This buffer is permanently allocated as long as the
dumpfs partition is mounted, it is only used when writing RAS data via dumpfs.
The buffer size will be rounded up to a multiple of the kernel page size.  The
default is buffer=128.

The user space view of the RAS data held in the dumpfs partitions is created by
"mount -t dumpfs_user none /mnt/dumpfs".  It logically merges and validates all
the dumpfs partitions that have been mounted and provides a user space view of
the files that have been written to dumpfs.  The only user space operations
supported on dumpfs_user are llseek, read, readdir, open (read only), close and
unlink.  Just enough to copy the files out of dumpfs_user and remove them.  User
space cannot write to dumpfs_user.

The kernel can write to files held in dumpfs partitions, to save RAS data over a
reboot.  Note that when kernel RAS components write to dumpfs they do _not_ use
the normal VFS layer, it may not be working during a failure.  Instead a RAS
component makes direct calls to the following dumpfs_kernel functions.

dumpfs_kernel_open("prefix", flags)

  Create and open for writing a file in dumpfs.  It returns a file descriptor
  within dumpfs.

  The dumpfs filename is constructed from "prefix-" followed by the value of
  xtime in the format CCYY-MM-DD-hh:mm:ss.n, where n starts at 0 and is
  incremented for each dumpfs file in the current boot.

  There is no requirement that a dumpfs_user mount point exist before the kernel
  can dump its data.  The first call to dumpfs_kernel_open will automatically
  create a kernel view that merges all the mounted dumpfs partitions.  The first
  call to dumpfs_kernel_open also writes the dumpfs superblocks to any shared
  partitions.

  Flags select compression, if any.

  dumpfs_kernel_open() is the simple interface.  It automatically stripes the
  data across all dumpfs partitions that are not currently being used.

  Most RAS code will open one dump file at a time, mainly because most users
  will only have one dumpfs partition.  The dumpfs code has a module_parm called
  dumpfs_max_open, with a default value of 1.

dumpfs_kernel_bdev_list()
dumpfs_kernel_open_choose("prefix", flags, bdev_list)

  Some platforms may need to have multiple output streams open in parallel.  For
  example a system with large amounts of memory and multiple disks may wish to
  assign different sections of memory to each cpu and to write to separate
  partitions.

  dumpfs_kernel_bdev_list() returns the list of usable dumpfs partitions.  If
  all partitions are in use then the list is empty.

  dumpfs_kernel_open_choose() opens a file using only the selected bdev entries.

  Systems that use concurrent parallel dumps should set module_parm
  dumpfs_max_open to a suitable value.

  Note: The following problems are inherently architecture and platform specific
  and are outside the scope of dumpfs.  That is not to say that we should not
  have an API for handling these problems on large systems, but it would be a
  separate API from dumpfs.

    Deciding which cpus to use for parallel dumping.
    Deciding which block devices each cpu should use.
    Getting the chosen cpus into the RAS code.
    Assigning the range of work to each cpu and each partition.
    Watching the dumping cpus for problems, recovering from those problems
      and reassigning the work to another cpu.
    Reconstructing the parallel dumps into a format for analysis.  dumpfs_user
      makes each dump file available to user space, but some code may be
      required to merge the separate files together.

dumpfs_kernel_close(fd)

  Sync the file's data to disk, close the file and update the dumpfs metadata.

dumpfs_kernel_write(fd, buffer, length)

  Write the buffer at the current dumpfs file location.  The data may or may not
  be written to disk immediately.  It returns the current location, including
  the data that was just written.

  For performance, the dumpfs data is striped over all the assigned partitions,
  in round robin.  The stripe unit is the minimum of the buffer= value across
  all the assigned partitions.

dumpfs_kernel_read(fd, buffer, length)

  Read the buffer from the current dumpfs file location.  It returns the current
  location, including the data that was just read.

dumpfs_kernel_llseek(fd, position)

  Set the current dumpfs file location.  It returns the previous location.  Only
  absolute seeking is supported.

dumpfs_kernel_sync(fd)

  Sync the file's data to disk and update the dumpfs metadata.

dumpfs_kernel_dirty_shared()

  Returns true if any shared partitions have been dirtied, in which case the
  kernel must be rebooted after all the RAS components have completed their
  work.

dumpfs_kernel_all_polled()

  Returns true if all dumpfs partitions can support polling mode I/O.  Otherwise
  the RAS code that calls dumpfs should enable interrupts, if at all possible.

Sample /etc/fstab entries for dumpfs partitions.

  /dev/sda2  /mnt/dumpfs  dumpfs  defaults  0 0
  /dev/sdb2  /mnt/dumpfs  dumpfs  share     0 0
  /dev/sdc7  /mnt/dumpfs  dumpfs  nopoll    0 0

Sample code in /etc/rc.sysinit to save dump data from the previous boot.  If you
are sharing dumpfs with swap, these commands must be executed before mounting
swap.  Note that dumpfs does not require any special user space tools to poke
inside partitions to see if there is any useful data to save, everything is a
file.

  # mount all the dumpfs partitions
  mount -a -t dumpfs
  # merge all dumpfs into dumpfs_user on /mnt/dump
  mount -t dumpfs_user none /mnt/dump
  # copy the data out
  (cd /mnt/dump; for f in `find -type f`; do echo saving $f; mv $f /var/log/dump; done)
  # drop dumpfs_user
  umount /mnt/dump
  # clear all the dumpfs metadata
  umount -a -t dumpfs
  mount -a -t dumpfs -o clear
  umount -a -t dumpfs

rc.sysinit will later mount the swap partitions, then mount all the other
partition types.  That will remount the dumpfs partitions, ready for the next
kernel crash.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-22 16:19 Announce: dumpfs v0.01 - common RAS output API Keith Owens
@ 2004-07-26  6:57 ` Andrew Morton
  2004-07-28  1:53   ` Eric W. Biederman
  0 siblings, 1 reply; 60+ messages in thread
From: Andrew Morton @ 2004-07-26  6:57 UTC (permalink / raw)
  To: Keith Owens; +Cc: linux-kernel

Keith Owens <kaos@sgi.com> wrote:
>
>  * How do we get a clean API to do polling mode I/O to disk?

We hope to not have to.  The current plan is to use kexec: at boot time, do
a kexec preload of a small (16MB) kernel image.  When the main kernel
crashes or panics, jump to the kexec kernel.  The kexec kernel will hold a
new device driver for /dev/hmem through which applications running under
the kexec'ed kernel can access the crashed kernel's memory.

Write the contents of /dev/hmem to stable storage using whatever device
drivers are in the kexeced kernel, then reboot into a real kernel again.

That's all pretty simple to do, and the quality of the platform's crash
dump feature will depend only upon the quality of the platform's kexec
support.

People have bits and pieces of this already - I'd hope to see candidate
patches within a few weeks.  The main participants are rddunlap, suparna
and mbligh.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-26  6:57 ` Andrew Morton
@ 2004-07-28  1:53   ` Eric W. Biederman
  2004-07-28 10:54     ` Suparna Bhattacharya
  2004-07-28 16:03     ` Jesse Barnes
  0 siblings, 2 replies; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-28  1:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Keith Owens, linux-kernel, Suparna Bhattacharya, Martin J. Bligh,
	fastboot

Andrew Morton <akpm@osdl.org> writes:

> Keith Owens <kaos@sgi.com> wrote:
> >
> >  * How do we get a clean API to do polling mode I/O to disk?
> 
> We hope to not have to.  The current plan is to use kexec: at boot time, do
> a kexec preload of a small (16MB) kernel image.  When the main kernel
> crashes or panics, jump to the kexec kernel.  The kexec kernel will hold a
> new device driver for /dev/hmem through which applications running under
> the kexec'ed kernel can access the crashed kernel's memory.

Hmm.  I think this will require one of the kernels to run at a
non-default address in physical memory.

> Write the contents of /dev/hmem to stable storage using whatever device
> drivers are in the kexeced kernel, then reboot into a real kernel
> again.

And at this point I don't quite see why you would need /dev/hmem,
as opposed to just using /dev/mem.

Or will the crashing kernel save and compress the core dump to
somewhere in ram and the dump kernel read it out from there? 

> That's all pretty simple to do, and the quality of the platform's crash
> dump feature will depend only upon the quality of the platform's kexec
> support.

Which will largely depend on the quality of it's device drivers...

> People have bits and pieces of this already - I'd hope to see candidate
> patches within a few weeks.  The main participants are rddunlap, suparna
> and mbligh.

I'm sorry I missed you then.  Unfortunately this is my busiest season at work
so I wasn't able to make it to OLS this year :(

Does anyone have a proof of concept implementation?  I have been able to find
a little bit of time for this kind of thing lately and have just done
the x86-64 port.  (You can all give me a hard time about taking a year
to get back to it :)  I am in the process of breaking everything up
into their individual change patches and doing a code review so I feel
comfortable with sending the code to Andrew.  So this would be a very
good time for me to look at any code for reporting a crash dump with
a kernel started with kexec.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 10:54     ` Suparna Bhattacharya
@ 2004-07-28 10:46       ` Alan Cox
  2004-07-28 14:38         ` Martin J. Bligh
  2004-07-28 15:42         ` Eric W. Biederman
  2004-07-28 15:12       ` Eric W. Biederman
  1 sibling, 2 replies; 60+ messages in thread
From: Alan Cox @ 2004-07-28 10:46 UTC (permalink / raw)
  To: suparna
  Cc: Eric W. Biederman, Andrew Morton, fastboot,
	Linux Kernel Mailing List, Martin J. Bligh

On Mer, 2004-07-28 at 11:54, Suparna Bhattacharya wrote:
> On Tue, Jul 27, 2004 at 07:53:01PM -0600, Eric W. Biederman wrote:
> > Andrew Morton <akpm@osdl.org> writes:
> > 
> > > Keith Owens <kaos@sgi.com> wrote:
> > > >
> > > >  * How do we get a clean API to do polling mode I/O to disk?
> > > 
> > > We hope to not have to.  The current plan is to use kexec: at boot time, do

If you are prepared to say "dump device is IDE" the driver ends up
pretty tiny - maybe 4K for PIO mode.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28  1:53   ` Eric W. Biederman
@ 2004-07-28 10:54     ` Suparna Bhattacharya
  2004-07-28 10:46       ` [Fastboot] " Alan Cox
  2004-07-28 15:12       ` Eric W. Biederman
  2004-07-28 16:03     ` Jesse Barnes
  1 sibling, 2 replies; 60+ messages in thread
From: Suparna Bhattacharya @ 2004-07-28 10:54 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Keith Owens, linux-kernel, Martin J. Bligh,
	fastboot

On Tue, Jul 27, 2004 at 07:53:01PM -0600, Eric W. Biederman wrote:
> Andrew Morton <akpm@osdl.org> writes:
> 
> > Keith Owens <kaos@sgi.com> wrote:
> > >
> > >  * How do we get a clean API to do polling mode I/O to disk?
> > 
> > We hope to not have to.  The current plan is to use kexec: at boot time, do
> > a kexec preload of a small (16MB) kernel image.  When the main kernel
> > crashes or panics, jump to the kexec kernel.  The kexec kernel will hold a
> > new device driver for /dev/hmem through which applications running under
> > the kexec'ed kernel can access the crashed kernel's memory.
> 
> Hmm.  I think this will require one of the kernels to run at a
> non-default address in physical memory.
> 
> > Write the contents of /dev/hmem to stable storage using whatever device
> > drivers are in the kexeced kernel, then reboot into a real kernel
> > again.
> 
> And at this point I don't quite see why you would need /dev/hmem,
> as opposed to just using /dev/mem.

This differs a little from your earlier suggestion of requiring
a kernel to run from a non-default address. Martin suggested simply
reserving about 16MB of area in advance, so that just before kexecing
the new kernel with mem=16M, we save the first 16MB away into the
reserved space. /dev/hmem (oldmem ?) is a view into the old kernel's
memory, as opposed to /dev/mem.

> 
> Or will the crashing kernel save and compress the core dump to
> somewhere in ram and the dump kernel read it out from there? 
> 
> > That's all pretty simple to do, and the quality of the platform's crash
> > dump feature will depend only upon the quality of the platform's kexec
> > support.
> 
> Which will largely depend on the quality of it's device drivers...
>  
> > People have bits and pieces of this already - I'd hope to see candidate
> > patches within a few weeks.  The main participants are rddunlap, suparna
> > and mbligh.
> 
> I'm sorry I missed you then.  Unfortunately this is my busiest season at work
> so I wasn't able to make it to OLS this year :(
> 
> Does anyone have a proof of concept implementation?  I have been able to find

Yes, Hari has a nice POC implementation - it might make sense for him to post
it rightaway for you to take a look. Basically, in addition to hmem (oldmem),
the upcoming kernel exports an ELF core view of the saved register and memory 
state of the previous kernel as /proc/vmcore.prev (remember your suggestion 
of using an ELF core file format for dump ?), so one can use cp or scp to 
save the core dump to disk. He has a quick demo, where he uses gdb (unmodified) 
to open the dump and show a stack trace of the dumping cpu.

Regards
Suparna

> a little bit of time for this kind of thing lately and have just done
> the x86-64 port.  (You can all give me a hard time about taking a year
> to get back to it :)  I am in the process of breaking everything up
> into their individual change patches and doing a code review so I feel
> comfortable with sending the code to Andrew.  So this would be a very
> good time for me to look at any code for reporting a crash dump with
> a kernel started with kexec.
> 
> Eric

-- 
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Lab, India


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 14:38         ` Martin J. Bligh
@ 2004-07-28 14:06           ` Alan Cox
  2004-07-28 15:21             ` Martin J. Bligh
  2004-07-28 16:05             ` Eric W. Biederman
  0 siblings, 2 replies; 60+ messages in thread
From: Alan Cox @ 2004-07-28 14:06 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: suparna, Eric W. Biederman, Andrew Morton, fastboot,
	Linux Kernel Mailing List

On Mer, 2004-07-28 at 15:38, Martin J. Bligh wrote:
> After kexec, we shouldn't need such things, do we? Before it, Linus won't 
> take the patch, as he said he doesn't like systems in unstable states doing
> crashdumps to disk ...

And what does kexec do.. it accesses the disk. A SHA signed standalone
dumper is as safe as anything else if not safer.

Alan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 10:46       ` [Fastboot] " Alan Cox
@ 2004-07-28 14:38         ` Martin J. Bligh
  2004-07-28 14:06           ` Alan Cox
  2004-07-28 15:42         ` Eric W. Biederman
  1 sibling, 1 reply; 60+ messages in thread
From: Martin J. Bligh @ 2004-07-28 14:38 UTC (permalink / raw)
  To: Alan Cox, suparna
  Cc: Eric W. Biederman, Andrew Morton, fastboot,
	Linux Kernel Mailing List

--Alan Cox <alan@lxorguk.ukuu.org.uk> wrote (on Wednesday, July 28, 2004 11:46:06 +0100):

> On Mer, 2004-07-28 at 11:54, Suparna Bhattacharya wrote:
>> On Tue, Jul 27, 2004 at 07:53:01PM -0600, Eric W. Biederman wrote:
>> > Andrew Morton <akpm@osdl.org> writes:
>> > 
>> > > Keith Owens <kaos@sgi.com> wrote:
>> > > > 
>> > > >  * How do we get a clean API to do polling mode I/O to disk?
>> > > 
>> > > We hope to not have to.  The current plan is to use kexec: at boot time, do
> 
> If you are prepared to say "dump device is IDE" the driver ends up
> pretty tiny - maybe 4K for PIO mode.

After kexec, we shouldn't need such things, do we? Before it, Linus won't 
take the patch, as he said he doesn't like systems in unstable states doing
crashdumps to disk ...

M.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 10:54     ` Suparna Bhattacharya
  2004-07-28 10:46       ` [Fastboot] " Alan Cox
@ 2004-07-28 15:12       ` Eric W. Biederman
  2004-07-28 15:23         ` Martin J. Bligh
  1 sibling, 1 reply; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-28 15:12 UTC (permalink / raw)
  To: suparna; +Cc: Andrew Morton, fastboot, linux-kernel, Martin J. Bligh

Suparna Bhattacharya <suparna@in.ibm.com> writes:

> This differs a little from your earlier suggestion of requiring
> a kernel to run from a non-default address. Martin suggested simply
> reserving about 16MB of area in advance, so that just before kexecing
> the new kernel with mem=16M, we save the first 16MB away into the
> reserved space. /dev/hmem (oldmem ?) is a view into the old kernel's
> memory, as opposed to /dev/mem.

Ok, That is fairly simple, and can be implemented easily, so it is
a good place to start.

A buffer to save old kernel state is needed.  At the very least
we need to save the old register state, copying over several megabytes
of data won't hurt the picture.

It has a little more exposure to on-going DMA transfers than running
from a reserved area of memory, as some of that memory may have been
setup as DMA buffers by the dying kernel. 

If it turns out that on-going DMA is a problem I see that as something
we can fix later on.

> > Does anyone have a proof of concept implementation?  I have been able to find
> 
> Yes, Hari has a nice POC implementation - it might make sense for him to post
> it rightaway for you to take a look. Basically, in addition to hmem (oldmem),
> the upcoming kernel exports an ELF core view of the saved register and memory 
> state of the previous kernel as /proc/vmcore.prev (remember your suggestion 
> of using an ELF core file format for dump ?), so one can use cp or scp to 
> save the core dump to disk. He has a quick demo, where he uses gdb (unmodified)
> to open the dump and show a stack trace of the dumping cpu.

Yes I would like to look at that.  

I am tempted to suggest the data buffer with the registers and memory
actually be in ELF format with virtual address representing where the
data was in memory, and the physical addresses reporting where the
data actually is in memory now.  Then we could just grab everything
with /dev/mem..

But I don't know how much of pain this would be.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 14:06           ` Alan Cox
@ 2004-07-28 15:21             ` Martin J. Bligh
  2004-07-28 15:56               ` Eric W. Biederman
  2004-07-28 17:09               ` Andrea Arcangeli
  2004-07-28 16:05             ` Eric W. Biederman
  1 sibling, 2 replies; 60+ messages in thread
From: Martin J. Bligh @ 2004-07-28 15:21 UTC (permalink / raw)
  To: Alan Cox
  Cc: suparna, Eric W. Biederman, Andrew Morton, fastboot,
	Linux Kernel Mailing List

--Alan Cox <alan@lxorguk.ukuu.org.uk> wrote (on Wednesday, July 28, 2004 15:06:27 +0100):

> On Mer, 2004-07-28 at 15:38, Martin J. Bligh wrote:
>> After kexec, we shouldn't need such things, do we? Before it, Linus won't 
>> take the patch, as he said he doesn't like systems in unstable states doing
>> crashdumps to disk ...
> 
> And what does kexec do.. it accesses the disk. A SHA signed standalone
> dumper is as safe as anything else if not safer.

But it's reading, not writing ... personally I'm happier with that bit ;-)

M.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 15:12       ` Eric W. Biederman
@ 2004-07-28 15:23         ` Martin J. Bligh
  2004-07-28 15:53           ` Eric W. Biederman
  0 siblings, 1 reply; 60+ messages in thread
From: Martin J. Bligh @ 2004-07-28 15:23 UTC (permalink / raw)
  To: Eric W. Biederman, suparna; +Cc: Andrew Morton, fastboot, linux-kernel

>> > Does anyone have a proof of concept implementation?  I have been able to find
>> 
>> Yes, Hari has a nice POC implementation - it might make sense for him to post
>> it rightaway for you to take a look. Basically, in addition to hmem (oldmem),
>> the upcoming kernel exports an ELF core view of the saved register and memory 
>> state of the previous kernel as /proc/vmcore.prev (remember your suggestion 
>> of using an ELF core file format for dump ?), so one can use cp or scp to 
>> save the core dump to disk. He has a quick demo, where he uses gdb (unmodified)
>> to open the dump and show a stack trace of the dumping cpu.
> 
> Yes I would like to look at that.  
> 
> I am tempted to suggest the data buffer with the registers and memory
> actually be in ELF format with virtual address representing where the
> data was in memory, and the physical addresses reporting where the
> data actually is in memory now.  Then we could just grab everything
> with /dev/mem..
> 
> But I don't know how much of pain this would be.

/dev/mem expects mem_map to be there, the size of which would easily
blow away the reserved section. /dev/oldmem (or whatever we call it)
is just a magic copy that does a kmap-like operation to get at pages
without a struct page.

M.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 10:46       ` [Fastboot] " Alan Cox
  2004-07-28 14:38         ` Martin J. Bligh
@ 2004-07-28 15:42         ` Eric W. Biederman
  1 sibling, 0 replies; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-28 15:42 UTC (permalink / raw)
  To: Alan Cox
  Cc: suparna, Andrew Morton, fastboot, Martin J. Bligh,
	Linux Kernel Mailing List

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> On Mer, 2004-07-28 at 11:54, Suparna Bhattacharya wrote:
> > On Tue, Jul 27, 2004 at 07:53:01PM -0600, Eric W. Biederman wrote:
> > > Andrew Morton <akpm@osdl.org> writes:
> > > 
> > > > Keith Owens <kaos@sgi.com> wrote:
> > > > >
> > > > >  * How do we get a clean API to do polling mode I/O to disk?
> > > > 
> > > > We hope to not have to.  The current plan is to use kexec: at boot time,
> do
> 
> 
> If you are prepared to say "dump device is IDE" the driver ends up
> pretty tiny - maybe 4K for PIO mode.

That sounds about right. I have a read-only version of Etherboot that
pretty much does that. 

A large challenge is how do you get the IDE device into a sane state
before you use it to dump core.  How do you stop on-going DMA
transactions.  What about implementation errata etc.  Do you need to
reprogram the drive to work in a PIO mode again.  I can't quite
remember by I seem to recall that the interface to IDE drives was
fairly stateful.  All of that needs to maintained in a separate dump
driver.  

Plus there is the challenge that many high-end systems where people
want to employ this kind of thing have SCSI disks.  At least that
has been my impression.

Using kexec to switch to an independent program to do the dump
sounds sane.  If you have a simple system with just IDE you could
run a very minimal 4K IDE dumper. If you have a full sized system
you can use a kernel+initrd and do all kinds of interesting things.
Dump to disk.  Dump to network etc.  Stop and work interactively with
the user to examine the state of the machine when it crashed.

Basically you are only limited by how much of your system your 
monitor can put into a sane state.  

Plus except for a small stub the code with kernel+initrd is almost
totally user space and using the normal path for kernel drivers.
The kernel drivers that are used may need to be made a little more
robust but that is not a big deal.

What is most attractive is using kexec to put the policy in user space
seems to be the unix way.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 15:23         ` Martin J. Bligh
@ 2004-07-28 15:53           ` Eric W. Biederman
  0 siblings, 0 replies; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-28 15:53 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: suparna, Andrew Morton, fastboot, linux-kernel

"Martin J. Bligh" <mbligh@aracnet.com> writes:

> /dev/mem expects mem_map to be there, the size of which would easily
> blow away the reserved section. /dev/oldmem (or whatever we call it)
> is just a magic copy that does a kmap-like operation to get at pages
> without a struct page.

Don't we already do that for the mmap case so we can access I/O devices?

But I agree if there is a dependence there having a /dev/rawmem
thing that does not need a page struct would make sense.  I just
don't think we actually need it...

Eric


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 15:21             ` Martin J. Bligh
@ 2004-07-28 15:56               ` Eric W. Biederman
  2004-07-28 17:09               ` Andrea Arcangeli
  1 sibling, 0 replies; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-28 15:56 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Alan Cox, suparna, Andrew Morton, Linux Kernel Mailing List,
	fastboot

"Martin J. Bligh" <mbligh@aracnet.com> writes:

> --Alan Cox <alan@lxorguk.ukuu.org.uk> wrote (on Wednesday, July 28, 2004
> 15:06:27 +0100):
> 
> 
> > On Mer, 2004-07-28 at 15:38, Martin J. Bligh wrote:
> >> After kexec, we shouldn't need such things, do we? Before it, Linus won't 
> >> take the patch, as he said he doesn't like systems in unstable states doing
> >> crashdumps to disk ...
> > 
> > And what does kexec do.. it accesses the disk. A SHA signed standalone
> > dumper is as safe as anything else if not safer.
> 
> But it's reading, not writing ... personally I'm happier with that bit ;-)

And it is only reading to preload the dumper in memory.  This happens
before the system crashes.

All that happens at crash dump time is we hand off control to the dumper.
kexec is just the mechanism to switch from the kernel to the dumper.

It is attractive to make the dumper based on the current linux kernel
but that is by no means a requirement.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28  1:53   ` Eric W. Biederman
  2004-07-28 10:54     ` Suparna Bhattacharya
@ 2004-07-28 16:03     ` Jesse Barnes
  2004-07-28 18:00       ` Eric W. Biederman
  2004-07-28 19:23       ` Martin J. Bligh
  1 sibling, 2 replies; 60+ messages in thread
From: Jesse Barnes @ 2004-07-28 16:03 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Keith Owens, linux-kernel, Suparna Bhattacharya,
	Martin J. Bligh, fastboot

On Tuesday, July 27, 2004 6:53 pm, Eric W. Biederman wrote:
> Hmm.  I think this will require one of the kernels to run at a
> non-default address in physical memory.

Right, and some platforms already support this, fortunately.

> Which will largely depend on the quality of it's device drivers...

I think this could end up being a good thing.  It gives more people a stake in 
making sure that driver shutdown() routines work well.

Jesse

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 14:06           ` Alan Cox
  2004-07-28 15:21             ` Martin J. Bligh
@ 2004-07-28 16:05             ` Eric W. Biederman
  1 sibling, 0 replies; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-28 16:05 UTC (permalink / raw)
  To: Alan Cox
  Cc: Martin J. Bligh, suparna, Andrew Morton,
	Linux Kernel Mailing List, fastboot

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> A SHA signed standalone dumper is as safe as anything else if not safer.

Thanks having a signature or a checksum on the dumper
stored in memory is one piece of this puzzle that I keep
forgetting.  Checking the signature before we trust the code
is important.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 15:21             ` Martin J. Bligh
  2004-07-28 15:56               ` Eric W. Biederman
@ 2004-07-28 17:09               ` Andrea Arcangeli
  1 sibling, 0 replies; 60+ messages in thread
From: Andrea Arcangeli @ 2004-07-28 17:09 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Alan Cox, suparna, Eric W. Biederman, Andrew Morton, fastboot,
	Linux Kernel Mailing List

On Wed, Jul 28, 2004 at 08:21:26AM -0700, Martin J. Bligh wrote:
> --Alan Cox <alan@lxorguk.ukuu.org.uk> wrote (on Wednesday, July 28, 2004 15:06:27 +0100):
> 
> > On Mer, 2004-07-28 at 15:38, Martin J. Bligh wrote:
> >> After kexec, we shouldn't need such things, do we? Before it, Linus won't 
> >> take the patch, as he said he doesn't like systems in unstable states doing
> >> crashdumps to disk ...
> > 
> > And what does kexec do.. it accesses the disk. A SHA signed standalone
> > dumper is as safe as anything else if not safer.
> 
> But it's reading, not writing ... personally I'm happier with that bit ;-)

I was using mcore a few years back and it didn't need to read anything
to launch the new kernel image with bootimg, the new kernel image was
stored in a safe place in memory IIRC

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 16:03     ` Jesse Barnes
@ 2004-07-28 18:00       ` Eric W. Biederman
  2004-07-28 18:06         ` Jesse Barnes
  2004-07-28 18:21         ` Alan Cox
  2004-07-28 19:23       ` Martin J. Bligh
  1 sibling, 2 replies; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-28 18:00 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Andrew Morton, Keith Owens, linux-kernel, Suparna Bhattacharya,
	Martin J. Bligh, fastboot

Jesse Barnes <jbarnes@engr.sgi.com> writes:

> On Tuesday, July 27, 2004 6:53 pm, Eric W. Biederman wrote:
> > Hmm.  I think this will require one of the kernels to run at a
> > non-default address in physical memory.
> 
> Right, and some platforms already support this, fortunately.
> 
> > Which will largely depend on the quality of it's device drivers...
> 
> I think this could end up being a good thing.  It gives more people a stake in 
> making sure that driver shutdown() routines work well.

Which actually is one of the items open for discussion currently.
For kexec on panic do we want to run the shutdown() routines?

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 18:00       ` Eric W. Biederman
@ 2004-07-28 18:06         ` Jesse Barnes
  2004-07-28 19:42           ` Martin J. Bligh
  2004-07-28 19:44           ` Andrew Morton
  2004-07-28 18:21         ` Alan Cox
  1 sibling, 2 replies; 60+ messages in thread
From: Jesse Barnes @ 2004-07-28 18:06 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Keith Owens, linux-kernel, Suparna Bhattacharya,
	Martin J. Bligh, fastboot

On Wednesday, July 28, 2004 11:00 am, Eric W. Biederman wrote:
> > I think this could end up being a good thing.  It gives more people a
> > stake in making sure that driver shutdown() routines work well.
>
> Which actually is one of the items open for discussion currently.
> For kexec on panic do we want to run the shutdown() routines?

We'll have to do something about incoming dma traffic and other stuff that the 
devices might be doing.  Maybe a arch specific callout to do some chipset 
stuff?

Jesse

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 18:00       ` Eric W. Biederman
  2004-07-28 18:06         ` Jesse Barnes
@ 2004-07-28 18:21         ` Alan Cox
  1 sibling, 0 replies; 60+ messages in thread
From: Alan Cox @ 2004-07-28 18:21 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Jesse Barnes, Andrew Morton, fastboot, Martin J. Bligh,
	Suparna Bhattacharya, Linux Kernel Mailing List

On Mer, 2004-07-28 at 19:00, Eric W. Biederman wrote:
> Which actually is one of the items open for discussion currently.
> For kexec on panic do we want to run the shutdown() routines?

Probably or you may not be able to recover some devices. In some cases
simply turning off the master bit might be enough however.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 16:03     ` Jesse Barnes
  2004-07-28 18:00       ` Eric W. Biederman
@ 2004-07-28 19:23       ` Martin J. Bligh
  2004-07-28 20:28         ` [Fastboot] " Eric W. Biederman
  1 sibling, 1 reply; 60+ messages in thread
From: Martin J. Bligh @ 2004-07-28 19:23 UTC (permalink / raw)
  To: Jesse Barnes, Eric W. Biederman
  Cc: Andrew Morton, Keith Owens, linux-kernel, Suparna Bhattacharya,
	fastboot



--On Wednesday, July 28, 2004 09:03:37 -0700 Jesse Barnes <jbarnes@engr.sgi.com> wrote:

> On Tuesday, July 27, 2004 6:53 pm, Eric W. Biederman wrote:
>> Hmm.  I think this will require one of the kernels to run at a
>> non-default address in physical memory.
> 
> Right, and some platforms already support this, fortunately.
> 
>> Which will largely depend on the quality of it's device drivers...
> 
> I think this could end up being a good thing.  It gives more people a stake in 
> making sure that driver shutdown() routines work well.

We discussed this at kernel summit a bit - it'd be safer to make the devices
clear down on boot up, rather than shutdown, if possible ... less work to
do on the unstable base.

Maybe we could shut down the devices on bringup, then bring it up again 
(no I'm not kidding ;-)) ... should reuse the code.

M.




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 18:06         ` Jesse Barnes
@ 2004-07-28 19:42           ` Martin J. Bligh
  2004-07-28 19:56             ` [Fastboot] " Alan Cox
  2004-07-28 19:44           ` Andrew Morton
  1 sibling, 1 reply; 60+ messages in thread
From: Martin J. Bligh @ 2004-07-28 19:42 UTC (permalink / raw)
  To: Jesse Barnes, Eric W. Biederman
  Cc: Andrew Morton, Keith Owens, linux-kernel, Suparna Bhattacharya,
	fastboot

>> > I think this could end up being a good thing.  It gives more people a
>> > stake in making sure that driver shutdown() routines work well.
>> 
>> Which actually is one of the items open for discussion currently.
>> For kexec on panic do we want to run the shutdown() routines?
> 
> We'll have to do something about incoming dma traffic and other stuff that the 
> devices might be doing.  Maybe a arch specific callout to do some chipset 
> stuff?

I vote for sleeping for 5 seconds ;-) Should kill off most of it ...

M.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 18:06         ` Jesse Barnes
  2004-07-28 19:42           ` Martin J. Bligh
@ 2004-07-28 19:44           ` Andrew Morton
  2004-07-28 23:11             ` [Fastboot] " Eric W. Biederman
  1 sibling, 1 reply; 60+ messages in thread
From: Andrew Morton @ 2004-07-28 19:44 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: ebiederm, kaos, linux-kernel, suparna, mbligh, fastboot

Jesse Barnes <jbarnes@engr.sgi.com> wrote:
>
> On Wednesday, July 28, 2004 11:00 am, Eric W. Biederman wrote:
> > > I think this could end up being a good thing.  It gives more people a
> > > stake in making sure that driver shutdown() routines work well.
> >
> > Which actually is one of the items open for discussion currently.
> > For kexec on panic do we want to run the shutdown() routines?
> 
> We'll have to do something about incoming dma traffic and other stuff that the 
> devices might be doing.  Maybe a arch specific callout to do some chipset 
> stuff?
> 

Does ongoing DMA actually matter?  After all,the memory which is being
dma-ed into is pre-reserved and allocated for that purpose, and the dump
kernel won't be using it.

It would be polite to pause for a number of seconds to allow things to go
quiet, but apart from that I think all we need to ensure is that the
drivers in the dump kernel firmly whack the hardware before reinitialising
it?


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 19:42           ` Martin J. Bligh
@ 2004-07-28 19:56             ` Alan Cox
  0 siblings, 0 replies; 60+ messages in thread
From: Alan Cox @ 2004-07-28 19:56 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Jesse Barnes, Eric W. Biederman, Andrew Morton,
	Suparna Bhattacharya, Linux Kernel Mailing List, fastboot

On Mer, 2004-07-28 at 20:42, Martin J. Bligh wrote:
> > We'll have to do something about incoming dma traffic and other stuff that the 
> > devices might be doing.  Maybe a arch specific callout to do some chipset 
> > stuff?
> 
> I vote for sleeping for 5 seconds ;-) Should kill off most of it ...

Wake up smell the coffee.

- Bus masters that run forever
- Devices that need to flush before reset is asserted (eg IDE disk)

...


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 20:33           ` Andrew Morton
@ 2004-07-28 19:59             ` Alan Cox
  2004-07-28 22:42               ` Andrew Morton
  2004-07-28 23:17               ` Eric W. Biederman
  0 siblings, 2 replies; 60+ messages in thread
From: Alan Cox @ 2004-07-28 19:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric W. Biederman, suparna, fastboot, mbligh,
	Linux Kernel Mailing List, jbarnes

On Mer, 2004-07-28 at 21:33, Andrew Morton wrote:
> We really don't want to be calling driver shutdown functions from a crashed
> kernel.

Then at the very least you need to disable bus mastering and have
specialist recovery functions for problematic devices. The bus
mastering one is essential otherwise bus masters will continue to
DMA random data into your new universe.

Other stuff like graphics cards and IDE may need care too.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 19:23       ` Martin J. Bligh
@ 2004-07-28 20:28         ` Eric W. Biederman
  2004-07-28 20:33           ` Andrew Morton
  0 siblings, 1 reply; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-28 20:28 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Jesse Barnes, Andrew Morton, Suparna Bhattacharya, linux-kernel,
	fastboot, Hariprasad Nellitheertha

"Martin J. Bligh" <mbligh@aracnet.com> writes:
> We discussed this at kernel summit a bit - it'd be safer to make the devices
> clear down on boot up, rather than shutdown, if possible ... less work to
> do on the unstable base.

Agreed, but I think for starters we should capture the low hanging fruit
by calling the shutdown method and then increasingly harden the
code by performing less in the kernel that panics and more in the
cleanup kernel.

That way we can concentrate first on the interfaces to the rest of the
kernel.  And then we can make the solution bullet proof.

> Maybe we could shut down the devices on bringup, then bring it up again 
> (no I'm not kidding ;-)) ... should reuse the code.

Might not be a bad idea.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 20:28         ` [Fastboot] " Eric W. Biederman
@ 2004-07-28 20:33           ` Andrew Morton
  2004-07-28 19:59             ` Alan Cox
  0 siblings, 1 reply; 60+ messages in thread
From: Andrew Morton @ 2004-07-28 20:33 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: mbligh, jbarnes, suparna, linux-kernel, fastboot, hari

ebiederm@xmission.com (Eric W. Biederman) wrote:
>
> "Martin J. Bligh" <mbligh@aracnet.com> writes:
> > We discussed this at kernel summit a bit - it'd be safer to make the devices
> > clear down on boot up, rather than shutdown, if possible ... less work to
> > do on the unstable base.
> 
> Agreed, but I think for starters we should capture the low hanging fruit
> by calling the shutdown method and then increasingly harden the
> code by performing less in the kernel that panics and more in the
> cleanup kernel.

We really don't want to be calling driver shutdown functions from a crashed
kernel.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 19:59             ` Alan Cox
@ 2004-07-28 22:42               ` Andrew Morton
  2004-07-28 22:44                 ` Jesse Barnes
  2004-07-28 23:17               ` Eric W. Biederman
  1 sibling, 1 reply; 60+ messages in thread
From: Andrew Morton @ 2004-07-28 22:42 UTC (permalink / raw)
  To: Alan Cox; +Cc: ebiederm, suparna, fastboot, mbligh, linux-kernel, jbarnes

Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>
> On Mer, 2004-07-28 at 21:33, Andrew Morton wrote:
> > We really don't want to be calling driver shutdown functions from a crashed
> > kernel.
> 
> Then at the very least you need to disable bus mastering and have
> specialist recovery functions for problematic devices. The bus
> mastering one is essential otherwise bus masters will continue to
> DMA random data into your new universe.

But they're welcome to do that: the memory for the DMA transfer has
already been allocated and our new universe will not be touching it.

What we need to do is to ensure that the new kexec-ed kernel appropriately
whacks the devices to stop any in-progress operations.  So it's the probe() and
open() routines which need to get the device into a sane state, not the shutdown
routines.

This way:

- We have less devices to take care of: we only care about those devices
  which are needed for a successful dump.

- We are poking at these devices in a known-good kernel, not from within
  a kernel which has wrecked itself.

- Any devices which are performing DMA to/from the old kernel's memory
  can just keep on doing that.  The new kernel doesn't care, unless it
  needs those devices for dumping.



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 22:42               ` Andrew Morton
@ 2004-07-28 22:44                 ` Jesse Barnes
  0 siblings, 0 replies; 60+ messages in thread
From: Jesse Barnes @ 2004-07-28 22:44 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alan Cox, ebiederm, suparna, fastboot, mbligh, linux-kernel

On Wednesday, July 28, 2004 3:42 pm, Andrew Morton wrote:
> But they're welcome to do that: the memory for the DMA transfer has
> already been allocated and our new universe will not be touching it.

Yeah, for the most part that should be ok.  We can be paranoid about 
misdirected DMA later...  (Some platforms will let you protect memory regions 
at the chipset level, so it seems like the new kernel should be so protected 
until it's actually jumped to by kexec, but prior to unprotecting it you'd 
want to make sure that a bad DMA from the broken kernel doesn't hose it.)

> What we need to do is to ensure that the new kexec-ed kernel appropriately
> whacks the devices to stop any in-progress operations.  So it's the probe()
> and open() routines which need to get the device into a sane state, not the
> shutdown routines.

That makes sense, and means that drivers may want to call a shutdown-like 
routine in their probe functions to make sure their device is in a known 
state before starting.  But all of this is very driver specific it seems.

Jesse

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 23:11             ` [Fastboot] " Eric W. Biederman
@ 2004-07-28 22:53               ` Alan Cox
  2004-07-29  1:12                 ` Eric W. Biederman
  0 siblings, 1 reply; 60+ messages in thread
From: Alan Cox @ 2004-07-28 22:53 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, suparna, fastboot, Jesse Barnes,
	Linux Kernel Mailing List, mbligh

On Iau, 2004-07-29 at 00:11, Eric W. Biederman wrote:
> If we can ensure the addresses where the new kernel will run will never
> have DMA pointed at them I actually don't think so.  This is why last
> year I recommended building a kernel that runs at a non-default address
> and finding a way to simply preload it there.

We DMA into arbitary allocated pages anywhere in the memory space, so
you never know where is safe other than areas preallocated during the
old kernel run.

Since you can just clear the master bit on each PCI device it isnt a big
deal to protect against. (except a couple of devices that forget
to honour it)


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 23:17               ` Eric W. Biederman
@ 2004-07-28 22:55                 ` Alan Cox
  2004-07-29  0:22                   ` Andrew Morton
  2004-07-29  1:05                   ` Eric W. Biederman
  2004-07-28 23:44                 ` Andrew Morton
  1 sibling, 2 replies; 60+ messages in thread
From: Alan Cox @ 2004-07-28 22:55 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, suparna, fastboot, mbligh, jbarnes,
	Linux Kernel Mailing List

On Iau, 2004-07-29 at 00:17, Eric W. Biederman wrote:
> What is your concern with stopping DMA?
> - Not smashing the recovery routine.
> - Getting a corrupted core dump because of on-going DMA?

Completely random happenings occurring when they are trivial to avoid.
Given all the worries about SHA signed in kernel standalone objects I
find it farcical that the same people don't even care about ensuring
something isnt DMAing over their dump partition description.

Alan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 19:44           ` Andrew Morton
@ 2004-07-28 23:11             ` Eric W. Biederman
  2004-07-28 22:53               ` Alan Cox
  0 siblings, 1 reply; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-28 23:11 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jesse Barnes, suparna, fastboot, mbligh, linux-kernel

Andrew Morton <akpm@osdl.org> writes:

> Jesse Barnes <jbarnes@engr.sgi.com> wrote:
> >
> > On Wednesday, July 28, 2004 11:00 am, Eric W. Biederman wrote:
> > > > I think this could end up being a good thing.  It gives more people a
> > > > stake in making sure that driver shutdown() routines work well.
> > >
> > > Which actually is one of the items open for discussion currently.
> > > For kexec on panic do we want to run the shutdown() routines?
> > 
> > We'll have to do something about incoming dma traffic and other stuff that the
> 
> > devices might be doing.  Maybe a arch specific callout to do some chipset 
> > stuff?
> > 
> 
> Does ongoing DMA actually matter?  After all,the memory which is being
> dma-ed into is pre-reserved and allocated for that purpose, and the dump
> kernel won't be using it.

If we can ensure the addresses where the new kernel will run will never
have DMA pointed at them I actually don't think so.  This is why last
year I recommended building a kernel that runs at a non-default address
and finding a way to simply preload it there.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 19:59             ` Alan Cox
  2004-07-28 22:42               ` Andrew Morton
@ 2004-07-28 23:17               ` Eric W. Biederman
  2004-07-28 22:55                 ` Alan Cox
  2004-07-28 23:44                 ` Andrew Morton
  1 sibling, 2 replies; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-28 23:17 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andrew Morton, suparna, fastboot, mbligh, jbarnes,
	Linux Kernel Mailing List

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> On Mer, 2004-07-28 at 21:33, Andrew Morton wrote:
> > We really don't want to be calling driver shutdown functions from a crashed
> > kernel.
> 
> Then at the very least you need to disable bus mastering and have
> specialist recovery functions for problematic devices. The bus
> mastering one is essential otherwise bus masters will continue to
> DMA random data into your new universe.
> 
> Other stuff like graphics cards and IDE may need care too.

Alan if we call anything the shutdown methods really are the thing
to call.  Because they are exactly the specialty recovery functions for
problematic devices.

Of course no matter what we do will this work 100% of the time because
part of what we will be fighting is broken hardware.  However we should
be able to get a mechanism that works most of the time.

What is your concern with stopping DMA?
- Not smashing the recovery routine.
- Getting a corrupted core dump because of on-going DMA?

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 23:17               ` Eric W. Biederman
  2004-07-28 22:55                 ` Alan Cox
@ 2004-07-28 23:44                 ` Andrew Morton
  2004-07-29  0:58                   ` Eric W. Biederman
                                     ` (2 more replies)
  1 sibling, 3 replies; 60+ messages in thread
From: Andrew Morton @ 2004-07-28 23:44 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: alan, suparna, fastboot, mbligh, jbarnes, linux-kernel

ebiederm@xmission.com (Eric W. Biederman) wrote:
>
> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> 
> > On Mer, 2004-07-28 at 21:33, Andrew Morton wrote:
> > > We really don't want to be calling driver shutdown functions from a crashed
> > > kernel.
> > 
> > Then at the very least you need to disable bus mastering and have
> > specialist recovery functions for problematic devices. The bus
> > mastering one is essential otherwise bus masters will continue to
> > DMA random data into your new universe.
> > 
> > Other stuff like graphics cards and IDE may need care too.
> 
> Alan if we call anything the shutdown methods really are the thing
> to call.  Because they are exactly the specialty recovery functions for
> problematic devices.
> 
> Of course no matter what we do will this work 100% of the time because
> part of what we will be fighting is broken hardware.  However we should
> be able to get a mechanism that works most of the time.

Shutdown methods will typically call into the slab allocator and the page
allocator to free stuff, and they are pretty common sources of oopses. 
Often with locks held.  You run an excellent change of deadlocking.

Possibly one could add

#ifdef CONFIG_WHATEVER
	if (unlikely(oops_in_progress))
		return;
#endif

to the relevant entry points.

The shutdown routines may also call into sysfs/kobject/procfs release entry
points, and they're even more popular oops sites.

We really want to get into the new kernel ASAP and clean stuff up from
in there.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 22:55                 ` Alan Cox
@ 2004-07-29  0:22                   ` Andrew Morton
  2004-07-29 13:57                     ` Alan Cox
  2004-07-29  1:05                   ` Eric W. Biederman
  1 sibling, 1 reply; 60+ messages in thread
From: Andrew Morton @ 2004-07-29  0:22 UTC (permalink / raw)
  To: Alan Cox; +Cc: ebiederm, suparna, fastboot, mbligh, jbarnes, linux-kernel

Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>
> On Iau, 2004-07-29 at 00:17, Eric W. Biederman wrote:
> > What is your concern with stopping DMA?
> > - Not smashing the recovery routine.
> > - Getting a corrupted core dump because of on-going DMA?
> 
> Completely random happenings occurring when they are trivial to avoid.
> Given all the worries about SHA signed in kernel standalone objects I
> find it farcical that the same people don't even care about ensuring
> something isnt DMAing over their dump partition description.
> 

eh?  People do care.  The point here is that we should stop the DMA in the
dump kernel, not from within the broken kernel.

btw, if we simply insert a five-second-pause, what problems does that
leave?  Network Rx, which is OK.  Disk writes will have completed (?). 
What remains?


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 23:44                 ` Andrew Morton
@ 2004-07-29  0:58                   ` Eric W. Biederman
  2004-07-29  1:09                     ` Andrew Morton
  2004-07-29 14:08                   ` Martin J. Bligh
  2004-07-29 17:12                   ` Matthias Urlichs
  2 siblings, 1 reply; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-29  0:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: suparna, fastboot, mbligh, linux-kernel, alan, jbarnes

Andrew Morton <akpm@osdl.org> writes:

> Shutdown methods will typically call into the slab allocator and the page
> allocator to free stuff, and they are pretty common sources of oopses. 
> Often with locks held.  You run an excellent change of deadlocking.

Hmm..  Last I looked shutdown methods typically don't exist at all.
The shutdown methods are explicitly separated from the remove methods
for exactly this reason.  It is a BUG for any shutdown method to
free memory.  Their only function is to shutdown the hardware. 

> Possibly one could add
> 
> #ifdef CONFIG_WHATEVER
> 	if (unlikely(oops_in_progress))
> 		return;
> #endif
> 
> to the relevant entry points.
> 
> The shutdown routines may also call into sysfs/kobject/procfs release entry
> points, and they're even more popular oops sites.

Again.  If a shutdown method does that it is a BUG.  Only the remove method
should do that.  

If I actually believed that shutdown methods existed, and did that 
I would be in favor of writing a patch to test for any accesses of
memory management or sysfs/kobject/procfs release stuff and BUG
if it happened.

> We really want to get into the new kernel ASAP and clean stuff up from
> in there.

I agree.  However the gymnastics for doing that have not been worked out.
The drivers cannot clean up stuff yet, nor do we have a good way to run
in memory where DMA transfers on not ongoing.

So for a first pass I think calling the shutdown methods make sense.

For a second pass we need to use a relocatable that can do everything itself.
And we should run it out of a reserved area of memory.

But the first pass is worth it (at least in the kexec tree) to sort out all
of the interface issues and catch the low hanging fruit.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 22:55                 ` Alan Cox
  2004-07-29  0:22                   ` Andrew Morton
@ 2004-07-29  1:05                   ` Eric W. Biederman
  2004-07-29 14:12                     ` Martin J. Bligh
  1 sibling, 1 reply; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-29  1:05 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andrew Morton, suparna, fastboot, mbligh,
	Linux Kernel Mailing List, jbarnes

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> On Iau, 2004-07-29 at 00:17, Eric W. Biederman wrote:
> > What is your concern with stopping DMA?
> > - Not smashing the recovery routine.
> > - Getting a corrupted core dump because of on-going DMA?
> 
> Completely random happenings occurring when they are trivial to avoid.
> Given all the worries about SHA signed in kernel standalone objects I
> find it farcical that the same people don't even care about ensuring
> something isnt DMAing over their dump partition description.

I was asking so I could give a better answer.

As far as I can tell the long term solution is to simply run the
dumper from memory that reserved in the kernel that called panic().

At that point you might get a corrupt DUMP. But it is extremely
unlikely any DMA transactions will touch your reserved memory.
Only the most buggy drivers or hardware will be running
DMA to the invalid addresses.  At which point there is
very little you can do in any event.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29  0:58                   ` Eric W. Biederman
@ 2004-07-29  1:09                     ` Andrew Morton
  2004-07-29  1:56                       ` Eric W. Biederman
  0 siblings, 1 reply; 60+ messages in thread
From: Andrew Morton @ 2004-07-29  1:09 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: suparna, fastboot, mbligh, linux-kernel, alan, jbarnes

ebiederm@xmission.com (Eric W. Biederman) wrote:
>
> Andrew Morton <akpm@osdl.org> writes:
> 
> > Shutdown methods will typically call into the slab allocator and the page
> > allocator to free stuff, and they are pretty common sources of oopses. 
> > Often with locks held.  You run an excellent change of deadlocking.
> 
> Hmm..  Last I looked shutdown methods typically don't exist at all.
> The shutdown methods are explicitly separated from the remove methods
> for exactly this reason.  It is a BUG for any shutdown method to
> free memory.  Their only function is to shutdown the hardware. 

OK.  But some (most) of them will sleep, too.  And we shouldn't sleep in a
dead kernel.

> > We really want to get into the new kernel ASAP and clean stuff up from
> > in there.
> 
> I agree.  However the gymnastics for doing that have not been worked out.
> The drivers cannot clean up stuff yet, nor do we have a good way to run
> in memory where DMA transfers on not ongoing.

Don't we?  The 16M of memory was allocated up-front at kexec load time[*],
so nobody will be pointing DMA hardware at it.  And the dump kernel won't
be pointing DMA hardware at the crashed kernel's pages.

> So for a first pass I think calling the shutdown methods make sense.

Well.  There aren't any.

> But the first pass is worth it (at least in the kexec tree) to sort out all
> of the interface issues and catch the low hanging fruit.

A significant proportion of kernel crashes happen from [soft]irq context,
from which we cannot call shutdown methods.  So we need to be able to bring
up the dump kernel without having run driver shutdown functions anwyay..

[*] At least, I _assume_ the 16MB will be prereserved,
    physically-contiguous and wholly within ZONE_NORMAL.  Is this wrong?


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 22:53               ` Alan Cox
@ 2004-07-29  1:12                 ` Eric W. Biederman
  2004-07-29 14:00                   ` Alan Cox
  0 siblings, 1 reply; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-29  1:12 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andrew Morton, suparna, fastboot, mbligh, Jesse Barnes,
	Linux Kernel Mailing List

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> On Iau, 2004-07-29 at 00:11, Eric W. Biederman wrote:
> > If we can ensure the addresses where the new kernel will run will never
> > have DMA pointed at them I actually don't think so.  This is why last
> > year I recommended building a kernel that runs at a non-default address
> > and finding a way to simply preload it there.
> 
> We DMA into arbitary allocated pages anywhere in the memory space, so
> you never know where is safe other than areas preallocated during the
> old kernel run.

Alan I just reread what you said and it appears we are in violent agreement
about the facts.

Different methods but...

> Since you can just clear the master bit on each PCI device it isnt a big
> deal to protect against. (except a couple of devices that forget
> to honour it)

Or those devices that hang the machine when you clear it.
Or the ioapics which loose the ability to generate interrupts
when you clear the master bit, and with the i82559 timer behind
them you can't get your new kernel to boot.

Plus there are all of the non-pci devices.  

And there is the fact that the pci configuration access methods
are frequently BIOS calls.

So I do see just clearing the master bit on each PCI devices to
as dangerous as calling the shutdown methods.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29  1:09                     ` Andrew Morton
@ 2004-07-29  1:56                       ` Eric W. Biederman
  2004-07-29 14:18                         ` Martin J. Bligh
  0 siblings, 1 reply; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-29  1:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: suparna, fastboot, mbligh, jbarnes, alan, linux-kernel

Andrew Morton <akpm@osdl.org> writes:

> ebiederm@xmission.com (Eric W. Biederman) wrote:
> >
> > Andrew Morton <akpm@osdl.org> writes:
> 
> OK.  But some (most) of them will sleep, too.  And we shouldn't sleep in a
> dead kernel.

Probably not.  And that is legitimate...

> > I agree.  However the gymnastics for doing that have not been worked out.
> > The drivers cannot clean up stuff yet, nor do we have a good way to run
> > in memory where DMA transfers on not ongoing.
> 
> Don't we?  The 16M of memory was allocated up-front at kexec load time[*],
> so nobody will be pointing DMA hardware at it.  And the dump kernel won't
> be pointing DMA hardware at the crashed kernel's pages.

No but we will be running in the first 16M of memory.  The 16M that
is allocated is currently used to hold a copy of the low 16M.

> > So for a first pass I think calling the shutdown methods make sense.
> 
> Well.  There aren't any.

Which makes them both safe and worthless.  On the normal kexec path
they we will need to get them written though.

> > But the first pass is worth it (at least in the kexec tree) to sort out all
> > of the interface issues and catch the low hanging fruit.
> 
> A significant proportion of kernel crashes happen from [soft]irq context,
> from which we cannot call shutdown methods.  So we need to be able to bring
> up the dump kernel without having run driver shutdown functions anwyay..

Well if calling shutdown is not really usable, then I we had better
transition quickly beyond using it...

> [*] At least, I _assume_ the 16MB will be prereserved,
>     physically-contiguous and wholly within ZONE_NORMAL.  Is this wrong?

The problem is that we really won't be using it for running code out
of because of i386 kernel limitations.  Unless someone can tell
my why 0 -16MB won't have DMA traffic in them.  Or how to run a kernel
at an address other than 1MB.

I suspect we can play with the initial page tables and how virtual
addresses map to physical addresses and fairly simply generate a
relocatable kernel.  I have not had a chance to investigate that
though.  Once we have that it will be trivial to run out of the
reserved 16M and many of the practical problems melt away.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29  0:22                   ` Andrew Morton
@ 2004-07-29 13:57                     ` Alan Cox
  2004-07-29 18:17                       ` Andrew Morton
  2004-07-30 15:24                       ` Olivier Galibert
  0 siblings, 2 replies; 60+ messages in thread
From: Alan Cox @ 2004-07-29 13:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: ebiederm, suparna, fastboot, mbligh, jbarnes,
	Linux Kernel Mailing List

On Iau, 2004-07-29 at 01:22, Andrew Morton wrote:
> eh?  People do care.  The point here is that we should stop the DMA in the
> dump kernel, not from within the broken kernel.

And pray just how do you expect to prove that the dump kernel isnt
being overwritten *as* it is being loaded.

> btw, if we simply insert a five-second-pause, what problems does that
> leave?  Network Rx, which is OK.  Disk writes will have completed (?). 
> What remains?

Network RX is the obvious one since we've no idea where the DMA is
going in memory.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29  1:12                 ` Eric W. Biederman
@ 2004-07-29 14:00                   ` Alan Cox
  2004-07-29 15:47                     ` Eric W. Biederman
  0 siblings, 1 reply; 60+ messages in thread
From: Alan Cox @ 2004-07-29 14:00 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, suparna, fastboot, mbligh, Jesse Barnes,
	Linux Kernel Mailing List

On Iau, 2004-07-29 at 02:12, Eric W. Biederman wrote:
> Or those devices that hang the machine when you clear it.

There are none. Its required by the PCI spec and used by BIOS vendors
during the boot sequence. So its a *tested* approach.

> And there is the fact that the pci configuration access methods
> are frequently BIOS calls.

You will be running bios code on some systems every time you read
the cmos clock, every time you touch pci config space, every time
you hit a key, even in your new kernel boot up path - whats your
point

> So I do see just clearing the master bit on each PCI devices to
> as dangerous as calling the shutdown methods.

Then we violently disagree


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 23:44                 ` Andrew Morton
  2004-07-29  0:58                   ` Eric W. Biederman
@ 2004-07-29 14:08                   ` Martin J. Bligh
  2004-07-29 15:52                     ` Eric W. Biederman
  2004-07-29 17:12                   ` Matthias Urlichs
  2 siblings, 1 reply; 60+ messages in thread
From: Martin J. Bligh @ 2004-07-29 14:08 UTC (permalink / raw)
  To: Andrew Morton, Eric W. Biederman
  Cc: alan, suparna, fastboot, jbarnes, linux-kernel

> We really want to get into the new kernel ASAP and clean stuff up from
> in there.

As long as the "init" routines are run on every startup (not just kexec ones),
they should get plenty of testing (though not from bad card state).

I still think we could share code by running the shutdown routines from 
the *new* kernel  before trying to init the card if they're written in a 
robust way so as to allow it ... is that insane?

M.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29  1:05                   ` Eric W. Biederman
@ 2004-07-29 14:12                     ` Martin J. Bligh
  0 siblings, 0 replies; 60+ messages in thread
From: Martin J. Bligh @ 2004-07-29 14:12 UTC (permalink / raw)
  To: Eric W. Biederman, Alan Cox
  Cc: Andrew Morton, suparna, fastboot, Linux Kernel Mailing List,
	jbarnes

> At that point you might get a corrupt DUMP. But it is extremely
> unlikely any DMA transactions will touch your reserved memory.
> Only the most buggy drivers or hardware will be running
> DMA to the invalid addresses.  At which point there is
> very little you can do in any event.

Nothing we do is ever going to give us 100% perfect crashdumps from any
given situation. I think we need to just accept that and take care not
to destroy anything important (ie disk data), and try to get the best
info into the dump we can.

M


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29  1:56                       ` Eric W. Biederman
@ 2004-07-29 14:18                         ` Martin J. Bligh
  2004-07-29 16:01                           ` Eric W. Biederman
  2004-07-29 16:19                           ` Eric W. Biederman
  0 siblings, 2 replies; 60+ messages in thread
From: Martin J. Bligh @ 2004-07-29 14:18 UTC (permalink / raw)
  To: Eric W. Biederman, Andrew Morton
  Cc: suparna, fastboot, jbarnes, alan, linux-kernel

> Well if calling shutdown is not really usable, then I we had better
> transition quickly beyond using it...
>  
>> [*] At least, I _assume_ the 16MB will be prereserved,
>>     physically-contiguous and wholly within ZONE_NORMAL.  Is this wrong?
> 
> The problem is that we really won't be using it for running code out
> of because of i386 kernel limitations.  Unless someone can tell
> my why 0 -16MB won't have DMA traffic in them.  Or how to run a kernel
> at an address other than 1MB.
> 
> I suspect we can play with the initial page tables and how virtual
> addresses map to physical addresses and fairly simply generate a
> relocatable kernel.  I have not had a chance to investigate that
> though.  Once we have that it will be trivial to run out of the
> reserved 16M and many of the practical problems melt away.

IIRC, what Adam did is to relocate the bottom 16MB of mem into the
reserved buffer and execute into the bottom 16MB. Yes, that probably does
leave some DMA issues that we should fix up as you suggest above, but I
think it's good enough for a first pass at the problem.

M.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29 14:00                   ` Alan Cox
@ 2004-07-29 15:47                     ` Eric W. Biederman
  0 siblings, 0 replies; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-29 15:47 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andrew Morton, suparna, fastboot, mbligh,
	Linux Kernel Mailing List, Jesse Barnes

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> On Iau, 2004-07-29 at 02:12, Eric W. Biederman wrote:
> > Or those devices that hang the machine when you clear it.
> 
> There are none. Its required by the PCI spec and used by BIOS vendors
> during the boot sequence. So its a *tested* approach.

Enabling is required.  Clearing is not.  The particular instance I was
thinking of was disabling memory access and leaving I/O enabled.  
 
> > And there is the fact that the pci configuration access methods
> > are frequently BIOS calls.
> 
> You will be running bios code on some systems every time you read
> the cmos clock, every time you touch pci config space, every time
> you hit a key, even in your new kernel boot up path - whats your
> point

Only that in many instances BIOS code can do things we don't expect.
And when we start out with the machine in an unknown state the
risk is worse.

> > So I do see just clearing the master bit on each PCI devices to
> > as dangerous as calling the shutdown methods.
> 
> Then we violently disagree

yes.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29 14:08                   ` Martin J. Bligh
@ 2004-07-29 15:52                     ` Eric W. Biederman
  2004-07-29 16:13                       ` Martin J. Bligh
  0 siblings, 1 reply; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-29 15:52 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Andrew Morton, suparna, fastboot, jbarnes, alan, linux-kernel

"Martin J. Bligh" <mbligh@aracnet.com> writes:

> > We really want to get into the new kernel ASAP and clean stuff up from
> > in there.
> 
> As long as the "init" routines are run on every startup (not just kexec ones),
> they should get plenty of testing (though not from bad card state).

And I know for a fact that many init routines won't initialize a
card in a bad state currently.  That is my most frequent failure in
the normal kexec case, when things are not in a 
 
> I still think we could share code by running the shutdown routines from 
> the *new* kernel  before trying to init the card if they're written in a 
> robust way so as to allow it ... is that insane?

As a rough feel yes that is sane.  Redundant but sane.  I would
like to hear what greg thinks of it though.

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29 14:18                         ` Martin J. Bligh
@ 2004-07-29 16:01                           ` Eric W. Biederman
  2004-07-29 16:19                           ` Eric W. Biederman
  1 sibling, 0 replies; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-29 16:01 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Andrew Morton, suparna, fastboot, jbarnes, alan, linux-kernel

"Martin J. Bligh" <mbligh@aracnet.com> writes:

> IIRC, what Adam did is to relocate the bottom 16MB of mem into the
> reserved buffer and execute into the bottom 16MB. Yes, that probably does
> leave some DMA issues that we should fix up as you suggest above, but I
> think it's good enough for a first pass at the problem.

Probably.  I have witnessed network RX causing memory corruption,
before the kexec code started downing the network interfaces on
the user space side. I suspect data capture from sound cards or
video capture cards would have the same issue.  

The way I have observed this in the past is to kexec memtest86,
on a machine with known good memory, and then attempt to ping it :)

What especially worries me about the low 16MB is that it is the
DMA zone for ISA devices.  Old sound cards in particular.  Most
of that is output but....   

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29 15:52                     ` Eric W. Biederman
@ 2004-07-29 16:13                       ` Martin J. Bligh
  0 siblings, 0 replies; 60+ messages in thread
From: Martin J. Bligh @ 2004-07-29 16:13 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, suparna, fastboot, jbarnes, alan, linux-kernel

>> > We really want to get into the new kernel ASAP and clean stuff up from
>> > in there.
>> 
>> As long as the "init" routines are run on every startup (not just kexec ones),
>> they should get plenty of testing (though not from bad card state).
> 
> And I know for a fact that many init routines won't initialize a
> card in a bad state currently.  That is my most frequent failure in
> the normal kexec case, when things are not in a 

Oh, yes, I know that ... I'm just saying we should fix it ;-)

M.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29 14:18                         ` Martin J. Bligh
  2004-07-29 16:01                           ` Eric W. Biederman
@ 2004-07-29 16:19                           ` Eric W. Biederman
  1 sibling, 0 replies; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-29 16:19 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Andrew Morton, suparna, fastboot, jbarnes, alan, linux-kernel

"Martin J. Bligh" <mbligh@aracnet.com> writes:

> IIRC, what Adam did is to relocate the bottom 16MB of mem into the
> reserved buffer and execute into the bottom 16MB. Yes, that probably does
> leave some DMA issues that we should fix up as you suggest above, but I
> think it's good enough for a first pass at the problem.

I guess I don't have a problem with that as long as I don't have to
chase the bugs.  

If I do have to chase the bugs I would rather call the shutdown methods
and just say things don't _yet_ work in cases where it is not safe
to call them.

I guess I just like to be easily to explain what does not work _yet_.

With running at the same addresses we also have a rogue cpu problem as
well.  If we don't kill the cpu before the new kernel starts what happens
if it starts executing some random bit of the new code...

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 23:44                 ` Andrew Morton
  2004-07-29  0:58                   ` Eric W. Biederman
  2004-07-29 14:08                   ` Martin J. Bligh
@ 2004-07-29 17:12                   ` Matthias Urlichs
  2 siblings, 0 replies; 60+ messages in thread
From: Matthias Urlichs @ 2004-07-29 17:12 UTC (permalink / raw)
  To: linux-kernel

Hi, Andrew Morton wrote:

> We really want to get into the new kernel ASAP and clean stuff up from
> in there.

Besides: when I want a crash dump, I want to dump the system state as of
the moment it crashed and not as of $RANDOM_STUFF later, well-intentioned
or not.

Let's reset the devices to a known state from within that kernel. If it
can't do that for some reason, it should reboot the system through the
BIOS (which had better be able to do it), and we're no worse off than
before.

-- 
Matthias Urlichs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29 13:57                     ` Alan Cox
@ 2004-07-29 18:17                       ` Andrew Morton
  2004-07-29 21:20                         ` Alan Cox
  2004-07-31 13:52                         ` Matthias Urlichs
  2004-07-30 15:24                       ` Olivier Galibert
  1 sibling, 2 replies; 60+ messages in thread
From: Andrew Morton @ 2004-07-29 18:17 UTC (permalink / raw)
  To: Alan Cox; +Cc: ebiederm, suparna, fastboot, mbligh, jbarnes, linux-kernel

Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>
> On Iau, 2004-07-29 at 01:22, Andrew Morton wrote:
> > eh?  People do care.  The point here is that we should stop the DMA in the
> > dump kernel, not from within the broken kernel.
> 
> And pray just how do you expect to prove that the dump kernel isnt
> being overwritten *as* it is being loaded.

It was preloaded.

Of course, there's an assumption here that the dead kernel doesn't scribble
on pages which were never available to its page allocator.  If DMA somehow
goes off and scribbles on the dump kernel we lose.

> > btw, if we simply insert a five-second-pause, what problems does that
> > leave?  Network Rx, which is OK.  Disk writes will have completed (?). 
> > What remains?
> 
> Network RX is the obvious one since we've no idea where the DMA is
> going in memory.

See above.  We assume that network RX DMA won't be scribbling in the 16MB
which was pre-reserved.  That's reasonable.  We _have_ to assume that.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29 18:17                       ` Andrew Morton
@ 2004-07-29 21:20                         ` Alan Cox
  2004-07-29 22:30                           ` Gerrit Huizenga
  2004-07-31 13:52                         ` Matthias Urlichs
  1 sibling, 1 reply; 60+ messages in thread
From: Alan Cox @ 2004-07-29 21:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: ebiederm, suparna, fastboot, mbligh, jbarnes,
	Linux Kernel Mailing List

On Iau, 2004-07-29 at 19:17, Andrew Morton wrote:
> Of course, there's an assumption here that the dead kernel doesn't scribble
> on pages which were never available to its page allocator.  If DMA somehow
> goes off and scribbles on the dump kernel we lose.

If the new kernel image starts with an SHA hash check including the
SHA hash check code that can be pretty robust as a sanity check.

> See above.  We assume that network RX DMA won't be scribbling in the 16MB
> which was pre-reserved.  That's reasonable.  We _have_ to assume that.

Ok


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29 21:20                         ` Alan Cox
@ 2004-07-29 22:30                           ` Gerrit Huizenga
  2004-07-30  0:04                             ` Eric W. Biederman
  0 siblings, 1 reply; 60+ messages in thread
From: Gerrit Huizenga @ 2004-07-29 22:30 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andrew Morton, suparna, fastboot, ebiederm, mbligh, jbarnes,
	Linux Kernel Mailing List

On Thu, 29 Jul 2004 22:20:13 BST, Alan Cox wrote:
> 
> On Iau, 2004-07-29 at 19:17, Andrew Morton wrote:
> > Of course, there's an assumption here that the dead kernel doesn't scribble
> > on pages which were never available to its page allocator.  If DMA somehow
> > goes off and scribbles on the dump kernel we lose.
> 
> If the new kernel image starts with an SHA hash check including the
> SHA hash check code that can be pretty robust as a sanity check.
> 
> > See above.  We assume that network RX DMA won't be scribbling in the 16MB
> > which was pre-reserved.  That's reasonable.  We _have_ to assume that.
> 
> Ok

Okay, I may be confused a bit but I *thought* kexec was going to
load the thin, new kernel (e.g. read from disk operations, which is
better than write to disk operations from the sick kernel).

This concept of having it pre-loaded sounds interesting, protecting
it from being written on doesn't bother me much, but why *not* read
it from disk/filesystem and then use the SHA hash in the newly
loaded & exec'd kernel to make sure that what we loaded was sane?

That sounds simpler than changing the kernel load process around,
ensuring you have the new kexec'd kernel build and loaded, etc.
At least it sounds simpler and more in line with using kexec for
fastboot as well.

gerrit

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-30  0:04                             ` Eric W. Biederman
@ 2004-07-29 23:25                               ` Alan Cox
  2004-07-30  4:07                                 ` Eric W. Biederman
  2004-07-30 12:38                               ` Gerrit Huizenga
  1 sibling, 1 reply; 60+ messages in thread
From: Alan Cox @ 2004-07-29 23:25 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Gerrit Huizenga, Andrew Morton, suparna, fastboot, mbligh,
	jbarnes, Linux Kernel Mailing List

On Gwe, 2004-07-30 at 01:04, Eric W. Biederman wrote:
> The beauty of kexec is all of these fun things become user 
> problems from the point of the view of the sick kernel so
> it does not need to worry about them.
> 
> I will be happy to see a SHA patch for /sbin/kexec.  

crypto/sha1.c provides all the code you need.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29 22:30                           ` Gerrit Huizenga
@ 2004-07-30  0:04                             ` Eric W. Biederman
  2004-07-29 23:25                               ` Alan Cox
  2004-07-30 12:38                               ` Gerrit Huizenga
  0 siblings, 2 replies; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-30  0:04 UTC (permalink / raw)
  To: Gerrit Huizenga
  Cc: Alan Cox, Andrew Morton, suparna, fastboot, mbligh, jbarnes,
	Linux Kernel Mailing List

Gerrit Huizenga <gh@us.ibm.com> writes:

> On Thu, 29 Jul 2004 22:20:13 BST, Alan Cox wrote:
> > 
> > On Iau, 2004-07-29 at 19:17, Andrew Morton wrote:
> > > Of course, there's an assumption here that the dead kernel doesn't scribble
> > > on pages which were never available to its page allocator.  If DMA somehow
> > > goes off and scribbles on the dump kernel we lose.
> > 
> > If the new kernel image starts with an SHA hash check including the
> > SHA hash check code that can be pretty robust as a sanity check.
> > 
> > > See above.  We assume that network RX DMA won't be scribbling in the 16MB
> > > which was pre-reserved.  That's reasonable.  We _have_ to assume that.
> > 
> > Ok
> 
> Okay, I may be confused a bit but I *thought* kexec was going to
> load the thin, new kernel (e.g. read from disk operations, which is
> better than write to disk operations from the sick kernel).

/sbin/kexec will load it with sys_kexec_load, before the kernel becomes
sick.
 
> This concept of having it pre-loaded sounds interesting, protecting
> it from being written on doesn't bother me much, but why *not* read
> it from disk/filesystem and then use the SHA hash in the newly
> loaded & exec'd kernel to make sure that what we loaded was sane?

Exactly.  That is where the SHA hash and all of the features will
go in the new ``kernel''.  What we are exec is an arbitrary
stand-alone program.  I suspect a SHA hash generator and checker
is something we can easily add as a wrapper. 

> That sounds simpler than changing the kernel load process around,
> ensuring you have the new kexec'd kernel build and loaded, etc.
> At least it sounds simpler and more in line with using kexec for
> fastboot as well.

The only process that is going to be changed around is where
we store the kernel before we transfer control to it, and when/and
how that transfer of control happens.

The beauty of kexec is all of these fun things become user 
problems from the point of the view of the sick kernel so
it does not need to worry about them.

I will be happy to see a SHA patch for /sbin/kexec.  

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29 23:25                               ` Alan Cox
@ 2004-07-30  4:07                                 ` Eric W. Biederman
  0 siblings, 0 replies; 60+ messages in thread
From: Eric W. Biederman @ 2004-07-30  4:07 UTC (permalink / raw)
  To: Alan Cox
  Cc: Gerrit Huizenga, Andrew Morton, suparna, fastboot, mbligh,
	jbarnes, Linux Kernel Mailing List

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> On Gwe, 2004-07-30 at 01:04, Eric W. Biederman wrote:
> > The beauty of kexec is all of these fun things become user 
> > problems from the point of the view of the sick kernel so
> > it does not need to worry about them.
> > 
> > I will be happy to see a SHA patch for /sbin/kexec.  
> 
> crypto/sha1.c provides all the code you need.

Yep that is the easy part, finding a sha1 implementation.  The
interesting part is the logic that hooks in the code and computes and
checks the hash.  Especially with the area the sha1 code checkes
including the sha1 check code :)

Eric

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-30  0:04                             ` Eric W. Biederman
  2004-07-29 23:25                               ` Alan Cox
@ 2004-07-30 12:38                               ` Gerrit Huizenga
  1 sibling, 0 replies; 60+ messages in thread
From: Gerrit Huizenga @ 2004-07-30 12:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Alan Cox, Andrew Morton, suparna, fastboot, mbligh, jbarnes,
	Linux Kernel Mailing List

On 29 Jul 2004 18:04:06 MDT, Eric W. Biederman wrote:
> Gerrit Huizenga <gh@us.ibm.com> writes:
> > 
> > Okay, I may be confused a bit but I *thought* kexec was going to
> > load the thin, new kernel (e.g. read from disk operations, which is
> > better than write to disk operations from the sick kernel).
> 
> /sbin/kexec will load it with sys_kexec_load, before the kernel becomes
> sick.
>  

...

> > That sounds simpler than changing the kernel load process around,
> > ensuring you have the new kexec'd kernel build and loaded, etc.
> > At least it sounds simpler and more in line with using kexec for
> > fastboot as well.
> 
> The only process that is going to be changed around is where
> we store the kernel before we transfer control to it, and when/and
> how that transfer of control happens.

This is what I had missed - the seperation of load and exec.

Thanks, Eric!

gerrit

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29 13:57                     ` Alan Cox
  2004-07-29 18:17                       ` Andrew Morton
@ 2004-07-30 15:24                       ` Olivier Galibert
  1 sibling, 0 replies; 60+ messages in thread
From: Olivier Galibert @ 2004-07-30 15:24 UTC (permalink / raw)
  To: Linux Kernel Mailing List

On Thu, Jul 29, 2004 at 02:57:09PM +0100, Alan Cox wrote:
> On Iau, 2004-07-29 at 01:22, Andrew Morton wrote:
> > btw, if we simply insert a five-second-pause, what problems does that
> > leave?  Network Rx, which is OK.  Disk writes will have completed (?). 
> > What remains?
> 
> Network RX is the obvious one since we've no idea where the DMA is
> going in memory.

Also audio and video capture on cyclic buffers can theorically go on
forever sending irqs from time to time while they're at it.

  OG.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Fastboot] Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-29 18:17                       ` Andrew Morton
  2004-07-29 21:20                         ` Alan Cox
@ 2004-07-31 13:52                         ` Matthias Urlichs
  1 sibling, 0 replies; 60+ messages in thread
From: Matthias Urlichs @ 2004-07-31 13:52 UTC (permalink / raw)
  To: linux-kernel

Hi, Andrew Morton wrote:

> See above.  We assume that network RX DMA won't be scribbling in the 16MB
> which was pre-reserved.  That's reasonable.  We _have_ to assume that.

If you wait a few seconds before verifying that the checksum of the
rescue kernel is still correct, then you should be able to be reasonably
sure that there won't be any corruption.

Nothing's 100% here, of course. But the chances that a delayed network DMA
causes the rescue kernel to write its dump data to an area it shouldn't
write to are small enough not to matter. IMHO.

-- 
Matthias Urlichs

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2004-07-31 13:53 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-22 16:19 Announce: dumpfs v0.01 - common RAS output API Keith Owens
2004-07-26  6:57 ` Andrew Morton
2004-07-28  1:53   ` Eric W. Biederman
2004-07-28 10:54     ` Suparna Bhattacharya
2004-07-28 10:46       ` [Fastboot] " Alan Cox
2004-07-28 14:38         ` Martin J. Bligh
2004-07-28 14:06           ` Alan Cox
2004-07-28 15:21             ` Martin J. Bligh
2004-07-28 15:56               ` Eric W. Biederman
2004-07-28 17:09               ` Andrea Arcangeli
2004-07-28 16:05             ` Eric W. Biederman
2004-07-28 15:42         ` Eric W. Biederman
2004-07-28 15:12       ` Eric W. Biederman
2004-07-28 15:23         ` Martin J. Bligh
2004-07-28 15:53           ` Eric W. Biederman
2004-07-28 16:03     ` Jesse Barnes
2004-07-28 18:00       ` Eric W. Biederman
2004-07-28 18:06         ` Jesse Barnes
2004-07-28 19:42           ` Martin J. Bligh
2004-07-28 19:56             ` [Fastboot] " Alan Cox
2004-07-28 19:44           ` Andrew Morton
2004-07-28 23:11             ` [Fastboot] " Eric W. Biederman
2004-07-28 22:53               ` Alan Cox
2004-07-29  1:12                 ` Eric W. Biederman
2004-07-29 14:00                   ` Alan Cox
2004-07-29 15:47                     ` Eric W. Biederman
2004-07-28 18:21         ` Alan Cox
2004-07-28 19:23       ` Martin J. Bligh
2004-07-28 20:28         ` [Fastboot] " Eric W. Biederman
2004-07-28 20:33           ` Andrew Morton
2004-07-28 19:59             ` Alan Cox
2004-07-28 22:42               ` Andrew Morton
2004-07-28 22:44                 ` Jesse Barnes
2004-07-28 23:17               ` Eric W. Biederman
2004-07-28 22:55                 ` Alan Cox
2004-07-29  0:22                   ` Andrew Morton
2004-07-29 13:57                     ` Alan Cox
2004-07-29 18:17                       ` Andrew Morton
2004-07-29 21:20                         ` Alan Cox
2004-07-29 22:30                           ` Gerrit Huizenga
2004-07-30  0:04                             ` Eric W. Biederman
2004-07-29 23:25                               ` Alan Cox
2004-07-30  4:07                                 ` Eric W. Biederman
2004-07-30 12:38                               ` Gerrit Huizenga
2004-07-31 13:52                         ` Matthias Urlichs
2004-07-30 15:24                       ` Olivier Galibert
2004-07-29  1:05                   ` Eric W. Biederman
2004-07-29 14:12                     ` Martin J. Bligh
2004-07-28 23:44                 ` Andrew Morton
2004-07-29  0:58                   ` Eric W. Biederman
2004-07-29  1:09                     ` Andrew Morton
2004-07-29  1:56                       ` Eric W. Biederman
2004-07-29 14:18                         ` Martin J. Bligh
2004-07-29 16:01                           ` Eric W. Biederman
2004-07-29 16:19                           ` Eric W. Biederman
2004-07-29 14:08                   ` Martin J. Bligh
2004-07-29 15:52                     ` Eric W. Biederman
2004-07-29 16:13                       ` Martin J. Bligh
2004-07-29 17:12                   ` Matthias Urlichs
  -- strict thread matches above, loose matches on Subject: below --
2004-07-22 15:42 Dan Kegel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox