re: Announce: dumpfs v0.01 - common RAS output API

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* re: Announce: dumpfs v0.01 - common RAS output API
@ 2004-07-22 15:42 Dan Kegel
  0 siblings, 0 replies; 11+ messages in thread
From: Dan Kegel @ 2004-07-22 15:42 UTC (permalink / raw)
  To: Linux Kernel Mailing List, kaos

Keith Owens <kaos () sgi ! com> wrote:
> Announcing dumpfs - a common API for all the RAS code that wants to
> save data during a kernel failure and to extract that RAS data on the
> next boot.  The documentation file is appended to this mail.
 > ...

I looked, but couldn't see any definition for RAS in your doc.
Could you add one?
The fs/Kconfig hunk might be a nice place to define it, since
naive users might see that text when configuring kernels.

http://www.kernelnewbies.org/glossary/#R does define it,
but it's so far down on
http://www.google.com/search?q=define%3Aras
that most people configuring a kernel might not be familiar with that sense.
- Dan

-- 
My technical stuff: http://kegel.com
My politics: see http://www.misleader.org for examples of why I'm for regime change

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Announce: dumpfs v0.01 - common RAS output API
@ 2004-07-22 16:19 Keith Owens
  2004-07-26  6:57 ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Keith Owens @ 2004-07-22 16:19 UTC (permalink / raw)
  To: linux-kernel

Announcing dumpfs - a common API for all the RAS code that wants to
save data during a kernel failure and to extract that RAS data on the
next boot.  The documentation file is appended to this mail.

ftp://oss.sgi.com/projects/kdb/download/dumpfs - current version is
v0.01, patch against 2.6.8-rc2.

This is a work in progress, the code is not complete and is subject to
change without notice.

dumpfs-v0.01 handles mounting the dumpfs partitions, including reliable
sharing with swap partitions and clearing the dumpfs partitions.  I am
working on the code that reads and writes dumpfs data from kernel
space, it is incomplete and has not been tested yet.  After
dumpfs_kernel is working, dumpfs_user is trivial.  The code is proof of
concept, some sections of the API (including polled I/O and data
compression) are not supported yet, and some of the code is ugly.

Why announce incomplete and untested code?  Mainly because RAS and
kernel dumping are being discussed at OLS this week.  Since I cannot be
at OLS, this is the next best thing.  Also the dumpfs API has
stabilized for the first cut, so it is time to get more discussion on
the API and to determine if it is worth continuing with the dumpfs
approach.  If dumpfs is discussed at OLS then I would appreciate any
feedback.

Questions for the other people who care about RAS (which rules out most
of the kernel developers) -

* Is using a common dump API the right thing to do?

  Obviously I think that this makes sense.  At the moment every bit of
  RAS code has its own dedicated I/O mechanism, not to mention its own
  user space tools to interface with the kernel, and to initialize,
  extract and clear its own data.

  dumpfs consolidates a lot of common code that is scattered over
  several RAS tools.  dumpfs removes the need for special RAS tools to
  extract dump data on reboot, instead standard user space commands
  will do the job.

* Is overloading mount the best approach?

  Making mount dumpfs share the partition with swap is ugly.  OTOH most
  of the existing code that dumpfs is intended to replace makes no
  attempt to verify its partition usage.  At least dumpfs tries to
  verify its partition data, ugly though the code is.

* Does the dumpfs API need to be extended or even replaced, either in
  kernel or in user space?

  One obvious extension is to make compression selective, so that some
  sections of the file can be compressed and others be in clear text.
  The lcrash header springs to mind.  Omitted for now since this
  version does not support compression yet.

* How do we get a clean API to do polling mode I/O to disk?

  One thing that is absolutely required for reliable RAS output is a
  polling mode method.  netdump is available for the network, we need
  the equivalent for disk I/O.  What is the best way to integrate
  polling mode I/O into the block device subsystem?

If the people who care about RAS think that a common RAS output API is
worthwhile then I will continue working on dumpfs.  Otherwise it will
be just another idea that did not get taken up, and each RAS tool will
continue to be developed and maintained in isolation.

==== 2.6.8-rc2/Documentation/filesystems/dumpfs.txt ====

dumpfs provides a common API for RAS components that need to dump kernel data
during a problem.  The dumped data is expected to be copied and cleared on the
next successful boot.

dumpfs consists of two layers, with completely different semantics.  These are
dumpfs (kernel only) and dumpfs_user (user space view of any saved dump data).

dumpfs uses one mount for each dump partition.  Each dumpfs partition can be
mounted with option share or noshare, the default is noshare.  The only
allowable user space operations on a dumpfs partition are mount and umount, user
space cannot directly access the dumpfs data.  Each dumpfs partition is mounted
with "mount -t dumpfs /dev/partition /mnt/dumpfs".  /mnt/dumpfs must be a
directory; it never contains anything useful but the mount semantics require a
directory here.

A shared dumpfs partition will normally coexist with a swap partition; the
dumpfs superblock is stored at an offset which leaves the swap signature alone.
A shared dump partition has no superblock on disk until the first dump file is
created.  Mounting a dumpfs partition with "-o clear" will completely zero the
dumpfs superblock, including the magic field.  This ensures that old dumpfs data
in a shared partition will not be used, its contents are unreliable because of
the data sharing.

When mounting a shared dumpfs partition, no check is made to see if the disk
contains a dumpfs superblock.  Mounting a dumpfs partition with -o share will
only share with a swap partition, it will not share with any other mounted
partition.

A non-shared dumpfs partition must have a superblock before being mounted.
mkfs.dumpfs and fsck.dumpfs (only used for non-shared partitions) are trivial.
Mounting dumpfs with "-o noshare,clear" will clear the metadata in the dumpfs
superblock, but preserve the magic field.

mkfs.dumpfs

#!/bin/sh
dd if=/dev/zero of="$1" bs=64k count=1
echo 'dum0' | dd of="$1" bs=64k seek=1 conv=sync

fsck.dumpfs

#!/bin/sh
true

Each dumpfs partition can be mounted with option poll or nopoll, the default is
poll.  Poll uses low level polled mode I/O direct to the partition, completely
bypassing the normal interrupt driven code.  This is done in an attempt to get
the data out to disk even when the kernel is so badly broken that interrupts are
not working.  Poll requires that the device driver for the dumpfs partition
supports polling mode I/O.  Nopoll uses the standard kernel I/O mechanisms, so
it is not guaranteed to work when the kernel is crashing.  Nopoll should only be
used when your device driver does not support polling mode I/O yet; you must
accept that dumpfs may hang waiting for the I/O to be serviced.

Another option when mounting a dumpfs partition is to specify the size of its
data buffer, in kibibytes.  This buffer is permanently allocated as long as the
dumpfs partition is mounted, it is only used when writing RAS data via dumpfs.
The buffer size will be rounded up to a multiple of the kernel page size.  The
default is buffer=128.

The user space view of the RAS data held in the dumpfs partitions is created by
"mount -t dumpfs_user none /mnt/dumpfs".  It logically merges and validates all
the dumpfs partitions that have been mounted and provides a user space view of
the files that have been written to dumpfs.  The only user space operations
supported on dumpfs_user are llseek, read, readdir, open (read only), close and
unlink.  Just enough to copy the files out of dumpfs_user and remove them.  User
space cannot write to dumpfs_user.

The kernel can write to files held in dumpfs partitions, to save RAS data over a
reboot.  Note that when kernel RAS components write to dumpfs they do _not_ use
the normal VFS layer, it may not be working during a failure.  Instead a RAS
component makes direct calls to the following dumpfs_kernel functions.

dumpfs_kernel_open("prefix", flags)

  Create and open for writing a file in dumpfs.  It returns a file descriptor
  within dumpfs.

  The dumpfs filename is constructed from "prefix-" followed by the value of
  xtime in the format CCYY-MM-DD-hh:mm:ss.n, where n starts at 0 and is
  incremented for each dumpfs file in the current boot.

  There is no requirement that a dumpfs_user mount point exist before the kernel
  can dump its data.  The first call to dumpfs_kernel_open will automatically
  create a kernel view that merges all the mounted dumpfs partitions.  The first
  call to dumpfs_kernel_open also writes the dumpfs superblocks to any shared
  partitions.

  Flags select compression, if any.

  dumpfs_kernel_open() is the simple interface.  It automatically stripes the
  data across all dumpfs partitions that are not currently being used.

  Most RAS code will open one dump file at a time, mainly because most users
  will only have one dumpfs partition.  The dumpfs code has a module_parm called
  dumpfs_max_open, with a default value of 1.

dumpfs_kernel_bdev_list()
dumpfs_kernel_open_choose("prefix", flags, bdev_list)

  Some platforms may need to have multiple output streams open in parallel.  For
  example a system with large amounts of memory and multiple disks may wish to
  assign different sections of memory to each cpu and to write to separate
  partitions.

  dumpfs_kernel_bdev_list() returns the list of usable dumpfs partitions.  If
  all partitions are in use then the list is empty.

  dumpfs_kernel_open_choose() opens a file using only the selected bdev entries.

  Systems that use concurrent parallel dumps should set module_parm
  dumpfs_max_open to a suitable value.

  Note: The following problems are inherently architecture and platform specific
  and are outside the scope of dumpfs.  That is not to say that we should not
  have an API for handling these problems on large systems, but it would be a
  separate API from dumpfs.

    Deciding which cpus to use for parallel dumping.
    Deciding which block devices each cpu should use.
    Getting the chosen cpus into the RAS code.
    Assigning the range of work to each cpu and each partition.
    Watching the dumping cpus for problems, recovering from those problems
      and reassigning the work to another cpu.
    Reconstructing the parallel dumps into a format for analysis.  dumpfs_user
      makes each dump file available to user space, but some code may be
      required to merge the separate files together.

dumpfs_kernel_close(fd)

  Sync the file's data to disk, close the file and update the dumpfs metadata.

dumpfs_kernel_write(fd, buffer, length)

  Write the buffer at the current dumpfs file location.  The data may or may not
  be written to disk immediately.  It returns the current location, including
  the data that was just written.

  For performance, the dumpfs data is striped over all the assigned partitions,
  in round robin.  The stripe unit is the minimum of the buffer= value across
  all the assigned partitions.

dumpfs_kernel_read(fd, buffer, length)

  Read the buffer from the current dumpfs file location.  It returns the current
  location, including the data that was just read.

dumpfs_kernel_llseek(fd, position)

  Set the current dumpfs file location.  It returns the previous location.  Only
  absolute seeking is supported.

dumpfs_kernel_sync(fd)

  Sync the file's data to disk and update the dumpfs metadata.

dumpfs_kernel_dirty_shared()

  Returns true if any shared partitions have been dirtied, in which case the
  kernel must be rebooted after all the RAS components have completed their
  work.

dumpfs_kernel_all_polled()

  Returns true if all dumpfs partitions can support polling mode I/O.  Otherwise
  the RAS code that calls dumpfs should enable interrupts, if at all possible.

Sample /etc/fstab entries for dumpfs partitions.

  /dev/sda2  /mnt/dumpfs  dumpfs  defaults  0 0
  /dev/sdb2  /mnt/dumpfs  dumpfs  share     0 0
  /dev/sdc7  /mnt/dumpfs  dumpfs  nopoll    0 0

Sample code in /etc/rc.sysinit to save dump data from the previous boot.  If you
are sharing dumpfs with swap, these commands must be executed before mounting
swap.  Note that dumpfs does not require any special user space tools to poke
inside partitions to see if there is any useful data to save, everything is a
file.

  # mount all the dumpfs partitions
  mount -a -t dumpfs
  # merge all dumpfs into dumpfs_user on /mnt/dump
  mount -t dumpfs_user none /mnt/dump
  # copy the data out
  (cd /mnt/dump; for f in `find -type f`; do echo saving $f; mv $f /var/log/dump; done)
  # drop dumpfs_user
  umount /mnt/dump
  # clear all the dumpfs metadata
  umount -a -t dumpfs
  mount -a -t dumpfs -o clear
  umount -a -t dumpfs

rc.sysinit will later mount the swap partitions, then mount all the other
partition types.  That will remount the dumpfs partitions, ready for the next
kernel crash.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-22 16:19 Keith Owens
@ 2004-07-26  6:57 ` Andrew Morton
  2004-07-28  1:53   ` Eric W. Biederman
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2004-07-26  6:57 UTC (permalink / raw)
  To: Keith Owens; +Cc: linux-kernel

Keith Owens <kaos@sgi.com> wrote:
>
>  * How do we get a clean API to do polling mode I/O to disk?

We hope to not have to.  The current plan is to use kexec: at boot time, do
a kexec preload of a small (16MB) kernel image.  When the main kernel
crashes or panics, jump to the kexec kernel.  The kexec kernel will hold a
new device driver for /dev/hmem through which applications running under
the kexec'ed kernel can access the crashed kernel's memory.

Write the contents of /dev/hmem to stable storage using whatever device
drivers are in the kexeced kernel, then reboot into a real kernel again.

That's all pretty simple to do, and the quality of the platform's crash
dump feature will depend only upon the quality of the platform's kexec
support.

People have bits and pieces of this already - I'd hope to see candidate
patches within a few weeks.  The main participants are rddunlap, suparna
and mbligh.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-26  6:57 ` Andrew Morton
@ 2004-07-28  1:53   ` Eric W. Biederman
  2004-07-28 10:54     ` Suparna Bhattacharya
  2004-07-28 16:03     ` Jesse Barnes
  0 siblings, 2 replies; 11+ messages in thread
From: Eric W. Biederman @ 2004-07-28  1:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Keith Owens, linux-kernel, Suparna Bhattacharya, Martin J. Bligh,
	fastboot

Andrew Morton <akpm@osdl.org> writes:

> Keith Owens <kaos@sgi.com> wrote:
> >
> >  * How do we get a clean API to do polling mode I/O to disk?
> 
> We hope to not have to.  The current plan is to use kexec: at boot time, do
> a kexec preload of a small (16MB) kernel image.  When the main kernel
> crashes or panics, jump to the kexec kernel.  The kexec kernel will hold a
> new device driver for /dev/hmem through which applications running under
> the kexec'ed kernel can access the crashed kernel's memory.

Hmm.  I think this will require one of the kernels to run at a
non-default address in physical memory.

> Write the contents of /dev/hmem to stable storage using whatever device
> drivers are in the kexeced kernel, then reboot into a real kernel
> again.

And at this point I don't quite see why you would need /dev/hmem,
as opposed to just using /dev/mem.

Or will the crashing kernel save and compress the core dump to
somewhere in ram and the dump kernel read it out from there? 

> That's all pretty simple to do, and the quality of the platform's crash
> dump feature will depend only upon the quality of the platform's kexec
> support.

Which will largely depend on the quality of it's device drivers...

> People have bits and pieces of this already - I'd hope to see candidate
> patches within a few weeks.  The main participants are rddunlap, suparna
> and mbligh.

I'm sorry I missed you then.  Unfortunately this is my busiest season at work
so I wasn't able to make it to OLS this year :(

Does anyone have a proof of concept implementation?  I have been able to find
a little bit of time for this kind of thing lately and have just done
the x86-64 port.  (You can all give me a hard time about taking a year
to get back to it :)  I am in the process of breaking everything up
into their individual change patches and doing a code review so I feel
comfortable with sending the code to Andrew.  So this would be a very
good time for me to look at any code for reporting a crash dump with
a kernel started with kexec.

Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28  1:53   ` Eric W. Biederman
@ 2004-07-28 10:54     ` Suparna Bhattacharya
  2004-07-28 16:03     ` Jesse Barnes
  1 sibling, 0 replies; 11+ messages in thread
From: Suparna Bhattacharya @ 2004-07-28 10:54 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Keith Owens, linux-kernel, Martin J. Bligh,
	fastboot

On Tue, Jul 27, 2004 at 07:53:01PM -0600, Eric W. Biederman wrote:
> Andrew Morton <akpm@osdl.org> writes:
> 
> > Keith Owens <kaos@sgi.com> wrote:
> > >
> > >  * How do we get a clean API to do polling mode I/O to disk?
> > 
> > We hope to not have to.  The current plan is to use kexec: at boot time, do
> > a kexec preload of a small (16MB) kernel image.  When the main kernel
> > crashes or panics, jump to the kexec kernel.  The kexec kernel will hold a
> > new device driver for /dev/hmem through which applications running under
> > the kexec'ed kernel can access the crashed kernel's memory.
> 
> Hmm.  I think this will require one of the kernels to run at a
> non-default address in physical memory.
> 
> > Write the contents of /dev/hmem to stable storage using whatever device
> > drivers are in the kexeced kernel, then reboot into a real kernel
> > again.
> 
> And at this point I don't quite see why you would need /dev/hmem,
> as opposed to just using /dev/mem.

This differs a little from your earlier suggestion of requiring
a kernel to run from a non-default address. Martin suggested simply
reserving about 16MB of area in advance, so that just before kexecing
the new kernel with mem=16M, we save the first 16MB away into the
reserved space. /dev/hmem (oldmem ?) is a view into the old kernel's
memory, as opposed to /dev/mem.

> 
> Or will the crashing kernel save and compress the core dump to
> somewhere in ram and the dump kernel read it out from there? 
> 
> > That's all pretty simple to do, and the quality of the platform's crash
> > dump feature will depend only upon the quality of the platform's kexec
> > support.
> 
> Which will largely depend on the quality of it's device drivers...
>  
> > People have bits and pieces of this already - I'd hope to see candidate
> > patches within a few weeks.  The main participants are rddunlap, suparna
> > and mbligh.
> 
> I'm sorry I missed you then.  Unfortunately this is my busiest season at work
> so I wasn't able to make it to OLS this year :(
> 
> Does anyone have a proof of concept implementation?  I have been able to find

Yes, Hari has a nice POC implementation - it might make sense for him to post
it rightaway for you to take a look. Basically, in addition to hmem (oldmem),
the upcoming kernel exports an ELF core view of the saved register and memory 
state of the previous kernel as /proc/vmcore.prev (remember your suggestion 
of using an ELF core file format for dump ?), so one can use cp or scp to 
save the core dump to disk. He has a quick demo, where he uses gdb (unmodified) 
to open the dump and show a stack trace of the dumping cpu.

Regards
Suparna

> a little bit of time for this kind of thing lately and have just done
> the x86-64 port.  (You can all give me a hard time about taking a year
> to get back to it :)  I am in the process of breaking everything up
> into their individual change patches and doing a code review so I feel
> comfortable with sending the code to Andrew.  So this would be a very
> good time for me to look at any code for reporting a crash dump with
> a kernel started with kexec.
> 
> Eric

-- 
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Lab, India


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28  1:53   ` Eric W. Biederman
  2004-07-28 10:54     ` Suparna Bhattacharya
@ 2004-07-28 16:03     ` Jesse Barnes
  2004-07-28 18:00       ` Eric W. Biederman
  2004-07-28 19:23       ` Martin J. Bligh
  1 sibling, 2 replies; 11+ messages in thread
From: Jesse Barnes @ 2004-07-28 16:03 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Keith Owens, linux-kernel, Suparna Bhattacharya,
	Martin J. Bligh, fastboot

On Tuesday, July 27, 2004 6:53 pm, Eric W. Biederman wrote:
> Hmm.  I think this will require one of the kernels to run at a
> non-default address in physical memory.

Right, and some platforms already support this, fortunately.

> Which will largely depend on the quality of it's device drivers...

I think this could end up being a good thing.  It gives more people a stake in 
making sure that driver shutdown() routines work well.

Jesse

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 16:03     ` Jesse Barnes
@ 2004-07-28 18:00       ` Eric W. Biederman
  2004-07-28 18:06         ` Jesse Barnes
  2004-07-28 19:23       ` Martin J. Bligh
  1 sibling, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2004-07-28 18:00 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Andrew Morton, Keith Owens, linux-kernel, Suparna Bhattacharya,
	Martin J. Bligh, fastboot

Jesse Barnes <jbarnes@engr.sgi.com> writes:

> On Tuesday, July 27, 2004 6:53 pm, Eric W. Biederman wrote:
> > Hmm.  I think this will require one of the kernels to run at a
> > non-default address in physical memory.
> 
> Right, and some platforms already support this, fortunately.
> 
> > Which will largely depend on the quality of it's device drivers...
> 
> I think this could end up being a good thing.  It gives more people a stake in 
> making sure that driver shutdown() routines work well.

Which actually is one of the items open for discussion currently.
For kexec on panic do we want to run the shutdown() routines?

Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 18:00       ` Eric W. Biederman
@ 2004-07-28 18:06         ` Jesse Barnes
  2004-07-28 19:42           ` Martin J. Bligh
  2004-07-28 19:44           ` Andrew Morton
  0 siblings, 2 replies; 11+ messages in thread
From: Jesse Barnes @ 2004-07-28 18:06 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Keith Owens, linux-kernel, Suparna Bhattacharya,
	Martin J. Bligh, fastboot

On Wednesday, July 28, 2004 11:00 am, Eric W. Biederman wrote:
> > I think this could end up being a good thing.  It gives more people a
> > stake in making sure that driver shutdown() routines work well.
>
> Which actually is one of the items open for discussion currently.
> For kexec on panic do we want to run the shutdown() routines?

We'll have to do something about incoming dma traffic and other stuff that the 
devices might be doing.  Maybe a arch specific callout to do some chipset 
stuff?

Jesse

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 16:03     ` Jesse Barnes
  2004-07-28 18:00       ` Eric W. Biederman
@ 2004-07-28 19:23       ` Martin J. Bligh
  1 sibling, 0 replies; 11+ messages in thread
From: Martin J. Bligh @ 2004-07-28 19:23 UTC (permalink / raw)
  To: Jesse Barnes, Eric W. Biederman
  Cc: Andrew Morton, Keith Owens, linux-kernel, Suparna Bhattacharya,
	fastboot



--On Wednesday, July 28, 2004 09:03:37 -0700 Jesse Barnes <jbarnes@engr.sgi.com> wrote:

> On Tuesday, July 27, 2004 6:53 pm, Eric W. Biederman wrote:
>> Hmm.  I think this will require one of the kernels to run at a
>> non-default address in physical memory.
> 
> Right, and some platforms already support this, fortunately.
> 
>> Which will largely depend on the quality of it's device drivers...
> 
> I think this could end up being a good thing.  It gives more people a stake in 
> making sure that driver shutdown() routines work well.

We discussed this at kernel summit a bit - it'd be safer to make the devices
clear down on boot up, rather than shutdown, if possible ... less work to
do on the unstable base.

Maybe we could shut down the devices on bringup, then bring it up again 
(no I'm not kidding ;-)) ... should reuse the code.

M.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 18:06         ` Jesse Barnes
@ 2004-07-28 19:42           ` Martin J. Bligh
  2004-07-28 19:44           ` Andrew Morton
  1 sibling, 0 replies; 11+ messages in thread
From: Martin J. Bligh @ 2004-07-28 19:42 UTC (permalink / raw)
  To: Jesse Barnes, Eric W. Biederman
  Cc: Andrew Morton, Keith Owens, linux-kernel, Suparna Bhattacharya,
	fastboot

>> > I think this could end up being a good thing.  It gives more people a
>> > stake in making sure that driver shutdown() routines work well.
>> 
>> Which actually is one of the items open for discussion currently.
>> For kexec on panic do we want to run the shutdown() routines?
> 
> We'll have to do something about incoming dma traffic and other stuff that the 
> devices might be doing.  Maybe a arch specific callout to do some chipset 
> stuff?

I vote for sleeping for 5 seconds ;-) Should kill off most of it ...

M.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Announce: dumpfs v0.01 - common RAS output API
  2004-07-28 18:06         ` Jesse Barnes
  2004-07-28 19:42           ` Martin J. Bligh
@ 2004-07-28 19:44           ` Andrew Morton
  1 sibling, 0 replies; 11+ messages in thread
From: Andrew Morton @ 2004-07-28 19:44 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: ebiederm, kaos, linux-kernel, suparna, mbligh, fastboot

Jesse Barnes <jbarnes@engr.sgi.com> wrote:
>
> On Wednesday, July 28, 2004 11:00 am, Eric W. Biederman wrote:
> > > I think this could end up being a good thing.  It gives more people a
> > > stake in making sure that driver shutdown() routines work well.
> >
> > Which actually is one of the items open for discussion currently.
> > For kexec on panic do we want to run the shutdown() routines?
> 
> We'll have to do something about incoming dma traffic and other stuff that the 
> devices might be doing.  Maybe a arch specific callout to do some chipset 
> stuff?
> 

Does ongoing DMA actually matter?  After all,the memory which is being
dma-ed into is pre-reserved and allocated for that purpose, and the dump
kernel won't be using it.

It would be polite to pause for a number of seconds to allow things to go
quiet, but apart from that I think all we need to ensure is that the
drivers in the dump kernel firmly whack the hardware before reinitialising
it?


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2004-07-28 19:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-22 15:42 Announce: dumpfs v0.01 - common RAS output API Dan Kegel
  -- strict thread matches above, loose matches on Subject: below --
2004-07-22 16:19 Keith Owens
2004-07-26  6:57 ` Andrew Morton
2004-07-28  1:53   ` Eric W. Biederman
2004-07-28 10:54     ` Suparna Bhattacharya
2004-07-28 16:03     ` Jesse Barnes
2004-07-28 18:00       ` Eric W. Biederman
2004-07-28 18:06         ` Jesse Barnes
2004-07-28 19:42           ` Martin J. Bligh
2004-07-28 19:44           ` Andrew Morton
2004-07-28 19:23       ` Martin J. Bligh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox