Re: Linux Kernel Dump Summit 2005

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Maneesh Soni <maneesh@in.ibm.com>
To: OBATA Noboru <noboru.obata.ar@hitachi.com>
Cc: hyoshiok@miraclelinux.com, akpm@osdl.org, linux-kernel@vger.kernel.org
Subject: Re: Linux Kernel Dump Summit 2005
Date: Thu, 13 Oct 2005 11:19:40 +0530	[thread overview]
Message-ID: <20051013054940.GA15878@in.ibm.com> (raw)
In-Reply-To: <20051012.173043.41629248.noboru.obata.ar@hitachi.com>

On Wed, Oct 12, 2005 at 05:30:43PM +0900, OBATA Noboru wrote:
> On Tue, 11 Oct 2005, Hiro Yoshioka wrote:
> > 
> > The reasons are
> > 1) They have to maintain the dump tools and support their users.
> >    Many users are still using 2.4 kernels so merging kdump into 2.6
> >    kernel does not help them.
> > 2) Commercial Linux Distros (Red Hat/Suse/MIRACLE(Asianux)/Turbo etc) use
> >    LKCD/diskdump/netdump etc.
> >    Almost no users use a vanilla kernel so kdump does not have users yet.

As of now I can see Red Hat has put kexec/kdump in FC5 devel tree 
(rawhide), and hopefully it will be merged with FC5.

> Agreed.
> 
> I am testing (or tasting ;-) kdump myself, and find it really
> impressive and promising.  Thank you all who have worked on.
> 
> In term of users, however, the majority of commercial users
> still use 2.4 kernels of commercial Linux distributions.  This
> is especially true for careful users who have large systems
> because switching to 2.6 kernels without regression is not an
> easy task.  So merging kdump into the mainline kernel does not
> directly mean that these users start using it now.
> 
> Rather, merging kdump has much meaning for commercial Linux
> distributors, who should be planning how and when to include
> kdump in their distros.
> 
> > > Is that a correct impression?  If so, what shortcoming(s) in kdump are
> > > causing people to be reluctant to use it?
> > 
> > I think the way to go is the kdump however it may take time.
> 
> Agreed.
> 
> I'd say commercial users are not reluctant to use kdump, but
> they are just waiting for kdump-ready distros.  So in turn, we
> still have some time left for improving kdump further before
> kdump-ready distros are shipped to users, and I would like to be
> involved in such improvement hereafter.
> 
> Thinking about the requirements in enterprise systems,
> challenges of kdump will be:
> 
>   - Reliability
>     + Hardware-related issues
> 
>   - Manageability
>     + Easy configuration
>     + Automated dump-capture and restart
>     + Time and space for capturing dump
>     + Handling two kernels
> 
>   - Flexibility
>     + Hook points before booting the 2nd kernel
> 
> My short impressions follow.  I understand that kdump/kexec
> developers are already discussing and working on some issues
> above, and I am grateful if someone tell me about the current
> status, or point me to the past lkml threads.

Many of the discussions are on fastboot mailing list. As of now
work is being done to port kdump to x86_64 and ppc64 architectures
and tackling the device initialization issues.

> 
> Reliability
> -----------
> 
> In terms of reliability, hardware-related issues, such as a
> device reinitialization problem, an ongoing DMA problem, and
> possibly a pending interrupts problem, must be carefully
> resolved.

As of now the idea is to tackle these issues as per driver basis,
as and when reported. It seems there may not be any generic way 
to solve device initialization.
>
> Manageability
> -------------
> 
> As for manageability, it is nice if a user can easily setup
> kdump just by writing DEVICE=/dev/sdc6 to one's
> /etc/sysconfig/kdump and start the kdump service, for example.
> It is also desirable that an action taken after capturing a dump
> (halt, reboot, or poweroff) is configurable.  I believe these are
> userspace tasks.

These are user space things and mostly distro specific. Though there
are some prototypes done for automatically loading the second kernel
and autmoatically saving the captured dump using initrd at
http://lse.sf.net/kdump/ 

> Time and space problem in capturing huge crash dump is raised
> already.  The partial dump and dump compress technology must be
> explored.
Agreed, any collaboration in this area is greatly appreciated.

> One of my worries is that the current kdump requires distinct
> two kernels (one for normal use, and one for capturing dumps) to
> work.  And I'm not fully convinced whether a use of two kernels
> is the only solution or not.  Well, I heard that this decision
> better solves the ongoing DMA problem (please correct me if
> other reasons are prominent), but from a pure management point
> of view handing one kernel is happier than two kernels.

I think there were some efforts being done in having a relocatable
kernel, which can facilitate running the same kernel as regular and
dump capture kernel, though at different physical start address.

> Flexibility
> -----------
> 
> To minimize the downtime, a crashed kernel would want to
> communicate with clustering software/firmware to help it detect
> the failure quickly.  This can be generalized by making
> appropriate hook points (or notifier lists) in kdump.
> 
Sorry, I am not getting what is being said here. I think the right thing
is to always minimize what a crashed kernel is supposed to do. So, why/what
should a crashed kernel communicate to someone.

> Perhaps these hooks can be used to try reseting devices when
> reinitialization of devices in the 2nd kernel tends to fail.


Thanks
Maneesh
-- 
Maneesh Soni
Linux Technology Center, 
IBM India Software Labs,
Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-25044990

next prev parent reply	other threads:[~2005-10-13  5:51 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-21 11:55 Linux Kernel Dump Summit 2005 Hiro Yoshioka
2005-10-06 12:17 ` OBATA Noboru
2005-10-06 14:39   ` Hiro Yoshioka
2005-10-10  8:45   ` Pavel Machek
2005-10-12  8:28     ` OBATA Noboru
2005-10-12  9:02       ` Felix Oxley
2005-10-12  9:09         ` Pavel Machek
2005-10-12  9:56           ` Felix Oxley
2005-10-12 10:07             ` Pavel Machek
2005-10-12 18:03               ` Andy Isaacson
2005-10-12 22:34                 ` Felix Oxley
2005-10-12 11:05             ` jerome lacoste
2005-10-12 21:10               ` Felix Oxley
2005-10-18 13:47         ` OBATA Noboru
2005-10-18 14:10           ` Hugh Dickins
2005-10-19 19:00             ` Theodore Ts'o
2005-10-27  7:48               ` OBATA Noboru
2005-10-11  0:49   ` Andrew Morton
2005-10-11  4:41     ` Hiro Yoshioka
2005-10-12  8:30       ` OBATA Noboru
2005-10-13  5:49         ` Maneesh Soni [this message]
2005-10-27  7:45           ` OBATA Noboru
2005-10-13 14:28     ` Troy Heber
2005-10-17 11:19     ` Takao Indoh
2005-10-18 13:48       ` OBATA Noboru
2005-10-19  3:17         ` Takao Indoh
2005-10-27  7:45           ` OBATA Noboru
2005-10-18 14:54     ` Carsten Otte
  -- strict thread matches above, loose matches on Subject: below --
2005-10-14  9:19 hideki.takahashi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20051013054940.GA15878@in.ibm.com \
    --to=maneesh@in.ibm.com \
    --cc=akpm@osdl.org \
    --cc=hyoshiok@miraclelinux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=noboru.obata.ar@hitachi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox