From: Vivek Goyal <vgoyal@redhat.com>
To: Vitaly Mayatskikh <v.mayatskih@gmail.com>
Cc: linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
Randy Dunlap <rdunlap@xenotime.net>
Subject: Re: [PATCH 0/5] kdump: extract log buffer and registers from vmcore on NMI button pressing
Date: Fri, 4 Jun 2010 10:42:02 -0400 [thread overview]
Message-ID: <20100604144202.GE4111@redhat.com> (raw)
In-Reply-To: <87iq60a3rh.wl%vmayatsk@redhat.com>
On Thu, Jun 03, 2010 at 11:01:38AM +0200, Vitaly Mayatskikh wrote:
> At Wed, 2 Jun 2010 11:16:11 -0400, Vivek Goyal wrote:
>
> > I am not sure what is the problem we are trying to solve here. If we are
> > unable to capture the dump because second kernel did not boot due to
> > some dirver issue etc, above patch is not going to help either.
> >
> > If kernel has booted, then one should be able to capture the dump, filter
> > it and look at the log buffers and cpu registers.
> >
> > Most of the failures I have seen in capture kernel is that it was unable
> > to boot due to either deivce issues or failure in early boot. Once it has
> > crossed those hurdles, after that capturing the dump is easy part.
> >
> > How many times does it happen in second kernel that kernel is spinning in
> > a loop and NMI can still get you information out.
> >
> > So can you please give some more information about what kind of failures
> > while capturing the dump you are addressing by this patchset.
>
> Obviously, this change doesn't help if 2nd kernel is not able to
> boot. But there are other problems, which may prevent vmcore to be
> captured. For example, machine has RAM > HDD and it may save vmcore
> only over network. If network fails (e.g., due to bugs in NIC drivers
> or NFS, what is not so rare), and dump capture environment is
> non-interactive, or it doesn't have development tools like `crash',
> there's no chance even to guess what has happened.
Vitaly, in this case it sounds like writting some user space utility to
display log buffers of previous kernel and pack into initrd/initramfs and
run that utility if network is down and hard disk does not have enough
space to store the dump.
I vaguely remember that dump filtering utility was doing something
similar.
>
> Other possibilities of failure may include broken RAID controller,
> HDD, RAM. NMI button in such situations is a last chance to see old
> log.
Again, can't we do it with the help of user space utility packed in
initrd.
IMHO, somehow NMI button does not sound like a very good option. At max we
probably can look into doing this through sysrq option but I am not too
keen on that also until and unless we have good examples. You mentioned
that one might not be able to log in, but I am wondering why one would not
be able to login.
In kdump initrd, we can create one default policy that if you can't
capture dump, then try to save only log buffers of previous kernel. If
disk is broken, then just dump the buffers on console etc. This assumes
that console are at least being logged or somebody is looking at the
console. If not, one can always login and run the utility to dump buffers
again.
The only corner case which is not covered is that one can not login into
the system and somebody plugged in cosole later or console was shared. I
am not sure how common that case is.
Making capture kernel print pervious kernel's buffers does not sound very
convincing to me, at this point. I will
Thanks
Vivek
prev parent reply other threads:[~2010-06-04 14:42 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-02 7:39 [PATCH 0/5] kdump: extract log buffer and registers from vmcore on NMI button pressing Vitaly Mayatskikh
2010-06-02 7:39 ` [PATCH 1/5] x86: Introduce ELF_CORE_EXTRACT_REGS Vitaly Mayatskikh
2010-06-02 7:39 ` [PATCH 2/5] x86: Split __show_regs() Vitaly Mayatskikh
2010-06-02 9:50 ` Pekka Enberg
2010-06-02 10:00 ` Vitaly Mayatskikh
2010-06-02 7:39 ` [PATCH 3/5] vmcore: Introduce dump_old_log() Vitaly Mayatskikh
2010-06-02 7:39 ` [PATCH 4/5] x86: Add new callback for unhandled NMIs Vitaly Mayatskikh
2010-06-02 7:39 ` [PATCH 5/5] Document unknown_nmi_dump_log variable Vitaly Mayatskikh
2010-06-02 15:16 ` [PATCH 0/5] kdump: extract log buffer and registers from vmcore on NMI button pressing Vivek Goyal
2010-06-03 9:01 ` Vitaly Mayatskikh
2010-06-03 9:30 ` Andi Kleen
2010-06-03 12:33 ` Vitaly Mayatskikh
2010-06-03 15:13 ` Andi Kleen
2010-06-04 9:32 ` Vitaly Mayatskikh
2010-06-04 10:15 ` Andi Kleen
2010-06-04 13:58 ` Vitaly Mayatskikh
2010-06-04 9:49 ` Américo Wang
2010-06-04 10:16 ` Andi Kleen
2010-06-04 14:49 ` Vivek Goyal
2010-06-04 14:42 ` Vivek Goyal [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100604144202.GE4111@redhat.com \
--to=vgoyal@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=rdunlap@xenotime.net \
--cc=tglx@linutronix.de \
--cc=v.mayatskih@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).