From: Petr Tesarik <ptesarik@suse.cz>
To: Christopher Covington <cov@codeaurora.org>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>,
kexec@lists.infradead.org,
qemu devel list <qemu-devel@nongnu.org>,
Qiao Nuohan <qiaonuohan@cn.fujitsu.com>,
Dave Anderson <anderson@redhat.com>,
kumagai-atsushi@mxc.nes.nec.co.jp,
Laszlo Ersek <lersek@redhat.com>,
crash-utility@redhat.com
Subject: Re: [Qemu-devel] uniquely identifying KDUMP files that originate from QEMU
Date: Wed, 12 Nov 2014 15:36:38 +0100 [thread overview]
Message-ID: <20141112153638.5a749b70@hananiah.suse.cz> (raw)
In-Reply-To: <54636096.6020500@codeaurora.org>
V Wed, 12 Nov 2014 08:28:54 -0500
Christopher Covington <cov@codeaurora.org> napsáno:
> On 11/12/2014 08:26 AM, Petr Tesarik wrote:
> > On Wed, 12 Nov 2014 08:18:04 -0500
> > Christopher Covington <cov@codeaurora.org> wrote:
> >
> >> On 11/12/2014 03:05 AM, Petr Tesarik wrote:
> >>> On Tue, 11 Nov 2014 12:27:44 -0500
> >>> Christopher Covington <cov@codeaurora.org> wrote:
> >>>
> >>>> On 11/11/2014 06:22 AM, Laszlo Ersek wrote:
> >>>>> (Note: I'm not subscribed to either qemu-devel or the kexec list; please
> >>>>> keep me CC'd.)
> >>>>>
> >>>>> QEMU is able to dump the guest's memory in KDUMP format (kdump-zlib,
> >>>>> kdump-lzo, kdump-snappy) with the "dump-guest-memory" QMP command.
> >>>>>
> >>>>> The resultant vmcore is usually analyzed with the "crash" utility.
> >>>>>
> >>>>> The original tool producing such files is kdump. Unlike the procedure
> >>>>> performed by QEMU, kdump runs from *within* the guest (under a kexec'd
> >>>>> kdump kernel), and has more information about the original guest kernel
> >>>>> state (which is being dumped) than QEMU. To QEMU, the guest kernel state
> >>>>> is opaque.
> >>>>>
> >>>>> For this reason, the kdump preparation logic in QEMU hardcodes a number
> >>>>> of fields in the kdump header. The direct issue is the "phys_base"
> >>>>> field. Refer to dump.c, functions create_header32(), create_header64(),
> >>>>> and "include/sysemu/dump.h", macro PHYS_BASE (with the replacement text
> >>>>> "0").
> >>>>>
> >>>>> http://git.qemu.org/?p=qemu.git;a=blob;f=dump.c;h=9c7dad8f865af3b778589dd0847e450ba9a75b9d;hb=HEAD
> >>>>>
> >>>>> http://git.qemu.org/?p=qemu.git;a=blob;f=include/sysemu/dump.h;h=7e4ec5c7d96fb39c943d970d1683aa2dc171c933;hb=HEAD
> >>>>>
> >>>>> This works in most cases, because the guest Linux kernel indeed tends to
> >>>>> be loaded at guest-phys address 0. However, when the guest Linux kernel
> >>>>> is booted on top of OVMF (which has a somewhat unusual UEFI memory map),
> >>>>> then the guest Linux kernel is loaded at 16MB, thereby getting out of
> >>>>> sync with the phys_base=0 setting visible in the KDUMP header.
> >>>>>
> >>>>> This trips up the "crash" utility.
> >>>>>
> >>>>> Dave worked around the issue in "crash" for ELF format dumps -- "crash"
> >>>>> can identify QEMU as the originator of the vmcore by finding the QEMU
> >>>>> notes in the ELF vmcore. If those are present, then "crash" employs a
> >>>>> heuristic, probing for a phys_base up to 32MB, in 1MB steps.
> >>>>
> >>>> What advantages does KDUMP have over ELF?
> >>>
> >>> It's smaller (data is compressed), and it contains a header with some
> >>> useful information (e.g. the crashed kernel's version and release).
> >>
> >> What if the ELF dumper used SHF_COMPRESSED or could dump an ELF.xz?
> >
> > Not the same thing. With KDUMP, each page is compressed separately, so
> > if a utility like crash needs a page from the middle, it can find it
> > and unpack it immediately. If we had an ELF.xz, then the whole file
> > must be unpacked before it can be used. And unpacking a few terabytes
> > takes ... a while. ;-)
>
> Understood on the ELF.xz approach, but why couldn't each page (or maybe a
> configurable size) be a SHF_COMPRESSED section?
A machine with 64TB of RAM (already manufactured by SGI) has
17,179,869,184 pages. When KDUMP (or, actually diskdump) format was
invented, ELF files could have at most 2^16 = 65,536 program headers.
Since then, ELF specification has been extended (PN_XNUM), so the
number of sections can be stored in the sh_info field of the first ELF
section, but that only increases the number of possible sections to
2^32 = 4,294,967,296.
Yes, we could divide memory into larger chunks than pages, but:
1. you're probably the first one to have the idea, and
2. this is easy if you save the complete RAM content, but not quite
that easy if some pages should be filtered out (makedumpfile).
There are a few other (minor) points, e.g.:
* Each program header consumes 56 bytes in ELF64, while a single
bit is sufficient in KDUMP compressed files to tell if the
corresponding page is stored or not.
* SHF_COMPRESSED currently supports only zlib compression, which is
rather slow. KDUMP supports zlib, lzo and snappy.
* Support for KDUMP files is already present in the crash utility,
while I don't think there is any support for SHF_COMPRESSED
segments.
In short, SHF_COMPRESSED looks like a viable alternative, but right now
KDUMP is the better choice in terms of features and interoperability.
Just my two cents,
Petr T
next prev parent reply other threads:[~2014-11-12 14:36 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-11 11:22 [Qemu-devel] uniquely identifying KDUMP files that originate from QEMU Laszlo Ersek
2014-11-11 11:46 ` Peter Maydell
2014-11-11 17:27 ` Christopher Covington
2014-11-12 8:05 ` Petr Tesarik
2014-11-12 13:18 ` Christopher Covington
2014-11-12 13:26 ` Petr Tesarik
2014-11-12 13:28 ` Christopher Covington
2014-11-12 14:36 ` Petr Tesarik [this message]
2014-11-12 14:40 ` Laszlo Ersek
2014-11-12 14:10 ` Laszlo Ersek
2014-11-12 14:48 ` Christopher Covington
2014-11-12 15:03 ` Laszlo Ersek
2014-11-12 15:43 ` Christopher Covington
2014-11-12 21:10 ` Petr Tesarik
2014-11-12 14:37 ` Laszlo Ersek
[not found] ` <20141111130913.11eec0a3@hananiah.suse.cz>
[not found] ` <20141112.120838.303682123986142686.d.hatayama@jp.fujitsu.com>
[not found] ` <20141112090441.3ee42632@hananiah.suse.cz>
[not found] ` <546373B8.70103@redhat.com>
[not found] ` <20141112194325.246ff381@hananiah.suse.cz>
2014-11-12 20:30 ` Laszlo Ersek
2014-11-12 20:41 ` Dave Anderson
2014-11-12 21:21 ` [Qemu-devel] [Crash-utility] " Dave Anderson
2014-11-12 21:20 ` [Qemu-devel] " Petr Tesarik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141112153638.5a749b70@hananiah.suse.cz \
--to=ptesarik@suse.cz \
--cc=anderson@redhat.com \
--cc=cov@codeaurora.org \
--cc=crash-utility@redhat.com \
--cc=kexec@lists.infradead.org \
--cc=kumagai-atsushi@mxc.nes.nec.co.jp \
--cc=lersek@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qiaonuohan@cn.fujitsu.com \
--cc=tumanova@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).