From: Dave Anderson <anderson@redhat.com>
To: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: lersek@redhat.com, kexec@lists.infradead.org, ptesarik@suse.cz,
crash-utility@redhat.com
Subject: Re: uniquely identifying KDUMP files that originate from QEMU
Date: Thu, 13 Nov 2014 10:21:57 -0500 (EST) [thread overview]
Message-ID: <336956801.8016660.1415892117619.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <20141113.100857.53867696478245447.d.hatayama@jp.fujitsu.com>
----- Original Message -----
> From: Dave Anderson <anderson@redhat.com>
> Subject: Re: uniquely identifying KDUMP files that originate from QEMU
> Date: Wed, 12 Nov 2014 09:09:34 -0500
>
> >
> >
> > ----- Original Message -----
> >> From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> >> To: ptesarik@suse.cz
> >> Cc: lersek@redhat.com, kexec@lists.infradead.org
> >> Subject: Re: uniquely identifying KDUMP files that originate from QEMU
> >> Message-ID:
> >> <20141112.120838.303682123986142686.d.hatayama@jp.fujitsu.com>
> >> Content-Type: Text/Plain; charset=us-ascii
> >>
> >> From: Petr Tesarik <ptesarik@suse.cz>
> >> Subject: Re: uniquely identifying KDUMP files that originate from QEMU
> >> Date: Tue, 11 Nov 2014 13:09:13 +0100
> >>
> >> > On Tue, 11 Nov 2014 12:22:52 +0100
> >> > Laszlo Ersek <lersek@redhat.com> wrote:
> >> >
> >> >> (Note: I'm not subscribed to either qemu-devel or the kexec list;
> >> >> please
> >> >> keep me CC'd.)
> >> >>
> >> >> QEMU is able to dump the guest's memory in KDUMP format (kdump-zlib,
> >> >> kdump-lzo, kdump-snappy) with the "dump-guest-memory" QMP command.
> >> >>
> >> >> The resultant vmcore is usually analyzed with the "crash" utility.
> >> >>
> >> >> The original tool producing such files is kdump. Unlike the procedure
> >> >> performed by QEMU, kdump runs from *within* the guest (under a kexec'd
> >> >> kdump kernel), and has more information about the original guest kernel
> >> >> state (which is being dumped) than QEMU. To QEMU, the guest kernel
> >> >> state
> >> >> is opaque.
> >> >>
> >> >> For this reason, the kdump preparation logic in QEMU hardcodes a number
> >> >> of fields in the kdump header. The direct issue is the "phys_base"
> >> >> field. Refer to dump.c, functions create_header32(), create_header64(),
> >> >> and "include/sysemu/dump.h", macro PHYS_BASE (with the replacement text
> >> >> "0").
> >> >>
> >> >> http://git.qemu.org/?p=qemu.git;a=blob;f=dump.c;h=9c7dad8f865af3b778589dd0847e450ba9a75b9d;hb=HEAD
> >> >>
> >> >> http://git.qemu.org/?p=qemu.git;a=blob;f=include/sysemu/dump.h;h=7e4ec5c7d96fb39c943d970d1683aa2dc171c933;hb=HEAD
> >> >>
> >> >> This works in most cases, because the guest Linux kernel indeed tends
> >> >> to
> >> >> be loaded at guest-phys address 0. However, when the guest Linux kernel
> >> >> is booted on top of OVMF (which has a somewhat unusual UEFI memory
> >> >> map),
> >> >> then the guest Linux kernel is loaded at 16MB, thereby getting out of
> >> >> sync with the phys_base=0 setting visible in the KDUMP header.
> >> >>
> >> >> This trips up the "crash" utility.
> >> >>
> >> >> Dave worked around the issue in "crash" for ELF format dumps -- "crash"
> >> >> can identify QEMU as the originator of the vmcore by finding the QEMU
> >> >> notes in the ELF vmcore. If those are present, then "crash" employs a
> >> >> heuristic, probing for a phys_base up to 32MB, in 1MB steps.
> >> >>
> >> >> Alas, the QEMU notes are not present in the KDUMP-format vmcores that
> >> >> QEMU produces (they cannot be),
> >> >
> >> > Why? Since KDUMP format version 4, the complete ELF notes can be stored
> >> > in the file (see offset_note, size_note fields in the sub-header).
> >> >
> >>
> >> Yes, the QEMU notes is present in kdump-compressed format. But
> >> phys_base cannot be calculated only from qemu-side. We cannot do more
> >> than the efforts crash utility does for workaround. So, the phys_base
> >> value in kdump-sub header is now designed to have 0 now.
> >>
> >> Anyway, phys_base is kernel information. To make it available for qemu
> >> side, there's need to prepare a mechanism for qemu to have any access
> >> to it.
> >>
> >> One ad-hoc but simple way is to put phys_base value as part of
> >> VMCOREINFO note information on kernel.
> >>
> >> Although there has already been a similar one in VMCOREINFO, like
> >>
> >> arch/x86/kernel/
> >> ==
> >> void arch_crash_save_vmcoreinfo(void)
> >> {
> >> VMCOREINFO_SYMBOL(phys_base); <---- This
> >> VMCOREINFO_SYMBOL(init_level4_pgt);
> >>
> >> ...
> >> ==
> >>
> >> this is meangless, because this value is a virtual address assigned to
> >> phys_base symbol. To refer to the value of phys_base itself, we need
> >> the phys_base value we are about to get now.
> >>
> >> So, instead, if we change this to save the value, not value of symbol
> >> phys_base, we can get phys_base from the VMCOREINFO.
> >>
> >> The VMCOREINFO consists simply of string. So it's easy to search
> >> vmcore for it e.g. using strings and grep like this:
> >>
> >> $ strings vmcore-3.10.0-121.el7.x86_64 | grep -E ".*VMCOREINFO.*" -A 100
> >> VMCOREINFO
> >> OSRELEASE=3.10.0-121.el7.x86_64
> >> PAGESIZE=4096
> >> ...
> >> SYMBOL(phys_base)=ffffffff818e5010 <-- though this is address of
> >> phys_base
> >> now...
> >> SYMBOL(init_level4_pgt)=ffffffff818de000
> >> SYMBOL(node_data)=ffffffff819f1cc0
> >> LENGTH(node_data)=1024
> >> CRASHTIME=1399460394
> >> ...
> >>
> >> This should also be useful to get phys_base of 2nd kernel, which is
> >> inherently relocated kernel from a vmcore generated using qemu dump.
> >>
> >> This is far from well-designed from qemu's point of view, but it would
> >> be manually easier to get phys_base than now.
> >>
> >> Obviously, the VMCOREINFO is available only if CONFIG_KEXEC is
> >> enabled. Other users cannot use this.
> >>
> >> --
> >> Thanks.
> >> HATAYAMA, Daisuke
> >
> > I agree that the actual value of phys_base should be included in the
> > vmcoreinfo.
> >
> > However, it won't help in this case because the vmcoreinfo data is not
> > copied into the compressed dumpfile header. The offset_vmcoreinfo and
> > size_vmcoreinfo fields are zero.
>
> Yes, so I said:
>
> >> This is far from well-designed from qemu's point of view, but it would
> >> be manually easier to get phys_base than now.
>
> This is just an ad-hoc way.
>
> >
> > Here's an example header dump of a QEMU-generated dumpfile:
> >
> > crash> help -n
> > makedumpfile header:
> > signature: "makedumpfile"
> > type: 1
> > version: 1
> > all_flat_data:
> > num_array: 18695
> > array: 7f484b760010
> > file_size: 0
> >
> > diskdump_data:
> > filename: vmcore.ovmf.rhel7.kdump-snappy
> > flags: c6
> > (KDUMP_CMPRS_LOCAL|ERROR_EXCLUDED|LZO_SUPPORTED|SNAPPY_SUPPORTED)
> > [FLAT]
> > dfd: 3
> > ofp: 3e441b1260
> > machine_type: 62 (EM_X86_64)
> >
> > header: 1a68fe0
> > signature: "KDUMP "
> > header_version: 6
> > utsname:
> > sysname:
> > nodename:
> > release:
> > version:
> > machine: x86_64
> > domainname:
> > timestamp:
> > tv_sec: 0
> > tv_usec: 0
> > status: 4 (DUMP_DH_COMPRESSED_SNAPPY)
> > block_size: 4096
> > sub_hdr_size: 1
> > bitmap_blocks: 76
> > max_mapnr: 1245184
> > total_ram_blocks: 0
> > device_blocks: 0
> > written_blocks: 0
> > current_cpu: 0
> > nr_cpus: 4
> > tasks[nr_cpus]: 0
> > 0
> > 0
> > 0
> >
> > sub_header: 0 (n/a)
> >
> > sub_header_kdump: 1a69ff0
> > phys_base: 0
> > dump_level: 1 (0x1) (DUMP_EXCLUDE_ZERO)
> > split: 0
> > start_pfn: (unused)
> > end_pfn: (unused)
> > offset_vmcoreinfo: 0 (0x0)
> > size_vmcoreinfo: 0 (0x0)
> > offset_note: 4200 (0x1068)
> > size_note: 3232 (0xca0)
> > num_prstatus_notes: 4
> > notes_buf: 1a6b000
> > notes[0]: 1a6b000
> > notes[1]: 1a6b164
> > notes[2]: 1a6b2c8
> > notes[3]: 1a6b42c
> > NT_PRSTATUS_offset: 1068
> > 11cc
> > 1330
> > 1494
> > offset_eraseinfo: 0 (0x0)
> > size_eraseinfo: 0 (0x0)
> > start_pfn_64: (unused)
> > end_pfn_64: (unused)
> > max_mapnr_64: 1245184 (0x130000)
> >
> > data_offset: 4e000
> > block_size: 4096
> > block_shift: 12
> > bitmap: 7f484b713010
> > bitmap_len: 311296
> > max_mapnr: 1245184 (0x130000)
> > dumpable_bitmap: 7f484b6c6010
> > byte: 0
> > bit: 0
> > compressed_page: 1a8c660
> > curbufptr: 1a7f650
> > ...
> >
> > Note that QEMU does add self-generated register dumps above, but the
> > special
> > "QEMU" note that is added to ELF kdumps is not included.
> >
>
> Sorry, I didn't know this, and there's no reason not to add it.
>
> > Also note that the kernel version information is also left zero-filled.
> >
>
> This is what I intended. Retrieving data from vmcore should be done in
> crash utility or makedumpfile.
>
> > In any case, if either a QEMU note or a diskdump.data flag were added, I would
> > be more than happy.
> >
> > Dave
>
> The absence of QEMU note is different from my intension. This is
> regression agast ELF. We must add it.
Not necessary -- as it turns out, the QEMU notes are located in the compressed
kdump notes section following the NT_PRSTATUS notes:
http://lists.infradead.org/pipermail/kexec/2014-November/012974.html
It's just that the notes-gathering code in the crash utility was only
looking for and storing NT_PRSTATUS note information.
Thanks,
Dave
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2014-11-13 15:22 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <mailman.20827.1415774425.22890.kexec@lists.infradead.org>
2014-11-12 14:09 ` uniquely identifying KDUMP files that originate from QEMU Dave Anderson
2014-11-12 15:01 ` Laszlo Ersek
2014-11-12 15:45 ` Dave Anderson
2014-11-13 1:08 ` HATAYAMA Daisuke
2014-11-13 15:21 ` Dave Anderson [this message]
2014-11-11 11:22 Laszlo Ersek
2014-11-11 12:09 ` Petr Tesarik
2014-11-12 3:08 ` HATAYAMA Daisuke
2014-11-12 8:04 ` Petr Tesarik
2014-11-12 14:50 ` Laszlo Ersek
2014-11-12 18:43 ` Petr Tesarik
2014-11-12 20:30 ` Laszlo Ersek
2014-11-12 20:41 ` Dave Anderson
2014-11-12 21:20 ` Petr Tesarik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=336956801.8016660.1415892117619.JavaMail.zimbra@redhat.com \
--to=anderson@redhat.com \
--cc=crash-utility@redhat.com \
--cc=d.hatayama@jp.fujitsu.com \
--cc=kexec@lists.infradead.org \
--cc=lersek@redhat.com \
--cc=ptesarik@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.