From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56743) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XoekF-0003Ui-TK for qemu-devel@nongnu.org; Wed, 12 Nov 2014 15:42:32 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XoekB-0006U9-Fn for qemu-devel@nongnu.org; Wed, 12 Nov 2014 15:42:27 -0500 Received: from mx5-phx2.redhat.com ([209.132.183.37]:39530) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XoekB-0006Ty-7y for qemu-devel@nongnu.org; Wed, 12 Nov 2014 15:42:23 -0500 Date: Wed, 12 Nov 2014 15:41:48 -0500 (EST) From: Dave Anderson Message-ID: <548583707.7386011.1415824908540.JavaMail.zimbra@redhat.com> In-Reply-To: <5463C35C.2000103@redhat.com> References: <5461F18C.2080400@redhat.com> <20141111130913.11eec0a3@hananiah.suse.cz> <20141112.120838.303682123986142686.d.hatayama@jp.fujitsu.com> <20141112090441.3ee42632@hananiah.suse.cz> <546373B8.70103@redhat.com> <20141112194325.246ff381@hananiah.suse.cz> <5463C35C.2000103@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] uniquely identifying KDUMP files that originate from QEMU List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Laszlo Ersek Cc: Petr Tesarik , tumanova@linux.vnet.ibm.com, kexec@lists.infradead.org, qiaonuohan@cn.fujitsu.com, qemu-devel@nongnu.org, HATAYAMA Daisuke , kumagai-atsushi@mxc.nes.nec.co.jp, crash-utility@redhat.com ----- Original Message ----- > adding back a few CC's because this discussion is useful >=20 > On 11/12/14 19:43, Petr Tesarik wrote: > > V Wed, 12 Nov 2014 15:50:32 +0100 > > Laszlo Ersek naps=C3=A1no: > >=20 > >> On 11/12/14 09:04, Petr Tesarik wrote: > >>> On Wed, 12 Nov 2014 12:08:38 +0900 (JST) > >>> HATAYAMA Daisuke wrote: > >> > >>>> Anyway, phys_base is kernel information. To make it available for qe= mu > >>>> side, there's need to prepare a mechanism for qemu to have any acces= s > >>>> to it. > >>> > >>> Yes. I wonder if you can have access without some sort of co-operatio= n > >>> from the guest kernel itself. I guess not. > >> > >> Propagating any kind of additional information from the guest kernel > >> (which is unprivileged and potentially malicious) to the host-side qem= u > >> process (which is by definition more privileged, although still confin= ed > >> by various measures) is something we'd explicitly like to avoid. > >> > >> Think of it like this. I throw a physical box at you, running Linux, > >> that has frozen in time. Can "crash" work with nothing else but the > >> contents of the memory, and information about the CPUs? > >=20 > > If only you could save the _complete_ state of the CPU... For example > > the content of CR3 would be quite useful. >=20 > (1) CR3 is already saved, in both the ELF and the kdump compressed format= s. >=20 > - ELF case: >=20 > qmp_dump_guest_memory() [dump.c] > create_vmcore() > dump_begin() > write_elf64_notes() >=20 > loop from 1 to #vcpu: > cpu_write_elf64_note() [qom/cpu.c] > x86_64_write_elf64_note() [target-i386/arch_dump.c] > writes "CORE" >=20 > loop from 1 to #vcpu: > cpu_write_elf64_qemunote() [qom/cpu.c] > x86_cpu_write_elf64_qemunote() [target-i386/arch_dump.c] > cpu_write_qemu_note() > qemu_get_cpustate() > s->cr[3] =3D env->cr[3]; <---------- here > writes "QEMU" >=20 > Hence, the information is part of the QEMU note. >=20 > - kdump case: >=20 > qmp_dump_guest_memory() [dump.c] > create_kdump_vmcore() > write_dump_header() > create_header64() > write_elf64_notes() > [... same as above ...] >=20 > The trick here is that the note-writer functions use a callback function > for actually outputting the data. So while in the ELF case the stuff > goes directly to a file, in the kdump case the notes are first saved in > a memory buffer, and then later saved in the file at offset > KdumpSubHeader64.offset_note. (... Which is then represented in the > flattened file format of course.) >=20 > So, the information is there in both cases. >=20 >=20 > (2) Dave -- this just made me realize that the QEMU note is *already* > there in the kdump file as well; pointed-to by > KdumpSubHeader64.offset_note, for a length of KdumpSubHeader64.note_size. >=20 > From your other email > : >=20 > > sub_header_kdump: 1c9cff0 > > phys_base: 0 > > dump_level: 1 (0x1) (DUMP_EXCLUDE_ZERO) > > split: 0 > > start_pfn: (unused) > > end_pfn: (unused) > > offset_vmcoreinfo: 0 (0x0) > > size_vmcoreinfo: 0 (0x0) > > offset_note: 4200 (0x1068) <----------- here > > size_note: 3232 (0xca0) <----------- > > num_prstatus_notes: 4 > > notes_buf: 1c9e000 > > notes[0]: 1c9e000 > > notes[1]: 1c9e164 > > notes[2]: 1c9e2c8 > > notes[3]: 1c9e42c > > NT_PRSTATUS_offset: 1068 > > 11cc > > 1330 > > 1494 > > offset_eraseinfo: 0 (0x0) > > size_eraseinfo: 0 (0x0) > > start_pfn_64: (unused) > > end_pfn_64: (unused) > > max_mapnr_64: 1245184 (0x130000) >=20 > Can you fetch that in "crash"? If you can, then there's nothing to do on > the qemu side (and I'll have to apologize for spamming a bunch of lists := /). Sure enough... I was just playing with process_el64_notes() to check/read the note name st= rings, and noticed that I can certainly see them. But as you noted, only the NT_P= RSTATUS notes are stored in the "notes[]" array. so I was under the impression that= the QEMU notes were completely missing. That being the case -- we're pretty much done! I'll put a patch in the next upstream release of crash. Thanks, Dave >=20 > I think "crash" already iterates over all of the notes in the note > buffer, but skips everything different from NT_PRSTATUS. >=20 >=20 > (3) Regarding the structure of the notes, we have to consider the > placement of the notes and their internal structure. The placement is > different between the ELF and the KDUMP file format. The internal > structure of the notes is identical between the two file formats. >=20 > For example, for a 4 VCPU guest, you end up with note names like >=20 > CORE > CORE > CORE > CORE > QEMU > QEMU > QEMU > QEMU >=20 > All of these are Elf64_Nhdr structures. The CORE ones have type > NT_PRSTATUS, and the QEMU ones have type 0. >=20 > (3a) The placement in the ELF file is already handled by "crash". Each > note "simply" gets its own ELF note segment/section. >=20 > (3b) In the kdump file, the Elf64_Nhdr structures (8 pieces in total, in > the above example -- 4x CORE, 4x QEMU) are concatenated in that order, > and finally stored at "offset_note". >=20 > (3c) Regarding the internal structure of the notes. The CORE ones are > already known and handled. The QEMU notes have the following structure: >=20 > > Elf64_Nhdr: > > n_namesz: 5 ("QEMU") > > n_descsz: 432 > > n_type: 0 (?) > > 000001b000000001 0000000000000000 > |------||------| |--------------| > size version rax >=20 > > 0000000000000000 0000000000000000 > |--------------| |--------------| > rbx rcx >=20 > > 0000000000000000 0000000000000001 > |--------------| |--------------| > rdx rsi >=20 > > ffffffff81dd5228 ffffffff81a01ec8 > |--------------| |--------------| > rdi rsp >=20 > > ffffffff81a01ec8 0000000000000000 > |--------------| |--------------| > rbp r8 >=20 > > 0000000000000000 00000013911d5f29 > |--------------| |--------------| > r9 r10 >=20 > > 0000000000000000 ffffffff81c00480 > |--------------| |--------------| > r11 r12 >=20 > > 0000000000000000 ffffffffffffffff > |--------------| |--------------| > r13 r14 >=20 > > 000000000309f000 ffffffff810375ab > |--------------| |--------------| > r15 rip >=20 > > 0000000000000246 ffffffff00000010 > |--------------| |------||------| > rflags cs/lim cs/sel >=20 > > 0000000000a09b00 0000000000000000 > |------||------| |--------------| > cs/pad cs/flags cs/base >=20 > > ffffffff00000018 0000000000c09300 > |------||------| |------||------| > ds/lim ds/sel ds/pad ds/flags >=20 > > 0000000000000000 ffffffff00000018 > |--------------| |------||------| > ds/base es/lim es/sel >=20 > > 0000000000c09300 0000000000000000 > |------||------| |--------------| > es/pad es/flags es/base >=20 > > ffffffff00000000 0000000000000000 > |------||------| |------||------| > fs/lim fs/sel fs/pad fs/flags >=20 > > 0000000000000000 ffffffff00000000 > |--------------| |------||------| > fs/base gs/lim gs/sel >=20 > > 0000000000000000 ffff880003200000 > |------||------| |--------------| > gs/pad gs/flags gs/base >=20 > > ffffffff00000018 0000000000c09300 > |------||------| |------||------| > ss/lim ss/sel ss/pad ss/flags >=20 > > 0000000000000000 ffffffff00000000 > |--------------| |------||------| > ss/base ldt... >=20 > > 0000000000000000 0000000000000000 > |------||------| |--------------| > ...ldt >=20 > > 0000208700000040 0000000000008b00 > |------||------| |------||------| > tr... >=20 > > ffff880003213b40 0000007f00000000 > |--------------| |------||------| > ...tr gdt... >=20 > > 0000000000000000 ffff880003204000 > |------||------| |--------------| > ...gdt >=20 > > 00000fff00000000 0000000000000000 > |------||------| |------||------| > idt... >=20 > > ffffffff81dd2000 000000008005003b > |--------------| |--------------| > ...idt cr0 >=20 > > 0000000000000000 0000000001b2e000 > |--------------| |--------------| > cr1 cr2 >=20 > > 0000000007b18000 00000000000006f0 > |--------------| |--------------| > cr3 cr4 >=20 > From "target-i386/arch_dump.c": >=20 > > struct QEMUCPUSegment { > > uint32_t selector; > > uint32_t limit; > > uint32_t flags; > > uint32_t pad; > > uint64_t base; > > }; > > > > typedef struct QEMUCPUSegment QEMUCPUSegment; > > > > struct QEMUCPUState { > > uint32_t version; > > uint32_t size; > > uint64_t rax, rbx, rcx, rdx, rsi, rdi, rsp, rbp; > > uint64_t r8, r9, r10, r11, r12, r13, r14, r15; > > uint64_t rip, rflags; > > QEMUCPUSegment cs, ds, es, fs, gs, ss; > > QEMUCPUSegment ldt, tr, gdt, idt; > > uint64_t cr[5]; > > }; > > > > typedef struct QEMUCPUState QEMUCPUState; >=20 >=20 > Summary: I think the info is all there. >=20 > Thanks > Laszlo >=20