qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC] dump memory when host pci device is used by guest
@ 2011-11-16  8:27 Wen Congyang
  2011-11-16 16:29 ` Dave Anderson
  2011-11-29  5:41 ` [Qemu-devel] [RFC][PATCH] introduce a new monitor command 'dump' to dump guest's memory Wen Congyang
  0 siblings, 2 replies; 7+ messages in thread
From: Wen Congyang @ 2011-11-16  8:27 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson

[-- Attachment #1: Type: text/plain, Size: 8403 bytes --]

Hi, all

'virsh dump' can not work when host pci device is used by guest. We have
discussed this issue here:
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html

We have determined to introduce a new command dump to dump memory. The core
file's format can be elf.

I created a kdump-elf vmcore, and found that it can be used by both crash and gdb:

# gdb /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux /work/core/vmcore
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux...done.
[New Thread 1691]
[New <main task>]
#0  sysrq_handle_crash (key=99, tty=0x0) at drivers/char/sysrq.c:130
130	drivers/char/sysrq.c: No such file or directory.
	in drivers/char/sysrq.c
(gdb) bt
#0  sysrq_handle_crash (key=99, tty=0x0) at drivers/char/sysrq.c:130
#1  0xffffffff8130d822 in __handle_sysrq (key=99, tty=0x0, check_mask=<value optimized out>) at drivers/char/sysrq.c:521
#2  0xffffffff8130d8de in write_sysrq_trigger (file=<value optimized out>, buf=<value optimized out>, count=2, ppos=<value optimized out>) at drivers/char/sysrq.c:599
#3  0xffffffff811cf31e in proc_reg_write (file=<value optimized out>, buf=0x7fdabafea000 <Address 0x7fdabafea000 out of bounds>, count=2, ppos=<value optimized out>)
    at fs/proc/inode.c:207
#4  0xffffffff8116c818 in vfs_write (file=0xffff88003c7bb740, buf=0x7fdabafea000 <Address 0x7fdabafea000 out of bounds>, count=<value optimized out>, pos=0xffff88003767ff48)
    at fs/read_write.c:347
#5  0xffffffff8116d251 in sys_write (fd=<value optimized out>, buf=0x7fdabafea000 <Address 0x7fdabafea000 out of bounds>, count=2) at fs/read_write.c:399
#6  0xffffffff81013172 in ?? () at arch/x86/kernel/entry_64.S:487
#7  0x0000000000000246 in ?? ()
#8  0x00000000ffffffff in ?? ()
#9  0x00007fdabafde700 in ?? ()
#10 0x000000000000000a in ?? ()
#11 0x0000000000000001 in ?? ()
#12 0x0000000000000002 in ?? ()
#13 0x0000000000000001 in ?? ()
#14 0x00000030f80d4230 in ?? ()
#15 0x0000000000000033 in ?? ()
#16 0x0000000000010206 in ?? ()
#17 0x00007fff8a126470 in ?? ()
#18 0x000000000000002b in ?? ()
#19 0xffff8800374f5000 in ?? ()
#20 0xffff88003c6f9000 in ?? ()
#21 0x0000000000000080 in ?? ()
#22 0xffff880037680080 in ?? ()
#23 0xffffffff00000014 in ?? ()
#24 0x0000000000000000 in ?? ()
(gdb) q
# crash -s /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux /work/core/vmcore 
crash> bt
PID: 1691   TASK: ffff88003711d520  CPU: 0   COMMAND: "bash"
 #0 [ffff88003767fae0] machine_kexec at ffffffff8103695b
 #1 [ffff88003767fb40] crash_kexec at ffffffff810b8f08
 #2 [ffff88003767fc10] oops_end at ffffffff814cbbd0
 #3 [ffff88003767fc40] no_context at ffffffff8104651b
 #4 [ffff88003767fc90] __bad_area_nosemaphore at ffffffff810467a5
 #5 [ffff88003767fce0] bad_area at ffffffff810468ce
 #6 [ffff88003767fd10] do_page_fault at ffffffff814cd740
 #7 [ffff88003767fd60] page_fault at ffffffff814caf45
    [exception RIP: sysrq_handle_crash+22]
    RIP: ffffffff8130d566  RSP: ffff88003767fe18  RFLAGS: 00010096
    RAX: 0000000000000010  RBX: 0000000000000063  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: 0000000000000063
    RBP: ffff88003767fe18   R8: 0000000000000000   R9: ffffffff815106c0
    R10: 0000000000000001  R11: 0000000000000000  R12: 0000000000000000
    R13: ffffffff8179e6c0  R14: 0000000000000286  R15: 0000000000000007
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff88003767fe20] __handle_sysrq at ffffffff8130d822
 #9 [ffff88003767fe70] write_sysrq_trigger at ffffffff8130d8de
#10 [ffff88003767fea0] proc_reg_write at ffffffff811cf31e
#11 [ffff88003767fef0] vfs_write at ffffffff8116c818
#12 [ffff88003767ff30] sys_write at ffffffff8116d251
#13 [ffff88003767ff80] system_call_fastpath at ffffffff81013172
    RIP: 00000030f80d4230  RSP: 00007fff8a126470  RFLAGS: 00010206
    RAX: 0000000000000001  RBX: ffffffff81013172  RCX: 0000000000000400
    RDX: 0000000000000002  RSI: 00007fdabafea000  RDI: 0000000000000001
    RBP: 00007fdabafea000   R8: 000000000000000a   R9: 00007fdabafde700
    R10: 00000000ffffffff  R11: 0000000000000246  R12: 0000000000000002
    R13: 00000030f8379780  R14: 0000000000000002  R15: 00000030f8379780
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b
crash> 

I wrote a sample(not finished). It only can works on x86_64(both host and guest)
I use it to create a core file:
# readelf -h /tmp/vm2.save 
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              CORE (Core file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         9
  Size of section headers:           0 (bytes)
  Number of section headers:         0
  Section header string table index: 0
# readelf -l /tmp/vm2.save 

Elf file type is CORE (Core file)
Entry point 0x0
There are 9 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  NOTE           0x0000000000000238 0x0000000000000000 0x0000000000000000
                 0x00000000000002c8 0x00000000000002c8         0
  LOAD           0x0000000000000500 0xffffffff81000000 0x0000000001000000
                 0x000000001f000000 0x000000001f000000         0
  LOAD           0x000000001f000500 0x0000000000000000 0x0000000000000000
                 0x0000000001000000 0x0000000001000000         0
  LOAD           0x0000000020000500 0x0000000000000000 0x0000000020000000
                 0x0000000000020000 0x0000000000020000         0
  LOAD           0x0000000020020500 0x0000000000000000 0x0000000020870000
                 0x0000000000010000 0x0000000000010000         0
  LOAD           0x0000000020030500 0x0000000000000000 0x0000000020850000
                 0x0000000000020000 0x0000000000020000         0
  LOAD           0x0000000020050500 0x0000000000000000 0x0000000020840000
                 0x0000000000010000 0x0000000000010000         0
  LOAD           0x0000000020060500 0x0000000000000000 0x0000000020040000
                 0x0000000000800000 0x0000000000800000         0
  LOAD           0x0000000020860500 0x0000000000000000 0x0000000020020000
                 0x0000000000020000 0x0000000000020000         0

I can use crash to anaylze the file, but I can not use gdb to anaylze it.
# gdb /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux /tmp/vm2.save 
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux...done.
[New <main task>]
[New <main task>]
#0  0x8103be8b00000000 in ?? ()
(gdb) bt
#0  0x8103be8b00000000 in ?? ()
Cannot access memory at address 0x8170dec800000000
(gdb) q

My first and the most important question is that: Is there necessary to continue this work?

The attachment is the sample patch.

Thanks
Wen Congyang

[-- Attachment #2: 0001-dump-sample.patch --]
[-- Type: text/x-patch, Size: 13244 bytes --]

>From bdb3daaeb2743a14df2cab364622e2f47ae25093 Mon Sep 17 00:00:00 2001
From: Wen Congyang <wency@cn.fujitsu.com>
Date: Wed, 16 Nov 2011 16:06:10 +0800
Subject: [PATCH] dump sample

---
 Makefile.target |    1 +
 dump.c          |  377 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 dump.h          |    1 +
 hmp-commands.hx |   16 +++
 monitor.c       |    3 +
 qmp-commands.hx |   24 ++++
 6 files changed, 422 insertions(+), 0 deletions(-)
 create mode 100644 dump.c
 create mode 100644 dump.h

diff --git a/Makefile.target b/Makefile.target
index a111521..95d48a5 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -84,6 +84,7 @@ libobj-y += cpu_init.o
 endif
 libobj-$(TARGET_SPARC) += int32_helper.o
 libobj-$(TARGET_SPARC64) += int64_helper.o
+libobj-y += dump.o
 
 libobj-y += disas.o
 libobj-$(CONFIG_TCI_DIS) += tci-dis.o
diff --git a/dump.c b/dump.c
new file mode 100644
index 0000000..bdec246
--- /dev/null
+++ b/dump.c
@@ -0,0 +1,377 @@
+/*
+ * QEMU live dump
+ *
+ * Copyright Fujitsu, Corp. 2011
+ *
+ * Authors:
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include <unistd.h>
+#include <elf.h>
+#include <sys/procfs.h>
+#include "cpu.h"
+#include "cpu-all.h"
+#include "targphys.h"
+#include "monitor.h"
+#include "kvm.h"
+#include "dump.h"
+#include "sysemu.h"
+
+#ifdef TARGET_X86_64
+typedef struct {
+    target_ulong r15,r14,r13,r12,rbp,rbx,r11,r10;
+    target_ulong r9,r8,rax,rcx,rdx,rsi,rdi,orig_rax;
+    target_ulong rip,cs,eflags;
+    target_ulong rsp,ss;
+    target_ulong fs_base, gs_base;
+    target_ulong ds,es,fs,gs;
+} x86_64_user_regs_struct;
+
+static int write_elf64_note(Monitor *mon, int fd, CPUState *env,
+                            target_phys_addr_t *offset)
+{
+    x86_64_user_regs_struct regs;
+    Elf64_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int ret;
+
+    regs.r15 = env->regs[15];
+    regs.r14 = env->regs[14];
+    regs.r13 = env->regs[13];
+    regs.r12 = env->regs[12];
+    regs.r11 = env->regs[11];
+    regs.r10 = env->regs[10];
+    regs.r9  = env->regs[9];
+    regs.r8  = env->regs[8];
+    regs.rbp = env->regs[R_EBP];
+    regs.rsp = env->regs[R_ESP];
+    regs.rdi = env->regs[R_EDI];
+    regs.rsi = env->regs[R_ESI];
+    regs.rdx = env->regs[R_EDX];
+    regs.rcx = env->regs[R_ECX];
+    regs.rbx = env->regs[R_EBX];
+    regs.rax = env->regs[R_EAX];
+    regs.rip = env->eip;
+    regs.eflags = env->eflags;
+
+    /* FIXME */
+    regs.orig_rax = 0;
+    regs.cs = 0;
+    regs.ss = 0;
+    regs.fs_base = 0;
+    regs.gs_base = 0;
+    regs.ds = 0;
+    regs.es = 0;
+    regs.fs = 0;
+    regs.gs = 0;
+
+    descsz = sizeof(prstatus_t) - sizeof(elf_gregset_t) +
+             sizeof(x86_64_user_regs_struct);
+    note_size = ((sizeof(Elf64_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz +3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    note->n_namesz = name_size;
+    note->n_descsz = descsz;
+    note->n_type = NT_PRSTATUS;
+    buf = (char *)note;
+    buf += ((sizeof(Elf64_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    buf += descsz - sizeof(x86_64_user_regs_struct) - sizeof(int);
+    memcpy(buf, &regs, sizeof(x86_64_user_regs_struct));
+
+    lseek(fd, *offset, SEEK_SET);
+    ret = write(fd, note, note_size);
+    g_free(note);
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write elf prstatus.\n");
+        return -1;
+    }
+
+    *offset += note_size;
+
+    return 0;
+}
+#endif
+
+typedef struct {
+    uint32_t ebx, ecx, edx, esi, edi, ebp, eax;
+    unsigned short ds, __ds, es, __es;
+    unsigned short fs, __fs, gs, __gs;
+    uint32_t orig_eax, eip;
+    unsigned short cs, __cs;
+    uint32_t eflags, esp;
+    unsigned short ss, __ss;
+} x86_user_regs_struct;
+
+static int write_elf32_note(Monitor *mon, int fd, CPUState *env,
+                            target_phys_addr_t *offset)
+{
+    /* TODO */
+    return 0;
+}
+
+static target_ulong get_phys_base_addr(CPUState *env, target_ulong *base_vaddr)
+{
+    int i;
+    target_ulong kernel_base = -1;
+    target_ulong last, mask;
+
+    for (i = 30, last = -1; (kernel_base == -1) && (i >= 20); i--) {
+        mask = ~((1LL << i) - 1);
+        *base_vaddr = env->idt.base & mask;
+        if (*base_vaddr == last) {
+            continue;
+        }
+
+        kernel_base = cpu_get_phys_page_debug(env, *base_vaddr);
+        last = *base_vaddr;
+    }
+
+    return kernel_base;
+}
+
+static int write_elf_header(Monitor *mon, int fd, int phdr_num, bool lma)
+{
+    Elf64_Ehdr elf_header;
+    int ret;
+
+    memset(&elf_header, 0, sizeof(Elf64_Ehdr));
+    memcpy(&elf_header, ELFMAG, 4);
+    elf_header.e_ident[EI_CLASS] = ELFCLASS64;
+    elf_header.e_ident[EI_DATA] = ELFDATA2LSB; /* FIXME */
+    elf_header.e_ident[EI_VERSION] = EV_CURRENT;
+    elf_header.e_type = ET_CORE;
+    elf_header.e_machine = lma ? EM_X86_64: EM_386;
+    elf_header.e_version = EV_CURRENT;
+    elf_header.e_ehsize = sizeof(elf_header);
+    elf_header.e_phoff = sizeof(Elf64_Ehdr);
+    elf_header.e_phentsize = sizeof(Elf64_Phdr);
+    elf_header.e_phnum = phdr_num;
+
+    lseek(fd, 0, SEEK_SET);
+    ret = write(fd, &elf_header, sizeof(elf_header));
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write elf header.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf_load(Monitor *mon, int fd, RAMBlock *block, int phdr_index,
+                          target_phys_addr_t *offset, target_ulong base_vaddr)
+{
+    Elf64_Phdr phdr;
+    off_t phdr_offset;
+    int ret;
+
+    memset(&phdr, 0, sizeof(Elf64_Phdr));
+    phdr.p_type = PT_LOAD;
+    phdr.p_offset = *offset;
+    phdr.p_paddr = block->offset;
+    phdr.p_filesz = block->length;
+    phdr.p_memsz = block->length;
+    phdr.p_vaddr = base_vaddr;
+
+    phdr_offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr) * phdr_index;
+    lseek(fd, phdr_offset, SEEK_SET);
+    ret = write(fd, &phdr, sizeof(Elf64_Phdr));
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    lseek(fd, *offset, SEEK_SET);
+    ret = write(fd, block->host, block->length);
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write program segment.\n");
+        return -1;
+    }
+    *offset += block->length;
+
+    return 0;
+}
+
+static int write_elf_notes(Monitor *mon, int fd, int phdr_index,
+                           target_phys_addr_t *offset, bool lma)
+{
+    CPUState *env;
+    int ret;
+    target_phys_addr_t begin = *offset;
+    Elf64_Phdr phdr;
+    off_t phdr_offset;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+#ifdef TARGET_X86_64
+        if (lma) {
+            ret = write_elf64_note(mon, fd, env, offset);
+        } else {
+#endif
+            ret = write_elf32_note(mon, fd, env, offset);
+#ifdef TARGET_X86_64
+        }
+#endif
+
+        if (ret < 0) {
+            monitor_printf(mon, "dump: failed to write elf notes.\n");
+            return -1;
+        }
+    }
+
+    memset(&phdr, 0, sizeof(Elf64_Phdr));
+    phdr.p_type = PT_NOTE;
+    phdr.p_offset = begin;
+    phdr.p_paddr = 0;
+    phdr.p_filesz = *offset - begin;
+    phdr.p_memsz = *offset - begin;
+    phdr.p_vaddr = 0;
+
+    phdr_offset = sizeof(Elf64_Ehdr);
+    lseek(fd, phdr_offset, SEEK_SET);
+    ret = write(fd, &phdr, sizeof(Elf64_Phdr));
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int create_vmcore(Monitor *mon, int fd)
+{
+    CPUState *env;
+    target_ulong kernel_base = -1, base_vaddr;
+    target_phys_addr_t offset;
+    int phdr_num, phdr_index;
+    RAMBlock *block;
+    bool lma = false;
+    int ret;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        cpu_synchronize_state(env);
+    }
+
+#ifdef TARGET_X86_64
+    lma = !!(first_cpu->hflags & HF_LMA_MASK);
+    if (lma) {
+        kernel_base = get_phys_base_addr(first_cpu, &base_vaddr);
+        if (kernel_base == -1) {
+            monitor_printf(mon, "fd_dump: can not get phys_base\n");
+            return -1;
+        }
+    }
+#endif
+
+    phdr_num = 1; /* PT_NOTE */
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        if (lma && kernel_base > block->offset &&
+            kernel_base < (block->offset + block->length)) {
+            phdr_num++;
+        }
+        phdr_num++;
+    }
+
+    ret = write_elf_header(mon, fd, phdr_num, lma);
+    if (ret < 0)
+        return -1;
+
+    phdr_index = 0;
+    offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr) * phdr_num;
+
+    ret = write_elf_notes(mon, fd, phdr_index++, &offset, lma);
+    if (ret < 0)
+        return -1;
+
+#ifdef TARGET_X86_64
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        if (lma && kernel_base >= block->offset &&
+            kernel_base < (block->offset + block->length)) {
+            if (kernel_base > block->offset) {
+                RAMBlock temp_block;
+
+                temp_block.host = block->host + (kernel_base-block->offset);
+                temp_block.offset = kernel_base;
+                temp_block.length = block->length - (kernel_base-block->offset);
+                ret = write_elf_load(mon, fd, &temp_block, phdr_index++, &offset,
+                                     base_vaddr);
+                if (ret < 0)
+                    return -1;
+
+                temp_block.host = block->host;
+                temp_block.offset = block->offset;
+                temp_block.length = kernel_base-block->offset;
+                ret = write_elf_load(mon, fd, &temp_block, phdr_index++,
+                                     &offset, 0);
+                if (ret < 0)
+                    return -1;
+            } else {
+                ret = write_elf_load(mon, fd, block, phdr_index++, &offset,
+                                     base_vaddr);
+                if (ret < 0)
+                    return -1;
+            }
+            break;
+        }
+    }
+#endif
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        if (lma && kernel_base >= block->offset &&
+            kernel_base < (block->offset + block->length)) {
+            continue;
+        }
+
+        ret = write_elf_load(mon, fd, block, phdr_index++, &offset, 0);
+        if (ret < 0)
+            return -1;
+    }
+
+    return 0;
+}
+
+int do_dump(Monitor *mon, const QDict *qdict, QObject **ret_data)
+{
+    const char *file = qdict_get_str(qdict, "file");
+    const char *p;
+    int fd = -1;
+
+#if !defined(WIN32)
+    if (strstart(file, "fd:", &p)) {
+        fd = monitor_get_fd(mon, p);
+        if (fd == -1) {
+            monitor_printf(mon, "fd_dump: invalid file descriptor"
+                           " identifier\n");
+            return -1;
+        }
+    }
+#endif
+    
+    if  (strstart(file, "file:", &p)) {
+        fd = open(p, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY);
+        if (fd < 0) {
+            monitor_printf(mon, "fd_dump: failed to open %s\n", p);
+            return -1;
+        }
+    }
+
+    if (fd == -1) {
+        monitor_printf(mon, "unknown dump protocol: %s\n", file);
+        return -1;
+    }
+
+    vm_stop(RUN_STATE_PAUSED);
+    if (create_vmcore(mon, fd) < 0)
+        return -1;
+
+    return 0;
+}
+
diff --git a/dump.h b/dump.h
new file mode 100644
index 0000000..b5d9eb4
--- /dev/null
+++ b/dump.h
@@ -0,0 +1 @@
+int do_dump(Monitor *mon, const QDict *qdict, QObject **ret_data);
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 089c1ac..ebbce8c 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -772,6 +772,22 @@ Migrate to @var{uri} (using -d to not wait for completion).
 ETEXI
 
     {
+        .name       = "dump",
+        .args_type  = "file:s",
+        .params     = "file",
+        .help       = "dump to file",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd_new = do_dump,
+    },
+
+
+STEXI
+@item dump @var{file}
+@findex dump
+Dump to @var{file}.
+ETEXI
+
+    {
         .name       = "migrate_cancel",
         .args_type  = "",
         .params     = "",
diff --git a/monitor.c b/monitor.c
index 5ea35de..5df35e0 100644
--- a/monitor.c
+++ b/monitor.c
@@ -73,6 +73,9 @@
 #endif
 #include "hw/lm32_pic.h"
 
+/* for dump */
+#include "dump.h"
+
 //#define DEBUG
 //#define DEBUG_COMPLETION
 
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 97975a5..5cf21c5 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -485,6 +485,30 @@ Notes:
 EQMP
 
     {
+        .name       = "dump",
+        .args_type  = "file:s",
+        .params     = "file",
+        .help       = "dump to file",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd_new = do_dump,
+    },
+
+SQMP
+dump
+-------
+
+Dump to file.
+
+Arguments: None.
+
+Example:
+
+-> { "execute": "dump", "arguments": { "file": "fd:dump" } }
+<- { "return": {} }
+
+EQMP
+
+    {
         .name       = "migrate_cancel",
         .args_type  = "",
         .params     = "",
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC] dump memory when host pci device is used by guest
  2011-11-16  8:27 [Qemu-devel] [RFC] dump memory when host pci device is used by guest Wen Congyang
@ 2011-11-16 16:29 ` Dave Anderson
  2011-11-18 12:46   ` Jan Kiszka
  2011-11-29  5:41 ` [Qemu-devel] [RFC][PATCH] introduce a new monitor command 'dump' to dump guest's memory Wen Congyang
  1 sibling, 1 reply; 7+ messages in thread
From: Dave Anderson @ 2011-11-16 16:29 UTC (permalink / raw)
  To: Wen Congyang; +Cc: qemu-devel



----- Original Message -----
> Hi, all
> 
> 'virsh dump' can not work when host pci device is used by guest. We have
> discussed this issue here:
> http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
> 
> We have determined to introduce a new command dump to dump memory.
> The core file's format can be elf.
> 
> I created a kdump-elf vmcore, and found that it can be used by both
> crash and gdb:
> 
> # gdb /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux
> /work/core/vmcore
> GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6)
> Copyright (C) 2010 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show
> copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from
> /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux...done.
> [New Thread 1691]
> [New <main task>]
> #0  sysrq_handle_crash (key=99, tty=0x0) at drivers/char/sysrq.c:130
> 130	drivers/char/sysrq.c: No such file or directory.
> 	in drivers/char/sysrq.c
> (gdb) bt
> #0  sysrq_handle_crash (key=99, tty=0x0) at drivers/char/sysrq.c:130
> #1  0xffffffff8130d822 in __handle_sysrq (key=99, tty=0x0,
> check_mask=<value optimized out>) at drivers/char/sysrq.c:521
> #2  0xffffffff8130d8de in write_sysrq_trigger (file=<value optimized
> out>, buf=<value optimized out>, count=2, ppos=<value optimized
> out>) at drivers/char/sysrq.c:599
> #3  0xffffffff811cf31e in proc_reg_write (file=<value optimized out>,
> buf=0x7fdabafea000 <Address 0x7fdabafea000 out of bounds>, count=2,
> ppos=<value optimized out>)
>     at fs/proc/inode.c:207
> #4  0xffffffff8116c818 in vfs_write (file=0xffff88003c7bb740,
> buf=0x7fdabafea000 <Address 0x7fdabafea000 out of bounds>,
> count=<value optimized out>, pos=0xffff88003767ff48)
>     at fs/read_write.c:347
> #5  0xffffffff8116d251 in sys_write (fd=<value optimized out>,
> buf=0x7fdabafea000 <Address 0x7fdabafea000 out of bounds>, count=2)
> at fs/read_write.c:399
> #6  0xffffffff81013172 in ?? () at arch/x86/kernel/entry_64.S:487
> #7  0x0000000000000246 in ?? ()
> #8  0x00000000ffffffff in ?? ()
> #9  0x00007fdabafde700 in ?? ()
> #10 0x000000000000000a in ?? ()
> #11 0x0000000000000001 in ?? ()
> #12 0x0000000000000002 in ?? ()
> #13 0x0000000000000001 in ?? ()
> #14 0x00000030f80d4230 in ?? ()
> #15 0x0000000000000033 in ?? ()
> #16 0x0000000000010206 in ?? ()
> #17 0x00007fff8a126470 in ?? ()
> #18 0x000000000000002b in ?? ()
> #19 0xffff8800374f5000 in ?? ()
> #20 0xffff88003c6f9000 in ?? ()
> #21 0x0000000000000080 in ?? ()
> #22 0xffff880037680080 in ?? ()
> #23 0xffffffff00000014 in ?? ()
> #24 0x0000000000000000 in ?? ()
> (gdb) q
> # crash -s /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux /work/core/vmcore
> crash> bt
> PID: 1691   TASK: ffff88003711d520  CPU: 0   COMMAND: "bash"
>  #0 [ffff88003767fae0] machine_kexec at ffffffff8103695b
>  #1 [ffff88003767fb40] crash_kexec at ffffffff810b8f08
>  #2 [ffff88003767fc10] oops_end at ffffffff814cbbd0
>  #3 [ffff88003767fc40] no_context at ffffffff8104651b
>  #4 [ffff88003767fc90] __bad_area_nosemaphore at ffffffff810467a5
>  #5 [ffff88003767fce0] bad_area at ffffffff810468ce
>  #6 [ffff88003767fd10] do_page_fault at ffffffff814cd740
>  #7 [ffff88003767fd60] page_fault at ffffffff814caf45
>     [exception RIP: sysrq_handle_crash+22]
>     RIP: ffffffff8130d566  RSP: ffff88003767fe18  RFLAGS: 00010096
>     RAX: 0000000000000010  RBX: 0000000000000063  RCX: 0000000000000000
>     RDX: 0000000000000000  RSI: 0000000000000000  RDI: 0000000000000063
>     RBP: ffff88003767fe18   R8: 0000000000000000   R9: ffffffff815106c0
>     R10: 0000000000000001  R11: 0000000000000000  R12: 0000000000000000
>     R13: ffffffff8179e6c0  R14: 0000000000000286  R15: 0000000000000007
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>  #8 [ffff88003767fe20] __handle_sysrq at ffffffff8130d822
>  #9 [ffff88003767fe70] write_sysrq_trigger at ffffffff8130d8de
> #10 [ffff88003767fea0] proc_reg_write at ffffffff811cf31e
> #11 [ffff88003767fef0] vfs_write at ffffffff8116c818
> #12 [ffff88003767ff30] sys_write at ffffffff8116d251
> #13 [ffff88003767ff80] system_call_fastpath at ffffffff81013172
>     RIP: 00000030f80d4230  RSP: 00007fff8a126470  RFLAGS: 00010206
>     RAX: 0000000000000001  RBX: ffffffff81013172  RCX: 0000000000000400
>     RDX: 0000000000000002  RSI: 00007fdabafea000  RDI: 0000000000000001
>     RBP: 00007fdabafea000   R8: 000000000000000a   R9: 00007fdabafde700
>     R10: 00000000ffffffff  R11: 0000000000000246  R12: 0000000000000002
>     R13: 00000030f8379780  R14: 0000000000000002  R15: 00000030f8379780
>     ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b
> crash>
> 
> I wrote a sample(not finished). It only can works on x86_64(both host and guest)
> I use it to create a core file:
> # readelf -h /tmp/vm2.save
> ELF Header:
>   Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
>   Class:                             ELF64
>   Data:                              2's complement, little endian
>   Version:                           1 (current)
>   OS/ABI:                            UNIX - System V
>   ABI Version:                       0
>   Type:                              CORE (Core file)
>   Machine:                           Advanced Micro Devices X86-64
>   Version:                           0x1
>   Entry point address:               0x0
>   Start of program headers:          64 (bytes into file)
>   Start of section headers:          0 (bytes into file)
>   Flags:                             0x0
>   Size of this header:               64 (bytes)
>   Size of program headers:           56 (bytes)
>   Number of program headers:         9
>   Size of section headers:           0 (bytes)
>   Number of section headers:         0
>   Section header string table index: 0
> # readelf -l /tmp/vm2.save
> 
> Elf file type is CORE (Core file)
> Entry point 0x0
> There are 9 program headers, starting at offset 64
> 
> Program Headers:
>   Type           Offset             VirtAddr           PhysAddr
>                  FileSiz            MemSiz              Flags  Align
>   NOTE           0x0000000000000238 0x0000000000000000 0x0000000000000000
>                  0x00000000000002c8 0x00000000000002c8         0
>   LOAD           0x0000000000000500 0xffffffff81000000 0x0000000001000000
>                  0x000000001f000000 0x000000001f000000         0
>   LOAD           0x000000001f000500 0x0000000000000000 0x0000000000000000
>                  0x0000000001000000 0x0000000001000000         0
>   LOAD           0x0000000020000500 0x0000000000000000 0x0000000020000000
>                  0x0000000000020000 0x0000000000020000         0
>   LOAD           0x0000000020020500 0x0000000000000000 0x0000000020870000
>                  0x0000000000010000 0x0000000000010000         0
>   LOAD           0x0000000020030500 0x0000000000000000 0x0000000020850000
>                  0x0000000000020000 0x0000000000020000         0
>   LOAD           0x0000000020050500 0x0000000000000000 0x0000000020840000
>                  0x0000000000010000 0x0000000000010000         0
>   LOAD           0x0000000020060500 0x0000000000000000 0x0000000020040000
>                  0x0000000000800000 0x0000000000800000         0
>   LOAD           0x0000000020860500 0x0000000000000000 0x0000000020020000
>                  0x0000000000020000 0x0000000000020000         0
> 
> I can use crash to anaylze the file, but I can not use gdb to anaylze it.
> # gdb /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux /tmp/vm2.save
> GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6)
> Copyright (C) 2010 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show
> copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from
> /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux...done.
> [New <main task>]
> [New <main task>]
> #0  0x8103be8b00000000 in ?? ()
> (gdb) bt
> #0  0x8103be8b00000000 in ?? ()
> Cannot access memory at address 0x8170dec800000000
> (gdb) q
> 
> My first and the most important question is that: Is there necessary
> to continue this work?
> 
> The attachment is the sample patch.
> 
> Thanks
> Wen Congyang

>From an enterprise/support point of view, the wholesale replacement
of the current use of the savevm dumpfile format by "virsh dump" with
this ELF style format would be a *huge* improvement. 

Dave Anderson
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC] dump memory when host pci device is used by guest
  2011-11-16 16:29 ` Dave Anderson
@ 2011-11-18 12:46   ` Jan Kiszka
  2011-11-21  8:06     ` Wen Congyang
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kiszka @ 2011-11-18 12:46 UTC (permalink / raw)
  To: Wen Congyang; +Cc: Dave Anderson, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 9528 bytes --]

On 2011-11-16 14:29, Dave Anderson wrote:
> 
> 
> ----- Original Message -----
>> Hi, all
>>
>> 'virsh dump' can not work when host pci device is used by guest. We have
>> discussed this issue here:
>> http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
>>
>> We have determined to introduce a new command dump to dump memory.
>> The core file's format can be elf.
>>
>> I created a kdump-elf vmcore, and found that it can be used by both
>> crash and gdb:
>>
>> # gdb /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux
>> /work/core/vmcore
>> GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6)
>> Copyright (C) 2010 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later
>> <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.  Type "show
>> copying"
>> and "show warranty" for details.
>> This GDB was configured as "x86_64-redhat-linux-gnu".
>> For bug reporting instructions, please see:
>> <http://www.gnu.org/software/gdb/bugs/>...
>> Reading symbols from
>> /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux...done.
>> [New Thread 1691]
>> [New <main task>]
>> #0  sysrq_handle_crash (key=99, tty=0x0) at drivers/char/sysrq.c:130
>> 130	drivers/char/sysrq.c: No such file or directory.
>> 	in drivers/char/sysrq.c
>> (gdb) bt
>> #0  sysrq_handle_crash (key=99, tty=0x0) at drivers/char/sysrq.c:130
>> #1  0xffffffff8130d822 in __handle_sysrq (key=99, tty=0x0,
>> check_mask=<value optimized out>) at drivers/char/sysrq.c:521
>> #2  0xffffffff8130d8de in write_sysrq_trigger (file=<value optimized
>> out>, buf=<value optimized out>, count=2, ppos=<value optimized
>> out>) at drivers/char/sysrq.c:599
>> #3  0xffffffff811cf31e in proc_reg_write (file=<value optimized out>,
>> buf=0x7fdabafea000 <Address 0x7fdabafea000 out of bounds>, count=2,
>> ppos=<value optimized out>)
>>     at fs/proc/inode.c:207
>> #4  0xffffffff8116c818 in vfs_write (file=0xffff88003c7bb740,
>> buf=0x7fdabafea000 <Address 0x7fdabafea000 out of bounds>,
>> count=<value optimized out>, pos=0xffff88003767ff48)
>>     at fs/read_write.c:347
>> #5  0xffffffff8116d251 in sys_write (fd=<value optimized out>,
>> buf=0x7fdabafea000 <Address 0x7fdabafea000 out of bounds>, count=2)
>> at fs/read_write.c:399
>> #6  0xffffffff81013172 in ?? () at arch/x86/kernel/entry_64.S:487
>> #7  0x0000000000000246 in ?? ()
>> #8  0x00000000ffffffff in ?? ()
>> #9  0x00007fdabafde700 in ?? ()
>> #10 0x000000000000000a in ?? ()
>> #11 0x0000000000000001 in ?? ()
>> #12 0x0000000000000002 in ?? ()
>> #13 0x0000000000000001 in ?? ()
>> #14 0x00000030f80d4230 in ?? ()
>> #15 0x0000000000000033 in ?? ()
>> #16 0x0000000000010206 in ?? ()
>> #17 0x00007fff8a126470 in ?? ()
>> #18 0x000000000000002b in ?? ()
>> #19 0xffff8800374f5000 in ?? ()
>> #20 0xffff88003c6f9000 in ?? ()
>> #21 0x0000000000000080 in ?? ()
>> #22 0xffff880037680080 in ?? ()
>> #23 0xffffffff00000014 in ?? ()
>> #24 0x0000000000000000 in ?? ()
>> (gdb) q
>> # crash -s /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux /work/core/vmcore
>> crash> bt
>> PID: 1691   TASK: ffff88003711d520  CPU: 0   COMMAND: "bash"
>>  #0 [ffff88003767fae0] machine_kexec at ffffffff8103695b
>>  #1 [ffff88003767fb40] crash_kexec at ffffffff810b8f08
>>  #2 [ffff88003767fc10] oops_end at ffffffff814cbbd0
>>  #3 [ffff88003767fc40] no_context at ffffffff8104651b
>>  #4 [ffff88003767fc90] __bad_area_nosemaphore at ffffffff810467a5
>>  #5 [ffff88003767fce0] bad_area at ffffffff810468ce
>>  #6 [ffff88003767fd10] do_page_fault at ffffffff814cd740
>>  #7 [ffff88003767fd60] page_fault at ffffffff814caf45
>>     [exception RIP: sysrq_handle_crash+22]
>>     RIP: ffffffff8130d566  RSP: ffff88003767fe18  RFLAGS: 00010096
>>     RAX: 0000000000000010  RBX: 0000000000000063  RCX: 0000000000000000
>>     RDX: 0000000000000000  RSI: 0000000000000000  RDI: 0000000000000063
>>     RBP: ffff88003767fe18   R8: 0000000000000000   R9: ffffffff815106c0
>>     R10: 0000000000000001  R11: 0000000000000000  R12: 0000000000000000
>>     R13: ffffffff8179e6c0  R14: 0000000000000286  R15: 0000000000000007
>>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>>  #8 [ffff88003767fe20] __handle_sysrq at ffffffff8130d822
>>  #9 [ffff88003767fe70] write_sysrq_trigger at ffffffff8130d8de
>> #10 [ffff88003767fea0] proc_reg_write at ffffffff811cf31e
>> #11 [ffff88003767fef0] vfs_write at ffffffff8116c818
>> #12 [ffff88003767ff30] sys_write at ffffffff8116d251
>> #13 [ffff88003767ff80] system_call_fastpath at ffffffff81013172
>>     RIP: 00000030f80d4230  RSP: 00007fff8a126470  RFLAGS: 00010206
>>     RAX: 0000000000000001  RBX: ffffffff81013172  RCX: 0000000000000400
>>     RDX: 0000000000000002  RSI: 00007fdabafea000  RDI: 0000000000000001
>>     RBP: 00007fdabafea000   R8: 000000000000000a   R9: 00007fdabafde700
>>     R10: 00000000ffffffff  R11: 0000000000000246  R12: 0000000000000002
>>     R13: 00000030f8379780  R14: 0000000000000002  R15: 00000030f8379780
>>     ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b
>> crash>
>>
>> I wrote a sample(not finished). It only can works on x86_64(both host and guest)
>> I use it to create a core file:
>> # readelf -h /tmp/vm2.save
>> ELF Header:
>>   Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
>>   Class:                             ELF64
>>   Data:                              2's complement, little endian
>>   Version:                           1 (current)
>>   OS/ABI:                            UNIX - System V
>>   ABI Version:                       0
>>   Type:                              CORE (Core file)
>>   Machine:                           Advanced Micro Devices X86-64
>>   Version:                           0x1
>>   Entry point address:               0x0
>>   Start of program headers:          64 (bytes into file)
>>   Start of section headers:          0 (bytes into file)
>>   Flags:                             0x0
>>   Size of this header:               64 (bytes)
>>   Size of program headers:           56 (bytes)
>>   Number of program headers:         9
>>   Size of section headers:           0 (bytes)
>>   Number of section headers:         0
>>   Section header string table index: 0
>> # readelf -l /tmp/vm2.save
>>
>> Elf file type is CORE (Core file)
>> Entry point 0x0
>> There are 9 program headers, starting at offset 64
>>
>> Program Headers:
>>   Type           Offset             VirtAddr           PhysAddr
>>                  FileSiz            MemSiz              Flags  Align
>>   NOTE           0x0000000000000238 0x0000000000000000 0x0000000000000000
>>                  0x00000000000002c8 0x00000000000002c8         0
>>   LOAD           0x0000000000000500 0xffffffff81000000 0x0000000001000000
>>                  0x000000001f000000 0x000000001f000000         0
>>   LOAD           0x000000001f000500 0x0000000000000000 0x0000000000000000
>>                  0x0000000001000000 0x0000000001000000         0
>>   LOAD           0x0000000020000500 0x0000000000000000 0x0000000020000000
>>                  0x0000000000020000 0x0000000000020000         0
>>   LOAD           0x0000000020020500 0x0000000000000000 0x0000000020870000
>>                  0x0000000000010000 0x0000000000010000         0
>>   LOAD           0x0000000020030500 0x0000000000000000 0x0000000020850000
>>                  0x0000000000020000 0x0000000000020000         0
>>   LOAD           0x0000000020050500 0x0000000000000000 0x0000000020840000
>>                  0x0000000000010000 0x0000000000010000         0
>>   LOAD           0x0000000020060500 0x0000000000000000 0x0000000020040000
>>                  0x0000000000800000 0x0000000000800000         0
>>   LOAD           0x0000000020860500 0x0000000000000000 0x0000000020020000
>>                  0x0000000000020000 0x0000000000020000         0
>>
>> I can use crash to anaylze the file, but I can not use gdb to anaylze it.
>> # gdb /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux /tmp/vm2.save
>> GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6)
>> Copyright (C) 2010 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later
>> <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.  Type "show
>> copying"
>> and "show warranty" for details.
>> This GDB was configured as "x86_64-redhat-linux-gnu".
>> For bug reporting instructions, please see:
>> <http://www.gnu.org/software/gdb/bugs/>...
>> Reading symbols from
>> /usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux...done.
>> [New <main task>]
>> [New <main task>]
>> #0  0x8103be8b00000000 in ?? ()
>> (gdb) bt
>> #0  0x8103be8b00000000 in ?? ()
>> Cannot access memory at address 0x8170dec800000000
>> (gdb) q
>>
>> My first and the most important question is that: Is there necessary
>> to continue this work?
>>
>> The attachment is the sample patch.
>>
>> Thanks
>> Wen Congyang
> 
> From an enterprise/support point of view, the wholesale replacement
> of the current use of the savevm dumpfile format by "virsh dump" with
> this ELF style format would be a *huge* improvement. 

Yes, fully agree. Would be cool if that could actually work for both
crash and gdb. Looking forward!

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC] dump memory when host pci device is used by guest
  2011-11-18 12:46   ` Jan Kiszka
@ 2011-11-21  8:06     ` Wen Congyang
  2011-11-26 10:27       ` Jan Kiszka
  0 siblings, 1 reply; 7+ messages in thread
From: Wen Congyang @ 2011-11-21  8:06 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Dave Anderson, qemu-devel

At 11/18/2011 08:46 PM, Jan Kiszka Write:
> On 2011-11-16 14:29, Dave Anderson wrote:
>>
>>
>> ----- Original Message -----
>>> Hi, all
>>>
>>> 'virsh dump' can not work when host pci device is used by guest. We have
>>> discussed this issue here:
>>> http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
>>>
>>> We have determined to introduce a new command dump to dump memory.
>>> The core file's format can be elf.
>>>
>>> I created a kdump-elf vmcore, and found that it can be used by both
>>> crash and gdb:
>>>

>> From an enterprise/support point of view, the wholesale replacement
>> of the current use of the savevm dumpfile format by "virsh dump" with
>> this ELF style format would be a *huge* improvement. 
> 
> Yes, fully agree. Would be cool if that could actually work for both
> crash and gdb. Looking forward!

Because the memory size for x86 machine can greater than 4G, so we should
create elf64 format core file for 32bit OS.

I create a vmcore: the guest OS is 32-bit, and the vmcore is elf64 format.
I can use crash to anaylyze it, but gdb can not do the same thing.

I create a kdump-elf64 vmcore on 32-bit machine, and gdb still can not anaylyze
it.

Does gdb support elf64 format core file on x86 box?

Thanks
Wen Congyang

> 
> Jan
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC] dump memory when host pci device is used by guest
  2011-11-21  8:06     ` Wen Congyang
@ 2011-11-26 10:27       ` Jan Kiszka
  2011-11-26 21:45         ` Sergio Durigan Junior
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kiszka @ 2011-11-26 10:27 UTC (permalink / raw)
  To: Wen Congyang; +Cc: Dave Anderson, qemu-devel, Sergio Durigan Junior

[-- Attachment #1: Type: text/plain, Size: 1445 bytes --]

On 2011-11-21 06:06, Wen Congyang wrote:
> At 11/18/2011 08:46 PM, Jan Kiszka Write:
>> On 2011-11-16 14:29, Dave Anderson wrote:
>>>
>>>
>>> ----- Original Message -----
>>>> Hi, all
>>>>
>>>> 'virsh dump' can not work when host pci device is used by guest. We have
>>>> discussed this issue here:
>>>> http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
>>>>
>>>> We have determined to introduce a new command dump to dump memory.
>>>> The core file's format can be elf.
>>>>
>>>> I created a kdump-elf vmcore, and found that it can be used by both
>>>> crash and gdb:
>>>>
> 
>>> From an enterprise/support point of view, the wholesale replacement
>>> of the current use of the savevm dumpfile format by "virsh dump" with
>>> this ELF style format would be a *huge* improvement. 
>>
>> Yes, fully agree. Would be cool if that could actually work for both
>> crash and gdb. Looking forward!
> 
> Because the memory size for x86 machine can greater than 4G, so we should
> create elf64 format core file for 32bit OS.
> 
> I create a vmcore: the guest OS is 32-bit, and the vmcore is elf64 format.
> I can use crash to anaylyze it, but gdb can not do the same thing.
> 
> I create a kdump-elf64 vmcore on 32-bit machine, and gdb still can not anaylyze
> it.
> 
> Does gdb support elf64 format core file on x86 box?
> 

Dunno, but I'm trying to pull in some interested gdb folks.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC] dump memory when host pci device is used by guest
  2011-11-26 10:27       ` Jan Kiszka
@ 2011-11-26 21:45         ` Sergio Durigan Junior
  0 siblings, 0 replies; 7+ messages in thread
From: Sergio Durigan Junior @ 2011-11-26 21:45 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Dave Anderson, qemu-devel

Jan Kiszka <jan.kiszka@web.de> writes:

> On 2011-11-21 06:06, Wen Congyang wrote:
>> At 11/18/2011 08:46 PM, Jan Kiszka Write:
>>> On 2011-11-16 14:29, Dave Anderson wrote:
>>>>
>>>>
>>>> ----- Original Message -----
>>>>> Hi, all
>>>>>
>>>>> 'virsh dump' can not work when host pci device is used by guest. We have
>>>>> discussed this issue here:
>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html>>>>
>>>>> We have determined to introduce a new command dump to dump memory.
>>>>> The core file's format can be elf.
>>>>>
>>>>> I created a kdump-elf vmcore, and found that it can be used by both
>>>>> crash and gdb:
>>>>>
>> 
>>>> From an enterprise/support point of view, the wholesale replacement
>>>> of the current use of the savevm dumpfile format by "virsh dump" with
>>>> this ELF style format would be a *huge* improvement. 
>>>
>>> Yes, fully agree. Would be cool if that could actually work for both
>>> crash and gdb. Looking forward!
>> 
>> Because the memory size for x86 machine can greater than 4G, so we should
>> create elf64 format core file for 32bit OS.
>> 
>> I create a vmcore: the guest OS is 32-bit, and the vmcore is elf64 format.
>> I can use crash to anaylyze it, but gdb can not do the same thing.
>> 
>> I create a kdump-elf64 vmcore on 32-bit machine, and gdb still can not anaylyze
>> it.
>> 
>> Does gdb support elf64 format core file on x86 box?
>> 
>
> Dunno, but I'm trying to pull in some interested gdb folks.

Hello,

IIRC, GDB supports ELF64 corefiles on x86 boxes, but only when
configured with `--enable-64-bit-bfd'.  Otherwise, it won't be able to
properly understand the format.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] [RFC][PATCH] introduce a new monitor command 'dump' to dump guest's memory
  2011-11-16  8:27 [Qemu-devel] [RFC] dump memory when host pci device is used by guest Wen Congyang
  2011-11-16 16:29 ` Dave Anderson
@ 2011-11-29  5:41 ` Wen Congyang
  1 sibling, 0 replies; 7+ messages in thread
From: Wen Congyang @ 2011-11-29  5:41 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson

'virsh dump' can not work when host pci device is used by guest. We have
discussed this issue here:
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html

We have determined to introduce a new command dump to dump memory. The core
file's format is elf.

Note:
1. The guest should be x86 or x86_64. The other arch is not supported.
2. gdb can not convert the virtual address to physical address sometimes, because
   the PT_LOAD in the core file does not contain enough information.
3. If you use old gdb, gdb may crash. I use gdb-7.3.1, and it does not crash.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target |    8 +-
 dump.c          |  772 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 dump.h          |    6 +
 hmp-commands.hx |   16 ++
 monitor.c       |    3 +
 qmp-commands.hx |   24 ++
 6 files changed, 825 insertions(+), 4 deletions(-)
 create mode 100644 dump.c
 create mode 100644 dump.h

diff --git a/Makefile.target b/Makefile.target
index a111521..a4f0e6d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -118,7 +118,7 @@ $(call set-vpath, $(SRC_PATH)/linux-user:$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR
 QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) -I$(SRC_PATH)/linux-user
 obj-y = main.o syscall.o strace.o mmap.o signal.o thunk.o \
       elfload.o linuxload.o uaccess.o gdbstub.o cpu-uname.o \
-      user-exec.o $(oslib-obj-y)
+      user-exec.o $(oslib-obj-y) dump.o
 
 obj-$(TARGET_HAS_BFLT) += flatload.o
 
@@ -156,7 +156,7 @@ LDFLAGS+=-Wl,-segaddr,__STD_PROG_ZONE,0x1000 -image_base 0x0e000000
 LIBS+=-lmx
 
 obj-y = main.o commpage.o machload.o mmap.o signal.o syscall.o thunk.o \
-        gdbstub.o user-exec.o
+        gdbstub.o user-exec.o dump.o
 
 obj-i386-y += ioport-user.o
 
@@ -178,7 +178,7 @@ $(call set-vpath, $(SRC_PATH)/bsd-user)
 QEMU_CFLAGS+=-I$(SRC_PATH)/bsd-user -I$(SRC_PATH)/bsd-user/$(TARGET_ARCH)
 
 obj-y = main.o bsdload.o elfload.o mmap.o signal.o strace.o syscall.o \
-        gdbstub.o uaccess.o user-exec.o
+        gdbstub.o uaccess.o user-exec.o dump.o
 
 obj-i386-y += ioport-user.o
 
@@ -194,7 +194,7 @@ endif #CONFIG_BSD_USER
 # System emulator target
 ifdef CONFIG_SOFTMMU
 
-obj-y = arch_init.o cpus.o monitor.o machine.o gdbstub.o balloon.o ioport.o
+obj-y = arch_init.o cpus.o monitor.o machine.o gdbstub.o balloon.o ioport.o dump.o
 # virtio has to be here due to weird dependency between PCI and virtio-net.
 # need to fix this properly
 obj-$(CONFIG_NO_PCI) += pci-stub.o
diff --git a/dump.c b/dump.c
new file mode 100644
index 0000000..0117ac3
--- /dev/null
+++ b/dump.c
@@ -0,0 +1,772 @@
+/*
+ * QEMU live dump
+ *
+ * Copyright Fujitsu, Corp. 2011
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include <unistd.h>
+#include <elf.h>
+#include <sys/procfs.h>
+#include "cpu.h"
+#include "cpu-all.h"
+#include "targphys.h"
+#include "monitor.h"
+#include "kvm.h"
+#include "dump.h"
+#include "sysemu.h"
+#include "bswap.h"
+
+static inline int cpuid(CPUState *env)
+{
+#if defined(CONFIG_USER_ONLY) && defined(CONFIG_USE_NPTL)
+    return env->host_tid;
+#else
+    return env->cpu_index + 1;
+#endif
+}
+
+#if defined(TARGET_I386)
+
+#ifdef TARGET_X86_64
+typedef struct {
+    target_ulong r15, r14, r13, r12, rbp, rbx, r11, r10;
+    target_ulong r9, r8, rax, rcx, rdx, rsi, rdi, orig_rax;
+    target_ulong rip, cs, eflags;
+    target_ulong rsp, ss;
+    target_ulong fs_base, gs_base;
+    target_ulong ds, es, fs, gs;
+} x86_64_user_regs_struct;
+
+static int x86_64_write_elf64_note(Monitor *mon, int fd, CPUState *env,
+                                   target_phys_addr_t *offset)
+{
+    x86_64_user_regs_struct regs;
+    Elf64_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int id = cpuid(env);
+    int ret;
+
+    regs.r15 = env->regs[15];
+    regs.r14 = env->regs[14];
+    regs.r13 = env->regs[13];
+    regs.r12 = env->regs[12];
+    regs.r11 = env->regs[11];
+    regs.r10 = env->regs[10];
+    regs.r9  = env->regs[9];
+    regs.r8  = env->regs[8];
+    regs.rbp = env->regs[R_EBP];
+    regs.rsp = env->regs[R_ESP];
+    regs.rdi = env->regs[R_EDI];
+    regs.rsi = env->regs[R_ESI];
+    regs.rdx = env->regs[R_EDX];
+    regs.rcx = env->regs[R_ECX];
+    regs.rbx = env->regs[R_EBX];
+    regs.rax = env->regs[R_EAX];
+    regs.rip = env->eip;
+    regs.eflags = env->eflags;
+
+    regs.orig_rax = 0; /* FIXME */
+    regs.cs = env->segs[R_CS].selector;
+    regs.ss = env->segs[R_SS].selector;
+    regs.fs_base = env->segs[R_FS].base;
+    regs.gs_base = env->segs[R_GS].base;
+    regs.ds = env->segs[R_DS].selector;
+    regs.es = env->segs[R_ES].selector;
+    regs.fs = env->segs[R_FS].selector;
+    regs.gs = env->segs[R_GS].selector;
+
+    descsz = 336; /* sizeof(prstatus_t) is 336 on x86_64 box */
+    note_size = ((sizeof(Elf64_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    note->n_namesz = cpu_to_le32(name_size);
+    note->n_descsz = cpu_to_le32(descsz);
+    note->n_type = cpu_to_le32(NT_PRSTATUS);
+    buf = (char *)note;
+    buf += ((sizeof(Elf64_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf + 32, &id, 4); /* pr_pid */
+    buf += descsz - sizeof(x86_64_user_regs_struct)-sizeof(target_ulong);
+    memcpy(buf, &regs, sizeof(x86_64_user_regs_struct));
+
+    lseek(fd, *offset, SEEK_SET);
+    ret = write(fd, note, note_size);
+    g_free(note);
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write elf prstatus.\n");
+        return -1;
+    }
+
+    *offset += note_size;
+
+    return 0;
+}
+#endif
+
+/* This function is copied from crash, gdb also needs this virtual address */
+static target_ulong get_phys_base_addr(CPUState *env, target_ulong *base_vaddr)
+{
+    int i;
+    target_ulong kernel_base = -1;
+    target_ulong last, mask;
+
+    for (i = 30, last = -1; (kernel_base == -1) && (i >= 20); i--) {
+        mask = ~((1LL << i) - 1);
+        *base_vaddr = env->idt.base & mask;
+        if (*base_vaddr == last) {
+            continue;
+        }
+
+        kernel_base = cpu_get_phys_page_debug(env, *base_vaddr);
+        last = *base_vaddr;
+    }
+
+    return kernel_base;
+}
+
+/* This function likes get_phys_base_addr(). */
+static target_ulong get_page_offset_addr(CPUState *env,
+                                         target_ulong *page_offset_vaddr)
+{
+    int i;
+    target_ulong kernel_base = -1;
+    target_ulong last, mask;
+
+    for (i = 30, last = -1; (kernel_base == -1) && (i >= 20); i--) {
+        mask = ~((1LL << i) - 1);
+        *page_offset_vaddr = env->gdt.base & mask;
+        if (*page_offset_vaddr == last) {
+            continue;
+        }
+
+        kernel_base = cpu_get_phys_page_debug(env, *page_offset_vaddr);
+        last = *page_offset_vaddr;
+    }
+
+    return kernel_base;
+}
+
+typedef struct {
+    uint32_t ebx, ecx, edx, esi, edi, ebp, eax;
+    unsigned short ds, __ds, es, __es;
+    unsigned short fs, __fs, gs, __gs;
+    uint32_t orig_eax, eip;
+    unsigned short cs, __cs;
+    uint32_t eflags, esp;
+    unsigned short ss, __ss;
+} x86_user_regs_struct;
+
+static int x86_write_elf64_note(Monitor *mon, int fd, CPUState *env,
+                                target_phys_addr_t *offset)
+{
+    x86_user_regs_struct regs;
+    Elf64_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int id = cpuid(env);
+    int ret;
+
+    regs.ebp = env->regs[R_EBP] & 0xffffffff;
+    regs.esp = env->regs[R_ESP] & 0xffffffff;
+    regs.edi = env->regs[R_EDI] & 0xffffffff;
+    regs.esi = env->regs[R_ESI] & 0xffffffff;
+    regs.edx = env->regs[R_EDX] & 0xffffffff;
+    regs.ecx = env->regs[R_ECX] & 0xffffffff;
+    regs.ebx = env->regs[R_EBX] & 0xffffffff;
+    regs.eax = env->regs[R_EAX] & 0xffffffff;
+    regs.eip = env->eip & 0xffffffff;
+    regs.eflags = env->eflags & 0xffffffff;
+
+    regs.cs = env->segs[R_CS].selector;
+    regs.__cs = 0;
+    regs.ss = env->segs[R_SS].selector;
+    regs.__ss = 0;
+    regs.ds = env->segs[R_DS].selector;
+    regs.__ds = 0;
+    regs.es = env->segs[R_ES].selector;
+    regs.__es = 0;
+    regs.fs = env->segs[R_FS].selector;
+    regs.__fs = 0;
+    regs.gs = env->segs[R_GS].selector;
+    regs.__gs = 0;
+
+    descsz = 144; /* sizeof(prstatus_t) is 144 on x86 box */
+    note_size = ((sizeof(Elf64_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    note->n_namesz = cpu_to_le32(name_size);
+    note->n_descsz = cpu_to_le32(descsz);
+    note->n_type = cpu_to_le32(NT_PRSTATUS);
+    buf = (char *)note;
+    buf += ((sizeof(Elf64_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf + 24, &id, 4); /* pr_pid */
+    buf += descsz - sizeof(x86_user_regs_struct)-4;
+    memcpy(buf, &regs, sizeof(x86_user_regs_struct));
+
+    lseek(fd, *offset, SEEK_SET);
+    ret = write(fd, note, note_size);
+    g_free(note);
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write elf prstatus.\n");
+        return -1;
+    }
+
+    *offset += note_size;
+
+    return 0;
+}
+
+static int x86_write_elf32_note(Monitor *mon, int fd, CPUState *env,
+                                target_phys_addr_t *offset)
+{
+    x86_user_regs_struct regs;
+    Elf32_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int id = cpuid(env);
+    int ret;
+
+    regs.ebp = env->regs[R_EBP] & 0xffffffff;
+    regs.esp = env->regs[R_ESP] & 0xffffffff;
+    regs.edi = env->regs[R_EDI] & 0xffffffff;
+    regs.esi = env->regs[R_ESI] & 0xffffffff;
+    regs.edx = env->regs[R_EDX] & 0xffffffff;
+    regs.ecx = env->regs[R_ECX] & 0xffffffff;
+    regs.ebx = env->regs[R_EBX] & 0xffffffff;
+    regs.eax = env->regs[R_EAX] & 0xffffffff;
+    regs.eip = env->eip & 0xffffffff;
+    regs.eflags = env->eflags & 0xffffffff;
+
+    regs.cs = env->segs[R_CS].selector;
+    regs.__cs = 0;
+    regs.ss = env->segs[R_SS].selector;
+    regs.__ss = 0;
+    regs.ds = env->segs[R_DS].selector;
+    regs.__ds = 0;
+    regs.es = env->segs[R_ES].selector;
+    regs.__es = 0;
+    regs.fs = env->segs[R_FS].selector;
+    regs.__fs = 0;
+    regs.gs = env->segs[R_GS].selector;
+    regs.__gs = 0;
+
+    descsz = 144; /* sizeof(prstatus_t) is 144 on x86 box */
+    note_size = ((sizeof(Elf32_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    note->n_namesz = cpu_to_le32(name_size);
+    note->n_descsz = cpu_to_le32(descsz);
+    note->n_type = cpu_to_le32(NT_PRSTATUS);
+    buf = (char *)note;
+    buf += ((sizeof(Elf32_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf + 24, &id, 4); /* pr_pid */
+    buf += descsz - sizeof(x86_user_regs_struct)-4;
+    memcpy(buf, &regs, sizeof(x86_user_regs_struct));
+
+    lseek(fd, *offset, SEEK_SET);
+    ret = write(fd, note, note_size);
+    g_free(note);
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write elf prstatus.\n");
+        return -1;
+    }
+
+    *offset += note_size;
+
+    return 0;
+}
+#endif
+
+static int write_elf64_header(Monitor *mon, int fd, int phdr_num, int machine,
+                              int endian)
+{
+    Elf64_Ehdr elf_header;
+    int ret;
+
+    memset(&elf_header, 0, sizeof(Elf64_Ehdr));
+    memcpy(&elf_header, ELFMAG, 4);
+    elf_header.e_ident[EI_CLASS] = ELFCLASS64;
+    elf_header.e_ident[EI_DATA] = endian;
+    elf_header.e_ident[EI_VERSION] = EV_CURRENT;
+    elf_header.e_type = cpu_to_le16(ET_CORE);
+    elf_header.e_machine = cpu_to_le16(machine);
+    elf_header.e_version = cpu_to_le32(EV_CURRENT);
+    elf_header.e_ehsize = cpu_to_le16(sizeof(elf_header));
+    elf_header.e_phoff = cpu_to_le64(sizeof(Elf64_Ehdr));
+    elf_header.e_phentsize = cpu_to_le16(sizeof(Elf64_Phdr));
+    elf_header.e_phnum = cpu_to_le16(phdr_num);
+
+    lseek(fd, 0, SEEK_SET);
+    ret = write(fd, &elf_header, sizeof(elf_header));
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write elf header.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_header(Monitor *mon, int fd, int phdr_num, int machine,
+                              int endian)
+{
+    Elf32_Ehdr elf_header;
+    int ret;
+
+    memset(&elf_header, 0, sizeof(Elf32_Ehdr));
+    memcpy(&elf_header, ELFMAG, 4);
+    elf_header.e_ident[EI_CLASS] = ELFCLASS32;
+    elf_header.e_ident[EI_DATA] = endian;
+    elf_header.e_ident[EI_VERSION] = EV_CURRENT;
+    elf_header.e_type = cpu_to_le16(ET_CORE);
+    elf_header.e_machine = cpu_to_le16(machine);
+    elf_header.e_version = cpu_to_le32(EV_CURRENT);
+    elf_header.e_ehsize = cpu_to_le16(sizeof(elf_header));
+    elf_header.e_phoff = cpu_to_le32(sizeof(Elf32_Ehdr));
+    elf_header.e_phentsize = cpu_to_le16(sizeof(Elf32_Phdr));
+    elf_header.e_phnum = cpu_to_le16(phdr_num);
+
+    lseek(fd, 0, SEEK_SET);
+    ret = write(fd, &elf_header, sizeof(elf_header));
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write elf header.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf64_load(Monitor *mon, int fd, RAMBlock *block,
+                            int phdr_index, target_phys_addr_t *offset,
+                            target_ulong base_vaddr)
+{
+    Elf64_Phdr phdr;
+    off_t phdr_offset;
+    int ret;
+
+    memset(&phdr, 0, sizeof(Elf64_Phdr));
+    phdr.p_type = cpu_to_le32(PT_LOAD);
+    phdr.p_offset = cpu_to_le64(*offset);
+    phdr.p_paddr = cpu_to_le64(block->offset);
+    phdr.p_filesz = cpu_to_le64(block->length);
+    phdr.p_memsz = cpu_to_le64(block->length);
+    phdr.p_vaddr = cpu_to_le64(base_vaddr);
+
+    phdr_offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr)*phdr_index;
+    lseek(fd, phdr_offset, SEEK_SET);
+    ret = write(fd, &phdr, sizeof(Elf64_Phdr));
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    lseek(fd, *offset, SEEK_SET);
+    ret = write(fd, block->host, block->length);
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write program segment.\n");
+        return -1;
+    }
+    *offset += block->length;
+
+    return 0;
+}
+
+static int write_elf32_load(Monitor *mon, int fd, RAMBlock *block,
+                            int phdr_index, target_phys_addr_t *offset,
+                            target_ulong base_vaddr)
+{
+    Elf32_Phdr phdr;
+    off_t phdr_offset;
+    int ret;
+
+    memset(&phdr, 0, sizeof(Elf32_Phdr));
+    phdr.p_type = cpu_to_le32(PT_LOAD);
+    phdr.p_offset = cpu_to_le32(*offset);
+    phdr.p_paddr = cpu_to_le32(block->offset);
+    phdr.p_filesz = cpu_to_le32(block->length);
+    phdr.p_memsz = cpu_to_le32(block->length);
+    phdr.p_vaddr = cpu_to_le32(base_vaddr);
+
+    phdr_offset = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr)*phdr_index;
+    lseek(fd, phdr_offset, SEEK_SET);
+    ret = write(fd, &phdr, sizeof(Elf32_Phdr));
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    lseek(fd, *offset, SEEK_SET);
+    ret = write(fd, block->host, block->length);
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write program segment.\n");
+        return -1;
+    }
+    *offset += block->length;
+
+    return 0;
+}
+
+static int write_elf_load(Monitor *mon, int fd, RAMBlock *block,
+                          int phdr_index, target_phys_addr_t *offset,
+                          target_ulong virtual_addr,
+                          target_phys_addr_t phys_addr,
+                          target_phys_addr_t length, int type)
+{
+    RAMBlock temp_block;
+    int ret;
+
+    temp_block.host = block->host + (phys_addr - block->offset);
+    temp_block.offset = phys_addr;
+    temp_block.length = length;
+    if (type == 1) {
+        ret = write_elf64_load(mon, fd, &temp_block, phdr_index,
+                               offset, virtual_addr);
+    } else  {
+        ret = write_elf32_load(mon, fd, &temp_block, phdr_index,
+                               offset, virtual_addr);
+    }
+    return ret;
+}
+
+static int write_elf64_notes(Monitor *mon, int fd, int phdr_index,
+                             target_phys_addr_t *offset, bool lma)
+{
+    CPUState *env;
+    int ret;
+    target_phys_addr_t begin = *offset;
+    Elf64_Phdr phdr;
+    off_t phdr_offset;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+#if defined(TARGET_I386)
+#ifdef TARGET_X86_64
+        if (lma) {
+            ret = x86_64_write_elf64_note(mon, fd, env, offset);
+        } else {
+#endif
+            ret = x86_write_elf64_note(mon, fd, env, offset);
+#ifdef TARGET_X86_64
+        }
+#endif
+#else
+        ret = -1; /* Not supported */
+#endif
+
+        if (ret < 0) {
+            monitor_printf(mon, "dump: failed to write elf notes.\n");
+            return -1;
+        }
+    }
+
+    memset(&phdr, 0, sizeof(Elf64_Phdr));
+    phdr.p_type = cpu_to_le32(PT_NOTE);
+    phdr.p_offset = cpu_to_le64(begin);
+    phdr.p_paddr = 0;
+    phdr.p_filesz = cpu_to_le64(*offset - begin);
+    phdr.p_memsz = cpu_to_le64(*offset - begin);
+    phdr.p_vaddr = 0;
+
+    phdr_offset = sizeof(Elf64_Ehdr);
+    lseek(fd, phdr_offset, SEEK_SET);
+    ret = write(fd, &phdr, sizeof(Elf64_Phdr));
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_notes(Monitor *mon, int fd, int phdr_index,
+                             target_phys_addr_t *offset)
+{
+    CPUState *env;
+    int ret;
+    target_phys_addr_t begin = *offset;
+    Elf32_Phdr phdr;
+    off_t phdr_offset;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+#if defined(TARGET_I386)
+        ret = x86_write_elf32_note(mon, fd, env, offset);
+#else
+        ret = -1; /* Not supported */
+#endif
+
+        if (ret < 0) {
+            monitor_printf(mon, "dump: failed to write elf notes.\n");
+            return -1;
+        }
+    }
+
+    memset(&phdr, 0, sizeof(Elf32_Phdr));
+    phdr.p_type = cpu_to_le32(PT_NOTE);
+    phdr.p_offset = cpu_to_le32(begin);
+    phdr.p_paddr = 0;
+    phdr.p_filesz = cpu_to_le32(*offset - begin);
+    phdr.p_memsz = cpu_to_le32(*offset - begin);
+    phdr.p_vaddr = 0;
+
+    phdr_offset = sizeof(Elf32_Ehdr);
+    lseek(fd, phdr_offset, SEEK_SET);
+    ret = write(fd, &phdr, sizeof(Elf32_Phdr));
+    if (ret < 0) {
+        monitor_printf(mon, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int create_vmcore(Monitor *mon, int fd)
+{
+    CPUState *env;
+    target_ulong kernel_base = -1, base_vaddr;
+    target_ulong page_offset = -1, page_offset_vaddr;
+    target_phys_addr_t offset;
+    int phdr_num, phdr_index;
+    RAMBlock *block;
+    bool lma = false;
+    int ret;
+    int type, machine, endian;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        cpu_synchronize_state(env);
+    }
+
+#if defined(TARGET_I386)
+
+#ifdef TARGET_X86_64
+    lma = !!(first_cpu->hflags & HF_LMA_MASK);
+#endif
+
+    kernel_base = get_phys_base_addr(first_cpu, &base_vaddr);
+    if (kernel_base == -1) {
+        monitor_printf(mon, "dump: can not get phys_base\n");
+        return -1;
+    }
+    page_offset = get_page_offset_addr(first_cpu, &page_offset_vaddr);
+    if (page_offset == -1) {
+        monitor_printf(mon, "dump: can not get page_offset\n");
+        return -1;
+    }
+#endif
+
+#if defined(TARGET_I386)
+    if (lma) {
+        machine = EM_X86_64;
+    } else {
+        machine = EM_386;
+    }
+    endian = ELFDATA2LSB;
+#else
+    monitor_printf(mon, "dump: unsupported target.\n")
+    return -1;
+#endif
+
+    if (sizeof(ram_addr_t) == 4) {
+        type = 0; /* use elf32 */
+#if defined(TARGET_I386)
+    } else if (!lma) {
+        type = 0; /* the guest os is not in IA-32e mode */
+#endif
+    } else {
+        type = 1; /* use elf64 */
+    }
+
+    phdr_num = 1; /* PT_NOTE */
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+#if defined(TARGET_I386)
+        if (kernel_base > block->offset &&
+            kernel_base < (block->offset + block->length)) {
+            phdr_num++;
+        }
+        if (page_offset != kernel_base && page_offset > block->offset &&
+            page_offset < (block->offset + block->length)) {
+            phdr_num++;
+        }
+        if (!lma && (block->offset + block->length > UINT_MAX)) {
+            type = 1; /* The memory size is greater than 4G */
+        }
+#endif
+        phdr_num++;
+    }
+
+    if (type == 1) {
+        ret = write_elf64_header(mon, fd, phdr_num, machine, endian);
+    } else {
+        ret = write_elf32_header(mon, fd, phdr_num, machine, endian);
+    }
+    if (ret < 0) {
+        return -1;
+    }
+
+    phdr_index = 0;
+    if (type == 1) {
+        offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr)*phdr_num;
+        ret = write_elf64_notes(mon, fd, phdr_index++, &offset, lma);
+    } else {
+        offset = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr)*phdr_num;
+        ret = write_elf32_notes(mon, fd, phdr_index++, &offset);
+    }
+
+    if (ret < 0) {
+        return -1;
+    }
+
+#if defined(TARGET_I386)
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        bool kernel_base_in_block, page_offset_in_block;
+        target_ulong length;
+        target_phys_addr_t paddr, vaddr;
+
+        kernel_base_in_block = kernel_base >= block->offset &&
+                               kernel_base < (block->offset + block->length);
+        page_offset_in_block = page_offset >= block->offset &&
+                               page_offset < (block->offset + block->length);
+        if (!kernel_base_in_block && !page_offset_in_block) {
+            continue;
+        }
+
+        if (kernel_base_in_block) {
+            paddr = kernel_base;
+            vaddr = base_vaddr;
+            if (page_offset_in_block && page_offset > paddr) {
+                length = page_offset - paddr;
+            } else {
+                length = block->length - (paddr - block->offset);
+            }
+
+            ret = write_elf_load(mon, fd, block, phdr_index++, &offset,
+                                 vaddr, paddr, length, type);
+            if (ret < 0) {
+                return -1;
+            }
+        }
+
+        if (page_offset != kernel_base && page_offset_in_block) {
+            paddr = page_offset;
+            vaddr = page_offset_vaddr;
+            if (kernel_base_in_block && kernel_base > paddr) {
+                length = kernel_base - paddr;
+            } else {
+                length = block->length - (paddr - block->offset);
+            }
+
+            ret = write_elf_load(mon, fd, block, phdr_index++, &offset,
+                                 vaddr, paddr, length, type);
+            if (ret < 0) {
+                return -1;
+            }
+        }
+
+        if ((kernel_base_in_block && kernel_base == block->offset) ||
+            (page_offset_in_block && page_offset == block->offset)) {
+            continue;
+        }
+
+        if (kernel_base_in_block && page_offset_in_block) {
+            if (page_offset >= kernel_base) {
+                length = kernel_base - block->offset;
+            } else {
+                length = page_offset - block->offset;
+            }
+        } else if (kernel_base_in_block) {
+            length = kernel_base - block->offset;
+        } else if (page_offset_in_block) {
+            length = page_offset - block->offset;
+        }
+        paddr = block->offset;
+        vaddr = 0;
+        ret = write_elf_load(mon, fd, block, phdr_index++, &offset,
+                             vaddr, paddr, length, type);
+        if (ret < 0) {
+            return -1;
+        }
+    }
+#endif
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+#if defined(TARGET_I386)
+        if (kernel_base >= block->offset &&
+            kernel_base < (block->offset + block->length)) {
+            continue;
+        }
+        if (page_offset >= block->offset &&
+            page_offset < (block->offset + block->length)) {
+            continue;
+        }
+#endif
+
+        if (type == 1) {
+            ret = write_elf64_load(mon, fd, block, phdr_index++, &offset, 0);
+        } else {
+            ret = write_elf32_load(mon, fd, block, phdr_index++, &offset, 0);
+        }
+        if (ret < 0) {
+            return -1;
+        }
+    }
+
+    return 0;
+}
+
+int do_dump(Monitor *mon, const QDict *qdict, QObject **ret_data)
+{
+    const char *file = qdict_get_str(qdict, "file");
+    const char *p;
+    int fd = -1;
+
+#if !defined(WIN32)
+    if (strstart(file, "fd:", &p)) {
+        fd = monitor_get_fd(mon, p);
+        if (fd == -1) {
+            monitor_printf(mon, "dump: invalid file descriptor"
+                           " identifier\n");
+            return -1;
+        }
+    }
+#endif
+
+    if  (strstart(file, "file:", &p)) {
+        fd = open(p, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY);
+        if (fd < 0) {
+            monitor_printf(mon, "dump: failed to open %s\n", p);
+            return -1;
+        }
+    }
+
+    if (fd == -1) {
+        monitor_printf(mon, "unknown dump protocol: %s\n", file);
+        return -1;
+    }
+
+    vm_stop(RUN_STATE_PAUSED);
+    if (create_vmcore(mon, fd) < 0) {
+        return -1;
+    }
+
+    return 0;
+}
diff --git a/dump.h b/dump.h
new file mode 100644
index 0000000..c91fa2c
--- /dev/null
+++ b/dump.h
@@ -0,0 +1,6 @@
+#ifndef DUMP_H
+#define DUMP_H
+
+int do_dump(Monitor *mon, const QDict *qdict, QObject **ret_data);
+
+#endif
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 089c1ac..ebbce8c 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -772,6 +772,22 @@ Migrate to @var{uri} (using -d to not wait for completion).
 ETEXI
 
     {
+        .name       = "dump",
+        .args_type  = "file:s",
+        .params     = "file",
+        .help       = "dump to file",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd_new = do_dump,
+    },
+
+
+STEXI
+@item dump @var{file}
+@findex dump
+Dump to @var{file}.
+ETEXI
+
+    {
         .name       = "migrate_cancel",
         .args_type  = "",
         .params     = "",
diff --git a/monitor.c b/monitor.c
index 1be222e..9f26c3d 100644
--- a/monitor.c
+++ b/monitor.c
@@ -73,6 +73,9 @@
 #endif
 #include "hw/lm32_pic.h"
 
+/* for dump */
+#include "dump.h"
+
 //#define DEBUG
 //#define DEBUG_COMPLETION
 
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 97975a5..5cf21c5 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -485,6 +485,30 @@ Notes:
 EQMP
 
     {
+        .name       = "dump",
+        .args_type  = "file:s",
+        .params     = "file",
+        .help       = "dump to file",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd_new = do_dump,
+    },
+
+SQMP
+dump
+-------
+
+Dump to file.
+
+Arguments: None.
+
+Example:
+
+-> { "execute": "dump", "arguments": { "file": "fd:dump" } }
+<- { "return": {} }
+
+EQMP
+
+    {
         .name       = "migrate_cancel",
         .args_type  = "",
         .params     = "",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-11-29  5:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-16  8:27 [Qemu-devel] [RFC] dump memory when host pci device is used by guest Wen Congyang
2011-11-16 16:29 ` Dave Anderson
2011-11-18 12:46   ` Jan Kiszka
2011-11-21  8:06     ` Wen Congyang
2011-11-26 10:27       ` Jan Kiszka
2011-11-26 21:45         ` Sergio Durigan Junior
2011-11-29  5:41 ` [Qemu-devel] [RFC][PATCH] introduce a new monitor command 'dump' to dump guest's memory Wen Congyang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).