Re: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Wen Congyang <wency@cn.fujitsu.com>
To: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: jan.kiszka@siemens.com, anderson@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest
Date: Thu, 15 Dec 2011 16:57:22 +0800	[thread overview]
Message-ID: <4EE9B672.1010509@cn.fujitsu.com> (raw)
In-Reply-To: <20111215.103006.193686161.d.hatayama@jp.fujitsu.com>

At 12/15/2011 09:30 AM, HATAYAMA Daisuke Write:
> From: Wen Congyang <wency@cn.fujitsu.com>
> Subject: Re: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest
> Date: Tue, 13 Dec 2011 17:20:24 +0800
> 
>> At 12/13/2011 02:01 PM, HATAYAMA Daisuke Write:
>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>> Subject: Re: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest
>>> Date: Tue, 13 Dec 2011 11:35:53 +0800
>>>
>>>> Hi, hatayama-san
>>>>
>>>> At 12/13/2011 11:12 AM, HATAYAMA Daisuke Write:
>>>>> Hello Wen,
>>>>>
>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>> Subject: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest
>>>>> Date: Fri, 09 Dec 2011 15:57:26 +0800
>>>>>
>>>>>> Hi, all
>>>>>>
>>>>>> 'virsh dump' can not work when host pci device is used by guest. We have
>>>>>> discussed this issue here:
>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
>>>>>>
>>>>>> We have determined to introduce a new command dump to dump memory. The core
>>>>>> file's format can be elf.
>>>>>>
>>>>>> Note:
>>>>>> 1. The guest should be x86 or x86_64. The other arch is not supported.
>>>>>> 2. If you use old gdb, gdb may crash. I use gdb-7.3.1, and it does not crash.
>>>>>> 3. If the OS is in the second kernel, gdb may not work well, and crash can
>>>>>>    work by specifying '--machdep phys_addr=xxx' in the command line. The
>>>>>>    reason is that the second kernel will update the page table, and we can
>>>>>>    not get the page table for the first kernel.
>>>>>
>>>>> I guess still the current implementation breaks vmalloc'ed area that
>>>>> needs page tables originally located in the first 640kB, right? If you
>>>>> want to do so in a correct way, you need to identify a position of
>>>>> backup region and get data of 1st kernel's page tables.
>>>>
>>>> I do not know anything about vmalloc'ed area. Can you explain it more
>>>> detailed?
>>>>
>>>
>>> It's memory area not straight-mapped. To read the area, it's necessary
>>> to look up guest machine's page tables. If I understand correctly,
>>> your current implementation translates the vmalloc'ed area so that the
>>> generated vmcore is linearly mapped w.r.t. virtual-address for gdb to
>>> work.
>>
>> Do you mean the page table for vmalloc'ed area is stored in first 640KB,
>> and it may be overwriten by the second kernel(this region has been backed up)?
>>
> 
> This might be wrong.. I've locally tried to ensure this but I have not
> done yet.
> 
> I make sure at least pgtlist_data could be within the first 640kB:
> 
> crash> log
> <cut>
> No NUMA configuration found
> Faking a node at 0000000000000000-000000007f800000
> Bootmem setup node 0 0000000000000000-000000007f800000
>   NODE_DATA [0000000000011000 - 0000000000044fff] <-- this

Only kernel built with CONFIG_NUMA has this. This config is only enabled
on RHEL x86_64. I do not have such env on hand now.

>   bootmap [0000000000045000 -  0000000000054eff] pages 10
> (7 early reservations) ==> bootmem [0000000000 - 007f800000]
>   #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
>   #1 [0000006000 - 0000008000]       TRAMPOLINE ==> [0000006000 - 0000008000]
> 
> And I had ever had the vmcore created after entering 2nd kernel where
> I cannot see module data using mod sub-command, which was resolved by
> re-reading the address to the corresponding backup region.
> 
> I guess becuase crash uses page table on memory, this affects paging
> badly.
> 
> I want to look into this more but I don't have such vmcore now because
> I lost them accidentally... I tried to reproduce this some times
> yesterday but didn't succeed. The vmcore above is one of them.
> 
>>>
>>> kdump saves the first 640kB physical memory into the backup region. I
>>> guess, for some vmcores created by the current implementation, gdb and
>>> crash cannot see the vmalloc'ed memory area that needs page tables
>>
>> Hmm, IIRC, crash do not use CPU's page table. gdb use the information in
>> PT_LOAD to read memory area.
>>
> 
> I was confused this. Your dump command uses CPU's page table.
> 
> So on the qemu side you can get page table over a whole physical
> address, right? If so, contents themselves are not broken, I think.
> 
>>> placed at the 640kB region, correctly. For example, try to use mod
>>> sub-command. Kernel modules are allocated on vmalloc'ed area.
>>>
>>> I have developped a very similar logic for sadump. Look at sadump.c in
>>> crash. Logic itself is very simple, but debugging information is
>>> necessary. Documentation/kdump/kdump.txt and the following paper
>>> explains backup region mechanism very well, and the implementaion
>>> around there remains same now.
>>
>> Hmm, we can not use debugging information on qemu sied.
>>
> 
> How about re-reading them later in crash? Users want to see the 1st
> kernel rather than 2nd kernel.

A easy way to see the 1st kernel is: specify --machdep phys_base=xxx in
the command line.

Thanks
Wen Congyang
> 
> To do it, the dump format must be able to be distingished from crash.
> Which does function in crash read vmcores created by this command?
> kcore, or netdump?
> 
> Thanks.
> HATAYAMA, Daisuke
> 
>

next prev parent reply	other threads:[~2011-12-15  8:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-09  7:57 [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest Wen Congyang
2011-12-09  8:06 ` [Qemu-devel] [RFC][PATCH 1/5 v2] Add API to create memory mapping list Wen Congyang
2011-12-13 13:03   ` Jan Kiszka
2011-12-14  8:10     ` Wen Congyang
2011-12-09  8:07 ` [Qemu-devel] [RFC][PATCH 2/5 v2] Add API to check whether a physical address is I/O address Wen Congyang
2011-12-09  8:08 ` [Qemu-devel] [RFC][PATCH 3/5 v2] target-i386: implement cpu_get_memory_mapping() Wen Congyang
2011-12-09  8:09 ` [Qemu-devel] [RFC][PATCH 4/5 v2] Add API to get memory mapping Wen Congyang
2011-12-09  8:09 ` [Qemu-devel] [RFC][PATCH 5/5v2] introduce a new monitor command 'dump' to dump guest's memory Wen Congyang
2011-12-13  3:12 ` [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest HATAYAMA Daisuke
2011-12-13  3:35   ` Wen Congyang
2011-12-13  6:01     ` HATAYAMA Daisuke
2011-12-13  9:20       ` Wen Congyang
2011-12-15  1:30         ` HATAYAMA Daisuke
2011-12-15  8:57           ` Wen Congyang [this message]
2011-12-13 12:55 ` Jan Kiszka
2011-12-14  2:43   ` Wen Congyang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EE9B672.1010509@cn.fujitsu.com \
    --to=wency@cn.fujitsu.com \
    --cc=anderson@redhat.com \
    --cc=d.hatayama@jp.fujitsu.com \
    --cc=jan.kiszka@siemens.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.