From: Wen Congyang <wency@cn.fujitsu.com>
To: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: jan.kiszka@siemens.com, anderson@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest
Date: Thu, 15 Dec 2011 16:57:22 +0800 [thread overview]
Message-ID: <4EE9B672.1010509@cn.fujitsu.com> (raw)
In-Reply-To: <20111215.103006.193686161.d.hatayama@jp.fujitsu.com>
At 12/15/2011 09:30 AM, HATAYAMA Daisuke Write:
> From: Wen Congyang <wency@cn.fujitsu.com>
> Subject: Re: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest
> Date: Tue, 13 Dec 2011 17:20:24 +0800
>
>> At 12/13/2011 02:01 PM, HATAYAMA Daisuke Write:
>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>> Subject: Re: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest
>>> Date: Tue, 13 Dec 2011 11:35:53 +0800
>>>
>>>> Hi, hatayama-san
>>>>
>>>> At 12/13/2011 11:12 AM, HATAYAMA Daisuke Write:
>>>>> Hello Wen,
>>>>>
>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>> Subject: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest
>>>>> Date: Fri, 09 Dec 2011 15:57:26 +0800
>>>>>
>>>>>> Hi, all
>>>>>>
>>>>>> 'virsh dump' can not work when host pci device is used by guest. We have
>>>>>> discussed this issue here:
>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
>>>>>>
>>>>>> We have determined to introduce a new command dump to dump memory. The core
>>>>>> file's format can be elf.
>>>>>>
>>>>>> Note:
>>>>>> 1. The guest should be x86 or x86_64. The other arch is not supported.
>>>>>> 2. If you use old gdb, gdb may crash. I use gdb-7.3.1, and it does not crash.
>>>>>> 3. If the OS is in the second kernel, gdb may not work well, and crash can
>>>>>> work by specifying '--machdep phys_addr=xxx' in the command line. The
>>>>>> reason is that the second kernel will update the page table, and we can
>>>>>> not get the page table for the first kernel.
>>>>>
>>>>> I guess still the current implementation breaks vmalloc'ed area that
>>>>> needs page tables originally located in the first 640kB, right? If you
>>>>> want to do so in a correct way, you need to identify a position of
>>>>> backup region and get data of 1st kernel's page tables.
>>>>
>>>> I do not know anything about vmalloc'ed area. Can you explain it more
>>>> detailed?
>>>>
>>>
>>> It's memory area not straight-mapped. To read the area, it's necessary
>>> to look up guest machine's page tables. If I understand correctly,
>>> your current implementation translates the vmalloc'ed area so that the
>>> generated vmcore is linearly mapped w.r.t. virtual-address for gdb to
>>> work.
>>
>> Do you mean the page table for vmalloc'ed area is stored in first 640KB,
>> and it may be overwriten by the second kernel(this region has been backed up)?
>>
>
> This might be wrong.. I've locally tried to ensure this but I have not
> done yet.
>
> I make sure at least pgtlist_data could be within the first 640kB:
>
> crash> log
> <cut>
> No NUMA configuration found
> Faking a node at 0000000000000000-000000007f800000
> Bootmem setup node 0 0000000000000000-000000007f800000
> NODE_DATA [0000000000011000 - 0000000000044fff] <-- this
Only kernel built with CONFIG_NUMA has this. This config is only enabled
on RHEL x86_64. I do not have such env on hand now.
> bootmap [0000000000045000 - 0000000000054eff] pages 10
> (7 early reservations) ==> bootmem [0000000000 - 007f800000]
> #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
> #1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000]
>
> And I had ever had the vmcore created after entering 2nd kernel where
> I cannot see module data using mod sub-command, which was resolved by
> re-reading the address to the corresponding backup region.
>
> I guess becuase crash uses page table on memory, this affects paging
> badly.
>
> I want to look into this more but I don't have such vmcore now because
> I lost them accidentally... I tried to reproduce this some times
> yesterday but didn't succeed. The vmcore above is one of them.
>
>>>
>>> kdump saves the first 640kB physical memory into the backup region. I
>>> guess, for some vmcores created by the current implementation, gdb and
>>> crash cannot see the vmalloc'ed memory area that needs page tables
>>
>> Hmm, IIRC, crash do not use CPU's page table. gdb use the information in
>> PT_LOAD to read memory area.
>>
>
> I was confused this. Your dump command uses CPU's page table.
>
> So on the qemu side you can get page table over a whole physical
> address, right? If so, contents themselves are not broken, I think.
>
>>> placed at the 640kB region, correctly. For example, try to use mod
>>> sub-command. Kernel modules are allocated on vmalloc'ed area.
>>>
>>> I have developped a very similar logic for sadump. Look at sadump.c in
>>> crash. Logic itself is very simple, but debugging information is
>>> necessary. Documentation/kdump/kdump.txt and the following paper
>>> explains backup region mechanism very well, and the implementaion
>>> around there remains same now.
>>
>> Hmm, we can not use debugging information on qemu sied.
>>
>
> How about re-reading them later in crash? Users want to see the 1st
> kernel rather than 2nd kernel.
A easy way to see the 1st kernel is: specify --machdep phys_base=xxx in
the command line.
Thanks
Wen Congyang
>
> To do it, the dump format must be able to be distingished from crash.
> Which does function in crash read vmcores created by this command?
> kcore, or netdump?
>
> Thanks.
> HATAYAMA, Daisuke
>
>
next prev parent reply other threads:[~2011-12-15 8:55 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-09 7:57 [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest Wen Congyang
2011-12-09 8:06 ` [Qemu-devel] [RFC][PATCH 1/5 v2] Add API to create memory mapping list Wen Congyang
2011-12-13 13:03 ` Jan Kiszka
2011-12-14 8:10 ` Wen Congyang
2011-12-09 8:07 ` [Qemu-devel] [RFC][PATCH 2/5 v2] Add API to check whether a physical address is I/O address Wen Congyang
2011-12-09 8:08 ` [Qemu-devel] [RFC][PATCH 3/5 v2] target-i386: implement cpu_get_memory_mapping() Wen Congyang
2011-12-09 8:09 ` [Qemu-devel] [RFC][PATCH 4/5 v2] Add API to get memory mapping Wen Congyang
2011-12-09 8:09 ` [Qemu-devel] [RFC][PATCH 5/5v2] introduce a new monitor command 'dump' to dump guest's memory Wen Congyang
2011-12-13 3:12 ` [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest HATAYAMA Daisuke
2011-12-13 3:35 ` Wen Congyang
2011-12-13 6:01 ` HATAYAMA Daisuke
2011-12-13 9:20 ` Wen Congyang
2011-12-15 1:30 ` HATAYAMA Daisuke
2011-12-15 8:57 ` Wen Congyang [this message]
2011-12-13 12:55 ` Jan Kiszka
2011-12-14 2:43 ` Wen Congyang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EE9B672.1010509@cn.fujitsu.com \
--to=wency@cn.fujitsu.com \
--cc=anderson@redhat.com \
--cc=d.hatayama@jp.fujitsu.com \
--cc=jan.kiszka@siemens.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).