From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
To: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: "bhe@redhat.com" <bhe@redhat.com>,
"chaowang@redhat.com" <chaowang@redhat.com>,
"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"ebiederm@xmission.com" <ebiederm@xmission.com>,
"dyoung@redhat.com" <dyoung@redhat.com>,
Vivek Goyal <vgoyal@redhat.com>
Subject: Re: /proc/vmcore mmap() failure issue
Date: Thu, 21 Nov 2013 17:31:46 +0900 [thread overview]
Message-ID: <528DC4F2.5040000@jp.fujitsu.com> (raw)
In-Reply-To: <0910DD04CBD6DE4193FCF86B9C00BE971C43E3@BPXM01GP.gisp.nec.co.jp>
(2013/11/21 14:00), Atsushi Kumagai wrote:
> Hello Vivek,
>
> On 2013/11/21 0:00:01, kexec <kexec-bounces@lists.infradead.org> wrote:
>>>>>> Is there any chance that you could look into fixing this. I
>>>>>> have no experience writing code for makedumpfile.
>>>>>
>>>>> I'll send a patch to fix this soon.
>>>>
>>>> Thanks Atsushi.
>>>>
>>>> Vivek
>>>
>>> Vivek, could you test this patch ?
>>>
>>> Thanks
>>> Atsushi Kumagai
>>>
>>>
>>> From: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
>>> Date: Wed, 20 Nov 2013 10:05:03 +0900
>>> Subject: [PATCH] Disable mmap() for reading fractional pages.
>>>
>>> Since mmap() was introduced on /proc/vmcore, it fails for fractional
>>> pages which don't start or end at page boundary due to kernel issue.
>>> This patch disables mmap() temporarily for fractional pages to avoid
>>> this issue, so mmap() will be used only for aligned pages.
>>>
>>> Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
>>
>> Hi Atsushi,
>>
>> Even with this patch applied I see mmap() failure.
>>
>> mem_map (39)
>> mem_map : ffffea0004e00000
>> pfn_start : 138000
>> pfn_end : 140000
>> read /proc/vmcore with mmap()
>> Excluding unnecessary pages : [100.0 %] |STEP [Excluding
>> unnecessary pages] : 0.035925 seconds
>> Excluding unnecessary pages : [100.0 %] \STEP [Excluding
>> unnecessary pages] : 0.035774 seconds
>> Excluding unnecessary pages : [100.0 %] -STEP [Excluding
>> unnecessary pages] : 0.035229 seconds
>> Copying data : [ 40.9 %] -Can't map
>> [b98fd000-b9cfd000] with mmap()
>> read_from_vmcore: Can't read the dump memory(/proc/vmcore) with mmap().
>> readpage_elf: Can't read the dump memory(/proc/vmcore).
>> readmem: type_addr: 1, addr:bffba000, size:4096
>> read_pfn: Can't get the page data.
>> Resource temporarily unavailable
>> makedumpfile Failed.
>> kdump: saving vmcore failed
>>
>> Following is part of /proc/iomem on my system.
>>
>> 00100000-bffc283f : System RAM
>> 01000000-018c551d : Kernel code
>> 018c551e-01ef3f3f : Kernel data
>> 0204a000-02984fff : Kernel bss
>> 2e000000-35ffffff : Crash kernel
>> bffc2840-bfffffff : reserved
>>
>> This is a different system than what I used last time. So I am not sure if this is same error or something else. But one thing is clear that System RAM last page is partial and we should face mmap() failure.
>
> Thanks for your testing, I've found my mistake.
>
> My patch tries to disable mmap() when a partial page is found, but
> actually mmap() has already been called because update_mmap_range()
> calls mmap() for every 4MB region in advance.
> If we try to keep using mmap() as much as possible, update_mmap_range()
> has to check whether the target region of mmap() includes the partial
> pages before calling mmap(), but it's too tough as workaround.
>
> So I think the patch I sent is enough, the policy will be simpler as
> "Don't use mmap() for buggy kernels".
>
> [PATCH] Fall back to read() when mmap() fails.
> http://lists.infradead.org/pipermail/kexec/2013-November/010199.html
>
I think logic becomes not so complex. For example, if input vmcore
format is ELF, then:
o in update_mmap_range():
- first calculate a range of the corresponding PT_LOAD entry truncated with
PAGE_SIZE.
- Then, truncate range of mmap() by the truncated range of the corresponding
PT_LOAD entry, i.e., exlucde partial pages from mmap() target range.
- Then determine offsets of two partial pages; the number of partial pages
are always at most two. The offsets can easily be calculated from the
original range of the corresponding PT_LOAD entry
o in read_from_vmcore(), if a given offset belongs to either of two partial
pages, then go to read() path; if not, go to mmap() path.
--
Thanks.
HATAYAMA, Daisuke
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
WARNING: multiple messages have this Message-ID (diff)
From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
To: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: Vivek Goyal <vgoyal@redhat.com>,
"bhe@redhat.com" <bhe@redhat.com>,
"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"ebiederm@xmission.com" <ebiederm@xmission.com>,
"dyoung@redhat.com" <dyoung@redhat.com>,
"chaowang@redhat.com" <chaowang@redhat.com>
Subject: Re: /proc/vmcore mmap() failure issue
Date: Thu, 21 Nov 2013 17:31:46 +0900 [thread overview]
Message-ID: <528DC4F2.5040000@jp.fujitsu.com> (raw)
In-Reply-To: <0910DD04CBD6DE4193FCF86B9C00BE971C43E3@BPXM01GP.gisp.nec.co.jp>
(2013/11/21 14:00), Atsushi Kumagai wrote:
> Hello Vivek,
>
> On 2013/11/21 0:00:01, kexec <kexec-bounces@lists.infradead.org> wrote:
>>>>>> Is there any chance that you could look into fixing this. I
>>>>>> have no experience writing code for makedumpfile.
>>>>>
>>>>> I'll send a patch to fix this soon.
>>>>
>>>> Thanks Atsushi.
>>>>
>>>> Vivek
>>>
>>> Vivek, could you test this patch ?
>>>
>>> Thanks
>>> Atsushi Kumagai
>>>
>>>
>>> From: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
>>> Date: Wed, 20 Nov 2013 10:05:03 +0900
>>> Subject: [PATCH] Disable mmap() for reading fractional pages.
>>>
>>> Since mmap() was introduced on /proc/vmcore, it fails for fractional
>>> pages which don't start or end at page boundary due to kernel issue.
>>> This patch disables mmap() temporarily for fractional pages to avoid
>>> this issue, so mmap() will be used only for aligned pages.
>>>
>>> Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
>>
>> Hi Atsushi,
>>
>> Even with this patch applied I see mmap() failure.
>>
>> mem_map (39)
>> mem_map : ffffea0004e00000
>> pfn_start : 138000
>> pfn_end : 140000
>> read /proc/vmcore with mmap()
>> Excluding unnecessary pages : [100.0 %] |STEP [Excluding
>> unnecessary pages] : 0.035925 seconds
>> Excluding unnecessary pages : [100.0 %] \STEP [Excluding
>> unnecessary pages] : 0.035774 seconds
>> Excluding unnecessary pages : [100.0 %] -STEP [Excluding
>> unnecessary pages] : 0.035229 seconds
>> Copying data : [ 40.9 %] -Can't map
>> [b98fd000-b9cfd000] with mmap()
>> read_from_vmcore: Can't read the dump memory(/proc/vmcore) with mmap().
>> readpage_elf: Can't read the dump memory(/proc/vmcore).
>> readmem: type_addr: 1, addr:bffba000, size:4096
>> read_pfn: Can't get the page data.
>> Resource temporarily unavailable
>> makedumpfile Failed.
>> kdump: saving vmcore failed
>>
>> Following is part of /proc/iomem on my system.
>>
>> 00100000-bffc283f : System RAM
>> 01000000-018c551d : Kernel code
>> 018c551e-01ef3f3f : Kernel data
>> 0204a000-02984fff : Kernel bss
>> 2e000000-35ffffff : Crash kernel
>> bffc2840-bfffffff : reserved
>>
>> This is a different system than what I used last time. So I am not sure if this is same error or something else. But one thing is clear that System RAM last page is partial and we should face mmap() failure.
>
> Thanks for your testing, I've found my mistake.
>
> My patch tries to disable mmap() when a partial page is found, but
> actually mmap() has already been called because update_mmap_range()
> calls mmap() for every 4MB region in advance.
> If we try to keep using mmap() as much as possible, update_mmap_range()
> has to check whether the target region of mmap() includes the partial
> pages before calling mmap(), but it's too tough as workaround.
>
> So I think the patch I sent is enough, the policy will be simpler as
> "Don't use mmap() for buggy kernels".
>
> [PATCH] Fall back to read() when mmap() fails.
> http://lists.infradead.org/pipermail/kexec/2013-November/010199.html
>
I think logic becomes not so complex. For example, if input vmcore
format is ELF, then:
o in update_mmap_range():
- first calculate a range of the corresponding PT_LOAD entry truncated with
PAGE_SIZE.
- Then, truncate range of mmap() by the truncated range of the corresponding
PT_LOAD entry, i.e., exlucde partial pages from mmap() target range.
- Then determine offsets of two partial pages; the number of partial pages
are always at most two. The offsets can easily be calculated from the
original range of the corresponding PT_LOAD entry
o in read_from_vmcore(), if a given offset belongs to either of two partial
pages, then go to read() path; if not, go to mmap() path.
--
Thanks.
HATAYAMA, Daisuke
next prev parent reply other threads:[~2013-11-21 8:33 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-13 20:41 /proc/vmcore mmap() failure issue Vivek Goyal
2013-11-13 20:41 ` Vivek Goyal
2013-11-13 21:04 ` Vivek Goyal
2013-11-13 21:04 ` Vivek Goyal
2013-11-13 21:14 ` H. Peter Anvin
2013-11-13 21:14 ` H. Peter Anvin
2013-11-13 22:41 ` Vivek Goyal
2013-11-13 22:41 ` Vivek Goyal
2013-11-13 22:44 ` H. Peter Anvin
2013-11-13 22:44 ` H. Peter Anvin
2013-11-13 23:00 ` Vivek Goyal
2013-11-13 23:00 ` Vivek Goyal
2013-11-13 23:08 ` H. Peter Anvin
2013-11-13 23:08 ` H. Peter Anvin
2013-11-14 10:31 ` HATAYAMA Daisuke
2013-11-14 10:31 ` HATAYAMA Daisuke
2013-11-14 15:13 ` Vivek Goyal
2013-11-14 15:13 ` Vivek Goyal
2013-11-15 9:41 ` HATAYAMA Daisuke
2013-11-15 9:41 ` HATAYAMA Daisuke
2013-11-15 14:26 ` Vivek Goyal
2013-11-15 14:26 ` Vivek Goyal
2013-11-18 0:51 ` Atsushi Kumagai
2013-11-18 0:51 ` Atsushi Kumagai
2013-11-18 13:55 ` Vivek Goyal
2013-11-18 13:55 ` Vivek Goyal
2013-11-20 5:29 ` Atsushi Kumagai
2013-11-20 5:29 ` Atsushi Kumagai
2013-11-20 14:59 ` Vivek Goyal
2013-11-20 14:59 ` Vivek Goyal
2013-11-21 5:00 ` Atsushi Kumagai
2013-11-21 5:00 ` Atsushi Kumagai
2013-11-21 8:31 ` HATAYAMA Daisuke [this message]
2013-11-21 8:31 ` HATAYAMA Daisuke
2013-11-21 16:52 ` Vivek Goyal
2013-11-21 16:52 ` Vivek Goyal
2013-11-25 8:10 ` Atsushi Kumagai
2013-11-25 8:10 ` Atsushi Kumagai
2013-11-25 9:01 ` HATAYAMA Daisuke
2013-11-25 9:01 ` HATAYAMA Daisuke
2013-11-25 14:41 ` Vivek Goyal
2013-11-25 14:41 ` Vivek Goyal
2013-11-26 1:51 ` Atsushi Kumagai
2013-11-26 1:51 ` Atsushi Kumagai
2013-11-26 5:16 ` HATAYAMA Daisuke
2013-11-26 5:16 ` HATAYAMA Daisuke
2013-11-19 9:55 ` HATAYAMA Daisuke
2013-11-19 9:55 ` HATAYAMA Daisuke
2013-11-20 5:27 ` Atsushi Kumagai
2013-11-20 5:27 ` Atsushi Kumagai
2013-11-20 6:43 ` HATAYAMA Daisuke
2013-11-20 6:43 ` HATAYAMA Daisuke
2013-11-26 1:52 ` Atsushi Kumagai
2013-11-26 1:52 ` Atsushi Kumagai
2013-11-21 7:14 ` chaowang
2013-11-21 7:14 ` chaowang
2013-11-25 8:09 ` Atsushi Kumagai
2013-11-25 8:09 ` Atsushi Kumagai
2013-11-26 3:29 ` chaowang
2013-11-26 3:29 ` chaowang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=528DC4F2.5040000@jp.fujitsu.com \
--to=d.hatayama@jp.fujitsu.com \
--cc=bhe@redhat.com \
--cc=chaowang@redhat.com \
--cc=dyoung@redhat.com \
--cc=ebiederm@xmission.com \
--cc=kexec@lists.infradead.org \
--cc=kumagai-atsushi@mxc.nes.nec.co.jp \
--cc=linux-kernel@vger.kernel.org \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.