From: Dave Young <dyoung@redhat.com>
To: tim@edgecast.com
Cc: Tejun Heo <tj@kernel.org>, WANG Cong <xiyou.wangcong@gmail.com>,
kexec@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: Crash during vmcore_init
Date: Fri, 18 Nov 2011 16:43:34 +0800 [thread overview]
Message-ID: <4EC61AB6.4090808@redhat.com> (raw)
In-Reply-To: <1321548033.12208.12.camel@boudreau>
On 11/18/2011 12:40 AM, Tim Hartrick wrote:
>
> Dave, Tejun, Americo,
>
> Attached find three configs:
>
> Ubuntu 2.6.32-21-server - works
> Ubuntu 2.6.38-8-server - fails
> Ubuntu 3.3.1-030101-generic (stable) - fails
Thanks, Tim
>
> On Thu, 2011-11-17 at 15:21 +0800, Dave Young wrote:
>> On 11/17/2011 01:22 PM, Tim Hartrick wrote:
>>
>>> Tejun, Dave,
>>>
>>> I will be happy to answer any questions about our environment or test
>>> debug or other patches. Just tell me what you need.
>>
>>
>> Thank you. Can you share your kernel config?
>>
>>>
>>> tim
>>>
>>> On Nov 16, 2011 8:44 PM, "Dave Young" <dyoung@redhat.com
>>> <mailto:dyoung@redhat.com>> wrote:
>>>
>>> On 11/17/2011 12:34 PM, Tejun Heo wrote:
>>>
>>> > Hello,
>>> >
>>> > On Wed, Nov 16, 2011 at 7:30 PM, Dave Young <dyoung@redhat.com
>>> <mailto:dyoung@redhat.com>> wrote:
>>> >> This addr is converted to an invalid phys address,
>>> >
>>> > I'm a bit lost on the context here. Who's calling
>>> per_cpu_ptr_to_phys()?
>>>
>>>
>>> It's drivers/base/cpu.c : show_crash_notes()
>>>
>>> >
>>> >> looking the code below:
>>> >> if (in_first_chunk) {
>>> >> if (!is_vmalloc_addr(addr))
>>> >> return __pa(addr);
>>> >> else
>>> >> return page_to_phys(vmalloc_to_page(addr));
>>> >> } else
>>> >> return page_to_phys(pcpu_addr_to_page(addr));
>>> >>
>>> >> I dont understand per cpu allocation well, if addr is not in
>>> first chunk
>>> >> then it should be in vmalloc area?
>>> >
>>> > Yes, it is. First chunk can be embedded in the kernel linear address
>>> > space but from the second one, it's always set up from the top of the
>>> > vmalloc area with the same offset layout as the first chunk.
>>>
>>>
>>> in this case ffff880667c19ad0 fall out of vmalloc area and it's not in
>>> first chunk also.
Tejun,
With config provided by Tim, I can reproduce this problem on a dell
machine. I did some debug about this, found that fisrt_start <
first_end, so there's no chance to check in for_each_possible_cpu(cpu)
why is the first_start/first_end wrong? pcpu_unit_offsets[] is not
ordered? any idea?
I see below hack make the bug gone, it confirmed the addr is indeed in
first chunk.
diff --git a/mm/percpu.c b/mm/percpu.c
index bf80e55..8f6eb58 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -984,26 +984,14 @@ phys_addr_t per_cpu_ptr_to_phys(void *addr)
{
void __percpu *base = __addr_to_pcpu_ptr(pcpu_base_addr);
bool in_first_chunk = false;
- unsigned long first_start, first_end;
unsigned int cpu;
- /*
- * The following test on first_start/end isn't strictly
- * necessary but will speed up lookups of addresses which
- * aren't in the first chunk.
- */
- first_start = pcpu_chunk_addr(pcpu_first_chunk, pcpu_first_unit_cpu, 0);
- first_end = pcpu_chunk_addr(pcpu_first_chunk, pcpu_last_unit_cpu,
- pcpu_unit_pages);
- if ((unsigned long)addr >= first_start &&
- (unsigned long)addr < first_end) {
- for_each_possible_cpu(cpu) {
- void *start = per_cpu_ptr(base, cpu);
-
- if (addr >= start && addr < start + pcpu_unit_size) {
- in_first_chunk = true;
- break;
- }
+ for_each_possible_cpu(cpu) {
+ void *start = per_cpu_ptr(base, cpu);
+
+ if (addr >= start && addr < start + pcpu_unit_size) {
+ in_first_chunk = true;
+ break;
}
}
>>>
>>> >
>>> >> Tejun, do you have any idea about this?
>>> >
>>> > Can you please tell me how to reproduce the problem? I'll try to find
>>> > out what's going on.
>>>
>>>
>>> make sure kernel support CRASH DUMP, then cat
>>> /sys/devices/system/cpu/cpu[x]/crash_notes
>>>
>>> Tim Hartrick <tim@edgecast.com <mailto:tim@edgecast.com>> reported
>>> the problem when test kdump.
>>> But I can not reproduce this. I think tim can help to test
>>>
>>> >
>>> > Thanks.
>>> >
>>>
>>>
>>>
>>> --
>>> Thanks
>>> Dave
>>>
>>
>>
>>
>
--
Thanks
Dave
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
WARNING: multiple messages have this Message-ID (diff)
From: Dave Young <dyoung@redhat.com>
To: tim@edgecast.com
Cc: WANG Cong <xiyou.wangcong@gmail.com>, Tejun Heo <tj@kernel.org>,
kexec@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: Crash during vmcore_init
Date: Fri, 18 Nov 2011 16:43:34 +0800 [thread overview]
Message-ID: <4EC61AB6.4090808@redhat.com> (raw)
In-Reply-To: <1321548033.12208.12.camel@boudreau>
On 11/18/2011 12:40 AM, Tim Hartrick wrote:
>
> Dave, Tejun, Americo,
>
> Attached find three configs:
>
> Ubuntu 2.6.32-21-server - works
> Ubuntu 2.6.38-8-server - fails
> Ubuntu 3.3.1-030101-generic (stable) - fails
Thanks, Tim
>
> On Thu, 2011-11-17 at 15:21 +0800, Dave Young wrote:
>> On 11/17/2011 01:22 PM, Tim Hartrick wrote:
>>
>>> Tejun, Dave,
>>>
>>> I will be happy to answer any questions about our environment or test
>>> debug or other patches. Just tell me what you need.
>>
>>
>> Thank you. Can you share your kernel config?
>>
>>>
>>> tim
>>>
>>> On Nov 16, 2011 8:44 PM, "Dave Young" <dyoung@redhat.com
>>> <mailto:dyoung@redhat.com>> wrote:
>>>
>>> On 11/17/2011 12:34 PM, Tejun Heo wrote:
>>>
>>> > Hello,
>>> >
>>> > On Wed, Nov 16, 2011 at 7:30 PM, Dave Young <dyoung@redhat.com
>>> <mailto:dyoung@redhat.com>> wrote:
>>> >> This addr is converted to an invalid phys address,
>>> >
>>> > I'm a bit lost on the context here. Who's calling
>>> per_cpu_ptr_to_phys()?
>>>
>>>
>>> It's drivers/base/cpu.c : show_crash_notes()
>>>
>>> >
>>> >> looking the code below:
>>> >> if (in_first_chunk) {
>>> >> if (!is_vmalloc_addr(addr))
>>> >> return __pa(addr);
>>> >> else
>>> >> return page_to_phys(vmalloc_to_page(addr));
>>> >> } else
>>> >> return page_to_phys(pcpu_addr_to_page(addr));
>>> >>
>>> >> I dont understand per cpu allocation well, if addr is not in
>>> first chunk
>>> >> then it should be in vmalloc area?
>>> >
>>> > Yes, it is. First chunk can be embedded in the kernel linear address
>>> > space but from the second one, it's always set up from the top of the
>>> > vmalloc area with the same offset layout as the first chunk.
>>>
>>>
>>> in this case ffff880667c19ad0 fall out of vmalloc area and it's not in
>>> first chunk also.
Tejun,
With config provided by Tim, I can reproduce this problem on a dell
machine. I did some debug about this, found that fisrt_start <
first_end, so there's no chance to check in for_each_possible_cpu(cpu)
why is the first_start/first_end wrong? pcpu_unit_offsets[] is not
ordered? any idea?
I see below hack make the bug gone, it confirmed the addr is indeed in
first chunk.
diff --git a/mm/percpu.c b/mm/percpu.c
index bf80e55..8f6eb58 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -984,26 +984,14 @@ phys_addr_t per_cpu_ptr_to_phys(void *addr)
{
void __percpu *base = __addr_to_pcpu_ptr(pcpu_base_addr);
bool in_first_chunk = false;
- unsigned long first_start, first_end;
unsigned int cpu;
- /*
- * The following test on first_start/end isn't strictly
- * necessary but will speed up lookups of addresses which
- * aren't in the first chunk.
- */
- first_start = pcpu_chunk_addr(pcpu_first_chunk, pcpu_first_unit_cpu, 0);
- first_end = pcpu_chunk_addr(pcpu_first_chunk, pcpu_last_unit_cpu,
- pcpu_unit_pages);
- if ((unsigned long)addr >= first_start &&
- (unsigned long)addr < first_end) {
- for_each_possible_cpu(cpu) {
- void *start = per_cpu_ptr(base, cpu);
-
- if (addr >= start && addr < start + pcpu_unit_size) {
- in_first_chunk = true;
- break;
- }
+ for_each_possible_cpu(cpu) {
+ void *start = per_cpu_ptr(base, cpu);
+
+ if (addr >= start && addr < start + pcpu_unit_size) {
+ in_first_chunk = true;
+ break;
}
}
>>>
>>> >
>>> >> Tejun, do you have any idea about this?
>>> >
>>> > Can you please tell me how to reproduce the problem? I'll try to find
>>> > out what's going on.
>>>
>>>
>>> make sure kernel support CRASH DUMP, then cat
>>> /sys/devices/system/cpu/cpu[x]/crash_notes
>>>
>>> Tim Hartrick <tim@edgecast.com <mailto:tim@edgecast.com>> reported
>>> the problem when test kdump.
>>> But I can not reproduce this. I think tim can help to test
>>>
>>> >
>>> > Thanks.
>>> >
>>>
>>>
>>>
>>> --
>>> Thanks
>>> Dave
>>>
>>
>>
>>
>
--
Thanks
Dave
next prev parent reply other threads:[~2011-11-18 8:41 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-11 23:39 Crash during vmcore_init Tim Hartrick
2011-11-14 13:39 ` WANG Cong
2011-11-14 18:50 ` Tim Hartrick
2011-11-15 8:14 ` Dave Young
2011-11-15 8:14 ` Dave Young
2011-11-15 13:47 ` Américo Wang
2011-11-15 13:47 ` Américo Wang
2011-11-15 13:50 ` Américo Wang
2011-11-15 13:50 ` Américo Wang
2011-11-15 22:32 ` Tim Hartrick
2011-11-15 22:32 ` Tim Hartrick
2011-11-16 2:22 ` Dave Young
2011-11-16 2:22 ` Dave Young
2011-11-16 18:20 ` Tim Hartrick
2011-11-16 18:20 ` Tim Hartrick
2011-11-17 3:30 ` Dave Young
2011-11-17 3:30 ` Dave Young
2011-11-17 4:34 ` Tejun Heo
2011-11-17 4:34 ` Tejun Heo
2011-11-17 4:46 ` Dave Young
2011-11-17 4:46 ` Dave Young
2011-11-17 5:22 ` Tim Hartrick
2011-11-17 7:21 ` Dave Young
2011-11-17 7:21 ` Dave Young
2011-11-17 7:23 ` Tejun Heo
2011-11-17 7:23 ` Tejun Heo
2011-11-17 7:42 ` Américo Wang
2011-11-17 7:42 ` Américo Wang
2011-11-17 16:40 ` Tim Hartrick
2011-11-17 16:40 ` Tim Hartrick
2011-11-18 8:43 ` Dave Young [this message]
2011-11-18 8:43 ` Dave Young
2011-11-18 8:45 ` Dave Young
2011-11-18 8:45 ` Dave Young
2011-11-18 18:55 ` [PATCH] percpu: fix chunk range calculation Tejun Heo
2011-11-18 18:55 ` Tejun Heo
2011-11-21 1:45 ` Dave Young
2011-11-21 1:45 ` Dave Young
2011-11-21 16:20 ` Tim Hartrick
2011-11-22 2:52 ` Dave Young
2011-11-22 2:52 ` Dave Young
2011-11-21 17:01 ` Tejun Heo
2011-11-21 17:01 ` Tejun Heo
2011-11-22 3:00 ` Dave Young
2011-11-22 3:00 ` Dave Young
2011-11-22 16:02 ` Tejun Heo
2011-11-22 16:02 ` Tejun Heo
2011-11-21 21:10 ` Tejun Heo
2011-11-21 21:10 ` Tejun Heo
2011-11-22 2:48 ` Dave Young
2011-11-22 2:48 ` Dave Young
2011-11-22 16:19 ` Tejun Heo
2011-11-22 16:19 ` Tejun Heo
2011-11-15 14:13 ` Crash during vmcore_init Américo Wang
2011-11-15 22:57 ` Tim Hartrick
2011-11-16 12:47 ` Américo Wang
2011-11-16 13:19 ` Tim Hartrick
2011-11-16 13:31 ` Américo Wang
2011-11-16 13:44 ` Tim Hartrick
[not found] ` <1321462343.4198.29.camel@boudreau>
2011-11-17 6:48 ` Américo Wang
2011-11-17 16:08 ` Tim Hartrick
2011-11-17 16:31 ` Tim Hartrick
2011-11-16 15:52 ` Tim Hartrick
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EC61AB6.4090808@redhat.com \
--to=dyoung@redhat.com \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tim@edgecast.com \
--cc=tj@kernel.org \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.