From: Dave Young <dyoung@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: WANG Cong <xiyou.wangcong@gmail.com>,
kexec@lists.infradead.org, tim@edgecast.com,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] percpu: fix chunk range calculation
Date: Tue, 22 Nov 2011 11:00:28 +0800 [thread overview]
Message-ID: <4ECB104C.3040801@redhat.com> (raw)
In-Reply-To: <20111121170146.GD15314@google.com>
On 11/22/2011 01:01 AM, Tejun Heo wrote:
> Hello, Dave.
>
> On Mon, Nov 21, 2011 at 09:45:51AM +0800, Dave Young wrote:
>> On 11/19/2011 02:55 AM, Tejun Heo wrote:
>>
>>> Percpu allocator recorded the cpus which map to the first and last
>>> units in pcpu_first/last_unit_cpu respectively and used them to
>>> determine the address range of a chunk - e.g. it assumed that the
>>> first unit has the lowest address in a chunk while the last unit has
>>> the highest address.
>>>
>>> This simply isn't true. Groups in a chunk can have arbitrary positive
>>> or negative offsets from the previous one and there is no guarantee
>>> that the first unit occupies the lowest offset while the last one the
>>> highest.
>>>
>>> Fix it by actually comparing unit offsets to determine cpus occupying
>>> the lowest and highest offsets. Also, rename pcu_first/last_unit_cpu
>>> to pcpu_low/high_unit_cpu to avoid confusion.
>>>
>>> The chunk address range is used to flush cache on vmalloc area
>>> map/unmap and decide whether a given address is in the first chunk by
>>> per_cpu_ptr_to_phys() and the bug was discovered by invalid
>>> per_cpu_ptr_to_phys() translation for crash_note.
>>>
>>> Kudos to Dave Young for tracking down the problem.
>>
>> Tejun, thanks
>>
>> Now that if addr is not in first trunk it must be in vmalloc area, below
>> logic should be right:
>> if (in_first_chunk) {
>> if (!is_vmalloc_addr(addr))
>> return __pa(addr);
>> else
>> return page_to_phys(vmalloc_to_page(addr));
>> } else
>> if (!is_vmalloc_addr(addr)) /* not possible */
>> return __pa(addr);
>> else
>> return page_to_phys(pcpu_addr_to_page(addr));
>>
>> So how about just simply remove in first chunk checking to simplify the
>> code as followging:
>>
>> phys_addr_t per_cpu_ptr_to_phys(void *addr)
>> {
>> if (!is_vmalloc_addr(addr))
>> return __pa(addr);
>> else
>> return page_to_phys(pcpu_addr_to_page(addr));
>> }
>
> Yes, that's much simpler. Hmmm... I'm still slightly reluctant
> because the current code reflects better how percpu allocator actually
> works. It has special setup for the first chunk, which currently
> supports either embedding in linear address space or vmalloc mapping,
> and, from the second one, the backing allocator (currently either vm
> or km) provides translation.
>
> per_cpu_ptr_to_phys() has been pretty good at exposing wrong
> assumptions in address translation mostly because this is one of few
> points where the assumptions are visible in a verifiable way (I think
> this is the second bug it has discovered).
>
> This forces the verification that "if it isn't in one of the explicit
> first chunk areas, it MUST follow backing allocator mapping" which can
> discover both bugs in percpu allocator itself and
> per_cpu_ptr_to_phys() callers. e.g. with the proposed change, if the
> caller passes in usual kernel address on percpu-vm which is not in the
> first chunk, it will return __pa for the address even though that
> address can't possibly be a percpu address, but the current code will
> trip VIRTUAL_BUG_ON() in vmalloc_to_page().
>
> Maybe we can add comment there how it can be further simplified and
> why things look more complicated than necessary?
I wonder if it's good to add something like CONFIG_PER_CPU_DEBUG?
Anyway I have no strong opinion with this. I'm fine to either of adding
comment and put it into debug around.
Thank you for your fix.
--
Thanks
Dave
next prev parent reply other threads:[~2011-11-22 2:58 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1318376345.2050.20.camel@boudreau>
[not found] ` <j9r5m5$b18$1@dough.gmane.org>
[not found] ` <1321296650.2066.17.camel@boudreau>
2011-11-15 8:14 ` Crash during vmcore_init Dave Young
2011-11-15 13:47 ` Américo Wang
2011-11-15 13:50 ` Américo Wang
2011-11-15 22:32 ` Tim Hartrick
2011-11-16 2:22 ` Dave Young
2011-11-16 18:20 ` Tim Hartrick
2011-11-17 3:30 ` Dave Young
2011-11-17 4:34 ` Tejun Heo
2011-11-17 4:46 ` Dave Young
[not found] ` <CAMMEr5k_ynqg5-7Looar2DxXTGZcMqi5Lo+jtETn9awO_bsaGg@mail.gmail.com>
2011-11-17 7:21 ` Dave Young
2011-11-17 7:23 ` Tejun Heo
2011-11-17 7:42 ` Américo Wang
2011-11-17 16:40 ` Tim Hartrick
2011-11-18 8:43 ` Dave Young
2011-11-18 8:45 ` Dave Young
2011-11-18 18:55 ` [PATCH] percpu: fix chunk range calculation Tejun Heo
2011-11-21 1:45 ` Dave Young
2011-11-21 17:01 ` Tejun Heo
2011-11-22 3:00 ` Dave Young [this message]
2011-11-22 16:02 ` Tejun Heo
[not found] ` <CAMMEr5my3pt0HXWd1EdwAeZKAJ8JP04tM_WqvHdCNvW=Q3ifvg@mail.gmail.com>
2011-11-22 2:52 ` Dave Young
2011-11-21 21:10 ` Tejun Heo
2011-11-22 2:48 ` Dave Young
2011-11-22 16:19 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ECB104C.3040801@redhat.com \
--to=dyoung@redhat.com \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tim@edgecast.com \
--cc=tj@kernel.org \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).