linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Young <dyoung@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: WANG Cong <xiyou.wangcong@gmail.com>,
	kexec@lists.infradead.org, tim@edgecast.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] percpu: fix chunk range calculation
Date: Tue, 22 Nov 2011 11:00:28 +0800	[thread overview]
Message-ID: <4ECB104C.3040801@redhat.com> (raw)
In-Reply-To: <20111121170146.GD15314@google.com>

On 11/22/2011 01:01 AM, Tejun Heo wrote:

> Hello, Dave.
> 
> On Mon, Nov 21, 2011 at 09:45:51AM +0800, Dave Young wrote:
>> On 11/19/2011 02:55 AM, Tejun Heo wrote:
>>
>>> Percpu allocator recorded the cpus which map to the first and last
>>> units in pcpu_first/last_unit_cpu respectively and used them to
>>> determine the address range of a chunk - e.g. it assumed that the
>>> first unit has the lowest address in a chunk while the last unit has
>>> the highest address.
>>>
>>> This simply isn't true.  Groups in a chunk can have arbitrary positive
>>> or negative offsets from the previous one and there is no guarantee
>>> that the first unit occupies the lowest offset while the last one the
>>> highest.
>>>
>>> Fix it by actually comparing unit offsets to determine cpus occupying
>>> the lowest and highest offsets.  Also, rename pcu_first/last_unit_cpu
>>> to pcpu_low/high_unit_cpu to avoid confusion.
>>>
>>> The chunk address range is used to flush cache on vmalloc area
>>> map/unmap and decide whether a given address is in the first chunk by
>>> per_cpu_ptr_to_phys() and the bug was discovered by invalid
>>> per_cpu_ptr_to_phys() translation for crash_note.
>>>
>>> Kudos to Dave Young for tracking down the problem.
>>
>> Tejun, thanks
>>
>> Now that if addr is not in first trunk it must be in vmalloc area, below
>> logic should be right:
>> 	if (in_first_chunk) {
>> 		if (!is_vmalloc_addr(addr))
>> 			return __pa(addr);
>> 		else
>> 			return page_to_phys(vmalloc_to_page(addr));
>> 	} else
>> 		if (!is_vmalloc_addr(addr)) /* not possible */
>> 			return __pa(addr);
>> 		else
>> 			return page_to_phys(pcpu_addr_to_page(addr));
>>
>> So how about just simply remove in first chunk checking to simplify the
>> code as followging:
>>
>> phys_addr_t per_cpu_ptr_to_phys(void *addr)
>> {
>> 	if (!is_vmalloc_addr(addr))
>> 		return __pa(addr);
>> 	else
>> 		return page_to_phys(pcpu_addr_to_page(addr));
>> }
> 
> Yes, that's much simpler.  Hmmm... I'm still slightly reluctant
> because the current code reflects better how percpu allocator actually
> works.  It has special setup for the first chunk, which currently
> supports either embedding in linear address space or vmalloc mapping,
> and, from the second one, the backing allocator (currently either vm
> or km) provides translation.
> 
> per_cpu_ptr_to_phys() has been pretty good at exposing wrong
> assumptions in address translation mostly because this is one of few
> points where the assumptions are visible in a verifiable way (I think
> this is the second bug it has discovered).
> 
> This forces the verification that "if it isn't in one of the explicit
> first chunk areas, it MUST follow backing allocator mapping" which can
> discover both bugs in percpu allocator itself and
> per_cpu_ptr_to_phys() callers.  e.g. with the proposed change, if the
> caller passes in usual kernel address on percpu-vm which is not in the
> first chunk, it will return __pa for the address even though that
> address can't possibly be a percpu address, but the current code will
> trip VIRTUAL_BUG_ON() in vmalloc_to_page().
> 
> Maybe we can add comment there how it can be further simplified and
> why things look more complicated than necessary?


I wonder if it's good to add something like CONFIG_PER_CPU_DEBUG?
Anyway I have no strong opinion with this. I'm fine to either of adding
comment and put it into debug around.

Thank you for your fix.

-- 
Thanks
Dave

  reply	other threads:[~2011-11-22  2:58 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1318376345.2050.20.camel@boudreau>
     [not found] ` <j9r5m5$b18$1@dough.gmane.org>
     [not found]   ` <1321296650.2066.17.camel@boudreau>
2011-11-15  8:14     ` Crash during vmcore_init Dave Young
2011-11-15 13:47       ` Américo Wang
2011-11-15 13:50         ` Américo Wang
2011-11-15 22:32       ` Tim Hartrick
2011-11-16  2:22         ` Dave Young
2011-11-16 18:20           ` Tim Hartrick
2011-11-17  3:30             ` Dave Young
2011-11-17  4:34               ` Tejun Heo
2011-11-17  4:46                 ` Dave Young
     [not found]                   ` <CAMMEr5k_ynqg5-7Looar2DxXTGZcMqi5Lo+jtETn9awO_bsaGg@mail.gmail.com>
2011-11-17  7:21                     ` Dave Young
2011-11-17  7:23                       ` Tejun Heo
2011-11-17  7:42                         ` Américo Wang
2011-11-17 16:40                       ` Tim Hartrick
2011-11-18  8:43                         ` Dave Young
2011-11-18  8:45                           ` Dave Young
2011-11-18 18:55                             ` [PATCH] percpu: fix chunk range calculation Tejun Heo
2011-11-21  1:45                               ` Dave Young
2011-11-21 17:01                                 ` Tejun Heo
2011-11-22  3:00                                   ` Dave Young [this message]
2011-11-22 16:02                                     ` Tejun Heo
     [not found]                                 ` <CAMMEr5my3pt0HXWd1EdwAeZKAJ8JP04tM_WqvHdCNvW=Q3ifvg@mail.gmail.com>
2011-11-22  2:52                                   ` Dave Young
2011-11-21 21:10                               ` Tejun Heo
2011-11-22  2:48                                 ` Dave Young
2011-11-22 16:19                                   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ECB104C.3040801@redhat.com \
    --to=dyoung@redhat.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tim@edgecast.com \
    --cc=tj@kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).