linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hush Bensen <hush.bensen@gmail.com>
To: Toshi Kani <toshi.kani@hp.com>
Cc: Ingo Molnar <mingo@kernel.org>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, x86@kernel.org, dave@sr71.net,
	kosaki.motohiro@gmail.com, isimatu.yasuaki@jp.fujitsu.com,
	tangchen@cn.fujitsu.com, vasilis.liaskovitis@profitbricks.com
Subject: Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
Date: Thu, 25 Jul 2013 08:56:14 +0800	[thread overview]
Message-ID: <51F077AE.7080307@gmail.com> (raw)
In-Reply-To: <1374681742.16322.180.camel@misato.fc.hp.com>

On 07/25/2013 12:02 AM, Toshi Kani wrote:
> On Wed, 2013-07-24 at 08:18 +0800, Hush Bensen wrote:
>> On 07/24/2013 04:45 AM, Toshi Kani wrote:
>>> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
>>>> * Toshi Kani <toshi.kani@hp.com> wrote:
>>>>
>>>>>> Could we please also fix it to never crash the kernel, even if stupid
>>>>>> ranges are provided?
>>>>> Yes, this probe interface can be enhanced to verify the firmware
>>>>> information before adding a given memory address.  However, such change
>>>>> would interfere its test use of "fake" hotplug, which is only the known
>>>>> use-case of this interface on x86.
>>>> Not crashing the kernel is not a novel concept even for test interfaces...
>>> Agreed.
>>>
>>>> Where does the possible crash come from - from using invalid RAM ranges,
>>>> right? I.e. on x86 to fix the crash we need to check the RAM is present in
>>>> the e820 maps, is marked RAM there, and is not already registered with the
>>>> kernel, or so?
>>> Yes, the crash comes from using invalid RAM ranges.  How to check if the
>>> RAM is present is different if the system supports hotplug or not.
>> Could you explain different methods to check the RAM is present if the
>> system supports hotplkug or not?
> e820 and UEFI memory descriptor tables are the boot-time interfaces.
> These interfaces are not required to reflect any run-time changes.
>
> ACPI memory device objects can be used at both boot-time and run-time,
> which reflect any run-time changes.  But they are optional to implement.
> They typically are not implemented unless the system supports hotplug.
>
>>>>> In order to verify if a given memory address is enabled at run-time (as
>>>>> opposed to boot-time), we need to check with ACPI memory device objects
>>>>> on x86.  However, system vendors tend to not implement memory device
>>>>> objects unless their systems support memory hotplug.  Dave Hansen is
>>>>> using this interface for his testing as a way to fake a hotplug event on
>>>>> a system that does not support memory hotplug.
>>>> All vendors implement e820 maps for the memory present at boot time.
>>> Yes for boot time.  At run-time, e820 is not guaranteed to represent a
>>> new memory added.  Here is a quote from ACPI spec.
>>>
>>> ===
>>> 15.1 INT 15H, E820H - Query System Address Map
>>>    :
>>> The memory map conveyed by this interface is not required to reflect any
>>> changes in available physical memory that have occurred after the BIOS
>>> has initially passed control to the operating system. For example, if
>>> memory is added dynamically, this interface is not required to reflect
>>> the new system memory configuration.
>>> ===
>>>
>>> By definition, the "probe" interface is used for the kernel to recognize
>>> a new memory added at run-time.  So, it should check ACPI memory device
>>> objects (which represents run-time state) for the verification.  On x86,
>>> however, ACPI also sends a hotplug event to the kernel, which triggers
>>> the kernel to recognize the new physical memory properly.  Hence, users
>>> do not need this "probe" interface.
>>>
>>>> How is the testing done by Dave Hansen? If it's done by booting with less
>>>> RAM than available (via say the mem=1g boot parameter), and then
>>>> hot-adding some of the missing RAM, then this could be made safe via the
>>>> e820 maps and by consultig the physical memory maps (to avoid double
>>>> registry), right?
>>> If we focus on this test scenario on a system that does not support
>>> hotplug, yes, I agree that we can check with e820 since it is safe to
>>> assume that the system has no change after boot.  IOW, it is unsafe to
>>> check with e820 if the system supports hotplug, but there is no use in
>>> this interface for testing if the system supports hotplug.  So, this may
>>> be a good idea.
>>>
>>> Dave, is this how you are testing?  Do you always specify a valid memory
>>> address for your testing?
>>>
>>>> How does the hotplug event based approach solve double adds? Relies on the
>>>> hardware not sending a hot-add event twice for the same memory area or for
>>>> an invalid memory area, or does it include fail-safes and double checks as
>>>> well to avoid double adds and adding invalid memory? If yes then that
>>>> could be utilized here as well.
>>> In high-level, here is how ACPI memory hotplug works:
>>>
>>> 1. ACPI sends a hotplug event to a new ACPI memory device object that is
>>> hot-added.
>>> 2. The kernel is notified, and verifies if the new memory device object
>>> has not been attached by any handler yet.
>>> 3. The memory handler is called, and obtains a new memory range from the
>>> ACPI memory device object.
>>> 4. The memory handler calls add_memory() with the new address range.
>>>
>>> The above step 1-4 proceeds automatically within the kernel.  No user
>>> input (nor sysfs interface) is necessary.  Step 2 prevents double adds
>>> and step 3 gets a valid address range from the firmware directly.  Step
>>> 4 is basically the same as the "probe" interface, but with all the
>>> verification up front, this step is safe.
>> This is hot-added part, could you also explain how ACPI memory hotplug
>> works for hot-remove?
> Sure.  Here is high-level.
>
> 1. ACPI sends a hotplug event to an ACPI memory device object that is
> requested to hot-remove.
> 2. The kernel is notified, and verifies if the memory device object is
> attached by a handler.
> 3. The memory handler is called (which is being attached), and obtains
> its memory range.
> 4. The memory handler calls remove_memory() with the address range.
> 5. The kernel calls eject method of the ACPI memory device object.

If hot remove the memory device by the hardware, or writing 1 to 
/sys/bus/acpi/devices/PNP0C80:XX/eject both will call eject method? 
What's the difference between these two methods? I guess the former will 
send SCI and the latter won't.

>
> Thanks,
> -Toshi
>
>


  parent reply	other threads:[~2013-07-25  0:56 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-19 17:47 [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default Toshi Kani
2013-07-19 19:30 ` KOSAKI Motohiro
2013-07-19 19:35   ` Toshi Kani
2013-07-22  8:37 ` Ingo Molnar
2013-07-22 17:12   ` Toshi Kani
2013-07-22 20:57     ` KOSAKI Motohiro
2013-07-22 21:04       ` Dave Hansen
2013-07-23  0:34       ` Toshi Kani
2013-07-23  8:01     ` Ingo Molnar
2013-07-23 20:45       ` Toshi Kani
2013-07-23 20:59         ` Dave Hansen
2013-07-23 21:34           ` Toshi Kani
2013-07-24  0:18         ` Hush Bensen
2013-07-24 16:02           ` Toshi Kani
2013-07-25  0:17             ` Hush Bensen
2013-07-25 15:47               ` Toshi Kani
2013-07-25  0:44             ` Hush Bensen
2013-07-25  0:56             ` Hush Bensen [this message]
2013-07-25  3:08               ` Yasuaki Ishimatsu
2013-07-25  3:34                 ` Hush Bensen
2013-07-25  4:55                   ` Yasuaki Ishimatsu
2013-07-24  4:20         ` Ingo Molnar
2013-07-24 16:58           ` Toshi Kani
2013-07-25 21:38             ` Ingo Molnar
2013-07-25 22:36               ` Toshi Kani
2013-07-23  0:24 ` Yasuaki Ishimatsu
2013-07-23  0:45   ` Toshi Kani
2013-07-23  7:46 ` [tip:x86/mm] " tip-bot for Toshi Kani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51F077AE.7080307@gmail.com \
    --to=hush.bensen@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@sr71.net \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@kernel.org \
    --cc=tangchen@cn.fujitsu.com \
    --cc=toshi.kani@hp.com \
    --cc=vasilis.liaskovitis@profitbricks.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).