From: Hush Bensen <hush.bensen@gmail.com>
To: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Toshi Kani <toshi.kani@hp.com>, Ingo Molnar <mingo@kernel.org>,
akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, x86@kernel.org, dave@sr71.net,
kosaki.motohiro@gmail.com, tangchen@cn.fujitsu.com,
vasilis.liaskovitis@profitbricks.com
Subject: Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
Date: Thu, 25 Jul 2013 11:34:49 +0800 [thread overview]
Message-ID: <51F09CD9.5050802@gmail.com> (raw)
In-Reply-To: <51F0969C.8000001@jp.fujitsu.com>
On 07/25/2013 11:08 AM, Yasuaki Ishimatsu wrote:
> (2013/07/25 9:56), Hush Bensen wrote:
>> On 07/25/2013 12:02 AM, Toshi Kani wrote:
>>> On Wed, 2013-07-24 at 08:18 +0800, Hush Bensen wrote:
>>>> On 07/24/2013 04:45 AM, Toshi Kani wrote:
>>>>> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
>>>>>> * Toshi Kani <toshi.kani@hp.com> wrote:
>>>>>>
>>>>>>>> Could we please also fix it to never crash the kernel, even if
>>>>>>>> stupid
>>>>>>>> ranges are provided?
>>>>>>> Yes, this probe interface can be enhanced to verify the firmware
>>>>>>> information before adding a given memory address. However, such
>>>>>>> change
>>>>>>> would interfere its test use of "fake" hotplug, which is only
>>>>>>> the known
>>>>>>> use-case of this interface on x86.
>>>>>> Not crashing the kernel is not a novel concept even for test
>>>>>> interfaces...
>>>>> Agreed.
>>>>>
>>>>>> Where does the possible crash come from - from using invalid RAM
>>>>>> ranges,
>>>>>> right? I.e. on x86 to fix the crash we need to check the RAM is
>>>>>> present in
>>>>>> the e820 maps, is marked RAM there, and is not already registered
>>>>>> with the
>>>>>> kernel, or so?
>>>>> Yes, the crash comes from using invalid RAM ranges. How to check
>>>>> if the
>>>>> RAM is present is different if the system supports hotplug or not.
>>>> Could you explain different methods to check the RAM is present if the
>>>> system supports hotplkug or not?
>>> e820 and UEFI memory descriptor tables are the boot-time interfaces.
>>> These interfaces are not required to reflect any run-time changes.
>>>
>>> ACPI memory device objects can be used at both boot-time and run-time,
>>> which reflect any run-time changes. But they are optional to
>>> implement.
>>> They typically are not implemented unless the system supports hotplug.
>>>
>>>>>>> In order to verify if a given memory address is enabled at
>>>>>>> run-time (as
>>>>>>> opposed to boot-time), we need to check with ACPI memory device
>>>>>>> objects
>>>>>>> on x86. However, system vendors tend to not implement memory
>>>>>>> device
>>>>>>> objects unless their systems support memory hotplug. Dave Hansen is
>>>>>>> using this interface for his testing as a way to fake a hotplug
>>>>>>> event on
>>>>>>> a system that does not support memory hotplug.
>>>>>> All vendors implement e820 maps for the memory present at boot time.
>>>>> Yes for boot time. At run-time, e820 is not guaranteed to
>>>>> represent a
>>>>> new memory added. Here is a quote from ACPI spec.
>>>>>
>>>>> ===
>>>>> 15.1 INT 15H, E820H - Query System Address Map
>>>>> :
>>>>> The memory map conveyed by this interface is not required to
>>>>> reflect any
>>>>> changes in available physical memory that have occurred after the
>>>>> BIOS
>>>>> has initially passed control to the operating system. For example, if
>>>>> memory is added dynamically, this interface is not required to
>>>>> reflect
>>>>> the new system memory configuration.
>>>>> ===
>>>>>
>>>>> By definition, the "probe" interface is used for the kernel to
>>>>> recognize
>>>>> a new memory added at run-time. So, it should check ACPI memory
>>>>> device
>>>>> objects (which represents run-time state) for the verification.
>>>>> On x86,
>>>>> however, ACPI also sends a hotplug event to the kernel, which
>>>>> triggers
>>>>> the kernel to recognize the new physical memory properly. Hence,
>>>>> users
>>>>> do not need this "probe" interface.
>>>>>
>>>>>> How is the testing done by Dave Hansen? If it's done by booting
>>>>>> with less
>>>>>> RAM than available (via say the mem=1g boot parameter), and then
>>>>>> hot-adding some of the missing RAM, then this could be made safe
>>>>>> via the
>>>>>> e820 maps and by consultig the physical memory maps (to avoid double
>>>>>> registry), right?
>>>>> If we focus on this test scenario on a system that does not support
>>>>> hotplug, yes, I agree that we can check with e820 since it is safe to
>>>>> assume that the system has no change after boot. IOW, it is
>>>>> unsafe to
>>>>> check with e820 if the system supports hotplug, but there is no
>>>>> use in
>>>>> this interface for testing if the system supports hotplug. So,
>>>>> this may
>>>>> be a good idea.
>>>>>
>>>>> Dave, is this how you are testing? Do you always specify a valid
>>>>> memory
>>>>> address for your testing?
>>>>>
>>>>>> How does the hotplug event based approach solve double adds?
>>>>>> Relies on the
>>>>>> hardware not sending a hot-add event twice for the same memory
>>>>>> area or for
>>>>>> an invalid memory area, or does it include fail-safes and double
>>>>>> checks as
>>>>>> well to avoid double adds and adding invalid memory? If yes then
>>>>>> that
>>>>>> could be utilized here as well.
>>>>> In high-level, here is how ACPI memory hotplug works:
>>>>>
>>>>> 1. ACPI sends a hotplug event to a new ACPI memory device object
>>>>> that is
>>>>> hot-added.
>>>>> 2. The kernel is notified, and verifies if the new memory device
>>>>> object
>>>>> has not been attached by any handler yet.
>>>>> 3. The memory handler is called, and obtains a new memory range
>>>>> from the
>>>>> ACPI memory device object.
>>>>> 4. The memory handler calls add_memory() with the new address range.
>>>>>
>>>>> The above step 1-4 proceeds automatically within the kernel. No user
>>>>> input (nor sysfs interface) is necessary. Step 2 prevents double
>>>>> adds
>>>>> and step 3 gets a valid address range from the firmware directly.
>>>>> Step
>>>>> 4 is basically the same as the "probe" interface, but with all the
>>>>> verification up front, this step is safe.
>>>> This is hot-added part, could you also explain how ACPI memory hotplug
>>>> works for hot-remove?
>>> Sure. Here is high-level.
>>>
>>> 1. ACPI sends a hotplug event to an ACPI memory device object that is
>>> requested to hot-remove.
>>> 2. The kernel is notified, and verifies if the memory device object is
>>> attached by a handler.
>>> 3. The memory handler is called (which is being attached), and obtains
>>> its memory range.
>>> 4. The memory handler calls remove_memory() with the address range.
>>> 5. The kernel calls eject method of the ACPI memory device object.
>>
>> If hot remove the memory device by the hardware, or writing 1 to
>> /sys/bus/acpi/devices/PNP0C80:XX/eject both will call eject method?
>
> Yes.
> Both operations will call eject method.
>
>> What's the difference between these two methods? I guess the former
>> will send SCI and the latter won't.
>
> Triggers are different. Former is triggered by SCI, latter is
> triggered by
> writing sysfs.
Thanks, another question, what's the role of udev in memory hotplug?
>
> Thanks,
> Yasuaki Ishimatsu
>
>
>>
>>>
>>> Thanks,
>>> -Toshi
>>>
>>>
>>
>
>
next prev parent reply other threads:[~2013-07-25 3:35 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-19 17:47 [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default Toshi Kani
2013-07-19 19:30 ` KOSAKI Motohiro
2013-07-19 19:35 ` Toshi Kani
2013-07-22 8:37 ` Ingo Molnar
2013-07-22 17:12 ` Toshi Kani
2013-07-22 20:57 ` KOSAKI Motohiro
2013-07-22 21:04 ` Dave Hansen
2013-07-23 0:34 ` Toshi Kani
2013-07-23 8:01 ` Ingo Molnar
2013-07-23 20:45 ` Toshi Kani
2013-07-23 20:59 ` Dave Hansen
2013-07-23 21:34 ` Toshi Kani
2013-07-24 0:18 ` Hush Bensen
2013-07-24 16:02 ` Toshi Kani
2013-07-25 0:17 ` Hush Bensen
2013-07-25 15:47 ` Toshi Kani
2013-07-25 0:44 ` Hush Bensen
2013-07-25 0:56 ` Hush Bensen
2013-07-25 3:08 ` Yasuaki Ishimatsu
2013-07-25 3:34 ` Hush Bensen [this message]
2013-07-25 4:55 ` Yasuaki Ishimatsu
2013-07-24 4:20 ` Ingo Molnar
2013-07-24 16:58 ` Toshi Kani
2013-07-25 21:38 ` Ingo Molnar
2013-07-25 22:36 ` Toshi Kani
2013-07-23 0:24 ` Yasuaki Ishimatsu
2013-07-23 0:45 ` Toshi Kani
2013-07-23 7:46 ` [tip:x86/mm] " tip-bot for Toshi Kani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51F09CD9.5050802@gmail.com \
--to=hush.bensen@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=dave@sr71.net \
--cc=isimatu.yasuaki@jp.fujitsu.com \
--cc=kosaki.motohiro@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@kernel.org \
--cc=tangchen@cn.fujitsu.com \
--cc=toshi.kani@hp.com \
--cc=vasilis.liaskovitis@profitbricks.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).