linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hush Bensen <hush.bensen@gmail.com>
To: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Toshi Kani <toshi.kani@hp.com>, Ingo Molnar <mingo@kernel.org>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, x86@kernel.org, dave@sr71.net,
	kosaki.motohiro@gmail.com, tangchen@cn.fujitsu.com,
	vasilis.liaskovitis@profitbricks.com
Subject: Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
Date: Thu, 25 Jul 2013 11:34:49 +0800	[thread overview]
Message-ID: <51F09CD9.5050802@gmail.com> (raw)
In-Reply-To: <51F0969C.8000001@jp.fujitsu.com>

On 07/25/2013 11:08 AM, Yasuaki Ishimatsu wrote:
> (2013/07/25 9:56), Hush Bensen wrote:
>> On 07/25/2013 12:02 AM, Toshi Kani wrote:
>>> On Wed, 2013-07-24 at 08:18 +0800, Hush Bensen wrote:
>>>> On 07/24/2013 04:45 AM, Toshi Kani wrote:
>>>>> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
>>>>>> * Toshi Kani <toshi.kani@hp.com> wrote:
>>>>>>
>>>>>>>> Could we please also fix it to never crash the kernel, even if 
>>>>>>>> stupid
>>>>>>>> ranges are provided?
>>>>>>> Yes, this probe interface can be enhanced to verify the firmware
>>>>>>> information before adding a given memory address. However, such 
>>>>>>> change
>>>>>>> would interfere its test use of "fake" hotplug, which is only 
>>>>>>> the known
>>>>>>> use-case of this interface on x86.
>>>>>> Not crashing the kernel is not a novel concept even for test 
>>>>>> interfaces...
>>>>> Agreed.
>>>>>
>>>>>> Where does the possible crash come from - from using invalid RAM 
>>>>>> ranges,
>>>>>> right? I.e. on x86 to fix the crash we need to check the RAM is 
>>>>>> present in
>>>>>> the e820 maps, is marked RAM there, and is not already registered 
>>>>>> with the
>>>>>> kernel, or so?
>>>>> Yes, the crash comes from using invalid RAM ranges.  How to check 
>>>>> if the
>>>>> RAM is present is different if the system supports hotplug or not.
>>>> Could you explain different methods to check the RAM is present if the
>>>> system supports hotplkug or not?
>>> e820 and UEFI memory descriptor tables are the boot-time interfaces.
>>> These interfaces are not required to reflect any run-time changes.
>>>
>>> ACPI memory device objects can be used at both boot-time and run-time,
>>> which reflect any run-time changes.  But they are optional to 
>>> implement.
>>> They typically are not implemented unless the system supports hotplug.
>>>
>>>>>>> In order to verify if a given memory address is enabled at 
>>>>>>> run-time (as
>>>>>>> opposed to boot-time), we need to check with ACPI memory device 
>>>>>>> objects
>>>>>>> on x86.  However, system vendors tend to not implement memory 
>>>>>>> device
>>>>>>> objects unless their systems support memory hotplug. Dave Hansen is
>>>>>>> using this interface for his testing as a way to fake a hotplug 
>>>>>>> event on
>>>>>>> a system that does not support memory hotplug.
>>>>>> All vendors implement e820 maps for the memory present at boot time.
>>>>> Yes for boot time.  At run-time, e820 is not guaranteed to 
>>>>> represent a
>>>>> new memory added.  Here is a quote from ACPI spec.
>>>>>
>>>>> ===
>>>>> 15.1 INT 15H, E820H - Query System Address Map
>>>>>    :
>>>>> The memory map conveyed by this interface is not required to 
>>>>> reflect any
>>>>> changes in available physical memory that have occurred after the 
>>>>> BIOS
>>>>> has initially passed control to the operating system. For example, if
>>>>> memory is added dynamically, this interface is not required to 
>>>>> reflect
>>>>> the new system memory configuration.
>>>>> ===
>>>>>
>>>>> By definition, the "probe" interface is used for the kernel to 
>>>>> recognize
>>>>> a new memory added at run-time.  So, it should check ACPI memory 
>>>>> device
>>>>> objects (which represents run-time state) for the verification.  
>>>>> On x86,
>>>>> however, ACPI also sends a hotplug event to the kernel, which 
>>>>> triggers
>>>>> the kernel to recognize the new physical memory properly. Hence, 
>>>>> users
>>>>> do not need this "probe" interface.
>>>>>
>>>>>> How is the testing done by Dave Hansen? If it's done by booting 
>>>>>> with less
>>>>>> RAM than available (via say the mem=1g boot parameter), and then
>>>>>> hot-adding some of the missing RAM, then this could be made safe 
>>>>>> via the
>>>>>> e820 maps and by consultig the physical memory maps (to avoid double
>>>>>> registry), right?
>>>>> If we focus on this test scenario on a system that does not support
>>>>> hotplug, yes, I agree that we can check with e820 since it is safe to
>>>>> assume that the system has no change after boot.  IOW, it is 
>>>>> unsafe to
>>>>> check with e820 if the system supports hotplug, but there is no 
>>>>> use in
>>>>> this interface for testing if the system supports hotplug.  So, 
>>>>> this may
>>>>> be a good idea.
>>>>>
>>>>> Dave, is this how you are testing?  Do you always specify a valid 
>>>>> memory
>>>>> address for your testing?
>>>>>
>>>>>> How does the hotplug event based approach solve double adds? 
>>>>>> Relies on the
>>>>>> hardware not sending a hot-add event twice for the same memory 
>>>>>> area or for
>>>>>> an invalid memory area, or does it include fail-safes and double 
>>>>>> checks as
>>>>>> well to avoid double adds and adding invalid memory? If yes then 
>>>>>> that
>>>>>> could be utilized here as well.
>>>>> In high-level, here is how ACPI memory hotplug works:
>>>>>
>>>>> 1. ACPI sends a hotplug event to a new ACPI memory device object 
>>>>> that is
>>>>> hot-added.
>>>>> 2. The kernel is notified, and verifies if the new memory device 
>>>>> object
>>>>> has not been attached by any handler yet.
>>>>> 3. The memory handler is called, and obtains a new memory range 
>>>>> from the
>>>>> ACPI memory device object.
>>>>> 4. The memory handler calls add_memory() with the new address range.
>>>>>
>>>>> The above step 1-4 proceeds automatically within the kernel.  No user
>>>>> input (nor sysfs interface) is necessary.  Step 2 prevents double 
>>>>> adds
>>>>> and step 3 gets a valid address range from the firmware directly.  
>>>>> Step
>>>>> 4 is basically the same as the "probe" interface, but with all the
>>>>> verification up front, this step is safe.
>>>> This is hot-added part, could you also explain how ACPI memory hotplug
>>>> works for hot-remove?
>>> Sure.  Here is high-level.
>>>
>>> 1. ACPI sends a hotplug event to an ACPI memory device object that is
>>> requested to hot-remove.
>>> 2. The kernel is notified, and verifies if the memory device object is
>>> attached by a handler.
>>> 3. The memory handler is called (which is being attached), and obtains
>>> its memory range.
>>> 4. The memory handler calls remove_memory() with the address range.
>>> 5. The kernel calls eject method of the ACPI memory device object.
>>
>> If hot remove the memory device by the hardware, or writing 1 to
>> /sys/bus/acpi/devices/PNP0C80:XX/eject both will call eject method?
>
> Yes.
> Both operations will call eject method.
>
>> What's the difference between these two methods? I guess the former 
>> will send SCI and the latter won't.
>
> Triggers are different. Former is triggered by SCI, latter is 
> triggered by
> writing sysfs.

Thanks, another question, what's the role of udev in memory hotplug?

>
> Thanks,
> Yasuaki Ishimatsu
>
>
>>
>>>
>>> Thanks,
>>> -Toshi
>>>
>>>
>>
>
>


  reply	other threads:[~2013-07-25  3:35 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-19 17:47 [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default Toshi Kani
2013-07-19 19:30 ` KOSAKI Motohiro
2013-07-19 19:35   ` Toshi Kani
2013-07-22  8:37 ` Ingo Molnar
2013-07-22 17:12   ` Toshi Kani
2013-07-22 20:57     ` KOSAKI Motohiro
2013-07-22 21:04       ` Dave Hansen
2013-07-23  0:34       ` Toshi Kani
2013-07-23  8:01     ` Ingo Molnar
2013-07-23 20:45       ` Toshi Kani
2013-07-23 20:59         ` Dave Hansen
2013-07-23 21:34           ` Toshi Kani
2013-07-24  0:18         ` Hush Bensen
2013-07-24 16:02           ` Toshi Kani
2013-07-25  0:17             ` Hush Bensen
2013-07-25 15:47               ` Toshi Kani
2013-07-25  0:44             ` Hush Bensen
2013-07-25  0:56             ` Hush Bensen
2013-07-25  3:08               ` Yasuaki Ishimatsu
2013-07-25  3:34                 ` Hush Bensen [this message]
2013-07-25  4:55                   ` Yasuaki Ishimatsu
2013-07-24  4:20         ` Ingo Molnar
2013-07-24 16:58           ` Toshi Kani
2013-07-25 21:38             ` Ingo Molnar
2013-07-25 22:36               ` Toshi Kani
2013-07-23  0:24 ` Yasuaki Ishimatsu
2013-07-23  0:45   ` Toshi Kani
2013-07-23  7:46 ` [tip:x86/mm] " tip-bot for Toshi Kani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51F09CD9.5050802@gmail.com \
    --to=hush.bensen@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@sr71.net \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@kernel.org \
    --cc=tangchen@cn.fujitsu.com \
    --cc=toshi.kani@hp.com \
    --cc=vasilis.liaskovitis@profitbricks.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).