From: David Hildenbrand <david@redhat.com>
To: Wei Yang <richard.weiyang@gmail.com>
Cc: mhocko@suse.com, dave.hansen@intel.com, osalvador@suse.de,
akpm@linux-foundation.org, linux-mm@kvack.org
Subject: Re: [PATCH v3 1/2] mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section()
Date: Mon, 3 Dec 2018 12:25:20 +0100 [thread overview]
Message-ID: <e44018ff-b3d1-a1e2-3496-9554ff148fc4@redhat.com> (raw)
In-Reply-To: <20181130042815.t44nroyqcqa3tpgv@master>
On 30.11.18 05:28, Wei Yang wrote:
> On Thu, Nov 29, 2018 at 05:06:15PM +0100, David Hildenbrand wrote:
>> On 29.11.18 16:53, Wei Yang wrote:
>>> pgdat_resize_lock is used to protect pgdat's memory region information
>>> like: node_start_pfn, node_present_pages, etc. While in function
>>> sparse_add/remove_one_section(), pgdat_resize_lock is used to protect
>>> initialization/release of one mem_section. This looks not proper.
>>>
>>> Based on current implementation, even remove this lock, mem_section
>>> is still away from contention, because it is protected by global
>>> mem_hotpulg_lock.
>>
>> s/mem_hotpulg_lock/mem_hotplug_lock/
>>
>>>
>>> Following is the current call trace of sparse_add/remove_one_section()
>>>
>>> mem_hotplug_begin()
>>> arch_add_memory()
>>> add_pages()
>>> __add_pages()
>>> __add_section()
>>> sparse_add_one_section()
>>> mem_hotplug_done()
>>>
>>> mem_hotplug_begin()
>>> arch_remove_memory()
>>> __remove_pages()
>>> __remove_section()
>>> sparse_remove_one_section()
>>> mem_hotplug_done()
>>>
>>> The comment above the pgdat_resize_lock also mentions "Holding this will
>>> also guarantee that any pfn_valid() stays that way.", which is true with
>>> the current implementation and false after this patch. But current
>>> implementation doesn't meet this comment. There isn't any pfn walkers
>>> to take the lock so this looks like a relict from the past. This patch
>>> also removes this comment.
>>
>> Should we start to document which lock is expected to protect what?
>>
>> I suggest adding what you just found out to
>> Documentation/admin-guide/mm/memory-hotplug.rst "Locking Internals".
>> Maybe a new subsection for mem_hotplug_lock. And eventually also
>> pgdat_resize_lock.
>
> Well, I am not good at document writting. Below is my first trial. Look
> forward your comments.
>
> BTW, in case I would send a new version with this, would I put this into
> a separate one or merge this into current one?
>
> diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
> index 5c4432c96c4b..1548820a0762 100644
> --- a/Documentation/admin-guide/mm/memory-hotplug.rst
> +++ b/Documentation/admin-guide/mm/memory-hotplug.rst
BTW, it really should go into
Documentation/core-api/memory-hotplug.rst
Something got wrong while merging this in linux-next, so now we have
duplicate documentation and the one in
Documentation/admin-guide/mm/memory-hotplug.rst about locking internals
has to go.
> @@ -396,6 +396,20 @@ Need more implementation yet....
> Locking Internals
> =================
>
> +There are three locks involved in memory-hotplug, two global lock and one local
> +lock:
> +
> +- device_hotplug_lock
> +- mem_hotplug_lock
> +- device_lock
> +
> +Currently, they are twisted together for all kinds of reasons. The following
> +part is divded into device_hotplug_lock and mem_hotplug_lock parts
> +respectively to describe those tricky situations.
> +
> +device_hotplug_lock
> +---------------------
> +
> When adding/removing memory that uses memory block devices (i.e. ordinary RAM),
> the device_hotplug_lock should be held to:
>
> @@ -417,14 +431,21 @@ memory faster than expected:
> As the device is visible to user space before taking the device_lock(), this
> can result in a lock inversion.
>
> +mem_hotplug_lock
> +---------------------
> +
> onlining/offlining of memory should be done via device_online()/
> -device_offline() - to make sure it is properly synchronized to actions
> -via sysfs. Holding device_hotplug_lock is advised (to e.g. protect online_type)
> +device_offline() - to make sure it is properly synchronized to actions via
> +sysfs. Even mem_hotplug_lock is used to protect the process, because of the
> +lock inversion described above, holding device_hotplug_lock is still advised
> +(to e.g. protect online_type)
>
> When adding/removing/onlining/offlining memory or adding/removing
> heterogeneous/device memory, we should always hold the mem_hotplug_lock in
> write mode to serialise memory hotplug (e.g. access to global/zone
> -variables).
> +variables). Currently, we take advantage of this to serialise sparsemem's
> +mem_section handling in sparse_add_one_section() and
> +sparse_remove_one_section().
>
> In addition, mem_hotplug_lock (in contrast to device_hotplug_lock) in read
> mode allows for a quite efficient get_online_mems/put_online_mems
>
>>
>>
>> Thanks,
>>
>> David / dhildenb
>
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2018-12-03 11:25 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-27 2:36 [PATCH] mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section() Wei Yang
2018-11-27 6:25 ` Michal Hocko
2018-11-27 7:17 ` Dave Hansen
2018-11-27 7:30 ` Michal Hocko
2018-11-27 7:52 ` osalvador
2018-11-27 8:00 ` Michal Hocko
2018-11-27 8:18 ` osalvador
2018-11-28 0:29 ` Wei Yang
2018-11-28 8:19 ` Oscar Salvador
2018-11-28 8:41 ` Wei Yang
2018-11-28 1:01 ` Wei Yang
2018-11-28 8:47 ` Wei Yang
2018-11-28 9:17 ` Wei Yang
2018-11-28 12:34 ` Michal Hocko
2018-11-28 9:12 ` [PATCH v2] " Wei Yang
2018-11-28 10:28 ` David Hildenbrand
2018-11-29 8:54 ` Michal Hocko
2018-11-29 9:29 ` Wei Yang
2018-11-29 15:53 ` [PATCH v3 1/2] " Wei Yang
2018-11-29 15:53 ` [PATCH v3 2/2] mm, sparse: pass nid instead of pgdat to sparse_add_one_section() Wei Yang
2018-11-29 16:01 ` David Hildenbrand
2018-11-30 1:22 ` Wei Yang
2018-11-30 9:20 ` David Hildenbrand
2018-11-29 17:15 ` Michal Hocko
2018-11-29 23:57 ` Wei Yang
2018-11-29 16:06 ` [PATCH v3 1/2] mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section() David Hildenbrand
2018-11-29 17:17 ` Michal Hocko
2018-11-30 4:28 ` Wei Yang
2018-11-30 9:19 ` David Hildenbrand
2018-11-30 9:52 ` Michal Hocko
2018-12-04 8:53 ` Wei Yang
2018-12-01 0:31 ` Wei Yang
2018-12-03 11:25 ` David Hildenbrand [this message]
2018-12-03 21:06 ` Wei Yang
2018-11-29 17:14 ` Michal Hocko
2018-12-04 8:56 ` [PATCH v4 " Wei Yang
2018-12-04 8:56 ` [PATCH v4 2/2] mm, sparse: pass nid instead of pgdat to sparse_add_one_section() Wei Yang
2018-12-04 9:24 ` [PATCH v4 1/2] mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section() David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e44018ff-b3d1-a1e2-3496-9554ff148fc4@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
--cc=richard.weiyang@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).