linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wei Yang <richardw.yang@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Wei Yang <richard.weiyang@gmail.com>,
	Michal Hocko <mhocko@suse.com>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>, Paul Mackerras <paulus@samba.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Oscar Salvador <osalvador@suse.de>
Subject: Re: [PATCH v9 01/12] mm/sparsemem: Introduce struct mem_section_usage
Date: Wed, 19 Jun 2019 10:13:50 +0800	[thread overview]
Message-ID: <20190619021350.GA11514@richard> (raw)
In-Reply-To: <CAPcyv4hw2W3=CkrUmWtvu3cAdo3GLRhG0=G_RO7xQBugNB2htA@mail.gmail.com>

On Tue, Jun 18, 2019 at 02:56:09PM -0700, Dan Williams wrote:
>On Sun, Jun 16, 2019 at 6:11 AM Wei Yang <richard.weiyang@gmail.com> wrote:
>>
>> On Wed, Jun 05, 2019 at 02:57:54PM -0700, Dan Williams wrote:
>> >Towards enabling memory hotplug to track partial population of a
>> >section, introduce 'struct mem_section_usage'.
>> >
>> >A pointer to a 'struct mem_section_usage' instance replaces the existing
>> >pointer to a 'pageblock_flags' bitmap. Effectively it adds one more
>> >'unsigned long' beyond the 'pageblock_flags' (usemap) allocation to
>> >house a new 'subsection_map' bitmap.  The new bitmap enables the memory
>> >hot{plug,remove} implementation to act on incremental sub-divisions of a
>> >section.
>> >
>> >The default SUBSECTION_SHIFT is chosen to keep the 'subsection_map' no
>> >larger than a single 'unsigned long' on the major architectures.
>> >Alternatively an architecture can define ARCH_SUBSECTION_SHIFT to
>> >override the default PMD_SHIFT. Note that PowerPC needs to use
>> >ARCH_SUBSECTION_SHIFT to workaround PMD_SHIFT being a non-constant
>> >expression on PowerPC.
>> >
>> >The primary motivation for this functionality is to support platforms
>> >that mix "System RAM" and "Persistent Memory" within a single section,
>> >or multiple PMEM ranges with different mapping lifetimes within a single
>> >section. The section restriction for hotplug has caused an ongoing saga
>> >of hacks and bugs for devm_memremap_pages() users.
>> >
>> >Beyond the fixups to teach existing paths how to retrieve the 'usemap'
>> >from a section, and updates to usemap allocation path, there are no
>> >expected behavior changes.
>> >
>> >Cc: Michal Hocko <mhocko@suse.com>
>> >Cc: Vlastimil Babka <vbabka@suse.cz>
>> >Cc: Logan Gunthorpe <logang@deltatee.com>
>> >Cc: Oscar Salvador <osalvador@suse.de>
>> >Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
>> >Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> >Cc: Paul Mackerras <paulus@samba.org>
>> >Cc: Michael Ellerman <mpe@ellerman.id.au>
>> >Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> >---
>> > arch/powerpc/include/asm/sparsemem.h |    3 +
>> > include/linux/mmzone.h               |   48 +++++++++++++++++++-
>> > mm/memory_hotplug.c                  |   18 ++++----
>> > mm/page_alloc.c                      |    2 -
>> > mm/sparse.c                          |   81 +++++++++++++++++-----------------
>> > 5 files changed, 99 insertions(+), 53 deletions(-)
>> >
>> >diff --git a/arch/powerpc/include/asm/sparsemem.h b/arch/powerpc/include/asm/sparsemem.h
>> >index 3192d454a733..1aa3c9303bf8 100644
>> >--- a/arch/powerpc/include/asm/sparsemem.h
>> >+++ b/arch/powerpc/include/asm/sparsemem.h
>> >@@ -10,6 +10,9 @@
>> >  */
>> > #define SECTION_SIZE_BITS       24
>> >
>> >+/* Reflect the largest possible PMD-size as the subsection-size constant */
>> >+#define ARCH_SUBSECTION_SHIFT 24
>> >+
>> > #endif /* CONFIG_SPARSEMEM */
>> >
>> > #ifdef CONFIG_MEMORY_HOTPLUG
>> >diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> >index 427b79c39b3c..ac163f2f274f 100644
>> >--- a/include/linux/mmzone.h
>> >+++ b/include/linux/mmzone.h
>> >@@ -1161,6 +1161,44 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec)
>> > #define SECTION_ALIGN_UP(pfn) (((pfn) + PAGES_PER_SECTION - 1) & PAGE_SECTION_MASK)
>> > #define SECTION_ALIGN_DOWN(pfn)       ((pfn) & PAGE_SECTION_MASK)
>> >
>> >+/*
>> >+ * SUBSECTION_SHIFT must be constant since it is used to declare
>> >+ * subsection_map and related bitmaps without triggering the generation
>> >+ * of variable-length arrays. The most natural size for a subsection is
>> >+ * a PMD-page. For architectures that do not have a constant PMD-size
>> >+ * ARCH_SUBSECTION_SHIFT can be set to a constant max size, or otherwise
>> >+ * fallback to 2MB.
>> >+ */
>> >+#if defined(ARCH_SUBSECTION_SHIFT)
>> >+#define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
>> >+#elif defined(PMD_SHIFT)
>> >+#define SUBSECTION_SHIFT (PMD_SHIFT)
>> >+#else
>> >+/*
>> >+ * Memory hotplug enabled platforms avoid this default because they
>> >+ * either define ARCH_SUBSECTION_SHIFT, or PMD_SHIFT is a constant, but
>> >+ * this is kept as a backstop to allow compilation on
>> >+ * !ARCH_ENABLE_MEMORY_HOTPLUG archs.
>> >+ */
>> >+#define SUBSECTION_SHIFT 21
>> >+#endif
>> >+
>> >+#define PFN_SUBSECTION_SHIFT (SUBSECTION_SHIFT - PAGE_SHIFT)
>> >+#define PAGES_PER_SUBSECTION (1UL << PFN_SUBSECTION_SHIFT)
>> >+#define PAGE_SUBSECTION_MASK ((~(PAGES_PER_SUBSECTION-1)))
>>
>> One pair of brackets could be removed, IMHO.
>
>Sure.
>
>>
>> >+
>> >+#if SUBSECTION_SHIFT > SECTION_SIZE_BITS
>> >+#error Subsection size exceeds section size
>> >+#else
>> >+#define SUBSECTIONS_PER_SECTION (1UL << (SECTION_SIZE_BITS - SUBSECTION_SHIFT))
>> >+#endif
>> >+
>> >+struct mem_section_usage {
>> >+      DECLARE_BITMAP(subsection_map, SUBSECTIONS_PER_SECTION);
>> >+      /* See declaration of similar field in struct zone */
>> >+      unsigned long pageblock_flags[0];
>> >+};
>> >+
>> > struct page;
>> > struct page_ext;
>> > struct mem_section {
>> >@@ -1178,8 +1216,7 @@ struct mem_section {
>> >        */
>> >       unsigned long section_mem_map;
>> >
>> >-      /* See declaration of similar field in struct zone */
>> >-      unsigned long *pageblock_flags;
>> >+      struct mem_section_usage *usage;
>> > #ifdef CONFIG_PAGE_EXTENSION
>> >       /*
>> >        * If SPARSEMEM, pgdat doesn't have page_ext pointer. We use
>> >@@ -1210,6 +1247,11 @@ extern struct mem_section **mem_section;
>> > extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
>> > #endif
>> >
>> >+static inline unsigned long *section_to_usemap(struct mem_section *ms)
>> >+{
>> >+      return ms->usage->pageblock_flags;
>>
>> Do we need to consider the case when ms->usage is NULL?
>
>No, this routine safely assumes it is always set.

Then everything looks good to me.

Reviewed-by: Wei Yang <richardw.yang@linux.intel.com>

>_______________________________________________
>Linux-nvdimm mailing list
>Linux-nvdimm@lists.01.org
>https://lists.01.org/mailman/listinfo/linux-nvdimm

-- 
Wei Yang
Help you, Help me


  reply	other threads:[~2019-06-19  2:14 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-05 21:57 [PATCH v9 00/12] mm: Sub-section memory hotplug support Dan Williams
2019-06-05 21:57 ` [PATCH v9 01/12] mm/sparsemem: Introduce struct mem_section_usage Dan Williams
2019-06-06 17:34   ` Oscar Salvador
2019-06-16 13:11   ` Wei Yang
2019-06-18 21:56     ` Dan Williams
2019-06-19  2:13       ` Wei Yang [this message]
2019-06-05 21:57 ` [PATCH v9 02/12] mm/sparsemem: Add helpers track active portions of a section at boot Dan Williams
2019-06-06 16:55   ` Oscar Salvador
2019-06-17 22:21   ` Wei Yang
2019-06-17 22:32     ` Dan Williams
2019-06-18  1:03       ` Wei Yang
2019-06-19  3:15       ` Dan Williams
2019-06-05 21:58 ` [PATCH v9 03/12] mm/hotplug: Prepare shrink_{zone, pgdat}_span for sub-section removal Dan Williams
2019-06-18  1:42   ` Wei Yang
2019-06-19  3:40     ` Dan Williams
2019-06-05 21:58 ` [PATCH v9 04/12] mm/sparsemem: Convert kmalloc_section_memmap() to populate_section_memmap() Dan Williams
2019-06-06 17:02   ` Oscar Salvador
2019-06-16  6:06   ` Aneesh Kumar K.V
2019-06-05 21:58 ` [PATCH v9 05/12] mm/hotplug: Kill is_dev_zone() usage in __remove_pages() Dan Williams
2019-06-05 21:58 ` [PATCH v9 06/12] mm: Kill is_dev_zone() helper Dan Williams
2019-06-18  3:35   ` Wei Yang
2019-06-05 21:58 ` [PATCH v9 07/12] mm/sparsemem: Prepare for sub-section ranges Dan Williams
2019-06-06 17:21   ` Oscar Salvador
2019-06-06 18:16     ` Dan Williams
2019-06-14  8:39   ` David Hildenbrand
2019-06-05 21:58 ` [PATCH v9 08/12] mm/sparsemem: Support sub-section hotplug Dan Williams
2019-06-07  8:33   ` Oscar Salvador
2019-06-07 15:38     ` Dan Williams
2019-06-07 21:41       ` Oscar Salvador
2019-06-05 21:58 ` [PATCH v9 09/12] mm: Document ZONE_DEVICE memory-model implications Dan Williams
2019-06-05 21:58 ` [PATCH v9 10/12] mm/devm_memremap_pages: Enable sub-section remap Dan Williams
2019-06-07  8:56   ` Oscar Salvador
2019-06-16  7:49   ` Aneesh Kumar K.V
2019-06-05 21:58 ` [PATCH v9 11/12] libnvdimm/pfn: Fix fsdax-mode namespace info-block zero-fields Dan Williams
2019-06-06 21:46   ` Andrew Morton
2019-06-06 22:06     ` Dan Williams
2019-06-07 19:54       ` Andrew Morton
2019-06-07 20:09         ` Dan Williams
2019-06-12  9:41   ` Aneesh Kumar K.V
2019-06-05 21:59 ` [PATCH v9 12/12] libnvdimm/pfn: Stop padding pmem namespaces to section alignment Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190619021350.GA11514@richard \
    --to=richardw.yang@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=dan.j.williams@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mhocko@suse.com \
    --cc=mpe@ellerman.id.au \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulus@samba.org \
    --cc=richard.weiyang@gmail.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).