linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Joao Martins <joao.m.martins@oracle.com>
Cc: Linux MM <linux-mm@kvack.org>, Ira Weiny <ira.weiny@intel.com>,
	 linux-nvdimm <linux-nvdimm@lists.01.org>,
	Matthew Wilcox <willy@infradead.org>,
	 Jason Gunthorpe <jgg@ziepe.ca>, Jane Chu <jane.chu@oracle.com>,
	 Muchun Song <songmuchun@bytedance.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	 Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v1 04/11] mm/memremap: add ZONE_DEVICE support for compound pages
Date: Mon, 7 Jun 2021 13:17:33 -0700	[thread overview]
Message-ID: <CAPcyv4j_PdzytEeabe95FrUiNVNobdJRvUE9M9j0krKQ1defBg@mail.gmail.com> (raw)
In-Reply-To: <8c922a58-c901-1ad9-5d19-1182bd6dea1e@oracle.com>

On Tue, May 18, 2021 at 10:28 AM Joao Martins <joao.m.martins@oracle.com> wrote:
>
> On 5/5/21 11:36 PM, Joao Martins wrote:
> > On 5/5/21 11:20 PM, Dan Williams wrote:
> >> On Wed, May 5, 2021 at 12:50 PM Joao Martins <joao.m.martins@oracle.com> wrote:
> >>> On 5/5/21 7:44 PM, Dan Williams wrote:
> >>>> On Thu, Mar 25, 2021 at 4:10 PM Joao Martins <joao.m.martins@oracle.com> wrote:
> >>>>> diff --git a/include/linux/memremap.h b/include/linux/memremap.h
> >>>>> index b46f63dcaed3..bb28d82dda5e 100644
> >>>>> --- a/include/linux/memremap.h
> >>>>> +++ b/include/linux/memremap.h
> >>>>> @@ -114,6 +114,7 @@ struct dev_pagemap {
> >>>>>         struct completion done;
> >>>>>         enum memory_type type;
> >>>>>         unsigned int flags;
> >>>>> +       unsigned long align;
> >>>>
> >>>> I think this wants some kernel-doc above to indicate that non-zero
> >>>> means "use compound pages with tail-page dedup" and zero / PAGE_SIZE
> >>>> means "use non-compound base pages".
>
> [...]
>
> >>>> The non-zero value must be
> >>>> PAGE_SIZE, PMD_PAGE_SIZE or PUD_PAGE_SIZE.
> >>>> Hmm, maybe it should be an
> >>>> enum:
> >>>>
> >>>> enum devmap_geometry {
> >>>>     DEVMAP_PTE,
> >>>>     DEVMAP_PMD,
> >>>>     DEVMAP_PUD,
> >>>> }
> >>>>
> >>> I suppose a converter between devmap_geometry and page_size would be needed too? And maybe
> >>> the whole dax/nvdimm align values change meanwhile (as a followup improvement)?
> >>
> >> I think it is ok for dax/nvdimm to continue to maintain their align
> >> value because it should be ok to have 4MB align if the device really
> >> wanted. However, when it goes to map that alignment with
> >> memremap_pages() it can pick a mode. For example, it's already the
> >> case that dax->align == 1GB is mapped with DEVMAP_PTE today, so
> >> they're already separate concepts that can stay separate.
> >>
> > Gotcha.
>
> I am reconsidering part of the above. In general, yes, the meaning of devmap @align
> represents a slightly different variation of the device @align i.e. how the metadata is
> laid out **but** regardless of what kind of page table entries we use vmemmap.
>
> By using DEVMAP_PTE/PMD/PUD we might end up 1) duplicating what nvdimm/dax already
> validates in terms of allowed device @align values (i.e. PAGE_SIZE, PMD_SIZE and PUD_SIZE)
> 2) the geometry of metadata is very much tied to the value we pick to @align at namespace
> provisioning -- not the "align" we might use at mmap() perhaps that's what you referred
> above? -- and 3) the value of geometry actually derives from dax device @align because we
> will need to create compound pages representing a page size of @align value.
>
> Using your example above: you're saying that dax->align == 1G is mapped with DEVMAP_PTEs,
> in reality the vmemmap is populated with PMDs/PUDs page tables (depending on what archs
> decide to do at vmemmap_populate()) and uses base pages as its metadata regardless of what
> device @align. In reality what we want to convey in @geometry is not page table sizes, but
> just the page size used for the vmemmap of the dax device.

Good point, the names "PTE, PMD, PUD" imply the hardware mapping size,
not the software compound page size.

> Additionally, limiting its
> value might not be desirable... if tomorrow Linux for some arch supports dax/nvdimm
> devices with 4M align or 64K align, the value of @geometry will have to reflect the 4M to
> create compound pages of order 10 for the said vmemmap.
>
> I am going to wait until you finish reviewing the remaining four patches of this series,
> but maybe this is a simple misnomer (s/align/geometry/) with a comment but without
> DEVMAP_{PTE,PMD,PUD} enum part? Or perhaps its own struct with a value and enum a
> setter/getter to audit its value? Thoughts?

I do see what you mean about the confusion DEVMAP_{PTE,PMD,PUD}
introduces, but I still think the device-dax align and the
organization of the 'struct page' metadata are distinct concepts. So
I'm happy with any color of the bikeshed as long as the 2 concepts are
distinct. How about calling it  "compound_page_order"? Open to other
ideas...


  parent reply	other threads:[~2021-06-07 20:18 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-25 23:09 [PATCH v1 00/11] mm, sparse-vmemmap: Introduce compound pagemaps Joao Martins
2021-03-25 23:09 ` [PATCH v1 01/11] memory-failure: fetch compound_head after pgmap_pfn_valid() Joao Martins
2021-04-24  0:12   ` Dan Williams
2021-04-24 19:00     ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 02/11] mm/page_alloc: split prep_compound_page into head and tail subparts Joao Martins
2021-04-24  0:16   ` Dan Williams
2021-04-24 19:05     ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 03/11] mm/page_alloc: refactor memmap_init_zone_device() page init Joao Martins
2021-04-24  0:18   ` Dan Williams
2021-04-24 19:05     ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 04/11] mm/memremap: add ZONE_DEVICE support for compound pages Joao Martins
2021-05-05 18:44   ` Dan Williams
2021-05-05 18:58     ` Matthew Wilcox
2021-05-05 19:49     ` Joao Martins
2021-05-05 22:20       ` Dan Williams
2021-05-05 22:36         ` Joao Martins
2021-05-05 23:03           ` Dan Williams
2021-05-06 10:12             ` Joao Martins
2021-05-18 17:27           ` Joao Martins
2021-05-18 19:56             ` Jane Chu
2021-05-19 11:29               ` Joao Martins
2021-05-19 18:36                 ` Jane Chu
2021-06-07 20:17             ` Dan Williams [this message]
2021-06-07 20:47               ` Joao Martins
2021-06-07 21:00                 ` Joao Martins
2021-06-07 21:57                   ` Dan Williams
2021-05-06  8:05         ` Aneesh Kumar K.V
2021-05-06 10:23           ` Joao Martins
2021-05-06 11:43             ` Matthew Wilcox
2021-05-06 12:15               ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 05/11] mm/sparse-vmemmap: add a pgmap argument to section activation Joao Martins
2021-05-05 22:34   ` Dan Williams
2021-05-05 22:37     ` Joao Martins
2021-05-05 23:14       ` Dan Williams
2021-05-06 10:24         ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 06/11] mm/sparse-vmemmap: refactor vmemmap_populate_basepages() Joao Martins
2021-05-05 22:43   ` Dan Williams
2021-05-06 10:27     ` Joao Martins
2021-05-06 18:36       ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 07/11] mm/sparse-vmemmap: populate compound pagemaps Joao Martins
2021-05-06  1:18   ` Dan Williams
2021-05-06 11:01     ` Joao Martins
2021-05-10 19:19       ` Dan Williams
2021-05-13 18:45         ` Joao Martins
2021-06-16 15:05           ` Joao Martins
2021-06-16 23:35             ` Dan Williams
2021-03-25 23:09 ` [PATCH v1 08/11] mm/sparse-vmemmap: use hugepages for PUD " Joao Martins
2021-06-01 19:30   ` Dan Williams
2021-06-07 12:02     ` Joao Martins
2021-06-07 19:47       ` Dan Williams
2021-03-25 23:09 ` [PATCH v1 09/11] mm/page_alloc: reuse tail struct pages for " Joao Martins
2021-06-01 23:35   ` Dan Williams
2021-06-07 13:48     ` Joao Martins
2021-06-07 19:32       ` Dan Williams
2021-06-14 18:41         ` Joao Martins
2021-06-14 23:07           ` Dan Williams
2021-03-25 23:09 ` [PATCH v1 10/11] device-dax: compound pagemap support Joao Martins
2021-06-02  0:36   ` Dan Williams
2021-06-07 13:59     ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 11/11] mm/gup: grab head page refcount once for group of subpages Joao Martins
2021-06-02  1:05   ` Dan Williams
2021-06-07 15:21     ` Joao Martins
2021-06-07 19:22       ` Dan Williams
2021-04-01  9:38 ` [PATCH v1 00/11] mm, sparse-vmemmap: Introduce compound pagemaps Joao Martins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4j_PdzytEeabe95FrUiNVNobdJRvUE9M9j0krKQ1defBg@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=ira.weiny@intel.com \
    --cc=jane.chu@oracle.com \
    --cc=jgg@ziepe.ca \
    --cc=joao.m.martins@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mike.kravetz@oracle.com \
    --cc=songmuchun@bytedance.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).