From: Toshi Kani <toshi.kani@hpe.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
linux-mm <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 2/3] libnvdimm, pmem: adjust for section collisions with 'System RAM'
Date: Mon, 07 Mar 2016 10:56:53 -0700 [thread overview]
Message-ID: <1457373413.15454.334.camel@hpe.com> (raw)
In-Reply-To: <CAA9_cmc9vjChKqs7P1NG9r66TGapw0cYHfcajWh_O+hk433MTg@mail.gmail.com>
On Fri, 2016-03-04 at 18:23 -0800, Dan Williams wrote:
> On Fri, Mar 4, 2016 at 6:48 PM, Toshi Kani <toshi.kani@hpe.com> wrote:
> > On Thu, 2016-03-03 at 13:53 -0800, Dan Williams wrote:
> > > On a platform where 'Persistent Memory' and 'System RAM' are mixed
> > > within a given sparsemem section, trim the namespace and notify about
> > > the
> > > sub-optimal alignment.
> > >
> > > Cc: Toshi Kani <toshi.kani@hpe.com>
> > > Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > > ---
> > > drivers/nvdimm/namespace_devs.c | 7 ++
> > > drivers/nvdimm/pfn.h | 10 ++-
> > > drivers/nvdimm/pfn_devs.c | 5 ++
> > > drivers/nvdimm/pmem.c | 125 ++++++++++++++++++++++++++++-
> > > ----
> > > ------
> > > 4 files changed, 111 insertions(+), 36 deletions(-)
> > >
> > > diff --git a/drivers/nvdimm/namespace_devs.c
> > > b/drivers/nvdimm/namespace_devs.c
> > > index 8ebfcaae3f5a..463756ca2d4b 100644
> > > --- a/drivers/nvdimm/namespace_devs.c
> > > +++ b/drivers/nvdimm/namespace_devs.c
> > > @@ -133,6 +133,7 @@ bool nd_is_uuid_unique(struct device *dev, u8
> > > *uuid)
> > > bool pmem_should_map_pages(struct device *dev)
> > > {
> > > struct nd_region *nd_region = to_nd_region(dev->parent);
> > > + struct nd_namespace_io *nsio;
> > >
> > > if (!IS_ENABLED(CONFIG_ZONE_DEVICE))
> > > return false;
> > > @@ -143,6 +144,12 @@ bool pmem_should_map_pages(struct device *dev)
> > > if (is_nd_pfn(dev) || is_nd_btt(dev))
> > > return false;
> > >
> > > + nsio = to_nd_namespace_io(dev);
> > > + if (region_intersects(nsio->res.start, resource_size(&nsio-
> > > > res),
> > > + IORESOURCE_SYSTEM_RAM,
> > > + IORES_DESC_NONE) == REGION_MIXED)
> >
> > Should this be != REGION_DISJOINT for safe?
>
> Acutally, it's ok. It doesn't need to be disjoint. The problem is
> mixing an mm-zone within a given section. If the region intersects
> system-ram then devm_memremap_pages() is a no-op and we can use the
> existing page allocation and linear mapping.
Oh, I see.
> >
> > > + return false;
> > > +
> >
> > :
> >
> > > @@ -304,21 +311,56 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
> > > }
> > >
> > > memset(pfn_sb, 0, sizeof(*pfn_sb));
> > > - npfns = (pmem->size - SZ_8K) / SZ_4K;
> > > +
> > > + /*
> > > + * Check if pmem collides with 'System RAM' when section
> > > aligned
> > > and
> > > + * trim it accordingly
> > > + */
> > > + nsio = to_nd_namespace_io(&ndns->dev);
> > > + start = PHYS_SECTION_ALIGN_DOWN(nsio->res.start);
> > > + size = resource_size(&nsio->res);
> > > + if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM,
> > > + IORES_DESC_NONE) == REGION_MIXED) {
> > > +
> > > + start = nsio->res.start;
> > > + start_pad = PHYS_SECTION_ALIGN_UP(start) - start;
> > > + }
> > > +
> > > + start = nsio->res.start;
> > > + size = PHYS_SECTION_ALIGN_UP(start + size) - start;
> > > + if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM,
> > > + IORES_DESC_NONE) == REGION_MIXED) {
> > > + size = resource_size(&nsio->res);
> > > + end_trunc = start + size -
> > > PHYS_SECTION_ALIGN_DOWN(start
> > > + size);
> > > + }
> >
> > This check seems to assume that guest's regular memory layout does not
> > change. That is, if there is no collision at first, there won't be any
> > later. Is this a valid assumption?
>
> If platform firmware changes the physical alignment during the
> lifetime of the namespace there's not much we can do.
The physical alignment can be changed as long as it is large enough (see
below).
> Another problem
> not addressed by this patch is firmware choosing to hot plug system
> ram into the same section as persistent memory.
Yes, and it does not have to be a hot-plug operation. Memory size may be
changed off-line. Data image can be copied to different guests for instant
deployment, or may be migrated to a different guest.
> As far as I can see
> all we do is ask firmware implementations to respect Linux section
> boundaries and otherwise not change alignments.
In addition to the requirement that pmem range alignment may not change,
the code also requires a regular memory range does not change to intersect
with a pmem section later. This seems fragile to me since guest config may
vary / change as I mentioned above.
So, shouldn't the driver fails to attach when the range is not aligned by
the section size? Since we need to place a requirement to firmware anyway,
we can simply state that it must be aligned by 128MiB (at least) on x86.
Then, memory and pmem physical layouts can be changed as long as this
requirement is met.
Thanks,
-Toshi
next prev parent reply other threads:[~2016-03-07 17:04 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-03 21:53 [PATCH v2 0/3] libnvdimm, pfn: support section misaligned pmem Dan Williams
2016-03-03 21:53 ` [PATCH v2 1/3] libnvdimm, pmem: fix 'pfn' support for section-misaligned namespaces Dan Williams
2016-03-03 21:53 ` [PATCH v2 2/3] libnvdimm, pmem: adjust for section collisions with 'System RAM' Dan Williams
2016-03-05 2:48 ` Toshi Kani
2016-03-05 2:23 ` Dan Williams
2016-03-07 17:56 ` Toshi Kani [this message]
2016-03-07 17:18 ` Dan Williams
2016-03-07 18:58 ` Toshi Kani
2016-03-07 18:19 ` Dan Williams
2016-03-07 18:37 ` Dan Williams
2016-03-03 21:53 ` [PATCH v2 3/3] libnvdimm, pfn: 'resource'-address and 'size' attributes for pfn devices Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1457373413.15454.334.camel@hpe.com \
--to=toshi.kani@hpe.com \
--cc=dan.j.williams@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@ml01.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox