All of lore.kernel.org
 help / color / mirror / Atom feed
From: Balbir Singh <bsingharora@gmail.com>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: thomas.lendacky@amd.com, mhocko@suse.com,
	linux-nvdimm@lists.01.org, tiwai@suse.de, zwisler@kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	jglisse@redhat.com, fengguang.wu@intel.com,
	baiyaowei@cmss.chinamobile.com, ying.huang@intel.com,
	bhelgaas@google.com, akpm@linux-foundation.org, bp@suse.de
Subject: Re: [PATCH 0/5] [v4] Allow persistent memory to be used like normal RAM
Date: Mon, 28 Jan 2019 22:09:58 +1100	[thread overview]
Message-ID: <20190128110958.GH26056@350D> (raw)
In-Reply-To: <20190124231441.37A4A305@viggo.jf.intel.com>

On Thu, Jan 24, 2019 at 03:14:41PM -0800, Dave Hansen wrote:
> v3 spurred a bunch of really good discussion.  Thanks to everybody
> that made comments and suggestions!
> 
> I would still love some Acks on this from the folks on cc, even if it
> is on just the patch touching your area.
> 
> Note: these are based on commit d2f33c19644 in:
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git libnvdimm-pending
> 
> Changes since v3:
>  * Move HMM-related resource warning instead of removing it
>  * Use __request_resource() directly instead of devm.
>  * Create a separate DAX_PMEM Kconfig option, complete with help text
>  * Update patch descriptions and cover letter to give a better
>    overview of use-cases and hardware where this might be useful.
> 
> Changes since v2:
>  * Updates to dev_dax_kmem_probe() in patch 5:
>    * Reject probes for devices with bad NUMA nodes.  Keeps slow
>      memory from being added to node 0.
>    * Use raw request_mem_region()
>    * Add comments about permanent reservation
>    * use dev_*() instead of printk's
>  * Add references to nvdimm documentation in descriptions
>  * Remove unneeded GPL export
>  * Add Kconfig prompt and help text
> 
> Changes since v1:
>  * Now based on git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git
>  * Use binding/unbinding from "dax bus" code
>  * Move over to a "dax bus" driver from being an nvdimm driver
> 
> --
> 
> Persistent memory is cool.  But, currently, you have to rewrite
> your applications to use it.  Wouldn't it be cool if you could
> just have it show up in your system like normal RAM and get to
> it like a slow blob of memory?  Well... have I got the patch
> series for you!
> 
> == Background / Use Cases ==
> 
> Persistent Memory (aka Non-Volatile DIMMs / NVDIMMS) themselves
> are described in detail in Documentation/nvdimm/nvdimm.txt.
> However, this documentation focuses on actually using them as
> storage.  This set is focused on using NVDIMMs as DRAM replacement.
> 
> This is intended for Intel-style NVDIMMs (aka. Intel Optane DC
> persistent memory) NVDIMMs.  These DIMMs are physically persistent,
> more akin to flash than traditional RAM.  They are also expected to
> be more cost-effective than using RAM, which is why folks want this
> set in the first place.

What variant of NVDIMM's F/P or both?

> 
> This set is not intended for RAM-based NVDIMMs.  Those are not
> cost-effective vs. plain RAM, and this using them here would simply
> be a waste.
> 

Sounds like NVDIMM (P)

> But, why would you bother with this approach?  Intel itself [1]
> has announced a hardware feature that does something very similar:
> "Memory Mode" which turns DRAM into a cache in front of persistent
> memory, which is then as a whole used as normal "RAM"?
> 
> Here are a few reasons:
> 1. The capacity of memory mode is the size of your persistent
>    memory that you dedicate.  DRAM capacity is "lost" because it
>    is used for cache.  With this, you get PMEM+DRAM capacity for
>    memory.
> 2. DRAM acts as a cache with memory mode, and caches can lead to
>    unpredictable latencies.  Since memory mode is all-or-nothing
>    (either all your DRAM is used as cache or none is), your entire
>    memory space is exposed to these unpredictable latencies.  This
>    solution lets you guarantee DRAM latencies if you need them.
> 3. The new "tier" of memory is exposed to software.  That means
>    that you can build tiered applications or infrastructure.  A
>    cloud provider could sell cheaper VMs that use more PMEM and
>    more expensive ones that use DRAM.  That's impossible with
>    memory mode.
> 
> Don't take this as criticism of memory mode.  Memory mode is
> awesome, and doesn't strictly require *any* software changes (we
> have software changes proposed for optimizing it though).  It has
> tons of other advantages over *this* approach.  Basically, we
> believe that the approach in these patches is complementary to
> memory mode and that both can live side-by-side in harmony.
> 
> == Patch Set Overview ==
> 
> This series adds a new "driver" to which pmem devices can be
> attached.  Once attached, the memory "owned" by the device is
> hot-added to the kernel and managed like any other memory.  On
> systems with an HMAT (a new ACPI table), each socket (roughly)
> will have a separate NUMA node for its persistent memory so
> this newly-added memory can be selected by its unique NUMA
> node.


NUMA is distance based topology, does HMAT solve these problems?
How do we prevent fallback nodes of normal nodes being pmem nodes?
On an unexpected crash/failure is there a scrubbing mechanism
or do we rely on the allocator to do the right thing prior to
reallocating any memory. Will frequent zero'ing hurt NVDIMM/pmem's
life times?

Balbir Singh.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Balbir Singh <bsingharora@gmail.com>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-kernel@vger.kernel.org, thomas.lendacky@amd.com,
	mhocko@suse.com, linux-nvdimm@lists.01.org, tiwai@suse.de,
	ying.huang@intel.com, linux-mm@kvack.org, jglisse@redhat.com,
	bp@suse.de, baiyaowei@cmss.chinamobile.com, zwisler@kernel.org,
	bhelgaas@google.com, fengguang.wu@intel.com,
	akpm@linux-foundation.org
Subject: Re: [PATCH 0/5] [v4] Allow persistent memory to be used like normal RAM
Date: Mon, 28 Jan 2019 22:09:58 +1100	[thread overview]
Message-ID: <20190128110958.GH26056@350D> (raw)
In-Reply-To: <20190124231441.37A4A305@viggo.jf.intel.com>

On Thu, Jan 24, 2019 at 03:14:41PM -0800, Dave Hansen wrote:
> v3 spurred a bunch of really good discussion.  Thanks to everybody
> that made comments and suggestions!
> 
> I would still love some Acks on this from the folks on cc, even if it
> is on just the patch touching your area.
> 
> Note: these are based on commit d2f33c19644 in:
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git libnvdimm-pending
> 
> Changes since v3:
>  * Move HMM-related resource warning instead of removing it
>  * Use __request_resource() directly instead of devm.
>  * Create a separate DAX_PMEM Kconfig option, complete with help text
>  * Update patch descriptions and cover letter to give a better
>    overview of use-cases and hardware where this might be useful.
> 
> Changes since v2:
>  * Updates to dev_dax_kmem_probe() in patch 5:
>    * Reject probes for devices with bad NUMA nodes.  Keeps slow
>      memory from being added to node 0.
>    * Use raw request_mem_region()
>    * Add comments about permanent reservation
>    * use dev_*() instead of printk's
>  * Add references to nvdimm documentation in descriptions
>  * Remove unneeded GPL export
>  * Add Kconfig prompt and help text
> 
> Changes since v1:
>  * Now based on git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git
>  * Use binding/unbinding from "dax bus" code
>  * Move over to a "dax bus" driver from being an nvdimm driver
> 
> --
> 
> Persistent memory is cool.  But, currently, you have to rewrite
> your applications to use it.  Wouldn't it be cool if you could
> just have it show up in your system like normal RAM and get to
> it like a slow blob of memory?  Well... have I got the patch
> series for you!
> 
> == Background / Use Cases ==
> 
> Persistent Memory (aka Non-Volatile DIMMs / NVDIMMS) themselves
> are described in detail in Documentation/nvdimm/nvdimm.txt.
> However, this documentation focuses on actually using them as
> storage.  This set is focused on using NVDIMMs as DRAM replacement.
> 
> This is intended for Intel-style NVDIMMs (aka. Intel Optane DC
> persistent memory) NVDIMMs.  These DIMMs are physically persistent,
> more akin to flash than traditional RAM.  They are also expected to
> be more cost-effective than using RAM, which is why folks want this
> set in the first place.

What variant of NVDIMM's F/P or both?

> 
> This set is not intended for RAM-based NVDIMMs.  Those are not
> cost-effective vs. plain RAM, and this using them here would simply
> be a waste.
> 

Sounds like NVDIMM (P)

> But, why would you bother with this approach?  Intel itself [1]
> has announced a hardware feature that does something very similar:
> "Memory Mode" which turns DRAM into a cache in front of persistent
> memory, which is then as a whole used as normal "RAM"?
> 
> Here are a few reasons:
> 1. The capacity of memory mode is the size of your persistent
>    memory that you dedicate.  DRAM capacity is "lost" because it
>    is used for cache.  With this, you get PMEM+DRAM capacity for
>    memory.
> 2. DRAM acts as a cache with memory mode, and caches can lead to
>    unpredictable latencies.  Since memory mode is all-or-nothing
>    (either all your DRAM is used as cache or none is), your entire
>    memory space is exposed to these unpredictable latencies.  This
>    solution lets you guarantee DRAM latencies if you need them.
> 3. The new "tier" of memory is exposed to software.  That means
>    that you can build tiered applications or infrastructure.  A
>    cloud provider could sell cheaper VMs that use more PMEM and
>    more expensive ones that use DRAM.  That's impossible with
>    memory mode.
> 
> Don't take this as criticism of memory mode.  Memory mode is
> awesome, and doesn't strictly require *any* software changes (we
> have software changes proposed for optimizing it though).  It has
> tons of other advantages over *this* approach.  Basically, we
> believe that the approach in these patches is complementary to
> memory mode and that both can live side-by-side in harmony.
> 
> == Patch Set Overview ==
> 
> This series adds a new "driver" to which pmem devices can be
> attached.  Once attached, the memory "owned" by the device is
> hot-added to the kernel and managed like any other memory.  On
> systems with an HMAT (a new ACPI table), each socket (roughly)
> will have a separate NUMA node for its persistent memory so
> this newly-added memory can be selected by its unique NUMA
> node.


NUMA is distance based topology, does HMAT solve these problems?
How do we prevent fallback nodes of normal nodes being pmem nodes?
On an unexpected crash/failure is there a scrubbing mechanism
or do we rely on the allocator to do the right thing prior to
reallocating any memory. Will frequent zero'ing hurt NVDIMM/pmem's
life times?

Balbir Singh.

  parent reply	other threads:[~2019-01-28 11:10 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-24 23:14 [PATCH 0/5] [v4] Allow persistent memory to be used like normal RAM Dave Hansen
2019-01-24 23:14 ` Dave Hansen
2019-01-24 23:14 ` Dave Hansen
2019-01-24 23:14 ` [PATCH 1/5] mm/resource: return real error codes from walk failures Dave Hansen
2019-01-24 23:14   ` Dave Hansen
2019-01-24 23:14   ` Dave Hansen
     [not found]   ` <20190124231442.EFD29EE0-LXbPSdftPKxrdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2019-01-25 21:02     ` Bjorn Helgaas
2019-01-25 21:02       ` Bjorn Helgaas
2019-01-25 21:09       ` Dave Hansen
2019-01-25 21:09         ` Dave Hansen
2019-01-25 21:19         ` Bjorn Helgaas
     [not found]         ` <4898e064-5298-6a82-83ea-23d16f3dfb3d-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2019-01-29  1:18           ` Michael Ellerman
2019-01-29  1:18             ` Michael Ellerman
2019-01-24 23:14 ` [PATCH 2/5] mm/resource: move HMM pr_debug() deeper into resource code Dave Hansen
2019-01-24 23:14   ` Dave Hansen
2019-01-24 23:14   ` Dave Hansen
2019-01-25 19:07   ` Jerome Glisse
2019-01-25 19:07     ` Jerome Glisse
2019-01-25 21:18   ` Bjorn Helgaas
2019-01-25 21:18     ` Bjorn Helgaas
2019-01-25 21:24     ` Dave Hansen
2019-01-25 21:24       ` Dave Hansen
2019-01-29  1:34       ` Michael Ellerman
2019-01-29  1:34         ` Michael Ellerman
2019-01-24 23:14 ` [PATCH 3/5] mm/memory-hotplug: allow memory resources to be children Dave Hansen
2019-01-24 23:14   ` Dave Hansen
2019-01-24 23:14   ` Dave Hansen
2019-01-24 23:14 ` [PATCH 4/5] dax/kmem: let walk_system_ram_range() search child resources Dave Hansen
2019-01-24 23:14   ` Dave Hansen
2019-01-24 23:14   ` Dave Hansen
2019-01-24 23:14 ` [PATCH 5/5] dax: "Hotplug" persistent memory for use like normal RAM Dave Hansen
2019-01-24 23:14   ` Dave Hansen
2019-01-24 23:14   ` Dave Hansen
2019-01-25  6:13   ` Jane Chu
2019-01-25  6:13     ` Jane Chu
2019-01-25  6:27     ` Dan Williams
2019-01-25  6:27       ` Dan Williams
2019-01-25  8:20       ` Du, Fan
2019-01-25  8:20         ` Du, Fan
2019-01-25 17:18         ` Dan Williams
2019-01-25 18:20           ` Verma, Vishal L
2019-01-25 18:20             ` Verma, Vishal L
2019-01-25 19:10             ` Jane Chu
2019-01-25 19:15               ` Dan Williams
2019-01-25 19:15                 ` Dan Williams
2019-01-25 23:30                 ` Jane Chu
2019-01-28  9:25                 ` Michal Hocko
2019-01-28 16:34                   ` Dan Williams
2019-01-28 16:34                     ` Dan Williams
2019-02-09 11:00   ` Brice Goglin
2019-02-11 16:22     ` Dave Hansen
2019-02-11 16:22       ` Dave Hansen
2019-02-12 19:59       ` Brice Goglin
2019-02-13  0:30         ` Dan Williams
2019-02-13  0:30           ` Dan Williams
2019-02-13  8:12           ` Brice Goglin
2019-02-13  8:12             ` Brice Goglin
2019-02-13  8:24             ` Dan Williams
2019-02-13  8:24               ` Dan Williams
2019-02-13  8:43               ` Brice Goglin
2019-02-13  8:43                 ` Brice Goglin
2019-02-13 13:06                 ` Brice Goglin
2019-02-13 13:06                   ` Brice Goglin
2019-02-13 16:19                   ` Dan Williams
2019-02-13 16:19                     ` Dan Williams
2019-01-25 19:08 ` [PATCH 0/5] [v4] Allow persistent memory to be used " Jerome Glisse
2019-01-25 19:08   ` Jerome Glisse
2019-01-28 11:09 ` Balbir Singh [this message]
2019-01-28 11:09   ` Balbir Singh
2019-01-28 16:50   ` Dave Hansen
2019-01-28 16:50     ` Dave Hansen
2019-01-28 16:50     ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190128110958.GH26056@350D \
    --to=bsingharora@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=baiyaowei@cmss.chinamobile.com \
    --cc=bhelgaas@google.com \
    --cc=bp@suse.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=fengguang.wu@intel.com \
    --cc=jglisse@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mhocko@suse.com \
    --cc=thomas.lendacky@amd.com \
    --cc=tiwai@suse.de \
    --cc=ying.huang@intel.com \
    --cc=zwisler@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.