Linux kernel -stable discussions
 help / color / mirror / Atom feed
* [PATCH v2 0/5] mm: Fix pfn_to_online_page() with respect to ZONE_DEVICE
@ 2021-01-12  9:34 Dan Williams
  2021-01-12  9:34 ` [PATCH v2 4/5] mm: Fix page reference leak in soft_offline_page() Dan Williams
  2021-01-12  9:35 ` [PATCH v2 5/5] libnvdimm/namespace: Fix visibility of namespace resource attribute Dan Williams
  0 siblings, 2 replies; 6+ messages in thread
From: Dan Williams @ 2021-01-12  9:34 UTC (permalink / raw)
  To: linux-mm
  Cc: David Hildenbrand, Dave Jiang, Ira Weiny, stable, Naoya Horiguchi,
	Qian Cai, Michal Hocko, Oscar Salvador, Michal Hocko,
	Vishal Verma, Andrew Morton, linux-nvdimm, linux-kernel

Changes since v1 [1]:
- Clarify the failing condition in patch 3 (Michal)
- Clarify how subsection collisions manifest in shipping systems
  (Michal)
- Use zone_idx() (Michal)
- Move section_taint_zone_device() conditions to
  move_pfn_range_to_zone() (Michal)
- Fix pfn_to_online_page() to account for pfn_valid() vs
  pfn_section_valid() confusion (David)

[1]: http://lore.kernel.org/r/160990599013.2430134.11556277600719835946.stgit@dwillia2-desk3.amr.corp.intel.com

---

Michal reminds that the discussion about how to ensure pfn-walkers do
not get confused by ZONE_DEVICE pages never resolved. A pfn-walker that
uses pfn_to_online_page() may inadvertently translate a pfn as online
and in the page allocator, when it is offline managed by a ZONE_DEVICE
mapping (details in Patch 3: ("mm: Teach pfn_to_online_page() about
ZONE_DEVICE section collisions")).

The 2 proposals under consideration are teach pfn_to_online_page() to be
precise in the presence of mixed-zone sections, or teach the memory-add
code to drop the System RAM associated with ZONE_DEVICE collisions. In
order to not regress memory capacity by a few 10s to 100s of MiB the
approach taken in this set is to add precision to pfn_to_online_page().

In the course of validating pfn_to_online_page() a couple other fixes
fell out:

1/ soft_offline_page() fails to drop the reference taken in the
   madvise(..., MADV_SOFT_OFFLINE) case.

2/ The libnvdimm sysfs attribute visibility code was failing to publish
   the resource base for memmap=ss!nn defined namespaces. This is needed
   for the regression test for soft_offline_page().

---

Dan Williams (5):
      mm: Move pfn_to_online_page() out of line
      mm: Teach pfn_to_online_page() to consider subsection validity
      mm: Teach pfn_to_online_page() about ZONE_DEVICE section collisions
      mm: Fix page reference leak in soft_offline_page()
      libnvdimm/namespace: Fix visibility of namespace resource attribute


 drivers/nvdimm/namespace_devs.c |   10 +++---
 include/linux/memory_hotplug.h  |   17 +----------
 include/linux/mmzone.h          |   22 +++++++++-----
 mm/memory-failure.c             |   20 ++++++++++---
 mm/memory_hotplug.c             |   62 +++++++++++++++++++++++++++++++++++++++
 5 files changed, 99 insertions(+), 32 deletions(-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 4/5] mm: Fix page reference leak in soft_offline_page()
  2021-01-12  9:34 [PATCH v2 0/5] mm: Fix pfn_to_online_page() with respect to ZONE_DEVICE Dan Williams
@ 2021-01-12  9:34 ` Dan Williams
  2021-01-12  9:53   ` Oscar Salvador
  2021-01-12 10:16   ` David Hildenbrand
  2021-01-12  9:35 ` [PATCH v2 5/5] libnvdimm/namespace: Fix visibility of namespace resource attribute Dan Williams
  1 sibling, 2 replies; 6+ messages in thread
From: Dan Williams @ 2021-01-12  9:34 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Naoya Horiguchi, David Hildenbrand, Michal Hocko,
	Oscar Salvador, stable, vishal.l.verma, linux-nvdimm,
	linux-kernel

The conversion to move pfn_to_online_page() internal to
soft_offline_page() missed that the get_user_pages() reference needs to
be dropped when pfn_to_online_page() fails.

When soft_offline_page() is handed a pfn_valid() &&
!pfn_to_online_page() pfn the kernel hangs at dax-device shutdown due to
a leaked reference.

Fixes: feec24a6139d ("mm, soft-offline: convert parameter to pfn")
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 mm/memory-failure.c |   20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 5a38e9eade94..78b173c7190c 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1885,6 +1885,12 @@ static int soft_offline_free_page(struct page *page)
 	return rc;
 }
 
+static void put_ref_page(struct page *page)
+{
+	if (page)
+		put_page(page);
+}
+
 /**
  * soft_offline_page - Soft offline a page.
  * @pfn: pfn to soft-offline
@@ -1910,20 +1916,26 @@ static int soft_offline_free_page(struct page *page)
 int soft_offline_page(unsigned long pfn, int flags)
 {
 	int ret;
-	struct page *page;
 	bool try_again = true;
+	struct page *page, *ref_page = NULL;
+
+	WARN_ON_ONCE(!pfn_valid(pfn) && (flags & MF_COUNT_INCREASED));
 
 	if (!pfn_valid(pfn))
 		return -ENXIO;
+	if (flags & MF_COUNT_INCREASED)
+		ref_page = pfn_to_page(pfn);
+
 	/* Only online pages can be soft-offlined (esp., not ZONE_DEVICE). */
 	page = pfn_to_online_page(pfn);
-	if (!page)
+	if (!page) {
+		put_ref_page(ref_page);
 		return -EIO;
+	}
 
 	if (PageHWPoison(page)) {
 		pr_info("%s: %#lx page already poisoned\n", __func__, pfn);
-		if (flags & MF_COUNT_INCREASED)
-			put_page(page);
+		put_ref_page(ref_page);
 		return 0;
 	}
 


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 5/5] libnvdimm/namespace: Fix visibility of namespace resource attribute
  2021-01-12  9:34 [PATCH v2 0/5] mm: Fix pfn_to_online_page() with respect to ZONE_DEVICE Dan Williams
  2021-01-12  9:34 ` [PATCH v2 4/5] mm: Fix page reference leak in soft_offline_page() Dan Williams
@ 2021-01-12  9:35 ` Dan Williams
  1 sibling, 0 replies; 6+ messages in thread
From: Dan Williams @ 2021-01-12  9:35 UTC (permalink / raw)
  To: linux-mm
  Cc: Vishal Verma, Dave Jiang, Ira Weiny, stable, linux-nvdimm,
	linux-kernel

Legacy pmem namespaces lost support for the "resource" attribute when
the code was cleaned up to put the permission visibility in the
declaration. Restore this by listing 'resource' in the default
attributes.

A new ndctl regression test for pfn_to_online_page() corner cases builds
on this fix.

Fixes: bfd2e9140656 ("libnvdimm: Simplify root read-only definition for the 'resource' attribute")
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/namespace_devs.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
index 6da67f4d641a..2403b71b601e 100644
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -1635,11 +1635,11 @@ static umode_t namespace_visible(struct kobject *kobj,
 		return a->mode;
 	}
 
-	if (a == &dev_attr_nstype.attr || a == &dev_attr_size.attr
-			|| a == &dev_attr_holder.attr
-			|| a == &dev_attr_holder_class.attr
-			|| a == &dev_attr_force_raw.attr
-			|| a == &dev_attr_mode.attr)
+	/* base is_namespace_io() attributes */
+	if (a == &dev_attr_nstype.attr || a == &dev_attr_size.attr ||
+	    a == &dev_attr_holder.attr || a == &dev_attr_holder_class.attr ||
+	    a == &dev_attr_force_raw.attr || a == &dev_attr_mode.attr ||
+	    a == &dev_attr_resource.attr)
 		return a->mode;
 
 	return 0;


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 4/5] mm: Fix page reference leak in soft_offline_page()
  2021-01-12  9:34 ` [PATCH v2 4/5] mm: Fix page reference leak in soft_offline_page() Dan Williams
@ 2021-01-12  9:53   ` Oscar Salvador
  2021-01-12 20:03     ` Dan Williams
  2021-01-12 10:16   ` David Hildenbrand
  1 sibling, 1 reply; 6+ messages in thread
From: Oscar Salvador @ 2021-01-12  9:53 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-mm, Andrew Morton, Naoya Horiguchi, David Hildenbrand,
	Michal Hocko, stable, vishal.l.verma, linux-nvdimm, linux-kernel

On Tue, Jan 12, 2021 at 01:34:58AM -0800, Dan Williams wrote:
> The conversion to move pfn_to_online_page() internal to
> soft_offline_page() missed that the get_user_pages() reference needs to
> be dropped when pfn_to_online_page() fails.

I would be more specific here wrt. get_user_pages (madvise).
soft_offline_page gets called from more places besides madvise_*.

> When soft_offline_page() is handed a pfn_valid() &&
> !pfn_to_online_page() pfn the kernel hangs at dax-device shutdown due to
> a leaked reference.
> 
> Fixes: feec24a6139d ("mm, soft-offline: convert parameter to pfn")
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

LGTM, thanks for catching this:

Reviewed-by: Oscar Salvador <osalvador@suse.de>

A nit below.

> ---
>  mm/memory-failure.c |   20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 5a38e9eade94..78b173c7190c 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1885,6 +1885,12 @@ static int soft_offline_free_page(struct page *page)
>  	return rc;
>  }
>  
> +static void put_ref_page(struct page *page)
> +{
> +	if (page)
> +		put_page(page);
> +}

I am not sure this warrants a function.
I would probably go with "if (ref_page).." in the two corresponding places,
but not feeling strong here.

> +
>  /**
>   * soft_offline_page - Soft offline a page.
>   * @pfn: pfn to soft-offline
> @@ -1910,20 +1916,26 @@ static int soft_offline_free_page(struct page *page)
>  int soft_offline_page(unsigned long pfn, int flags)
>  {
>  	int ret;
> -	struct page *page;
>  	bool try_again = true;
> +	struct page *page, *ref_page = NULL;
> +
> +	WARN_ON_ONCE(!pfn_valid(pfn) && (flags & MF_COUNT_INCREASED));

Did you see any scenario where this could happen? I understand that you are
adding this because we will leak a reference in case pfn is not valid anymore.

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 4/5] mm: Fix page reference leak in soft_offline_page()
  2021-01-12  9:34 ` [PATCH v2 4/5] mm: Fix page reference leak in soft_offline_page() Dan Williams
  2021-01-12  9:53   ` Oscar Salvador
@ 2021-01-12 10:16   ` David Hildenbrand
  1 sibling, 0 replies; 6+ messages in thread
From: David Hildenbrand @ 2021-01-12 10:16 UTC (permalink / raw)
  To: Dan Williams, linux-mm
  Cc: Andrew Morton, Naoya Horiguchi, Michal Hocko, Oscar Salvador,
	stable, vishal.l.verma, linux-nvdimm, linux-kernel

On 12.01.21 10:34, Dan Williams wrote:
> The conversion to move pfn_to_online_page() internal to
> soft_offline_page() missed that the get_user_pages() reference needs to
> be dropped when pfn_to_online_page() fails.
> 
> When soft_offline_page() is handed a pfn_valid() &&
> !pfn_to_online_page() pfn the kernel hangs at dax-device shutdown due to
> a leaked reference.
> 
> Fixes: feec24a6139d ("mm, soft-offline: convert parameter to pfn")
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  mm/memory-failure.c |   20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 5a38e9eade94..78b173c7190c 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1885,6 +1885,12 @@ static int soft_offline_free_page(struct page *page)
>  	return rc;
>  }
>  
> +static void put_ref_page(struct page *page)
> +{
> +	if (page)
> +		put_page(page);
> +}
> +
>  /**
>   * soft_offline_page - Soft offline a page.
>   * @pfn: pfn to soft-offline
> @@ -1910,20 +1916,26 @@ static int soft_offline_free_page(struct page *page)
>  int soft_offline_page(unsigned long pfn, int flags)
>  {
>  	int ret;
> -	struct page *page;
>  	bool try_again = true;
> +	struct page *page, *ref_page = NULL;
> +
> +	WARN_ON_ONCE(!pfn_valid(pfn) && (flags & MF_COUNT_INCREASED));
>  
>  	if (!pfn_valid(pfn))
>  		return -ENXIO;
> +	if (flags & MF_COUNT_INCREASED)
> +		ref_page = pfn_to_page(pfn);
> +
>  	/* Only online pages can be soft-offlined (esp., not ZONE_DEVICE). */
>  	page = pfn_to_online_page(pfn);
> -	if (!page)
> +	if (!page) {
> +		put_ref_page(ref_page);
>  		return -EIO;
> +	}
>  
>  	if (PageHWPoison(page)) {
>  		pr_info("%s: %#lx page already poisoned\n", __func__, pfn);
> -		if (flags & MF_COUNT_INCREASED)
> -			put_page(page);
> +		put_ref_page(ref_page);
>  		return 0;
>  	}

Reviewed-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 4/5] mm: Fix page reference leak in soft_offline_page()
  2021-01-12  9:53   ` Oscar Salvador
@ 2021-01-12 20:03     ` Dan Williams
  0 siblings, 0 replies; 6+ messages in thread
From: Dan Williams @ 2021-01-12 20:03 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Linux MM, Andrew Morton, Naoya Horiguchi, David Hildenbrand,
	Michal Hocko, stable, Vishal L Verma, linux-nvdimm,
	Linux Kernel Mailing List

On Tue, Jan 12, 2021 at 1:54 AM Oscar Salvador <osalvador@suse.de> wrote:
>
> On Tue, Jan 12, 2021 at 01:34:58AM -0800, Dan Williams wrote:
> > The conversion to move pfn_to_online_page() internal to
> > soft_offline_page() missed that the get_user_pages() reference needs to
> > be dropped when pfn_to_online_page() fails.
>
> I would be more specific here wrt. get_user_pages (madvise).
> soft_offline_page gets called from more places besides madvise_*.

Sure.

>
> > When soft_offline_page() is handed a pfn_valid() &&
> > !pfn_to_online_page() pfn the kernel hangs at dax-device shutdown due to
> > a leaked reference.
> >
> > Fixes: feec24a6139d ("mm, soft-offline: convert parameter to pfn")
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
> > Cc: David Hildenbrand <david@redhat.com>
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: Oscar Salvador <osalvador@suse.de>
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> LGTM, thanks for catching this:
>
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
>
> A nit below.
>
> > ---
> >  mm/memory-failure.c |   20 ++++++++++++++++----
> >  1 file changed, 16 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 5a38e9eade94..78b173c7190c 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1885,6 +1885,12 @@ static int soft_offline_free_page(struct page *page)
> >       return rc;
> >  }
> >
> > +static void put_ref_page(struct page *page)
> > +{
> > +     if (page)
> > +             put_page(page);
> > +}
>
> I am not sure this warrants a function.
> I would probably go with "if (ref_page).." in the two corresponding places,
> but not feeling strong here.

I'll take another look, it felt cluttered...

>
> > +
> >  /**
> >   * soft_offline_page - Soft offline a page.
> >   * @pfn: pfn to soft-offline
> > @@ -1910,20 +1916,26 @@ static int soft_offline_free_page(struct page *page)
> >  int soft_offline_page(unsigned long pfn, int flags)
> >  {
> >       int ret;
> > -     struct page *page;
> >       bool try_again = true;
> > +     struct page *page, *ref_page = NULL;
> > +
> > +     WARN_ON_ONCE(!pfn_valid(pfn) && (flags & MF_COUNT_INCREASED));
>
> Did you see any scenario where this could happen? I understand that you are
> adding this because we will leak a reference in case pfn is not valid anymore.
>

I did not, more future proofing / documenting against refactoring that
fails to consider that case.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-01-12 21:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-01-12  9:34 [PATCH v2 0/5] mm: Fix pfn_to_online_page() with respect to ZONE_DEVICE Dan Williams
2021-01-12  9:34 ` [PATCH v2 4/5] mm: Fix page reference leak in soft_offline_page() Dan Williams
2021-01-12  9:53   ` Oscar Salvador
2021-01-12 20:03     ` Dan Williams
2021-01-12 10:16   ` David Hildenbrand
2021-01-12  9:35 ` [PATCH v2 5/5] libnvdimm/namespace: Fix visibility of namespace resource attribute Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox