stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery
@ 2023-06-15  1:52 Jane Chu
  2023-06-15  1:52 ` [5.15-stable PATCH 1/2] mm, hwpoison: try to recover from copy-on write faults Jane Chu
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jane Chu @ 2023-06-15  1:52 UTC (permalink / raw)
  To: stable
  Cc: tony.luck, dan.j.williams, naoya.horiguchi, linmiaohe, glider,
	jane.chu

I was able to reproduce crash on 5.15.y kernel during COW, and
when the grandchild process attempts a write to a private page
inherited from the child process and the private page contains
a memory uncorrectable error. The way to reproduce is described
in Tony's patch, using his ras-tools/einj_mem_uc.
And the patch series fixed the panic issue in 5.15.y.

The backport has encountered trivial conflicts due to missing
dependencies, details are provided in each patch.

Please let me know whether the backport is acceptable.

Tony Luck (2):
  mm, hwpoison: try to recover from copy-on write faults
  mm, hwpoison: when copy-on-write hits poison, take page offline

 include/linux/highmem.h | 24 ++++++++++++++++++++++++
 include/linux/mm.h      |  5 ++++-
 mm/memory.c             | 33 +++++++++++++++++++++++----------
 3 files changed, 51 insertions(+), 11 deletions(-)

-- 
2.18.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [5.15-stable PATCH 1/2] mm, hwpoison: try to recover from copy-on write faults
  2023-06-15  1:52 [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery Jane Chu
@ 2023-06-15  1:52 ` Jane Chu
  2023-06-15  1:52 ` [5.15-stable PATCH 2/2] mm, hwpoison: when copy-on-write hits poison, take page offline Jane Chu
  2023-06-19  8:28 ` [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery Greg KH
  2 siblings, 0 replies; 7+ messages in thread
From: Jane Chu @ 2023-06-15  1:52 UTC (permalink / raw)
  To: stable
  Cc: tony.luck, dan.j.williams, naoya.horiguchi, linmiaohe, glider,
	jane.chu

From: Tony Luck <tony.luck@intel.com>

commit a873dfe1032a132bf89f9e19a6ac44f5a0b78754 upstream.

Patch series "Copy-on-write poison recovery", v3.

Part 1 deals with the process that triggered the copy on write fault with
a store to a shared read-only page.  That process is send a SIGBUS with
the usual machine check decoration to specify the virtual address of the
lost page, together with the scope.

Part 2 sets up to asynchronously take the page with the uncorrected error
offline to prevent additional machine check faults.  H/t to Miaohe Lin
<linmiaohe@huawei.com> and Shuai Xue <xueshuai@linux.alibaba.com> for
pointing me to the existing function to queue a call to memory_failure().

On x86 there is some duplicate reporting (because the error is also
signalled by the memory controller as well as by the core that triggered
the machine check).  Console logs look like this:

This patch (of 2):

If the kernel is copying a page as the result of a copy-on-write
fault and runs into an uncorrectable error, Linux will crash because
it does not have recovery code for this case where poison is consumed
by the kernel.

It is easy to set up a test case. Just inject an error into a private
page, fork(2), and have the child process write to the page.

I wrapped that neatly into a test at:

  git://git.kernel.org/pub/scm/linux/kernel/git/aegl/ras-tools.git

just enable ACPI error injection and run:

  # ./einj_mem-uc -f copy-on-write

Add a new copy_user_highpage_mc() function that uses copy_mc_to_kernel()
on architectures where that is available (currently x86 and powerpc).
When an error is detected during the page copy, return VM_FAULT_HWPOISON
to caller of wp_page_copy(). This propagates up the call stack. Both x86
and powerpc have code in their fault handler to deal with this code by
sending a SIGBUS to the application.

Note that this patch avoids a system crash and signals the process that
triggered the copy-on-write action. It does not take any action for the
memory error that is still in the shared page. To handle that a call to
memory_failure() is needed. But this cannot be done from wp_page_copy()
because it holds mmap_lock(). Perhaps the architecture fault handlers
can deal with this loose end in a subsequent patch?

On Intel/x86 this loose end will often be handled automatically because
the memory controller provides an additional notification of the h/w
poison in memory, the handler for this will call memory_failure(). This
isn't a 100% solution. If there are multiple errors, not all may be
logged in this way.

Cc: <stable@vger.kernel.org>
[tony.luck@intel.com: add call to kmsan_unpoison_memory(), per Miaohe Lin]
  Link: https://lkml.kernel.org/r/20221031201029.102123-2-tony.luck@intel.com
Link: https://lkml.kernel.org/r/20221021200120.175753-1-tony.luck@intel.com
Link: https://lkml.kernel.org/r/20221021200120.175753-2-tony.luck@intel.com
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Tested-by: Shuai Xue <xueshuai@linux.alibaba.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Conflicts:
	mm/memory.c
Due to missing commits
  c89357e27f20d ("mm: support GUP-triggered unsharing of anonymous pages")
  662ce1dc9caf4 ("delayacct: track delays from write-protect copy")
  b073d7f8aee4e ("mm: kmsan: maintain KMSAN metadata for page operations")
The impact of c89357e27f20d is a name change from cow_user_page() to
__wp_page_copy_user().
The impact of 662ce1dc9caf4 is the introduction of a new feature of
tracking write-protect copy in delayacct.
The impact of b073d7f8aee4e is an introduction of KASAN feature.
None of these commits establishes meaningful dependency, hence resolve by
ignoring them.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 include/linux/highmem.h | 24 ++++++++++++++++++++++++
 mm/memory.c             | 31 +++++++++++++++++++++----------
 2 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index b4c49f9cc379..87763f48c6c3 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -247,6 +247,30 @@ static inline void copy_user_highpage(struct page *to, struct page *from,
 
 #endif
 
+#ifdef copy_mc_to_kernel
+static inline int copy_mc_user_highpage(struct page *to, struct page *from,
+					unsigned long vaddr, struct vm_area_struct *vma)
+{
+	unsigned long ret;
+	char *vfrom, *vto;
+
+	vfrom = kmap_local_page(from);
+	vto = kmap_local_page(to);
+	ret = copy_mc_to_kernel(vto, vfrom, PAGE_SIZE);
+	kunmap_local(vto);
+	kunmap_local(vfrom);
+
+	return ret;
+}
+#else
+static inline int copy_mc_user_highpage(struct page *to, struct page *from,
+					unsigned long vaddr, struct vm_area_struct *vma)
+{
+	copy_user_highpage(to, from, vaddr, vma);
+	return 0;
+}
+#endif
+
 #ifndef __HAVE_ARCH_COPY_HIGHPAGE
 
 static inline void copy_highpage(struct page *to, struct page *from)
diff --git a/mm/memory.c b/mm/memory.c
index 8d71a82462dd..8dd43a6b6bd7 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2753,10 +2753,16 @@ static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd,
 	return same;
 }
 
-static inline bool cow_user_page(struct page *dst, struct page *src,
-				 struct vm_fault *vmf)
+/*
+ * Return:
+ *	0:		copied succeeded
+ *	-EHWPOISON:	copy failed due to hwpoison in source page
+ *	-EAGAIN:	copied failed (some other reason)
+ */
+static inline int cow_user_page(struct page *dst, struct page *src,
+				      struct vm_fault *vmf)
 {
-	bool ret;
+	int ret;
 	void *kaddr;
 	void __user *uaddr;
 	bool locked = false;
@@ -2765,8 +2771,9 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 	unsigned long addr = vmf->address;
 
 	if (likely(src)) {
-		copy_user_highpage(dst, src, addr, vma);
-		return true;
+		if (copy_mc_user_highpage(dst, src, addr, vma))
+			return -EHWPOISON;
+		return 0;
 	}
 
 	/*
@@ -2793,7 +2800,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 			 * and update local tlb only
 			 */
 			update_mmu_tlb(vma, addr, vmf->pte);
-			ret = false;
+			ret = -EAGAIN;
 			goto pte_unlock;
 		}
 
@@ -2818,7 +2825,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 		if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
 			/* The PTE changed under us, update local tlb */
 			update_mmu_tlb(vma, addr, vmf->pte);
-			ret = false;
+			ret = -EAGAIN;
 			goto pte_unlock;
 		}
 
@@ -2837,7 +2844,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 		}
 	}
 
-	ret = true;
+	ret = 0;
 
 pte_unlock:
 	if (locked)
@@ -3003,6 +3010,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
 	pte_t entry;
 	int page_copied = 0;
 	struct mmu_notifier_range range;
+	int ret;
 
 	if (unlikely(anon_vma_prepare(vma)))
 		goto oom;
@@ -3018,17 +3026,20 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
 		if (!new_page)
 			goto oom;
 
-		if (!cow_user_page(new_page, old_page, vmf)) {
+		ret = cow_user_page(new_page, old_page, vmf);
+		if (ret) {
 			/*
 			 * COW failed, if the fault was solved by other,
 			 * it's fine. If not, userspace would re-fault on
 			 * the same address and we will handle the fault
 			 * from the second attempt.
+			 * The -EHWPOISON case will not be retried.
 			 */
 			put_page(new_page);
 			if (old_page)
 				put_page(old_page);
-			return 0;
+
+			return ret == -EHWPOISON ? VM_FAULT_HWPOISON : 0;
 		}
 	}
 
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [5.15-stable PATCH 2/2] mm, hwpoison: when copy-on-write hits poison, take page offline
  2023-06-15  1:52 [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery Jane Chu
  2023-06-15  1:52 ` [5.15-stable PATCH 1/2] mm, hwpoison: try to recover from copy-on write faults Jane Chu
@ 2023-06-15  1:52 ` Jane Chu
  2023-06-19  8:28 ` [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery Greg KH
  2 siblings, 0 replies; 7+ messages in thread
From: Jane Chu @ 2023-06-15  1:52 UTC (permalink / raw)
  To: stable
  Cc: tony.luck, dan.j.williams, naoya.horiguchi, linmiaohe, glider,
	jane.chu

From: Tony Luck <tony.luck@intel.com>

commit d302c2398ba269e788a4f37ae57c07a7fcabaa42 upstream.

Cannot call memory_failure() directly from the fault handler because
mmap_lock (and others) are held.

It is important, but not urgent, to mark the source page as h/w poisoned
and unmap it from other tasks.

Use memory_failure_queue() to request a call to memory_failure() for the
page with the error.

Also provide a stub version for CONFIG_MEMORY_FAILURE=n

Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20221021200120.175753-3-tony.luck@intel.com
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Conflicts:
	include/linux/mm.h
Due to missing commits
  e591ef7d96d6e ("mm,hwpoison,hugetlb,memory_hotplug: hotremove memory section with hwpoisoned hugepage")
  5033091de814a ("mm/hwpoison: introduce per-memory_block hwpoison counter")
The impact of e591ef7d96d6e is its introduction of an additional flag in
__get_huge_page_for_hwpoison() that serves as an indication a hwpoisoned
hugetlb page should have its migratable bit cleared.
The impact of 5033091de814a is contexual.
Resolve by ignoring both missing commits.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 include/linux/mm.h | 5 ++++-
 mm/memory.c        | 4 +++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index e4e1817bb3b8..a27a6b58d374 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3124,7 +3124,6 @@ enum mf_flags {
 	MF_SOFT_OFFLINE = 1 << 3,
 };
 extern int memory_failure(unsigned long pfn, int flags);
-extern void memory_failure_queue(unsigned long pfn, int flags);
 extern void memory_failure_queue_kick(int cpu);
 extern int unpoison_memory(unsigned long pfn);
 extern int sysctl_memory_failure_early_kill;
@@ -3133,8 +3132,12 @@ extern void shake_page(struct page *p);
 extern atomic_long_t num_poisoned_pages __read_mostly;
 extern int soft_offline_page(unsigned long pfn, int flags);
 #ifdef CONFIG_MEMORY_FAILURE
+extern void memory_failure_queue(unsigned long pfn, int flags);
 extern int __get_huge_page_for_hwpoison(unsigned long pfn, int flags);
 #else
+static inline void memory_failure_queue(unsigned long pfn, int flags)
+{
+}
 static inline int __get_huge_page_for_hwpoison(unsigned long pfn, int flags)
 {
 	return 0;
diff --git a/mm/memory.c b/mm/memory.c
index 8dd43a6b6bd7..1bb01b12db53 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2771,8 +2771,10 @@ static inline int cow_user_page(struct page *dst, struct page *src,
 	unsigned long addr = vmf->address;
 
 	if (likely(src)) {
-		if (copy_mc_user_highpage(dst, src, addr, vma))
+		if (copy_mc_user_highpage(dst, src, addr, vma)) {
+			memory_failure_queue(page_to_pfn(src), 0);
 			return -EHWPOISON;
+		}
 		return 0;
 	}
 
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery
  2023-06-15  1:52 [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery Jane Chu
  2023-06-15  1:52 ` [5.15-stable PATCH 1/2] mm, hwpoison: try to recover from copy-on write faults Jane Chu
  2023-06-15  1:52 ` [5.15-stable PATCH 2/2] mm, hwpoison: when copy-on-write hits poison, take page offline Jane Chu
@ 2023-06-19  8:28 ` Greg KH
  2023-06-20 18:36   ` Jane Chu
  2 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2023-06-19  8:28 UTC (permalink / raw)
  To: Jane Chu
  Cc: stable, tony.luck, dan.j.williams, naoya.horiguchi, linmiaohe,
	glider

On Wed, Jun 14, 2023 at 07:52:53PM -0600, Jane Chu wrote:
> I was able to reproduce crash on 5.15.y kernel during COW, and
> when the grandchild process attempts a write to a private page
> inherited from the child process and the private page contains
> a memory uncorrectable error. The way to reproduce is described
> in Tony's patch, using his ras-tools/einj_mem_uc.
> And the patch series fixed the panic issue in 5.15.y.

But you are skipping 6.1.y, which is not ok as it would cause
regressions when you upgrade.

I'll drop this from my review queue now, please provide working
backports for this and newer releases, and I'll be glad to take them.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery
  2023-06-19  8:28 ` [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery Greg KH
@ 2023-06-20 18:36   ` Jane Chu
  2023-06-20 18:42     ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: Jane Chu @ 2023-06-20 18:36 UTC (permalink / raw)
  To: Greg KH
  Cc: stable, tony.luck, dan.j.williams, naoya.horiguchi, linmiaohe,
	glider

Hi, Greg,

On 6/19/2023 1:28 AM, Greg KH wrote:
> On Wed, Jun 14, 2023 at 07:52:53PM -0600, Jane Chu wrote:
>> I was able to reproduce crash on 5.15.y kernel during COW, and
>> when the grandchild process attempts a write to a private page
>> inherited from the child process and the private page contains
>> a memory uncorrectable error. The way to reproduce is described
>> in Tony's patch, using his ras-tools/einj_mem_uc.
>> And the patch series fixed the panic issue in 5.15.y.
> 
> But you are skipping 6.1.y, which is not ok as it would cause
> regressions when you upgrade.
> 
> I'll drop this from my review queue now, please provide working
> backports for this and newer releases, and I'll be glad to take them.
> 

Thanks for the guidance, will do.
To confirm, you're looking for backport to both 6.1.y and 5.15.y, and 
nothing else, correct?  Just curious, why 6.1.y in particular?

thanks!
-jane

> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery
  2023-06-20 18:36   ` Jane Chu
@ 2023-06-20 18:42     ` Greg KH
  2023-06-20 19:50       ` Jane Chu
  0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2023-06-20 18:42 UTC (permalink / raw)
  To: Jane Chu
  Cc: stable, tony.luck, dan.j.williams, naoya.horiguchi, linmiaohe,
	glider

On Tue, Jun 20, 2023 at 11:36:13AM -0700, Jane Chu wrote:
> Hi, Greg,
> 
> On 6/19/2023 1:28 AM, Greg KH wrote:
> > On Wed, Jun 14, 2023 at 07:52:53PM -0600, Jane Chu wrote:
> > > I was able to reproduce crash on 5.15.y kernel during COW, and
> > > when the grandchild process attempts a write to a private page
> > > inherited from the child process and the private page contains
> > > a memory uncorrectable error. The way to reproduce is described
> > > in Tony's patch, using his ras-tools/einj_mem_uc.
> > > And the patch series fixed the panic issue in 5.15.y.
> > 
> > But you are skipping 6.1.y, which is not ok as it would cause
> > regressions when you upgrade.
> > 
> > I'll drop this from my review queue now, please provide working
> > backports for this and newer releases, and I'll be glad to take them.
> > 
> 
> Thanks for the guidance, will do.
> To confirm, you're looking for backport to both 6.1.y and 5.15.y, and
> nothing else, correct?  Just curious, why 6.1.y in particular?


If you don't think it needs to go to any kernels older than 5.15.y, that would
be fine.

And as for 6.1.y, look at the front page of www.kernel.org, it shows the
active kernel versions.  We can't apply a change to an older kernel tree
only because if you upgrade to a newer one (i.e. from 5.15.y to 6.1.y),
you would have a regression which we don't ever want.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery
  2023-06-20 18:42     ` Greg KH
@ 2023-06-20 19:50       ` Jane Chu
  0 siblings, 0 replies; 7+ messages in thread
From: Jane Chu @ 2023-06-20 19:50 UTC (permalink / raw)
  To: Greg KH
  Cc: stable, tony.luck, dan.j.williams, naoya.horiguchi, linmiaohe,
	glider

On 6/20/2023 11:42 AM, Greg KH wrote:
> On Tue, Jun 20, 2023 at 11:36:13AM -0700, Jane Chu wrote:
>> Hi, Greg,
>>
>> On 6/19/2023 1:28 AM, Greg KH wrote:
>>> On Wed, Jun 14, 2023 at 07:52:53PM -0600, Jane Chu wrote:
>>>> I was able to reproduce crash on 5.15.y kernel during COW, and
>>>> when the grandchild process attempts a write to a private page
>>>> inherited from the child process and the private page contains
>>>> a memory uncorrectable error. The way to reproduce is described
>>>> in Tony's patch, using his ras-tools/einj_mem_uc.
>>>> And the patch series fixed the panic issue in 5.15.y.
>>>
>>> But you are skipping 6.1.y, which is not ok as it would cause
>>> regressions when you upgrade.
>>>
>>> I'll drop this from my review queue now, please provide working
>>> backports for this and newer releases, and I'll be glad to take them.
>>>
>>
>> Thanks for the guidance, will do.
>> To confirm, you're looking for backport to both 6.1.y and 5.15.y, and
>> nothing else, correct?  Just curious, why 6.1.y in particular?
> 
> 
> If you don't think it needs to go to any kernels older than 5.15.y, that would
> be fine.
> 
> And as for 6.1.y, look at the front page of www.kernel.org, it shows the
> active kernel versions.  We can't apply a change to an older kernel tree
> only because if you upgrade to a newer one (i.e. from 5.15.y to 6.1.y),
> you would have a regression which we don't ever want.
> 

Got it, thanks!

-jane

> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-06-20 19:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-15  1:52 [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery Jane Chu
2023-06-15  1:52 ` [5.15-stable PATCH 1/2] mm, hwpoison: try to recover from copy-on write faults Jane Chu
2023-06-15  1:52 ` [5.15-stable PATCH 2/2] mm, hwpoison: when copy-on-write hits poison, take page offline Jane Chu
2023-06-19  8:28 ` [5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery Greg KH
2023-06-20 18:36   ` Jane Chu
2023-06-20 18:42     ` Greg KH
2023-06-20 19:50       ` Jane Chu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).