* + mm-gup-check-page-posion-status-for-coredump-fix.patch added to -mm tree
@ 2021-03-20 0:22 akpm
2021-03-22 3:52 ` [PATCH v4] mm/gup: check page posion status for coredump Aili Yao
0 siblings, 1 reply; 3+ messages in thread
From: akpm @ 2021-03-20 0:22 UTC (permalink / raw)
To: akpm, david, mike.kravetz, mm-commits, naoya.horiguchi, osalvador,
willy, yaoaili
The patch titled
Subject: mm-gup-check-page-posion-status-for-coredump-fix
has been added to the -mm tree. Its filename is
mm-gup-check-page-posion-status-for-coredump-fix.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-gup-check-page-posion-status-for-coredump-fix.patch
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-gup-check-page-posion-status-for-coredump-fix.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-gup-check-page-posion-status-for-coredump-fix
s/0/false/
Cc: Aili Yao <yaoaili@kingsoft.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/internal.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/internal.h~mm-gup-check-page-posion-status-for-coredump-fix
+++ a/mm/internal.h
@@ -115,7 +115,7 @@ static inline bool is_page_poisoned(stru
else if (PageHuge(page) && PageHWPoison(compound_head(page)))
return true;
}
- return 0;
+ return false;
}
extern unsigned long highest_memmap_pfn;
_
Patches currently in -mm which might be from akpm@linux-foundation.org are
mm.patch
mm-gup-check-page-posion-status-for-coredump-fix.patch
mm-memcontrol-switch-to-rstat-fix.patch
kasan-remove-redundant-config-option-fix.patch
mmmemory_hotplug-allocate-memmap-from-the-added-memory-range-fix.patch
linux-next-rejects.patch
kernel-forkc-export-kernel_thread-to-modules.patch
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v4] mm/gup: check page posion status for coredump.
2021-03-20 0:22 + mm-gup-check-page-posion-status-for-coredump-fix.patch added to -mm tree akpm
@ 2021-03-22 3:52 ` Aili Yao
2021-03-22 9:06 ` HORIGUCHI NAOYA(堀口 直也)
0 siblings, 1 reply; 3+ messages in thread
From: Aili Yao @ 2021-03-22 3:52 UTC (permalink / raw)
To: akpm, willy
Cc: david, mike.kravetz, mm-commits, naoya.horiguchi, osalvador,
yaoaili
Hi Andrew:
Thanks for mergeing v3 patch into mm, but there is still a modification suggested by
Matthew Wilcox needing to finish. I am not sure how does the right process works. I post patch v4
here, if anythong wrong, please point out.
Thanks!
When we do coredump for user process signal, this may be an SIGBUS signal
with BUS_MCEERR_AR or BUS_MCEERR_AO code, which means this signal is
resulted from ECC memory fail like SRAR or SRAO, we expect the memory
recovery work is finished correctly, then the get_dump_page() will not
return the error page as its process pte is set invalid by
memory_failure().
But memory_failure() may fail, and the process's related pte may not be
correctly set invalid, for current code, we will return the poison page,
get it dumped, and then lead to system panic as its in kernel code.
So check the poison status in get_dump_page(), and if TRUE, return NULL.
There maybe other scenario that is also better to check the posion status
and not to panic, so make a wrapper for this check, Thanks to David's
suggestion(<david@redhat.com>).
Link: https://lkml.kernel.org/r/20210319104437.6f30e80d@alex-virtual-machine
Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Aili Yao <yaoaili@kingsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/gup.c | 4 ++++
mm/internal.h | 20 ++++++++++++++++++++
2 files changed, 24 insertions(+)
diff --git a/mm/gup.c b/mm/gup.c
index e4c224c..dcabe96 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1536,6 +1536,10 @@ struct page *get_dump_page(unsigned long addr)
FOLL_FORCE | FOLL_DUMP | FOLL_GET);
if (locked)
mmap_read_unlock(mm);
+
+ if (ret == 1 && is_page_poisoned(page))
+ return NULL;
+
return (ret == 1) ? page : NULL;
}
#endif /* CONFIG_ELF_CORE */
diff --git a/mm/internal.h b/mm/internal.h
index 25d2b2439..dcd2051 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -97,6 +97,26 @@ static inline void set_page_refcounted(struct page *page)
set_page_count(page, 1);
}
+/*
+ * When kernel touch the user page, the user page may be have been marked
+ * poison but still mapped in user space, if without this page, the kernel
+ * can guarantee the data integrity and operation success, the kernel is
+ * better to check the posion status and avoid touching it, be good not to
+ * panic, coredump for process fatal signal is a sample case matching this
+ * scenario. Or if kernel can't guarantee the data integrity, it's better
+ * not to call this function, let kernel touch the poison page and get to
+ * panic.
+ */
+static inline bool is_page_poisoned(struct page *page)
+{
+ if (PageHWPoison(page))
+ return true;
+ else if (PageHuge(page) && PageHWPoison(compound_head(page)))
+ return true;
+
+ return false;
+}
+
extern unsigned long highest_memmap_pfn;
/*
--
1.8.3.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v4] mm/gup: check page posion status for coredump.
2021-03-22 3:52 ` [PATCH v4] mm/gup: check page posion status for coredump Aili Yao
@ 2021-03-22 9:06 ` HORIGUCHI NAOYA(堀口 直也)
0 siblings, 0 replies; 3+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2021-03-22 9:06 UTC (permalink / raw)
To: Aili Yao
Cc: akpm@linux-foundation.org, willy@infradead.org, david@redhat.com,
mike.kravetz@oracle.com, mm-commits@vger.kernel.org,
osalvador@suse.de
On Mon, Mar 22, 2021 at 11:52:33AM +0800, Aili Yao wrote:
> Hi Andrew:
>
> Thanks for mergeing v3 patch into mm, but there is still a modification suggested by
> Matthew Wilcox needing to finish. I am not sure how does the right process works. I post patch v4
> here, if anythong wrong, please point out.
>
> Thanks!
>
>
> When we do coredump for user process signal, this may be an SIGBUS signal
> with BUS_MCEERR_AR or BUS_MCEERR_AO code, which means this signal is
> resulted from ECC memory fail like SRAR or SRAO, we expect the memory
> recovery work is finished correctly, then the get_dump_page() will not
> return the error page as its process pte is set invalid by
> memory_failure().
>
> But memory_failure() may fail, and the process's related pte may not be
> correctly set invalid, for current code, we will return the poison page,
> get it dumped, and then lead to system panic as its in kernel code.
>
> So check the poison status in get_dump_page(), and if TRUE, return NULL.
>
> There maybe other scenario that is also better to check the posion status
> and not to panic, so make a wrapper for this check, Thanks to David's
> suggestion(<david@redhat.com>).
>
> Link: https://lkml.kernel.org/r/20210319104437.6f30e80d@alex-virtual-machine
> Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Aili Yao <yaoaili@kingsoft.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Thank you.
This is a simple and clear fix, so I think it's worth ccing to -stable.
> ---
> mm/gup.c | 4 ++++
> mm/internal.h | 20 ++++++++++++++++++++
> 2 files changed, 24 insertions(+)
>
> diff --git a/mm/gup.c b/mm/gup.c
> index e4c224c..dcabe96 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1536,6 +1536,10 @@ struct page *get_dump_page(unsigned long addr)
> FOLL_FORCE | FOLL_DUMP | FOLL_GET);
> if (locked)
> mmap_read_unlock(mm);
> +
> + if (ret == 1 && is_page_poisoned(page))
> + return NULL;
> +
> return (ret == 1) ? page : NULL;
> }
> #endif /* CONFIG_ELF_CORE */
> diff --git a/mm/internal.h b/mm/internal.h
> index 25d2b2439..dcd2051 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -97,6 +97,26 @@ static inline void set_page_refcounted(struct page *page)
> set_page_count(page, 1);
> }
>
> +/*
> + * When kernel touch the user page, the user page may be have been marked
> + * poison but still mapped in user space, if without this page, the kernel
> + * can guarantee the data integrity and operation success, the kernel is
> + * better to check the posion status and avoid touching it, be good not to
> + * panic, coredump for process fatal signal is a sample case matching this
> + * scenario. Or if kernel can't guarantee the data integrity, it's better
> + * not to call this function, let kernel touch the poison page and get to
> + * panic.
> + */
> +static inline bool is_page_poisoned(struct page *page)
The word "poison" is abused even in mm subsystem, so please use "hwpoison"
to be distinct. And please send a patch to linux-mm for review instead of
replying to this thread.
Thanks,
Naoya Horiguchi
> +{
> + if (PageHWPoison(page))
> + return true;
> + else if (PageHuge(page) && PageHWPoison(compound_head(page)))
> + return true;
> +
> + return false;
> +}
> +
> extern unsigned long highest_memmap_pfn;
>
> /*
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-03-22 9:07 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-03-20 0:22 + mm-gup-check-page-posion-status-for-coredump-fix.patch added to -mm tree akpm
2021-03-22 3:52 ` [PATCH v4] mm/gup: check page posion status for coredump Aili Yao
2021-03-22 9:06 ` HORIGUCHI NAOYA(堀口 直也)
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.