* [PATCH 0/1] hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of i_size
@ 2017-10-16 22:39 Andrea Arcangeli
2017-10-16 22:39 ` [PATCH 1/1] userfaultfd: " Andrea Arcangeli
0 siblings, 1 reply; 4+ messages in thread
From: Andrea Arcangeli @ 2017-10-16 22:39 UTC (permalink / raw)
To: Andrew Morton, linux-mm
Cc: Mike Rapoport, Dr. David Alan Gilbert, Mike Kravetz
Hello,
The erratic mapping (as in page_mapped()) of hugetlbfs pages beyond
the end of the i_size, was found while testing some userfaultfd
backport. It can trigger a bugcheck as side effect.
Andrea Arcangeli (1):
userfaultfd: hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of
i_size
mm/hugetlb.c | 32 ++++++++++++++++++++++++++++++--
1 file changed, 30 insertions(+), 2 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/1] userfaultfd: hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of i_size
2017-10-16 22:39 [PATCH 0/1] hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of i_size Andrea Arcangeli
@ 2017-10-16 22:39 ` Andrea Arcangeli
2017-10-16 23:14 ` Andrew Morton
0 siblings, 1 reply; 4+ messages in thread
From: Andrea Arcangeli @ 2017-10-16 22:39 UTC (permalink / raw)
To: Andrew Morton, linux-mm
Cc: Mike Rapoport, Dr. David Alan Gilbert, Mike Kravetz
kernel BUG at fs/hugetlbfs/inode.c:484!
RIP: 0010:[<ffffffff815f8520>] [<ffffffff815f8520>] remove_inode_hugepages+0x3d0/0x410
Call Trace:
[<ffffffff815f95b9>] hugetlbfs_setattr+0xd9/0x130
[<ffffffff81526312>] notify_change+0x292/0x410
[<ffffffff816cc6b6>] ? security_inode_need_killpriv+0x16/0x20
[<ffffffff81503c65>] do_truncate+0x65/0xa0
[<ffffffff81504035>] ? do_sys_ftruncate.constprop.3+0xe5/0x180
[<ffffffff8150406a>] do_sys_ftruncate.constprop.3+0x11a/0x180
[<ffffffff8150410e>] SyS_ftruncate+0xe/0x10
[<ffffffff81999f27>] tracesys+0xd9/0xde
This oops was caused by the lack of i_size check in
hugetlb_mcopy_atomic_pte. mmap() can still succeed beyond the end of
the i_size after vmtruncate zapped vmas in those ranges, but the
faults must not succeed, and that includes UFFDIO_COPY.
We could differentiate the retval to userland to represent a SIGBUS
like a page fault would do (vs SIGSEGV), but it doesn't seem very
useful and we'd need to pick a random retval as there's no meaningful
syscall retval that would differentiate from SIGSEGV and SIGBUS,
there's just -EFAULT.
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
mm/hugetlb.c | 32 ++++++++++++++++++++++++++++++--
1 file changed, 30 insertions(+), 2 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 357161bd1519..0cbb7c37dc33 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4012,6 +4012,9 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
unsigned long src_addr,
struct page **pagep)
{
+ struct address_space *mapping;
+ pgoff_t idx;
+ unsigned long size;
int vm_shared = dst_vma->vm_flags & VM_SHARED;
struct hstate *h = hstate_vma(dst_vma);
pte_t _dst_pte;
@@ -4049,13 +4052,24 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
__SetPageUptodate(page);
__set_page_huge_active(page);
+ mapping = dst_vma->vm_file->f_mapping;
+ idx = vma_hugecache_offset(h, dst_vma, dst_addr);
+
/*
* If shared, add to page cache
*/
if (vm_shared) {
- struct address_space *mapping = dst_vma->vm_file->f_mapping;
- pgoff_t idx = vma_hugecache_offset(h, dst_vma, dst_addr);
+ size = i_size_read(mapping->host) >> huge_page_shift(h);
+ ret = -EFAULT;
+ if (idx >= size)
+ goto out_release_nounlock;
+ /*
+ * Serialization between remove_inode_hugepages() and
+ * huge_add_to_page_cache() below happens through the
+ * hugetlb_fault_mutex_table that here must be hold by
+ * the caller.
+ */
ret = huge_add_to_page_cache(page, mapping, idx);
if (ret)
goto out_release_nounlock;
@@ -4064,6 +4078,20 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
ptl = huge_pte_lockptr(h, dst_mm, dst_pte);
spin_lock(ptl);
+ /*
+ * Recheck the i_size after holding PT lock to make sure not
+ * to leave any page mapped (as page_mapped()) beyond the end
+ * of the i_size (remove_inode_hugepages() is strict about
+ * enforcing that). If we bail out here, we'll also leave a
+ * page in the radix tree in the vm_shared case beyond the end
+ * of the i_size, but remove_inode_hugepages() will take care
+ * of it as soon as we drop the hugetlb_fault_mutex_table.
+ */
+ size = i_size_read(mapping->host) >> huge_page_shift(h);
+ ret = -EFAULT;
+ if (idx >= size)
+ goto out_release_unlock;
+
ret = -EEXIST;
if (!huge_pte_none(huge_ptep_get(dst_pte)))
goto out_release_unlock;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 1/1] userfaultfd: hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of i_size
2017-10-16 22:39 ` [PATCH 1/1] userfaultfd: " Andrea Arcangeli
@ 2017-10-16 23:14 ` Andrew Morton
2017-10-18 19:10 ` Andrea Arcangeli
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2017-10-16 23:14 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: linux-mm, Mike Rapoport, Dr. David Alan Gilbert, Mike Kravetz
On Tue, 17 Oct 2017 00:39:14 +0200 Andrea Arcangeli <aarcange@redhat.com> wrote:
> kernel BUG at fs/hugetlbfs/inode.c:484!
> RIP: 0010:[<ffffffff815f8520>] [<ffffffff815f8520>] remove_inode_hugepages+0x3d0/0x410
> Call Trace:
> [<ffffffff815f95b9>] hugetlbfs_setattr+0xd9/0x130
> [<ffffffff81526312>] notify_change+0x292/0x410
> [<ffffffff816cc6b6>] ? security_inode_need_killpriv+0x16/0x20
> [<ffffffff81503c65>] do_truncate+0x65/0xa0
> [<ffffffff81504035>] ? do_sys_ftruncate.constprop.3+0xe5/0x180
> [<ffffffff8150406a>] do_sys_ftruncate.constprop.3+0x11a/0x180
> [<ffffffff8150410e>] SyS_ftruncate+0xe/0x10
> [<ffffffff81999f27>] tracesys+0xd9/0xde
>
> This oops was caused by the lack of i_size check in
> hugetlb_mcopy_atomic_pte. mmap() can still succeed beyond the end of
> the i_size after vmtruncate zapped vmas in those ranges, but the
> faults must not succeed, and that includes UFFDIO_COPY.
>
> We could differentiate the retval to userland to represent a SIGBUS
> like a page fault would do (vs SIGSEGV), but it doesn't seem very
> useful and we'd need to pick a random retval as there's no meaningful
> syscall retval that would differentiate from SIGSEGV and SIGBUS,
> there's just -EFAULT.
>
> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
No cc:stable? The patch applies to 4.13 textually...
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/1] userfaultfd: hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of i_size
2017-10-16 23:14 ` Andrew Morton
@ 2017-10-18 19:10 ` Andrea Arcangeli
0 siblings, 0 replies; 4+ messages in thread
From: Andrea Arcangeli @ 2017-10-18 19:10 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Mike Rapoport, Dr. David Alan Gilbert, Mike Kravetz
On Mon, Oct 16, 2017 at 04:14:43PM -0700, Andrew Morton wrote:
> On Tue, 17 Oct 2017 00:39:14 +0200 Andrea Arcangeli <aarcange@redhat.com> wrote:
>
> > kernel BUG at fs/hugetlbfs/inode.c:484!
> > RIP: 0010:[<ffffffff815f8520>] [<ffffffff815f8520>] remove_inode_hugepages+0x3d0/0x410
> > Call Trace:
> > [<ffffffff815f95b9>] hugetlbfs_setattr+0xd9/0x130
> > [<ffffffff81526312>] notify_change+0x292/0x410
> > [<ffffffff816cc6b6>] ? security_inode_need_killpriv+0x16/0x20
> > [<ffffffff81503c65>] do_truncate+0x65/0xa0
> > [<ffffffff81504035>] ? do_sys_ftruncate.constprop.3+0xe5/0x180
> > [<ffffffff8150406a>] do_sys_ftruncate.constprop.3+0x11a/0x180
> > [<ffffffff8150410e>] SyS_ftruncate+0xe/0x10
> > [<ffffffff81999f27>] tracesys+0xd9/0xde
> >
> > This oops was caused by the lack of i_size check in
> > hugetlb_mcopy_atomic_pte. mmap() can still succeed beyond the end of
> > the i_size after vmtruncate zapped vmas in those ranges, but the
> > faults must not succeed, and that includes UFFDIO_COPY.
> >
> > We could differentiate the retval to userland to represent a SIGBUS
> > like a page fault would do (vs SIGSEGV), but it doesn't seem very
> > useful and we'd need to pick a random retval as there's no meaningful
> > syscall retval that would differentiate from SIGSEGV and SIGBUS,
> > there's just -EFAULT.
> >
> > Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
>
> No cc:stable? The patch applies to 4.13 textually...
Yes, I should have CC'ed stable.
Last time I CC'ed stable I didn't submit it in the right way, I should
have added cc:stable to the commit body, this time. I can resend with
cc:stable if you prefer.
Thanks!
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-10-18 19:10 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-16 22:39 [PATCH 0/1] hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of i_size Andrea Arcangeli
2017-10-16 22:39 ` [PATCH 1/1] userfaultfd: " Andrea Arcangeli
2017-10-16 23:14 ` Andrew Morton
2017-10-18 19:10 ` Andrea Arcangeli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).