From: Mike Kravetz <mike.kravetz@oracle.com>
To: Edward Adam Davis <eadavis@qq.com>
Cc: riel@surriel.com, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
llvm@lists.linux.dev, muchun.song@linux.dev, nathan@kernel.org,
ndesaulniers@google.com,
syzbot+6ada951e7c0f7bc8a71e@syzkaller.appspotmail.com,
syzkaller-bugs@googlegroups.com, trix@redhat.com
Subject: Re: [PATCH] mm/hugetlb: fix null ptr defer in hugetlb_vma_lock_write
Date: Thu, 2 Nov 2023 19:24:26 -0700 [thread overview]
Message-ID: <20231103022426.GA3531@monkey> (raw)
In-Reply-To: <tencent_164A247533766D667C3D873798968236C409@qq.com>
On 11/02/23 20:58, Edward Adam Davis wrote:
> When obtaining resv_map from vma, it is necessary to simultaneously determine
> the flag HPAGE_RESV_OWNER of vm_private_data.
> Only when they are met simultaneously, resv_map is valid.
Thanks for looking into this!
The check for HPAGE_RESV_OWNER does 'work'. However, I believe root
cause is this block of code in __unmap_hugepage_range().
/*
* If a reference page is supplied, it is because a specific
* page is being unmapped, not a range. Ensure the page we
* are about to unmap is the actual page of interest.
*/
if (ref_page) {
if (page != ref_page) {
spin_unlock(ptl);
continue;
}
/*
* Mark the VMA as having unmapped its page so that
* future faults in this VMA will fail rather than
* looking like data was lost
*/
set_vma_resv_flags(vma, HPAGE_RESV_UNMAPPED);
}
In the specific case causing the null-ptr-deref, the resv_map pointer
(vm_private_data) is NULL. So, set_vma_resv_flags() just sets the lower bit.
Because of this, __vma_private_lock returns true.
As mentioned, the check for HPAGE_RESV_OWNER in this patch 'works' because
only the HPAGE_RESV_UNMAPPED bit is set in vm_private_data.
I was thinking a more explicit check for this 'NULL pointer' with lower
bits set could be made in __vma_private_lock. Below is something I put
together. I just open coded the check for 'NULL pointer' instead of
moving a bunch of code to the header file. In addition, I changed the
#defines from HPAGE_* to HUGETLB_* to avoid any confusion as the header
file defines are in the global name space. I thought about also adding
an explicit check for the HPAGE_RESV_OWNER as done in this patch. This
would catch the case where the pointer is not NULL. I do not believe
that is possible in the code today, but might make the check more future
proof.
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 47d25a5e1933..7b472432708e 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1260,6 +1260,15 @@ bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr);
#define flush_hugetlb_tlb_range(vma, addr, end) flush_tlb_range(vma, addr, end)
#endif
+/*
+ * Flags for MAP_PRIVATE reservations. These are stored in the bottom
+ * bits of the reservation map pointer, which are always clear due to
+ * alignment.
+ */
+#define HUGETLB_RESV_OWNER (1UL << 0)
+#define HUGETLB_RESV_UNMAPPED (1UL << 1)
+#define HUGETLB_RESV_MASK (HUGETLB_RESV_OWNER | HUGETLB_RESV_UNMAPPED)
+
static inline bool __vma_shareable_lock(struct vm_area_struct *vma)
{
return (vma->vm_flags & VM_MAYSHARE) && vma->vm_private_data;
@@ -1267,7 +1276,9 @@ static inline bool __vma_shareable_lock(struct vm_area_struct *vma)
static inline bool __vma_private_lock(struct vm_area_struct *vma)
{
- return (!(vma->vm_flags & VM_MAYSHARE)) && vma->vm_private_data;
+ /* Careful - flags may be set in lower bits of pointer */
+ return (!(vma->vm_flags & VM_MAYSHARE)) &&
+ (unsigned long)vma->vm_private_data & ~HUGETLB_RESV_MASK;
}
/*
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1301ba7b2c9a..d2215f7647b1 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1028,15 +1028,6 @@ __weak unsigned long vma_mmu_pagesize(struct vm_area_struct *vma)
return vma_kernel_pagesize(vma);
}
-/*
- * Flags for MAP_PRIVATE reservations. These are stored in the bottom
- * bits of the reservation map pointer, which are always clear due to
- * alignment.
- */
-#define HPAGE_RESV_OWNER (1UL << 0)
-#define HPAGE_RESV_UNMAPPED (1UL << 1)
-#define HPAGE_RESV_MASK (HPAGE_RESV_OWNER | HPAGE_RESV_UNMAPPED)
-
/*
* These helpers are used to track how many pages are reserved for
* faults in a MAP_PRIVATE mapping. Only the process that called mmap()
@@ -1162,7 +1153,7 @@ static struct resv_map *vma_resv_map(struct vm_area_struct *vma)
} else {
return (struct resv_map *)(get_vma_private_data(vma) &
- ~HPAGE_RESV_MASK);
+ ~HUGETLB_RESV_MASK);
}
}
@@ -1236,7 +1227,7 @@ void clear_vma_resv_huge_pages(struct vm_area_struct *vma)
*/
struct resv_map *reservations = vma_resv_map(vma);
- if (reservations && is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
+ if (reservations && is_vma_resv_set(vma, HUGETLB_RESV_OWNER)) {
resv_map_put_hugetlb_cgroup_uncharge_info(reservations);
kref_put(&reservations->refs, resv_map_release);
}
@@ -1282,7 +1273,7 @@ static bool vma_has_reserves(struct vm_area_struct *vma, long chg)
* Only the process that called mmap() has reserves for
* private mappings.
*/
- if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
+ if (is_vma_resv_set(vma, HUGETLB_RESV_OWNER)) {
/*
* Like the shared case above, a hole punch or truncate
* could have been performed on the private mapping.
@@ -2763,7 +2754,7 @@ static long __vma_reservation_common(struct hstate *h,
if (vma->vm_flags & VM_MAYSHARE || mode == VMA_DEL_RESV)
return ret;
/*
- * We know private mapping must have HPAGE_RESV_OWNER set.
+ * We know private mapping must have HUGETLB_RESV_OWNER set.
*
* In most cases, reserves always exist for private mappings.
* However, a file associated with mapping could have been
@@ -4833,7 +4824,7 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma)
struct resv_map *resv = vma_resv_map(vma);
/*
- * HPAGE_RESV_OWNER indicates a private mapping.
+ * HUGETLB_RESV_OWNER indicates a private mapping.
* This new VMA should share its siblings reservation map if present.
* The VMA will only ever have a valid reservation map pointer where
* it is being copied for another still existing VMA. As that VMA
@@ -4841,7 +4832,7 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma)
* after this open call completes. It is therefore safe to take a
* new reference here without additional locking.
*/
- if (resv && is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
+ if (resv && is_vma_resv_set(vma, HUGETLB_RESV_OWNER)) {
resv_map_dup_hugetlb_cgroup_uncharge_info(resv);
kref_get(&resv->refs);
}
@@ -4877,7 +4868,7 @@ static void hugetlb_vm_op_close(struct vm_area_struct *vma)
hugetlb_vma_lock_free(vma);
resv = vma_resv_map(vma);
- if (!resv || !is_vma_resv_set(vma, HPAGE_RESV_OWNER))
+ if (!resv || !is_vma_resv_set(vma, HUGETLB_RESV_OWNER))
return;
start = vma_hugecache_offset(h, vma, vma->vm_start);
@@ -5394,7 +5385,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
* future faults in this VMA will fail rather than
* looking like data was lost
*/
- set_vma_resv_flags(vma, HPAGE_RESV_UNMAPPED);
+ set_vma_resv_flags(vma, HUGETLB_RESV_UNMAPPED);
}
pte = huge_ptep_get_and_clear(mm, address, ptep);
@@ -5544,7 +5535,7 @@ static void unmap_ref_private(struct mm_struct *mm, struct vm_area_struct *vma,
* could insert a zeroed page instead of the data existing
* from the time of fork. This would look like data corruption
*/
- if (!is_vma_resv_set(iter_vma, HPAGE_RESV_OWNER))
+ if (!is_vma_resv_set(iter_vma, HUGETLB_RESV_OWNER))
unmap_hugepage_range(iter_vma, address,
address + huge_page_size(h), page, 0);
}
@@ -5625,7 +5616,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
* at the time of fork() could consume its reserves on COW instead
* of the full address range.
*/
- if (is_vma_resv_set(vma, HPAGE_RESV_OWNER) &&
+ if (is_vma_resv_set(vma, HUGETLB_RESV_OWNER) &&
old_folio != pagecache_folio)
outside_reserve = 1;
@@ -5865,7 +5856,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
* COW/unsharing. Warn that such a situation has occurred as it may not
* be obvious.
*/
- if (is_vma_resv_set(vma, HPAGE_RESV_UNMAPPED)) {
+ if (is_vma_resv_set(vma, HUGETLB_RESV_UNMAPPED)) {
pr_warn_ratelimited("PID %d killed due to inadequate hugepage pool\n",
current->pid);
goto out;
@@ -6756,7 +6747,7 @@ bool hugetlb_reserve_pages(struct inode *inode,
chg = to - from;
set_vma_resv_map(vma, resv_map);
- set_vma_resv_flags(vma, HPAGE_RESV_OWNER);
+ set_vma_resv_flags(vma, HUGETLB_RESV_OWNER);
}
if (chg < 0)
@@ -6853,7 +6844,7 @@ bool hugetlb_reserve_pages(struct inode *inode,
*/
if (chg >= 0 && add < 0)
region_abort(resv_map, from, to, regions_needed);
- if (vma && is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
+ if (vma && is_vma_resv_set(vma, HUGETLB_RESV_OWNER)) {
kref_put(&resv_map->refs, resv_map_release);
set_vma_resv_map(vma, NULL);
}
next prev parent reply other threads:[~2023-11-03 2:24 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-29 9:27 [syzbot] [mm?] general protection fault in hugetlb_vma_lock_write syzbot
2023-10-31 18:38 ` Rik van Riel
2023-11-02 12:15 ` Aleksandr Nogikh
2023-11-01 6:36 ` [PATCH] mm/hugetlb: fix null ptr defer " Edward Adam Davis
2023-11-01 14:27 ` Rik van Riel
2023-11-02 12:58 ` Edward Adam Davis
2023-11-02 23:29 ` kernel test robot
2023-11-02 23:29 ` kernel test robot
2023-11-03 0:56 ` Rik van Riel
2023-11-03 1:26 ` kernel test robot
2023-11-03 2:24 ` Mike Kravetz [this message]
2023-11-03 2:28 ` Mike Kravetz
2023-11-03 2:37 ` Mike Kravetz
2023-11-03 3:15 ` Rik van Riel
2023-11-03 4:31 ` Mike Kravetz
2023-11-08 3:22 ` Mike Kravetz
2023-11-02 13:46 ` Yin, Fengwei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231103022426.GA3531@monkey \
--to=mike.kravetz@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=eadavis@qq.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=llvm@lists.linux.dev \
--cc=muchun.song@linux.dev \
--cc=nathan@kernel.org \
--cc=ndesaulniers@google.com \
--cc=riel@surriel.com \
--cc=syzbot+6ada951e7c0f7bc8a71e@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=trix@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.