* [PATCH] hugetlb: fix hugetlb cgroup refcounting during vma split
@ 2021-08-30 21:50 Mike Kravetz
2021-08-31 14:01 ` Guillaume Morin
0 siblings, 1 reply; 2+ messages in thread
From: Mike Kravetz @ 2021-08-30 21:50 UTC (permalink / raw)
To: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA
Cc: Mina Almasry, David Rientjes, Greg Thelen, Sandipan Das,
Shakeel Butt, Shuah Khan, Andrew Morton, Mike Kravetz,
Guillaume Morin, stable-u79uwXL29TY76Z2rM5mHXA
Guillaume Morin reported hitting the following WARNING followed
by GPF or NULL pointer deference either in cgroups_destroy or in
the kill_css path.:
percpu ref (css_release) <= 0 (-1) after switching to atomic
WARNING: CPU: 23 PID: 130 at lib/percpu-refcount.c:196 percpu_ref_switch_to_atomic_rcu+0x127/0x130
CPU: 23 PID: 130 Comm: ksoftirqd/23 Kdump: loaded Tainted: G O 5.10.60 #1
RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x127/0x130
Call Trace:
rcu_core+0x30f/0x530
rcu_core_si+0xe/0x10
__do_softirq+0x103/0x2a2
? sort_range+0x30/0x30
run_ksoftirqd+0x2b/0x40
smpboot_thread_fn+0x11a/0x170
kthread+0x10a/0x140
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x22/0x30
Upon further examination, it was discovered that the css structure
was associated with hugetlb reservations.
For private hugetlb mappings the vma points to a reserve map that
contains a pointer to the css. At mmap time, reservations are set up
and a reference to the css is taken. This reference is dropped in the
vma close operation; hugetlb_vm_op_close. However, if a vma is split
no additional reference to the css is taken yet hugetlb_vm_op_close will
be called twice for the split vma resulting in an underflow.
Fix by taking another reference in hugetlb_vm_op_open. Note that the
reference is only taken for the owner of the reserve map. In the more
common fork case, the pointer to the reserve map is cleared for
non-owning vmas.
Fixes: e9fe92ae0cd2 ("hugetlb_cgroup: add reservation accounting for
private mappings")
Reported-by: Guillaume Morin <guillaume-/FyPzM6KSZdAfugRpC6u6w@public.gmane.org>
Suggested-by: Guillaume Morin <guillaume-/FyPzM6KSZdAfugRpC6u6w@public.gmane.org>
Signed-off-by: Mike Kravetz <mike.kravetz-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
---
include/linux/hugetlb_cgroup.h | 12 ++++++++++++
mm/hugetlb.c | 4 +++-
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
index 0b8d1fdda3a1..c137396129db 100644
--- a/include/linux/hugetlb_cgroup.h
+++ b/include/linux/hugetlb_cgroup.h
@@ -121,6 +121,13 @@ static inline void hugetlb_cgroup_put_rsvd_cgroup(struct hugetlb_cgroup *h_cg)
css_put(&h_cg->css);
}
+static inline void resv_map_dup_hugetlb_cgroup_uncharge_info(
+ struct resv_map *resv_map)
+{
+ if (resv_map->css)
+ css_get(resv_map->css);
+}
+
extern int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
struct hugetlb_cgroup **ptr);
extern int hugetlb_cgroup_charge_cgroup_rsvd(int idx, unsigned long nr_pages,
@@ -199,6 +206,11 @@ static inline void hugetlb_cgroup_put_rsvd_cgroup(struct hugetlb_cgroup *h_cg)
{
}
+static inline void resv_map_dup_hugetlb_cgroup_uncharge_info(
+ struct resv_map *resv_map)
+{
+}
+
static inline int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
struct hugetlb_cgroup **ptr)
{
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 8ea35ba6699f..6c583ef079e3 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4033,8 +4033,10 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma)
* after this open call completes. It is therefore safe to take a
* new reference here without additional locking.
*/
- if (resv && is_vma_resv_set(vma, HPAGE_RESV_OWNER))
+ if (resv && is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
+ resv_map_dup_hugetlb_cgroup_uncharge_info(resv);
kref_get(&resv->refs);
+ }
}
static void hugetlb_vm_op_close(struct vm_area_struct *vma)
--
2.31.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] hugetlb: fix hugetlb cgroup refcounting during vma split
2021-08-30 21:50 [PATCH] hugetlb: fix hugetlb cgroup refcounting during vma split Mike Kravetz
@ 2021-08-31 14:01 ` Guillaume Morin
0 siblings, 0 replies; 2+ messages in thread
From: Guillaume Morin @ 2021-08-31 14:01 UTC (permalink / raw)
To: Mike Kravetz
Cc: linux-mm, linux-kernel, cgroups, Mina Almasry, David Rientjes,
Greg Thelen, Sandipan Das, Shakeel Butt, Shuah Khan,
Andrew Morton, Guillaume Morin, stable
On 30 Aug 14:50, Mike Kravetz wrote:
> Guillaume Morin reported hitting the following WARNING followed
> by GPF or NULL pointer deference either in cgroups_destroy or in
> the kill_css path.:
>
> percpu ref (css_release) <= 0 (-1) after switching to atomic
> WARNING: CPU: 23 PID: 130 at lib/percpu-refcount.c:196 percpu_ref_switch_to_atomic_rcu+0x127/0x130
> CPU: 23 PID: 130 Comm: ksoftirqd/23 Kdump: loaded Tainted: G O 5.10.60 #1
> RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x127/0x130
> Call Trace:
> rcu_core+0x30f/0x530
> rcu_core_si+0xe/0x10
> __do_softirq+0x103/0x2a2
> ? sort_range+0x30/0x30
> run_ksoftirqd+0x2b/0x40
> smpboot_thread_fn+0x11a/0x170
> kthread+0x10a/0x140
> ? kthread_create_worker_on_cpu+0x70/0x70
> ret_from_fork+0x22/0x30
>
> Upon further examination, it was discovered that the css structure
> was associated with hugetlb reservations.
>
> For private hugetlb mappings the vma points to a reserve map that
> contains a pointer to the css. At mmap time, reservations are set up
> and a reference to the css is taken. This reference is dropped in the
> vma close operation; hugetlb_vm_op_close. However, if a vma is split
> no additional reference to the css is taken yet hugetlb_vm_op_close will
> be called twice for the split vma resulting in an underflow.
>
> Fix by taking another reference in hugetlb_vm_op_open. Note that the
> reference is only taken for the owner of the reserve map. In the more
> common fork case, the pointer to the reserve map is cleared for
> non-owning vmas.
>
> Fixes: e9fe92ae0cd2 ("hugetlb_cgroup: add reservation accounting for
> private mappings")
> Reported-by: Guillaume Morin <guillaume@morinfr.org>
> Suggested-by: Guillaume Morin <guillaume@morinfr.org>
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: <stable@vger.kernel.org>
I verified that the patch does fix the underflow. I appreciate the help!
Feel free to add:
Tested-by: Guillaume Morin <guillaume@morinfr.org>
--
Guillaume Morin <guillaume@morinfr.org>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2021-08-31 14:01 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-08-30 21:50 [PATCH] hugetlb: fix hugetlb cgroup refcounting during vma split Mike Kravetz
2021-08-31 14:01 ` Guillaume Morin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).