* [PATCH 0/3] mm/hugetlb: Fix missing hugetlb_lock for memcg resv uncharge
@ 2024-04-17 21:18 Peter Xu
2024-04-17 21:18 ` [PATCH 1/3] fixup! mm: always initialise folio->_deferred_list Peter Xu
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Peter Xu @ 2024-04-17 21:18 UTC (permalink / raw)
To: linux-mm, linux-kernel
Cc: peterx, David Hildenbrand, Mina Almasry, Andrew Morton,
Muchun Song, David Rientjes
Should fix the recent syzbot report for:
https://lore.kernel.org/all/000000000000ee06de0616177560@google.com/
Patch 1 is a small fixup where I notice mm-unstable crashes with the new
hugetlb memcg accounting when testing the changes.
Patch 2 should be the fix to the reported issue.
Patch 3 is an oneliner to add an assertion similar to what found the issue
in patch 2.
Only smoke tested over a bunch of hugetlb unit tests. Reviews welcomed.
Thanks,
Peter Xu (3):
fixup! mm: always initialise folio->_deferred_list
mm/hugetlb: Fix missing hugetlb_lock for resv uncharge
mm/hugetlb: Assert hugetlb_lock in __hugetlb_cgroup_commit_charge
mm/hugetlb.c | 5 ++++-
mm/hugetlb_cgroup.c | 2 +-
mm/memcontrol.c | 1 +
3 files changed, 6 insertions(+), 2 deletions(-)
--
2.44.0
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH 1/3] fixup! mm: always initialise folio->_deferred_list 2024-04-17 21:18 [PATCH 0/3] mm/hugetlb: Fix missing hugetlb_lock for memcg resv uncharge Peter Xu @ 2024-04-17 21:18 ` Peter Xu 2024-04-17 23:46 ` Matthew Wilcox 2024-04-17 21:18 ` [PATCH 2/3] mm/hugetlb: Fix missing hugetlb_lock for resv uncharge Peter Xu 2024-04-17 21:18 ` [PATCH 3/3] mm/hugetlb: Assert hugetlb_lock in __hugetlb_cgroup_commit_charge Peter Xu 2 siblings, 1 reply; 9+ messages in thread From: Peter Xu @ 2024-04-17 21:18 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: peterx, David Hildenbrand, Mina Almasry, Andrew Morton, Muchun Song, David Rientjes Current mm-unstable will hit this when running test_hugetlb_memcg. This fixes the crash for me. Signed-off-by: Peter Xu <peterx@redhat.com> --- mm/memcontrol.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 1840ba4c355d..7703ced535a3 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -7529,6 +7529,7 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); VM_BUG_ON_FOLIO(folio_order(folio) > 1 && + !folio_test_hugetlb(folio) && !list_empty(&folio->_deferred_list), folio); /* -- 2.44.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] fixup! mm: always initialise folio->_deferred_list 2024-04-17 21:18 ` [PATCH 1/3] fixup! mm: always initialise folio->_deferred_list Peter Xu @ 2024-04-17 23:46 ` Matthew Wilcox 2024-04-18 1:39 ` Peter Xu 0 siblings, 1 reply; 9+ messages in thread From: Matthew Wilcox @ 2024-04-17 23:46 UTC (permalink / raw) To: Peter Xu Cc: linux-mm, linux-kernel, David Hildenbrand, Mina Almasry, Andrew Morton, Muchun Song, David Rientjes On Wed, Apr 17, 2024 at 05:18:34PM -0400, Peter Xu wrote: > Current mm-unstable will hit this when running test_hugetlb_memcg. This > fixes the crash for me. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > mm/memcontrol.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 1840ba4c355d..7703ced535a3 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -7529,6 +7529,7 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) > > VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); > VM_BUG_ON_FOLIO(folio_order(folio) > 1 && > + !folio_test_hugetlb(folio) && > !list_empty(&folio->_deferred_list), folio); Hum. I thought we didn't get here for hugetlb folios. What stacktrace did you get? I'm basing it on comments like this: /* hugetlb has its own memcg */ if (folio_test_hugetlb(folio)) { if (lruvec) { unlock_page_lruvec_irqrestore(lruvec, flags); lruvec = NULL; } free_huge_folio(folio); continue; } ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] fixup! mm: always initialise folio->_deferred_list 2024-04-17 23:46 ` Matthew Wilcox @ 2024-04-18 1:39 ` Peter Xu 0 siblings, 0 replies; 9+ messages in thread From: Peter Xu @ 2024-04-18 1:39 UTC (permalink / raw) To: Matthew Wilcox Cc: linux-mm, linux-kernel, David Hildenbrand, Mina Almasry, Andrew Morton, Muchun Song, David Rientjes On Thu, Apr 18, 2024 at 12:46:39AM +0100, Matthew Wilcox wrote: > On Wed, Apr 17, 2024 at 05:18:34PM -0400, Peter Xu wrote: > > Current mm-unstable will hit this when running test_hugetlb_memcg. This > > fixes the crash for me. [1] > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > mm/memcontrol.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 1840ba4c355d..7703ced535a3 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -7529,6 +7529,7 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) > > > > VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); > > VM_BUG_ON_FOLIO(folio_order(folio) > 1 && > > + !folio_test_hugetlb(folio) && > > !list_empty(&folio->_deferred_list), folio); > > Hum. I thought we didn't get here for hugetlb folios. What > stacktrace did you get? A normal hugetlb free path iirc. You can try the test case, I mentioned the reproducer [1] above. It crashes constantly. > > I'm basing it on comments like this: > > /* hugetlb has its own memcg */ > if (folio_test_hugetlb(folio)) { > if (lruvec) { > unlock_page_lruvec_irqrestore(lruvec, flags); > lruvec = NULL; > } > free_huge_folio(folio); > continue; > } Hugetlb does have its own memcg but I guess now it's even more than that, see the patch merged months ago: https://lore.kernel.org/all/20231006184629.155543-4-nphamcs@gmail.com/ Especially: @@ -1902,6 +1902,7 @@ void free_huge_folio(struct folio *folio) pages_per_huge_page(h), folio); hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h), pages_per_huge_page(h), folio); + mem_cgroup_uncharge(folio); if (restore_reserve) h->resv_huge_pages++; The comment above looks a bit confusing to me, as memcg is not the only thing special for hugetlb pages. Explicitly mentioning it there made me feel like hugetlb can be freed like a normal compound page if memcg is not a problem, but looks like not yet? -- Peter Xu ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/3] mm/hugetlb: Fix missing hugetlb_lock for resv uncharge 2024-04-17 21:18 [PATCH 0/3] mm/hugetlb: Fix missing hugetlb_lock for memcg resv uncharge Peter Xu 2024-04-17 21:18 ` [PATCH 1/3] fixup! mm: always initialise folio->_deferred_list Peter Xu @ 2024-04-17 21:18 ` Peter Xu 2024-04-19 15:16 ` Mina Almasry 2024-04-17 21:18 ` [PATCH 3/3] mm/hugetlb: Assert hugetlb_lock in __hugetlb_cgroup_commit_charge Peter Xu 2 siblings, 1 reply; 9+ messages in thread From: Peter Xu @ 2024-04-17 21:18 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: peterx, David Hildenbrand, Mina Almasry, Andrew Morton, Muchun Song, David Rientjes, syzbot+4b8077a5fccc61c385a1, linux-stable There is a recent report on UFFDIO_COPY over hugetlb: https://lore.kernel.org/all/000000000000ee06de0616177560@google.com/ 350: lockdep_assert_held(&hugetlb_lock); Should be an issue in hugetlb but triggered in an userfault context, where it goes into the unlikely path where two threads modifying the resv map together. Mike has a fix in that path for resv uncharge but it looks like the locking criteria was overlooked: hugetlb_cgroup_uncharge_folio_rsvd() will update the cgroup pointer, so it requires to be called with the lock held. Looks like a stable material, so have it copied. Reported-by: syzbot+4b8077a5fccc61c385a1@syzkaller.appspotmail.com Cc: Mina Almasry <almasrymina@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: linux-stable <stable@vger.kernel.org> Fixes: 79aa925bf239 ("hugetlb_cgroup: fix reservation accounting") Signed-off-by: Peter Xu <peterx@redhat.com> --- mm/hugetlb.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 26ab9dfc7d63..3158a55ce567 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3247,9 +3247,12 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, rsv_adjust = hugepage_subpool_put_pages(spool, 1); hugetlb_acct_memory(h, -rsv_adjust); - if (deferred_reserve) + if (deferred_reserve) { + spin_lock_irq(&hugetlb_lock); hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h), pages_per_huge_page(h), folio); + spin_unlock_irq(&hugetlb_lock); + } } if (!memcg_charge_ret) -- 2.44.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] mm/hugetlb: Fix missing hugetlb_lock for resv uncharge 2024-04-17 21:18 ` [PATCH 2/3] mm/hugetlb: Fix missing hugetlb_lock for resv uncharge Peter Xu @ 2024-04-19 15:16 ` Mina Almasry 0 siblings, 0 replies; 9+ messages in thread From: Mina Almasry @ 2024-04-19 15:16 UTC (permalink / raw) To: Peter Xu Cc: linux-mm, linux-kernel, David Hildenbrand, Andrew Morton, Muchun Song, David Rientjes, syzbot+4b8077a5fccc61c385a1, linux-stable On Wed, Apr 17, 2024 at 2:18 PM Peter Xu <peterx@redhat.com> wrote: > > There is a recent report on UFFDIO_COPY over hugetlb: > > https://lore.kernel.org/all/000000000000ee06de0616177560@google.com/ > > 350: lockdep_assert_held(&hugetlb_lock); > > Should be an issue in hugetlb but triggered in an userfault context, where > it goes into the unlikely path where two threads modifying the resv map > together. Mike has a fix in that path for resv uncharge but it looks like > the locking criteria was overlooked: hugetlb_cgroup_uncharge_folio_rsvd() > will update the cgroup pointer, so it requires to be called with the lock > held. > > Looks like a stable material, so have it copied. > > Reported-by: syzbot+4b8077a5fccc61c385a1@syzkaller.appspotmail.com > Cc: Mina Almasry <almasrymina@google.com> > Cc: David Hildenbrand <david@redhat.com> > Cc: linux-stable <stable@vger.kernel.org> > Fixes: 79aa925bf239 ("hugetlb_cgroup: fix reservation accounting") > Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mina Almasry <almasrymina@google.com> > --- > mm/hugetlb.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 26ab9dfc7d63..3158a55ce567 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -3247,9 +3247,12 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, > > rsv_adjust = hugepage_subpool_put_pages(spool, 1); > hugetlb_acct_memory(h, -rsv_adjust); > - if (deferred_reserve) > + if (deferred_reserve) { > + spin_lock_irq(&hugetlb_lock); > hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h), > pages_per_huge_page(h), folio); > + spin_unlock_irq(&hugetlb_lock); > + } > } > > if (!memcg_charge_ret) > -- > 2.44.0 > -- Thanks, Mina ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 3/3] mm/hugetlb: Assert hugetlb_lock in __hugetlb_cgroup_commit_charge 2024-04-17 21:18 [PATCH 0/3] mm/hugetlb: Fix missing hugetlb_lock for memcg resv uncharge Peter Xu 2024-04-17 21:18 ` [PATCH 1/3] fixup! mm: always initialise folio->_deferred_list Peter Xu 2024-04-17 21:18 ` [PATCH 2/3] mm/hugetlb: Fix missing hugetlb_lock for resv uncharge Peter Xu @ 2024-04-17 21:18 ` Peter Xu 2024-04-19 15:03 ` Mina Almasry 2 siblings, 1 reply; 9+ messages in thread From: Peter Xu @ 2024-04-17 21:18 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: peterx, David Hildenbrand, Mina Almasry, Andrew Morton, Muchun Song, David Rientjes This is similar to __hugetlb_cgroup_uncharge_folio() where it relies on holding hugetlb_lock. Add the similar assertion like the other one, since it looks like such things may help some day. Signed-off-by: Peter Xu <peterx@redhat.com> --- mm/hugetlb_cgroup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c index aa4486bd3904..e20339a346b9 100644 --- a/mm/hugetlb_cgroup.c +++ b/mm/hugetlb_cgroup.c @@ -308,7 +308,7 @@ static void __hugetlb_cgroup_commit_charge(int idx, unsigned long nr_pages, { if (hugetlb_cgroup_disabled() || !h_cg) return; - + lockdep_assert_held(&hugetlb_lock); __set_hugetlb_cgroup(folio, h_cg, rsvd); if (!rsvd) { unsigned long usage = -- 2.44.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] mm/hugetlb: Assert hugetlb_lock in __hugetlb_cgroup_commit_charge 2024-04-17 21:18 ` [PATCH 3/3] mm/hugetlb: Assert hugetlb_lock in __hugetlb_cgroup_commit_charge Peter Xu @ 2024-04-19 15:03 ` Mina Almasry 2024-04-19 15:21 ` Peter Xu 0 siblings, 1 reply; 9+ messages in thread From: Mina Almasry @ 2024-04-19 15:03 UTC (permalink / raw) To: Peter Xu Cc: linux-mm, linux-kernel, David Hildenbrand, Andrew Morton, Muchun Song, David Rientjes On Wed, Apr 17, 2024 at 2:18 PM Peter Xu <peterx@redhat.com> wrote: > > This is similar to __hugetlb_cgroup_uncharge_folio() where it relies on > holding hugetlb_lock. Add the similar assertion like the other one, since > it looks like such things may help some day. > > Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mina Almasry <almasrymina@google.com> > --- > mm/hugetlb_cgroup.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c > index aa4486bd3904..e20339a346b9 100644 > --- a/mm/hugetlb_cgroup.c > +++ b/mm/hugetlb_cgroup.c > @@ -308,7 +308,7 @@ static void __hugetlb_cgroup_commit_charge(int idx, unsigned long nr_pages, > { > if (hugetlb_cgroup_disabled() || !h_cg) > return; > - > + lockdep_assert_held(&hugetlb_lock); Maybe also remove the comment on the top of the function: /* Should be called with hugetlb_lock held */ Now that the function asserts, the comment seems redundant, but up to you. > __set_hugetlb_cgroup(folio, h_cg, rsvd); > if (!rsvd) { > unsigned long usage = > -- > 2.44.0 > -- Thanks, Mina ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] mm/hugetlb: Assert hugetlb_lock in __hugetlb_cgroup_commit_charge 2024-04-19 15:03 ` Mina Almasry @ 2024-04-19 15:21 ` Peter Xu 0 siblings, 0 replies; 9+ messages in thread From: Peter Xu @ 2024-04-19 15:21 UTC (permalink / raw) To: Mina Almasry Cc: linux-mm, linux-kernel, David Hildenbrand, Andrew Morton, Muchun Song, David Rientjes On Fri, Apr 19, 2024 at 08:03:08AM -0700, Mina Almasry wrote: > On Wed, Apr 17, 2024 at 2:18 PM Peter Xu <peterx@redhat.com> wrote: > > > > This is similar to __hugetlb_cgroup_uncharge_folio() where it relies on > > holding hugetlb_lock. Add the similar assertion like the other one, since > > it looks like such things may help some day. > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > Reviewed-by: Mina Almasry <almasrymina@google.com> Thanks. > > > --- > > mm/hugetlb_cgroup.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c > > index aa4486bd3904..e20339a346b9 100644 > > --- a/mm/hugetlb_cgroup.c > > +++ b/mm/hugetlb_cgroup.c > > @@ -308,7 +308,7 @@ static void __hugetlb_cgroup_commit_charge(int idx, unsigned long nr_pages, > > { > > if (hugetlb_cgroup_disabled() || !h_cg) > > return; > > - > > + lockdep_assert_held(&hugetlb_lock); > > Maybe also remove the comment on the top of the function: > > /* Should be called with hugetlb_lock held */ > > Now that the function asserts, the comment seems redundant, but up to you. IMHO there's no harm to be verbose in this case, so I'll just keep it simple to be as-is. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-04-19 15:22 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-04-17 21:18 [PATCH 0/3] mm/hugetlb: Fix missing hugetlb_lock for memcg resv uncharge Peter Xu 2024-04-17 21:18 ` [PATCH 1/3] fixup! mm: always initialise folio->_deferred_list Peter Xu 2024-04-17 23:46 ` Matthew Wilcox 2024-04-18 1:39 ` Peter Xu 2024-04-17 21:18 ` [PATCH 2/3] mm/hugetlb: Fix missing hugetlb_lock for resv uncharge Peter Xu 2024-04-19 15:16 ` Mina Almasry 2024-04-17 21:18 ` [PATCH 3/3] mm/hugetlb: Assert hugetlb_lock in __hugetlb_cgroup_commit_charge Peter Xu 2024-04-19 15:03 ` Mina Almasry 2024-04-19 15:21 ` Peter Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).