* Re: [PATCH 1/2] mm: Abstract THP allocation
@ 2024-09-01 5:02 kernel test robot
2024-09-02 9:14 ` Dev Jain
0 siblings, 1 reply; 3+ messages in thread
From: kernel test robot @ 2024-09-01 5:02 UTC (permalink / raw)
To: oe-kbuild; +Cc: lkp, Dan Carpenter
BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <20240830084117.4079805-2-dev.jain@arm.com>
References: <20240830084117.4079805-2-dev.jain@arm.com>
TO: Dev Jain <dev.jain@arm.com>
TO: akpm@linux-foundation.org
TO: david@redhat.com
TO: willy@infradead.org
TO: kirill.shutemov@linux.intel.com
CC: ryan.roberts@arm.com
CC: anshuman.khandual@arm.com
CC: catalin.marinas@arm.com
CC: cl@gentwo.org
CC: vbabka@suse.cz
CC: mhocko@suse.com
CC: apopple@nvidia.com
CC: dave.hansen@linux.intel.com
CC: will@kernel.org
CC: baohua@kernel.org
CC: jack@suse.cz
CC: mark.rutland@arm.com
CC: hughd@google.com
CC: aneesh.kumar@kernel.org
CC: yang@os.amperecomputing.com
CC: peterx@redhat.com
CC: ioworker0@gmail.com
CC: jglisse@google.com
CC: linux-arm-kernel@lists.infradead.org
CC: linux-kernel@vger.kernel.org
CC: linux-mm@kvack.org
CC: Dev Jain <dev.jain@arm.com>
Hi Dev,
kernel test robot noticed the following build warnings:
[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on linus/master v6.11-rc5 next-20240830]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Dev-Jain/mm-Abstract-THP-allocation/20240830-164300
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20240830084117.4079805-2-dev.jain%40arm.com
patch subject: [PATCH 1/2] mm: Abstract THP allocation
:::::: branch date: 2 days ago
:::::: commit date: 2 days ago
config: x86_64-randconfig-161-20240901 (https://download.01.org/0day-ci/archive/20240901/202409011215.bhqLigEB-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <error27@gmail.com>
| Closes: https://lore.kernel.org/r/202409011215.bhqLigEB-lkp@intel.com/
smatch warnings:
mm/huge_memory.c:1236 __do_huge_pmd_anonymous_page() warn: inconsistent returns 'vmf->ptl'.
vim +1236 mm/huge_memory.c
5ee350d3e6d11e Dev Jain 2024-08-30 1186
5ee350d3e6d11e Dev Jain 2024-08-30 1187 static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf)
5ee350d3e6d11e Dev Jain 2024-08-30 1188 {
5ee350d3e6d11e Dev Jain 2024-08-30 1189 struct vm_area_struct *vma = vmf->vma;
5ee350d3e6d11e Dev Jain 2024-08-30 1190 struct folio *folio = NULL;
5ee350d3e6d11e Dev Jain 2024-08-30 1191 pgtable_t pgtable;
5ee350d3e6d11e Dev Jain 2024-08-30 1192 unsigned long haddr = vmf->address & HPAGE_PMD_MASK;
5ee350d3e6d11e Dev Jain 2024-08-30 1193 vm_fault_t ret = 0;
5ee350d3e6d11e Dev Jain 2024-08-30 1194 gfp_t gfp = vma_thp_gfp_mask(vma);
5ee350d3e6d11e Dev Jain 2024-08-30 1195
5ee350d3e6d11e Dev Jain 2024-08-30 1196 pgtable = pte_alloc_one(vma->vm_mm);
5ee350d3e6d11e Dev Jain 2024-08-30 1197 if (unlikely(!pgtable)) {
5ee350d3e6d11e Dev Jain 2024-08-30 1198 ret = VM_FAULT_OOM;
5ee350d3e6d11e Dev Jain 2024-08-30 1199 goto release;
5ee350d3e6d11e Dev Jain 2024-08-30 1200 }
5ee350d3e6d11e Dev Jain 2024-08-30 1201
5ee350d3e6d11e Dev Jain 2024-08-30 1202 ret = thp_fault_alloc(gfp, HPAGE_PMD_ORDER, vma, haddr, &folio,
5ee350d3e6d11e Dev Jain 2024-08-30 1203 vmf->address);
5ee350d3e6d11e Dev Jain 2024-08-30 1204 if (ret)
5ee350d3e6d11e Dev Jain 2024-08-30 1205 goto release;
71e3aac0724ffe Andrea Arcangeli 2011-01-13 1206
82b0f8c39a3869 Jan Kara 2016-12-14 1207 vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
5ee350d3e6d11e Dev Jain 2024-08-30 1208
82b0f8c39a3869 Jan Kara 2016-12-14 1209 if (unlikely(!pmd_none(*vmf->pmd))) {
6b31d5955cb29a Michal Hocko 2017-08-18 1210 goto unlock_release;
71e3aac0724ffe Andrea Arcangeli 2011-01-13 1211 } else {
6b31d5955cb29a Michal Hocko 2017-08-18 1212 ret = check_stable_address_space(vma->vm_mm);
6b31d5955cb29a Michal Hocko 2017-08-18 1213 if (ret)
6b31d5955cb29a Michal Hocko 2017-08-18 1214 goto unlock_release;
6b31d5955cb29a Michal Hocko 2017-08-18 1215
6b251fc96cf2cd Andrea Arcangeli 2015-09-04 1216 /* Deliver the page fault to userland */
6b251fc96cf2cd Andrea Arcangeli 2015-09-04 1217 if (userfaultfd_missing(vma)) {
82b0f8c39a3869 Jan Kara 2016-12-14 1218 spin_unlock(vmf->ptl);
cfe3236d32d07b Kefeng Wang 2023-03-02 1219 folio_put(folio);
bae473a423f65e Kirill A. Shutemov 2016-07-26 1220 pte_free(vma->vm_mm, pgtable);
8fd5eda4c7268b Miaohe Lin 2021-05-04 1221 ret = handle_userfault(vmf, VM_UFFD_MISSING);
8fd5eda4c7268b Miaohe Lin 2021-05-04 1222 VM_BUG_ON(ret & VM_FAULT_FALLBACK);
8fd5eda4c7268b Miaohe Lin 2021-05-04 1223 return ret;
6b251fc96cf2cd Andrea Arcangeli 2015-09-04 1224 }
5ee350d3e6d11e Dev Jain 2024-08-30 1225 map_pmd_thp(folio, vmf, vma, haddr, pgtable);
71e3aac0724ffe Andrea Arcangeli 2011-01-13 1226 }
71e3aac0724ffe Andrea Arcangeli 2011-01-13 1227
aa2e878efa7949 David Rientjes 2012-05-29 1228 return 0;
6b31d5955cb29a Michal Hocko 2017-08-18 1229 unlock_release:
6b31d5955cb29a Michal Hocko 2017-08-18 1230 spin_unlock(vmf->ptl);
6b31d5955cb29a Michal Hocko 2017-08-18 1231 release:
6b31d5955cb29a Michal Hocko 2017-08-18 1232 if (pgtable)
6b31d5955cb29a Michal Hocko 2017-08-18 1233 pte_free(vma->vm_mm, pgtable);
5ee350d3e6d11e Dev Jain 2024-08-30 1234 if (folio)
cfe3236d32d07b Kefeng Wang 2023-03-02 1235 folio_put(folio);
6b31d5955cb29a Michal Hocko 2017-08-18 @1236 return ret;
6b31d5955cb29a Michal Hocko 2017-08-18 1237
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH 1/2] mm: Abstract THP allocation 2024-09-01 5:02 [PATCH 1/2] mm: Abstract THP allocation kernel test robot @ 2024-09-02 9:14 ` Dev Jain 0 siblings, 0 replies; 3+ messages in thread From: Dev Jain @ 2024-09-02 9:14 UTC (permalink / raw) To: lkp; +Cc: error27, oe-kbuild, Dev Jain Please check with the following patch. In do_huge_zero_wp_pmd_locked(), unlocking should not happen, that function should execute only in locked state. So pull out the unlocking out of map_pmd_thp(). Signed-off-by: Dev Jain <dev.jain@arm.com> --- mm/huge_memory.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index e5b568e2bb34..58125fbcc532 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -987,7 +987,6 @@ static void __thp_fault_success_stats(struct vm_area_struct *vma, int order) static void map_pmd_thp(struct folio *folio, struct vm_fault *vmf, struct vm_area_struct *vma, unsigned long haddr, pgtable_t pgtable) - __releases(vmf->ptl) { pmd_t entry; @@ -1000,8 +999,6 @@ static void map_pmd_thp(struct folio *folio, struct vm_fault *vmf, update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); mm_inc_nr_ptes(vma->vm_mm); - spin_unlock(vmf->ptl); - __thp_fault_success_stats(vma, HPAGE_PMD_ORDER); } static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf) @@ -1043,6 +1040,8 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf) return ret; } map_pmd_thp(folio, vmf, vma, haddr, pgtable); + spin_unlock(vmf->ptl); + __thp_fault_success_stats(vma, HPAGE_PMD_ORDER); } return 0; -- 2.30.2 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 0/2] Do not shatter hugezeropage on wp-fault @ 2024-08-30 8:41 Dev Jain 2024-08-30 8:41 ` [PATCH 1/2] mm: Abstract THP allocation Dev Jain 0 siblings, 1 reply; 3+ messages in thread From: Dev Jain @ 2024-08-30 8:41 UTC (permalink / raw) To: akpm, david, willy, kirill.shutemov Cc: ryan.roberts, anshuman.khandual, catalin.marinas, cl, vbabka, mhocko, apopple, dave.hansen, will, baohua, jack, mark.rutland, hughd, aneesh.kumar, yang, peterx, ioworker0, jglisse, linux-arm-kernel, linux-kernel, linux-mm, Dev Jain It was observed at [1] and [2] that the current kernel behaviour of shattering a hugezeropage is inconsistent and suboptimal. For a VMA with a THP allowable order, when we write-fault on it, the kernel installs a PMD-mapped THP. On the other hand, if we first get a read fault, we get a PMD pointing to the hugezeropage; subsequent write will trigger a write-protection fault, shattering the hugezeropage into one writable page, and all the other PTEs write-protected. The conclusion being, as compared to the case of a single write-fault, applications have to suffer 512 extra page faults if they were to use the VMA as such, plus we get the overhead of khugepaged trying to replace that area with a THP anyway. Instead, replace the hugezeropage with a THP on wp-fault. [1]: https://lore.kernel.org/all/3743d7e1-0b79-4eaf-82d5-d1ca29fe347d@arm.com/ [2]: https://lore.kernel.org/all/1cfae0c0-96a2-4308-9c62-f7a640520242@arm.com/ Dev Jain (2): mm: Abstract THP allocation mm: Allocate THP on hugezeropage wp-fault include/linux/huge_mm.h | 7 ++ mm/huge_memory.c | 171 +++++++++++++++++++++++++++++----------- mm/memory.c | 5 +- 3 files changed, 136 insertions(+), 47 deletions(-) -- 2.30.2 ^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 1/2] mm: Abstract THP allocation 2024-08-30 8:41 [PATCH 0/2] Do not shatter hugezeropage on wp-fault Dev Jain @ 2024-08-30 8:41 ` Dev Jain 0 siblings, 0 replies; 3+ messages in thread From: Dev Jain @ 2024-08-30 8:41 UTC (permalink / raw) To: akpm, david, willy, kirill.shutemov Cc: ryan.roberts, anshuman.khandual, catalin.marinas, cl, vbabka, mhocko, apopple, dave.hansen, will, baohua, jack, mark.rutland, hughd, aneesh.kumar, yang, peterx, ioworker0, jglisse, linux-arm-kernel, linux-kernel, linux-mm, Dev Jain In preparation for the second patch, abstract away the THP allocation logic present in the create_huge_pmd() path, which corresponds to the faulting case when no page is present. There should be no functional change as a result of applying this patch. Signed-off-by: Dev Jain <dev.jain@arm.com> --- mm/huge_memory.c | 113 +++++++++++++++++++++++++++++------------------ 1 file changed, 69 insertions(+), 44 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 67c86a5d64a6..e5b568e2bb34 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -943,47 +943,92 @@ unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, } EXPORT_SYMBOL_GPL(thp_get_unmapped_area); -static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, - struct page *page, gfp_t gfp) +static vm_fault_t thp_fault_alloc(gfp_t gfp, int order, struct vm_area_struct *vma, + unsigned long haddr, struct folio **foliop, + unsigned long addr) { - struct vm_area_struct *vma = vmf->vma; - struct folio *folio = page_folio(page); - pgtable_t pgtable; - unsigned long haddr = vmf->address & HPAGE_PMD_MASK; - vm_fault_t ret = 0; + struct folio *folio = vma_alloc_folio(gfp, order, vma, haddr, true); - VM_BUG_ON_FOLIO(!folio_test_large(folio), folio); + *foliop = folio; + if (unlikely(!folio)) { + count_vm_event(THP_FAULT_FALLBACK); + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK); + return VM_FAULT_FALLBACK; + } + VM_BUG_ON_FOLIO(!folio_test_large(folio), folio); if (mem_cgroup_charge(folio, vma->vm_mm, gfp)) { folio_put(folio); count_vm_event(THP_FAULT_FALLBACK); count_vm_event(THP_FAULT_FALLBACK_CHARGE); - count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_FALLBACK); - count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK); + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); return VM_FAULT_FALLBACK; } folio_throttle_swaprate(folio, gfp); - pgtable = pte_alloc_one(vma->vm_mm); - if (unlikely(!pgtable)) { - ret = VM_FAULT_OOM; - goto release; - } - - folio_zero_user(folio, vmf->address); + folio_zero_user(folio, addr); /* * The memory barrier inside __folio_mark_uptodate makes sure that * folio_zero_user writes become visible before the set_pmd_at() * write. */ __folio_mark_uptodate(folio); + return 0; +} + +static void __thp_fault_success_stats(struct vm_area_struct *vma, int order) +{ + count_vm_event(THP_FAULT_ALLOC); + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_ALLOC); + count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC); +} + +static void map_pmd_thp(struct folio *folio, struct vm_fault *vmf, + struct vm_area_struct *vma, unsigned long haddr, + pgtable_t pgtable) + __releases(vmf->ptl) +{ + pmd_t entry; + + entry = mk_huge_pmd(&folio->page, vma->vm_page_prot); + entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); + folio_add_new_anon_rmap(folio, vma, haddr, RMAP_EXCLUSIVE); + folio_add_lru_vma(folio, vma); + pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); + set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); + update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); + add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); + mm_inc_nr_ptes(vma->vm_mm); + spin_unlock(vmf->ptl); + __thp_fault_success_stats(vma, HPAGE_PMD_ORDER); +} + +static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf) +{ + struct vm_area_struct *vma = vmf->vma; + struct folio *folio = NULL; + pgtable_t pgtable; + unsigned long haddr = vmf->address & HPAGE_PMD_MASK; + vm_fault_t ret = 0; + gfp_t gfp = vma_thp_gfp_mask(vma); + + pgtable = pte_alloc_one(vma->vm_mm); + if (unlikely(!pgtable)) { + ret = VM_FAULT_OOM; + goto release; + } + + ret = thp_fault_alloc(gfp, HPAGE_PMD_ORDER, vma, haddr, &folio, + vmf->address); + if (ret) + goto release; vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); + if (unlikely(!pmd_none(*vmf->pmd))) { goto unlock_release; } else { - pmd_t entry; - ret = check_stable_address_space(vma->vm_mm); if (ret) goto unlock_release; @@ -997,20 +1042,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, VM_BUG_ON(ret & VM_FAULT_FALLBACK); return ret; } - - entry = mk_huge_pmd(page, vma->vm_page_prot); - entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); - folio_add_new_anon_rmap(folio, vma, haddr, RMAP_EXCLUSIVE); - folio_add_lru_vma(folio, vma); - pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); - set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); - update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); - add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); - mm_inc_nr_ptes(vma->vm_mm); - spin_unlock(vmf->ptl); - count_vm_event(THP_FAULT_ALLOC); - count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC); - count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC); + map_pmd_thp(folio, vmf, vma, haddr, pgtable); } return 0; @@ -1019,7 +1051,8 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, release: if (pgtable) pte_free(vma->vm_mm, pgtable); - folio_put(folio); + if (folio) + folio_put(folio); return ret; } @@ -1077,8 +1110,6 @@ static void set_huge_zero_folio(pgtable_t pgtable, struct mm_struct *mm, vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; - gfp_t gfp; - struct folio *folio; unsigned long haddr = vmf->address & HPAGE_PMD_MASK; vm_fault_t ret; @@ -1129,14 +1160,8 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf) } return ret; } - gfp = vma_thp_gfp_mask(vma); - folio = vma_alloc_folio(gfp, HPAGE_PMD_ORDER, vma, haddr, true); - if (unlikely(!folio)) { - count_vm_event(THP_FAULT_FALLBACK); - count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_FALLBACK); - return VM_FAULT_FALLBACK; - } - return __do_huge_pmd_anonymous_page(vmf, &folio->page, gfp); + + return __do_huge_pmd_anonymous_page(vmf); } static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, -- 2.30.2 ^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-09-02 9:14 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-01 5:02 [PATCH 1/2] mm: Abstract THP allocation kernel test robot 2024-09-02 9:14 ` Dev Jain -- strict thread matches above, loose matches on Subject: below -- 2024-08-30 8:41 [PATCH 0/2] Do not shatter hugezeropage on wp-fault Dev Jain 2024-08-30 8:41 ` [PATCH 1/2] mm: Abstract THP allocation Dev Jain
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.