From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-187.mta1.migadu.com (out-187.mta1.migadu.com [95.215.58.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32629280017 for ; Thu, 2 Oct 2025 02:32:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.187 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759372341; cv=none; b=n5tVCDr16jA/ycL3C75tjpUxJb9Fmo9csJXRjcsZybZX62lfjUdp2oeF3b5T6SK9FI16+t0GjMF8a1myhRXWLVHdNp2X6dAi7uOWOzzunol3zsoEIbNJ2cec6LrAqdGdFJe79MNnNpiFAs20a9LdhMXWdtQ/9ilIO7yIWpLc8cU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759372341; c=relaxed/simple; bh=HGgVtb3V83t4F/CK2lJUH9Sspl4Iot4SNTKE25bDVd8=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=EAV0mGO0m6i1yr/vVJK+dFWQS8r05IO1/cIo46d8jJb4QG0tKeoLqNIlnkaINkv4RkQN8ABExOJl3dcjzVINwIcdXfHk3Bo6DikbRjzlPp3kHPDIIru/zCFR8VEsDJ575b3SCMv3w/CFpp4TVEhJI4+RrNOk6AzAId5o/xwEpU0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=CbbQ1EZo; arc=none smtp.client-ip=95.215.58.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="CbbQ1EZo" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1759372337; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0BIRTtLPf+QDVfO6C0pMRdsH/qN18OkH5csEBfkbiN8=; b=CbbQ1EZoL/G2TucCvc8KETWSTYQyFgE5iafViFJfAYCzTyATKovd/hMHVu2UAuXuBaDCRr e4sJoy7tfLV9CcMG/lYjnCLFhqOC/L10/3E3qUcVx4tlOTsRrpnmeDMf7XplYjHKakdTmd uyzlHETjN+ck6aSgEOXajL23OYIpJio= Date: Thu, 2 Oct 2025 10:31:53 +0800 Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [Patch v2] mm/huge_memory: add pmd folio to ds_queue in do_huge_zero_wp_pmd() Content-Language: en-US To: Wei Yang Cc: akpm@linux-foundation.org, david@redhat.com, lorenzo.stoakes@oracle.com, ziy@nvidia.com, baolin.wang@linux.alibaba.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, wangkefeng.wang@huawei.com, linux-mm@kvack.org, stable@vger.kernel.org References: <20251002013825.20448-1-richard.weiyang@gmail.com> <20251002014604.d2ryohvtrdfn7mvf@master> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Lance Yang In-Reply-To: <20251002014604.d2ryohvtrdfn7mvf@master> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 2025/10/2 09:46, Wei Yang wrote: > On Thu, Oct 02, 2025 at 01:38:25AM +0000, Wei Yang wrote: >> We add pmd folio into ds_queue on the first page fault in >> __do_huge_pmd_anonymous_page(), so that we can split it in case of >> memory pressure. This should be the same for a pmd folio during wp >> page fault. >> >> Commit 1ced09e0331f ("mm: allocate THP on hugezeropage wp-fault") miss >> to add it to ds_queue, which means system may not reclaim enough memory >> in case of memory pressure even the pmd folio is under used. >> >> Move deferred_split_folio() into map_anon_folio_pmd() to make the pmd >> folio installation consistent. >> > > Since we move deferred_split_folio() into map_anon_folio_pmd(), I am thinking > about whether we can consolidate the process in collapse_huge_page(). > > Use map_anon_folio_pmd() in collapse_huge_page(), but skip those statistic > adjustment. Yeah, that's a good idea :) We could add a simple bool is_fault parameter to map_anon_folio_pmd() to control the statistics. The fault paths would call it with true, and the collapse paths could then call it with false. Something like this: ``` diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 1b81680b4225..9924180a4a56 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1218,7 +1218,7 @@ static struct folio *vma_alloc_anon_folio_pmd(struct vm_area_struct *vma, } static void map_anon_folio_pmd(struct folio *folio, pmd_t *pmd, - struct vm_area_struct *vma, unsigned long haddr) + struct vm_area_struct *vma, unsigned long haddr, bool is_fault) { pmd_t entry; @@ -1228,10 +1228,15 @@ static void map_anon_folio_pmd(struct folio *folio, pmd_t *pmd, folio_add_lru_vma(folio, vma); set_pmd_at(vma->vm_mm, haddr, pmd, entry); update_mmu_cache_pmd(vma, haddr, pmd); - add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); - count_vm_event(THP_FAULT_ALLOC); - count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC); - count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC); + + if (is_fault) { + add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); + count_vm_event(THP_FAULT_ALLOC); + count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC); + count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC); + } + + deferred_split_folio(folio, false); } static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index d0957648db19..2eddd5a60e48 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1227,17 +1227,10 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, __folio_mark_uptodate(folio); pgtable = pmd_pgtable(_pmd); - _pmd = folio_mk_pmd(folio, vma->vm_page_prot); - _pmd = maybe_pmd_mkwrite(pmd_mkdirty(_pmd), vma); - spin_lock(pmd_ptl); BUG_ON(!pmd_none(*pmd)); - folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); - folio_add_lru_vma(folio, vma); pgtable_trans_huge_deposit(mm, pmd, pgtable); - set_pmd_at(mm, address, pmd, _pmd); - update_mmu_cache_pmd(vma, address, pmd); - deferred_split_folio(folio, false); + map_anon_folio_pmd(folio, pmd, vma, address, false); spin_unlock(pmd_ptl); folio = NULL; ``` Untested, though. > >> Fixes: 1ced09e0331f ("mm: allocate THP on hugezeropage wp-fault") >> Signed-off-by: Wei Yang >> Cc: David Hildenbrand >> Cc: Lance Yang >> Cc: Dev Jain >> Cc: >> >