From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A1C77081F for ; Mon, 1 Dec 2025 16:23:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764606202; cv=none; b=XFBvSUJtJJCtqUfi25kSo45tQ6GYFEXK3YWVS37FhiUjT+4eV6+44m/vTNZwvqdRC+XPfiJXz3zfC7I3uZUnFkZZla8D9b1/0rgmw25b3q5PwMvwiJHiKcG4hXYfxRD97NgkLSd1C4/HMEXA6szMHuogGMnpmXFbJvKngbc2wmk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764606202; c=relaxed/simple; bh=jI2Xz4swv6NzW7KPu/IKp1XhNPCIaDL+QjuUHyw6uMU=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=tRuvYjLF30bnrZ3zMFj+jnVyjA+fLEIj8fkMwhTisiAUmKrNwWscEzeYUU5LRzhakQWHoQtWO+JcCRBZsZ9Oufhaopv8uW2tWkTUF3+/go6EL0Q9Tapl88dq5h15PjhZ5Cs4IlPWH9MCZ34fRuPj8IgEVBbmLdcOqIWBTivGmt0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=G/UhbdjW; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="G/UhbdjW" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DDA41C116C6; Mon, 1 Dec 2025 16:23:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1764606200; bh=jI2Xz4swv6NzW7KPu/IKp1XhNPCIaDL+QjuUHyw6uMU=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=G/UhbdjW1d6DsPOsCmlw/Bj6lJAQvrgkme7H+SyraTkDoDtpuBjPOsXxArXqlssHg 0v2N6siQfR2ubsRPSVCPKwAo7iW0u8+OdDzyuKvy1gHZOi5jtOE4GlV5ivPNGcuCkl EQDv9/ZLuxLihV+nFOgNdccm9PFjsmgbMV7QZ8TqfnA08c+fu36yE0lyfNZXMMFnHJ BkQZJR+2JUkl/r9a7VhDDrtf5uNmCZwkoZkFYPpMWRozZ1h4JBD9BPE4KlBo7vWRXa 4l19T5HvXfe/uwJlvzonEI8gj85nfj/ZZ+YxIPMRxLcy4ui/Sv73sTE6Qncbu15OlU haES3TV6LehVg== Message-ID: <341d1aed-13ad-41ee-ad30-487c5baec399@kernel.org> Date: Mon, 1 Dec 2025 17:23:13 +0100 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 0/2] support batched checks of the references for large folios To: Baolin Wang , akpm@linux-foundation.org, catalin.marinas@arm.com, will@kernel.org Cc: lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, riel@surriel.com, harry.yoo@oracle.com, jannh@google.com, willy@infradead.org, baohua@kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: From: "David Hildenbrand (Red Hat)" Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 11/25/25 01:56, Baolin Wang wrote: > Currently, folio_referenced_one() always checks the young flag for each PTE > sequentially, which is inefficient for large folios. This inefficiency is > especially noticeable when reclaiming clean file-backed large folios, where > folio_referenced() is observed as a significant performance hotspot. > > Moreover, on Arm architecture, which supports contiguous PTEs, there is already > an optimization to clear the young flags for PTEs within a contiguous range. > However, this is not sufficient. We can extend this to perform batched operations > for the entire large folio (which might exceed the contiguous range: CONT_PTE_SIZE). > > By supporting batched checking of the young flags and flushing TLB entries, > I observed a 33% performance improvement in my file-backed folios reclaim tests. Can you point at the benchmark or briefly explain what it does? What exactly are we measuring that improves by 33%? -- Cheers David