From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F47ACA6F
	for <linux-kernel@vger.kernel.org>; Thu, 18 Dec 2025 09:32:03 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1766050323; cv=none; b=knw0ob2Zu1eqxsainc/mtEIP+2RXFvGhFLRUgphe+ABQFry9q31yB07PDDj8Mn+bELPxR7ocmJ7oSQqxefKJPhLnobt/bjztV8oW44GlOd//GYtKsJMJq4lM1nqF3kqy1AZGggPTHy/I1+a6j8hvKGl56h6wHdv4DPbsl8ZRAO8=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1766050323; c=relaxed/simple;
	bh=dvhG8sfqEYygPleLeUK3D7CNQ7o0b/Y0O54NJkeV+oE=;
	h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From:
	 In-Reply-To:Content-Type; b=NqsRJET3fXgqcUhspL5FOICRMyhPewVxKpsVlfjZdOMQtRKR21iFnX4PZLEw4tJc7Dlvg676zzsOFuFfZGGrLOrlQdpg6W1BV9/tn7ltycG40ljLDxSlL/KGhq2+Fw01uMYGdjNnc1v9D5ij4FKw+b+wxHB+DgftDdEM9tN8dnM=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=KtsUoT7A; arc=none smtp.client-ip=10.30.226.201
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="KtsUoT7A"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id C4A5CC4CEFB;
	Thu, 18 Dec 2025 09:32:00 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1766050323;
	bh=dvhG8sfqEYygPleLeUK3D7CNQ7o0b/Y0O54NJkeV+oE=;
	h=Date:Subject:To:Cc:References:From:In-Reply-To:From;
	b=KtsUoT7AWc3bULPITga265ziXXiV/R3yhm5a8E4tusf/EZWNgwrka83/YJFz2fFyx
	 bSjuQ0m2+BCkTgJAOVUgZ79m3eAXplk8mJYIWJvadCIvdzUDy1ohT7ac2aF1u3RaNa
	 lS1lb/8oPb7IVFhy5XSDPLarKvZ2DYgMWPHvv0rn0Yd/UnunEkUo9GcVeGs41v+FrR
	 krAV9W+Yw95i62ezY+8FvhhEbPtx5kxw2mtirFZM7252gx+eeGNfn5Dxg010vqIjEG
	 hsjkPLeWVXut3SlsuGGp1b24KACgT8iwhV/5c9xNV9kXg8lCM9RdZxiekFwm12ouTz
	 VP+QVK4SHiVwQ==
Message-ID: <3c75d915-5d7f-4e80-975f-4479393e7139@kernel.org>
Date: Thu, 18 Dec 2025 10:31:58 +0100
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: [PATCH 3/4] mm: khugepaged: move mm to list tail when
 MADV_COLD/MADV_FREE
To: Vernon Yang <vernon2gm@gmail.com>, akpm@linux-foundation.org,
 lorenzo.stoakes@oracle.com
Cc: ziy@nvidia.com, npache@redhat.com, baohua@kernel.org,
 lance.yang@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
 Vernon Yang <yanglincheng@kylinos.cn>
References: <20251215090419.174418-1-yanglincheng@kylinos.cn>
 <20251215090419.174418-4-yanglincheng@kylinos.cn>
From: "David Hildenbrand (Red Hat)" <david@kernel.org>
Content-Language: en-US
In-Reply-To: <20251215090419.174418-4-yanglincheng@kylinos.cn>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

On 12/15/25 10:04, Vernon Yang wrote:
> For example, create three task: hot1 -> cold -> hot2. After all three
> task are created, each allocate memory 128MB. the hot1/hot2 task
> continuously access 128 MB memory, while the cold task only accesses
> its memory briefly andthen call madvise(MADV_COLD). However, khugepaged
> still prioritizes scanning the cold task and only scans the hot2 task
> after completing the scan of the cold task.
> 
> So if the user has explicitly informed us via MADV_COLD/FREE that this
> memory is cold or will be freed, it is appropriate for khugepaged to
> scan it only at the latest possible moment, thereby avoiding unnecessary
> scan and collapse operations to reducing CPU wastage.
> 
> Here are the performance test results:
> (Throughput bigger is better, other smaller is better)
> 
> Testing on x86_64 machine:
> 
> | task hot2           | without patch | with patch    |  delta  |
> |---------------------|---------------|---------------|---------|
> | total accesses time |  3.14 sec     |  2.92 sec     | -7.01%  |
> | cycles per access   |  4.91         |  2.07         | -57.84% |
> | Throughput          |  104.38 M/sec |  112.12 M/sec | +7.42%  |
> | dTLB-load-misses    |  288966432    |  1292908      | -99.55% |
> 
> Testing on qemu-system-x86_64 -enable-kvm:
> 
> | task hot2           | without patch | with patch    |  delta  |
> |---------------------|---------------|---------------|---------|
> | total accesses time |  3.35 sec     |  2.96 sec     | -11.64% |
> | cycles per access   |  7.23         |  2.12         | -70.68% |
> | Throughput          |  97.88 M/sec  |  110.76 M/sec | +13.16% |
> | dTLB-load-misses    |  237406497    |  3189194      | -98.66% |

Again, I also don't like that because you make assumptions on a full 
process based on some part of it's address space.

E.g., if a library issues a MADV_COLD on some part of the memory the 
library manages, why should the remaining part of the process suffer as 
well?

This seems to be an heuristic focused on some specific workloads, no?

-- 
Cheers

David