From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-184.mta0.migadu.com (out-184.mta0.migadu.com [91.218.175.184]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E71283AD52D for ; Tue, 10 Mar 2026 13:00:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.184 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773147609; cv=none; b=I2I5l9vAbVJEN95jtkDjQ6k0f/qf7HYXq+x1CgExOj1uZ/jt9mXDq5DVOa+oyD/uKytlilWGisD9VxVs5b2LB0OMg2PxUPGk99iIZupEDiDJ8ijO1pPKKjJuulbOhZd/QhgRASHTYaKU+ojXoJ3uRXZyANI9Jqj/+BLdeajvPTM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773147609; c=relaxed/simple; bh=sgTclLgzriSKXcTKZetQ5yi+Cp5PP7XzjXp1rt148h0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=sfzueN5zoE2OcfvWnVJzPpyz24sOm1R9r27894ywuklpnYVUR7JE1mnCWpvfJW1CLBziK12dMu50+bzDP0o9TJUYW5X6W+6d06vKA4pn/BwuEvDN79sbXlfdApopUoyOLwhjMiFT6ih3iAVM1fyb8bQ5Mmod296Z9B90R+aVIF8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=CU4iyxRk; arc=none smtp.client-ip=91.218.175.184 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="CU4iyxRk" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1773147605; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RftT2Y3IHSXVZYJ/UW0Tv3eAsLcm29Tj9tutyIZqu1o=; b=CU4iyxRkqkOC9/RWtWv+LAXeUNmQxDABlgtj4YJbm6UgHanX2RC8TMubw3qTTXydh02zku 6CI+w51oyp132mrx6nNaEog5MHpNDZFpwyNJcwjN1iMFnEqSSB2q8AhbsiZdTJ6v1HYvFI 2KD7n6dcSjrzxsZPydIqyn5R/lh+p88= From: Lance Yang To: dev.jain@arm.com Cc: Liam.Howlett@oracle.com, akpm@linux-foundation.org, anshuman.khandual@arm.com, axelrasmussen@google.com, baohua@kernel.org, baolin.wang@linux.alibaba.com, bhe@redhat.com, chrisl@kernel.org, david@kernel.org, harry.yoo@oracle.com, hughd@google.com, jannh@google.com, kas@kernel.org, kasong@tencent.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, ljs@kernel.org, mhocko@suse.com, nphamcs@gmail.com, pfalcato@suse.de, riel@surriel.com, rppt@kernel.org, ryan.roberts@arm.com, shikemeng@huaweicloud.com, surenb@google.com, vbabka@kernel.org, weixugc@google.com, willy@infradead.org, youngjun.park@lge.com, yuanchu@google.com, yuzhao@google.com, ziy@nvidia.com, Lance Yang Subject: Re: [PATCH 0/9] mm/rmap: Optimize anonymous large folio unmapping Date: Tue, 10 Mar 2026 20:59:40 +0800 Message-ID: <20260310125940.39707-1-lance.yang@linux.dev> In-Reply-To: <20260310073013.4069309-1-dev.jain@arm.com> References: <20260310073013.4069309-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT On Tue, Mar 10, 2026 at 01:00:04PM +0530, Dev Jain wrote: >Speed up unmapping of anonymous large folios by clearing the ptes, and >setting swap ptes, in one go. > >The following benchmark (stolen from Barry at [1]) is used to measure the >time taken to swapout 256M worth of memory backed by 64K large folios: > > #define _GNU_SOURCE > #include > #include > #include > #include > #include > #include > #include > > #define SIZE_MB 256 > #define SIZE_BYTES (SIZE_MB * 1024 * 1024) > > int main() { > void *addr = mmap(NULL, SIZE_BYTES, PROT_READ | PROT_WRITE, > MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > if (addr == MAP_FAILED) { > perror("mmap failed"); > return 1; > } > > memset(addr, 0, SIZE_BYTES); > > struct timespec start, end; > clock_gettime(CLOCK_MONOTONIC, &start); > > if (madvise(addr, SIZE_BYTES, MADV_PAGEOUT) != 0) { > perror("madvise(MADV_PAGEOUT) failed"); > munmap(addr, SIZE_BYTES); > return 1; > } > > clock_gettime(CLOCK_MONOTONIC, &end); > > long duration_ns = (end.tv_sec - start.tv_sec) * 1e9 + > (end.tv_nsec - start.tv_nsec); > printf("madvise(MADV_PAGEOUT) took %ld ns (%.3f ms)\n", > duration_ns, duration_ns / 1e6); > > munmap(addr, SIZE_BYTES); > return 0; > } > >On arm64, only showing one of the middle values in the distribution: > >without patch: >madvise(MADV_PAGEOUT) took 52192959 ns (52.193 ms) > >with patch: >madvise(MADV_PAGEOUT) took 26676625 ns (26.677 ms) Good numbers! Just tested on x86 KVM with THP=never, no performance regression observed. Cheers, Lance