From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3CA5C54E66 for ; Thu, 14 Mar 2024 01:51:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5ADB080077; Wed, 13 Mar 2024 21:51:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5358980073; Wed, 13 Mar 2024 21:51:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3D78580077; Wed, 13 Mar 2024 21:51:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 2911A80073 for ; Wed, 13 Mar 2024 21:51:43 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E643E403D0 for ; Thu, 14 Mar 2024 01:51:42 +0000 (UTC) X-FDA: 81893968044.15.C57C9C5 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf02.hostedemail.com (Postfix) with ESMTP id 46D4E80005 for ; Thu, 14 Mar 2024 01:51:41 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=o88uf++9; spf=pass (imf02.hostedemail.com: domain of sj@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710381101; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cYPjWYsM1+EPKBt1VUz2ei8Pn2IrMrAoODDwq6zLScM=; b=vud92PJuqeqyQZu6ROcn3zvzqj/FK7Ghp39MtNWShabEVJc5qNaQwVSwuiKT8Qs1DWypK3 +UNuvKnuCeQWNTAx3+T5k1mIXPgcH2nVADHTes0jUbN6bumMwzFcoKDOUIj6A/drR1tZ4t sEbmO/3c2iEya4ImcM/5lD6opU90SLo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710381101; a=rsa-sha256; cv=none; b=lSAOQl498WFc22qQnYpE9sOCpn5fmFGAtQD5/7L56KgDMM30HRke8DLOmbEE60b5WIZ2PM 1Hu81H8l3S6g8MSjSHXqlRM6GgZV1VjAr8oEwDgiLPCNO8zpn6o/rOSoSciRWH/rV/mPjp /0meQqMY/SOfZGA3EbrtBmkY1fIVyFs= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=o88uf++9; spf=pass (imf02.hostedemail.com: domain of sj@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 213A1615BF for ; Thu, 14 Mar 2024 01:51:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7FB47C433C7; Thu, 14 Mar 2024 01:51:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1710381099; bh=ysygYWOGeMQAohnSXmJR9KYUkCnmKWHH3aTVFgVfQUU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=o88uf++9Ax0S8z7M+AmMOXOWcPLwlSgvsAbCJfGMZZZoXLReocKaaxbl3ey0SgZsd OAQL3KgDuc/632ZgBCLs5U/qgQr0vwMjhfM3hNZVbPWRtKoN7TAlhF58ttglHBWVGF WH/Z/SEcoi/wEJhl316hReclEGyGbcNfwC688/2xNbAIVUFlgaC5xDKWZgZVTcCtkJ jWUxdYm4odyFPpWsNV0YXEmXqk4IfPIXHkine2LXs7Uu/PvnNhcwEFhYars6yuegoD sQiTYrN3ITAFtFLxw7ash4WOMEZOjz0AYldXdGsvnNaWSo3bfh6ZWldycQDtV8d9XD 1GlmaMfEF4Ftg== From: SeongJae Park To: SeongJae Park Cc: damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC IDEA] ACMA: Access/Contiguity-aware Memory Auto-scaling Date: Wed, 13 Mar 2024 18:51:37 -0700 Message-Id: <20240314015137.62598-1-sj@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231112195114.61474-1-sj@kernel.org> References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 46D4E80005 X-Rspam-User: X-Stat-Signature: nj8jqk47xo788914uw866ha7ae5qmnnf X-Rspamd-Server: rspam03 X-HE-Tag: 1710381101-359922 X-HE-Meta: U2FsdGVkX19G8bK1t/hIbFVG0nvEsdspuEqri2dNigHY7wFcjbIPYRQg6TnJFDHchOf7AMsbV5EbCt54hZzkyehvTIK39XuOawiz7ZTCF5G7CCAP1zMizJOk2w5Jr5M0eXgL6t2Hw1o69logh9+RiLa3tbP4f9PuuyATlxCOcS8AC4aLNZi0PA2TJNDsF1ZGSxNTkyeri4RBc30mG4JgwpruFD353hdkounvRI0jgZvs0GCVVMkemgHD6QEn4l00+2E+UfmnLYeG3oy6CPuclGjdT2RioADLx3uq7lbRvrZ281KXlDLHZiaPfdK+IsFsBYiQhxX80x/y2XTZmXxlPAACCxWme+GYnvhidOAEryGwG8Sd45/i5yLpQcXeoBDjH13A0eEt4cyGST+HQ65xzlTIhPw+MD794jm84YCDansQLS3BBH2qkXAphqYQZkKO1Tg2mqLJPrsI+lfE9vOzCSgt0RAN6YJM5Nnsh5xGZAlhebI+Lae7RpIodtUKnnsYujndKyWrqSgRbKyOwJm8jzwCHkYN4wOZzqNMIxT+axJr/B5l2i/G5xoMapEdQzLnIvzYFnp02Yd32rE0AnDzYr3M3dh4pmupf5cHYH4HEw8/3L8PspnFIngOInx6ItSFxeBtlQ0w80fK3CaaJMuVKeZ7/7KBbX2gtX3WAtWwW7WmFr+kvwCqqHXQMih0dve7O13h7erozsl7cFwVWQVmQBalPKGWgMK5nBghZZHIOfzShkC2278KgX3QCt7fVarNIZjRpcGD2PWWFUG/G59Shm3BAkrm729wgAyc4W/em6l5SepW7KTsilEdqWj6RqNIGjU0ojIdXmchaYlKl/uL6cWhMxTBc6vUSuUEGuul8+aoofxoMMRxRL1B/+NvVCB3cCLhj5GrSDpge21kGwvrK5+UfO4v8jRTe90DmReXqalhn4M/h893XKYoD7Qke2t/glb51NAxfhoKIqn4YxX YgGJ257i Rpd00DGou0OE9l+TQU2wCjqUuZYtk9DAKF2WfPkd9zmkEMQRL/zS3Q4aylB7LottixbfhQkqk5rGJmy2JdqW4PGNtKH4jSYqEJnGJfmZFRE98+3PSygMBEDAigCEvKOsU4RHhgSVYHLvHHf3/HvzZ/gsz82lysTlOofhjj88dFuUFfbQ411mfFtA42ttFDJzIKlWJDH42yFYDLMkRdSWJCNl+ex470jI2LqGuZVWfZeG77WFXh370lYqxw2hUPzTozRJdMmSP4nDW6iEHxG336xzA9DQvSdlDscFwXZElbW0YOv1VdVDoAnGCkg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, 12 Nov 2023 19:51:14 +0000 SeongJae Park wrote: > Hello, > > > I'd like to share an idea for making systems automatically scale up/down memory > in an access/contiguity-awared way. It is designed for memory efficiency of > free pages reporting-like collaboration based memory oversubscribed virtual > machine systems, but it might also be potentially useful for memory/power > efficiency and memory contiguity of general systems. There is no > implementation at the moment, but I'd like to hear any comments or concerns > about the idea first if anyone has. I will also share this in the upcoming > kernel summit DAMON talk[1]'s future plans part. > [...] > > ACMA: Access/Contiguity-aware Memory Auto-scaling > ================================================= > > We therefore propose a new kernel feature for the requirements, namely > Access/Contiguity-aware Memory Auto-scaling (ACMA). > > Definitions > ----------- > > ACMA defines a metric called DAMON-detected working set. This is a set of > memory regions that DAMON has detected access to those regions within a > user-specifiable time interval, say, one minute. > > ACMA also defines a new operation called stealing. It receives a contiguous > memory region as its input, and allocates the pages of the region. If some > pages in the region are not free, migrate those out. Hence it could be thought > of a variant, or a wrapper of memory offlining or alloc_contig_range(). If the > allocation is successful, it further reports the region as safe to use to the > host. ACMA manages the stealing status of each memory block. If the entire > page of a memory block is stolen, it further hot-unplug the block. > > It further defines a new operation called stolen pages returning. The action > receives an amount of memory size as input. If there are not-yet-hot-unplugged > stolen pages of the size, it frees the page. If there are no such stolen pages > but a hot-unplugged stolen memory block, it hot-plugs the block again, closer > to the not-hot-unplugged blocks first. Then the guest users can allocate pages > of returned ones and access it. When they access it, the host will notify that > via page fault and assign/map a host-physical page for that. > > Workflow > -------- > > With these definitions, ACMA behaves based on system status as follows. > > Phase 0. It periodically monitors the DAMON-based working set size and free > memory size of the system. > > Phase 1. If the free memory to the working set size ratio is more than a > threshold (high), say, 2:1 (200%), ACMA steals report-granularity contiguous > non-working set pages in the last not-yet-hot-unplugged memory block, colder > pages first. The ratio will decrease. > > Phase 2. If the free memory to the working set size ratio becomes less than a > threshold (normal), say, 1:1 (100%), ACMA stops stealing and start reclaiming > non-workingset pages, colder pages first. The ratio will increase. The > reclamation is continued until the ratio becomes higher than the normal > threshold. > > Phase 3. If the non-workingset reclamation is not increasing the ratio and it > becomes less than yet another threshold (low), say, 1:2 (50%), ACMA starts > returning stolen pages until the free memory to the working set ratio becomes > higher than the low threshold. So, the idea is to keep only specific portion of working set as free. However, the free memory to the working set size ratio is not easy to understand since it changes very dynamically, based on the access pattern. Hence, imagining how it will works and what results the system will get without visualization or detailed example scenario is not easy. This would be much more challenging for users. The three thresholds may also be hard to be optimally tuned, especially when the characteristic of the workload is dynamic. Since we have user/self-feedback-driven auto-tuning, I believe we could make this more simple. Specifically, ACMA could ask user to set min/max memory of the system to guarantee, and acceptable level of memory pressure. Then, it could do its best to make the system memory efficient while keeping the three conditions. Detailed mechanism will of course more complicated then this simple statement, but I believe this simple statement is letting users understand what the result of using ACMA is. I will share more detailed specification of the updated idea as another "RFC IDEA" mail soon. Thanks, SJ [...]