From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84D68C2B9F4 for ; Mon, 28 Jun 2021 13:34:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1AB176145F for ; Mon, 28 Jun 2021 13:34:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1AB176145F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 651428D004F; Mon, 28 Jun 2021 09:34:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 627C68D0016; Mon, 28 Jun 2021 09:34:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42D308D004F; Mon, 28 Jun 2021 09:34:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0136.hostedemail.com [216.40.44.136]) by kanga.kvack.org (Postfix) with ESMTP id 038228D0016 for ; Mon, 28 Jun 2021 09:34:29 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E2359181AEF00 for ; Mon, 28 Jun 2021 13:34:29 +0000 (UTC) X-FDA: 78303227058.01.430C0F6 Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by imf05.hostedemail.com (Postfix) with ESMTP id 7607FE000273 for ; Mon, 28 Jun 2021 13:34:29 +0000 (UTC) Received: by mail-qv1-f51.google.com with SMTP id h18so1281625qve.1 for ; Mon, 28 Jun 2021 06:34:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/2jiGDnWXkTXohF7OUnjZxSSSmDhP/kYSJ73gZCayKU=; b=NBrveEDhQ8cD6S6/nbYE7EXep+z8nZMZspCGU0M0hI6K9ifUx5TKUcWt0dqbmO9+nI 8KYnC605bDwGeyRIXPgVgfRGhKfpIn/J2tVcg/xBI7D0GcrddGF+5aXR2xchJ7YjsQ8u 7wgoQAIY3XiXR1Hv5Bx3ULfquimrKKQty831g6d4vpcAC/VORtDd5bJplLV7dYw/bKkX i7obKmupJp9pWBICcigi4Dfox2TCKaLt2COOVGz1j5febHaP63y1uxh4ofmjXAe4rq7i 2tvb6tU/c8LjiGz985BvzfpQNDaByBwuPlc9KYMRqd7Fvos5c3nNuHKEKp5IuqR5/pA+ YYVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/2jiGDnWXkTXohF7OUnjZxSSSmDhP/kYSJ73gZCayKU=; b=kio2yCm/2Dqsz5HwYTtHg4LTxx6lDnoI5Gj1pd81ZUO/rlLdFnlDMt9HegWfDMgKIq hB12U/dz4wk11OTt9ORVA4RvxLIgD01oYBaQIKrtLNqnMtDNfQ7f+FB/LluvD6cCNvS1 now+N/MeX2NAaeYTuNA674NuEqISW9V3qdDNE5mRKBMAGzLAMof/kMvqllvaI6MN2aaB inlDuNN4PEl3iEPtySfdvwKghlcnj8hiEgpDstOJmwGEfGy3rYfJ2aYAIi8KQMOeIWI4 RMpD/ahTQATDIFL75KaFDb3ZF+Rncx2uOdBssfn4jOWvFhsct1le+hZ2hUuf9f7RsJdw npmg== X-Gm-Message-State: AOAM533u/kP+qDUqfdtGXSgi/YwD4UAkDZ+mBIW6OApBcqZQliWWU7Ak hsLba1ZWUFV5qo9BQodlssg= X-Google-Smtp-Source: ABdhPJzYS0xtTIaibDVuyBRsWzw3bvjzRvMvQ/R0XSWeG67uRhGcc69vyoVAA7/Pxebr8fBk3bq7uA== X-Received: by 2002:ad4:4772:: with SMTP id d18mr25380981qvx.35.1624887268630; Mon, 28 Jun 2021 06:34:28 -0700 (PDT) Received: from localhost.localdomain (ec2-35-169-212-159.compute-1.amazonaws.com. [35.169.212.159]) by smtp.gmail.com with ESMTPSA id h1sm2276030qkm.50.2021.06.28.06.34.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Jun 2021 06:34:28 -0700 (PDT) From: SeongJae Park To: akpm@linux-foundation.org Cc: SeongJae Park , Jonathan.Cameron@Huawei.com, acme@kernel.org, alexander.shishkin@linux.intel.com, amit@kernel.org, benh@kernel.crashing.org, brendanhiggins@google.com, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, fan.du@intel.com, foersleo@amazon.de, greg@kroah.com, gthelen@google.com, guoju.fgj@alibaba-inc.com, jgowans@amazon.com, mgorman@suse.de, mheyne@amazon.de, minchan@kernel.org, mingo@redhat.com, namhyung@kernel.org, peterz@infradead.org, riel@surriel.com, rientjes@google.com, rostedt@goodmis.org, rppt@kernel.org, shakeelb@google.com, shuah@kernel.org, sieberf@amazon.com, sj38.park@gmail.com, snu@zelle79.org, vbabka@suse.cz, vdavydov.dev@gmail.com, zgf574564920@gmail.com, linux-damon@amazon.com, linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v32 10/13] Documentation: Add documents for DAMON Date: Mon, 28 Jun 2021 13:33:52 +0000 Message-Id: <20210628133355.18576-11-sj38.park@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210628133355.18576-1-sj38.park@gmail.com> References: <20210628133355.18576-1-sj38.park@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=NBrveEDh; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of sj38park@gmail.com designates 209.85.219.51 as permitted sender) smtp.mailfrom=sj38park@gmail.com X-Stat-Signature: waog8be5z9mpfum9py7pxrnjkpkoz435 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 7607FE000273 X-HE-Tag: 1624887269-600944 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park This commit adds documents for DAMON under `Documentation/admin-guide/mm/damon/` and `Documentation/vm/damon/`. Signed-off-by: SeongJae Park Reviewed-by: Fernand Sieber Reviewed-by: Markus Boehme --- Documentation/admin-guide/mm/damon/index.rst | 15 ++ Documentation/admin-guide/mm/damon/start.rst | 114 +++++++++++++ Documentation/admin-guide/mm/damon/usage.rst | 112 +++++++++++++ Documentation/admin-guide/mm/index.rst | 1 + Documentation/vm/damon/api.rst | 20 +++ Documentation/vm/damon/design.rst | 166 +++++++++++++++++++ Documentation/vm/damon/faq.rst | 51 ++++++ Documentation/vm/damon/index.rst | 30 ++++ Documentation/vm/index.rst | 1 + 9 files changed, 510 insertions(+) create mode 100644 Documentation/admin-guide/mm/damon/index.rst create mode 100644 Documentation/admin-guide/mm/damon/start.rst create mode 100644 Documentation/admin-guide/mm/damon/usage.rst create mode 100644 Documentation/vm/damon/api.rst create mode 100644 Documentation/vm/damon/design.rst create mode 100644 Documentation/vm/damon/faq.rst create mode 100644 Documentation/vm/damon/index.rst diff --git a/Documentation/admin-guide/mm/damon/index.rst b/Documentation= /admin-guide/mm/damon/index.rst new file mode 100644 index 000000000000..8c5dde3a5754 --- /dev/null +++ b/Documentation/admin-guide/mm/damon/index.rst @@ -0,0 +1,15 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Monitoring Data Accesses +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +:doc:`DAMON ` allows light-weight data access monitorin= g. +Using DAMON, users can analyze the memory access patterns of their syste= ms and +optimize those. + +.. toctree:: + :maxdepth: 2 + + start + usage diff --git a/Documentation/admin-guide/mm/damon/start.rst b/Documentation= /admin-guide/mm/damon/start.rst new file mode 100644 index 000000000000..d5eb89a8fc38 --- /dev/null +++ b/Documentation/admin-guide/mm/damon/start.rst @@ -0,0 +1,114 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Getting Started +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +This document briefly describes how you can use DAMON by demonstrating i= ts +default user space tool. Please note that this document describes only = a part +of its features for brevity. Please refer to :doc:`usage` for more deta= ils. + + +TL; DR +=3D=3D=3D=3D=3D=3D + +Follow the commands below to monitor and visualize the memory access pat= tern of +your workload. :: + + # # build the kernel with CONFIG_DAMON_*=3Dy, install it, and reboot + # mount -t debugfs none /sys/kernel/debug/ + # git clone https://github.com/awslabs/damo + # ./damo/damo record $(pidof ) + # ./damo/damo report heat --plot_ascii + +The final command draws the access heatmap of ````. The = heatmap +shows which memory region (x-axis) is accessed when (y-axis) and how fre= quently +(number; the higher the more accesses have been observed). :: + + 111111111111111111111111111111111111111111111111111111110000 + 111121111111111111111111111111211111111111111111111111110000 + 000000000000000000000000000000000000000000000000001555552000 + 000000000000000000000000000000000000000000000222223555552000 + 000000000000000000000000000000000000000011111677775000000000 + 000000000000000000000000000000000000000488888000000000000000 + 000000000000000000000000000000000177888400000000000000000000 + 000000000000000000000000000046666522222100000000000000000000 + 000000000000000000000014444344444300000000000000000000000000 + 000000000000000002222245555510000000000000000000000000000000 + # access_frequency: 0 1 2 3 4 5 6 7 8 9 + # x-axis: space (140286319947776-140286426374096: 101.496 MiB) + # y-axis: time (605442256436361-605479951866441: 37.695430s) + # resolution: 60x10 (1.692 MiB and 3.770s for each character) + + +Prerequisites +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Kernel +------ + +You should first ensure your system is running on a kernel built with +``CONFIG_DAMON_*=3Dy``. + + +User Space Tool +--------------- + +For the demonstration, we will use the default user space tool for DAMON= , +called DAMON Operator (DAMO). It is available at +https://github.com/awslabs/damo. The examples below assume that ``damo`= ` is on +your ``$PATH``. It's not mandatory, though. + +Because DAMO is using the debugfs interface (refer to :doc:`usage` for t= he +detail) of DAMON, you should ensure debugfs is mounted. Mount it manual= ly as +below:: + + # mount -t debugfs none /sys/kernel/debug/ + +or append the following line to your ``/etc/fstab`` file so that your sy= stem +can automatically mount debugfs upon booting:: + + debugfs /sys/kernel/debug debugfs defaults 0 0 + + +Recording Data Access Patterns +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D + +The commands below record the memory access patterns of a program and sa= ve the +monitoring results to a file. :: + + $ git clone https://github.com/sjp38/masim + $ cd masim; make; ./masim ./configs/zigzag.cfg & + $ sudo damo record -o damon.data $(pidof masim) + +The first two lines of the commands download an artificial memory access +generator program and run it in the background. The generator will repe= atedly +access two 100 MiB sized memory regions one by one. You can substitute = this +with your real workload. The last line asks ``damo`` to record the acce= ss +pattern in the ``damon.data`` file. + + +Visualizing Recorded Patterns +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D + +The following three commands visualize the recorded access patterns and = save +the results as separate image files. :: + + $ damo report heats --heatmap access_pattern_heatmap.png + $ damo report wss --range 0 101 1 --plot wss_dist.png + $ damo report wss --range 0 101 1 --sortby time --plot wss_chron_cha= nge.png + +- ``access_pattern_heatmap.png`` will visualize the data access pattern = in a + heatmap, showing which memory region (y-axis) got accessed when (x-axi= s) + and how frequently (color). +- ``wss_dist.png`` will show the distribution of the working set size. +- ``wss_chron_change.png`` will show how the working set size has + chronologically changed. + +You can view the visualizations of this example workload at [1]_. +Visualizations of other realistic workloads are available at [2]_ [3]_ [= 4]_. + +.. [1] https://damonitor.github.io/doc/html/v17/admin-guide/mm/damon/sta= rt.html#visualizing-recorded-patterns +.. [2] https://damonitor.github.io/test/result/visual/latest/rec.heatmap= .1.png.html +.. [3] https://damonitor.github.io/test/result/visual/latest/rec.wss_sz.= png.html +.. [4] https://damonitor.github.io/test/result/visual/latest/rec.wss_tim= e.png.html diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation= /admin-guide/mm/damon/usage.rst new file mode 100644 index 000000000000..a72cda374aba --- /dev/null +++ b/Documentation/admin-guide/mm/damon/usage.rst @@ -0,0 +1,112 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Detailed Usages +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +DAMON provides below three interfaces for different users. + +- *DAMON user space tool.* + This is for privileged people such as system administrators who want a + just-working human-friendly interface. Using this, users can use the = DAMON=E2=80=99s + major features in a human-friendly way. It may not be highly tuned fo= r + special cases, though. It supports only virtual address spaces monito= ring. +- *debugfs interface.* + This is for privileged user space programmers who want more optimized = use of + DAMON. Using this, users can use DAMON=E2=80=99s major features by re= ading + from and writing to special debugfs files. Therefore, you can write a= nd use + your personalized DAMON debugfs wrapper programs that reads/writes the + debugfs files instead of you. The DAMON user space tool is also a ref= erence + implementation of such programs. It supports only virtual address spa= ces + monitoring. +- *Kernel Space Programming Interface.* + This is for kernel space programmers. Using this, users can utilize e= very + feature of DAMON most flexibly and efficiently by writing kernel space + DAMON application programs for you. You can even extend DAMON for var= ious + address spaces. + +Nevertheless, you could write your own user space tool using the debugfs +interface. A reference implementation is available at +https://github.com/awslabs/damo. If you are a kernel programmer, you co= uld +refer to :doc:`/vm/damon/api` for the kernel space programming interface= . For +the reason, this document describes only the debugfs interface + +debugfs Interface +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +DAMON exports three files, ``attrs``, ``target_ids``, and ``monitor_on``= under +its debugfs directory, ``/damon/``. + + +Attributes +---------- + +Users can get and set the ``sampling interval``, ``aggregation interval`= `, +``regions update interval``, and min/max number of monitoring target reg= ions by +reading from and writing to the ``attrs`` file. To know about the monit= oring +attributes in detail, please refer to the :doc:`/vm/damon/design`. For +example, below commands set those values to 5 ms, 100 ms, 1,000 ms, 10 a= nd +1000, and then check it again:: + + # cd /damon + # echo 5000 100000 1000000 10 1000 > attrs + # cat attrs + 5000 100000 1000000 10 1000 + + +Target IDs +---------- + +Some types of address spaces supports multiple monitoring target. For e= xample, +the virtual memory address spaces monitoring can have multiple processes= as the +monitoring targets. Users can set the targets by writing relevant id va= lues of +the targets to, and get the ids of the current targets by reading from t= he +``target_ids`` file. In case of the virtual address spaces monitoring, = the +values should be pids of the monitoring target processes. For example, = below +commands set processes having pids 42 and 4242 as the monitoring targets= and +check it again:: + + # cd /damon + # echo 42 4242 > target_ids + # cat target_ids + 42 4242 + +Note that setting the target ids doesn't start the monitoring. + + +Turning On/Off +-------------- + +Setting the files as described above doesn't incur effect unless you exp= licitly +start the monitoring. You can start, stop, and check the current status= of the +monitoring by writing to and reading from the ``monitor_on`` file. Writ= ing +``on`` to the file starts the monitoring of the targets with the attribu= tes. +Writing ``off`` to the file stops those. DAMON also stops if every targ= et +process is terminated. Below example commands turn on, off, and check t= he +status of DAMON:: + + # cd /damon + # echo on > monitor_on + # echo off > monitor_on + # cat monitor_on + off + +Please note that you cannot write to the above-mentioned debugfs files w= hile +the monitoring is turned on. If you write to the files while DAMON is r= unning, +an error code such as ``-EBUSY`` will be returned. + + +Tracepoint for Monitoring Results +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D + +DAMON provides the monitoring results via a tracepoint, +``damon:damon_aggregated``. While the monitoring is turned on, you coul= d +record the tracepoint events and show results using tracepoint supportin= g tools +like ``perf``. For example:: + + # echo on > monitor_on + # perf record -e damon:damon_aggregated & + # sleep 5 + # kill 9 $(pidof perf) + # echo off > monitor_on + # perf script diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin= -guide/mm/index.rst index 4b14d8b50e9e..cbd19d5e625f 100644 --- a/Documentation/admin-guide/mm/index.rst +++ b/Documentation/admin-guide/mm/index.rst @@ -27,6 +27,7 @@ the Linux memory management. =20 concepts cma_debugfs + damon/index hugetlbpage idle_page_tracking ksm diff --git a/Documentation/vm/damon/api.rst b/Documentation/vm/damon/api.= rst new file mode 100644 index 000000000000..08f34df45523 --- /dev/null +++ b/Documentation/vm/damon/api.rst @@ -0,0 +1,20 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +API Reference +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Kernel space programs can use every feature of DAMON using below APIs. = All you +need to do is including ``damon.h``, which is located in ``include/linux= /`` of +the source tree. + +Structures +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +.. kernel-doc:: include/linux/damon.h + + +Functions +=3D=3D=3D=3D=3D=3D=3D=3D=3D + +.. kernel-doc:: mm/damon/core.c diff --git a/Documentation/vm/damon/design.rst b/Documentation/vm/damon/d= esign.rst new file mode 100644 index 000000000000..b05159c295f4 --- /dev/null +++ b/Documentation/vm/damon/design.rst @@ -0,0 +1,166 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D +Design +=3D=3D=3D=3D=3D=3D + +Configurable Layers +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +DAMON provides data access monitoring functionality while making the acc= uracy +and the overhead controllable. The fundamental access monitorings requi= re +primitives that dependent on and optimized for the target address space.= On +the other hand, the accuracy and overhead tradeoff mechanism, which is t= he core +of DAMON, is in the pure logic space. DAMON separates the two parts in +different layers and defines its interface to allow various low level +primitives implementations configurable with the core logic. + +Due to this separated design and the configurable interface, users can e= xtend +DAMON for any address space by configuring the core logics with appropri= ate low +level primitive implementations. If appropriate one is not provided, us= ers can +implement the primitives on their own. + +For example, physical memory, virtual memory, swap space, those for spec= ific +processes, NUMA nodes, files, and backing memory devices would be suppor= table. +Also, if some architectures or devices support special optimized access = check +primitives, those will be easily configurable. + + +Reference Implementations of Address Space Specific Primitives +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The low level primitives for the fundamental access monitoring are defin= ed in +two parts: + +1. Identification of the monitoring target address range for the address= space. +2. Access check of specific address range in the target space. + +DAMON currently provides the implementation of the primitives for only t= he +virtual address spaces. Below two subsections describe how it works. + + +VMA-based Target Address Range Construction +------------------------------------------- + +Only small parts in the super-huge virtual address space of the processe= s are +mapped to the physical memory and accessed. Thus, tracking the unmapped +address regions is just wasteful. However, because DAMON can deal with = some +level of noise using the adaptive regions adjustment mechanism, tracking= every +mapping is not strictly required but could even incur a high overhead in= some +cases. That said, too huge unmapped areas inside the monitoring target = should +be removed to not take the time for the adaptive mechanism. + +For the reason, this implementation converts the complex mappings to thr= ee +distinct regions that cover every mapped area of the address space. The= two +gaps between the three regions are the two biggest unmapped areas in the= given +address space. The two biggest unmapped areas would be the gap between = the +heap and the uppermost mmap()-ed region, and the gap between the lowermo= st +mmap()-ed region and the stack in most of the cases. Because these gaps= are +exceptionally huge in usual address spaces, excluding these will be suff= icient +to make a reasonable trade-off. Below shows this in detail:: + + + + + (small mmap()-ed regions and munmap()-ed regions) + + + + + +PTE Accessed-bit Based Access Check +----------------------------------- + +The implementation for the virtual address space uses PTE Accessed-bit f= or +basic access checks. It finds the relevant PTE Accessed bit from the ad= dress +by walking the page table for the target task of the address. In this w= ay, the +implementation finds and clears the bit for next sampling target address= and +checks whether the bit set again after one sampling period. This could = disturb +other kernel subsystems using the Accessed bits, namely Idle page tracki= ng and +the reclaim logic. To avoid such disturbances, DAMON makes it mutually +exclusive with Idle page tracking and uses ``PG_idle`` and ``PG_young`` = page +flags to solve the conflict with the reclaim logic, as Idle page trackin= g does. + + +Address Space Independent Core Mechanisms +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Below four sections describe each of the DAMON core mechanisms and the f= ive +monitoring attributes, ``sampling interval``, ``aggregation interval``, +``regions update interval``, ``minimum number of regions``, and ``maximu= m +number of regions``. + + +Access Frequency Monitoring +--------------------------- + +The output of DAMON says what pages are how frequently accessed for a gi= ven +duration. The resolution of the access frequency is controlled by setti= ng +``sampling interval`` and ``aggregation interval``. In detail, DAMON ch= ecks +access to each page per ``sampling interval`` and aggregates the results= . In +other words, counts the number of the accesses to each page. After each +``aggregation interval`` passes, DAMON calls callback functions that pre= viously +registered by users so that users can read the aggregated results and th= en +clears the results. This can be described in below simple pseudo-code:: + + while monitoring_on: + for page in monitoring_target: + if accessed(page): + nr_accesses[page] +=3D 1 + if time() % aggregation_interval =3D=3D 0: + for callback in user_registered_callbacks: + callback(monitoring_target, nr_accesses) + for page in monitoring_target: + nr_accesses[page] =3D 0 + sleep(sampling interval) + +The monitoring overhead of this mechanism will arbitrarily increase as t= he +size of the target workload grows. + + +Region Based Sampling +--------------------- + +To avoid the unbounded increase of the overhead, DAMON groups adjacent p= ages +that assumed to have the same access frequencies into a region. As long= as the +assumption (pages in a region have the same access frequencies) is kept,= only +one page in the region is required to be checked. Thus, for each ``samp= ling +interval``, DAMON randomly picks one page in each region, waits for one +``sampling interval``, checks whether the page is accessed meanwhile, an= d +increases the access frequency of the region if so. Therefore, the moni= toring +overhead is controllable by setting the number of regions. DAMON allows= users +to set the minimum and the maximum number of regions for the trade-off. + +This scheme, however, cannot preserve the quality of the output if the +assumption is not guaranteed. + + +Adaptive Regions Adjustment +--------------------------- + +Even somehow the initial monitoring target regions are well constructed = to +fulfill the assumption (pages in same region have similar access frequen= cies), +the data access pattern can be dynamically changed. This will result in= low +monitoring quality. To keep the assumption as much as possible, DAMON +adaptively merges and splits each region based on their access frequency= . + +For each ``aggregation interval``, it compares the access frequencies of +adjacent regions and merges those if the frequency difference is small. = Then, +after it reports and clears the aggregated access frequency of each regi= on, it +splits each region into two or three regions if the total number of regi= ons +will not exceed the user-specified maximum number of regions after the s= plit. + +In this way, DAMON provides its best-effort quality and minimal overhead= while +keeping the bounds users set for their trade-off. + + +Dynamic Target Space Updates Handling +------------------------------------- + +The monitoring target address range could dynamically changed. For exam= ple, +virtual memory could be dynamically mapped and unmapped. Physical memor= y could +be hot-plugged. + +As the changes could be quite frequent in some cases, DAMON checks the d= ynamic +memory mapping changes and applies it to the abstracted target area only= for +each of a user-specified time interval (``regions update interval``). diff --git a/Documentation/vm/damon/faq.rst b/Documentation/vm/damon/faq.= rst new file mode 100644 index 000000000000..cb3d8b585a8b --- /dev/null +++ b/Documentation/vm/damon/faq.rst @@ -0,0 +1,51 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D +Frequently Asked Questions +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D + +Why a new subsystem, instead of extending perf or other user space tools= ? +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +First, because it needs to be lightweight as much as possible so that it= can be +used online, any unnecessary overhead such as kernel - user space contex= t +switching cost should be avoided. Second, DAMON aims to be used by othe= r +programs including the kernel. Therefore, having a dependency on specif= ic +tools like perf is not desirable. These are the two biggest reasons why= DAMON +is implemented in the kernel space. + + +Can 'idle pages tracking' or 'perf mem' substitute DAMON? +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D + +Idle page tracking is a low level primitive for access check of the phys= ical +address space. 'perf mem' is similar, though it can use sampling to min= imize +the overhead. On the other hand, DAMON is a higher-level framework for = the +monitoring of various address spaces. It is focused on memory managemen= t +optimization and provides sophisticated accuracy/overhead handling mecha= nisms. +Therefore, 'idle pages tracking' and 'perf mem' could provide a subset o= f +DAMON's output, but cannot substitute DAMON. + + +Does DAMON support virtual memory only? +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +No. The core of the DAMON is address space independent. The address sp= ace +specific low level primitive parts including monitoring target regions +constructions and actual access checks can be implemented and configured= on the +DAMON core by the users. In this way, DAMON users can monitor any addre= ss +space with any access check technique. + +Nonetheless, DAMON provides vma tracking and PTE Accessed bit check base= d +implementations of the address space dependent functions for the virtual= memory +by default, for a reference and convenient use. In near future, we will +provide those for physical memory address space. + + +Can I simply monitor page granularity? +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Yes. You can do so by setting the ``min_nr_regions`` attribute higher t= han the +working set size divided by the page size. Because the monitoring targe= t +regions size is forced to be ``>=3Dpage size``, the region split will ma= ke no +effect. diff --git a/Documentation/vm/damon/index.rst b/Documentation/vm/damon/in= dex.rst new file mode 100644 index 000000000000..a2858baf3bf1 --- /dev/null +++ b/Documentation/vm/damon/index.rst @@ -0,0 +1,30 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D +DAMON: Data Access MONitor +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D + +DAMON is a data access monitoring framework subsystem for the Linux kern= el. +The core mechanisms of DAMON (refer to :doc:`design` for the detail) mak= e it + + - *accurate* (the monitoring output is useful enough for DRAM level mem= ory + management; It might not appropriate for CPU Cache levels, though), + - *light-weight* (the monitoring overhead is low enough to be applied o= nline), + and + - *scalable* (the upper-bound of the overhead is in constant range rega= rdless + of the size of target workloads). + +Using this framework, therefore, the kernel's memory management mechanis= ms can +make advanced decisions. Experimental memory management optimization wo= rks +that incurring high data accesses monitoring overhead could implemented = again. +In user space, meanwhile, users who have some special workloads can writ= e +personalized applications for better understanding and optimizations of = their +workloads and systems. + +.. toctree:: + :maxdepth: 2 + + faq + design + api + plans diff --git a/Documentation/vm/index.rst b/Documentation/vm/index.rst index eff5fbd492d0..b51f0d8992f8 100644 --- a/Documentation/vm/index.rst +++ b/Documentation/vm/index.rst @@ -32,6 +32,7 @@ descriptions of data structures and algorithms. arch_pgtable_helpers balance cleancache + damon/index free_page_reporting frontswap highmem --=20 2.17.1