From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 368ABC433ED for ; Tue, 11 May 2021 04:20:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A1A7961921 for ; Tue, 11 May 2021 04:20:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A1A7961921 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E76DF6B006E; Tue, 11 May 2021 00:20:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E27526B0071; Tue, 11 May 2021 00:20:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CA1C86B0072; Tue, 11 May 2021 00:20:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0092.hostedemail.com [216.40.44.92]) by kanga.kvack.org (Postfix) with ESMTP id AF99E6B006E for ; Tue, 11 May 2021 00:20:11 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 4F77C8249980 for ; Tue, 11 May 2021 04:20:11 +0000 (UTC) X-FDA: 78127647822.30.C2C7E01 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) by imf24.hostedemail.com (Postfix) with ESMTP id DCBAAA000396 for ; Tue, 11 May 2021 04:19:56 +0000 (UTC) Received: by mail-pf1-f176.google.com with SMTP id q2so15083964pfh.13 for ; Mon, 10 May 2021 21:20:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=NMrDcvVUEo+HujvVPjWV2EcDsSyNyhVfg9VVJSr3No0=; b=jGli3YI5umWvQIidY2apKrgMwCP9TIKRPT3NUKfKFDHsMsWOn88wH6+BghaFg4XkOh 1wAzUhOJ7r6MXd8XqtFa57xv4Z9KOXsGk7lO3AKZAYvp/579zilnCAbx/oofVMuJwK0a gbHE65vk8kJl8ntboSexC0ljylEtPPWFGXR8ZZApldp9AboPO4wa8+gUOoFHE1SBdJSk f1DpX9BLmk/qVLANC3Ej1drceXT9rra6vy8yO8kJHLCvS07Tof5F6Hq1XeSvOvj0xzXs nyQa5DUpWbgaI3aJuEECLfrcLQ6pl6np3YElRyb3kmkj9hKYWM1DTDZfoBCQz6S3ZPZy 4pmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=NMrDcvVUEo+HujvVPjWV2EcDsSyNyhVfg9VVJSr3No0=; b=r0Fm17Ui6sWZjrs+YU33DF/0mntkKtooxzG+QnT0QxBpCXNC+e/CYKisUCFJH5T+9H 930NC3LHm4YTTl+FAIxPJli5HQ7N5PHxB/NZHdzSPn4zhAQ9iUIeMYLyo47hzpnu8ebD v32soH8Ip51mjCQYVTQ+eE5i+JYSfaCDpFv7ysUnfYxeLAK4u4o2OMwcj7eDzc8p67aG hvK00b516KKNoMTeX++WseW6EJGinXeZrszsmFFkafzD4vpWgJgDfFB4f/dLXL3k0cfR vWzXKJa+MSOFSYe9+E7Hi9JKmSlFlx5JJSyR2dGNqUUDX8IwgFSyBvJirq1BenoMPW6D SCnQ== X-Gm-Message-State: AOAM530G95uMvRDEYTtjf4oBxTXGKGTuHjI7nnV1zDrXSBOoEBBG5BaT a93i96t3U1cObWwSj7q79pnVHg== X-Google-Smtp-Source: ABdhPJzr2ZvBFl7f7NYeLHZPOcYFlD9ARgxPRHHb8kRKfPBmaiTjRuf/5YfYFGwJ8b/ClPjXG1KZiw== X-Received: by 2002:a62:1b97:0:b029:24e:44e9:a8c1 with SMTP id b145-20020a621b970000b029024e44e9a8c1mr29201702pfb.19.1620706809625; Mon, 10 May 2021 21:20:09 -0700 (PDT) Received: from [2620:15c:17:3:2a0a:b96a:de1c:f12c] ([2620:15c:17:3:2a0a:b96a:de1c:f12c]) by smtp.gmail.com with ESMTPSA id v123sm12302620pfb.80.2021.05.10.21.20.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 May 2021 21:20:08 -0700 (PDT) Date: Mon, 10 May 2021 21:20:08 -0700 (PDT) From: David Rientjes To: Andrew Morton cc: chukaiping , mcgrof@kernel.org, keescook@chromium.org, yzaikin@google.com, vbabka@suse.cz, nigupta@nvidia.com, bhe@redhat.com, khalid.aziz@oracle.com, iamjoonsoo.kim@lge.com, mateusznosek0@gmail.com, sh_def@163.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Mel Gorman Subject: Re: [PATCH v4] mm/compaction: let proactive compaction order configurable In-Reply-To: <20210509171748.8dbc70ceccc5cc1ae61fe41c@linux-foundation.org> Message-ID: References: <1619576901-9531-1-git-send-email-chukaiping@baidu.com> <20210509171748.8dbc70ceccc5cc1ae61fe41c@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: DCBAAA000396 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20161025 header.b=jGli3YI5; spf=pass (imf24.hostedemail.com: domain of rientjes@google.com designates 209.85.210.176 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam04 X-Stat-Signature: 3knbycsgua3ddk69d157r6f8owj9i73j Received-SPF: none (google.com>: No applicable sender policy available) receiver=imf24; identity=mailfrom; envelope-from=""; helo=mail-pf1-f176.google.com; client-ip=209.85.210.176 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1620706796-412476 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, 9 May 2021, Andrew Morton wrote: > > Currently the proactive compaction order is fixed to > > COMPACTION_HPAGE_ORDER(9), it's OK in most machines with lots of > > normal 4KB memory, but it's too high for the machines with small > > normal memory, for example the machines with most memory configured > > as 1GB hugetlbfs huge pages. In these machines the max order of > > free pages is often below 9, and it's always below 9 even with hard > > compaction. This will lead to proactive compaction be triggered very > > frequently. In these machines we only care about order of 3 or 4. > > This patch export the oder to proc and let it configurable > > by user, and the default value is still COMPACTION_HPAGE_ORDER. > > It would be great to do this automatically? It's quite simple to see > when memory is being handed out to hugetlbfs - so can we tune > proactive_compaction_order in response to this? That would be far > better than adding a manual tunable. > > But from having read Khalid's comments, that does sound quite involved. > Is there some partial solution that we can come up with that will get > most people out of trouble? > > That being said, this patch is super-super-simple so perhaps we should > just merge it just to get one person (and hopefully a few more) out of > trouble. But on the other hand, once we add a /proc tunable we must > maintain that tunable for ever (or at least a very long time) even if > the internal implementations change a lot. > As mentioned in v3 of the patch, I'm not sure why this belongs in the kernel at all. I understand that the system is largely consumed by 1GB gigantic pages and that a small percentage of memory is left for native pages. Thus, fragmentation readily occurs and can affect large order allocations even at the levels of order-3 or order-4. So it seems like the ideal solution would be to monitor the fragmentation index at the order you care about (the same order you would use for this new tunable) and root userspace would manually trigger compaction when necessary. When this was brought up, it was commented that explicitly triggered compaction is too expensive to do all in one iteration. That's fair enough, but shouldn't that be an improvement on explicitly triggered compaction through sysfs to provide a shorter term (or weaker form) of compaction rather than build additional policy decisions into the kernel? If done this way, there would be a clear separation between mechanism and policy and the kernel would not need to carry these sysctls to tune very niche areas.