From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5061C433EF for ; Wed, 18 May 2022 05:58:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE56A6B0072; Wed, 18 May 2022 01:58:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C6D626B0073; Wed, 18 May 2022 01:58:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE6EB6B0074; Wed, 18 May 2022 01:58:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 99FBF6B0072 for ; Wed, 18 May 2022 01:58:43 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 606561202ED for ; Wed, 18 May 2022 05:58:43 +0000 (UTC) X-FDA: 79477809726.26.5447E0D Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) by imf10.hostedemail.com (Postfix) with ESMTP id 58B40C00A1 for ; Wed, 18 May 2022 05:58:13 +0000 (UTC) Received: by mail-pf1-f175.google.com with SMTP id j6so1156054pfe.13 for ; Tue, 17 May 2022 22:58:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:from:to:cc:subject:references:mime-version :content-disposition:in-reply-to; bh=Wvwoi2ieX1SAY39tBWj8LHG37asi931JP8mrlpfaKk4=; b=IYRMNrIiumsSaflitKrY0i7FLS+lQCNLzOzrzfpuDPHl8qKTyR2SruC/IiZPaWASLU h1814XYKS6hk+GXI8YmOfUnZZvmtntVApJKWkygY16itKnRQP2mlCzMkvZBUWHpeFhdf vxsTjQ/hFRdQmWf81TAcvUVbgUIqwgYy/P2macO/GgTqyvjwUq6j6rEzicfyDezBeCJh jalDG8dek+x5k4G1i6fjll8rsPu9bLOuh5MuDAsCSNI+H91QXyUx22iqd8m24ddtm7p9 SQvgRYreW4YjQ3RNg+uvIuLy6RceXkwr/A5qYIInoXTV+YQHxzPC4L5WS8ySQpgc+F6R R5yg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:from:to:cc:subject:references :mime-version:content-disposition:in-reply-to; bh=Wvwoi2ieX1SAY39tBWj8LHG37asi931JP8mrlpfaKk4=; b=lgZKD2wTn6k5Md6njFy0rc+HyO+LW8xa0F3+rO5by9Fdx00zZ57KYsHsO88aNnccG+ Hne/kNRlN2UkANqhpx8Yj9gnPWXGidf4Pg4HKt9HAagst637352nfOesBDj/oRINPrh/ w6+Dkb+/IcZ0/nSAEd6Q+h7uHNYgrQg/4AzyhcpC9j4wX5pxfHKZRPj92P7YhYpam+D5 o5mRgd00oF71QprVyhtedeW/Uzxx1d8fzzATEUs8r3h/3BJiAUiuMNdH5GS7cJrtQaM4 gZUKetghOIC+cnbenSd8JQOXRsqEAhKkNXGQIXql5JgVGbamTZiC+xDCXwtUnS4g1+gB jupg== X-Gm-Message-State: AOAM5332ytJkkNzZrXOv408B8jx+y59TjJESKZ6k/Cwh91UM6O+hGHpZ cg1pmJO/QI5ViG9TzJEzHKw= X-Google-Smtp-Source: ABdhPJwr9HRFJAt/zhYuJZ10Py9yYXbZjFzFpTiFrytFhnSOVnn+LtwTvwpMfpIndB0c1Sf+6QiTvA== X-Received: by 2002:a63:5d50:0:b0:3db:5325:b120 with SMTP id o16-20020a635d50000000b003db5325b120mr22910678pgm.212.1652853521830; Tue, 17 May 2022 22:58:41 -0700 (PDT) Received: from localhost ([193.203.214.57]) by smtp.gmail.com with ESMTPSA id q13-20020a170903204d00b0015e8d4eb20asm649445pla.84.2022.05.17.22.58.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 May 2022 22:58:41 -0700 (PDT) Message-ID: <62848b11.1c69fb81.6ce50.2091@mx.google.com> X-Google-Original-Message-ID: <20220518055839.GA1677365@cgel.zte@gmail.com> Date: Wed, 18 May 2022 05:58:39 +0000 From: CGEL To: Michal Hocko Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, willy@infradead.org, shy828301@gmail.com, roman.gushchin@linux.dev, shakeelb@google.com, linmiaohe@huawei.com, william.kucharski@oracle.com, peterx@redhat.com, hughd@google.com, vbabka@suse.cz, songmuchun@bytedance.com, surenb@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Yang Yang Subject: Re: [PATCH] mm/memcg: support control THP behaviour in cgroup References: <20220505033814.103256-1-xu.xin16@zte.com.cn> <6275d3e7.1c69fb81.1d62.4504@mx.google.com> <6278fa75.1c69fb81.9c598.f794@mx.google.com> <6279c354.1c69fb81.7f6c1.15e0@mx.google.com> <627a5214.1c69fb81.1b7fb.47be@mx.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: y35otnbydmzzj493j9od84p7xs3tge9s X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 58B40C00A1 X-Rspam-User: Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=IYRMNrIi; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf10.hostedemail.com: domain of cgel.zte@gmail.com designates 209.85.210.175 as permitted sender) smtp.mailfrom=cgel.zte@gmail.com X-HE-Tag: 1652853493-812954 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, May 10, 2022 at 03:36:34PM +0200, Michal Hocko wrote: > On Tue 10-05-22 11:52:51, CGEL wrote: > > On Tue, May 10, 2022 at 12:00:04PM +0200, Michal Hocko wrote: > > > On Tue 10-05-22 01:43:38, CGEL wrote: > > > > On Mon, May 09, 2022 at 01:48:39PM +0200, Michal Hocko wrote: > > > > > On Mon 09-05-22 11:26:43, CGEL wrote: > > > > > > On Mon, May 09, 2022 at 12:00:28PM +0200, Michal Hocko wrote: > > > > > > > On Sat 07-05-22 02:05:25, CGEL wrote: > > > > > > > [...] > > > > > > > > If there are many containers to run on one host, and some of them have high > > > > > > > > performance requirements, administrator could turn on thp for them: > > > > > > > > # docker run -it --thp-enabled=always > > > > > > > > Then all the processes in those containers will always use thp. > > > > > > > > While other containers turn off thp by: > > > > > > > > # docker run -it --thp-enabled=never > > > > > > > > > > > > > > I do not know. The THP config space is already too confusing and complex > > > > > > > and this just adds on top. E.g. is the behavior of the knob > > > > > > > hierarchical? What is the policy if parent memcg says madivise while > > > > > > > child says always? How does the per-application configuration aligns > > > > > > > with all that (e.g. memcg policy madivise but application says never via > > > > > > > prctl while still uses some madvised - e.g. via library). > > > > > > > > > > > > > > > > > > > The cgroup THP behavior is align to host and totally independent just likes > > > > > > /sys/fs/cgroup/memory.swappiness. That means if one cgroup config 'always' > > > > > > for thp, it has no matter with host or other cgroup. This make it simple for > > > > > > user to understand or control. > > > > > > > > > > All controls in cgroup v2 should be hierarchical. This is really > > > > > required for a proper delegation semantic. > > > > > > > > > > > > > Could we align to the semantic of /sys/fs/cgroup/memory.swappiness? > > > > Some distributions like Ubuntu is still using cgroup v1. > > > > > > cgroup v1 interface is mostly frozen. All new features are added to the > > > v2 interface. > > > > > > > So what about we add this interface to cgroup v2? > > Can you come up with a sane hierarchical behavior? > > [...] > > > > For micro-service architecture, the application in one container is not a > > > > set of loosely tight processes, it's aim at provide one certain service, > > > > so different containers means different service, and different service > > > > has different QoS demand. > > > > > > OK, if they are tightly coupled you could apply the same THP policy by > > > an existing prctl interface. Why is that not feasible. As you are noting > > > below... > > > > > > > 5.containers usually managed by compose software, which treats container as > > > > base management unit; > > > > > > ..so the compose software can easily start up the workload by using prctl > > > to disable THP for whatever workloads it is not suitable for. > > > > prctl(PR_SET_THP_DISABLE..) can not be elegance to support the semantic we > > need. If only some containers needs THP, other containers and host do not need > > THP. We must set host THP to always first, and call prctl() to close THP for > > host tasks and other containers one by one, > > It might not be the most elegant solution but it should work. > Maintaining user interfaces for ever has some cost and the THP > configuration space is quite large already. So I would rather not add > more complication in unless that is absolutely necessary. > By the way, should we let prctl() support PR_SET_THP_ALWAYS? Just likes PR_TASK_PERF_EVENTS_DISABLE and PR_TASK_PERF_EVENTS_ENABLE. This would make it simpler to let certain process use THP while others not use. > > in this process some tasks that start before we call prctl() may > > already use THP with no need. > > As long as all those processes have a common ancestor I do not see how > that would be possible. > > -- > Michal Hocko > SUSE Labs