From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.6 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD85EC433E0 for ; Fri, 3 Jul 2020 16:27:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5ABA02073E for ; Fri, 3 Jul 2020 16:27:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="aA3CxAPy" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5ABA02073E Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8DA628D0091; Fri, 3 Jul 2020 12:27:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 88BE98D0074; Fri, 3 Jul 2020 12:27:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 77A2A8D0091; Fri, 3 Jul 2020 12:27:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5F8F48D0074 for ; Fri, 3 Jul 2020 12:27:34 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id F3A0D180AD81D for ; Fri, 3 Jul 2020 16:27:33 +0000 (UTC) X-FDA: 76997295228.13.note96_0504f3b26e93 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id CD1BB18140B72 for ; Fri, 3 Jul 2020 16:27:33 +0000 (UTC) X-HE-Tag: note96_0504f3b26e93 X-Filterd-Recvd-Size: 6085 Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Fri, 3 Jul 2020 16:27:33 +0000 (UTC) Received: by mail-lj1-f179.google.com with SMTP id 9so37618569ljv.5 for ; Fri, 03 Jul 2020 09:27:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=KcSwCD8ehLU/B3QNKmMXqmLG2nfUoa8CRrvpDaEqynw=; b=aA3CxAPywsEECYk7HA55y/TnZvUHyMJa+H7L7zkuPqWu+SFym0bgvdxpKDd6b5IFKo lsiajzdRGT0ACjkznuQLwoDoHR/1dPiP5VltP4vTuxOUJ21D2bXNmG5QWs8GW0jw2Omx Tzi39QUCzgkVoA+HKdzc4bozQi7kKP8tEl99FFwHKOBcZscgh2d5GxAi4UWLOu9xu4W8 CujI1ALU/QEdHaMY7GTwT4v3eVcYy0xRr+ugX8rAiZTTdiCBdz30ay/hr6Xiz8NucxNb cYmHkQvMmjHajlRViCjJK1B9fMQnj2Br/meo/+mtVi4Fbqwp+l5mpBdzx4jArAYe5q1c QDCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=KcSwCD8ehLU/B3QNKmMXqmLG2nfUoa8CRrvpDaEqynw=; b=AosS3lByZ1j1zbEqx+4/AyYRZaE7xhy93sYHs+v4zF/nGCgAMq1WdTcdKZcgs06xcT cXJ6wEVK8UbWXrM2OfIFS94w4LmjIOoMKDxD4udVaqWZeM9opd+TkpPbYD2ZADPTojvv tpCy1HX7ZZzX5uFDJACgHBL+fRCOsQGfISXTn+QzOVHnqIZJttpz/HOqAmD5A1B/GqPB nqKfu1E4z0RSm3URTfG6mxYNgeUBvbym0amBpePYhVSGOft/Gvy9sNsyv48LEYBYQPSR BCVF/s15qLv3KjUM639q0Dz/qCsZeeXg8hqOK8+AAMJwffHfW65UPTnhn4exSl1I1CFH N8vw== X-Gm-Message-State: AOAM530FDwbJ0OB9yQi8Hp+VLzbkx4zEkTgrCuxLSoOrmOXxaozLFCcH yJwka69Hekw+OYkN2zGZsCxXfZSCz3HaGzwVRbKrPg== X-Google-Smtp-Source: ABdhPJytgNJa60ASLPPZfeIcxsiunKccYp8qRVzYACKj6fdI1vtw8On/o+zuRfEIOb8lyzsf+OESOnVprVcq/3zSils= X-Received: by 2002:a2e:a58a:: with SMTP id m10mr20800065ljp.347.1593793651592; Fri, 03 Jul 2020 09:27:31 -0700 (PDT) MIME-Version: 1.0 References: <20200702152222.2630760-1-shakeelb@google.com> <20200703063548.GM18446@dhcp22.suse.cz> <20200703155021.GB114903@carbon.dhcp.thefacebook.com> In-Reply-To: <20200703155021.GB114903@carbon.dhcp.thefacebook.com> From: Shakeel Butt Date: Fri, 3 Jul 2020 09:27:19 -0700 Message-ID: Subject: Re: [RFC PROPOSAL] memcg: per-memcg user space reclaim interface To: Roman Gushchin Cc: Michal Hocko , Johannes Weiner , Yang Shi , David Rientjes , Greg Thelen , Andrew Morton , Linux MM , LKML , Cgroups Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: CD1BB18140B72 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jul 3, 2020 at 8:50 AM Roman Gushchin wrote: > > On Fri, Jul 03, 2020 at 07:23:14AM -0700, Shakeel Butt wrote: > > On Thu, Jul 2, 2020 at 11:35 PM Michal Hocko wrote: > > > > > > On Thu 02-07-20 08:22:22, Shakeel Butt wrote: > > > [...] > > > > Interface options: > > > > ------------------ > > > > > > > > 1) memcg interface e.g. 'echo 10M > memory.reclaim' > > > > > > > > + simple > > > > + can be extended to target specific type of memory (anon, file, kmem). > > > > - most probably restricted to cgroup v2. > > > > > > > > 2) fadvise(PAGEOUT) on cgroup_dir_fd > > > > > > > > + more general and applicable to other FSes (actually we are using > > > > something similar for tmpfs). > > > > + can be extended in future to just age the LRUs instead of reclaim or > > > > some new use cases. > > > > > > Could you explain why memory.high as an interface to trigger pro-active > > > memory reclaim is not sufficient. Also memory.low limit to protect > > > latency sensitve workloads? > > I initially liked the proposal, but after some thoughts I've realized > that I don't know a good use case where memory.high is less useful. > Shakeel, what's the typical use case you thinking of? > Who and how will use the new interface? > > > > > Yes, we can use memory.high to trigger [proactive] reclaim in a memcg > > but note that it can also introduce stalls in the application running > > in that memcg. Let's suppose the memory.current of a memcg is 100MiB > > and we want to reclaim 20MiB from it, we can set the memory.high to > > 80MiB but any allocation attempt from the application running in that > > memcg can get stalled/throttled. I want the functionality of the > > reclaim without potential stalls. > > But reclaiming some pagecache/swapping out anon pages can always > generate some stalls caused by pagefaults, no? > Thanks for looking into the proposal. Let me answer both of your questions together. I have added the two use-cases but let me explain the proactive reclaim a bit more as we actually use that in our production. We have defined tolerable refault rates for the applications based on their type (latency sensitive or not). Proactive reclaim is triggered in the application based on their current refault rates and usage. If the current refault rate exceeds the tolerable refault rate then stop/slowdown the proactive reclaim. For the second question, yes, each individual refault can induce the stall as well but we have more control on that stall as compared to stalls due to reclaim. For us almost all the reclaimable memory is anon and we use compression based swap, so, the cost of each refault is fixed and a couple of microseconds. I think the next question is what about the refaults from disk or source with highly variable cost. Usually the latency sensitive applications remove such uncertainty by mlocking the pages backed by such backends (e.g. mlocking the executable) or at least that is the case for us. Thanks, Shakeel