From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.9 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B41A6C433DF for ; Thu, 21 May 2020 12:23:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 62620207F9 for ; Thu, 21 May 2020 12:23:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chrisdown.name header.i=@chrisdown.name header.b="XPG2fmle" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 62620207F9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=chrisdown.name Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EC5AC80008; Thu, 21 May 2020 08:23:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E753280007; Thu, 21 May 2020 08:23:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D63BB80008; Thu, 21 May 2020 08:23:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0142.hostedemail.com [216.40.44.142]) by kanga.kvack.org (Postfix) with ESMTP id BB60580007 for ; Thu, 21 May 2020 08:23:29 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 76ED35008 for ; Thu, 21 May 2020 12:23:29 +0000 (UTC) X-FDA: 76840641738.20.end95_75c03bd1beb41 X-HE-Tag: end95_75c03bd1beb41 X-Filterd-Recvd-Size: 5072 Received: from mail-ed1-f68.google.com (mail-ed1-f68.google.com [209.85.208.68]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Thu, 21 May 2020 12:23:29 +0000 (UTC) Received: by mail-ed1-f68.google.com with SMTP id b91so6544041edf.3 for ; Thu, 21 May 2020 05:23:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chrisdown.name; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=uwOifwCrhhMbK5Uq7O2JD6i7Fke3H2izTnc0aJuCXEg=; b=XPG2fmle0UHD0TI3rprNxDcmfUfFqTRmNpREToIIZtED2Tw1ccSUVY3s4nRvf2eeHI 6fyQbuYFMmtWFb/NIweeYDAxLIZQggq40/t2t6OcwndTJ9KfQbOi2cjxxCdmgouj8g5f PWs23lR8s68dqTVlmt13rZ3Mv+n9u0mXUNON0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=uwOifwCrhhMbK5Uq7O2JD6i7Fke3H2izTnc0aJuCXEg=; b=c8BWh1ly3RctO9NKCsZ1zriG2Idt/IORZsOqv7wTxpA4vcpiT1OJf1RNKk/pdC7d6q 0G13T1tcaEpkD867ttJ3MveumGsoLo8A/zW9rr718QlQhspMkznbdlOuYMcjENofrkf0 9JlPTnLkYKbWoP9rZMcoK6E2fDEA++lQlt3yoZeZfLJiVLvKK6HXNvge/daY/6ttDuNE zM0UXn7nUEANDTHdS6zw5CHjND4N2TazMh7iS4VbavuzKyNs1JfLCKLS60U0iR7JZsrg 3v2C5gN93IoNjzrfaU4WIo3obYpH0LOcBaMJmUNQtUzt8HTWxiG4Aoy4A2oRPj+mD6BW 13mQ== X-Gm-Message-State: AOAM532wLbieefZ2ql6NjNyWXqNcmHwh9zNud8scv02wz47LBOrjY1HU NmnhTH0sl3Lj5jTAqjp7tJ1lCA== X-Google-Smtp-Source: ABdhPJwIyMIKWxVlICS84mttq3Nqy68ltl2ewLaM0bfQqS1tl6vJX6JfY8W4NWhUa5vl9fh86hqbhQ== X-Received: by 2002:a50:bf04:: with SMTP id f4mr7265805edk.91.1590063807770; Thu, 21 May 2020 05:23:27 -0700 (PDT) Received: from localhost ([2620:10d:c093:400::5:4262]) by smtp.gmail.com with ESMTPSA id h16sm4695618ejy.47.2020.05.21.05.23.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 May 2020 05:23:27 -0700 (PDT) Date: Thu, 21 May 2020 13:23:27 +0100 From: Chris Down To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Tejun Heo , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] mm, memcg: reclaim more aggressively before high allocator throttling Message-ID: <20200521122327.GB990580@chrisdown.name> References: <20200520143712.GA749486@chrisdown.name> <20200520160756.GE6462@dhcp22.suse.cz> <20200520202650.GB558281@chrisdown.name> <20200521071929.GH6462@dhcp22.suse.cz> <20200521112711.GA990580@chrisdown.name> <20200521120455.GM6462@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20200521120455.GM6462@dhcp22.suse.cz> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: (I'll leave the dirty throttling discussion to Johannes, because I'm not so familiar with that code or its history.) Michal Hocko writes: >> > The main problem I see with that approach is that the loop could easily >> > lead to reclaim unfairness when a heavy producer which doesn't leave the >> > kernel (e.g. a large read/write call) can keep a different task doing >> > all the reclaim work. The loop is effectivelly unbound when there is a >> > reclaim progress and so the return to the userspace is by no means >> > proportional to the requested memory/charge. >> >> It's not unbound when there is reclaim progress, it stops when we are within >> the memory.high throttling grace period. Right after reclaim, we check if >> penalty_jiffies is less than 10ms, and abort and further reclaim or >> allocator throttling: > >Just imagine that you have parallel producers increasing the high limit >excess while somebody reclaims those. Sure in practice the loop will be >bounded but the reclaimer might perform much more work on behalf of >other tasks. A cgroup is a unit and breaking it down into "reclaim fairness" for individual tasks like this seems suspect to me. For example, if one task in a cgroup is leaking unreclaimable memory like crazy, everyone in that cgroup is going to be penalised by allocator throttling as a result, even if they aren't "responsible" for that reclaim. So the options here are as follows when a cgroup is over memory.high and a single reclaim isn't enough: 1. Decline further reclaim. Instead, throttle for up to 2 seconds. 2. Keep on reclaiming. Only throttle if we can't get back under memory.high. The outcome of your suggestion to decline further reclaim is case #1, which is significantly more practically "unfair" to that task. Throttling is extremely disruptive to tasks and should be a last resort when we've exhausted all other practical options. It shouldn't be something you get just because you didn't try to reclaim hard enough.