From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.9 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C9D6C433DF for ; Thu, 21 May 2020 12:58:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 517202072C for ; Thu, 21 May 2020 12:58:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chrisdown.name header.i=@chrisdown.name header.b="bLOzNR/s" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 517202072C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=chrisdown.name Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CBDB680008; Thu, 21 May 2020 08:58:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C6D2780007; Thu, 21 May 2020 08:58:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B834080008; Thu, 21 May 2020 08:58:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0234.hostedemail.com [216.40.44.234]) by kanga.kvack.org (Postfix) with ESMTP id A04A180007 for ; Thu, 21 May 2020 08:58:02 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 603E3180AD81F for ; Thu, 21 May 2020 12:58:02 +0000 (UTC) X-FDA: 76840728804.15.flock20_805c72e052d02 X-HE-Tag: flock20_805c72e052d02 X-Filterd-Recvd-Size: 5823 Received: from mail-ej1-f65.google.com (mail-ej1-f65.google.com [209.85.218.65]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Thu, 21 May 2020 12:58:01 +0000 (UTC) Received: by mail-ej1-f65.google.com with SMTP id a2so8649986ejb.10 for ; Thu, 21 May 2020 05:58:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chrisdown.name; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=zE0q+anVt0YABBKa0iOm9d1qeItC3z2sLcxqGncBVQI=; b=bLOzNR/s9S7I+d6/VAHg3FcIB91TnIH4PTsQDruoSc+fbSfnEh3cUbmsQecHLhLaOX 4vj+ItRpXsQBcQkYDtk35GJ237GUVOK950cf+h6475GY+7lSmLi/9YcoCmEDa9fwkyN1 veDn1mKKC0Xc4FQ/FjilKO0iTyrz98Xo8C16s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=zE0q+anVt0YABBKa0iOm9d1qeItC3z2sLcxqGncBVQI=; b=h7sH51sta4l5XMpNZEq4hD8KxwKRsMfNWM8mKMRszO67TRgC+CyLO71mAayukAPp1v FQjnWkS2b7CVhB0FzsoUEzm7uG6KBp4IeiXDuAwGRMQnQrgrlr9vD7v40kGoIYLwlPv8 vaC0PQw3brxw5RyGAC2rZnQuXR+fV8Ld1elWQs+6cFVxZEvZcz6tfd1Iy5B6NkJXa3CI acBHAttfT03+Qbaq1jFz/sohE5hNSJgjeq77ZrIctbsKPEmBYnubudlRjaUUh2tU6GGu ZTdqoJZw5SIIUlkRyT9emIdfuhmgLtovUhGJ6n0OLgT7B7a3Khk4SJqNlNgZs0Mr3/li lxQA== X-Gm-Message-State: AOAM532FkWJWwSCGFXISbGwL3aYUBilw9ueKUxn7vSXogS2WZF4+6FS8 EgZ1kVVsCFjavz/X3zhVyZoVLlDJqd4y/Hln X-Google-Smtp-Source: ABdhPJwUlPCcc8nSFDOxIqSfKdRpccOkuvlBZZUOzHd1MhWNz7hN6UxMTNyjgr9XgEoNGyCUvOKvaA== X-Received: by 2002:a17:906:ae18:: with SMTP id le24mr3597119ejb.155.1590065880626; Thu, 21 May 2020 05:58:00 -0700 (PDT) Received: from localhost ([2620:10d:c093:400::5:4262]) by smtp.gmail.com with ESMTPSA id o2sm4806631eja.68.2020.05.21.05.58.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 May 2020 05:58:00 -0700 (PDT) Date: Thu, 21 May 2020 13:57:59 +0100 From: Chris Down To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Tejun Heo , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] mm, memcg: reclaim more aggressively before high allocator throttling Message-ID: <20200521125759.GD990580@chrisdown.name> References: <20200520143712.GA749486@chrisdown.name> <20200520160756.GE6462@dhcp22.suse.cz> <20200520202650.GB558281@chrisdown.name> <20200521071929.GH6462@dhcp22.suse.cz> <20200521112711.GA990580@chrisdown.name> <20200521120455.GM6462@dhcp22.suse.cz> <20200521122327.GB990580@chrisdown.name> <20200521123742.GO6462@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20200521123742.GO6462@dhcp22.suse.cz> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Michal Hocko writes: >> A cgroup is a unit and breaking it down into "reclaim fairness" for >> individual tasks like this seems suspect to me. For example, if one task in >> a cgroup is leaking unreclaimable memory like crazy, everyone in that cgroup >> is going to be penalised by allocator throttling as a result, even if they >> aren't "responsible" for that reclaim. > >You are right, but that doesn't mean that it is desirable that some >tasks would be throttled unexpectedly too long because of the other's activity. Are you really talking about throttling, or reclaim? If throttling, tasks are already throttled proportionally to how much this allocation is contributing to the overage in calculate_high_delay. If you're talking about reclaim, trying to reason about whether the overage is the result of some other task in this cgroup or the task that's allocating right now is something that we already know doesn't work well (eg. global OOM). >> So the options here are as follows when a cgroup is over memory.high and a >> single reclaim isn't enough: >> >> 1. Decline further reclaim. Instead, throttle for up to 2 seconds. >> 2. Keep on reclaiming. Only throttle if we can't get back under memory.high. >> >> The outcome of your suggestion to decline further reclaim is case #1, which >> is significantly more practically "unfair" to that task. Throttling is >> extremely disruptive to tasks and should be a last resort when we've >> exhausted all other practical options. It shouldn't be something you get >> just because you didn't try to reclaim hard enough. > >I believe I have asked in other email in this thread. Could you explain >why enforcint the requested target (memcg_nr_pages_over_high) is >insufficient for the problem you are dealing with? Because that would >make sense for large targets to me while it would keep relatively >reasonable semantic of the throttling - aka proportional to the memory >demand rather than the excess. memcg_nr_pages_over_high is related to the charge size. As such, if you're way over memory.high as a result of transient reclaim failures, but the majority of your charges are small, it's going to hard to make meaningful progress: 1. Most nr_pages will be MEMCG_CHARGE_BATCH, which is not enough to help; 2. Large allocations will only get a single reclaim attempt to succeed. As such, in many cases we're either doomed to successfully reclaim a paltry amount of pages, or fail to reclaim a lot of pages. Asking try_to_free_pages() to deal with those huge allocations is generally not reasonable, regardless of the specifics of why it doesn't work in this case.