From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16813C10DCE for ; Wed, 18 Mar 2020 09:55:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B741520767 for ; Wed, 18 Mar 2020 09:55:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B741520767 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0C1986B0070; Wed, 18 Mar 2020 05:55:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0233C6B0071; Wed, 18 Mar 2020 05:55:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E7AEE6B0072; Wed, 18 Mar 2020 05:55:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0252.hostedemail.com [216.40.44.252]) by kanga.kvack.org (Postfix) with ESMTP id CBD5E6B0070 for ; Wed, 18 Mar 2020 05:55:18 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 86357181AEF0B for ; Wed, 18 Mar 2020 09:55:18 +0000 (UTC) X-FDA: 76608025116.10.thumb77_2e54c7c99b858 X-HE-Tag: thumb77_2e54c7c99b858 X-Filterd-Recvd-Size: 5056 Received: from mail-wr1-f65.google.com (mail-wr1-f65.google.com [209.85.221.65]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Wed, 18 Mar 2020 09:55:18 +0000 (UTC) Received: by mail-wr1-f65.google.com with SMTP id s5so29502832wrg.3 for ; Wed, 18 Mar 2020 02:55:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=h44VEghhnCpVrg8TDJ73z2V7JCZi3jSE0gcmzyWYyfw=; b=czHF/vj7McsKpBfPqTRNYa3BO7qkYOE5ptzZ5661wTIbWvpyL1vLvTELVwTjZkGMSH 6/8ycyPSgnN55wLZw82BWGNfFCFMA0927DB0U8s2wB5PhcU+hGwNyao4WPNWQshyQCm2 WqKwRC7A9yfdRMaMeJsVS7wa6qaVRBNgCDH9VGDBCMTPypD39lGIdrq+lu159qh4bPwP z0foCyANVuVAZFHQS8UZ0I7cGHNwCazeHscujg0DiMtVvKQDbLwE1OOYlA3b7LtblD8j DW2jgbqMXDi948inzYO0gbRfUk8lC+IKd38lA6S1HMy9gy9H6Rkm6pR1EJq6X6LdXgJM jSSg== X-Gm-Message-State: ANhLgQ3/nioWmde+R+/B59syWcySsV7ihHYCOjaQuJi/ea3X3L01pT6X oVB9QqXbxzZpGG3VPuLks+c= X-Google-Smtp-Source: ADFU+vstgSHo5VmReUDZrAdj+l5NrUgEQOqkv+kWYNzAN2bza84o98mKLheUgeRu/U+bb1powmFOPQ== X-Received: by 2002:adf:df82:: with SMTP id z2mr4474873wrl.46.1584525317004; Wed, 18 Mar 2020 02:55:17 -0700 (PDT) Received: from localhost (ip-37-188-180-89.eurotel.cz. [37.188.180.89]) by smtp.gmail.com with ESMTPSA id 195sm1952050wmb.8.2020.03.18.02.55.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2020 02:55:16 -0700 (PDT) Date: Wed, 18 Mar 2020 10:55:14 +0100 From: Michal Hocko To: Robert Kolchmeyer Cc: David Rientjes , Andrew Morton , Vlastimil Babka , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Ami Fischman Subject: Re: [patch] mm, oom: make a last minute check to prevent unnecessary memcg oom kills Message-ID: <20200318095514.GF21362@dhcp22.suse.cz> References: <20200310221938.GF8447@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 17-03-20 11:25:52, Robert Kolchmeyer wrote: > On Tue, Mar 10, 2020 at 3:54 PM David Rientjes wrote: > > > > Robert, could you elaborate on the user-visible effects of this issue that > > caused it to initially get reported? > > > > Ami (now cc'ed) knows more, but here is my understanding. The use case > involves a Docker container running multiple processes. The container > has a memory limit set. The container contains two long-lived, > important processes p1 and p2, and some arbitrary, dynamic number of > usually ephemeral processes p3,...,pn. These processes are structured > in a hierarchy that looks like p1->p2->[p3,...,pn]; p1 is a parent of > p2, and p2 is the parent for all of the ephemeral processes p3,...,pn. > > Since p1 and p2 are long-lived and important, the user does not want > p1 and p2 to be oom-killed. However, p3,...,pn are expected to use a > lot of memory, and it's ok for those processes to be oom-killed. > > If the user sets oom_score_adj on p1 and p2 to make them very unlikely > to be oom-killed, p3,...,pn will inherit the oom_score_adj value, > which is bad. Additionally, setting oom_score_adj on p3,...,pn is > tricky, since processes in the Docker container (specifically p1 and > p2) don't have permissions to set oom_score_adj on p3,...,pn. The > ephemeral nature of p3,...,pn also makes setting oom_score_adj on them > tricky after they launch. Thanks for the clarification. > So, the user hopes that when one of p3,...,pn triggers an oom > condition in the Docker container, the oom killer will almost always > kill processes from p3,...,pn (and not kill p1 or p2, which are both > important and unlikely to trigger an oom condition). The issue of more > processes being killed than are strictly necessary is resulting in p1 > or p2 being killed much more frequently when one of p3,...,pn triggers > an oom condition, and p1 or p2 being killed is very disruptive for the > user (my understanding is that p1 or p2 going down with high frequency > results in significant unhealthiness in the user's service). Do you have any logs showing this condition? I am interested because from your description it seems like p1/p2 shouldn't be usually those which trigger the oom, right? That suggests that it should be mostly p3, ... pn to be in the kernel triggering the oom and therefore they shouldn't vanish. -- Michal Hocko SUSE Labs