From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6648AC636C9 for ; Thu, 15 Jul 2021 18:22:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1A5E3613CA for ; Thu, 15 Jul 2021 18:22:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1A5E3613CA Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 765268D00F5; Thu, 15 Jul 2021 14:22:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 73DF38D00EC; Thu, 15 Jul 2021 14:22:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DD258D00F5; Thu, 15 Jul 2021 14:22:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0034.hostedemail.com [216.40.44.34]) by kanga.kvack.org (Postfix) with ESMTP id 3CC558D00EC for ; Thu, 15 Jul 2021 14:22:21 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 1A479185B85FF for ; Thu, 15 Jul 2021 18:22:20 +0000 (UTC) X-FDA: 78365642040.03.C6AED95 Received: from mail-lf1-f46.google.com (mail-lf1-f46.google.com [209.85.167.46]) by imf24.hostedemail.com (Postfix) with ESMTP id ACE7AB000099 for ; Thu, 15 Jul 2021 18:22:19 +0000 (UTC) Received: by mail-lf1-f46.google.com with SMTP id b26so11529378lfo.4 for ; Thu, 15 Jul 2021 11:22:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=F1H/bWMbBZ+Edpc8fYBwimkAh3hVifPS5VIkP0ejhXM=; b=kKak+JOk+MJfDv2F7RXrt3mwVvzMfe5u5QF+FTa8kVNr/0kDzmQ8K59PTs+ihYjGud GDgPuLMACsaJ8lKeAOD9jHahyL0gpPaGg8FZFtKsZCcvrTJDM7y9IYAeXg8uXz4RqzAg fWVCp0S3F6cF6U9aiOg8XjrZfCj2Kgp66z97hyfV2lVckGC/v/lWSMbeAcjOvlKwJWJj g0SkA7O5ZXZKpK/+SO8D5K+O3xaY76PDvKDvz4fbbvs/KEJm7d8/e8ynAKsn8hnxhEap aClCZ7jLFilMZI3GCplmR/1UTxs7hptszBd3Y293WXpajuNS89Wt1AYCm6JLt/scl3HX 8IOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=F1H/bWMbBZ+Edpc8fYBwimkAh3hVifPS5VIkP0ejhXM=; b=gMf/XfOz6SS7DRmMV1vYo2DwMQgsgxuDptJRDmKKekG9sc8xbIWidFkWkQT+nW+Nc4 zzTWgVxkP/r5qTnmErD6FtUBo0ZTngujymmzFajg3qEa8IZUnO7PVCdltQdleWKAKWLe /vWHfBTYPlY0JTpZ67iWAt2G6FIHXp8ujFuNU+TJZMU8Y2SuQyeiA284Rj7srYmoW13V qItZSlPTHQxyctIVDw+yyYX/N1JY1060943yW1ifhp3Evgwf5mHv5hsrWr+vbhF5w3cU ohyo4JCTd21F4+nkYU1IPh6eDAIO+N9wjkTSsTkZoIdrv8jY1fs9pDUPp8FLsvKUg5NR fAMw== X-Gm-Message-State: AOAM530AiSLp3YTbFY/VFdTVnx+42oBrL10GaNb9v/pK9nzF6lSfgySL iNFDTacuVvxJrrAin1Tnx6fMFzBDdtTY7gBxVdSlXg== X-Google-Smtp-Source: ABdhPJwaYi06+2Yqd174ywvY5No8oZFV+JpRoqSRFcHPyb7bSJE9pNhNtlm7eepc3Y9QXgKPq77mgjiuYd3+zXsQzS8= X-Received: by 2002:a05:6512:c23:: with SMTP id z35mr4660747lfu.299.1626373337406; Thu, 15 Jul 2021 11:22:17 -0700 (PDT) MIME-Version: 1.0 References: <1626333284-1404-1-git-send-email-nglaive@gmail.com> In-Reply-To: From: Shakeel Butt Date: Thu, 15 Jul 2021 11:22:05 -0700 Message-ID: Subject: Re: [PATCH] memcg: charge semaphores and sem_undo objects To: Matthew Wilcox Cc: Yutian Yang , Michal Hocko , Johannes Weiner , Vladimir Davydov , Cgroups , Linux MM , shenwenbo@zju.edu.cn Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: ACE7AB000099 X-Stat-Signature: 6sa9uzxoo9juq556auyuxwmtrhdasrn6 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20161025 header.b=kKak+JOk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of shakeelb@google.com designates 209.85.167.46 as permitted sender) smtp.mailfrom=shakeelb@google.com X-HE-Tag: 1626373339-731258 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jul 15, 2021 at 10:50 AM Matthew Wilcox wrote: > > On Thu, Jul 15, 2021 at 03:14:44AM -0400, Yutian Yang wrote: > > This patch adds accounting flags to semaphores and sem_undo allocation > > sites so that kernel could correctly charge these objects. > > > > A malicious user could take up more than 63GB unaccounted memory under > > default sysctl settings by exploiting the unaccounted objects. She could > > allocate up to 32,000 unaccounted semaphore sets with up to 32,000 > > unaccounted semaphore objects in each set. She could further allocate one > > sem_undo unaccounted object for each semaphore set. > > Do we really have to account every object that's allocated on behalf of > userspace? ie how seriously do we take this kind of thing? Are memcgs > supposed to be a hard limit, or are they just a rough accounting thing? The memcgs are used for providing isolation between different workloads running on the system and not just rough accounting estimation. So, if there is an unbound allocation which can be triggered by userspace than it should be accounted. > > There could be a very large stream of patches turning GFP_KERNEL into > GFP_KERNEL_ACCOUNT. For example, file locks (fs/locks.c) are only > allocated with GFP_KERNEL and you can allocate one lock per byte of a > file. I'm sure there are hundreds more places where we do similar things. We used to do opt-out kmem memcg accounting but switched to opt-in with a9bb7e620efdf ("memcg: only account kmem allocations marked as __GFP_ACCOUNT") with the reason that number of allocations which should not be charged are larger than the allocations which should be charged. Personally I would prefer we go back to the opt-out accounting specially after we have switched to reparenting the kmem charges and shared kmem caches.