From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F4ABC433F5 for ; Wed, 5 Oct 2022 17:43:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D3B666B0072; Wed, 5 Oct 2022 13:42:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CEAE86B0073; Wed, 5 Oct 2022 13:42:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB2FD8E0001; Wed, 5 Oct 2022 13:42:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id A3CCD6B0072 for ; Wed, 5 Oct 2022 13:42:59 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 520DD1C68C2 for ; Wed, 5 Oct 2022 17:42:59 +0000 (UTC) X-FDA: 79987616478.25.578BEEB Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf14.hostedemail.com (Postfix) with ESMTP id AD1FE100011 for ; Wed, 5 Oct 2022 17:42:58 +0000 (UTC) Received: by mail-pl1-f175.google.com with SMTP id d24so16001344pls.4 for ; Wed, 05 Oct 2022 10:42:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:from:to:cc:subject:date:message-id :reply-to; bh=rH6mIZfviwrycSmWdDaZAhXLCNY90027cxpX58MWWN4=; b=e+UiZaH7mX/rn0iDl2OGvojLjy1qpQQPt9uPJFVFT4kf60gWjlXFYLj34i23DMavAr BLndmyJyLVDXkKAxOK8h0A4Nle8Yxzo41yeII4R+bOLzRFK0WIa+pfDMCWnNOLsQ5XO6 v2JY9sofXArGDOkK1ieQgGOqeDIPdG2f2o5rkt9E+rj1VQGRqKIBIZbkYveYHcRbQ3wB OmaUnNsYE7BwrFxCV2SyMa0nVXkcArpG+OvQCquJp8bQNq20rwhEAxBikni7cLDM9AED UewhKYeF+Z4GVxwlf5RiROYfcMWw0iPaYs2aO43v2hBVlWa89OnbnsboZUTb8XwP1tFh ulAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rH6mIZfviwrycSmWdDaZAhXLCNY90027cxpX58MWWN4=; b=348yys7s9xNj4gnd8iVzamsqFtBwCsJjuY4pjTRWC5SHitB6kVXs/UC3+DkN8sMnZe 2UzSYXoFB/eNnb2cRWdvKTgqw09SNpwIZkERKRoqmjmH6liKLqcGSlPi1ha+MER9lE0s Ef9friOfAF3tpT9d3ZKmW16VHQppb5smDWqJL/dG2qcBtFu287GrqxDZqNaiGvzGBpHv 4y09Q4p+gAjhivJhfy6zvI9mZy69JWWTpFbCZuANAnX4/r14yOk/DncBKdxl72IJ++y5 j2F4s2m0r6w9L62viBCrg8vW/hoevyrydKuj3dLfMmnxgxBtwrqAbMdUu9RSBJ0fV3xx sKBQ== X-Gm-Message-State: ACrzQf1AVF2Vgosm89Bin5NwBcTZf1RPoJB2WzVKSw2XTga9NiyJwdPJ cpwzryt526nQtGuww9KM6UI= X-Google-Smtp-Source: AMsMyM4LkYUXkvDJmdP7ShXaaHp4Ct4phsnoAhJ0/6hE9qpU+pcbCMzQByMQEnHAtdiNT0RVUjCwgw== X-Received: by 2002:a17:90b:3803:b0:203:a31c:e2e9 with SMTP id mq3-20020a17090b380300b00203a31ce2e9mr837406pjb.13.1664991777529; Wed, 05 Oct 2022 10:42:57 -0700 (PDT) Received: from localhost (2603-800c-1a02-1bae-a7fa-157f-969a-4cde.res6.spectrum.com. [2603:800c:1a02:1bae:a7fa:157f:969a:4cde]) by smtp.gmail.com with ESMTPSA id gd10-20020a17090b0fca00b0020ad53b5883sm1452528pjb.14.2022.10.05.10.42.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Oct 2022 10:42:56 -0700 (PDT) Date: Wed, 5 Oct 2022 07:42:55 -1000 From: Tejun Heo To: Yosry Ahmed Cc: Zefan Li , Johannes Weiner , Michal Hocko , Shakeel Butt , Roman Gushchin , Michal =?iso-8859-1?Q?Koutn=FD?= , Andrew Morton , Linux-MM , Cgroups , Greg Thelen Subject: Re: [RFC] memcg rstat flushing optimization Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664991778; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rH6mIZfviwrycSmWdDaZAhXLCNY90027cxpX58MWWN4=; b=jMngfa4ibRgyCXK9Mrw4wmt58LwcTzLGvPL5H66pM4bohYMoAPN5Egnai8qkhNDR2tk78I iMkGG8B7EQtS7i32YyEZfgl/WUQn54bbpBpQJxXiQCAgbUkQgmj0Z+cGtqd4oKCltSLsSj 0dO/GwvgOGWm4gRYOLJ149DRkcEROhk= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=e+UiZaH7; spf=pass (imf14.hostedemail.com: domain of htejun@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=htejun@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664991778; a=rsa-sha256; cv=none; b=tHGuATp0McQdAApNtn49h9MJloZ3c01pyMnMyU+Lq3327KHPAcY1Tbgc0p1l/uONmjmDBY +m37y3kB+tRxPvIOS5KfslL0CRCgOcoV/m4ltil6HXopkl2KtSqdGO0VkjabIpUikz/z+0 YGxtoF4pnsCz8P8SuYap5AbrUi4dbAQ= X-Rspam-User: Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=e+UiZaH7; spf=pass (imf14.hostedemail.com: domain of htejun@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=htejun@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) X-Stat-Signature: totheppfy118k5swr438njyec7ddkan5 X-Rspamd-Queue-Id: AD1FE100011 X-Rspamd-Server: rspam02 X-HE-Tag: 1664991778-645970 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, On Wed, Oct 05, 2022 at 10:20:54AM -0700, Yosry Ahmed wrote: > > How long were the stalls? Given that rstats are usually flushed by its > > I think 10 seconds while interrupts are disabled is what we need for a > hard lockup, right? Oh man, that's a long while. I'd really like to learn more about the numbers. How many cgroups are being flushed across how many CPUs? > IIUC you mean that the caller of cgroup_rstat_flush() can call a > different variant that only flushes a part of the rstat tree then > returns, and the caller makes several calls interleaved by re-enabling > irq, right? Because the flushing code seems to already do this > internally if the non irqsafe version is used. I was thinking more that being done inside the flush function. > I think this might be tricky. In this case the path that caused the > lockup was memcg_check_events()->mem_cgroup_threshold()->__mem_cgroup_threshold()->mem_cgroup_usage()->mem_cgroup_flush_stats(). > Interrupts are disabled by callers of memcg_check_events(), but the > rstat flush call is made much deeper in the call stack. Whoever is > disabling interrupts doesn't have access to pause/resume flushing. Hmm.... yeah I guess it's worthwhile to experiment with selective flushing for specific paths. That said, we'd still need to address the whole flush taking long too. > There are also other code paths that used to use > cgroup_rstat_flush_irqsafe() directly before mem_cgroup_flush_stats() > was introduced like mem_cgroup_wb_stats() [1]. > > This is why I suggested a selective flushing variant of > cgroup_rstat_flush_irqsafe(), so that flushers that need irq disabled > have the ability to only flush a subset of the stats to avoid long > stalls if possible. I have nothing against selective flushing but it's not a free thing to do both in terms of complexity and runtime overhead, so let's get some numbers on how much time is spent where. Thanks. -- tejun