From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B5C5EE4992 for ; Fri, 18 Aug 2023 22:20:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229656AbjHRWTt (ORCPT ); Fri, 18 Aug 2023 18:19:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241159AbjHRWTR (ORCPT ); Fri, 18 Aug 2023 18:19:17 -0400 Received: from mail-oi1-x231.google.com (mail-oi1-x231.google.com [IPv6:2607:f8b0:4864:20::231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 84F893ABC for ; Fri, 18 Aug 2023 15:19:15 -0700 (PDT) Received: by mail-oi1-x231.google.com with SMTP id 5614622812f47-3a800814122so1028466b6e.0 for ; Fri, 18 Aug 2023 15:19:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20221208.gappssmtp.com; s=20221208; t=1692397155; x=1693001955; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=hmpKIyAQ8RsIyIEMOMC7QqMJtydQ9V1xA5lN6gQxjZI=; b=rDWfS1qgctml/CnDjcIdruBH+DHUCGbl85DmFlXB/3Rb4TZalsuBMBkcKe5rwOG/2d q11J5RVL2Nh/T5ca7d+YfO44YpB/y6Vj8RCzA7MmGI3ebcpF2CNKRcBf1WF3YSRTHN9t XRjr3lPJrmP0vBGvYrVC+KBesQD9ApDl2o1zdU5SL10jW3rWfTd8YVAbF86EpTusstrw 6gxji2eNBUIE9sigybwMoCwnpzMhHZKp12BG7G2ouWBobL+2wNU7U87YLpeMHLQyGnSG qzX3XHyI1s1Ux7P7jE2aqRsSHBafKPpXmfcASYxce8+0kpUiNyZoIbwshrygCQxg43Zr coNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692397155; x=1693001955; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hmpKIyAQ8RsIyIEMOMC7QqMJtydQ9V1xA5lN6gQxjZI=; b=i1YwrvwjptihJxkefmQrB4VopGTOm3ldW5C1cVcVPqLHi3auf54InXoDwRGwpxxCsk tNLoCNZfTqscd3j1/YKJGiMnaX7/CbEFkXaP7Vb9OjCuhFhhdP7uecTod3MG0gYxzPSB kl+1M7TLcsaDVsnnoMNMlpPVey8f+00fgOoK0DwHhU+zpckkkysL7mLY6J7dhTpyjd9s 6NlAva5uRQ/IljPhvd9+tD2jAfsfcfPx7FnfoqQKmaCnp2jf7LmuuzK4W/xvFICyWsYf p/T9Ul4Iw27G8mVwrV6UmmuXlWrvWj+9GvBSLPXwTPPIAGHZPGJSpIvcFjyBPFVWEE4B Btrg== X-Gm-Message-State: AOJu0YyStkVpxJKBkq6pHYM9vKCiv22+6LKVDnhCQRroYkLSKD9g03m3 OPB2Noe5srZEz/WxZ/vPQWt3bQ== X-Google-Smtp-Source: AGHT+IGlY5jBVyHy6id4d2YSavOSbKGRT+TFNUDhCFs+bK5aqxjjPqXsb43+1mr0FR6EuLyJsll8cA== X-Received: by 2002:a05:6359:1b85:b0:139:e3a4:70a1 with SMTP id ur5-20020a0563591b8500b00139e3a470a1mr420297rwb.28.1692397154786; Fri, 18 Aug 2023 15:19:14 -0700 (PDT) Received: from localhost ([2620:10d:c091:400::5:75e0]) by smtp.gmail.com with ESMTPSA id g22-20020a0caad6000000b006262de12a8csm981855qvb.65.2023.08.18.15.19.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Aug 2023 15:19:14 -0700 (PDT) Date: Fri, 18 Aug 2023 18:19:13 -0400 From: Johannes Weiner To: Yosry Ahmed Cc: Yu Zhao , Nhat Pham , akpm@linux-foundation.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org Subject: Re: [PATCH v2] workingset: ensure memcg is valid for recency check Message-ID: <20230818221913.GA144640@cmpxchg.org> References: <20230818134906.GA138967@cmpxchg.org> <20230818173544.GA142196@cmpxchg.org> <20230818183538.GA142974@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org On Fri, Aug 18, 2023 at 11:44:45AM -0700, Yosry Ahmed wrote: > On Fri, Aug 18, 2023 at 11:35 AM Johannes Weiner wrote: > > > > On Fri, Aug 18, 2023 at 10:45:56AM -0700, Yosry Ahmed wrote: > > > On Fri, Aug 18, 2023 at 10:35 AM Johannes Weiner wrote: > > > > On Fri, Aug 18, 2023 at 07:56:37AM -0700, Yosry Ahmed wrote: > > > > > If this happens it seems possible for this to happen: > > > > > > > > > > cpu #1 cpu#2 > > > > > css_put() > > > > > /* css_free_rwork_fn is queued */ > > > > > rcu_read_lock() > > > > > mem_cgroup_from_id() > > > > > mem_cgroup_id_remove() > > > > > /* access memcg */ > > > > > > > > I don't quite see how that'd possible. IDR uses rcu_assign_pointer() > > > > during deletion, which inserts the necessary barriering. My > > > > understanding is that this should always be safe: > > > > > > > > rcu_read_lock() (writer serialization, in this case ref count == 0) > > > > foo = idr_find(x) idr_remove(x) > > > > if (foo) kfree_rcu(foo) > > > > LOAD(foo->bar) > > > > rcu_read_unlock() > > > > > > How does a barrier inside IDR removal protect against the memcg being > > > freed here though? > > > > > > If css_put() is executed out-of-order before mem_cgroup_id_remove(), > > > the memcg can be freed even before mem_cgroup_id_remove() is called, > > > right? > > > > css_put() can start earlier, but it's not allowed to reorder the rcu > > callback that frees past the rcu_assign_pointer() in idr_remove(). > > > > This is what RCU and its access primitives guarantees. It ensures that > > after "unpublishing" the pointer, all concurrent RCU-protected > > accesses to the object have finished, and the memory can be freed. > > I am not sure I understand, this is the scenario I mean: > > cpu#1 cpu#2 cpu#3 > css_put() > /* schedule free */ > rcu_read_lock() > idr_remove() > mem_cgroup_from_id() > > /* free memcg */ > /* use memcg */ idr_remove() cannot be re-ordered after scheduling the free. Think about it, this is the common rcu-freeing pattern: rcu_assign_pointer(p, NULL); call_rcu(rh, free_pointee); on the write side, and: rcu_read_lock(); pointee = rcu_dereference(p); if (pointee) do_stuff(pointee); rcu_read_unlock(); on the read side. In our case, the rcu_assign_pointer() is in idr_remove(). And the rcu_dereference() is in mem_cgroup_from_id() -> idr_find() -> radix_tree_lookup() -> radix_tree_descend(). So if we find the memcg in the idr under rcu lock, the cgroup rcu work is guaranteed to not run until the lock is dropped. If we don't find it, it may or may not have already run.