From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8508CEE4992 for ; Fri, 18 Aug 2023 22:26:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CC089280066; Fri, 18 Aug 2023 18:26:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C7041940012; Fri, 18 Aug 2023 18:26:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B37E9280066; Fri, 18 Aug 2023 18:26:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A4C15940012 for ; Fri, 18 Aug 2023 18:26:44 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 6591D1CA363 for ; Fri, 18 Aug 2023 22:26:44 +0000 (UTC) X-FDA: 81138661128.18.C8A0AA7 Received: from mail-ej1-f52.google.com (mail-ej1-f52.google.com [209.85.218.52]) by imf07.hostedemail.com (Postfix) with ESMTP id 861894000F for ; Fri, 18 Aug 2023 22:26:42 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="uHeLi/hW"; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.52 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692397602; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nGG2YTEbV1sS/xdnpOKbesAlwjX46+7FJ2zNOOjpqUM=; b=QiS2/cxOgyUnDOkDVYrOUWZyq4YrrLgdwpzajoOVAor7YkUZe0nN8bAiqkHrVIOkcRbvcY RJQ/mCCHxUpKWqdF8DZUWH9oc4WDtOOz6JaIH2Lmxe+dBttb2846BMSPLv+PTN1q1PzIcn 9rBZfyNE+UI/mjrTKeq1F6Tme4v2rw0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692397602; a=rsa-sha256; cv=none; b=bQvRasMLzgRBty71P0kX+KxW4j2Pbs41Due3uBRyl6dhDCjy0IdsNXREsyW2JsGvQaFvtZ sfz3/QUDgauHmS4SojjuHntjM59u5BYmBVncDo7CTgi/viobohPoq/yA1JCtGZ8MpAljVN ENRgfr962MQu6BYqgwYW3DC9iCGj1W8= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="uHeLi/hW"; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.52 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ej1-f52.google.com with SMTP id a640c23a62f3a-9936b3d0286so184679366b.0 for ; Fri, 18 Aug 2023 15:26:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692397601; x=1693002401; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=nGG2YTEbV1sS/xdnpOKbesAlwjX46+7FJ2zNOOjpqUM=; b=uHeLi/hW9WyEjgHsuAsFaMG+fs1T6pxWFAs8ivxHPYXaKihZmoJItXIkQbmzHWFMAT ciBTV/rjKmjPXl3S8ZDaaNy+IoYzYyU0Ofh3zgBd5QfxaoaBDLphbTFKqWSvN5frFj1z 2Z5O/jZHU2wImK6A/EdD2+LI4h2jZmnBykwyOiSp4BvCJUFR2+imyamroV5IC1acJFDY KZKarI45j000VcQdXoU95NvxrCEwjZ4eFIn10dd6B/yYLUUZDR+L9N8liLj+T9W6qAPv P5Lqw/mwR0siOlAcybKiYLm6vdZvjpg+aATYvkwMWXNuzdh4nAlePlkxT4p+giRUdVT8 D6Tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692397601; x=1693002401; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nGG2YTEbV1sS/xdnpOKbesAlwjX46+7FJ2zNOOjpqUM=; b=OH9dXSR9pO0YvDdZ+NEcYQY7Y0qgWCH9ph6OJ8Cm6oMeoaT5SJWThfwjFKs5GCDyQN Ewmce8MgL+N6hIjI0H/mCJVSYT/GU8z0ASTMaVkbDdXyrEQdYflNBJxDZKx1Bpo4XiOg kMcMCSQDyFFwK9Q4jssV4FoDM+UjjP81TT8vmQERHHlOJVWq5WOumJZRRWIXPVmIBM/c ImsBkG4g9O8ERlHrb+wx0GIFsDoaTBDXmP3hndoFTDUpNf61PvOXyvZMHKn83O3UzpMZ lY8ZwmM0OcyMCsZu5UiOmQtFrZzIrHOLaH6hmwCkErli22ZsiNV8iDWmvDFD4zH1rUTo kzYg== X-Gm-Message-State: AOJu0YxB5aFOLe+AWRgJ4pginvth4ihFelUfYba/8oalnabimazKfBqQ Fo12n8c5680lxHOVlWuhgNShUAD1QQ1MpCxTbegFsw== X-Google-Smtp-Source: AGHT+IFzee4khZqjKncgnJQtEE6TR6TELVpbhgbz22fzVOgHMWG/YGc9nISJ46q9FbU+nurB1+Pz85C483HGoDuNb+s= X-Received: by 2002:a17:907:7891:b0:99b:5445:10d5 with SMTP id ku17-20020a170907789100b0099b544510d5mr342524ejc.51.1692397600920; Fri, 18 Aug 2023 15:26:40 -0700 (PDT) MIME-Version: 1.0 References: <20230818134906.GA138967@cmpxchg.org> <20230818173544.GA142196@cmpxchg.org> <20230818183538.GA142974@cmpxchg.org> <20230818221913.GA144640@cmpxchg.org> In-Reply-To: <20230818221913.GA144640@cmpxchg.org> From: Yosry Ahmed Date: Fri, 18 Aug 2023 15:26:04 -0700 Message-ID: Subject: Re: [PATCH v2] workingset: ensure memcg is valid for recency check To: Johannes Weiner Cc: Yu Zhao , Nhat Pham , akpm@linux-foundation.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: ckkb7sszpp83pbs4dgok1mmximp9sy9i X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 861894000F X-Rspam-User: X-HE-Tag: 1692397602-57808 X-HE-Meta: U2FsdGVkX1+QAFoUBroRO5EZHpCX2fJbarK0Ma0NzpQJq65GwWREHlVUKMMHw1agXt7AbdCroVx4jmKITfCZYwl7Te4F5I7i9qxUFYWnvPBf/SvhmkaTqd9QWgxl6BeYz0pCecFcl29BW0yE4bmT4KHiduJ87KgXUQvlF93lxSoaVBADc6OLX4Y/YSHvguX2HrYNtJ0S/sMFKP1iZhopRr4OYMQvVkjd/y8XclgteoQzqAA/Z+ZtkHwnQIaNml1DAFapOiH+cM5Gba59T3oBEsS6pwlBzKFqpgdGThddjc8dsc5OwUBc1xLjdL07cIOLs8Qru15QVkRp6HlDQzWlMFiJ0LxjzhRVznTIK0MeCQJMoUf50d9Gu2cfy7UGy17dhzuovfReOMJjZZUE3UAIyOxD8lFLS7+Q61APdPlQH+UD7EV7/05IYrG7LC795f5MfiSXqxErnWUppV1ITPyreiMwfZPRNJkKS04tNDdlfBidqPhLdLRIiiSyOmeBHiHyY/pHrn24AL5WPmV5r78+mHYNcGOAPUz/EnOKH3qzzUab/9r6hyg4BQwYLPCKyW15UQmoH5yiC3wB0PVtDAidoXhHaRyFJrPCPjkuD/5S+fCk6MRZdgagLBF8hWBJt9swLs2YtOZ7wRWgcWxIvbjbCRQSqqKwvhWIUH2oXWWEQhy459wMmZI+fzKVPNfC9wNbUwJs5WicnB0Q/ZhMoA1aEBu4xH6ejHO4gYikECi7C44zoaae+7Fpzbcs2D4D+o6Li2pcYnorwtnaEnV6gnexojTldaoLP6makwPiXCYtweuQcHcXAkT/HveOPUOLD52NNl7uCK6cEB+8D3UIxS5JXp7y+78diftaBgwXZMOTAIVoOrwRJyBbxlEz3x9H3LD+GSiOXkvAWu/aYvOXzYrJbXTrjbRcXZL43wmAbSpG2yuZ61XDoOliBdZkv3L6Eumv6p66HVXk8Xi/IDdtlRg HmNWbcQC +Sja9AtFIRn1LyBhLhQ9b/pP7teTmlUXY0bDseHhfLOSl4d9VclYqnfHUgxm6qw+I3HISXwfdRzNuCGmcHiR/HX7pTUsxuTTHG4gUfuohVmtWcoYWoZqLR8qppe7jLvCMyVfDkCmEVAeyjEjxUfQ2kYXK8IGBl/DRsJlBLDdH40zh7PLED+hvM+naHqYa571c/pDg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 18, 2023 at 3:19=E2=80=AFPM Johannes Weiner wrote: > > On Fri, Aug 18, 2023 at 11:44:45AM -0700, Yosry Ahmed wrote: > > On Fri, Aug 18, 2023 at 11:35=E2=80=AFAM Johannes Weiner wrote: > > > > > > On Fri, Aug 18, 2023 at 10:45:56AM -0700, Yosry Ahmed wrote: > > > > On Fri, Aug 18, 2023 at 10:35=E2=80=AFAM Johannes Weiner wrote: > > > > > On Fri, Aug 18, 2023 at 07:56:37AM -0700, Yosry Ahmed wrote: > > > > > > If this happens it seems possible for this to happen: > > > > > > > > > > > > cpu #1 cpu#2 > > > > > > css_put() > > > > > > /* css_free_rwork_= fn is queued */ > > > > > > rcu_read_lock() > > > > > > mem_cgroup_from_id() > > > > > > mem_cgroup_id_remo= ve() > > > > > > /* access memcg */ > > > > > > > > > > I don't quite see how that'd possible. IDR uses rcu_assign_pointe= r() > > > > > during deletion, which inserts the necessary barriering. My > > > > > understanding is that this should always be safe: > > > > > > > > > > rcu_read_lock() (writer serialization, in this = case ref count =3D=3D 0) > > > > > foo =3D idr_find(x) idr_remove(x) > > > > > if (foo) kfree_rcu(foo) > > > > > LOAD(foo->bar) > > > > > rcu_read_unlock() > > > > > > > > How does a barrier inside IDR removal protect against the memcg bei= ng > > > > freed here though? > > > > > > > > If css_put() is executed out-of-order before mem_cgroup_id_remove()= , > > > > the memcg can be freed even before mem_cgroup_id_remove() is called= , > > > > right? > > > > > > css_put() can start earlier, but it's not allowed to reorder the rcu > > > callback that frees past the rcu_assign_pointer() in idr_remove(). > > > > > > This is what RCU and its access primitives guarantees. It ensures tha= t > > > after "unpublishing" the pointer, all concurrent RCU-protected > > > accesses to the object have finished, and the memory can be freed. > > > > I am not sure I understand, this is the scenario I mean: > > > > cpu#1 cpu#2 cpu#3 > > css_put() > > /* schedule free */ > > rcu_read_lock() > > idr_remove() > > mem_cgroup_from_id() > > > > /* free memcg */ > > /* use memcg */ > > idr_remove() cannot be re-ordered after scheduling the free. Think > about it, this is the common rcu-freeing pattern: > > rcu_assign_pointer(p, NULL); > call_rcu(rh, free_pointee); > > on the write side, and: > > rcu_read_lock(); > pointee =3D rcu_dereference(p); > if (pointee) > do_stuff(pointee); > rcu_read_unlock(); > > on the read side. > > In our case, the rcu_assign_pointer() is in idr_remove(). And the > rcu_dereference() is in mem_cgroup_from_id() -> idr_find() -> > radix_tree_lookup() -> radix_tree_descend(). > > So if we find the memcg in the idr under rcu lock, the cgroup rcu work > is guaranteed to not run until the lock is dropped. If we don't find > it, it may or may not have already run. Yeah I missed the implicit barrier there, thanks for bearing with me. I think Shakeel might have found the actual problem here (see his response).