From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5AEEEE4998 for ; Fri, 18 Aug 2023 22:00:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 220F2940078; Fri, 18 Aug 2023 18:00:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 18222940012; Fri, 18 Aug 2023 18:00:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 04968940078; Fri, 18 Aug 2023 18:00:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E2900940012 for ; Fri, 18 Aug 2023 18:00:36 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B7B671411D1 for ; Fri, 18 Aug 2023 22:00:36 +0000 (UTC) X-FDA: 81138595272.13.A3976CE Received: from mail-ej1-f46.google.com (mail-ej1-f46.google.com [209.85.218.46]) by imf10.hostedemail.com (Postfix) with ESMTP id E1156C0005 for ; Fri, 18 Aug 2023 22:00:34 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=OlYQYHj2; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.46 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692396035; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wV15IMnKh+DcB05r7g4z9M9VBYTIic9ANjBGLdi5r5Q=; b=qwAvpio7MhfqwDD06fYFBfMuprt81DW7v2UT6vqTANpFdORz96LtAUyj3xQGVsHqTACxbw ZZRlgAhBTB/jbJ3AqtcaP/0zwfb1o1fyL7GOOip7CwZlbfKnAqbOBfDltogVRBNveCseDd Kc0Ar63EJGVvZVx6dwkNDDBf5Cu3av4= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=OlYQYHj2; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.46 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692396035; a=rsa-sha256; cv=none; b=vbTGagCdTfjzTDK56QE7CKCldkUgjHTpeqtyqpZvnyCvvmONqrm3PCGUcqoXcSXLrZuis8 TYm9QJAEy3GHhOP23+zW2dOPHlmc5F8YjzaSBJzH7+j8c03YR6ivE7zLmCvMoU9Bv45plL FMv1SObfAzfxKrvMXT1lFce1u0KWtu4= Received: by mail-ej1-f46.google.com with SMTP id a640c23a62f3a-99bccc9ec02so180369466b.2 for ; Fri, 18 Aug 2023 15:00:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692396033; x=1693000833; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wV15IMnKh+DcB05r7g4z9M9VBYTIic9ANjBGLdi5r5Q=; b=OlYQYHj2Jk8sLqHfxCgAXnzGNUohKU1TrA4G6naHyEFabxv7DTkvQY3xhfqY2o6Qdv RuOcP6eWhHkLmbzDxZaOegQ0nrYrZ0VJ92yTL+eCEiJ8AzFCWwrSDWx7+sfm0vsvmaUh ADibU9J07YKqxU5CtbYdqXfq2FmZNfF1g9CXTwrxzdBndudoaGmnn/e8oQpoEN+vw80W IBSgEnkWVgSeepCIVuTnMufqEzbqnnNpIGcNC5OSfq1Yclg0lCxICauqwkWn8FZgvdwS 3G+iD5+0s6cTKeJQCP9k/XJLgaqf6x4VvDVD183WmplUh2/t31BBHW6snCnYG+E5/MfX 3kiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692396033; x=1693000833; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wV15IMnKh+DcB05r7g4z9M9VBYTIic9ANjBGLdi5r5Q=; b=C/J7DREQzMV/uGCPb4ffznRcssd0MApkybfBRwds1X4lKGfRAGzI0OxXVj3chHNlrI 7UAP2GTatLDOWYsmUxFrXOALArdsfNseAZ1PmkCCBX1afpEyu1eduG5fpQA+9r0yW5P0 hUVQe+prPy2vMfGg1Gy9MVqKOdGu0vb4jcPa5KHy9yD37kF8x3MssQqTjguvafcCltWj uxJOkwZoSuJGeKLeKR+n5DFubvQw4HP2lAnXzLI0fzESNLCebEvbLI8Ik/3Gy2G/Vozx pUCGAr0ZfauygxULBuc2QVXNzH1UGI/sTiTxmRdiFsMq4TdHZDKOi4NkDyXtvt+EbYXH t1Mw== X-Gm-Message-State: AOJu0YxPHsmbHdlkX7+W4LYj3ZM/f5sr75MMuzFyIB4CtbklC0Zzrjti OEcawkqC8KlDatL3wiCrMrHTI9pTuPV5s6NXhfYjeYXixe8LCtgZixdsUw== X-Google-Smtp-Source: AGHT+IEhwSxjg6Bw/Lb5ooIorg4674rp14RUtKuw80CbiNpRDbPWTLi+nf6Boy0zgXwGPbXrXuvbL36MaxfzLUtAUfc= X-Received: by 2002:a17:906:cc4c:b0:99b:65fa:fc2b with SMTP id mm12-20020a170906cc4c00b0099b65fafc2bmr292975ejb.35.1692396033253; Fri, 18 Aug 2023 15:00:33 -0700 (PDT) MIME-Version: 1.0 References: <20230818134906.GA138967@cmpxchg.org> <20230818173544.GA142196@cmpxchg.org> <20230818183538.GA142974@cmpxchg.org> <20230818213502.nsur4qbs7t7nxg54@google.com> In-Reply-To: From: Yosry Ahmed Date: Fri, 18 Aug 2023 14:59:56 -0700 Message-ID: Subject: Re: [PATCH v2] workingset: ensure memcg is valid for recency check To: Yu Zhao Cc: Shakeel Butt , Johannes Weiner , Nhat Pham , akpm@linux-foundation.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: rgad6so5fd9mt77fysxe5pizqm1xwb3b X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: E1156C0005 X-HE-Tag: 1692396034-821951 X-HE-Meta: U2FsdGVkX1+xmyO5XV66vnZ2AaKaHSx4KC0faWWz+KEbEcy7EAQP34TKPFNW4OtOw+DQqMqvRI1NWX8NV4pJZ/TKk9uD9/8nYspDcaRMBBpHN6mlgbDMi76tDfsqKZqGmQGhI9WyTqtEoz1+lVdvPJzkr8EbQof5RxTPdP1hz/1M2TT/urUepcbFXFoihFKyDX8qM7TrUkZ79VuSvx+BwkYfXa0Nz7UpvPTQRE2aRi9YClchwdv04SXd+rdHB/1Ko4RDtuGNmB5KnymQout9tGEUEScu6/IcRaAwG01GPt81OmvLuPEZAEUFVHF6gwvWf3Ot7+q+KrMmcStSt2FB4uMv62f7auGnKRzqOavEtPuKkI63/Aoj3V738XHf6z+oyFVgNYcHpCyXLKEV/cj9Oue4IyibMGgvZmakeOZdd96ftUyCKMcpSprHQb6mtSOcgJ5VrR0+aTlFIENdFYURdiIdkAEX8ZI5fAQQOUNXM4Ahs+iFHzm3F+JTqYJ+ZSqOxl1VLRCVazNIIrdbXr+nShaS8HfVrkErkrYWum3226mnRuwKMNiYulAvtXl9r3ozzylJ18+tTQU8KipxsMNjqQf1+wxrr3sZqEvBHtP4PlzfMH015jxC2/SbpMMQZXnqRvc6f5W5Ouii1Tv6x9SNnBciSO8aCF8hC6jU+jdODciV7KteySXIyD4uGFFH5C/rnJajgmwjw1pVqFSl1V4IIUaFJ4SvK/01purT6zQ8gGO/tsG1k8WVvObBZ0Y/XIfRqnUmWgnM+E1uMCkUNCffqxyQSDA5WiLz96ILcJ4d8uM1ZGFOUyIvrhlUJT6kkADSTlEItQrjCOvyYt2VMrMbX1ucWWlkAgHjwmcncTXFE5JLC/k7D1qG3m7gAYAv9VsLqRJn6vj0a8tWqteT+B9XBIJHKsQUhIHhxqm+Tf3Ja//OAS4W+tXz93sGH4ayGDk7kpPVjhJpf/Hb4B0oQ4O WAtpE5dG u9DF6FXGRmIeO3DO1I+KabNrx3KlvY58BGQKoIqI33jBAnjyGv7lK+EhvqrqKm71yKA9J/mx3OVKzcmumgbanGlYGsq6EiYhZAJukrkryJ+Z5gwwpAADHEuaxg04LYMHViz9fGFMUhacoR0mIS443iOHbx8fFCG5Mc7Ow0oy5IYjke7T/qMKq+QCQ06oT+6h8H3DLnHeOqd07az1JTE17PwFk7DegGcS0sDCfKG6+gMXwLMeTAiOhReBvGbZhlCK34lvbCQJaSPO4M/WA+OAYKtt7IA0gGWAha467viy4QFWTgwyOx7QDtCYpGQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 18, 2023 at 2:52=E2=80=AFPM Yu Zhao wrote: > > On Fri, Aug 18, 2023 at 3:35=E2=80=AFPM Shakeel Butt wrote: > > > > On Fri, Aug 18, 2023 at 11:44:45AM -0700, Yosry Ahmed wrote: > > > On Fri, Aug 18, 2023 at 11:35=E2=80=AFAM Johannes Weiner wrote: > > > > > > > > On Fri, Aug 18, 2023 at 10:45:56AM -0700, Yosry Ahmed wrote: > > > > > On Fri, Aug 18, 2023 at 10:35=E2=80=AFAM Johannes Weiner wrote: > > > > > > On Fri, Aug 18, 2023 at 07:56:37AM -0700, Yosry Ahmed wrote: > > > > > > > If this happens it seems possible for this to happen: > > > > > > > > > > > > > > cpu #1 cpu#2 > > > > > > > css_put() > > > > > > > /* css_free_rwor= k_fn is queued */ > > > > > > > rcu_read_lock() > > > > > > > mem_cgroup_from_id() > > > > > > > mem_cgroup_id_re= move() > > > > > > > /* access memcg */ > > > > > > > > > > > > I don't quite see how that'd possible. IDR uses rcu_assign_poin= ter() > > > > > > during deletion, which inserts the necessary barriering. My > > > > > > understanding is that this should always be safe: > > > > > > > > > > > > rcu_read_lock() (writer serialization, in thi= s case ref count =3D=3D 0) > > > > > > foo =3D idr_find(x) idr_remove(x) > > > > > > if (foo) kfree_rcu(foo) > > > > > > LOAD(foo->bar) > > > > > > rcu_read_unlock() > > > > > > > > > > How does a barrier inside IDR removal protect against the memcg b= eing > > > > > freed here though? > > > > > > > > > > If css_put() is executed out-of-order before mem_cgroup_id_remove= (), > > > > > the memcg can be freed even before mem_cgroup_id_remove() is call= ed, > > > > > right? > > > > > > > > css_put() can start earlier, but it's not allowed to reorder the rc= u > > > > callback that frees past the rcu_assign_pointer() in idr_remove(). > > > > > > > > This is what RCU and its access primitives guarantees. It ensures t= hat > > > > after "unpublishing" the pointer, all concurrent RCU-protected > > > > accesses to the object have finished, and the memory can be freed. > > > > > > I am not sure I understand, this is the scenario I mean: > > > > > > cpu#1 cpu#2 cpu#3 > > > css_put() > > > /* schedule free */ > > > rcu_read_lock() > > > idr_remove() > > > mem_cgroup_from_id() > > > > > > /* free memcg */ > > > /* use memcg */ > > > > > > If I understand correctly you are saying that the scheduled free > > > callback cannot run before idr_remove() due to the barrier in there, > > > but it can run after the rcu_read_lock() in cpu #2 because it was > > > scheduled before that RCU critical section started, right? > > > > Isn't there a simpler explanation. The memcg whose id is stored in the > > shadow entry has been freed and there is an ongoing new memcg allocatio= n > > which by chance has acquired the same id and has not yet initialized > > completely. More specifically the new memcg creation is between > > css_alloc() and init_and_link_css() and there is a refault for the > > shadow entry holding that id. That's actually very plausible. > > I think so, and this fix would just crash at tryget() instead when > hitting the problem. It seems like mem_cgroup_from_id() completely disregards memcg->id.ref. In the case that Shakeel mentioned, memcg->id.ref should have a count of 0. Perhaps we should update it to respect the id refcount? Maybe try to acquire a ref first before looking up the idr?