From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92466C83F1A for ; Thu, 10 Jul 2025 07:03:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 25FFC6B0093; Thu, 10 Jul 2025 03:03:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 237BA6B009A; Thu, 10 Jul 2025 03:03:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 14DF06B00A8; Thu, 10 Jul 2025 03:03:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 041E76B0093 for ; Thu, 10 Jul 2025 03:03:22 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 550E358D9D for ; Thu, 10 Jul 2025 07:03:21 +0000 (UTC) X-FDA: 83647463802.17.B04F73D Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by imf04.hostedemail.com (Postfix) with ESMTP id 898A14000A for ; Thu, 10 Jul 2025 07:03:19 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mC4tbraR; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of surenb@google.com designates 209.85.160.180 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752130999; a=rsa-sha256; cv=none; b=c+sfwNHZrOBofblJTliE0otmYNqz/qQQ5F97ecyiwCghK4snnBs+zx6H91QuS+dklpYSfO rWJrXVFSa8e1n2hVftV1fL6TCKYSnuNR7fht20Ezr371VJmh4h9DPqKFDWmW18yz7Rd/ft tOXmkROvrO4sznhbIncuzWBoVa1U0uA= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mC4tbraR; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of surenb@google.com designates 209.85.160.180 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752130999; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=czjb2588E9c6YybnVN7gEK08FqLa9riqWhr4gBg65c4=; b=YxkbYdL0WF8YaB0CyYpTT8w2VrgfVBEFrKXeIMHDEZzIf+uk9MvX8sJL5I+U97ULE5MBjR h2guRAK5tu0fEQfwVYa/p3sU6WEo99/dDUDo6kTTYlywDg1XvFJXqg7VGXEw6UDWZWkoqa lT8AYZgUOh/kvhc+0EHb3ZcaA5vpT2o= Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-4a7fc24ed5cso166031cf.1 for ; Thu, 10 Jul 2025 00:03:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752130998; x=1752735798; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=czjb2588E9c6YybnVN7gEK08FqLa9riqWhr4gBg65c4=; b=mC4tbraRxSphaM95Y8BHRY8Qv7DIBPXUo8QptVcdyg6Fh1zs8RUD/YzXw7u5Ttar7j LIJtst/4cP6JOvaZdOkBXYyBPwqpgChTcno8zEG2neFbDY5A2h9S0oJ+pzgSukjakgF7 Q43nDt0YorKuzwIeWye+uR/TKtp7Ei+l2MI0wBNvrwUc1jelNklhXZlmZuiSXkO8ghjc 7I5/jJZXqUp2IFu5Yc4W0olvINGDlaahm5wsGePIIsNdTbki/EYQdPrKRWy9XaUwaO/o 6EAbh5sxedu5RtjaKuYw46l91E+CY7MQEtA5B1VeEmf5ZZSWVdQD7AA4bvb2wiz8AQaw jLjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752130998; x=1752735798; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=czjb2588E9c6YybnVN7gEK08FqLa9riqWhr4gBg65c4=; b=Nt6er2QbzzlUvyaKx6YXByXBTjaDKZHTIHZMhZaC9meTNzAM1nzR/bjR/md97QSuCi obpZkm2z5pEZUUod9tC3kNSpsT/yxYXH9ADmc++OXgAnoPI3oJ9ba85UUBbiT4BSjqif c5fZ8E4SO11hA1vqI6lqMa5V5nooxXxcS0Y94rFDYpCUwFdsDbeoFCFNqHMTbV0NvFrT HZV/HydqI8+NpJqFeXAgZLWyg1/9w9A6WoI65t80lN7uPc0UIIfK89VCpFxCKsHoflSr mSx7ih1XCrnUgbF6gaVSYfz4b+h7aMLkPv0RU7ULGREw+mKBjVzju2fS30w0Np1T8nEO AxrQ== X-Forwarded-Encrypted: i=1; AJvYcCUXQd+eCFUj8nuhs/8ngxZd6VtR2iqS1aEc3V61KXPhE4zc8fygTERc89r4q0Bd58mWQ0WEUv7u7A==@kvack.org X-Gm-Message-State: AOJu0YwHl42ke0eFsmjJJO1P18FT2MYjYXebwZfl/a1uuZjQtWuJLLfg AaswQ5u31rIMAOnFCCBd57n2HvdcwErx9qdLB37O3XrcnfSHfCF/hY8ZtBC93wT86Iv81j/WFvp W6wlBqa++fPhEbTIZXFbtEQjx4HXfZz2vDrkNa2Vc X-Gm-Gg: ASbGnct0IoTaWJmZg3vaEZKitoi6HOLdcbTS+BZEsh9ProwMNknZ0KkXUPifhXk5evC Q7t4oeU2mx4xcibZ1MIPEaijeATMgjoyofnHHslbW0xqfzhwjT197ojetJ7UFXTX75nQdW4bqan XZf0aieYXBW4lCtxmjwRyr9K+so+2H5l6LuR4l0t6/4bJ50qk0EoeJ3XlJkNBTAhhIBWyWS1jvq w== X-Google-Smtp-Source: AGHT+IF5cYv23UdH28BLI5Dpa0xi/QijAGQHszwhzAY2gHXh+9/fijn3Wu/NhxWdHt+CT++oYjA8XVxu0MQ0aYWeYmI= X-Received: by 2002:ac8:5fc3:0:b0:4a5:a9f4:b7c2 with SMTP id d75a77b69052e-4a9ec849010mr1976131cf.17.1752130997981; Thu, 10 Jul 2025 00:03:17 -0700 (PDT) MIME-Version: 1.0 References: <20250704060727.724817-1-surenb@google.com> <20250704060727.724817-8-surenb@google.com> <3b3521f6-30c8-419e-9615-9228f539251e@suse.cz> In-Reply-To: From: Suren Baghdasaryan Date: Thu, 10 Jul 2025 00:03:06 -0700 X-Gm-Features: Ac12FXyB6946DkPN-YjZ5xyUeJCgHbkvs1R9oIUtLa3p_9gPg-mvfpHqpMIC_X0 Message-ID: Subject: Re: [PATCH v6 7/8] fs/proc/task_mmu: read proc/pid/maps under per-vma lock To: "Liam R. Howlett" , Suren Baghdasaryan , Vlastimil Babka , Lorenzo Stoakes , akpm@linux-foundation.org, david@redhat.com, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 898A14000A X-Stat-Signature: onzw9fhuikbp6xnf7wr8d9t6nct1s9rt X-HE-Tag: 1752130999-880295 X-HE-Meta: U2FsdGVkX1+rPiNN2xgi/A3BbFO+SfsBF2Q88qFNFE2PzDatFSMm764OALwwxAFkzM796lIsWAZOSM9M2Y4q4xU+vF0I8vuBHf8ybLTQ7gZR+DKZtMj5qiiKocOVz2yJJS76oge6Cee3682EYeq3BmPwh/pGFqaZ/JMNiX+imBKwtOOYbZ1mltIDeHnaxVx4Yxdmy5SsPGar90OldiOqZ6NvJSnYYGXM7b3fWJqK/hn2y7++JKdNc3ebvIigx7DqhTtzJENqRPD3/BgM5dMkbAPSjvXD09K5AWTM59HLplMdbisOVj5B4oEdyd5Z4qJOUxKoytxWh8gnegtCm7evm48e6ljRXBixjctohK/8mzuuZqfC8BGFs5lnkdGUJS4f8TkOcD+alLSI+pqOMHd4LnynvAQP/aVMPhQQHneahkHe/9rFue1zDuljs5XAKd2kfr/b/0uz1WSXb1+KH8u3I858vlq+pA3PKaWkSgD7Wfc1uvZFxTGmcbsOwXG805cB01ibFiZbRPwQdL9RmFZJfA2+DF8e1HtrPhBC9YE68KeXD4volETYz8fGSqgkh5tgaQumZX0jrE854yIvCE+8MtdXGRaUnrTl3Ko4Jt+dAlk+TYINnYIbP5FYMisr6jwR4Hwq8+hAQR/T2BujhTPbTtJjRQzdyGsSgd4ONuDXyEX1n9spyCHc4aPjWLxMkUURhqkzl89pSbj898JZ4zBRMklOZ670fAwZhVN3lBApjtS2VJPJyMGMwYOYr0Zrwb0PstV0iwNB1pgl5/pOaj28ZeZ/9Sc8pkLn5xHDwGMk1ZI8BmS4E0YjCGAJsBubPmqvBnam24/sychzGo1wnO/aFjVGO3o6/8QhFdNSRf2Ci2Gmky98IGPf9B1vaS6cZJvK19sEAUinSNrAkmMkyDSmB24CCDft3rUwPU//emvB/za76Ek122wfSz1qH/WmbjW7NDSN3J9g6CFtI4cJIzh ZGZPYpHZ xUvE+JZB76qEoO0Yq96JAblldpQqKGnAdl2Pqw31GKklflyriAW28EOU9R8GyaL5TSkR9uRiHwA5O0SuoRRzqc/O/coeMF2p9IIY/lr6rXWK3A7sLo8KHlOmD0HfGBOeCnpvz1VkRh4+ycxMxVqztPC1jT8hfuh5eGZMaBdsCPqTe9Xtwl+H4Rk4MkqnIHrnsfCxdxDgr2evqndKnYaGG04wODm6n+Vjvy40BeNofORcoaiPm7dkiH3sP87dniVKarujOx1kWUB7ple1IMIGAetqmVWCKqimwu4bZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jul 9, 2025 at 10:47=E2=80=AFAM Suren Baghdasaryan wrote: > > On Wed, Jul 9, 2025 at 4:12=E2=80=AFPM Liam R. Howlett wrote: > > > > * Suren Baghdasaryan [250709 11:06]: > > > On Wed, Jul 9, 2025 at 3:03=E2=80=AFPM Vlastimil Babka wrote: > > > > > > > > On 7/9/25 16:43, Suren Baghdasaryan wrote: > > > > > On Wed, Jul 9, 2025 at 1:57=E2=80=AFAM Vlastimil Babka wrote: > > > > >> > > > > >> On 7/8/25 01:10, Suren Baghdasaryan wrote: > > > > >> >>> + rcu_read_unlock(); > > > > >> >>> + vma =3D lock_vma_under_mmap_lock(mm, iter, address); > > > > >> >>> + rcu_read_lock(); > > > > >> >> OK I guess we hold the RCU lock the whole time as we traverse= except when > > > > >> >> we lock under mmap lock. > > > > >> > Correct. > > > > >> > > > > >> I wonder if it's really necessary? Can't it be done just inside > > > > >> lock_next_vma()? It would also avoid the unlock/lock dance quote= d above. > > > > >> > > > > >> Even if we later manage to extend this approach to smaps and emp= loy rcu > > > > >> locking to traverse the page tables, I'd think it's best to sepa= rate and > > > > >> fine-grain the rcu lock usage for vma iterator and page tables, = if only to > > > > >> avoid too long time under the lock. > > > > > > > > > > I thought we would need to be in the same rcu read section while > > > > > traversing the maple tree using vma_next() but now looking at it, > > > > > maybe we can indeed enter only while finding and locking the next > > > > > vma... > > > > > Liam, would that work? I see struct ma_state containing a node fi= eld. > > > > > Can it be freed from under us if we find a vma, exit rcu read sec= tion > > > > > then re-enter rcu and use the same iterator to find the next vma? > > > > > > > > If the rcu protection needs to be contigous, and patch 8 avoids the= issue by > > > > always doing vma_iter_init() after rcu_read_lock() (but does it rea= lly avoid > > > > the issue or is it why we see the syzbot reports?) then I guess in = the code > > > > quoted above we also need a vma_iter_init() after the rcu_read_lock= (), > > > > because although the iterator was used briefly under mmap_lock prot= ection, > > > > that was then unlocked and there can be a race before the rcu_read_= lock(). > > > > > > Quite true. So, let's wait for Liam's confirmation and based on his > > > answer I'll change the patch by either reducing the rcu read section > > > or adding the missing vma_iter_init() after we switch to mmap_lock. > > > > You need to either be under rcu or mmap lock to ensure the node in the > > maple state hasn't been freed (and potentially, reallocated). > > > > So in this case, in the higher level, we can hold the rcu read lock for > > a series of walks and avoid re-walking the tree then the performance > > would be better. > > Got it. Thanks for confirming! > > > > > When we return to userspace, then we should drop the rcu read lock and > > will need to vma_iter_set()/vma_iter_invalidate() on return. I thought > > this was being done (through vma_iter_init()), but syzbot seems to > > indicate a path that was missed? > > We do that in m_start()/m_stop() by calling > lock_vma_range()/unlock_vma_range() but I think I have two problems > here: > 1. As Vlastimil mentioned I do not reset the iterator when falling > back to mmap_lock and exiting and then re-entering rcu read section; > 2. I do not reset the iterator after exiting rcu read section in > m_stop() and re-entering it in m_start(), so the later call to > lock_next_vma() might be using an iterator with a node that was freed > (and possibly reallocated). > > > > > This is the same thing that needed to be done previously with the mmap > > lock, but now under the rcu lock. > > > > I'm not sure how to mitigate the issue with the page table, maybe we > > guess on the number of vmas that we were doing for 4k blocks of output > > and just drop/reacquire then. Probably a problem for another day > > anyways. > > > > Also, I think you can also change the vma_iter_init() to vma_iter_set()= , > > which is slightly less code under the hood. Vlastimil asked about this > > and it's probably a better choice. > > Ack. > I'll update my series with these fixes and all comments I received so > far, will run the reproducers to confirm no issues and repost them > later today. I have the patchset ready but would like to test it some more. Will post it tomorrow. > Thanks, > Suren. > > > > > Thanks, > > Liam > >