From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA47CEB64DD for ; Tue, 27 Jun 2023 16:23:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4EE008D0002; Tue, 27 Jun 2023 12:23:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 49ED48D0001; Tue, 27 Jun 2023 12:23:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 38CEF8D0002; Tue, 27 Jun 2023 12:23:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 2A9568D0001 for ; Tue, 27 Jun 2023 12:23:41 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id AEFB41C8A5E for ; Tue, 27 Jun 2023 16:23:40 +0000 (UTC) X-FDA: 80949048600.13.00271AC Received: from mail-yb1-f169.google.com (mail-yb1-f169.google.com [209.85.219.169]) by imf06.hostedemail.com (Postfix) with ESMTP id DF8EE18001A for ; Tue, 27 Jun 2023 16:23:38 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=cX6aE86F; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of surenb@google.com designates 209.85.219.169 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687883018; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qymS9y/iBQJ/Nbtgv3hEJrG0PMa0rrPgMrPQ2j32eRU=; b=HQUkLn6hartC2TE8XTz93UDcXs+orODoQCTs35Oj8et/Xa7kHDMgUCl/euW8l4k9jlFiXb AIf3zkyUVC0fc8FXMHFTGrT9huYqMTpurT9lZhoz0U9QV124WSZWTJ4Wn/+WTJlqbatkSr IczEt6GvPuWrYe9a7SFk/l7dgwSpmwE= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=cX6aE86F; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of surenb@google.com designates 209.85.219.169 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687883018; a=rsa-sha256; cv=none; b=7O5+hdmw0CSx4lbii7ZQp2SlsxLy8r1k6pGvIBp12bnO4gdYfqVf5WM5DLWrx3LEGA3ASM YOdkdps+AOXfhVT4dsIScdIIVjqhUjnpRf+a02XFyi8Peee9m3MJDErczMlKW7KC+Na3Zb 3B0ljhy+zuFGICVGopzhn3n2+STSKp0= Received: by mail-yb1-f169.google.com with SMTP id 3f1490d57ef6-c029d59fcb7so4747476276.0 for ; Tue, 27 Jun 2023 09:23:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687883018; x=1690475018; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=qymS9y/iBQJ/Nbtgv3hEJrG0PMa0rrPgMrPQ2j32eRU=; b=cX6aE86FpHdfRFsdIIgbTpEgKD/Vu2YaitK2vLRrQg49v5sfKauLKa+/hmPC64QbWN lZYVvZRR80L6HLdIU1vhGPGJgnwsxf1Bb1wEeoKAGrBVzk9Snfun7mghg9W+u+hgR67X LKWGZLIeNqusorujboi2uhuN3vpDdOMs35gUGte5KRAInUeSoXLZgk1a1evNWDsoxNxs w9Yfer2GlW92ulsQm+8CRdz7cjowDs3WNqV7wk0jO6lpXK1T/0JYTv0w9+eN9kPgEFGd fPoLKr1qMrEYyPrEv4eyZ/0ZuMTU/sWbL0AjFqwNzLkgZd/fbZjfirILKN37PGDUIk+G NM9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687883018; x=1690475018; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qymS9y/iBQJ/Nbtgv3hEJrG0PMa0rrPgMrPQ2j32eRU=; b=ObvNCSCw2fQImsiBxavPGYu6qOA0fyzWGg+asgANkc/oLtqAglFfUCTh5KYY/V5PG9 sKeSmaVDI/TdVo86SkxstagEDr+O9UmW4BY1copxT8DuHC/VwVMRnpWZ61L0OQaB8d8J ynYj6gxiO4U8Q7ByhSm3+M0WN4doDBKfB5Xzdtvl9uxcRrZPVPeH1laFB5K/eqqy2HPO 17kU//v83YVSSu1AM2ewPe3AM2LfHaP5UdJahEeCrrht3idd5kJnBybxggk16/JMuAoS YQvnt1UdNGp+02gnlKIPkwNnlSpM6ZGMigwpqIDH3yalwVPLp0o/IOVdLxYK+5AMh0qP +Ttg== X-Gm-Message-State: AC+VfDyfG/E+T/PgxTXjYG7JWCx6dRfZ4LXRVxO6koiXzVT2btkv2qyK yA3VkaUCww3K9uJbceitPNTkXPnCSUElkeE5iOHdiA== X-Google-Smtp-Source: ACHHUZ55vtvWXDpDJ91JhVYJu5OqOdjEzz1lOxA79ilXOPa63gdyb/WDbHmi+KEDsz//Yk8mxcBPvYE5d1gzq1EXwZk= X-Received: by 2002:a05:6902:1342:b0:c1c:f99e:ef55 with SMTP id g2-20020a056902134200b00c1cf99eef55mr6600031ybu.57.1687883017742; Tue, 27 Jun 2023 09:23:37 -0700 (PDT) MIME-Version: 1.0 References: <20230627042321.1763765-1-surenb@google.com> <20230627042321.1763765-8-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Tue, 27 Jun 2023 09:23:26 -0700 Message-ID: Subject: Re: [PATCH v3 7/8] mm: drop VMA lock before waiting for migration To: Peter Xu Cc: akpm@linux-foundation.org, willy@infradead.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com, ying.huang@intel.com, david@redhat.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, pasha.tatashin@soleen.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: DF8EE18001A X-Stat-Signature: 6hyy57kefi59c6j76yckipqcy9etspff X-Rspam-User: X-HE-Tag: 1687883018-807230 X-HE-Meta: U2FsdGVkX18vBRMgNOW5qqZyXRci3aSCJcG68auT2JPONnIYoQTzs5kO7yRHDgvOzJAGrjcUvINiI6CzZYrPNIr551P16Ulo+xZcAxM9QvzCxHOYOUhrgVpjIUqmK/paCQDjtMagkmiCXbbdVkzHx/9bSuUZr1Yf2PEjnnS4MQHiLusooHvzLEDyOXqScbiMyoPstfZxg3d1ZJouFoEge8ZdxBnrjJk4RwOGTyDZ804SBCC0PFS7mibE3vAq0Lmy/Qc9z5isJhcH/MHlq4kt0l2w0FvBHsvs8lLRp8mF+sdCvFkNdQVs8dYdMlzJpxsWmSobGBSxtrCJptZRtRn0TDVnlsWdhTkX8wyiEYMMjpFq1qZdG8ooH7Sp2EQE0a8gH5m/Z+OWhrC2WSknA6g9WiBcYdrjDQ7HV1eIhvGr/SRfKVKGzgggYRNhhDZyjGkAA6zrD/M8EWSAVZT1eW+xP7n1Q/+CJR8cj/fiYuqlTYkXiFZDy8/fr7euNFnQekQpe73m1ItIaZFHyUcXl7iKnQQlZ+xdZqwge6wZQlxWRYK7o/IXe6nwaxllN6HQqLK+SpXVJMX8Uys8Cj9LiBqB+MqAB2aKsr1CArIIPfDCDnbbc+Iju+rVmIb978TvgPxJ79BYPQf6ZYZH5WovIYgwRuWJALWNFw9MK9R5YCelgcUWEnaDOa00vRYhiv3ovgEOWuQAKhBd7qeMS/EtAQLn8fU9i6i7PUDxTvVeARznXZM9hZedZRZ/HQi+e2ML+7R8drsA1WMgUqDGwrhGm8v380I6u52Zn9RrGy07zcx/BEzAw9rna0K0sMRC0BtjeIhb3QUIy/YgSz5G7RIBfAYdLnfc97tVa/GGsNaOaCPcclkRNgD41hC00a9c+3LoaZN+M5mtjQZL/BFDT6BuRWX/CGxD+gVXciT9gojZaFNsmiUi6qOmhBwKq/ZNssyVl1n274+1UdudWp58QE0Nqkq JwfjjrD2 Z9CUCk95aGTtuX9CFNap5Q8uUyHbgs9YQ+GVYMNrTBBR9GPbhX84+srIqq708i61eD288ywRxLfFuVMO1fc70eLyUxu0o8qjoRkJ28e/bRSlWrxmaPPPkNR18jrfXlgeDcEW7fQwOYR4LWI4894L0vdDPxwQqgXtsTldCUP81DieBZkf5ANpshtWlNA/tuqalKFjkUI3bMlN4j2JjGokSbp7JWGT+/vPwuv73rXALq7Vz4CxwRTjJeRt++KkVwJByFwU4jEbWG0t60B5pu0oWYE+5pw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 27, 2023 at 8:49=E2=80=AFAM Peter Xu wrote: > > On Mon, Jun 26, 2023 at 09:23:20PM -0700, Suren Baghdasaryan wrote: > > migration_entry_wait does not need VMA lock, therefore it can be > > dropped before waiting. > > Hmm, I'm not sure.. > > Note that we're still dereferencing *vmf->pmd when waiting, while *pmd is > on the page table and IIUC only be guaranteed if the vma is still there. > If without both mmap / vma lock I don't see what makes sure the pgtable i= s > always there. E.g. IIUC a race can happen where unmap() runs right after > vma_end_read() below but before pmdp_get_lockless() (inside > migration_entry_wait()), then pmdp_get_lockless() can read some random > things if the pgtable is freed. That sounds correct. I thought ptl would keep pmd stable but there is time between vma_end_read() and spin_lock(ptl) when it can be freed from under us. I think it would work if we do vma_end_read() after spin_lock(ptl) but that requires code refactoring. I'll probably drop this optimization from the patchset for now to keep things simple and will get back to it later. > > > > > Signed-off-by: Suren Baghdasaryan > > --- > > mm/memory.c | 14 ++++++++++++-- > > 1 file changed, 12 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 5caaa4c66ea2..bdf46fdc58d6 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -3715,8 +3715,18 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > entry =3D pte_to_swp_entry(vmf->orig_pte); > > if (unlikely(non_swap_entry(entry))) { > > if (is_migration_entry(entry)) { > > - migration_entry_wait(vma->vm_mm, vmf->pmd, > > - vmf->address); > > + /* Save mm in case VMA lock is dropped */ > > + struct mm_struct *mm =3D vma->vm_mm; > > + > > + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { > > + /* > > + * No need to hold VMA lock for migration= . > > + * WARNING: vma can't be used after this! > > + */ > > + vma_end_read(vma); > > + ret |=3D VM_FAULT_COMPLETED; > > + } > > + migration_entry_wait(mm, vmf->pmd, vmf->address); > > } else if (is_device_exclusive_entry(entry)) { > > vmf->page =3D pfn_swap_entry_to_page(entry); > > ret =3D remove_device_exclusive_entry(vmf); > > -- > > 2.41.0.178.g377b9f9a00-goog > > > > -- > Peter Xu >