From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8ECD4ECAAD3 for ; Thu, 1 Sep 2022 17:39:15 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4MJSw16m7sz3f3t for ; Fri, 2 Sep 2022 03:39:13 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20210112 header.b=OULzLK//; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--surenb.bounces.google.com (client-ip=2607:f8b0:4864:20::b49; helo=mail-yb1-xb49.google.com; envelope-from=3zu0qywykdkmvxuhqejrrjoh.frpolqxassf-ghyolvwv.rcodev.ruj@flex--surenb.bounces.google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20210112 header.b=OULzLK//; dkim-atps=neutral Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4MJSqs1wTkz3015 for ; Fri, 2 Sep 2022 03:35:36 +1000 (AEST) Received: by mail-yb1-xb49.google.com with SMTP id v8-20020a258488000000b00695847496a4so4934328ybk.19 for ; Thu, 01 Sep 2022 10:35:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date; bh=eIMt0K4Dq/+E4lj9SQ/fnMIDQQYvNY/8MPu6+maRPr0=; b=OULzLK//lObTRMGAKryDz/AAItdyiK5yqhZqIIxu+sL1Dieokt2jidmnz3C7/IkvFZ m9XlR/LP/dWBo2UxnXyYn7qdVCzxgLbSFD2CG9EOYY1gruRvre30gd8M27oZrFHurKvO KqLi6QjNq7f2EcLrXhJZo+HJMeLkP9RVRxlBnBeo9kXpz7L5PtGPdJWGZy8ptDnDibsD BU++Zg9dqZVio0ctwQpkNxICzBLOlMkeBzfVJBfO5ajkBVunOWr6F+myD0p5P9YWiss6 hO75UDowCTjrDDdzDEi22CFFlp+R1gx4mTcBiehvp3OGDbyQbHz6/QfMRzJJk1Rc6B1v GPRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date; bh=eIMt0K4Dq/+E4lj9SQ/fnMIDQQYvNY/8MPu6+maRPr0=; b=lSGUJxH5WjaKgRJmIVff9l1niV8Sle0ERTjKLFKfm3AjYvhUuXxk6aBgwxUOxLrq1d +2OLeK/ZBbBr8qKjFajO6w+zdx5bJVkeKgWq/upYgit1Z9VAA/K75rP834vJcTC4Rf29 PaPV07NVEKjwUKVkV+j569mW7Y90t3oKy2GeUwZxYZkx3z3lbaD2YJDVsyvSoHq9DpPX 3R5Q/dWT67+MnaSlVBFeB6WHvSUB4hypEWorh5oUFdSH9uQ73oRquGX6Yevn1w97g9qz HhfAw16ZumlFYqwwE4oKtoKySSiPChHmRYKthjYJMwkQO5y6YBSlrmF8byDwXTaT9XxU GRLA== X-Gm-Message-State: ACgBeo302HflEqUPHaLjLaXDxGcj4JKMqYuMdnuW4AJ9Un94z8/1W2jy 47FPJBx4CMIJU/kndyMReNnpu3lDXtISRStVdGrOK+GkFItg03Cg4XzD5Bb7SpyHQqPvaBxFIvT Mx0YrhXgkAGVdMOgXcTxcTw== X-Google-Smtp-Source: AA6agR57KkNIgUFHWPdFiQWyoY7oyjG1ZvVgnOFqYQ9kJCo94JBJEo43qz2wa5ev7XB45jgLE1aOSNe2TR4= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:1bfc:e7ee:6530:4449]) (user=surenb job=sendgmr) by 2002:a0d:ff86:0:b0:341:5844:5527 with SMTP id p128-20020a0dff86000000b0034158445527mr14770867ywf.504.1662053734514; Thu, 01 Sep 2022 10:35:34 -0700 (PDT) Date: Thu, 1 Sep 2022 10:34:53 -0700 In-Reply-To: <20220901173516.702122-1-surenb@google.com> Mime-Version: 1.0 References: <20220901173516.702122-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.789.g6183377224-goog Message-ID: <20220901173516.702122-6-surenb@google.com> Subject: [RFC PATCH RESEND 05/28] mm: add per-VMA lock and helper functions to control it From: Suren Baghdasaryan To: akpm@linux-foundation.org Content-Type: text/plain; charset="UTF-8" X-ccpol: medium X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: michel@lespinasse.org, joelaf@google.com, songliubraving@fb.com, mhocko@suse.com, david@redhat.com, peterz@infradead.org, bigeasy@linutronix.de, peterx@redhat.com, dhowells@redhat.com, linux-mm@kvack.org, jglisse@google.com, dave@stgolabs.net, minchan@google.com, x86@kernel.org, hughd@google.com, willy@infradead.org, laurent.dufour@fr.ibm.com, mgorman@suse.de, rientjes@google.com, axelrasmussen@google.com, kernel-team@android.com, paulmck@kernel.org, liam.howlett@oracle.com, luto@kernel.org, ldufour@linux.ibm.com, surenb@google.com, vbabka@suse.cz, linux-arm-kernel@lists.infradead.org, kent.overstreet@linux.dev, linux-kernel@vger.kernel.org, hannes@cmpxchg.org, linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" Introduce a per-VMA rw_semaphore to be used during page fault handling instead of mmap_lock. Because there are cases when multiple VMAs need to be exclusively locked during VMA tree modifications, instead of the usual lock/unlock patter we mark a VMA as locked by taking per-VMA lock exclusively and setting vma->lock_seq to the current mm->lock_seq. When mmap_write_lock holder is done with all modifications and drops mmap_lock, it will increment mm->lock_seq, effectively unlocking all VMAs marked as locked. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 78 +++++++++++++++++++++++++++++++++++++++ include/linux/mm_types.h | 7 ++++ include/linux/mmap_lock.h | 13 +++++++ kernel/fork.c | 4 ++ mm/init-mm.c | 3 ++ 5 files changed, 105 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 7d322a979455..476bf936c5f0 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -611,6 +611,83 @@ struct vm_operations_struct { unsigned long addr); }; +#ifdef CONFIG_PER_VMA_LOCK +static inline void vma_init_lock(struct vm_area_struct *vma) +{ + init_rwsem(&vma->lock); + vma->vm_lock_seq = -1; +} + +static inline void vma_mark_locked(struct vm_area_struct *vma) +{ + int mm_lock_seq; + + mmap_assert_write_locked(vma->vm_mm); + + /* + * current task is holding mmap_write_lock, both vma->vm_lock_seq and + * mm->mm_lock_seq can't be concurrently modified. + */ + mm_lock_seq = READ_ONCE(vma->vm_mm->mm_lock_seq); + if (vma->vm_lock_seq == mm_lock_seq) + return; + + down_write(&vma->lock); + vma->vm_lock_seq = mm_lock_seq; + up_write(&vma->lock); +} + +static inline bool vma_read_trylock(struct vm_area_struct *vma) +{ + if (unlikely(down_read_trylock(&vma->lock) == 0)) + return false; + + /* + * Overflow might produce false locked result but it's not critical. + * False unlocked result is critical but is impossible because we + * modify and check vma->vm_lock_seq under vma->lock protection and + * mm->mm_lock_seq modification invalidates all existing locks. + */ + if (vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq)) { + up_read(&vma->lock); + return false; + } + return true; +} + +static inline void vma_read_unlock(struct vm_area_struct *vma) +{ + up_read(&vma->lock); +} + +static inline void vma_assert_locked(struct vm_area_struct *vma) +{ + lockdep_assert_held(&vma->lock); + VM_BUG_ON_VMA(!rwsem_is_locked(&vma->lock), vma); +} + +static inline void vma_assert_write_locked(struct vm_area_struct *vma, int pos) +{ + mmap_assert_write_locked(vma->vm_mm); + /* + * current task is holding mmap_write_lock, both vma->vm_lock_seq and + * mm->mm_lock_seq can't be concurrently modified. + */ + VM_BUG_ON_VMA(vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq), vma); +} + +#else /* CONFIG_PER_VMA_LOCK */ + +static inline void vma_init_lock(struct vm_area_struct *vma) {} +static inline void vma_mark_locked(struct vm_area_struct *vma) {} +static inline bool vma_read_trylock(struct vm_area_struct *vma) + { return false; } +static inline void vma_read_unlock(struct vm_area_struct *vma) {} +static inline void vma_assert_locked(struct vm_area_struct *vma) {} +static inline void vma_assert_write_locked(struct vm_area_struct *vma, int pos) {} + +#endif /* CONFIG_PER_VMA_LOCK */ + static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) { static const struct vm_operations_struct dummy_vm_ops = {}; @@ -619,6 +696,7 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) vma->vm_mm = mm; vma->vm_ops = &dummy_vm_ops; INIT_LIST_HEAD(&vma->anon_vma_chain); + vma_init_lock(vma); } static inline void vma_set_anonymous(struct vm_area_struct *vma) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index bed25ef7c994..6a03f59c1e78 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -486,6 +486,10 @@ struct vm_area_struct { struct mempolicy *vm_policy; /* NUMA policy for the VMA */ #endif struct vm_userfaultfd_ctx vm_userfaultfd_ctx; +#ifdef CONFIG_PER_VMA_LOCK + struct rw_semaphore lock; + int vm_lock_seq; +#endif } __randomize_layout; struct kioctx_table; @@ -567,6 +571,9 @@ struct mm_struct { * init_mm.mmlist, and are protected * by mmlist_lock */ +#ifdef CONFIG_PER_VMA_LOCK + int mm_lock_seq; +#endif unsigned long hiwater_rss; /* High-watermark of RSS usage */ diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index e49ba91bb1f0..a391ae226564 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -72,6 +72,17 @@ static inline void mmap_assert_write_locked(struct mm_struct *mm) VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm); } +#ifdef CONFIG_PER_VMA_LOCK +static inline void vma_mark_unlocked_all(struct mm_struct *mm) +{ + mmap_assert_write_locked(mm); + /* No races during update due to exclusive mmap_lock being held */ + WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1); +} +#else +static inline void vma_mark_unlocked_all(struct mm_struct *mm) {} +#endif + static inline void mmap_init_lock(struct mm_struct *mm) { init_rwsem(&mm->mmap_lock); @@ -114,12 +125,14 @@ static inline bool mmap_write_trylock(struct mm_struct *mm) static inline void mmap_write_unlock(struct mm_struct *mm) { __mmap_lock_trace_released(mm, true); + vma_mark_unlocked_all(mm); up_write(&mm->mmap_lock); } static inline void mmap_write_downgrade(struct mm_struct *mm) { __mmap_lock_trace_acquire_returned(mm, false, true); + vma_mark_unlocked_all(mm); downgrade_write(&mm->mmap_lock); } diff --git a/kernel/fork.c b/kernel/fork.c index 614872438393..bfab31ecd11e 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -475,6 +475,7 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig) */ *new = data_race(*orig); INIT_LIST_HEAD(&new->anon_vma_chain); + vma_init_lock(new); new->vm_next = new->vm_prev = NULL; dup_anon_vma_name(orig, new); } @@ -1130,6 +1131,9 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, seqcount_init(&mm->write_protect_seq); mmap_init_lock(mm); INIT_LIST_HEAD(&mm->mmlist); +#ifdef CONFIG_PER_VMA_LOCK + WRITE_ONCE(mm->mm_lock_seq, 0); +#endif mm_pgtables_bytes_init(mm); mm->map_count = 0; mm->locked_vm = 0; diff --git a/mm/init-mm.c b/mm/init-mm.c index fbe7844d0912..8399f90d631c 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -37,6 +37,9 @@ struct mm_struct init_mm = { .page_table_lock = __SPIN_LOCK_UNLOCKED(init_mm.page_table_lock), .arg_lock = __SPIN_LOCK_UNLOCKED(init_mm.arg_lock), .mmlist = LIST_HEAD_INIT(init_mm.mmlist), +#ifdef CONFIG_PER_VMA_LOCK + .mm_lock_seq = 0, +#endif .user_ns = &init_user_ns, .cpu_bitmap = CPU_BITS_NONE, #ifdef CONFIG_IOMMU_SVA -- 2.37.2.789.g6183377224-goog From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 698CCECAAD5 for ; Thu, 1 Sep 2022 18:10:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=ZEj4bjQHoR0jrfLEAwN2z8SDBZpl5pAbwShk3AlSPnY=; b=q+mOGtF6FY1Xn1ONrusR/OmkhV yXqesTP8xHpcul3TITkHBsT7yUGbn0fXP0vFhFFUFLtuaEZgC69J04D+aXzERR60D/cS3/vg5537A tm5KSAv5XWjiDui3egEYy9XGFc+EDGUj5hFVNENfg3DJQvvIOf8ka5nayAfv3IMWcA4EyhsrnBCxs +EKQv5QyAlS090/zmghnHrtgQV0sDdZS/XAxUlkKS8VE7YeTAa+6SmEHX5A5KyLVKTPJ7WSo0NNik HAP371CchLSQtGfww05WIz6TbxGnkYmTekLGEBdN8Sjrlgp7nNAnf7lzYf9shu76CsDoaDhd6WFaa mA1VCRvw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oTodK-00Dp2y-Hi; Thu, 01 Sep 2022 18:09:44 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oToXX-00Dm5L-T6 for linux-arm-kernel@bombadil.infradead.org; Thu, 01 Sep 2022 18:03:44 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:Cc:To:From:Subject: Message-ID:References:Mime-Version:In-Reply-To:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=eIMt0K4Dq/+E4lj9SQ/fnMIDQQYvNY/8MPu6+maRPr0=; b=Tt8kXCMfZWo9Bn9LYB2mHpMbvP zHvU8JHVLuIitZGu7oBJqC1OBFejQ5BBJ3vkyHwSkDs0DvP7NFt23pmRCbZ4IYIaPX9irSiJ0jmSQ hv4a0VWBzVff1jh005m6NBWhHp3t+OecOGaTtziYSYhWPWsj4P0g3FyF8uodr5kPykhZXs3ZWtxsn cdxWOD+1fcsMOOHaoOI8ginzbuPPRIwLRSJ3y9K7rKpL8tE+y3eAXuPzS/xbhAtxvk+5C/B/8fYxR G5w2CRxTfn2MIHXhoYTmj+OCiSIttYwQ7q+V5PSdichYA0mLDUVPxGCeYX5+zZqDjdpSANVuwr6m1 ujcG0x5g==; Received: from mail-yb1-xb4a.google.com ([2607:f8b0:4864:20::b4a]) by desiato.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oTo6K-008RcV-NK for linux-arm-kernel@lists.infradead.org; Thu, 01 Sep 2022 17:35:38 +0000 Received: by mail-yb1-xb4a.google.com with SMTP id x27-20020a25ac9b000000b0069140cfbbd9so4902017ybi.8 for ; Thu, 01 Sep 2022 10:35:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date; bh=eIMt0K4Dq/+E4lj9SQ/fnMIDQQYvNY/8MPu6+maRPr0=; b=OULzLK//lObTRMGAKryDz/AAItdyiK5yqhZqIIxu+sL1Dieokt2jidmnz3C7/IkvFZ m9XlR/LP/dWBo2UxnXyYn7qdVCzxgLbSFD2CG9EOYY1gruRvre30gd8M27oZrFHurKvO KqLi6QjNq7f2EcLrXhJZo+HJMeLkP9RVRxlBnBeo9kXpz7L5PtGPdJWGZy8ptDnDibsD BU++Zg9dqZVio0ctwQpkNxICzBLOlMkeBzfVJBfO5ajkBVunOWr6F+myD0p5P9YWiss6 hO75UDowCTjrDDdzDEi22CFFlp+R1gx4mTcBiehvp3OGDbyQbHz6/QfMRzJJk1Rc6B1v GPRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date; bh=eIMt0K4Dq/+E4lj9SQ/fnMIDQQYvNY/8MPu6+maRPr0=; b=N9Q7CLmqd9p+6fjG8oUQRFCLoflWlbtIZo7AQODixPoq5vr3tX7elotYaZ5rLUzc3/ TSo7qICJT+7AY1lIkOYVWX8+Aa+ZRKfY9q7gHrdmOFaLy/T44HxKlgTmEXrPHk352uuY nbmEgwjsVm6POIG5Jndzzb19R4D4FZPq7S1Zx3EafXAA+ZmtLvED0kG4az4p47jx+3Qr guC7myxZgytSOSN8SbhCYpRmQ6UuB0uOqSxpPeyoZ5CsZgOabjrJN/qEAZFDeNfa5EpP rc0yc9TraR+nCo+TL4qBI3lTB//tJbF70+YLJLXGlFxZ25sWDz11WC47gN/f2yta1sFo rdKw== X-Gm-Message-State: ACgBeo1v5QPLPlUWopPLMg4LYBcwMuuGo32vIAFUe4YrNu/A1chbVN/p kUxP3W7j5xLCi6qbR3ITMLTcNDS894T1i16QtYhNTe+P5bYCMn4TTWwwrESYKKAIpCt+snGfeiB IOvUb9QE8r4XMT2369CL0MFfwuRSeUkU= X-Google-Smtp-Source: AA6agR57KkNIgUFHWPdFiQWyoY7oyjG1ZvVgnOFqYQ9kJCo94JBJEo43qz2wa5ev7XB45jgLE1aOSNe2TR4= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:1bfc:e7ee:6530:4449]) (user=surenb job=sendgmr) by 2002:a0d:ff86:0:b0:341:5844:5527 with SMTP id p128-20020a0dff86000000b0034158445527mr14770867ywf.504.1662053734514; Thu, 01 Sep 2022 10:35:34 -0700 (PDT) Date: Thu, 1 Sep 2022 10:34:53 -0700 In-Reply-To: <20220901173516.702122-1-surenb@google.com> Mime-Version: 1.0 References: <20220901173516.702122-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.789.g6183377224-goog Message-ID: <20220901173516.702122-6-surenb@google.com> Subject: [RFC PATCH RESEND 05/28] mm: add per-VMA lock and helper functions to control it From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org X-ccpol: medium X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220901_183536_820348_BE8F712E X-CRM114-Status: GOOD ( 19.58 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Introduce a per-VMA rw_semaphore to be used during page fault handling instead of mmap_lock. Because there are cases when multiple VMAs need to be exclusively locked during VMA tree modifications, instead of the usual lock/unlock patter we mark a VMA as locked by taking per-VMA lock exclusively and setting vma->lock_seq to the current mm->lock_seq. When mmap_write_lock holder is done with all modifications and drops mmap_lock, it will increment mm->lock_seq, effectively unlocking all VMAs marked as locked. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 78 +++++++++++++++++++++++++++++++++++++++ include/linux/mm_types.h | 7 ++++ include/linux/mmap_lock.h | 13 +++++++ kernel/fork.c | 4 ++ mm/init-mm.c | 3 ++ 5 files changed, 105 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 7d322a979455..476bf936c5f0 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -611,6 +611,83 @@ struct vm_operations_struct { unsigned long addr); }; +#ifdef CONFIG_PER_VMA_LOCK +static inline void vma_init_lock(struct vm_area_struct *vma) +{ + init_rwsem(&vma->lock); + vma->vm_lock_seq = -1; +} + +static inline void vma_mark_locked(struct vm_area_struct *vma) +{ + int mm_lock_seq; + + mmap_assert_write_locked(vma->vm_mm); + + /* + * current task is holding mmap_write_lock, both vma->vm_lock_seq and + * mm->mm_lock_seq can't be concurrently modified. + */ + mm_lock_seq = READ_ONCE(vma->vm_mm->mm_lock_seq); + if (vma->vm_lock_seq == mm_lock_seq) + return; + + down_write(&vma->lock); + vma->vm_lock_seq = mm_lock_seq; + up_write(&vma->lock); +} + +static inline bool vma_read_trylock(struct vm_area_struct *vma) +{ + if (unlikely(down_read_trylock(&vma->lock) == 0)) + return false; + + /* + * Overflow might produce false locked result but it's not critical. + * False unlocked result is critical but is impossible because we + * modify and check vma->vm_lock_seq under vma->lock protection and + * mm->mm_lock_seq modification invalidates all existing locks. + */ + if (vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq)) { + up_read(&vma->lock); + return false; + } + return true; +} + +static inline void vma_read_unlock(struct vm_area_struct *vma) +{ + up_read(&vma->lock); +} + +static inline void vma_assert_locked(struct vm_area_struct *vma) +{ + lockdep_assert_held(&vma->lock); + VM_BUG_ON_VMA(!rwsem_is_locked(&vma->lock), vma); +} + +static inline void vma_assert_write_locked(struct vm_area_struct *vma, int pos) +{ + mmap_assert_write_locked(vma->vm_mm); + /* + * current task is holding mmap_write_lock, both vma->vm_lock_seq and + * mm->mm_lock_seq can't be concurrently modified. + */ + VM_BUG_ON_VMA(vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq), vma); +} + +#else /* CONFIG_PER_VMA_LOCK */ + +static inline void vma_init_lock(struct vm_area_struct *vma) {} +static inline void vma_mark_locked(struct vm_area_struct *vma) {} +static inline bool vma_read_trylock(struct vm_area_struct *vma) + { return false; } +static inline void vma_read_unlock(struct vm_area_struct *vma) {} +static inline void vma_assert_locked(struct vm_area_struct *vma) {} +static inline void vma_assert_write_locked(struct vm_area_struct *vma, int pos) {} + +#endif /* CONFIG_PER_VMA_LOCK */ + static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) { static const struct vm_operations_struct dummy_vm_ops = {}; @@ -619,6 +696,7 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) vma->vm_mm = mm; vma->vm_ops = &dummy_vm_ops; INIT_LIST_HEAD(&vma->anon_vma_chain); + vma_init_lock(vma); } static inline void vma_set_anonymous(struct vm_area_struct *vma) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index bed25ef7c994..6a03f59c1e78 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -486,6 +486,10 @@ struct vm_area_struct { struct mempolicy *vm_policy; /* NUMA policy for the VMA */ #endif struct vm_userfaultfd_ctx vm_userfaultfd_ctx; +#ifdef CONFIG_PER_VMA_LOCK + struct rw_semaphore lock; + int vm_lock_seq; +#endif } __randomize_layout; struct kioctx_table; @@ -567,6 +571,9 @@ struct mm_struct { * init_mm.mmlist, and are protected * by mmlist_lock */ +#ifdef CONFIG_PER_VMA_LOCK + int mm_lock_seq; +#endif unsigned long hiwater_rss; /* High-watermark of RSS usage */ diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index e49ba91bb1f0..a391ae226564 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -72,6 +72,17 @@ static inline void mmap_assert_write_locked(struct mm_struct *mm) VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm); } +#ifdef CONFIG_PER_VMA_LOCK +static inline void vma_mark_unlocked_all(struct mm_struct *mm) +{ + mmap_assert_write_locked(mm); + /* No races during update due to exclusive mmap_lock being held */ + WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1); +} +#else +static inline void vma_mark_unlocked_all(struct mm_struct *mm) {} +#endif + static inline void mmap_init_lock(struct mm_struct *mm) { init_rwsem(&mm->mmap_lock); @@ -114,12 +125,14 @@ static inline bool mmap_write_trylock(struct mm_struct *mm) static inline void mmap_write_unlock(struct mm_struct *mm) { __mmap_lock_trace_released(mm, true); + vma_mark_unlocked_all(mm); up_write(&mm->mmap_lock); } static inline void mmap_write_downgrade(struct mm_struct *mm) { __mmap_lock_trace_acquire_returned(mm, false, true); + vma_mark_unlocked_all(mm); downgrade_write(&mm->mmap_lock); } diff --git a/kernel/fork.c b/kernel/fork.c index 614872438393..bfab31ecd11e 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -475,6 +475,7 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig) */ *new = data_race(*orig); INIT_LIST_HEAD(&new->anon_vma_chain); + vma_init_lock(new); new->vm_next = new->vm_prev = NULL; dup_anon_vma_name(orig, new); } @@ -1130,6 +1131,9 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, seqcount_init(&mm->write_protect_seq); mmap_init_lock(mm); INIT_LIST_HEAD(&mm->mmlist); +#ifdef CONFIG_PER_VMA_LOCK + WRITE_ONCE(mm->mm_lock_seq, 0); +#endif mm_pgtables_bytes_init(mm); mm->map_count = 0; mm->locked_vm = 0; diff --git a/mm/init-mm.c b/mm/init-mm.c index fbe7844d0912..8399f90d631c 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -37,6 +37,9 @@ struct mm_struct init_mm = { .page_table_lock = __SPIN_LOCK_UNLOCKED(init_mm.page_table_lock), .arg_lock = __SPIN_LOCK_UNLOCKED(init_mm.arg_lock), .mmlist = LIST_HEAD_INIT(init_mm.mmlist), +#ifdef CONFIG_PER_VMA_LOCK + .mm_lock_seq = 0, +#endif .user_ns = &init_user_ns, .cpu_bitmap = CPU_BITS_NONE, #ifdef CONFIG_IOMMU_SVA -- 2.37.2.789.g6183377224-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C4D9ECAAD1 for ; Thu, 1 Sep 2022 17:35:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEB5A8001B; Thu, 1 Sep 2022 13:35:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D9C898000D; Thu, 1 Sep 2022 13:35:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C88D68001B; Thu, 1 Sep 2022 13:35:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id B97318000D for ; Thu, 1 Sep 2022 13:35:35 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 8BC9EAB75E for ; Thu, 1 Sep 2022 17:35:35 +0000 (UTC) X-FDA: 79864218630.14.976367C Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf23.hostedemail.com (Postfix) with ESMTP id 3EEF4140059 for ; Thu, 1 Sep 2022 17:35:35 +0000 (UTC) Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-33ef3e5faeeso228801227b3.0 for ; Thu, 01 Sep 2022 10:35:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date; bh=eIMt0K4Dq/+E4lj9SQ/fnMIDQQYvNY/8MPu6+maRPr0=; b=OULzLK//lObTRMGAKryDz/AAItdyiK5yqhZqIIxu+sL1Dieokt2jidmnz3C7/IkvFZ m9XlR/LP/dWBo2UxnXyYn7qdVCzxgLbSFD2CG9EOYY1gruRvre30gd8M27oZrFHurKvO KqLi6QjNq7f2EcLrXhJZo+HJMeLkP9RVRxlBnBeo9kXpz7L5PtGPdJWGZy8ptDnDibsD BU++Zg9dqZVio0ctwQpkNxICzBLOlMkeBzfVJBfO5ajkBVunOWr6F+myD0p5P9YWiss6 hO75UDowCTjrDDdzDEi22CFFlp+R1gx4mTcBiehvp3OGDbyQbHz6/QfMRzJJk1Rc6B1v GPRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date; bh=eIMt0K4Dq/+E4lj9SQ/fnMIDQQYvNY/8MPu6+maRPr0=; b=uATto1ia5wqlpHmEI5rTYo3Raz4yZm5NH3GaeRMiVem2gwLTgGkUzDSLPIDHuOohNk HfbbsSwblpMavkJ6koYH48jC+sWEcBHLo4gqDJjT4XjvWHKQamHV0Hk801DiFxCdzEyT QCIu1iTQim0VM4ZC+EjjELh44Y5JPb4AHUDnj048RYAfvZhOI0lod+R9UdXkXRCmV4ua /qmcX9IWYKNtxsgqeLNZqI9LVhYtxgdkki+2Gtz3+tP6XJ7m69shqrpgCxfXYhcA7+RH ClBxCbk+eA2bcebcdJeZfgGZ8Z62YM7FyJ33kJOOQUKRIfYsVFroIFaRocLb3gfZ8Hfb 6/Rw== X-Gm-Message-State: ACgBeo38Vni8E0Hj9x7/0w+dMJIjnV0zirfQkiGRgygi+QPycbtDIsy/ Mq03tbQ+YyHAEjBCfShBgAa1Z6VQcpzwEc22bq5Q3XJ2unVq29ZKZEr3Jk/j4xZDZXjxpxVYIWW LHydlEEo= X-Google-Smtp-Source: AA6agR57KkNIgUFHWPdFiQWyoY7oyjG1ZvVgnOFqYQ9kJCo94JBJEo43qz2wa5ev7XB45jgLE1aOSNe2TR4= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:1bfc:e7ee:6530:4449]) (user=surenb job=sendgmr) by 2002:a0d:ff86:0:b0:341:5844:5527 with SMTP id p128-20020a0dff86000000b0034158445527mr14770867ywf.504.1662053734514; Thu, 01 Sep 2022 10:35:34 -0700 (PDT) Date: Thu, 1 Sep 2022 10:34:53 -0700 In-Reply-To: <20220901173516.702122-1-surenb@google.com> Mime-Version: 1.0 References: <20220901173516.702122-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.789.g6183377224-goog Message-ID: <20220901173516.702122-6-surenb@google.com> Subject: [RFC PATCH RESEND 05/28] mm: add per-VMA lock and helper functions to control it From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-ccpol: medium ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1662053735; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eIMt0K4Dq/+E4lj9SQ/fnMIDQQYvNY/8MPu6+maRPr0=; b=K+sUrYK19S7fOJa36TFxUjJI35DhbbYC1IJJ/k6EdxyFrYw6qtqNaOvYsy09JJ1c9jWItk OEoSm8M0RrIyHrFccDitWebRgkHp4ei0gkiDyETSugCRz6ZVLNzhoel6QdZbmohaPDDpp4 elM+U/zPgTZDhiJHctSVuZZbZDerFlk= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="OULzLK//"; spf=pass (imf23.hostedemail.com: domain of 3Zu0QYwYKCKMVXUHQEJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--surenb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3Zu0QYwYKCKMVXUHQEJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1662053735; a=rsa-sha256; cv=none; b=yrSiW071zBOBRsV6RT8A59g2ePH6POm9QMI+AlsFQqrTUdgckhQlTNGc6UkRgFxpxJlO9Q wWONZUzfbfR+8ZNpRHN+eQuA0wq8auR3j3kXMIXd+RI/x8cmE7J9v7Mjbz1LcPd+XoHByq G/u8wCODLY6WeJ1Rmnf9Ydq3dbyNZT4= Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="OULzLK//"; spf=pass (imf23.hostedemail.com: domain of 3Zu0QYwYKCKMVXUHQEJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--surenb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3Zu0QYwYKCKMVXUHQEJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspam-User: X-Stat-Signature: 5s7f9hthahnttkcy65d8dejziwdos6i7 X-Rspamd-Queue-Id: 3EEF4140059 X-Rspamd-Server: rspam05 X-HE-Tag: 1662053735-748439 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Introduce a per-VMA rw_semaphore to be used during page fault handling instead of mmap_lock. Because there are cases when multiple VMAs need to be exclusively locked during VMA tree modifications, instead of the usual lock/unlock patter we mark a VMA as locked by taking per-VMA lock exclusively and setting vma->lock_seq to the current mm->lock_seq. When mmap_write_lock holder is done with all modifications and drops mmap_lock, it will increment mm->lock_seq, effectively unlocking all VMAs marked as locked. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 78 +++++++++++++++++++++++++++++++++++++++ include/linux/mm_types.h | 7 ++++ include/linux/mmap_lock.h | 13 +++++++ kernel/fork.c | 4 ++ mm/init-mm.c | 3 ++ 5 files changed, 105 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 7d322a979455..476bf936c5f0 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -611,6 +611,83 @@ struct vm_operations_struct { unsigned long addr); }; +#ifdef CONFIG_PER_VMA_LOCK +static inline void vma_init_lock(struct vm_area_struct *vma) +{ + init_rwsem(&vma->lock); + vma->vm_lock_seq = -1; +} + +static inline void vma_mark_locked(struct vm_area_struct *vma) +{ + int mm_lock_seq; + + mmap_assert_write_locked(vma->vm_mm); + + /* + * current task is holding mmap_write_lock, both vma->vm_lock_seq and + * mm->mm_lock_seq can't be concurrently modified. + */ + mm_lock_seq = READ_ONCE(vma->vm_mm->mm_lock_seq); + if (vma->vm_lock_seq == mm_lock_seq) + return; + + down_write(&vma->lock); + vma->vm_lock_seq = mm_lock_seq; + up_write(&vma->lock); +} + +static inline bool vma_read_trylock(struct vm_area_struct *vma) +{ + if (unlikely(down_read_trylock(&vma->lock) == 0)) + return false; + + /* + * Overflow might produce false locked result but it's not critical. + * False unlocked result is critical but is impossible because we + * modify and check vma->vm_lock_seq under vma->lock protection and + * mm->mm_lock_seq modification invalidates all existing locks. + */ + if (vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq)) { + up_read(&vma->lock); + return false; + } + return true; +} + +static inline void vma_read_unlock(struct vm_area_struct *vma) +{ + up_read(&vma->lock); +} + +static inline void vma_assert_locked(struct vm_area_struct *vma) +{ + lockdep_assert_held(&vma->lock); + VM_BUG_ON_VMA(!rwsem_is_locked(&vma->lock), vma); +} + +static inline void vma_assert_write_locked(struct vm_area_struct *vma, int pos) +{ + mmap_assert_write_locked(vma->vm_mm); + /* + * current task is holding mmap_write_lock, both vma->vm_lock_seq and + * mm->mm_lock_seq can't be concurrently modified. + */ + VM_BUG_ON_VMA(vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq), vma); +} + +#else /* CONFIG_PER_VMA_LOCK */ + +static inline void vma_init_lock(struct vm_area_struct *vma) {} +static inline void vma_mark_locked(struct vm_area_struct *vma) {} +static inline bool vma_read_trylock(struct vm_area_struct *vma) + { return false; } +static inline void vma_read_unlock(struct vm_area_struct *vma) {} +static inline void vma_assert_locked(struct vm_area_struct *vma) {} +static inline void vma_assert_write_locked(struct vm_area_struct *vma, int pos) {} + +#endif /* CONFIG_PER_VMA_LOCK */ + static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) { static const struct vm_operations_struct dummy_vm_ops = {}; @@ -619,6 +696,7 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) vma->vm_mm = mm; vma->vm_ops = &dummy_vm_ops; INIT_LIST_HEAD(&vma->anon_vma_chain); + vma_init_lock(vma); } static inline void vma_set_anonymous(struct vm_area_struct *vma) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index bed25ef7c994..6a03f59c1e78 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -486,6 +486,10 @@ struct vm_area_struct { struct mempolicy *vm_policy; /* NUMA policy for the VMA */ #endif struct vm_userfaultfd_ctx vm_userfaultfd_ctx; +#ifdef CONFIG_PER_VMA_LOCK + struct rw_semaphore lock; + int vm_lock_seq; +#endif } __randomize_layout; struct kioctx_table; @@ -567,6 +571,9 @@ struct mm_struct { * init_mm.mmlist, and are protected * by mmlist_lock */ +#ifdef CONFIG_PER_VMA_LOCK + int mm_lock_seq; +#endif unsigned long hiwater_rss; /* High-watermark of RSS usage */ diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index e49ba91bb1f0..a391ae226564 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -72,6 +72,17 @@ static inline void mmap_assert_write_locked(struct mm_struct *mm) VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm); } +#ifdef CONFIG_PER_VMA_LOCK +static inline void vma_mark_unlocked_all(struct mm_struct *mm) +{ + mmap_assert_write_locked(mm); + /* No races during update due to exclusive mmap_lock being held */ + WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1); +} +#else +static inline void vma_mark_unlocked_all(struct mm_struct *mm) {} +#endif + static inline void mmap_init_lock(struct mm_struct *mm) { init_rwsem(&mm->mmap_lock); @@ -114,12 +125,14 @@ static inline bool mmap_write_trylock(struct mm_struct *mm) static inline void mmap_write_unlock(struct mm_struct *mm) { __mmap_lock_trace_released(mm, true); + vma_mark_unlocked_all(mm); up_write(&mm->mmap_lock); } static inline void mmap_write_downgrade(struct mm_struct *mm) { __mmap_lock_trace_acquire_returned(mm, false, true); + vma_mark_unlocked_all(mm); downgrade_write(&mm->mmap_lock); } diff --git a/kernel/fork.c b/kernel/fork.c index 614872438393..bfab31ecd11e 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -475,6 +475,7 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig) */ *new = data_race(*orig); INIT_LIST_HEAD(&new->anon_vma_chain); + vma_init_lock(new); new->vm_next = new->vm_prev = NULL; dup_anon_vma_name(orig, new); } @@ -1130,6 +1131,9 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, seqcount_init(&mm->write_protect_seq); mmap_init_lock(mm); INIT_LIST_HEAD(&mm->mmlist); +#ifdef CONFIG_PER_VMA_LOCK + WRITE_ONCE(mm->mm_lock_seq, 0); +#endif mm_pgtables_bytes_init(mm); mm->map_count = 0; mm->locked_vm = 0; diff --git a/mm/init-mm.c b/mm/init-mm.c index fbe7844d0912..8399f90d631c 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -37,6 +37,9 @@ struct mm_struct init_mm = { .page_table_lock = __SPIN_LOCK_UNLOCKED(init_mm.page_table_lock), .arg_lock = __SPIN_LOCK_UNLOCKED(init_mm.arg_lock), .mmlist = LIST_HEAD_INIT(init_mm.mmlist), +#ifdef CONFIG_PER_VMA_LOCK + .mm_lock_seq = 0, +#endif .user_ns = &init_user_ns, .cpu_bitmap = CPU_BITS_NONE, #ifdef CONFIG_IOMMU_SVA -- 2.37.2.789.g6183377224-goog