From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 542E6C54F30 for ; Tue, 27 May 2025 04:42:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E6ACF6B0088; Tue, 27 May 2025 00:42:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E42AD6B008A; Tue, 27 May 2025 00:42:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D59106B008C; Tue, 27 May 2025 00:42:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B20276B0088 for ; Tue, 27 May 2025 00:42:03 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 5F951161B31 for ; Tue, 27 May 2025 04:42:03 +0000 (UTC) X-FDA: 83487440526.15.3371C0E Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf04.hostedemail.com (Postfix) with ESMTP id 6A23840005 for ; Tue, 27 May 2025 04:42:01 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Xb9VnslV; spf=pass (imf04.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748320921; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=NBD30Q/JjAMR8KKPVHJYGARQR1sJnroWjbTCKT2rt1w=; b=bFobN8NQwGjcovALjA1z4KGW2kcWb1VsWXwzsYUtPUVSBKf7toFW4HVe2tlhw3o7FxyWQ+ AVKfQI9WWmSrxD54kE+HtSIis7ThlA+WLOHuu8FiAy8yJUujXGevWlhkcZ5E+c7jZ4ytki RWkQtgnPJO+ZoLDKTNm4o4dK378iL78= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Xb9VnslV; spf=pass (imf04.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748320921; a=rsa-sha256; cv=none; b=q+qYyVjXO7Q4ONVB/hnH/9tU3t7b2pHbqUH75WxEcizdsl1zZZUj4TuD9BfoB5lQYc3oWw GsZ2Ha7QqXz4C4yNVBvGHBURipzNdhqJv6NPtbJ9sRBmxSU99t2G7aCn+XCnoOaETXSmlZ JXxK1Aslu6NmLUUUaNGW6ksrGL6w6XA= Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-234593b0613so14643815ad.1 for ; Mon, 26 May 2025 21:42:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1748320920; x=1748925720; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=NBD30Q/JjAMR8KKPVHJYGARQR1sJnroWjbTCKT2rt1w=; b=Xb9VnslVf8vFy9iR8mHfylQW/GL8Dhq64NNtmGssvZjQWIDXqjlRW+hmjzMEFRZfJg OcgULcDwWL1FG6C6aRZUWpVHmpGpZz5HOjAqN8BksBuR18XweZAd42YgBJlmXQhkeRO8 b+IQBYgCOjpkIsPk0aVYDgXLaIinh4FO80QoTvnzh/zYVzbMf0zkK2qrNYIu9xvW42MR H1FwJjyFlZPgSJm80v6YDLbcmAeTlyZmy84t4pgWi6M3eT2chDUSMiDLV7gjHXHZ/3Kk memsASw6SIbkGVSa3/U+i2BtV0nEtE1tR9dDoQW7YJxWotbqtJHWIgYdNpqCqh02XbFx oqAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748320920; x=1748925720; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NBD30Q/JjAMR8KKPVHJYGARQR1sJnroWjbTCKT2rt1w=; b=b3pBIiYyHzNLBEf8TLPKLeeQ1/51E5utVWMQN1nVQQJer+D1XkgRGYzs6B2EL8K0CQ KL3yUuYNmsgn6xbTPJ8Y6lUQKgNbwjPJSfrjmrFloE0XY4RsdrFaOrYfU9273Ncgk84g pJFsP10GP1byor72fIGXtnLlyNPnkiNo1w+e1pPxycsfCu7zjVvqJjhhDUHZXrONbrKm JgTFA8/pvE3599OeX3TqKKTi4nlTAXVC6FEpzhJfZOXiFf3KrpeIcyHPsCef0LLRHnS/ dmh5V75+OXskdi+Gm3H6YIXmnZwRs0qCpbLurbA22B/DzX67iuLbxRJMiouIHkqckGDB LnFg== X-Forwarded-Encrypted: i=1; AJvYcCX+w7wyosbOMLWytuCTiB0p4xZDLNGDSxsxIlRBMNggCTAxC57AxePiYX8HWrQw4a4bGqL4lPqzVA==@kvack.org X-Gm-Message-State: AOJu0YwY/30GR0G9opbtlUt4cDe5AoMCXEpXjgnzRC4hz4eyfDGE1DJ5 aKZvwviT5AWxTMbv2EslwKqPofAJ/O0FDAZAmvoiN4Ukk0Ej1XHBUIjd X-Gm-Gg: ASbGncuAPi6YubdnNUhS7vTnYx0GHZF6ody5eJolT7qqCUvHZ7ymrec6dk3tQbKwpmf 4E437WQrQdS2z/EJs6VxYzO3mTOpcDb+hUOgmLxnzOqYJcG5/Ww6tDkdGrqL1aSYbrTQrCK10bR +7X0DClgA5qGx9Wpypj/F5RalOXa/PLP8RjpKokeldHSAE0285MYrnh6xB5xt8TO1jeOxd4mi4S 47id76nxlyrGTEqxddZXHkzkH7A72+pRZ1VsU4tKTen6xS5YwIXpUM5T8xjzaHiyhpkRykCwvCA nf7TrZMGdDO0TP2EN2X7n6+genUtMlXEAIupyJOSOqVUCBQmNg2xzagOkrnuqXnePWad X-Google-Smtp-Source: AGHT+IE5YXFNlnd4ARuct0pbKVfz2NokbalEcri/Tip5cXml99M51xAmj2w4aUX6A6oLslapmD3TAQ== X-Received: by 2002:a17:902:fc50:b0:234:6d06:b3eb with SMTP id d9443c01a7336-2346d06b6cbmr85838185ad.41.1748320920156; Mon, 26 May 2025 21:42:00 -0700 (PDT) Received: from Barrys-MBP.hub ([118.92.145.159]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-231d4e97886sm176247835ad.146.2025.05.26.21.41.55 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 26 May 2025 21:41:59 -0700 (PDT) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Barry Song , "Liam R. Howlett" , Lorenzo Stoakes , David Hildenbrand , Vlastimil Babka , Jann Horn , Suren Baghdasaryan , Lokesh Gidra , Tangquan Zheng Subject: [PATCH RFC] mm: use per_vma lock for MADV_DONTNEED Date: Tue, 27 May 2025 16:41:45 +1200 Message-Id: <20250527044145.13153-1-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 6A23840005 X-Stat-Signature: 6h1gph5pchngjfhiacruhqb1ndk8aety X-Rspam-User: X-HE-Tag: 1748320921-574342 X-HE-Meta: U2FsdGVkX1/bKlz9MyhLlhJDd+/gcDQrTBRqGEyxVXVhYfxGPehtDLbonVVmroxAZPS+soktSUY/MG2j8Xm7f2Budqo5wEGa003XuMtRoCfSFToYUddsfQtsY4H6cOVgHiWutvLo30A8+UbL0mW7DFLSSnWt37qK2YGyam5pt+ENJ8hQsn2/NZXGIjIo2I+N+5q+Z8sFp/NYLkONB6XioxYvwDJRApQ/Os9P6RU6C7VKOhK6cQvVRx454hvgpZOPZ5/VEetb9ToZdzx1Hcdn/waa7D5u8wwd/3qWiqJio09PgHYc227ei9RSnxDZjScLf1gcepXyftvr9qHMB+yuNUfikfm2gz6r/9dirAXcoUkxg6465KVvZmRX7iJvlsA7CpMSuoE2bONTfGHLyS1/mEkKp6oFSjvgq+cOexk2Ttf1P+C0pfxHqzSAtVyK7lwy+cN1YXBwZWWwc04CvPquHvsxm2SC/tKa9gX808vqgBNegyRYBXkXz8Qr304navCAfehOEVn1/QQA3H00LVFg424J7A+zqCF77GnE+jcpsNSsaxTcMgETg2B2xbypeyMeTp1VXq7vpT9b9t29K/2NTBJ5KyL+FebNBGWKakNXy2/XNDNwUy79PhwOxPXirh4v7pyBGIi+RWLbsZwZBiwy4b6HUFKhQ86flyoWEelUL/effEzAUUod1dbXI4E64AxVw5qhOyOYBY+71zfRSwFKjItrSKJQ7u7eRA3D1mYh0Mg6sKeGfV8RFGYOvPZMPrKDexKZvb/THeTHqwIFxIBjOVg9amD1I2A1+4WUzcX3InoQSwFovWqmcepeS7jReV0W3wkfdY9Bq765zceiNgJeVBY45iAsg0xBuhhH85Ls8q3nZzNQ+E13ocXFklSwKm/Dorzv7QMSbfHtWrqSbq+a2kJL6v1wM3fa1gOhaH3C9KvFVfJH5coWHqUWw7h34I9yeCVubnqlHrzhnPo2LXp MtBw1Ta1 I0fy/ReLu+mkveUTUPCV5MTxXuTsOD+zpcTEqXY3e1e2r7xlUvrksNbYvvZmwTN4yNyj4IJwV8c4gtm2NgR5HzEXykI2yRany7N8tdYwmCaQ8M1Z5nkNIUncJzQ2A92XVDlkL6cYKYusZaoD4dKR1JJM0m5nxshQxoUtfcZ6Pyg9rZx5Gs6wdNDAMcbWBhYEa9PN/TxebBLdtDcrrUGNOOTBP/OE5g7/afEaQc9KTcP6qkdJBIUd2XF/QEK5y3gHbZ+hF+ss/ENVC/uv+L53ni/qMsIJd56zNpjG6tUBOdZXBuSsoC/z93e1nL8yzvlyHIaZfKUrXHb5W2DUU5c12DfEQdNSeRuoasYFQX2QXR+2Ia1AgQN1n60lOdSYYH0GmGnivwnjRuOCN1RIohwwlGgr73uTxw4YG/t8h6eVD2RtVzTSN1y2XukocwQmJ1Ed9XXpyXM2/Z/s6ZKMzhAgg8URL8Q/eYNjs2hxFuYyS8PHlDOHUbm4LCWLfyqV3/6LIOD4MBrlTupB+SaSqgB591p00VA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song Certain madvise operations, especially MADV_DONTNEED, occur far more frequently than other madvise options, particularly in native and Java heaps for dynamic memory management. Currently, the mmap_lock is always held during these operations, even when unnecessary. This causes lock contention and can lead to severe priority inversion, where low-priority threads—such as Android's HeapTaskDaemon— hold the lock and block higher-priority threads. This patch enables the use of per-VMA locks when the advised range lies entirely within a single VMA, avoiding the need for full VMA traversal. In practice, userspace heaps rarely issue MADV_DONTNEED across multiple VMAs. Tangquan’s testing shows that over 99.5% of memory reclaimed by Android benefits from this per-VMA lock optimization. After extended runtime, 217,735 madvise calls from HeapTaskDaemon used the per-VMA path, while only 1,231 fell back to mmap_lock. To simplify handling, the implementation falls back to the standard mmap_lock if userfaultfd is enabled on the VMA, avoiding the complexity of userfaultfd_remove(). Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: David Hildenbrand Cc: Vlastimil Babka Cc: Jann Horn Cc: Suren Baghdasaryan Cc: Lokesh Gidra Cc: Tangquan Zheng Signed-off-by: Barry Song --- mm/madvise.c | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/mm/madvise.c b/mm/madvise.c index 8433ac9b27e0..da016a1d0434 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1817,6 +1817,39 @@ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int beh if (madvise_should_skip(start, len_in, behavior, &error)) return error; + + /* + * MADV_DONTNEED is commonly used with userspace heaps and most often + * affects a single VMA. In these cases, we can use per-VMA locks to + * reduce contention on the mmap_lock. + */ + if (behavior == MADV_DONTNEED || behavior == MADV_DONTNEED_LOCKED) { + struct vm_area_struct *prev, *vma; + unsigned long untagged_start, end; + + untagged_start = untagged_addr(start); + end = untagged_start + len_in; + vma = lock_vma_under_rcu(mm, untagged_start); + if (!vma) + goto lock; + if (end > vma->vm_end || userfaultfd_armed(vma)) { + vma_end_read(vma); + goto lock; + } + if (unlikely(!can_modify_vma_madv(vma, behavior))) { + error = -EPERM; + vma_end_read(vma); + goto out; + } + madvise_init_tlb(&madv_behavior, mm); + error = madvise_dontneed_free(vma, &prev, untagged_start, + end, &madv_behavior); + madvise_finish_tlb(&madv_behavior); + vma_end_read(vma); + goto out; + } + +lock: error = madvise_lock(mm, behavior); if (error) return error; @@ -1825,6 +1858,7 @@ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int beh madvise_finish_tlb(&madv_behavior); madvise_unlock(mm, behavior); +out: return error; } -- 2.39.3 (Apple Git-146)