From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD617EB64D9 for ; Mon, 10 Jul 2023 10:52:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6F7E36B0080; Mon, 10 Jul 2023 06:52:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A7CA6B0081; Mon, 10 Jul 2023 06:52:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5701C6B0082; Mon, 10 Jul 2023 06:52:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 492146B0080 for ; Mon, 10 Jul 2023 06:52:37 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 106B240338 for ; Mon, 10 Jul 2023 10:52:37 +0000 (UTC) X-FDA: 80995388754.17.E9AB9DE Received: from mail-oi1-f172.google.com (mail-oi1-f172.google.com [209.85.167.172]) by imf24.hostedemail.com (Postfix) with ESMTP id 5B67B180017 for ; Mon, 10 Jul 2023 10:52:34 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=QgNE2wco; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf24.hostedemail.com: domain of zhangpeng.00@bytedance.com designates 209.85.167.172 as permitted sender) smtp.mailfrom=zhangpeng.00@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688986355; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1qWr+V3gYZrsNg0XRx3MW3g+BVG6UNjjUpqVndfuvns=; b=qx5/bVHZTNsomKJLDc9mLMxxP5D5FXGvfEqI5JyOLk2Amsu+rnAr9K8xJRfu60nT7DsEFC VbJQOlkiL665ErNhm88949w8881wZqABM1A+YoAqRKdZyI4ZSdmQU1dpkht8fx47lqJVDO /CPd24kKletI92OimvFHz56orQfSRrY= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=QgNE2wco; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf24.hostedemail.com: domain of zhangpeng.00@bytedance.com designates 209.85.167.172 as permitted sender) smtp.mailfrom=zhangpeng.00@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688986355; a=rsa-sha256; cv=none; b=HLEH3iOtQ8VWYrVovv4tulVY5qPCQ3kwy4vkmwJOVMTZE1uuIzPsqny7F17f30uJ6RcloB MorPMMKYz5z9mtljzC5NMNBJrpxJXapQBcNqwjV3P8S7EIk5WJWSKlV4RzA2Z01h2k2D01 119FAsw03eMIbz9WsfwaV/nA/8EeSI8= Received: by mail-oi1-f172.google.com with SMTP id 5614622812f47-38e04d1b2b4so3312996b6e.3 for ; Mon, 10 Jul 2023 03:52:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1688986353; x=1691578353; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=1qWr+V3gYZrsNg0XRx3MW3g+BVG6UNjjUpqVndfuvns=; b=QgNE2wco+U0ham6nOIpQGx2HHG7POV14PtvoYUO9gc75CiqvOv8695t58wgZJW7WBv mMPDnpEOZ8USpDasHFbpQDc+vevRi0W8mXTa5AeOvt7GxA16VFO77yHX6IwoPfKCFBI4 t7DiF1xn2V3GQtf1ATFAmgaJEkto4JVHETPw27/mRnMLh0STzicLHdxebmkTlpRPpCEU fRHsu79dTwDZgqgZR7JC2XYgmajfjqvz1Tc/LwD0gjfumyOb0sZMgUsSDH7vpgf1R5j7 p21EFrHYmYIBgog0nYpjUU8hXFgGb5r5a2xCtINjlMyZ1X7Pcbdszu5gqLcPz2vvKm9H +fWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688986353; x=1691578353; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=1qWr+V3gYZrsNg0XRx3MW3g+BVG6UNjjUpqVndfuvns=; b=KKZsA2jqA6xMpbHYCgGIa6J6UWYiQ7vzqtAPSPSdV4AR6QeLQCTf3ti+EpA8IXNbnS OOAq3T2Om8+OolNBUL76tD++Dt7IBR+fDB5sI21Ml+ADBxod474LdBCUitttJr4XrGwr jp/29jHm+nIZL8xsrNytiLIhHaTeUJACyqTif2hA2C5DpUVDbZd11VAMHVjuE5WyhBf5 bdvnfPeiYGpceWtgBQliZ17Vinyz981oFiKdzFv1KPADwb79lEp6lWp8TDS1FjG8knmi AIyWHOVmQ1jPlIjM2igmVivQMWqo7mIwiOo+dashbc+rzQGfxtthcKH1jRIFFZ38cbu/ MqXg== X-Gm-Message-State: ABy/qLYZ4vQUkLKLXQFKS/TxMqOA1bDH2P9vQFCYvYv5T3I4jDOCDEIv JHJt2WW6NABwwxLNCnA77T5d2Q== X-Google-Smtp-Source: APBJJlH1cWx4gMVcMxl6FnFBdryk+fIRvyMddabfNZ6Oiv3Gf66K+tZmCS8MES8xlyzAsEEBq6cK8w== X-Received: by 2002:a05:6808:3d3:b0:3a3:d4e9:7d18 with SMTP id o19-20020a05680803d300b003a3d4e97d18mr10935100oie.2.1688986353095; Mon, 10 Jul 2023 03:52:33 -0700 (PDT) Received: from [10.90.35.114] ([203.208.167.147]) by smtp.gmail.com with ESMTPSA id z2-20020a637e02000000b00553ad4ae5e5sm7065863pgc.22.2023.07.10.03.52.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 10 Jul 2023 03:52:32 -0700 (PDT) Message-ID: <96a737e5-5545-04ed-e533-e322ffe20fce@bytedance.com> Date: Mon, 10 Jul 2023 18:52:27 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH] mm: kfence: allocate kfence_metadata at runtime To: Marco Elver Cc: glider@google.com, dvyukov@google.com, akpm@linux-foundation.org, kasan-dev@googlegroups.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, muchun.song@linux.dev, Peng Zhang References: <20230710032714.26200-1-zhangpeng.00@bytedance.com> From: Peng Zhang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 5B67B180017 X-Stat-Signature: h8f7473d3fqyk6ctybiika3zkbhtyn17 X-Rspam-User: X-HE-Tag: 1688986354-346559 X-HE-Meta: U2FsdGVkX1+REAEOILaArV49mbx/3XunaxUIqK7oWztQ9zsx1iVwyRdGFAnoFwSYeoxiD6+DaEOle/Px/XKT8EWuZ6556Z2WmUMkwVLIbG0Rb33iyeRaVZyq6wyQ621VZbjfaR4Uvl9igFK8A/OKL7mav1E1H3Pv4TwkINggUUCBVoHvQGpwph8gGJpU5CMNaIicL+9AloZUKRubO6wSm6PAbfEZ4CnSEtEyfx8ith/8We5n5niQP51oh1K1j89ZTeoUDsGiwezS5kAL9QlLr95Gl3coScBdeCnDcJhnxZ1WlG9IY5kCW4n4UIMs2O32ZF0lfx9ZNmEB9s+agGTu94lysdAJ7i2jJXoUL2vx/9id3gg3s7xsRpE4hNdg9S+nnOGQrcsQOTNrc5AsZp9HHF7k6MbFW+QQUEYg5NISxuda4dwlEpp3WzUi9aUiSclNC91tFcXArRGazbf77WlVYUtGuSzPcIYcgpDJSUAT3d4yToZ/SaPL5AmcBzg+ZxyupHWPoexraNiEfyWHAgfOnIfDtTivnkfCALe5p4wp+ip9gGs0n7kdnTWkDz5yGN+6YR+hzJycjOs4REzL9QRIPTs98FuT2BPFBQkWoEVXfaX080IJK+45XnK6O8t3BrktugXOWfnV1JZz5Mb/LX4b3cdB95bG97AeEfH54b455SH0TIlIP03YJ31iuiKTj5yClTVV2wpXYDvkKLTBQrymIwdBlEWZFblsm91YOocnPXjeT23yX+G9tZoLN52oCHsA5SuWXF5sILGuNmXXNZnhrYxKGvxC3JAwp7LaxzawDql13qlVqBJvtDGaS+9sIsiXta7r4w1UmOVPPyMvLEAV+QOowwKITu4Q18dhbYYq5FQgz6ZDQMAn2tlgvBCwfvjOUw4SW+3Gy4vvsG3LySTvc2ttjZ75QqJrNow6HgaK3AuZ3IUlGS6TsD/pzN/qpr6O4sqo9fEywulu8HQQiny eP0pLHXx PcdAEO2zOEKTFWZIadSDAnkq1scnqPdS+DYNDbr2+Nhpa0B6QogyJMAGdly2RnsjsUiqdInHpyexrTwg6Iiq/GOVUsjjzW+ZtXViLl4iIUK+tp1yy413z2Hsms9B4V0lKHSDpswLCWVseiz+zD4RQbn2Guo9aimqs1vowB2zZbcbiYDSnZTk/tmZo56+Qzx/CVeIuECUmCmgbKi4t70nsECO9aU1+wOFf3DEwXALEBsmziGBWx3FdL4LnT1zyNxYlL9i/EFAP8W6PocXpTboA0nTbPXOSt6eqGItbjvjjyKQf2NdKE2EkCcmHwBV93O18n7X65ogr2TYlexEeqbBHzPG3faApQd+SNhDask8co+6PSe6D1Vat9TVkWmw1Kp2dXXalkKqqsSJL5C31Zl6AxTQxcFLBdLMxJDvhJidT2FiG7UJ+L2w/Xy/dJN+zT1Bhu49Nii2MRvrIsCWi1jx0stsrkwVImq31pP3uMDhU3OGWyzC8e3j1pi124jbY4wABLuKow762Zkm61Ru+VLtmr7L9XQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: 在 2023/7/10 18:21, Marco Elver 写道: > On Mon, Jul 10, 2023 at 12:19PM +0200, Marco Elver wrote: >> On Mon, 10 Jul 2023 at 05:27, 'Peng Zhang' via kasan-dev >> wrote: >>> >>> kfence_metadata is currently a static array. For the purpose of >>> allocating scalable __kfence_pool, we first change it to runtime >>> allocation of metadata. Since the size of an object of kfence_metadata >>> is 1160 bytes, we can save at least 72 pages (with default 256 objects) >>> without enabling kfence. >>> >>> Below is the numbers obtained in qemu (with default 256 objects). >>> before: Memory: 8134692K/8388080K available (3668K bss) >>> after: Memory: 8136740K/8388080K available (1620K bss) >>> More than expected, it saves 2MB memory. >>> >>> Signed-off-by: Peng Zhang >> >> Seems like a reasonable optimization, but see comments below. >> >> Also with this patch applied on top of v6.5-rc1, KFENCE just doesn't >> init at all anymore (early init). Please fix. > > Forgot to attach .config -- attached config. > > All I see is: > > [ 0.303465] rcu: RCU calculated value of scheduler-enlistment delay is 100 jiffies. > [ 0.304783] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8 > [ 0.316800] NR_IRQS: 4352, nr_irqs: 488, preallocated irqs: 16 > [ 0.318140] rcu: srcu_init: Setting srcu_struct sizes based on contention. > [ 0.320001] kfence: kfence_init failed > [ 0.326880] Console: colour VGA+ 80x25 > [ 0.327585] printk: console [ttyS0] enabled > [ 0.327585] printk: console [ttyS0] enabled > > around KFENCE initialization. Thanks for your review and testing, I'll take a look at the issues later. > >>> --- >>> mm/kfence/core.c | 102 ++++++++++++++++++++++++++++++++------------- >>> mm/kfence/kfence.h | 5 ++- >>> 2 files changed, 78 insertions(+), 29 deletions(-) >>> >>> diff --git a/mm/kfence/core.c b/mm/kfence/core.c >>> index dad3c0eb70a0..b9fec1c46e3d 100644 >>> --- a/mm/kfence/core.c >>> +++ b/mm/kfence/core.c >>> @@ -116,7 +116,7 @@ EXPORT_SYMBOL(__kfence_pool); /* Export for test modules. */ >>> * backing pages (in __kfence_pool). >>> */ >>> static_assert(CONFIG_KFENCE_NUM_OBJECTS > 0); >>> -struct kfence_metadata kfence_metadata[CONFIG_KFENCE_NUM_OBJECTS]; >>> +struct kfence_metadata *kfence_metadata; >>> >>> /* Freelist with available objects. */ >>> static struct list_head kfence_freelist = LIST_HEAD_INIT(kfence_freelist); >>> @@ -643,13 +643,56 @@ static unsigned long kfence_init_pool(void) >>> return addr; >>> } >>> >>> +static int kfence_alloc_metadata(void) >>> +{ >>> + unsigned long nr_pages = KFENCE_METADATA_SIZE / PAGE_SIZE; >>> + >>> +#ifdef CONFIG_CONTIG_ALLOC >>> + struct page *pages; >>> + >>> + pages = alloc_contig_pages(nr_pages, GFP_KERNEL, first_online_node, >>> + NULL); >>> + if (pages) >>> + kfence_metadata = page_to_virt(pages); >>> +#else >>> + if (nr_pages > MAX_ORDER_NR_PAGES) { >>> + pr_warn("KFENCE_NUM_OBJECTS too large for buddy allocator\n"); >> >> Does this mean that KFENCE won't work at all if we can't allocate the >> metadata? I.e. it won't work either in early nor late init modes? >> >> I know we already have this limitation for _late init_ of the KFENCE pool. >> >> So I have one major question: when doing _early init_, what is the >> maximum size of the KFENCE pool (#objects) with this change? >> >>> + return -EINVAL; >>> + } >>> + kfence_metadata = alloc_pages_exact(KFENCE_METADATA_SIZE, >>> + GFP_KERNEL); >>> +#endif >>> + >>> + if (!kfence_metadata) >>> + return -ENOMEM; >>> + >>> + memset(kfence_metadata, 0, KFENCE_METADATA_SIZE); >> >> memzero_explicit, or pass __GFP_ZERO to alloc_pages? >> >>> + return 0; >>> +} >>> + >>> +static void kfence_free_metadata(void) >>> +{ >>> + if (WARN_ON(!kfence_metadata)) >>> + return; >>> +#ifdef CONFIG_CONTIG_ALLOC >>> + free_contig_range(page_to_pfn(virt_to_page((void *)kfence_metadata)), >>> + KFENCE_METADATA_SIZE / PAGE_SIZE); >>> +#else >>> + free_pages_exact((void *)kfence_metadata, KFENCE_METADATA_SIZE); >>> +#endif >>> + kfence_metadata = NULL; >>> +} >>> + >>> static bool __init kfence_init_pool_early(void) >>> { >>> - unsigned long addr; >>> + unsigned long addr = (unsigned long)__kfence_pool; >>> >>> if (!__kfence_pool) >>> return false; >>> >>> + if (!kfence_alloc_metadata()) >>> + goto free_pool; >>> + >>> addr = kfence_init_pool(); >>> >>> if (!addr) { >>> @@ -663,6 +706,7 @@ static bool __init kfence_init_pool_early(void) >>> return true; >>> } >>> >>> + kfence_free_metadata(); >>> /* >>> * Only release unprotected pages, and do not try to go back and change >>> * page attributes due to risk of failing to do so as well. If changing >>> @@ -670,31 +714,12 @@ static bool __init kfence_init_pool_early(void) >>> * fails for the first page, and therefore expect addr==__kfence_pool in >>> * most failure cases. >>> */ >>> +free_pool: >>> memblock_free_late(__pa(addr), KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)); >>> __kfence_pool = NULL; >>> return false; >>> } >>> >>> -static bool kfence_init_pool_late(void) >>> -{ >>> - unsigned long addr, free_size; >>> - >>> - addr = kfence_init_pool(); >>> - >>> - if (!addr) >>> - return true; >>> - >>> - /* Same as above. */ >>> - free_size = KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool); >>> -#ifdef CONFIG_CONTIG_ALLOC >>> - free_contig_range(page_to_pfn(virt_to_page((void *)addr)), free_size / PAGE_SIZE); >>> -#else >>> - free_pages_exact((void *)addr, free_size); >>> -#endif >>> - __kfence_pool = NULL; >>> - return false; >>> -} >>> - >>> /* === DebugFS Interface ==================================================== */ >>> >>> static int stats_show(struct seq_file *seq, void *v) >>> @@ -896,6 +921,10 @@ void __init kfence_init(void) >>> static int kfence_init_late(void) >>> { >>> const unsigned long nr_pages = KFENCE_POOL_SIZE / PAGE_SIZE; >>> + unsigned long addr = (unsigned long)__kfence_pool; >>> + unsigned long free_size = KFENCE_POOL_SIZE; >>> + int ret; >>> + >>> #ifdef CONFIG_CONTIG_ALLOC >>> struct page *pages; >>> >>> @@ -913,15 +942,29 @@ static int kfence_init_late(void) >>> return -ENOMEM; >>> #endif >>> >>> - if (!kfence_init_pool_late()) { >>> - pr_err("%s failed\n", __func__); >>> - return -EBUSY; >>> + ret = kfence_alloc_metadata(); >>> + if (!ret) >>> + goto free_pool; >>> + >>> + addr = kfence_init_pool(); >>> + if (!addr) { >>> + kfence_init_enable(); >>> + kfence_debugfs_init(); >>> + return 0; >>> } >>> >>> - kfence_init_enable(); >>> - kfence_debugfs_init(); >>> + pr_err("%s failed\n", __func__); >>> + kfence_free_metadata(); >>> + free_size = KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool); >>> + ret = -EBUSY; >>> >>> - return 0; >>> +free_pool: >>> +#ifdef CONFIG_CONTIG_ALLOC >>> + free_contig_range(page_to_pfn(virt_to_page((void *)addr)), free_size / PAGE_SIZE); >>> +#else >>> + free_pages_exact((void *)addr, free_size); >>> +#endif >> >> You moved this from kfence_init_pool_late - that did "__kfence_pool = >> NULL" which is missing now.