From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 867F2C433FE for ; Thu, 6 Oct 2022 16:12:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CC5EA6B0072; Thu, 6 Oct 2022 12:12:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C4E766B0073; Thu, 6 Oct 2022 12:12:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA0FC8E0001; Thu, 6 Oct 2022 12:12:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 909B86B0072 for ; Thu, 6 Oct 2022 12:12:42 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 494E11A0781 for ; Thu, 6 Oct 2022 16:12:42 +0000 (UTC) X-FDA: 79991017764.29.AFFD21E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf28.hostedemail.com (Postfix) with ESMTP id D39FDC0028 for ; Thu, 6 Oct 2022 16:12:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665072761; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E688m+CVGsUZmRjmdJkUdBL96fMrGrm7kEEkGoS+98E=; b=iTGIbKUj3CXGJfIBcrgUVXxx/6gwhhTVi0nSCMnHQgYIfZHHHesYe7uxBfOBPDqoErsHuv YnfNumEZ3PqxZ0EsUfVt9XqBT2IV/KlzY87nD6oAyjcLNfTSkZ+ykyHM8J80F69B9M/knI Cc/v9uepL8gxOHhvt3z2uCEjYA9Ip28= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-653-6S72xRrWOLCbGAzkuVibwg-1; Thu, 06 Oct 2022 12:12:39 -0400 X-MC-Unique: 6S72xRrWOLCbGAzkuVibwg-1 Received: by mail-wm1-f69.google.com with SMTP id p24-20020a05600c1d9800b003b4b226903dso2818517wms.4 for ; Thu, 06 Oct 2022 09:12:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=E688m+CVGsUZmRjmdJkUdBL96fMrGrm7kEEkGoS+98E=; b=00uA8GWCYEELSBWRGBI+lymcOvCThzWr7EGkCxv9Lc0mwgeil61aAPmaD14OB89L36 MNmysuPhYqru7999k7JcKfJNxMsJCrJmuKzYeEFiqW2R6HQKIE7lVoyVd4vO0eKCK5+G XRTG5Ec1h5zsLMXJXNBBEEE2WL4+j0tPgbWZ/REcjbl95otOkyTIUJm5ll5ip34Cyv5z 6r6XM9CDYBvXDA4Y7iTB1s+9VEBny4v8nba7vMMxslF6g8As9SKjEEaE5mXJLA8mLXRj zf3tSsLmzfdYZl64wVNWsqMX/HH3R0+ZarJYDWrsBrVZQNizpcPrFF5FbprvpmU1D6uY J7iA== X-Gm-Message-State: ACrzQf3pIDzZik00FSIYZGelyz1TU624A+57EGLNRB4pLLQ0JLbuPjHE l0Xyvtv23YrkmtinShLyEnLynntuuCa9T3yuzT8HA8P1uO+mTILlUhLu/1eOIG+fG7bV7QjJMWx hxw8oRR1zsJ8= X-Received: by 2002:a05:600c:2d14:b0:3b4:86fe:bcec with SMTP id x20-20020a05600c2d1400b003b486febcecmr401949wmf.16.1665072755738; Thu, 06 Oct 2022 09:12:35 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7RO9woKdAL+LHrVj/UDeMdeNstL9ikpJ7wAzTV5V4ztZy4BkjPc7vuIG1IztKgPwsFpc52Ww== X-Received: by 2002:a05:600c:2d14:b0:3b4:86fe:bcec with SMTP id x20-20020a05600c2d1400b003b486febcecmr401932wmf.16.1665072755423; Thu, 06 Oct 2022 09:12:35 -0700 (PDT) Received: from ?IPV6:2003:cb:c705:3700:aed2:a0f8:c270:7f30? (p200300cbc7053700aed2a0f8c2707f30.dip0.t-ipconnect.de. [2003:cb:c705:3700:aed2:a0f8:c270:7f30]) by smtp.gmail.com with ESMTPSA id dn10-20020a05600c654a00b003b341a2cfadsm4941834wmb.17.2022.10.06.09.12.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 06 Oct 2022 09:12:34 -0700 (PDT) Message-ID: <9ce8a3a3-8305-31a4-a097-3719861c234e@redhat.com> Date: Thu, 6 Oct 2022 18:12:33 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.1 Subject: Re: KASAN-related VMAP allocation errors in debug kernels with many logical CPUS To: Uladzislau Rezki Cc: Alexander Potapenko , Andrey Konovalov , "linux-mm@kvack.org" , Andrey Ryabinin , Dmitry Vyukov , Vincenzo Frascino , kasan-dev@googlegroups.com References: <8aaaeec8-14a1-cdc4-4c77-4878f4979f3e@redhat.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665072761; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=E688m+CVGsUZmRjmdJkUdBL96fMrGrm7kEEkGoS+98E=; b=gwEpqdSl752wyfcSSbM4rEvrhNTOB9vQMK6PHKWA0FJ+hf5wO2js/SgjtzhV4kOtJ5oQ1C a6ybUVpWnhFcopNaSkvzT8mQgsker6/vQtG9OvSk5UcDu/Dx2jGo2Mn6W3g5mY3cwK1tRG p0+t6/9lysWIdZ7mJHoOGeUPaXZQuzE= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iTGIbKUj; spf=pass (imf28.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665072761; a=rsa-sha256; cv=none; b=3voGq/JOHGAOOC/DB/2n+zzzan2zkfVKGBr0A2sQA53Bto3JiETlZAXxXGHuUEL6h95yoa lvEIlZ4IHFgx/cnvtCr4FIsYIt6fygB5tKG8Dv+9qP2Q8LJNPUHuBmgu5MsFUL7cbjPLwX 5HQBkcBtSqQXcR+ypKH2Yyr+Qza48+M= Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iTGIbKUj; spf=pass (imf28.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: D39FDC0028 X-Rspam-User: X-Stat-Signature: xk88iufxdj34wsjjo81a7ufwi8n6g7pi X-HE-Tag: 1665072761-529391 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 06.10.22 17:35, Uladzislau Rezki wrote: >> Hi, >> >> we're currently hitting a weird vmap issue in debug kernels with KASAN enabled >> on fairly large VMs. I reproduced it on v5.19 (did not get the chance to >> try 6.0 yet because I don't have access to the machine right now, but >> I suspect it persists). >> >> It seems to trigger when udev probes a massive amount of devices in parallel >> while the system is booting up. Once the system booted, I no longer see any >> such issues. >> >> >> [ 165.818200] vmap allocation for size 2498560 failed: use vmalloc= to increase size >> [ 165.836622] vmap allocation for size 315392 failed: use vmalloc= to increase size >> [ 165.837461] vmap allocation for size 315392 failed: use vmalloc= to increase size >> [ 165.840573] vmap allocation for size 2498560 failed: use vmalloc= to increase size >> [ 165.841059] vmap allocation for size 2498560 failed: use vmalloc= to increase size >> [ 165.841428] vmap allocation for size 2498560 failed: use vmalloc= to increase size >> [ 165.841819] vmap allocation for size 2498560 failed: use vmalloc= to increase size >> [ 165.842123] vmap allocation for size 2498560 failed: use vmalloc= to increase size >> [ 165.843359] vmap allocation for size 2498560 failed: use vmalloc= to increase size >> [ 165.844894] vmap allocation for size 2498560 failed: use vmalloc= to increase size >> [ 165.847028] CPU: 253 PID: 4995 Comm: systemd-udevd Not tainted 5.19.0 #2 >> [ 165.935689] Hardware name: Lenovo ThinkSystem SR950 -[7X12ABC1WW]-/-[7X12ABC1WW]-, BIOS -[PSE130O-1.81]- 05/20/2020 >> [ 165.947343] Call Trace: >> [ 165.950075] >> [ 165.952425] dump_stack_lvl+0x57/0x81 >> [ 165.956532] warn_alloc.cold+0x95/0x18a >> [ 165.960836] ? zone_watermark_ok_safe+0x240/0x240 >> [ 165.966100] ? slab_free_freelist_hook+0x11d/0x1d0 >> [ 165.971461] ? __get_vm_area_node+0x2af/0x360 >> [ 165.976341] ? __get_vm_area_node+0x2af/0x360 >> [ 165.981219] __vmalloc_node_range+0x291/0x560 >> [ 165.986087] ? __mutex_unlock_slowpath+0x161/0x5e0 >> [ 165.991447] ? move_module+0x4c/0x630 >> [ 165.995547] ? vfree_atomic+0xa0/0xa0 >> [ 165.999647] ? move_module+0x4c/0x630 >> [ 166.003741] module_alloc+0xe7/0x170 >> [ 166.007747] ? move_module+0x4c/0x630 >> [ 166.011840] move_module+0x4c/0x630 >> [ 166.015751] layout_and_allocate+0x32c/0x560 >> [ 166.020519] load_module+0x8e0/0x25c0 >> > Can it be that we do not have enough "module section" size? I mean the > section size, which is MODULES_END - MODULES_VADDR is rather small so > some modules are not loaded due to no space. > > CONFIG_RANDOMIZE_BASE also creates some offset overhead if enabled on > your box. But it looks it is rather negligible. Right, I suspected both points -- but was fairly confused why the numbers of CPUs would matter. What would make sense is that if we're tight on module vmap space, that the race I think that could happen with purging only once and then failing could become relevant. > > Maybe try to increase the module-section size to see if it solves the > problem. What would be the easiest way to do that? Thanks! -- Thanks, David / dhildenb