linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: Alexander Potapenko <glider@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	kasan-dev@googlegroups.com
Subject: Re: KASAN-related VMAP allocation errors in debug kernels with many logical CPUS
Date: Thu, 6 Oct 2022 18:12:33 +0200	[thread overview]
Message-ID: <9ce8a3a3-8305-31a4-a097-3719861c234e@redhat.com> (raw)
In-Reply-To: <Yz711WzMS+lG7Zlw@pc636>

On 06.10.22 17:35, Uladzislau Rezki wrote:
>> Hi,
>>
>> we're currently hitting a weird vmap issue in debug kernels with KASAN enabled
>> on fairly large VMs. I reproduced it on v5.19 (did not get the chance to
>> try 6.0 yet because I don't have access to the machine right now, but
>> I suspect it persists).
>>
>> It seems to trigger when udev probes a massive amount of devices in parallel
>> while the system is booting up. Once the system booted, I no longer see any
>> such issues.
>>
>>
>> [  165.818200] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
>> [  165.836622] vmap allocation for size 315392 failed: use vmalloc=<size> to increase size
>> [  165.837461] vmap allocation for size 315392 failed: use vmalloc=<size> to increase size
>> [  165.840573] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
>> [  165.841059] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
>> [  165.841428] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
>> [  165.841819] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
>> [  165.842123] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
>> [  165.843359] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
>> [  165.844894] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
>> [  165.847028] CPU: 253 PID: 4995 Comm: systemd-udevd Not tainted 5.19.0 #2
>> [  165.935689] Hardware name: Lenovo ThinkSystem SR950 -[7X12ABC1WW]-/-[7X12ABC1WW]-, BIOS -[PSE130O-1.81]- 05/20/2020
>> [  165.947343] Call Trace:
>> [  165.950075]  <TASK>
>> [  165.952425]  dump_stack_lvl+0x57/0x81
>> [  165.956532]  warn_alloc.cold+0x95/0x18a
>> [  165.960836]  ? zone_watermark_ok_safe+0x240/0x240
>> [  165.966100]  ? slab_free_freelist_hook+0x11d/0x1d0
>> [  165.971461]  ? __get_vm_area_node+0x2af/0x360
>> [  165.976341]  ? __get_vm_area_node+0x2af/0x360
>> [  165.981219]  __vmalloc_node_range+0x291/0x560
>> [  165.986087]  ? __mutex_unlock_slowpath+0x161/0x5e0
>> [  165.991447]  ? move_module+0x4c/0x630
>> [  165.995547]  ? vfree_atomic+0xa0/0xa0
>> [  165.999647]  ? move_module+0x4c/0x630
>> [  166.003741]  module_alloc+0xe7/0x170
>> [  166.007747]  ? move_module+0x4c/0x630
>> [  166.011840]  move_module+0x4c/0x630
>> [  166.015751]  layout_and_allocate+0x32c/0x560
>> [  166.020519]  load_module+0x8e0/0x25c0
>>
> Can it be that we do not have enough "module section" size? I mean the
> section size, which is MODULES_END - MODULES_VADDR is rather small so
> some modules are not loaded due to no space.
> 
> CONFIG_RANDOMIZE_BASE also creates some offset overhead if enabled on
> your box. But it looks it is rather negligible.

Right, I suspected both points -- but was fairly confused why the 
numbers of CPUs would matter.

What would make sense is that if we're tight on module vmap space, that 
the race I think that could happen with purging only once and then 
failing could become relevant.

> 
> Maybe try to increase the module-section size to see if it solves the
> problem.

What would be the easiest way to do that?

Thanks!

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2022-10-06 16:12 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-06 13:46 KASAN-related VMAP allocation errors in debug kernels with many logical CPUS David Hildenbrand
2022-10-06 15:35 ` Uladzislau Rezki
2022-10-06 16:12   ` David Hildenbrand [this message]
2022-10-07 15:34     ` Uladzislau Rezki
2022-10-10  6:56       ` David Hildenbrand
2022-10-10 12:19         ` Uladzislau Rezki
2022-10-11 19:52           ` David Hildenbrand
2022-10-12 16:36             ` Uladzislau Rezki
2022-10-13 16:21               ` David Hildenbrand
2022-10-15  9:23                 ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9ce8a3a3-8305-31a4-a097-3719861c234e@redhat.com \
    --to=david@redhat.com \
    --cc=andreyknvl@gmail.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-mm@kvack.org \
    --cc=ryabinin.a.a@gmail.com \
    --cc=urezki@gmail.com \
    --cc=vincenzo.frascino@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).