From: David Hildenbrand <david@redhat.com>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: Alexander Potapenko <glider@google.com>,
Andrey Konovalov <andreyknvl@gmail.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Andrey Ryabinin <ryabinin.a.a@gmail.com>,
Dmitry Vyukov <dvyukov@google.com>,
Vincenzo Frascino <vincenzo.frascino@arm.com>,
kasan-dev@googlegroups.com
Subject: Re: KASAN-related VMAP allocation errors in debug kernels with many logical CPUS
Date: Tue, 11 Oct 2022 21:52:40 +0200 [thread overview]
Message-ID: <478c93f5-3f06-e426-9266-2c043c3658da@redhat.com> (raw)
In-Reply-To: <Y0QNt5zAvrJwfFk2@pc636>
On 10.10.22 14:19, Uladzislau Rezki wrote:
> On Mon, Oct 10, 2022 at 08:56:55AM +0200, David Hildenbrand wrote:
>>>>> Maybe try to increase the module-section size to see if it solves the
>>>>> problem.
>>>>
>>>> What would be the easiest way to do that?
>>>>
>>> Sorry for late answer. I was trying to reproduce it on my box. What i
>>> did was trying to load all modules in my system with KASAN_INLINE option:
>>>
>>
>> Thanks!
>>
>>> <snip>
>>> #!/bin/bash
>>>
>>> # Exclude test_vmalloc.ko
>>> MODULES_LIST=(`find /lib/modules/$(uname -r) -type f \
>>> \( -iname "*.ko" -not -iname "test_vmalloc*" \) | awk -F"/" '{print $NF}' | sed 's/.ko//'`)
>>>
>>> function moduleExist(){
>>> MODULE="$1"
>>> if lsmod | grep "$MODULE" &> /dev/null ; then
>>> return 0
>>> else
>>> return 1
>>> fi
>>> }
>>>
>>> i=0
>>>
>>> for module_name in ${MODULES_LIST[@]}; do
>>> sudo modprobe $module_name
>>>
>>> if moduleExist ${module_name}; then
>>> ((i=i+1))
>>> echo "Successfully loaded $module_name counter $i"
>>> fi
>>> done
>>> <snip>
>>>
>>> as you wrote it looks like it is not easy to reproduce. So i do not see
>>> any vmap related errors.
>>
>> Yeah, it's quite mystery and only seems to trigger on these systems with a
>> lot of CPUs.
>>
>>>
>>> Returning back to the question. I think you could increase the MODULES_END
>>> address and shift the FIXADDR_START little forward. See the dump_pagetables.c
>>> But it might be they are pretty compact and located in the end. So i am not
>>> sure if there is a room there.
>>
>> That's what I was afraid of :)
>>
>>>
>>> Second. It would be good to understand if vmap only fails on allocating for a
>>> module:
>>>
>>> <snip>
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>> index dd6cdb201195..53026fdda224 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -1614,6 +1614,8 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>>> va->va_end = addr + size;
>>> va->vm = NULL;
>>> + trace_printk("-> alloc %lu size, align: %lu, vstart: %lu, vend: %lu\n", size, align, vstart, vend);
>>> +
>>> spin_lock(&vmap_area_lock);
>>> <snip>
>>
>> I'll try grabbing a suitable system again and add some more debugging
>> output. Might take a while, unfortunately.
>>
> Yes that makes sense. Especially to understand if it fails on the MODULES_VADDR
> - MODULES_END range or somewhere else. According to your trace output it looks
> like that but it would be good to confirm it by adding some traces.
>
> BTW, vmap code is lack of good trace events. Probably it is worth to add
> some basic ones.
Was lucky to grab that system again. Compiled a custom 6.0 kernel, whereby I printk all vmap allocation errors, including the range similarly to what you suggested above (but printk only on the failure path).
So these are the failing allocations:
# dmesg | grep " -> alloc"
[ 168.862511] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.863020] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.863841] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.864562] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.864646] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.865688] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.865718] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.866098] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.866551] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.866752] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.867147] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.867210] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.867312] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.867650] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.867767] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.867815] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.867815] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.868059] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.868463] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.868822] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.868919] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.869843] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.869854] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.870174] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.870611] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.870806] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.870982] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 168.879000] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.449101] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.449834] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.450667] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.451539] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.452326] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.453239] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.454052] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.454697] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.454811] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.455575] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.455754] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.461450] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.805223] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.805507] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.929577] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.930389] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.931244] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.932035] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.932796] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.933592] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.934470] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.935344] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 169.970641] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.191600] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.191875] -> alloc 40960 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.241901] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.242708] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.243465] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.244211] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.245060] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.245868] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.246433] -> alloc 40960 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.246657] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.247451] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.248226] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.248902] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.249704] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.250497] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.251244] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.252076] -> alloc 319488 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.587168] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 170.598995] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 171.865721] -> alloc 2506752 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
[ 172.138557] -> alloc 917504 size, align: 4096, vstart: 18446744072639352832, vend: 18446744073692774400
Really looks like only module vmap space. ~ 1 GiB of vmap module space ...
I did try:
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index dd6cdb201195..199154a2228a 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -72,6 +72,8 @@ early_param("nohugevmalloc", set_nohugevmalloc);
static const bool vmap_allow_huge = false;
#endif /* CONFIG_HAVE_ARCH_HUGE_VMALLOC */
+static atomic_long_t vmap_lazy_nr = ATOMIC_LONG_INIT(0);
+
bool is_vmalloc_addr(const void *x)
{
unsigned long addr = (unsigned long)kasan_reset_tag(x);
@@ -1574,7 +1576,6 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
struct vmap_area *va;
unsigned long freed;
unsigned long addr;
- int purged = 0;
int ret;
BUG_ON(!size);
@@ -1631,23 +1632,22 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
return va;
overflow:
- if (!purged) {
+ if (atomic_long_read(&vmap_lazy_nr)) {
purge_vmap_area_lazy();
- purged = 1;
goto retry;
}
freed = 0;
blocking_notifier_call_chain(&vmap_notify_list, 0, &freed);
- if (freed > 0) {
- purged = 0;
+ if (freed > 0)
goto retry;
- }
- if (!(gfp_mask & __GFP_NOWARN) && printk_ratelimit())
+ if (!(gfp_mask & __GFP_NOWARN)) {
pr_warn("vmap allocation for size %lu failed: use vmalloc=<size> to increase size\n",
size);
+ printk("-> alloc %lu size, align: %lu, vstart: %lu, vend: %lu\n", size, align, vstart, vend);
+ }
kmem_cache_free(vmap_area_cachep, va);
return ERR_PTR(-EBUSY);
@@ -1690,8 +1690,6 @@ static unsigned long lazy_max_pages(void)
return log * (32UL * 1024 * 1024 / PAGE_SIZE);
}
-static atomic_long_t vmap_lazy_nr = ATOMIC_LONG_INIT(0);
-
But that didn't help at all. That system is crazy:
# lspci | wc -l
1117
What I find interesting is that we have these recurring allocations of similar sizes failing.
I wonder if user space is capable of loading the same kernel module concurrently to
trigger a massive amount of allocations, and module loading code only figures out
later that it has already been loaded and backs off.
My best guess would be that module loading is serialized completely, but for some reason,
something seems to go wrong with a lot of concurrency ...
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2022-10-11 19:52 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-06 13:46 KASAN-related VMAP allocation errors in debug kernels with many logical CPUS David Hildenbrand
2022-10-06 15:35 ` Uladzislau Rezki
2022-10-06 16:12 ` David Hildenbrand
2022-10-07 15:34 ` Uladzislau Rezki
2022-10-10 6:56 ` David Hildenbrand
2022-10-10 12:19 ` Uladzislau Rezki
2022-10-11 19:52 ` David Hildenbrand [this message]
2022-10-12 16:36 ` Uladzislau Rezki
2022-10-13 16:21 ` David Hildenbrand
2022-10-15 9:23 ` Uladzislau Rezki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=478c93f5-3f06-e426-9266-2c043c3658da@redhat.com \
--to=david@redhat.com \
--cc=andreyknvl@gmail.com \
--cc=dvyukov@google.com \
--cc=glider@google.com \
--cc=kasan-dev@googlegroups.com \
--cc=linux-mm@kvack.org \
--cc=ryabinin.a.a@gmail.com \
--cc=urezki@gmail.com \
--cc=vincenzo.frascino@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).