All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Gomez <da.gomez@kernel.org>
To: Petr Pavlu <petr.pavlu@suse.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Sami Tolvanen <samitolvanen@google.com>,
	Daniel Gomez <da.gomez@samsung.com>
Cc: linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/2] module: Fix memory deallocation on error path in move_module()
Date: Mon, 30 Jun 2025 14:09:18 +0200	[thread overview]
Message-ID: <bcfc588e-71de-4715-bcdb-754afaba3b50@kernel.org> (raw)
In-Reply-To: <20250618122730.51324-2-petr.pavlu@suse.com>

On 18/06/2025 14.26, Petr Pavlu wrote:
> The function move_module() uses the variable t to track how many memory
> types it has allocated and consequently how many should be freed if an
> error occurs.
> 
> The variable is initially set to 0 and is updated when a call to
> module_memory_alloc() fails. However, move_module() can fail for other
> reasons as well, in which case t remains set to 0 and no memory is freed.
> 
> Fix the problem by initializing t to MOD_MEM_NUM_TYPES. Additionally, make
> the deallocation loop more robust by not relying on the mod_mem_type_t enum
> having a signed integer as its underlying type.

This is a follow up based on v1 discussion [1] regarding a way to reproduce
the leak.

[1]
https://lore.kernel.org/linux-modules/ae967353-71fa-4438-a84b-8f7e2815f485@kernel.org/

I tried using eBPF and moderr [2][3] to find a way to reproduce this. I run into
some issues but I applied some workarounds. If this is useful we can check at
how to solve this properly.

[2] moderr RFC:
https://lore.kernel.org/linux-modules/20250122-modules-error-injection-v1-0-910590a04fd5@samsung.com/

[3] updated moderr branch:
https://git.kernel.org/pub/scm/linux/kernel/git/da.gomez/linux.git/?h=b4%2Fmodules-error-injection

The logs below show an A/B test memory consumption comparison when a module
is loaded and leaked (A) because of the issue described in the commit log
vs when is correctly loaded/unloaded (B). Besides observing a huge leak,
kernel logs (dmesg) show the message "execmem: unable to allocate memory".
Note that we can't use kmemleak as we skip it with kmemleak_not_leak() in
module_memory_alloc(), except for is_rox memory.

Memory lost:
A test: 1220444 kB
B test: 892 kB

Thoughts?


Workflow to reproduce the issue (before applying this patchset):

1. Run moderr to return error in codetag_needs_module_section()

sudo moderr \
  -modname=brd
  --modfunc=module_codetag_needs_module_section \
  --error=1

Note: We can also force codetag_alloc_module_section() with:

sudo moderr \
--modname=brd \
--modfunc=module_codetag_alloc_module_section \
--error=-22
 
2. Run a loop to probe a module with:

while true; do modprobe brd; done

3. Monitor memory before/after and check delta:

Note: this output it's a wrapper around the above loop.

[*] Dropping caches...
[*] Wait...
[*] Starting memory: 6758564 kB
[*] Loop 200s...
[*] Wait...
[*] Dropping caches...
[*] Wait...
[*] Ending memory: 5538120 kB
[*] Memory lost: 1220444 kB
--- meminfo_before.txt  2025-06-30 10:29:49.880661145 +0000
+++ meminfo_after.txt   2025-06-30 10:33:20.091501671 +0000
@@ -1,38 +1,38 @@
 MemTotal:        8133036 kB
-MemFree:         6858820 kB
-MemAvailable:    6758652 kB
-Buffers:            2576 kB
-Cached:            73164 kB
+MemFree:         5620596 kB
+MemAvailable:    5538204 kB
+Buffers:            2552 kB
+Cached:           107256 kB
 SwapCached:            0 kB
-Active:            25448 kB
-Inactive:          95272 kB
-Active(anon):        176 kB
-Inactive(anon):    45684 kB
-Active(file):      25272 kB
-Inactive(file):    49588 kB
+Active:            67688 kB
+Inactive:          95284 kB
+Active(anon):        184 kB
+Inactive(anon):    54028 kB
+Active(file):      67504 kB
+Inactive(file):    41256 kB
 Unevictable:           0 kB
 Mlocked:               0 kB
 SwapTotal:             0 kB
 SwapFree:              0 kB
-Dirty:               468 kB
+Dirty:              2804 kB
 Writeback:             0 kB
-AnonPages:         45192 kB
-Mapped:            72028 kB
-Shmem:               916 kB
-KReclaimable:       5816 kB
-Slab:              42336 kB
-SReclaimable:       5816 kB
-SUnreclaim:        36520 kB
-KernelStack:        1984 kB
-PageTables:         2652 kB
+AnonPages:         53556 kB
+Mapped:           105784 kB
+Shmem:               936 kB
+KReclaimable:       7468 kB
+Slab:             176080 kB
+SReclaimable:       7468 kB
+SUnreclaim:       168612 kB
+KernelStack:        2016 kB
+PageTables:         5472 kB
 SecPageTables:         0 kB
 NFS_Unstable:          0 kB
 Bounce:                0 kB
 WritebackTmp:          0 kB
 CommitLimit:     3542228 kB
-Committed_AS:      89960 kB
+Committed_AS:      92956 kB
 VmallocTotal:   34359738367 kB
-VmallocUsed:        5112 kB
+VmallocUsed:     1041252 kB
 VmallocChunk:          0 kB
 Percpu:             1712 kB
 AnonHugePages:      4096 kB
@@ -49,6 +49,6 @@
 HugePages_Surp:        0
 Hugepagesize:       2048 kB
 Hugetlb:         1048576 kB
-DirectMap4k:       30548 kB
-DirectMap2M:     3115008 kB
-DirectMap1G:     7340032 kB
+DirectMap4k:      730964 kB
+DirectMap2M:     3463168 kB
+DirectMap1G:     6291456 kB

4. Check kernel logs

[   30.099667] tee (618): drop_caches: 3
[  221.695667] execmem: unable to allocate memory
{...}
[  226.701772] execmem_vmalloc: 1110 callbacks suppressed
{...}
[  231.705976] execmem_vmalloc: 1099 callbacks suppressed
{...}
[  231.742226] execmem: unable to allocate memory
[  231.747107] execmem: unable to allocate memory
[  240.109107] tee (92935): drop_caches: 3


Now, I repeated the test without error injection:

Loop:
while true; do modprobe brd && rmmod brd; done


[*] Dropping caches...
[*] Wait...
[*] Starting memory: 6747960 kB
Loop 100s...
[*] Wait...
[*] Dropping caches...
[*] Wait...
[*] Ending memory: 6747068 kB
[*] Memory lost: 892 kB
--- meminfo_before.txt  2025-06-30 11:21:44.922597662 +0000
+++ meminfo_after.txt   2025-06-30 11:23:35.023419139 +0000
@@ -1,38 +1,38 @@
 MemTotal:        8133036 kB
-MemFree:         6853796 kB
-MemAvailable:    6747960 kB
-Buffers:            2920 kB
-Cached:            61976 kB
+MemFree:         6852956 kB
+MemAvailable:    6747244 kB
+Buffers:            2540 kB
+Cached:            62124 kB
 SwapCached:            0 kB
-Active:            25968 kB
-Inactive:          83616 kB
-Active(anon):        176 kB
-Inactive(anon):    45744 kB
-Active(file):      25792 kB
-Inactive(file):    37872 kB
+Active:            26744 kB
+Inactive:          83120 kB
+Active(anon):        172 kB
+Inactive(anon):    46048 kB
+Active(file):      26572 kB
+Inactive(file):    37072 kB
 Unevictable:           0 kB
 Mlocked:               0 kB
 SwapTotal:             0 kB
 SwapFree:              0 kB
-Dirty:               996 kB
+Dirty:               452 kB
 Writeback:             0 kB
-AnonPages:         45116 kB
-Mapped:            54632 kB
-Shmem:               928 kB
-KReclaimable:       5680 kB
-Slab:              42392 kB
-SReclaimable:       5680 kB
-SUnreclaim:        36712 kB
-KernelStack:        1888 kB
-PageTables:         2468 kB
+AnonPages:         45372 kB
+Mapped:            59120 kB
+Shmem:               992 kB
+KReclaimable:       5940 kB
+Slab:              43056 kB
+SReclaimable:       5940 kB
+SUnreclaim:        37116 kB
+KernelStack:        1904 kB
+PageTables:         2460 kB
 SecPageTables:         0 kB
 NFS_Unstable:          0 kB
 Bounce:                0 kB
 WritebackTmp:          0 kB
 CommitLimit:     3542228 kB
-Committed_AS:      81484 kB
+Committed_AS:      81680 kB
 VmallocTotal:   34359738367 kB
-VmallocUsed:        5056 kB
+VmallocUsed:        7112 kB
 VmallocChunk:          0 kB
 Percpu:             1632 kB
 AnonHugePages:      2048 kB
@@ -49,6 +49,6 @@
 HugePages_Surp:        0
 Hugepagesize:       2048 kB
 Hugetlb:         1048576 kB
-DirectMap4k:       28500 kB
-DirectMap2M:     3117056 kB
-DirectMap1G:     7340032 kB
+DirectMap4k:      274260 kB
+DirectMap2M:     3919872 kB
+DirectMap1G:     6291456 kB

  parent reply	other threads:[~2025-06-30 12:09 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-18 12:26 [PATCH v2 0/2] module: Fix memory deallocation on error path in move_module() Petr Pavlu
2025-06-18 12:26 ` [PATCH v2 1/2] " Petr Pavlu
2025-06-18 14:02   ` Daniel Gomez
2025-06-30 12:09   ` Daniel Gomez [this message]
2025-06-18 12:26 ` [PATCH v2 2/2] module: Avoid unnecessary return value initialization " Petr Pavlu
2025-06-18 14:02   ` Daniel Gomez
2025-06-30 13:57 ` [PATCH v2 0/2] module: Fix memory deallocation on error path " Daniel Gomez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bcfc588e-71de-4715-bcdb-754afaba3b50@kernel.org \
    --to=da.gomez@kernel.org \
    --cc=da.gomez@samsung.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=petr.pavlu@suse.com \
    --cc=samitolvanen@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.