* [PATCH] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled
@ 2024-12-10 4:15 Hao Ge
2024-12-10 6:53 ` [PATCH v2] " Hao Ge
0 siblings, 1 reply; 19+ messages in thread
From: Hao Ge @ 2024-12-10 4:15 UTC (permalink / raw)
To: surenb, kent.overstreet, akpm
Cc: linux-mm, linux-kernel, hao.ge, Hao Ge, Ben Greear
From: Hao Ge <gehao@kylinos.cn>
When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
is not enabled, we may encounter a panic during system boot.
Because we haven't allocated pages and created mappings
for the shadow memory corresponding to module_tags region,
similar to how it is done for execmem_vmalloc.
The difference is that our module_tags are allocated on demand,
so similarly,we also need to allocate shadow memory regions on demand.
However, we still need to adhere to the MODULE_ALIGN principle.
Here is the log for panic:
[ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
[ 18.350016] #PF: supervisor read access in kernel mode
[ 18.350459] #PF: error_code(0x0000) - not-present page
[ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
[ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
[ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
[ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
[ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
[ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
[ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
[ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
[ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
[ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
[ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
[ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
[ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 18.362020] PKRU: 55555554
[ 18.362261] Call Trace:
[ 18.362481] <TASK>
[ 18.362671] ? __die+0x23/0x70
[ 18.362964] ? page_fault_oops+0xc2/0x160
[ 18.363318] ? exc_page_fault+0xad/0xc0
[ 18.363680] ? asm_exc_page_fault+0x26/0x30
[ 18.364056] ? move_module+0x3cc/0x8a0
[ 18.364398] ? kasan_check_range+0xba/0x1b0
[ 18.364755] __asan_memcpy+0x3c/0x60
[ 18.365074] move_module+0x3cc/0x8a0
[ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
[ 18.365841] ? early_mod_check+0x3dc/0x510
[ 18.366195] load_module+0x72/0x1850
[ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
[ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
[ 18.367262] init_module_from_file+0xd1/0x130
[ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
[ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
[ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
[ 18.368938] idempotent_init_module+0x22c/0x790
[ 18.369332] ? simple_getattr+0x6f/0x120
[ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
[ 18.370110] ? fdget+0x58/0x3a0
[ 18.370393] ? security_capable+0x64/0xf0
[ 18.370745] __x64_sys_finit_module+0xc2/0x140
[ 18.371136] do_syscall_64+0x7d/0x160
[ 18.371459] ? fdget_pos+0x1c8/0x4c0
[ 18.371784] ? ksys_read+0xfd/0x1d0
[ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.372525] ? do_syscall_64+0x89/0x160
[ 18.372860] ? do_syscall_64+0x89/0x160
[ 18.373194] ? do_syscall_64+0x89/0x160
[ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.373952] ? do_syscall_64+0x89/0x160
[ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.374701] ? do_syscall_64+0x89/0x160
[ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
[ 18.375416] ? clear_bhb_loop+0x25/0x80
[ 18.375748] ? clear_bhb_loop+0x25/0x80
[ 18.376119] ? clear_bhb_loop+0x25/0x80
[ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
Reported-by: Ben Greear <greearb@candelatech.com>
Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
Signed-off-by: Hao Ge <gehao@kylinos.cn>
---
commit 233e89322cbe ("alloc_tag: fix module allocation
tags populated area calculation") is currently in the
mm-hotfixes-unstable branch, so this patch is
developed based on the mm-hotfixes-unstable branch.
---
lib/alloc_tag.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index f942408b53ef..88c1fc512ae0 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -422,6 +422,9 @@ static int vm_module_tags_populate(void)
return -ENOMEM;
}
vm_module_tags->nr_pages += nr;
+
+ if ((phys_end & (MODULE_ALIGN - 1)) == 0)
+ kasan_alloc_module_shadow((void *)phys_end, nr << PAGE_SHIFT, GFP_KERNEL);
}
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled
2024-12-10 4:15 [PATCH] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled Hao Ge
@ 2024-12-10 6:53 ` Hao Ge
2024-12-10 17:55 ` Suren Baghdasaryan
0 siblings, 1 reply; 19+ messages in thread
From: Hao Ge @ 2024-12-10 6:53 UTC (permalink / raw)
To: surenb, kent.overstreet, akpm
Cc: linux-mm, linux-kernel, hao.ge, greearb, Hao Ge
From: Hao Ge <gehao@kylinos.cn>
When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
is not enabled, we may encounter a panic during system boot.
Because we haven't allocated pages and created mappings
for the shadow memory corresponding to module_tags region,
similar to how it is done for execmem_vmalloc.
The difference is that our module_tags are allocated on demand,
so similarly,we also need to allocate shadow memory regions on demand.
However, we still need to adhere to the MODULE_ALIGN principle.
Here is the log for panic:
[ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
[ 18.350016] #PF: supervisor read access in kernel mode
[ 18.350459] #PF: error_code(0x0000) - not-present page
[ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
[ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
[ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
[ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
[ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
[ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
[ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
[ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
[ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
[ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
[ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
[ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
[ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 18.362020] PKRU: 55555554
[ 18.362261] Call Trace:
[ 18.362481] <TASK>
[ 18.362671] ? __die+0x23/0x70
[ 18.362964] ? page_fault_oops+0xc2/0x160
[ 18.363318] ? exc_page_fault+0xad/0xc0
[ 18.363680] ? asm_exc_page_fault+0x26/0x30
[ 18.364056] ? move_module+0x3cc/0x8a0
[ 18.364398] ? kasan_check_range+0xba/0x1b0
[ 18.364755] __asan_memcpy+0x3c/0x60
[ 18.365074] move_module+0x3cc/0x8a0
[ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
[ 18.365841] ? early_mod_check+0x3dc/0x510
[ 18.366195] load_module+0x72/0x1850
[ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
[ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
[ 18.367262] init_module_from_file+0xd1/0x130
[ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
[ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
[ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
[ 18.368938] idempotent_init_module+0x22c/0x790
[ 18.369332] ? simple_getattr+0x6f/0x120
[ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
[ 18.370110] ? fdget+0x58/0x3a0
[ 18.370393] ? security_capable+0x64/0xf0
[ 18.370745] __x64_sys_finit_module+0xc2/0x140
[ 18.371136] do_syscall_64+0x7d/0x160
[ 18.371459] ? fdget_pos+0x1c8/0x4c0
[ 18.371784] ? ksys_read+0xfd/0x1d0
[ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.372525] ? do_syscall_64+0x89/0x160
[ 18.372860] ? do_syscall_64+0x89/0x160
[ 18.373194] ? do_syscall_64+0x89/0x160
[ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.373952] ? do_syscall_64+0x89/0x160
[ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.374701] ? do_syscall_64+0x89/0x160
[ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
[ 18.375416] ? clear_bhb_loop+0x25/0x80
[ 18.375748] ? clear_bhb_loop+0x25/0x80
[ 18.376119] ? clear_bhb_loop+0x25/0x80
[ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
Reported-by: Ben Greear <greearb@candelatech.com>
Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
Signed-off-by: Hao Ge <gehao@kylinos.cn>
---
v2: Add comments to facilitate understanding of the code.
Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
already handles this internally,but to make the code more readable and user-friendly
commit 233e89322cbe ("alloc_tag: fix module allocation
tags populated area calculation") is currently in the
mm-hotfixes-unstable branch, so this patch is
developed based on the mm-hotfixes-unstable branch.
---
lib/alloc_tag.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index f942408b53ef..bd3ee57ea13f 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -10,6 +10,7 @@
#include <linux/seq_buf.h>
#include <linux/seq_file.h>
#include <linux/vmalloc.h>
+#include <linux/math.h>
#define ALLOCINFO_FILE_NAME "allocinfo"
#define MODULE_ALLOC_TAG_VMAP_SIZE (100000UL * sizeof(struct alloc_tag))
@@ -422,6 +423,17 @@ static int vm_module_tags_populate(void)
return -ENOMEM;
}
vm_module_tags->nr_pages += nr;
+
+ /*
+ * Kasan allocates 1 byte of shadow for every 8 bytes of data.
+ * When kasan_alloc_module_shadow allocates shadow memory,
+ * it does so in units of pages.
+ * Therefore, here we need to align to MODULE_ALIGN.
+ */
+ if ((phys_end & (MODULE_ALIGN - 1)) == 0)
+ kasan_alloc_module_shadow((void *)phys_end,
+ round_up(nr << PAGE_SHIFT, MODULE_ALIGN),
+ GFP_KERNEL);
}
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled
2024-12-10 6:53 ` [PATCH v2] " Hao Ge
@ 2024-12-10 17:55 ` Suren Baghdasaryan
2024-12-10 18:45 ` Hao Ge
2024-12-10 18:56 ` [PATCH v2] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled Ben Greear
0 siblings, 2 replies; 19+ messages in thread
From: Suren Baghdasaryan @ 2024-12-10 17:55 UTC (permalink / raw)
To: Hao Ge; +Cc: kent.overstreet, akpm, linux-mm, linux-kernel, greearb, Hao Ge
On Mon, Dec 9, 2024 at 10:53 PM Hao Ge <hao.ge@linux.dev> wrote:
>
> From: Hao Ge <gehao@kylinos.cn>
>
> When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
> is not enabled, we may encounter a panic during system boot.
>
> Because we haven't allocated pages and created mappings
> for the shadow memory corresponding to module_tags region,
> similar to how it is done for execmem_vmalloc.
>
> The difference is that our module_tags are allocated on demand,
> so similarly,we also need to allocate shadow memory regions on demand.
> However, we still need to adhere to the MODULE_ALIGN principle.
>
> Here is the log for panic:
>
> [ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
> [ 18.350016] #PF: supervisor read access in kernel mode
> [ 18.350459] #PF: error_code(0x0000) - not-present page
> [ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
> [ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
> [ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
> [ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> [ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
> [ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
> [ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
> [ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
> [ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
> [ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
> [ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
> [ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
> [ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
> [ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
> [ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 18.362020] PKRU: 55555554
> [ 18.362261] Call Trace:
> [ 18.362481] <TASK>
> [ 18.362671] ? __die+0x23/0x70
> [ 18.362964] ? page_fault_oops+0xc2/0x160
> [ 18.363318] ? exc_page_fault+0xad/0xc0
> [ 18.363680] ? asm_exc_page_fault+0x26/0x30
> [ 18.364056] ? move_module+0x3cc/0x8a0
> [ 18.364398] ? kasan_check_range+0xba/0x1b0
> [ 18.364755] __asan_memcpy+0x3c/0x60
> [ 18.365074] move_module+0x3cc/0x8a0
> [ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
> [ 18.365841] ? early_mod_check+0x3dc/0x510
> [ 18.366195] load_module+0x72/0x1850
> [ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
> [ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
> [ 18.367262] init_module_from_file+0xd1/0x130
> [ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
> [ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
> [ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
> [ 18.368938] idempotent_init_module+0x22c/0x790
> [ 18.369332] ? simple_getattr+0x6f/0x120
> [ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
> [ 18.370110] ? fdget+0x58/0x3a0
> [ 18.370393] ? security_capable+0x64/0xf0
> [ 18.370745] __x64_sys_finit_module+0xc2/0x140
> [ 18.371136] do_syscall_64+0x7d/0x160
> [ 18.371459] ? fdget_pos+0x1c8/0x4c0
> [ 18.371784] ? ksys_read+0xfd/0x1d0
> [ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
> [ 18.372525] ? do_syscall_64+0x89/0x160
> [ 18.372860] ? do_syscall_64+0x89/0x160
> [ 18.373194] ? do_syscall_64+0x89/0x160
> [ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
> [ 18.373952] ? do_syscall_64+0x89/0x160
> [ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
> [ 18.374701] ? do_syscall_64+0x89/0x160
> [ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
> [ 18.375416] ? clear_bhb_loop+0x25/0x80
> [ 18.375748] ? clear_bhb_loop+0x25/0x80
> [ 18.376119] ? clear_bhb_loop+0x25/0x80
> [ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
> Reported-by: Ben Greear <greearb@candelatech.com>
> Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
> Signed-off-by: Hao Ge <gehao@kylinos.cn>
> ---
> v2: Add comments to facilitate understanding of the code.
> Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
> already handles this internally,but to make the code more readable and user-friendly
>
> commit 233e89322cbe ("alloc_tag: fix module allocation
> tags populated area calculation") is currently in the
> mm-hotfixes-unstable branch, so this patch is
> developed based on the mm-hotfixes-unstable branch.
> ---
> lib/alloc_tag.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> index f942408b53ef..bd3ee57ea13f 100644
> --- a/lib/alloc_tag.c
> +++ b/lib/alloc_tag.c
> @@ -10,6 +10,7 @@
> #include <linux/seq_buf.h>
> #include <linux/seq_file.h>
> #include <linux/vmalloc.h>
> +#include <linux/math.h>
>
> #define ALLOCINFO_FILE_NAME "allocinfo"
> #define MODULE_ALLOC_TAG_VMAP_SIZE (100000UL * sizeof(struct alloc_tag))
> @@ -422,6 +423,17 @@ static int vm_module_tags_populate(void)
> return -ENOMEM;
> }
> vm_module_tags->nr_pages += nr;
> +
> + /*
> + * Kasan allocates 1 byte of shadow for every 8 bytes of data.
> + * When kasan_alloc_module_shadow allocates shadow memory,
> + * it does so in units of pages.
> + * Therefore, here we need to align to MODULE_ALIGN.
> + */
> + if ((phys_end & (MODULE_ALIGN - 1)) == 0)
phys_end is calculated as:
unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
(vm_module_tags->nr_pages
<< PAGE_SHIFT);
and therefore is always PAGE_SIZE-aligned. PAGE_SIZE is always a
multiple of MODULE_ALIGN, therefore phys_end is always
MODULE_ALIGN-aligned and the above condition is not needed.
> + kasan_alloc_module_shadow((void *)phys_end,
> + round_up(nr << PAGE_SHIFT, MODULE_ALIGN),
Here again, (nr << PAGE_SHIFT) is PAGE_SIZE-aligned and PAGE_SIZE is a
multiple of MODULE_ALIGN, therefore (nr << PAGE_SHIFT) is always
multiple of MODULE_ALIGN and there is no need for round_up().
IOW, I think this patch should simply add one line:
vm_module_tags->nr_pages += nr;
+ kasan_alloc_module_shadow((void *)phys_end, nr <<
PAGE_SHIFT, GFP_KERNEL);
Am I missing something?
> + GFP_KERNEL);
> }
>
> /*
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled
2024-12-10 17:55 ` Suren Baghdasaryan
@ 2024-12-10 18:45 ` Hao Ge
2024-12-10 19:20 ` Suren Baghdasaryan
2024-12-10 18:56 ` [PATCH v2] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled Ben Greear
1 sibling, 1 reply; 19+ messages in thread
From: Hao Ge @ 2024-12-10 18:45 UTC (permalink / raw)
To: Suren Baghdasaryan
Cc: kent.overstreet, akpm, linux-mm, linux-kernel, greearb, Hao Ge
Hi Suren
Thanks for your review.
On 12/11/24 01:55, Suren Baghdasaryan wrote:
> On Mon, Dec 9, 2024 at 10:53 PM Hao Ge <hao.ge@linux.dev> wrote:
>> From: Hao Ge <gehao@kylinos.cn>
>>
>> When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
>> is not enabled, we may encounter a panic during system boot.
>>
>> Because we haven't allocated pages and created mappings
>> for the shadow memory corresponding to module_tags region,
>> similar to how it is done for execmem_vmalloc.
>>
>> The difference is that our module_tags are allocated on demand,
>> so similarly,we also need to allocate shadow memory regions on demand.
>> However, we still need to adhere to the MODULE_ALIGN principle.
>>
>> Here is the log for panic:
>>
>> [ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
>> [ 18.350016] #PF: supervisor read access in kernel mode
>> [ 18.350459] #PF: error_code(0x0000) - not-present page
>> [ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
>> [ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
>> [ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
>> [ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
>> [ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
>> [ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
>> [ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
>> [ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
>> [ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
>> [ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
>> [ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
>> [ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
>> [ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
>> [ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
>> [ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [ 18.362020] PKRU: 55555554
>> [ 18.362261] Call Trace:
>> [ 18.362481] <TASK>
>> [ 18.362671] ? __die+0x23/0x70
>> [ 18.362964] ? page_fault_oops+0xc2/0x160
>> [ 18.363318] ? exc_page_fault+0xad/0xc0
>> [ 18.363680] ? asm_exc_page_fault+0x26/0x30
>> [ 18.364056] ? move_module+0x3cc/0x8a0
>> [ 18.364398] ? kasan_check_range+0xba/0x1b0
>> [ 18.364755] __asan_memcpy+0x3c/0x60
>> [ 18.365074] move_module+0x3cc/0x8a0
>> [ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
>> [ 18.365841] ? early_mod_check+0x3dc/0x510
>> [ 18.366195] load_module+0x72/0x1850
>> [ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
>> [ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
>> [ 18.367262] init_module_from_file+0xd1/0x130
>> [ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
>> [ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
>> [ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
>> [ 18.368938] idempotent_init_module+0x22c/0x790
>> [ 18.369332] ? simple_getattr+0x6f/0x120
>> [ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
>> [ 18.370110] ? fdget+0x58/0x3a0
>> [ 18.370393] ? security_capable+0x64/0xf0
>> [ 18.370745] __x64_sys_finit_module+0xc2/0x140
>> [ 18.371136] do_syscall_64+0x7d/0x160
>> [ 18.371459] ? fdget_pos+0x1c8/0x4c0
>> [ 18.371784] ? ksys_read+0xfd/0x1d0
>> [ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
>> [ 18.372525] ? do_syscall_64+0x89/0x160
>> [ 18.372860] ? do_syscall_64+0x89/0x160
>> [ 18.373194] ? do_syscall_64+0x89/0x160
>> [ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
>> [ 18.373952] ? do_syscall_64+0x89/0x160
>> [ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
>> [ 18.374701] ? do_syscall_64+0x89/0x160
>> [ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
>> [ 18.375416] ? clear_bhb_loop+0x25/0x80
>> [ 18.375748] ? clear_bhb_loop+0x25/0x80
>> [ 18.376119] ? clear_bhb_loop+0x25/0x80
>> [ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>> Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
>> Reported-by: Ben Greear <greearb@candelatech.com>
>> Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
>> Signed-off-by: Hao Ge <gehao@kylinos.cn>
>> ---
>> v2: Add comments to facilitate understanding of the code.
>> Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
>> already handles this internally,but to make the code more readable and user-friendly
>>
>> commit 233e89322cbe ("alloc_tag: fix module allocation
>> tags populated area calculation") is currently in the
>> mm-hotfixes-unstable branch, so this patch is
>> developed based on the mm-hotfixes-unstable branch.
>> ---
>> lib/alloc_tag.c | 12 ++++++++++++
>> 1 file changed, 12 insertions(+)
>>
>> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
>> index f942408b53ef..bd3ee57ea13f 100644
>> --- a/lib/alloc_tag.c
>> +++ b/lib/alloc_tag.c
>> @@ -10,6 +10,7 @@
>> #include <linux/seq_buf.h>
>> #include <linux/seq_file.h>
>> #include <linux/vmalloc.h>
>> +#include <linux/math.h>
>>
>> #define ALLOCINFO_FILE_NAME "allocinfo"
>> #define MODULE_ALLOC_TAG_VMAP_SIZE (100000UL * sizeof(struct alloc_tag))
>> @@ -422,6 +423,17 @@ static int vm_module_tags_populate(void)
>> return -ENOMEM;
>> }
>> vm_module_tags->nr_pages += nr;
>> +
>> + /*
>> + * Kasan allocates 1 byte of shadow for every 8 bytes of data.
>> + * When kasan_alloc_module_shadow allocates shadow memory,
>> + * it does so in units of pages.
>> + * Therefore, here we need to align to MODULE_ALIGN.
>> + */
>> + if ((phys_end & (MODULE_ALIGN - 1)) == 0)
> phys_end is calculated as:
>
> unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
> (vm_module_tags->nr_pages
> << PAGE_SHIFT);
>
> and therefore is always PAGE_SIZE-aligned. PAGE_SIZE is always a
> multiple of MODULE_ALIGN, therefore phys_end is always
When CONFIG_KASAN_VMALLOC is not enabled
#define MODULE_ALIGN (PAGE_SIZE << KASAN_SHADOW_SCALE_SHIFT)
https://elixir.bootlin.com/linux/v6.13-rc2/source/include/linux/execmem.h#L11
and On x86, KASAN_SHADOW_SCALE_SHIFT is set to 3
https://elixir.bootlin.com/linux/v6.13-rc2/source/arch/x86/include/asm/kasan.h#L7
As mentioned in my comment, Kasan allocates 1 byte of shadow for every 8
bytes of data
So, when you allocate a shadow page through kasan_alloc_module_shadow,
it corresponds to eight physical pages in our system.
So, we need MODULE_ALIGN to ensure proper alignment when allocating
shadow memory for modules using KASAN.
Let's take a look at the kasan_alloc_module_shadow function again
As I mentioned earlier,Kasan allocates 1 byte of shadow for every 8
bytes of data.
Assuming phys_end is set to 0 for the sake of this example, if you
allocate a single shadow page,
the corresponding address range it can represent would be [0, 0x7FFFF].
So, it is incorrect to call kasan_alloc_module_shadow every time a page
is allocated, as it can trigger warnings in the system.
https://elixir.bootlin.com/linux/v6.13-rc2/source/mm/kasan/shadow.c#L599
Thanks
Best Regards Hao
> MODULE_ALIGN-aligned and the above condition is not needed.
>
>> + kasan_alloc_module_shadow((void *)phys_end,
>> + round_up(nr << PAGE_SHIFT, MODULE_ALIGN),
> Here again, (nr << PAGE_SHIFT) is PAGE_SIZE-aligned and PAGE_SIZE is a
> multiple of MODULE_ALIGN, therefore (nr << PAGE_SHIFT) is always
> multiple of MODULE_ALIGN and there is no need for round_up().
>
> IOW, I think this patch should simply add one line:
>
> vm_module_tags->nr_pages += nr;
> + kasan_alloc_module_shadow((void *)phys_end, nr <<
> PAGE_SHIFT, GFP_KERNEL);
>
> Am I missing something?
>
>> + GFP_KERNEL);
>> }
>>
>> /*
>> --
>> 2.25.1
>>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled
2024-12-10 17:55 ` Suren Baghdasaryan
2024-12-10 18:45 ` Hao Ge
@ 2024-12-10 18:56 ` Ben Greear
1 sibling, 0 replies; 19+ messages in thread
From: Ben Greear @ 2024-12-10 18:56 UTC (permalink / raw)
To: Suren Baghdasaryan, Hao Ge
Cc: kent.overstreet, akpm, linux-mm, linux-kernel, Hao Ge
On 12/10/24 09:55, Suren Baghdasaryan wrote:
> On Mon, Dec 9, 2024 at 10:53 PM Hao Ge <hao.ge@linux.dev> wrote:
>>
>> From: Hao Ge <gehao@kylinos.cn>
>>
>> When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
>> is not enabled, we may encounter a panic during system boot.
>>
>> Because we haven't allocated pages and created mappings
>> for the shadow memory corresponding to module_tags region,
>> similar to how it is done for execmem_vmalloc.
>>
>> The difference is that our module_tags are allocated on demand,
>> so similarly,we also need to allocate shadow memory regions on demand.
>> However, we still need to adhere to the MODULE_ALIGN principle.
Hello,
I applied this patch:
https://lore.kernel.org/all/20241130001423.1114965-1-surenb@google.com/raw
as well as the v2 patch in this email thread, on top of today's Linus
kernel (plus local patches that should be un-related to this particular
issue) and now my system boots when my current mix of kernel debugging is enabled.
Thanks for the fix!
--Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled
2024-12-10 18:45 ` Hao Ge
@ 2024-12-10 19:20 ` Suren Baghdasaryan
2024-12-10 19:36 ` Hao Ge
0 siblings, 1 reply; 19+ messages in thread
From: Suren Baghdasaryan @ 2024-12-10 19:20 UTC (permalink / raw)
To: Hao Ge; +Cc: kent.overstreet, akpm, linux-mm, linux-kernel, greearb, Hao Ge
On Tue, Dec 10, 2024 at 10:46 AM Hao Ge <hao.ge@linux.dev> wrote:
>
> Hi Suren
>
>
> Thanks for your review.
>
>
> On 12/11/24 01:55, Suren Baghdasaryan wrote:
> > On Mon, Dec 9, 2024 at 10:53 PM Hao Ge <hao.ge@linux.dev> wrote:
> >> From: Hao Ge <gehao@kylinos.cn>
> >>
> >> When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
> >> is not enabled, we may encounter a panic during system boot.
> >>
> >> Because we haven't allocated pages and created mappings
> >> for the shadow memory corresponding to module_tags region,
> >> similar to how it is done for execmem_vmalloc.
> >>
> >> The difference is that our module_tags are allocated on demand,
> >> so similarly,we also need to allocate shadow memory regions on demand.
> >> However, we still need to adhere to the MODULE_ALIGN principle.
> >>
> >> Here is the log for panic:
> >>
> >> [ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
> >> [ 18.350016] #PF: supervisor read access in kernel mode
> >> [ 18.350459] #PF: error_code(0x0000) - not-present page
> >> [ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
> >> [ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
> >> [ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
> >> [ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> >> [ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
> >> [ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
> >> [ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
> >> [ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
> >> [ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
> >> [ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
> >> [ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
> >> [ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
> >> [ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
> >> [ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
> >> [ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> [ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >> [ 18.362020] PKRU: 55555554
> >> [ 18.362261] Call Trace:
> >> [ 18.362481] <TASK>
> >> [ 18.362671] ? __die+0x23/0x70
> >> [ 18.362964] ? page_fault_oops+0xc2/0x160
> >> [ 18.363318] ? exc_page_fault+0xad/0xc0
> >> [ 18.363680] ? asm_exc_page_fault+0x26/0x30
> >> [ 18.364056] ? move_module+0x3cc/0x8a0
> >> [ 18.364398] ? kasan_check_range+0xba/0x1b0
> >> [ 18.364755] __asan_memcpy+0x3c/0x60
> >> [ 18.365074] move_module+0x3cc/0x8a0
> >> [ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
> >> [ 18.365841] ? early_mod_check+0x3dc/0x510
> >> [ 18.366195] load_module+0x72/0x1850
> >> [ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
> >> [ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
> >> [ 18.367262] init_module_from_file+0xd1/0x130
> >> [ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
> >> [ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
> >> [ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
> >> [ 18.368938] idempotent_init_module+0x22c/0x790
> >> [ 18.369332] ? simple_getattr+0x6f/0x120
> >> [ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
> >> [ 18.370110] ? fdget+0x58/0x3a0
> >> [ 18.370393] ? security_capable+0x64/0xf0
> >> [ 18.370745] __x64_sys_finit_module+0xc2/0x140
> >> [ 18.371136] do_syscall_64+0x7d/0x160
> >> [ 18.371459] ? fdget_pos+0x1c8/0x4c0
> >> [ 18.371784] ? ksys_read+0xfd/0x1d0
> >> [ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
> >> [ 18.372525] ? do_syscall_64+0x89/0x160
> >> [ 18.372860] ? do_syscall_64+0x89/0x160
> >> [ 18.373194] ? do_syscall_64+0x89/0x160
> >> [ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
> >> [ 18.373952] ? do_syscall_64+0x89/0x160
> >> [ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
> >> [ 18.374701] ? do_syscall_64+0x89/0x160
> >> [ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
> >> [ 18.375416] ? clear_bhb_loop+0x25/0x80
> >> [ 18.375748] ? clear_bhb_loop+0x25/0x80
> >> [ 18.376119] ? clear_bhb_loop+0x25/0x80
> >> [ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >>
> >> Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
> >> Reported-by: Ben Greear <greearb@candelatech.com>
> >> Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
> >> Signed-off-by: Hao Ge <gehao@kylinos.cn>
> >> ---
> >> v2: Add comments to facilitate understanding of the code.
> >> Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
> >> already handles this internally,but to make the code more readable and user-friendly
> >>
> >> commit 233e89322cbe ("alloc_tag: fix module allocation
> >> tags populated area calculation") is currently in the
> >> mm-hotfixes-unstable branch, so this patch is
> >> developed based on the mm-hotfixes-unstable branch.
> >> ---
> >> lib/alloc_tag.c | 12 ++++++++++++
> >> 1 file changed, 12 insertions(+)
> >>
> >> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> >> index f942408b53ef..bd3ee57ea13f 100644
> >> --- a/lib/alloc_tag.c
> >> +++ b/lib/alloc_tag.c
> >> @@ -10,6 +10,7 @@
> >> #include <linux/seq_buf.h>
> >> #include <linux/seq_file.h>
> >> #include <linux/vmalloc.h>
> >> +#include <linux/math.h>
> >>
> >> #define ALLOCINFO_FILE_NAME "allocinfo"
> >> #define MODULE_ALLOC_TAG_VMAP_SIZE (100000UL * sizeof(struct alloc_tag))
> >> @@ -422,6 +423,17 @@ static int vm_module_tags_populate(void)
> >> return -ENOMEM;
> >> }
> >> vm_module_tags->nr_pages += nr;
> >> +
> >> + /*
> >> + * Kasan allocates 1 byte of shadow for every 8 bytes of data.
> >> + * When kasan_alloc_module_shadow allocates shadow memory,
> >> + * it does so in units of pages.
> >> + * Therefore, here we need to align to MODULE_ALIGN.
> >> + */
> >> + if ((phys_end & (MODULE_ALIGN - 1)) == 0)
> > phys_end is calculated as:
> >
> > unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
> > (vm_module_tags->nr_pages
> > << PAGE_SHIFT);
> >
> > and therefore is always PAGE_SIZE-aligned. PAGE_SIZE is always a
> > multiple of MODULE_ALIGN, therefore phys_end is always
>
> When CONFIG_KASAN_VMALLOC is not enabled
>
> #define MODULE_ALIGN (PAGE_SIZE << KASAN_SHADOW_SCALE_SHIFT)
Ah, sorry, I misread this as (PAGE_SIZE >> KASAN_SHADOW_SCALE_SHIFT)
and assumed MODULE_ALIGN is always multiple of PAGE_SIZE. Now it makes
more sense. However I'm still not sure about this condition:
if ((phys_end & (MODULE_ALIGN - 1)) == 0)
What if page_end is not MODULE_ALIGN-aligned. We will be skipping
kasan_alloc_module_shadow().
For example, say module_tags.start_addr == 0x1018 (4096+24), original
phys_end will be 0x1000 (4096) and say we allocated one page (nr ==
1), tags area is [0x1000-0x2000]. phys_end is not MODULE_ALIGN-aligned
and we will skip kasan_alloc_module_shadow(). IIUC, this is already
incorrect.
Now, say the next time we allocate 8 pages. phys_end this time is
0x2000 and the new tags area spans [0x1000-0xA000], we skip
kasan_alloc_module_shadow() again. Next time we allocate pages,
phys_end is 0xA000 and it again is not MODULE_ALIGN-aligned, we skip
again. You see my point?
>
> https://elixir.bootlin.com/linux/v6.13-rc2/source/include/linux/execmem.h#L11
>
> and On x86, KASAN_SHADOW_SCALE_SHIFT is set to 3
>
> https://elixir.bootlin.com/linux/v6.13-rc2/source/arch/x86/include/asm/kasan.h#L7
>
> As mentioned in my comment, Kasan allocates 1 byte of shadow for every 8
> bytes of data
>
> So, when you allocate a shadow page through kasan_alloc_module_shadow,
> it corresponds to eight physical pages in our system.
>
> So, we need MODULE_ALIGN to ensure proper alignment when allocating
> shadow memory for modules using KASAN.
>
> Let's take a look at the kasan_alloc_module_shadow function again
>
> As I mentioned earlier,Kasan allocates 1 byte of shadow for every 8
> bytes of data.
>
> Assuming phys_end is set to 0 for the sake of this example, if you
> allocate a single shadow page,
>
> the corresponding address range it can represent would be [0, 0x7FFFF].
>
> So, it is incorrect to call kasan_alloc_module_shadow every time a page
> is allocated, as it can trigger warnings in the system.
>
> https://elixir.bootlin.com/linux/v6.13-rc2/source/mm/kasan/shadow.c#L599
>
> Thanks
>
> Best Regards Hao
>
> > MODULE_ALIGN-aligned and the above condition is not needed.
> >
> >> + kasan_alloc_module_shadow((void *)phys_end,
> >> + round_up(nr << PAGE_SHIFT, MODULE_ALIGN),
> > Here again, (nr << PAGE_SHIFT) is PAGE_SIZE-aligned and PAGE_SIZE is a
> > multiple of MODULE_ALIGN, therefore (nr << PAGE_SHIFT) is always
> > multiple of MODULE_ALIGN and there is no need for round_up().
> >
> > IOW, I think this patch should simply add one line:
> >
> > vm_module_tags->nr_pages += nr;
> > + kasan_alloc_module_shadow((void *)phys_end, nr <<
> > PAGE_SHIFT, GFP_KERNEL);
> >
> > Am I missing something?
> >
>
>
> >> + GFP_KERNEL);
> >> }
> >>
> >> /*
> >> --
> >> 2.25.1
> >>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled
2024-12-10 19:20 ` Suren Baghdasaryan
@ 2024-12-10 19:36 ` Hao Ge
2024-12-10 20:04 ` Suren Baghdasaryan
0 siblings, 1 reply; 19+ messages in thread
From: Hao Ge @ 2024-12-10 19:36 UTC (permalink / raw)
To: Suren Baghdasaryan
Cc: kent.overstreet, akpm, linux-mm, linux-kernel, greearb, Hao Ge
Hi Suren
On 12/11/24 03:20, Suren Baghdasaryan wrote:
> On Tue, Dec 10, 2024 at 10:46 AM Hao Ge <hao.ge@linux.dev> wrote:
>> Hi Suren
>>
>>
>> Thanks for your review.
>>
>>
>> On 12/11/24 01:55, Suren Baghdasaryan wrote:
>>> On Mon, Dec 9, 2024 at 10:53 PM Hao Ge <hao.ge@linux.dev> wrote:
>>>> From: Hao Ge <gehao@kylinos.cn>
>>>>
>>>> When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
>>>> is not enabled, we may encounter a panic during system boot.
>>>>
>>>> Because we haven't allocated pages and created mappings
>>>> for the shadow memory corresponding to module_tags region,
>>>> similar to how it is done for execmem_vmalloc.
>>>>
>>>> The difference is that our module_tags are allocated on demand,
>>>> so similarly,we also need to allocate shadow memory regions on demand.
>>>> However, we still need to adhere to the MODULE_ALIGN principle.
>>>>
>>>> Here is the log for panic:
>>>>
>>>> [ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
>>>> [ 18.350016] #PF: supervisor read access in kernel mode
>>>> [ 18.350459] #PF: error_code(0x0000) - not-present page
>>>> [ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
>>>> [ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
>>>> [ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
>>>> [ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
>>>> [ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
>>>> [ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
>>>> [ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
>>>> [ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
>>>> [ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
>>>> [ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
>>>> [ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
>>>> [ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
>>>> [ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
>>>> [ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
>>>> [ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> [ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>> [ 18.362020] PKRU: 55555554
>>>> [ 18.362261] Call Trace:
>>>> [ 18.362481] <TASK>
>>>> [ 18.362671] ? __die+0x23/0x70
>>>> [ 18.362964] ? page_fault_oops+0xc2/0x160
>>>> [ 18.363318] ? exc_page_fault+0xad/0xc0
>>>> [ 18.363680] ? asm_exc_page_fault+0x26/0x30
>>>> [ 18.364056] ? move_module+0x3cc/0x8a0
>>>> [ 18.364398] ? kasan_check_range+0xba/0x1b0
>>>> [ 18.364755] __asan_memcpy+0x3c/0x60
>>>> [ 18.365074] move_module+0x3cc/0x8a0
>>>> [ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
>>>> [ 18.365841] ? early_mod_check+0x3dc/0x510
>>>> [ 18.366195] load_module+0x72/0x1850
>>>> [ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
>>>> [ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
>>>> [ 18.367262] init_module_from_file+0xd1/0x130
>>>> [ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
>>>> [ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
>>>> [ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
>>>> [ 18.368938] idempotent_init_module+0x22c/0x790
>>>> [ 18.369332] ? simple_getattr+0x6f/0x120
>>>> [ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
>>>> [ 18.370110] ? fdget+0x58/0x3a0
>>>> [ 18.370393] ? security_capable+0x64/0xf0
>>>> [ 18.370745] __x64_sys_finit_module+0xc2/0x140
>>>> [ 18.371136] do_syscall_64+0x7d/0x160
>>>> [ 18.371459] ? fdget_pos+0x1c8/0x4c0
>>>> [ 18.371784] ? ksys_read+0xfd/0x1d0
>>>> [ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
>>>> [ 18.372525] ? do_syscall_64+0x89/0x160
>>>> [ 18.372860] ? do_syscall_64+0x89/0x160
>>>> [ 18.373194] ? do_syscall_64+0x89/0x160
>>>> [ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
>>>> [ 18.373952] ? do_syscall_64+0x89/0x160
>>>> [ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
>>>> [ 18.374701] ? do_syscall_64+0x89/0x160
>>>> [ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
>>>> [ 18.375416] ? clear_bhb_loop+0x25/0x80
>>>> [ 18.375748] ? clear_bhb_loop+0x25/0x80
>>>> [ 18.376119] ? clear_bhb_loop+0x25/0x80
>>>> [ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>>
>>>> Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
>>>> Reported-by: Ben Greear <greearb@candelatech.com>
>>>> Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn>
>>>> ---
>>>> v2: Add comments to facilitate understanding of the code.
>>>> Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
>>>> already handles this internally,but to make the code more readable and user-friendly
>>>>
>>>> commit 233e89322cbe ("alloc_tag: fix module allocation
>>>> tags populated area calculation") is currently in the
>>>> mm-hotfixes-unstable branch, so this patch is
>>>> developed based on the mm-hotfixes-unstable branch.
>>>> ---
>>>> lib/alloc_tag.c | 12 ++++++++++++
>>>> 1 file changed, 12 insertions(+)
>>>>
>>>> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
>>>> index f942408b53ef..bd3ee57ea13f 100644
>>>> --- a/lib/alloc_tag.c
>>>> +++ b/lib/alloc_tag.c
>>>> @@ -10,6 +10,7 @@
>>>> #include <linux/seq_buf.h>
>>>> #include <linux/seq_file.h>
>>>> #include <linux/vmalloc.h>
>>>> +#include <linux/math.h>
>>>>
>>>> #define ALLOCINFO_FILE_NAME "allocinfo"
>>>> #define MODULE_ALLOC_TAG_VMAP_SIZE (100000UL * sizeof(struct alloc_tag))
>>>> @@ -422,6 +423,17 @@ static int vm_module_tags_populate(void)
>>>> return -ENOMEM;
>>>> }
>>>> vm_module_tags->nr_pages += nr;
>>>> +
>>>> + /*
>>>> + * Kasan allocates 1 byte of shadow for every 8 bytes of data.
>>>> + * When kasan_alloc_module_shadow allocates shadow memory,
>>>> + * it does so in units of pages.
>>>> + * Therefore, here we need to align to MODULE_ALIGN.
>>>> + */
>>>> + if ((phys_end & (MODULE_ALIGN - 1)) == 0)
>>> phys_end is calculated as:
>>>
>>> unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
>>> (vm_module_tags->nr_pages
>>> << PAGE_SHIFT);
>>>
>>> and therefore is always PAGE_SIZE-aligned. PAGE_SIZE is always a
>>> multiple of MODULE_ALIGN, therefore phys_end is always
>> When CONFIG_KASAN_VMALLOC is not enabled
>>
>> #define MODULE_ALIGN (PAGE_SIZE << KASAN_SHADOW_SCALE_SHIFT)
> Ah, sorry, I misread this as (PAGE_SIZE >> KASAN_SHADOW_SCALE_SHIFT)
> and assumed MODULE_ALIGN is always multiple of PAGE_SIZE. Now it makes
> more sense. However I'm still not sure about this condition:
>
> if ((phys_end & (MODULE_ALIGN - 1)) == 0)
>
> What if page_end is not MODULE_ALIGN-aligned. We will be skipping
> kasan_alloc_module_shadow().
Theoretically, this scenario does not exist.
Please refer to the following:
https://elixir.bootlin.com/linux/v6.13-rc2/source/arch/x86/mm/init.c#L1072
They would all comply with MODULE_ALIGN.
> For example, say module_tags.start_addr == 0x1018 (4096+24), original
> phys_end will be 0x1000 (4096) and say we allocated one page (nr ==
> 1), tags area is [0x1000-0x2000]. phys_end is not MODULE_ALIGN-aligned
> and we will skip kasan_alloc_module_shadow(). IIUC, this is already
> incorrect.
> Now, say the next time we allocate 8 pages. phys_end this time is
> 0x2000 and the new tags area spans [0x1000-0xA000], we skip
> kasan_alloc_module_shadow() again. Next time we allocate pages,
> phys_end is 0xA000 and it again is not MODULE_ALIGN-aligned, we skip
> again. You see my point?
>
>> https://elixir.bootlin.com/linux/v6.13-rc2/source/include/linux/execmem.h#L11
>>
>> and On x86, KASAN_SHADOW_SCALE_SHIFT is set to 3
>>
>> https://elixir.bootlin.com/linux/v6.13-rc2/source/arch/x86/include/asm/kasan.h#L7
>>
>> As mentioned in my comment, Kasan allocates 1 byte of shadow for every 8
>> bytes of data
>>
>> So, when you allocate a shadow page through kasan_alloc_module_shadow,
>> it corresponds to eight physical pages in our system.
>>
>> So, we need MODULE_ALIGN to ensure proper alignment when allocating
>> shadow memory for modules using KASAN.
>>
>> Let's take a look at the kasan_alloc_module_shadow function again
>>
>> As I mentioned earlier,Kasan allocates 1 byte of shadow for every 8
>> bytes of data.
>>
>> Assuming phys_end is set to 0 for the sake of this example, if you
>> allocate a single shadow page,
>>
>> the corresponding address range it can represent would be [0, 0x7FFFF].
>>
>> So, it is incorrect to call kasan_alloc_module_shadow every time a page
>> is allocated, as it can trigger warnings in the system.
>>
>> https://elixir.bootlin.com/linux/v6.13-rc2/source/mm/kasan/shadow.c#L599
>>
>> Thanks
>>
>> Best Regards Hao
>>
>>> MODULE_ALIGN-aligned and the above condition is not needed.
>>>
>>>> + kasan_alloc_module_shadow((void *)phys_end,
>>>> + round_up(nr << PAGE_SHIFT, MODULE_ALIGN),
>>> Here again, (nr << PAGE_SHIFT) is PAGE_SIZE-aligned and PAGE_SIZE is a
>>> multiple of MODULE_ALIGN, therefore (nr << PAGE_SHIFT) is always
>>> multiple of MODULE_ALIGN and there is no need for round_up().
>>>
>>> IOW, I think this patch should simply add one line:
>>>
>>> vm_module_tags->nr_pages += nr;
>>> + kasan_alloc_module_shadow((void *)phys_end, nr <<
>>> PAGE_SHIFT, GFP_KERNEL);
>>>
>>> Am I missing something?
>>>
>>
>>>> + GFP_KERNEL);
>>>> }
>>>>
>>>> /*
>>>> --
>>>> 2.25.1
>>>>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled
2024-12-10 19:36 ` Hao Ge
@ 2024-12-10 20:04 ` Suren Baghdasaryan
2024-12-11 1:10 ` Hao Ge
0 siblings, 1 reply; 19+ messages in thread
From: Suren Baghdasaryan @ 2024-12-10 20:04 UTC (permalink / raw)
To: Hao Ge; +Cc: kent.overstreet, akpm, linux-mm, linux-kernel, greearb, Hao Ge
On Tue, Dec 10, 2024 at 11:37 AM Hao Ge <hao.ge@linux.dev> wrote:
>
> Hi Suren
>
>
> On 12/11/24 03:20, Suren Baghdasaryan wrote:
> > On Tue, Dec 10, 2024 at 10:46 AM Hao Ge <hao.ge@linux.dev> wrote:
> >> Hi Suren
> >>
> >>
> >> Thanks for your review.
> >>
> >>
> >> On 12/11/24 01:55, Suren Baghdasaryan wrote:
> >>> On Mon, Dec 9, 2024 at 10:53 PM Hao Ge <hao.ge@linux.dev> wrote:
> >>>> From: Hao Ge <gehao@kylinos.cn>
> >>>>
> >>>> When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
> >>>> is not enabled, we may encounter a panic during system boot.
> >>>>
> >>>> Because we haven't allocated pages and created mappings
> >>>> for the shadow memory corresponding to module_tags region,
> >>>> similar to how it is done for execmem_vmalloc.
> >>>>
> >>>> The difference is that our module_tags are allocated on demand,
> >>>> so similarly,we also need to allocate shadow memory regions on demand.
> >>>> However, we still need to adhere to the MODULE_ALIGN principle.
> >>>>
> >>>> Here is the log for panic:
> >>>>
> >>>> [ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
> >>>> [ 18.350016] #PF: supervisor read access in kernel mode
> >>>> [ 18.350459] #PF: error_code(0x0000) - not-present page
> >>>> [ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
> >>>> [ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
> >>>> [ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
> >>>> [ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> >>>> [ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
> >>>> [ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
> >>>> [ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
> >>>> [ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
> >>>> [ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
> >>>> [ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
> >>>> [ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
> >>>> [ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
> >>>> [ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
> >>>> [ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>> [ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
> >>>> [ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>>> [ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >>>> [ 18.362020] PKRU: 55555554
> >>>> [ 18.362261] Call Trace:
> >>>> [ 18.362481] <TASK>
> >>>> [ 18.362671] ? __die+0x23/0x70
> >>>> [ 18.362964] ? page_fault_oops+0xc2/0x160
> >>>> [ 18.363318] ? exc_page_fault+0xad/0xc0
> >>>> [ 18.363680] ? asm_exc_page_fault+0x26/0x30
> >>>> [ 18.364056] ? move_module+0x3cc/0x8a0
> >>>> [ 18.364398] ? kasan_check_range+0xba/0x1b0
> >>>> [ 18.364755] __asan_memcpy+0x3c/0x60
> >>>> [ 18.365074] move_module+0x3cc/0x8a0
> >>>> [ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
> >>>> [ 18.365841] ? early_mod_check+0x3dc/0x510
> >>>> [ 18.366195] load_module+0x72/0x1850
> >>>> [ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
> >>>> [ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
> >>>> [ 18.367262] init_module_from_file+0xd1/0x130
> >>>> [ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
> >>>> [ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
> >>>> [ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
> >>>> [ 18.368938] idempotent_init_module+0x22c/0x790
> >>>> [ 18.369332] ? simple_getattr+0x6f/0x120
> >>>> [ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
> >>>> [ 18.370110] ? fdget+0x58/0x3a0
> >>>> [ 18.370393] ? security_capable+0x64/0xf0
> >>>> [ 18.370745] __x64_sys_finit_module+0xc2/0x140
> >>>> [ 18.371136] do_syscall_64+0x7d/0x160
> >>>> [ 18.371459] ? fdget_pos+0x1c8/0x4c0
> >>>> [ 18.371784] ? ksys_read+0xfd/0x1d0
> >>>> [ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
> >>>> [ 18.372525] ? do_syscall_64+0x89/0x160
> >>>> [ 18.372860] ? do_syscall_64+0x89/0x160
> >>>> [ 18.373194] ? do_syscall_64+0x89/0x160
> >>>> [ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
> >>>> [ 18.373952] ? do_syscall_64+0x89/0x160
> >>>> [ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
> >>>> [ 18.374701] ? do_syscall_64+0x89/0x160
> >>>> [ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
> >>>> [ 18.375416] ? clear_bhb_loop+0x25/0x80
> >>>> [ 18.375748] ? clear_bhb_loop+0x25/0x80
> >>>> [ 18.376119] ? clear_bhb_loop+0x25/0x80
> >>>> [ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >>>>
> >>>> Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
> >>>> Reported-by: Ben Greear <greearb@candelatech.com>
> >>>> Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
> >>>> Signed-off-by: Hao Ge <gehao@kylinos.cn>
> >>>> ---
> >>>> v2: Add comments to facilitate understanding of the code.
> >>>> Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
> >>>> already handles this internally,but to make the code more readable and user-friendly
> >>>>
> >>>> commit 233e89322cbe ("alloc_tag: fix module allocation
> >>>> tags populated area calculation") is currently in the
> >>>> mm-hotfixes-unstable branch, so this patch is
> >>>> developed based on the mm-hotfixes-unstable branch.
> >>>> ---
> >>>> lib/alloc_tag.c | 12 ++++++++++++
> >>>> 1 file changed, 12 insertions(+)
> >>>>
> >>>> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> >>>> index f942408b53ef..bd3ee57ea13f 100644
> >>>> --- a/lib/alloc_tag.c
> >>>> +++ b/lib/alloc_tag.c
> >>>> @@ -10,6 +10,7 @@
> >>>> #include <linux/seq_buf.h>
> >>>> #include <linux/seq_file.h>
> >>>> #include <linux/vmalloc.h>
> >>>> +#include <linux/math.h>
> >>>>
> >>>> #define ALLOCINFO_FILE_NAME "allocinfo"
> >>>> #define MODULE_ALLOC_TAG_VMAP_SIZE (100000UL * sizeof(struct alloc_tag))
> >>>> @@ -422,6 +423,17 @@ static int vm_module_tags_populate(void)
> >>>> return -ENOMEM;
> >>>> }
> >>>> vm_module_tags->nr_pages += nr;
> >>>> +
> >>>> + /*
> >>>> + * Kasan allocates 1 byte of shadow for every 8 bytes of data.
> >>>> + * When kasan_alloc_module_shadow allocates shadow memory,
> >>>> + * it does so in units of pages.
> >>>> + * Therefore, here we need to align to MODULE_ALIGN.
> >>>> + */
> >>>> + if ((phys_end & (MODULE_ALIGN - 1)) == 0)
> >>> phys_end is calculated as:
> >>>
> >>> unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
> >>> (vm_module_tags->nr_pages
> >>> << PAGE_SHIFT);
> >>>
> >>> and therefore is always PAGE_SIZE-aligned. PAGE_SIZE is always a
> >>> multiple of MODULE_ALIGN, therefore phys_end is always
> >> When CONFIG_KASAN_VMALLOC is not enabled
> >>
> >> #define MODULE_ALIGN (PAGE_SIZE << KASAN_SHADOW_SCALE_SHIFT)
> > Ah, sorry, I misread this as (PAGE_SIZE >> KASAN_SHADOW_SCALE_SHIFT)
> > and assumed MODULE_ALIGN is always multiple of PAGE_SIZE. Now it makes
> > more sense. However I'm still not sure about this condition:
> >
> > if ((phys_end & (MODULE_ALIGN - 1)) == 0)
> >
> > What if page_end is not MODULE_ALIGN-aligned. We will be skipping
> > kasan_alloc_module_shadow().
>
>
> Theoretically, this scenario does not exist.
>
>
> Please refer to the following:
>
> https://elixir.bootlin.com/linux/v6.13-rc2/source/arch/x86/mm/init.c#L1072
>
> They would all comply with MODULE_ALIGN.
Well, not all. The original execmem_vmap() called from
alloc_mod_tags_mem() will indeed return MODULE_ALIGN-aligned address,
therefore the original phys_end is MODULE_ALIGN-aligned. But as
phys_end grows it can become misaligned. Let's modify my example:
module_tags.start_addr = 0x8000; (returned by execmem_vmap())
// we need to allocate 1 page (nr = 1)
phys_end = 0x8000; // MODULE_ALIGN'ed, so we allocate a shadow page
// tags covered area is [0x8000-0x9000]
// our shadows memory represents the area [0x8000-0x10000]
// now we allocate 8 more pages (nr = 8)
phys_end = 0x9000; // not MODULE_ALIGN'ed, we skip allocating shadow pages
// tags covered area is [0x8000-0x11000]
// but our shadows memory still represents the area [0x8000-0x10000]
>
>
> > For example, say module_tags.start_addr == 0x1018 (4096+24), original
> > phys_end will be 0x1000 (4096) and say we allocated one page (nr ==
> > 1), tags area is [0x1000-0x2000]. phys_end is not MODULE_ALIGN-aligned
> > and we will skip kasan_alloc_module_shadow(). IIUC, this is already
> > incorrect.
> > Now, say the next time we allocate 8 pages. phys_end this time is
> > 0x2000 and the new tags area spans [0x1000-0xA000], we skip
> > kasan_alloc_module_shadow() again. Next time we allocate pages,
> > phys_end is 0xA000 and it again is not MODULE_ALIGN-aligned, we skip
> > again. You see my point?
> >
> >> https://elixir.bootlin.com/linux/v6.13-rc2/source/include/linux/execmem.h#L11
> >>
> >> and On x86, KASAN_SHADOW_SCALE_SHIFT is set to 3
> >>
> >> https://elixir.bootlin.com/linux/v6.13-rc2/source/arch/x86/include/asm/kasan.h#L7
> >>
> >> As mentioned in my comment, Kasan allocates 1 byte of shadow for every 8
> >> bytes of data
> >>
> >> So, when you allocate a shadow page through kasan_alloc_module_shadow,
> >> it corresponds to eight physical pages in our system.
> >>
> >> So, we need MODULE_ALIGN to ensure proper alignment when allocating
> >> shadow memory for modules using KASAN.
> >>
> >> Let's take a look at the kasan_alloc_module_shadow function again
> >>
> >> As I mentioned earlier,Kasan allocates 1 byte of shadow for every 8
> >> bytes of data.
> >>
> >> Assuming phys_end is set to 0 for the sake of this example, if you
> >> allocate a single shadow page,
> >>
> >> the corresponding address range it can represent would be [0, 0x7FFFF].
> >>
> >> So, it is incorrect to call kasan_alloc_module_shadow every time a page
> >> is allocated, as it can trigger warnings in the system.
> >>
> >> https://elixir.bootlin.com/linux/v6.13-rc2/source/mm/kasan/shadow.c#L599
> >>
> >> Thanks
> >>
> >> Best Regards Hao
> >>
> >>> MODULE_ALIGN-aligned and the above condition is not needed.
> >>>
> >>>> + kasan_alloc_module_shadow((void *)phys_end,
> >>>> + round_up(nr << PAGE_SHIFT, MODULE_ALIGN),
> >>> Here again, (nr << PAGE_SHIFT) is PAGE_SIZE-aligned and PAGE_SIZE is a
> >>> multiple of MODULE_ALIGN, therefore (nr << PAGE_SHIFT) is always
> >>> multiple of MODULE_ALIGN and there is no need for round_up().
> >>>
> >>> IOW, I think this patch should simply add one line:
> >>>
> >>> vm_module_tags->nr_pages += nr;
> >>> + kasan_alloc_module_shadow((void *)phys_end, nr <<
> >>> PAGE_SHIFT, GFP_KERNEL);
> >>>
> >>> Am I missing something?
> >>>
> >>
> >>>> + GFP_KERNEL);
> >>>> }
> >>>>
> >>>> /*
> >>>> --
> >>>> 2.25.1
> >>>>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled
2024-12-10 20:04 ` Suren Baghdasaryan
@ 2024-12-11 1:10 ` Hao Ge
2024-12-11 2:57 ` [PATCH v3] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled Hao Ge
0 siblings, 1 reply; 19+ messages in thread
From: Hao Ge @ 2024-12-11 1:10 UTC (permalink / raw)
To: Suren Baghdasaryan
Cc: kent.overstreet, akpm, linux-mm, linux-kernel, greearb, Hao Ge
Hi Suren
On 12/11/24 04:04, Suren Baghdasaryan wrote:
> On Tue, Dec 10, 2024 at 11:37 AM Hao Ge <hao.ge@linux.dev> wrote:
>> Hi Suren
>>
>>
>> On 12/11/24 03:20, Suren Baghdasaryan wrote:
>>> On Tue, Dec 10, 2024 at 10:46 AM Hao Ge <hao.ge@linux.dev> wrote:
>>>> Hi Suren
>>>>
>>>>
>>>> Thanks for your review.
>>>>
>>>>
>>>> On 12/11/24 01:55, Suren Baghdasaryan wrote:
>>>>> On Mon, Dec 9, 2024 at 10:53 PM Hao Ge <hao.ge@linux.dev> wrote:
>>>>>> From: Hao Ge <gehao@kylinos.cn>
>>>>>>
>>>>>> When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
>>>>>> is not enabled, we may encounter a panic during system boot.
>>>>>>
>>>>>> Because we haven't allocated pages and created mappings
>>>>>> for the shadow memory corresponding to module_tags region,
>>>>>> similar to how it is done for execmem_vmalloc.
>>>>>>
>>>>>> The difference is that our module_tags are allocated on demand,
>>>>>> so similarly,we also need to allocate shadow memory regions on demand.
>>>>>> However, we still need to adhere to the MODULE_ALIGN principle.
>>>>>>
>>>>>> Here is the log for panic:
>>>>>>
>>>>>> [ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
>>>>>> [ 18.350016] #PF: supervisor read access in kernel mode
>>>>>> [ 18.350459] #PF: error_code(0x0000) - not-present page
>>>>>> [ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
>>>>>> [ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
>>>>>> [ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
>>>>>> [ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
>>>>>> [ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
>>>>>> [ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
>>>>>> [ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
>>>>>> [ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
>>>>>> [ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
>>>>>> [ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
>>>>>> [ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
>>>>>> [ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
>>>>>> [ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
>>>>>> [ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
>>>>>> [ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>> [ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>>>> [ 18.362020] PKRU: 55555554
>>>>>> [ 18.362261] Call Trace:
>>>>>> [ 18.362481] <TASK>
>>>>>> [ 18.362671] ? __die+0x23/0x70
>>>>>> [ 18.362964] ? page_fault_oops+0xc2/0x160
>>>>>> [ 18.363318] ? exc_page_fault+0xad/0xc0
>>>>>> [ 18.363680] ? asm_exc_page_fault+0x26/0x30
>>>>>> [ 18.364056] ? move_module+0x3cc/0x8a0
>>>>>> [ 18.364398] ? kasan_check_range+0xba/0x1b0
>>>>>> [ 18.364755] __asan_memcpy+0x3c/0x60
>>>>>> [ 18.365074] move_module+0x3cc/0x8a0
>>>>>> [ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
>>>>>> [ 18.365841] ? early_mod_check+0x3dc/0x510
>>>>>> [ 18.366195] load_module+0x72/0x1850
>>>>>> [ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
>>>>>> [ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
>>>>>> [ 18.367262] init_module_from_file+0xd1/0x130
>>>>>> [ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
>>>>>> [ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
>>>>>> [ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
>>>>>> [ 18.368938] idempotent_init_module+0x22c/0x790
>>>>>> [ 18.369332] ? simple_getattr+0x6f/0x120
>>>>>> [ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
>>>>>> [ 18.370110] ? fdget+0x58/0x3a0
>>>>>> [ 18.370393] ? security_capable+0x64/0xf0
>>>>>> [ 18.370745] __x64_sys_finit_module+0xc2/0x140
>>>>>> [ 18.371136] do_syscall_64+0x7d/0x160
>>>>>> [ 18.371459] ? fdget_pos+0x1c8/0x4c0
>>>>>> [ 18.371784] ? ksys_read+0xfd/0x1d0
>>>>>> [ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
>>>>>> [ 18.372525] ? do_syscall_64+0x89/0x160
>>>>>> [ 18.372860] ? do_syscall_64+0x89/0x160
>>>>>> [ 18.373194] ? do_syscall_64+0x89/0x160
>>>>>> [ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
>>>>>> [ 18.373952] ? do_syscall_64+0x89/0x160
>>>>>> [ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
>>>>>> [ 18.374701] ? do_syscall_64+0x89/0x160
>>>>>> [ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
>>>>>> [ 18.375416] ? clear_bhb_loop+0x25/0x80
>>>>>> [ 18.375748] ? clear_bhb_loop+0x25/0x80
>>>>>> [ 18.376119] ? clear_bhb_loop+0x25/0x80
>>>>>> [ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>>>>
>>>>>> Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
>>>>>> Reported-by: Ben Greear <greearb@candelatech.com>
>>>>>> Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
>>>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn>
>>>>>> ---
>>>>>> v2: Add comments to facilitate understanding of the code.
>>>>>> Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
>>>>>> already handles this internally,but to make the code more readable and user-friendly
>>>>>>
>>>>>> commit 233e89322cbe ("alloc_tag: fix module allocation
>>>>>> tags populated area calculation") is currently in the
>>>>>> mm-hotfixes-unstable branch, so this patch is
>>>>>> developed based on the mm-hotfixes-unstable branch.
>>>>>> ---
>>>>>> lib/alloc_tag.c | 12 ++++++++++++
>>>>>> 1 file changed, 12 insertions(+)
>>>>>>
>>>>>> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
>>>>>> index f942408b53ef..bd3ee57ea13f 100644
>>>>>> --- a/lib/alloc_tag.c
>>>>>> +++ b/lib/alloc_tag.c
>>>>>> @@ -10,6 +10,7 @@
>>>>>> #include <linux/seq_buf.h>
>>>>>> #include <linux/seq_file.h>
>>>>>> #include <linux/vmalloc.h>
>>>>>> +#include <linux/math.h>
>>>>>>
>>>>>> #define ALLOCINFO_FILE_NAME "allocinfo"
>>>>>> #define MODULE_ALLOC_TAG_VMAP_SIZE (100000UL * sizeof(struct alloc_tag))
>>>>>> @@ -422,6 +423,17 @@ static int vm_module_tags_populate(void)
>>>>>> return -ENOMEM;
>>>>>> }
>>>>>> vm_module_tags->nr_pages += nr;
>>>>>> +
>>>>>> + /*
>>>>>> + * Kasan allocates 1 byte of shadow for every 8 bytes of data.
>>>>>> + * When kasan_alloc_module_shadow allocates shadow memory,
>>>>>> + * it does so in units of pages.
>>>>>> + * Therefore, here we need to align to MODULE_ALIGN.
>>>>>> + */
>>>>>> + if ((phys_end & (MODULE_ALIGN - 1)) == 0)
>>>>> phys_end is calculated as:
>>>>>
>>>>> unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
>>>>> (vm_module_tags->nr_pages
>>>>> << PAGE_SHIFT);
>>>>>
>>>>> and therefore is always PAGE_SIZE-aligned. PAGE_SIZE is always a
>>>>> multiple of MODULE_ALIGN, therefore phys_end is always
>>>> When CONFIG_KASAN_VMALLOC is not enabled
>>>>
>>>> #define MODULE_ALIGN (PAGE_SIZE << KASAN_SHADOW_SCALE_SHIFT)
>>> Ah, sorry, I misread this as (PAGE_SIZE >> KASAN_SHADOW_SCALE_SHIFT)
>>> and assumed MODULE_ALIGN is always multiple of PAGE_SIZE. Now it makes
>>> more sense. However I'm still not sure about this condition:
>>>
>>> if ((phys_end & (MODULE_ALIGN - 1)) == 0)
>>>
>>> What if page_end is not MODULE_ALIGN-aligned. We will be skipping
>>> kasan_alloc_module_shadow().
>>
>> Theoretically, this scenario does not exist.
>>
>>
>> Please refer to the following:
>>
>> https://elixir.bootlin.com/linux/v6.13-rc2/source/arch/x86/mm/init.c#L1072
>>
>> They would all comply with MODULE_ALIGN.
> Well, not all. The original execmem_vmap() called from
> alloc_mod_tags_mem() will indeed return MODULE_ALIGN-aligned address,
> therefore the original phys_end is MODULE_ALIGN-aligned. But as
> phys_end grows it can become misaligned. Let's modify my example:
>
> module_tags.start_addr = 0x8000; (returned by execmem_vmap())
> // we need to allocate 1 page (nr = 1)
> phys_end = 0x8000; // MODULE_ALIGN'ed, so we allocate a shadow page
> // tags covered area is [0x8000-0x9000]
> // our shadows memory represents the area [0x8000-0x10000]
>
> // now we allocate 8 more pages (nr = 8)
> phys_end = 0x9000; // not MODULE_ALIGN'ed, we skip allocating shadow pages
> // tags covered area is [0x8000-0x11000]
> // but our shadows memory still represents the area [0x8000-0x10000]
I think I need a cup of coffee at this late hour – I completely forgot
about that logic!
Thank you so much for your detailed explanation and pointing it out.
I'll update next version to address this issue.
>
>
>>
>>> For example, say module_tags.start_addr == 0x1018 (4096+24), original
>>> phys_end will be 0x1000 (4096) and say we allocated one page (nr ==
>>> 1), tags area is [0x1000-0x2000]. phys_end is not MODULE_ALIGN-aligned
>>> and we will skip kasan_alloc_module_shadow(). IIUC, this is already
>>> incorrect.
>>> Now, say the next time we allocate 8 pages. phys_end this time is
>>> 0x2000 and the new tags area spans [0x1000-0xA000], we skip
>>> kasan_alloc_module_shadow() again. Next time we allocate pages,
>>> phys_end is 0xA000 and it again is not MODULE_ALIGN-aligned, we skip
>>> again. You see my point?
>>>
>>>> https://elixir.bootlin.com/linux/v6.13-rc2/source/include/linux/execmem.h#L11
>>>>
>>>> and On x86, KASAN_SHADOW_SCALE_SHIFT is set to 3
>>>>
>>>> https://elixir.bootlin.com/linux/v6.13-rc2/source/arch/x86/include/asm/kasan.h#L7
>>>>
>>>> As mentioned in my comment, Kasan allocates 1 byte of shadow for every 8
>>>> bytes of data
>>>>
>>>> So, when you allocate a shadow page through kasan_alloc_module_shadow,
>>>> it corresponds to eight physical pages in our system.
>>>>
>>>> So, we need MODULE_ALIGN to ensure proper alignment when allocating
>>>> shadow memory for modules using KASAN.
>>>>
>>>> Let's take a look at the kasan_alloc_module_shadow function again
>>>>
>>>> As I mentioned earlier,Kasan allocates 1 byte of shadow for every 8
>>>> bytes of data.
>>>>
>>>> Assuming phys_end is set to 0 for the sake of this example, if you
>>>> allocate a single shadow page,
>>>>
>>>> the corresponding address range it can represent would be [0, 0x7FFFF].
>>>>
>>>> So, it is incorrect to call kasan_alloc_module_shadow every time a page
>>>> is allocated, as it can trigger warnings in the system.
>>>>
>>>> https://elixir.bootlin.com/linux/v6.13-rc2/source/mm/kasan/shadow.c#L599
>>>>
>>>> Thanks
>>>>
>>>> Best Regards Hao
>>>>
>>>>> MODULE_ALIGN-aligned and the above condition is not needed.
>>>>>
>>>>>> + kasan_alloc_module_shadow((void *)phys_end,
>>>>>> + round_up(nr << PAGE_SHIFT, MODULE_ALIGN),
>>>>> Here again, (nr << PAGE_SHIFT) is PAGE_SIZE-aligned and PAGE_SIZE is a
>>>>> multiple of MODULE_ALIGN, therefore (nr << PAGE_SHIFT) is always
>>>>> multiple of MODULE_ALIGN and there is no need for round_up().
>>>>>
>>>>> IOW, I think this patch should simply add one line:
>>>>>
>>>>> vm_module_tags->nr_pages += nr;
>>>>> + kasan_alloc_module_shadow((void *)phys_end, nr <<
>>>>> PAGE_SHIFT, GFP_KERNEL);
>>>>>
>>>>> Am I missing something?
>>>>>
>>>>>> + GFP_KERNEL);
>>>>>> }
>>>>>>
>>>>>> /*
>>>>>> --
>>>>>> 2.25.1
>>>>>>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v3] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled
2024-12-11 1:10 ` Hao Ge
@ 2024-12-11 2:57 ` Hao Ge
2024-12-11 16:32 ` Suren Baghdasaryan
2024-12-11 17:18 ` [PATCH v3] " kernel test robot
0 siblings, 2 replies; 19+ messages in thread
From: Hao Ge @ 2024-12-11 2:57 UTC (permalink / raw)
To: surenb, kent.overstreet, akpm
Cc: linux-mm, linux-kernel, hao.ge, Hao Ge, Ben Greear
From: Hao Ge <gehao@kylinos.cn>
When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
is not enabled, we may encounter a panic during system boot.
Because we haven't allocated pages and created mappings
for the shadow memory corresponding to module_tags region,
similar to how it is done for execmem_vmalloc.
The difference is that our module_tags are allocated on demand,
so similarly,we also need to allocate shadow memory regions on demand.
However, we still need to adhere to the MODULE_ALIGN principle.
Here is the log for panic:
[ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
[ 18.350016] #PF: supervisor read access in kernel mode
[ 18.350459] #PF: error_code(0x0000) - not-present page
[ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
[ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
[ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
[ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
[ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
[ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
[ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
[ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
[ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
[ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
[ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
[ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
[ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 18.362020] PKRU: 55555554
[ 18.362261] Call Trace:
[ 18.362481] <TASK>
[ 18.362671] ? __die+0x23/0x70
[ 18.362964] ? page_fault_oops+0xc2/0x160
[ 18.363318] ? exc_page_fault+0xad/0xc0
[ 18.363680] ? asm_exc_page_fault+0x26/0x30
[ 18.364056] ? move_module+0x3cc/0x8a0
[ 18.364398] ? kasan_check_range+0xba/0x1b0
[ 18.364755] __asan_memcpy+0x3c/0x60
[ 18.365074] move_module+0x3cc/0x8a0
[ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
[ 18.365841] ? early_mod_check+0x3dc/0x510
[ 18.366195] load_module+0x72/0x1850
[ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
[ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
[ 18.367262] init_module_from_file+0xd1/0x130
[ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
[ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
[ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
[ 18.368938] idempotent_init_module+0x22c/0x790
[ 18.369332] ? simple_getattr+0x6f/0x120
[ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
[ 18.370110] ? fdget+0x58/0x3a0
[ 18.370393] ? security_capable+0x64/0xf0
[ 18.370745] __x64_sys_finit_module+0xc2/0x140
[ 18.371136] do_syscall_64+0x7d/0x160
[ 18.371459] ? fdget_pos+0x1c8/0x4c0
[ 18.371784] ? ksys_read+0xfd/0x1d0
[ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.372525] ? do_syscall_64+0x89/0x160
[ 18.372860] ? do_syscall_64+0x89/0x160
[ 18.373194] ? do_syscall_64+0x89/0x160
[ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.373952] ? do_syscall_64+0x89/0x160
[ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.374701] ? do_syscall_64+0x89/0x160
[ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
[ 18.375416] ? clear_bhb_loop+0x25/0x80
[ 18.375748] ? clear_bhb_loop+0x25/0x80
[ 18.376119] ? clear_bhb_loop+0x25/0x80
[ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
Reported-by: Ben Greear <greearb@candelatech.com>
Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
Signed-off-by: Hao Ge <gehao@kylinos.cn>
---
v3: Adjusting the title because the previous one was a bit unclear.
Suren has pointed out that our condition for determining whether
to allocate shadow memory is unreasonable.We have adjusted our method
to use every 8 pages as an index (idx), and we will make decisions based
on this idx when determining whether to allocate shadow memory.
v2: Add comments to facilitate understanding of the code.
Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
already handles this internally,but to make the code more readable and user-friendly
commit 233e89322cbe ("alloc_tag: fix module allocation
tags populated area calculation") is currently in the
mm-hotfixes-unstable branch, so this patch is
developed based on the mm-hotfixes-unstable branch.
---
lib/alloc_tag.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index f942408b53ef..8bf04756887d 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -10,6 +10,7 @@
#include <linux/seq_buf.h>
#include <linux/seq_file.h>
#include <linux/vmalloc.h>
+#include <linux/math.h>
#define ALLOCINFO_FILE_NAME "allocinfo"
#define MODULE_ALLOC_TAG_VMAP_SIZE (100000UL * sizeof(struct alloc_tag))
@@ -404,6 +405,9 @@ static int vm_module_tags_populate(void)
unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
(vm_module_tags->nr_pages << PAGE_SHIFT);
unsigned long new_end = module_tags.start_addr + module_tags.size;
+ unsigned long phys_idx = (vm_module_tags->nr_pages +
+ (2 << KASAN_SHADOW_SCALE_SHIFT) - 1) >> KASAN_SHADOW_SCALE_SHIFT;
+ unsigned long new_idx = 0;
if (phys_end < new_end) {
struct page **next_page = vm_module_tags->pages + vm_module_tags->nr_pages;
@@ -421,7 +425,26 @@ static int vm_module_tags_populate(void)
__free_page(next_page[i]);
return -ENOMEM;
}
+
vm_module_tags->nr_pages += nr;
+
+ new_idx = (vm_module_tags->nr_pages +
+ (2 << KASAN_SHADOW_SCALE_SHIFT) - 1) >> KASAN_SHADOW_SCALE_SHIFT;
+
+ /*
+ * Kasan allocates 1 byte of shadow for every 8 bytes of data.
+ * When kasan_alloc_module_shadow allocates shadow memory,
+ * its unit of allocation is a page.
+ * Therefore, here we need to align to MODULE_ALIGN.
+ *
+ * For every KASAN_SHADOW_SCALE_SHIFT, a shadow page is allocated.
+ * So, we determine whether to allocate based on whether the
+ * number of pages falls within the scope of the same KASAN_SHADOW_SCALE_SHIFT.
+ */
+ if (phys_idx != new_idx)
+ kasan_alloc_module_shadow((void *)round_up(phys_end, MODULE_ALIGN),
+ (new_idx - phys_idx) * MODULE_ALIGN,
+ GFP_KERNEL);
}
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v3] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled
2024-12-11 2:57 ` [PATCH v3] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled Hao Ge
@ 2024-12-11 16:32 ` Suren Baghdasaryan
2024-12-12 1:07 ` Hao Ge
2024-12-11 17:18 ` [PATCH v3] " kernel test robot
1 sibling, 1 reply; 19+ messages in thread
From: Suren Baghdasaryan @ 2024-12-11 16:32 UTC (permalink / raw)
To: Hao Ge; +Cc: kent.overstreet, akpm, linux-mm, linux-kernel, Hao Ge, Ben Greear
On Tue, Dec 10, 2024 at 6:58 PM Hao Ge <hao.ge@linux.dev> wrote:
>
> From: Hao Ge <gehao@kylinos.cn>
>
> When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
> is not enabled, we may encounter a panic during system boot.
>
> Because we haven't allocated pages and created mappings
> for the shadow memory corresponding to module_tags region,
> similar to how it is done for execmem_vmalloc.
>
> The difference is that our module_tags are allocated on demand,
> so similarly,we also need to allocate shadow memory regions on demand.
> However, we still need to adhere to the MODULE_ALIGN principle.
>
> Here is the log for panic:
>
> [ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
> [ 18.350016] #PF: supervisor read access in kernel mode
> [ 18.350459] #PF: error_code(0x0000) - not-present page
> [ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
> [ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
> [ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
> [ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> [ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
> [ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
> [ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
> [ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
> [ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
> [ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
> [ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
> [ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
> [ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
> [ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
> [ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 18.362020] PKRU: 55555554
> [ 18.362261] Call Trace:
> [ 18.362481] <TASK>
> [ 18.362671] ? __die+0x23/0x70
> [ 18.362964] ? page_fault_oops+0xc2/0x160
> [ 18.363318] ? exc_page_fault+0xad/0xc0
> [ 18.363680] ? asm_exc_page_fault+0x26/0x30
> [ 18.364056] ? move_module+0x3cc/0x8a0
> [ 18.364398] ? kasan_check_range+0xba/0x1b0
> [ 18.364755] __asan_memcpy+0x3c/0x60
> [ 18.365074] move_module+0x3cc/0x8a0
> [ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
> [ 18.365841] ? early_mod_check+0x3dc/0x510
> [ 18.366195] load_module+0x72/0x1850
> [ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
> [ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
> [ 18.367262] init_module_from_file+0xd1/0x130
> [ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
> [ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
> [ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
> [ 18.368938] idempotent_init_module+0x22c/0x790
> [ 18.369332] ? simple_getattr+0x6f/0x120
> [ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
> [ 18.370110] ? fdget+0x58/0x3a0
> [ 18.370393] ? security_capable+0x64/0xf0
> [ 18.370745] __x64_sys_finit_module+0xc2/0x140
> [ 18.371136] do_syscall_64+0x7d/0x160
> [ 18.371459] ? fdget_pos+0x1c8/0x4c0
> [ 18.371784] ? ksys_read+0xfd/0x1d0
> [ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
> [ 18.372525] ? do_syscall_64+0x89/0x160
> [ 18.372860] ? do_syscall_64+0x89/0x160
> [ 18.373194] ? do_syscall_64+0x89/0x160
> [ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
> [ 18.373952] ? do_syscall_64+0x89/0x160
> [ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
> [ 18.374701] ? do_syscall_64+0x89/0x160
> [ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
> [ 18.375416] ? clear_bhb_loop+0x25/0x80
> [ 18.375748] ? clear_bhb_loop+0x25/0x80
> [ 18.376119] ? clear_bhb_loop+0x25/0x80
> [ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
> Reported-by: Ben Greear <greearb@candelatech.com>
> Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
> Signed-off-by: Hao Ge <gehao@kylinos.cn>
> ---
> v3: Adjusting the title because the previous one was a bit unclear.
> Suren has pointed out that our condition for determining whether
> to allocate shadow memory is unreasonable.We have adjusted our method
> to use every 8 pages as an index (idx), and we will make decisions based
> on this idx when determining whether to allocate shadow memory.
>
> v2: Add comments to facilitate understanding of the code.
> Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
> already handles this internally,but to make the code more readable and user-friendly
>
> commit 233e89322cbe ("alloc_tag: fix module allocation
> tags populated area calculation") is currently in the
> mm-hotfixes-unstable branch, so this patch is
> developed based on the mm-hotfixes-unstable branch.
> ---
> lib/alloc_tag.c | 23 +++++++++++++++++++++++
> 1 file changed, 23 insertions(+)
>
> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> index f942408b53ef..8bf04756887d 100644
> --- a/lib/alloc_tag.c
> +++ b/lib/alloc_tag.c
> @@ -10,6 +10,7 @@
> #include <linux/seq_buf.h>
> #include <linux/seq_file.h>
> #include <linux/vmalloc.h>
> +#include <linux/math.h>
>
> #define ALLOCINFO_FILE_NAME "allocinfo"
> #define MODULE_ALLOC_TAG_VMAP_SIZE (100000UL * sizeof(struct alloc_tag))
> @@ -404,6 +405,9 @@ static int vm_module_tags_populate(void)
> unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
> (vm_module_tags->nr_pages << PAGE_SHIFT);
> unsigned long new_end = module_tags.start_addr + module_tags.size;
> + unsigned long phys_idx = (vm_module_tags->nr_pages +
> + (2 << KASAN_SHADOW_SCALE_SHIFT) - 1) >> KASAN_SHADOW_SCALE_SHIFT;
> + unsigned long new_idx = 0;
>
> if (phys_end < new_end) {
> struct page **next_page = vm_module_tags->pages + vm_module_tags->nr_pages;
> @@ -421,7 +425,26 @@ static int vm_module_tags_populate(void)
> __free_page(next_page[i]);
> return -ENOMEM;
> }
> +
> vm_module_tags->nr_pages += nr;
> +
> + new_idx = (vm_module_tags->nr_pages +
> + (2 << KASAN_SHADOW_SCALE_SHIFT) - 1) >> KASAN_SHADOW_SCALE_SHIFT;
> +
> + /*
> + * Kasan allocates 1 byte of shadow for every 8 bytes of data.
> + * When kasan_alloc_module_shadow allocates shadow memory,
> + * its unit of allocation is a page.
> + * Therefore, here we need to align to MODULE_ALIGN.
> + *
> + * For every KASAN_SHADOW_SCALE_SHIFT, a shadow page is allocated.
> + * So, we determine whether to allocate based on whether the
> + * number of pages falls within the scope of the same KASAN_SHADOW_SCALE_SHIFT.
> + */
> + if (phys_idx != new_idx)
> + kasan_alloc_module_shadow((void *)round_up(phys_end, MODULE_ALIGN),
> + (new_idx - phys_idx) * MODULE_ALIGN,
> + GFP_KERNEL);
> }
This seems overly-complicated. I was thinking something like this would work:
static int vm_module_tags_populate(void)
{
unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
(vm_module_tags->nr_pages << PAGE_SHIFT);
unsigned long new_end = module_tags.start_addr + module_tags.size;
if (phys_end < new_end) {
struct page **next_page = vm_module_tags->pages +
vm_module_tags->nr_pages;
+ unsigned long old_shadow_end = ALIGN(phys_end, MODULE_ALIGN);
+ unsigned long new_shadow_end = ALIGN(new_end, MODULE_ALIGN);
unsigned long more_pages;
unsigned long nr;
more_pages = ALIGN(new_end - phys_end, PAGE_SIZE) >> PAGE_SHIFT;
nr = alloc_pages_bulk_array_node(GFP_KERNEL | __GFP_NOWARN,
NUMA_NO_NODE,
more_pages, next_page);
if (nr < more_pages ||
vmap_pages_range(phys_end, phys_end + (nr <<
PAGE_SHIFT), PAGE_KERNEL,
next_page, PAGE_SHIFT) < 0) {
/* Clean up and error out */
for (int i = 0; i < nr; i++)
__free_page(next_page[i]);
return -ENOMEM;
}
vm_module_tags->nr_pages += nr;
+ if (old_shadow_end < new_shadow_end)
+ kasan_alloc_module_shadow((void *)old_shadow_end,
+ new_shadow_end - old_shadow_end
+ GFP_KERNEL);
}
WDYT?
>
> /*
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled
2024-12-11 2:57 ` [PATCH v3] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled Hao Ge
2024-12-11 16:32 ` Suren Baghdasaryan
@ 2024-12-11 17:18 ` kernel test robot
2024-12-12 2:23 ` Hao Ge
1 sibling, 1 reply; 19+ messages in thread
From: kernel test robot @ 2024-12-11 17:18 UTC (permalink / raw)
To: Hao Ge, surenb, kent.overstreet, akpm
Cc: oe-kbuild-all, linux-mm, linux-kernel, hao.ge, Hao Ge, Ben Greear
Hi Hao,
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-everything]
url: https://github.com/intel-lab-lkp/linux/commits/Hao-Ge/mm-alloc_tag-Fix-panic-when-CONFIG_KASAN-enabled-and-CONFIG_KASAN_VMALLOC-not-enabled/20241211-110206
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20241211025755.56173-1-hao.ge%40linux.dev
patch subject: [PATCH v3] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled
config: i386-buildonly-randconfig-005-20241211 (https://download.01.org/0day-ci/archive/20241212/202412120143.l3g6vx8b-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241212/202412120143.l3g6vx8b-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202412120143.l3g6vx8b-lkp@intel.com/
All errors (new ones prefixed by >>):
lib/alloc_tag.c: In function 'vm_module_tags_populate':
>> lib/alloc_tag.c:409:40: error: 'KASAN_SHADOW_SCALE_SHIFT' undeclared (first use in this function)
409 | (2 << KASAN_SHADOW_SCALE_SHIFT) - 1) >> KASAN_SHADOW_SCALE_SHIFT;
| ^~~~~~~~~~~~~~~~~~~~~~~~
lib/alloc_tag.c:409:40: note: each undeclared identifier is reported only once for each function it appears in
vim +/KASAN_SHADOW_SCALE_SHIFT +409 lib/alloc_tag.c
402
403 static int vm_module_tags_populate(void)
404 {
405 unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
406 (vm_module_tags->nr_pages << PAGE_SHIFT);
407 unsigned long new_end = module_tags.start_addr + module_tags.size;
408 unsigned long phys_idx = (vm_module_tags->nr_pages +
> 409 (2 << KASAN_SHADOW_SCALE_SHIFT) - 1) >> KASAN_SHADOW_SCALE_SHIFT;
410 unsigned long new_idx = 0;
411
412 if (phys_end < new_end) {
413 struct page **next_page = vm_module_tags->pages + vm_module_tags->nr_pages;
414 unsigned long more_pages;
415 unsigned long nr;
416
417 more_pages = ALIGN(new_end - phys_end, PAGE_SIZE) >> PAGE_SHIFT;
418 nr = alloc_pages_bulk_array_node(GFP_KERNEL | __GFP_NOWARN,
419 NUMA_NO_NODE, more_pages, next_page);
420 if (nr < more_pages ||
421 vmap_pages_range(phys_end, phys_end + (nr << PAGE_SHIFT), PAGE_KERNEL,
422 next_page, PAGE_SHIFT) < 0) {
423 /* Clean up and error out */
424 for (int i = 0; i < nr; i++)
425 __free_page(next_page[i]);
426 return -ENOMEM;
427 }
428
429 vm_module_tags->nr_pages += nr;
430
431 new_idx = (vm_module_tags->nr_pages +
432 (2 << KASAN_SHADOW_SCALE_SHIFT) - 1) >> KASAN_SHADOW_SCALE_SHIFT;
433
434 /*
435 * Kasan allocates 1 byte of shadow for every 8 bytes of data.
436 * When kasan_alloc_module_shadow allocates shadow memory,
437 * its unit of allocation is a page.
438 * Therefore, here we need to align to MODULE_ALIGN.
439 *
440 * For every KASAN_SHADOW_SCALE_SHIFT, a shadow page is allocated.
441 * So, we determine whether to allocate based on whether the
442 * number of pages falls within the scope of the same KASAN_SHADOW_SCALE_SHIFT.
443 */
444 if (phys_idx != new_idx)
445 kasan_alloc_module_shadow((void *)round_up(phys_end, MODULE_ALIGN),
446 (new_idx - phys_idx) * MODULE_ALIGN,
447 GFP_KERNEL);
448 }
449
450 /*
451 * Mark the pages as accessible, now that they are mapped.
452 * With hardware tag-based KASAN, marking is skipped for
453 * non-VM_ALLOC mappings, see __kasan_unpoison_vmalloc().
454 */
455 kasan_unpoison_vmalloc((void *)module_tags.start_addr,
456 new_end - module_tags.start_addr,
457 KASAN_VMALLOC_PROT_NORMAL);
458
459 return 0;
460 }
461
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled
2024-12-11 16:32 ` Suren Baghdasaryan
@ 2024-12-12 1:07 ` Hao Ge
2024-12-12 1:37 ` [PATCH v4] " Hao Ge
0 siblings, 1 reply; 19+ messages in thread
From: Hao Ge @ 2024-12-12 1:07 UTC (permalink / raw)
To: Suren Baghdasaryan
Cc: kent.overstreet, akpm, linux-mm, linux-kernel, Hao Ge, Ben Greear
Hi Suren
On 12/12/24 00:32, Suren Baghdasaryan wrote:
> On Tue, Dec 10, 2024 at 6:58 PM Hao Ge <hao.ge@linux.dev> wrote:
>> From: Hao Ge <gehao@kylinos.cn>
>>
>> When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
>> is not enabled, we may encounter a panic during system boot.
>>
>> Because we haven't allocated pages and created mappings
>> for the shadow memory corresponding to module_tags region,
>> similar to how it is done for execmem_vmalloc.
>>
>> The difference is that our module_tags are allocated on demand,
>> so similarly,we also need to allocate shadow memory regions on demand.
>> However, we still need to adhere to the MODULE_ALIGN principle.
>>
>> Here is the log for panic:
>>
>> [ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
>> [ 18.350016] #PF: supervisor read access in kernel mode
>> [ 18.350459] #PF: error_code(0x0000) - not-present page
>> [ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
>> [ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
>> [ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
>> [ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
>> [ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
>> [ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
>> [ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
>> [ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
>> [ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
>> [ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
>> [ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
>> [ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
>> [ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
>> [ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
>> [ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [ 18.362020] PKRU: 55555554
>> [ 18.362261] Call Trace:
>> [ 18.362481] <TASK>
>> [ 18.362671] ? __die+0x23/0x70
>> [ 18.362964] ? page_fault_oops+0xc2/0x160
>> [ 18.363318] ? exc_page_fault+0xad/0xc0
>> [ 18.363680] ? asm_exc_page_fault+0x26/0x30
>> [ 18.364056] ? move_module+0x3cc/0x8a0
>> [ 18.364398] ? kasan_check_range+0xba/0x1b0
>> [ 18.364755] __asan_memcpy+0x3c/0x60
>> [ 18.365074] move_module+0x3cc/0x8a0
>> [ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
>> [ 18.365841] ? early_mod_check+0x3dc/0x510
>> [ 18.366195] load_module+0x72/0x1850
>> [ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
>> [ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
>> [ 18.367262] init_module_from_file+0xd1/0x130
>> [ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
>> [ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
>> [ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
>> [ 18.368938] idempotent_init_module+0x22c/0x790
>> [ 18.369332] ? simple_getattr+0x6f/0x120
>> [ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
>> [ 18.370110] ? fdget+0x58/0x3a0
>> [ 18.370393] ? security_capable+0x64/0xf0
>> [ 18.370745] __x64_sys_finit_module+0xc2/0x140
>> [ 18.371136] do_syscall_64+0x7d/0x160
>> [ 18.371459] ? fdget_pos+0x1c8/0x4c0
>> [ 18.371784] ? ksys_read+0xfd/0x1d0
>> [ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
>> [ 18.372525] ? do_syscall_64+0x89/0x160
>> [ 18.372860] ? do_syscall_64+0x89/0x160
>> [ 18.373194] ? do_syscall_64+0x89/0x160
>> [ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
>> [ 18.373952] ? do_syscall_64+0x89/0x160
>> [ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
>> [ 18.374701] ? do_syscall_64+0x89/0x160
>> [ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
>> [ 18.375416] ? clear_bhb_loop+0x25/0x80
>> [ 18.375748] ? clear_bhb_loop+0x25/0x80
>> [ 18.376119] ? clear_bhb_loop+0x25/0x80
>> [ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>> Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
>> Reported-by: Ben Greear <greearb@candelatech.com>
>> Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
>> Signed-off-by: Hao Ge <gehao@kylinos.cn>
>> ---
>> v3: Adjusting the title because the previous one was a bit unclear.
>> Suren has pointed out that our condition for determining whether
>> to allocate shadow memory is unreasonable.We have adjusted our method
>> to use every 8 pages as an index (idx), and we will make decisions based
>> on this idx when determining whether to allocate shadow memory.
>>
>> v2: Add comments to facilitate understanding of the code.
>> Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
>> already handles this internally,but to make the code more readable and user-friendly
>>
>> commit 233e89322cbe ("alloc_tag: fix module allocation
>> tags populated area calculation") is currently in the
>> mm-hotfixes-unstable branch, so this patch is
>> developed based on the mm-hotfixes-unstable branch.
>> ---
>> lib/alloc_tag.c | 23 +++++++++++++++++++++++
>> 1 file changed, 23 insertions(+)
>>
>> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
>> index f942408b53ef..8bf04756887d 100644
>> --- a/lib/alloc_tag.c
>> +++ b/lib/alloc_tag.c
>> @@ -10,6 +10,7 @@
>> #include <linux/seq_buf.h>
>> #include <linux/seq_file.h>
>> #include <linux/vmalloc.h>
>> +#include <linux/math.h>
>>
>> #define ALLOCINFO_FILE_NAME "allocinfo"
>> #define MODULE_ALLOC_TAG_VMAP_SIZE (100000UL * sizeof(struct alloc_tag))
>> @@ -404,6 +405,9 @@ static int vm_module_tags_populate(void)
>> unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
>> (vm_module_tags->nr_pages << PAGE_SHIFT);
>> unsigned long new_end = module_tags.start_addr + module_tags.size;
>> + unsigned long phys_idx = (vm_module_tags->nr_pages +
>> + (2 << KASAN_SHADOW_SCALE_SHIFT) - 1) >> KASAN_SHADOW_SCALE_SHIFT;
>> + unsigned long new_idx = 0;
>>
>> if (phys_end < new_end) {
>> struct page **next_page = vm_module_tags->pages + vm_module_tags->nr_pages;
>> @@ -421,7 +425,26 @@ static int vm_module_tags_populate(void)
>> __free_page(next_page[i]);
>> return -ENOMEM;
>> }
>> +
>> vm_module_tags->nr_pages += nr;
>> +
>> + new_idx = (vm_module_tags->nr_pages +
>> + (2 << KASAN_SHADOW_SCALE_SHIFT) - 1) >> KASAN_SHADOW_SCALE_SHIFT;
>> +
>> + /*
>> + * Kasan allocates 1 byte of shadow for every 8 bytes of data.
>> + * When kasan_alloc_module_shadow allocates shadow memory,
>> + * its unit of allocation is a page.
>> + * Therefore, here we need to align to MODULE_ALIGN.
>> + *
>> + * For every KASAN_SHADOW_SCALE_SHIFT, a shadow page is allocated.
>> + * So, we determine whether to allocate based on whether the
>> + * number of pages falls within the scope of the same KASAN_SHADOW_SCALE_SHIFT.
>> + */
>> + if (phys_idx != new_idx)
>> + kasan_alloc_module_shadow((void *)round_up(phys_end, MODULE_ALIGN),
>> + (new_idx - phys_idx) * MODULE_ALIGN,
>> + GFP_KERNEL);
>> }
> This seems overly-complicated. I was thinking something like this would work:
>
> static int vm_module_tags_populate(void)
> {
> unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
> (vm_module_tags->nr_pages << PAGE_SHIFT);
> unsigned long new_end = module_tags.start_addr + module_tags.size;
>
> if (phys_end < new_end) {
> struct page **next_page = vm_module_tags->pages +
> vm_module_tags->nr_pages;
> + unsigned long old_shadow_end = ALIGN(phys_end, MODULE_ALIGN);
> + unsigned long new_shadow_end = ALIGN(new_end, MODULE_ALIGN);
> unsigned long more_pages;
> unsigned long nr;
>
> more_pages = ALIGN(new_end - phys_end, PAGE_SIZE) >> PAGE_SHIFT;
> nr = alloc_pages_bulk_array_node(GFP_KERNEL | __GFP_NOWARN,
> NUMA_NO_NODE,
> more_pages, next_page);
> if (nr < more_pages ||
> vmap_pages_range(phys_end, phys_end + (nr <<
> PAGE_SHIFT), PAGE_KERNEL,
> next_page, PAGE_SHIFT) < 0) {
> /* Clean up and error out */
> for (int i = 0; i < nr; i++)
> __free_page(next_page[i]);
> return -ENOMEM;
> }
> vm_module_tags->nr_pages += nr;
> + if (old_shadow_end < new_shadow_end)
> + kasan_alloc_module_shadow((void *)old_shadow_end,
> + new_shadow_end - old_shadow_end
> + GFP_KERNEL);
> }
>
> WDYT?
Yes, it's much simpler this way.
I'll verify for accuracy,If there are no issues, I'll release the V4
version and add your "Suggested-by".
Thanks
Best Regards
Hao
>
>> /*
>> --
>> 2.25.1
>>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v4] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled
2024-12-12 1:07 ` Hao Ge
@ 2024-12-12 1:37 ` Hao Ge
2024-12-12 6:48 ` Suren Baghdasaryan
0 siblings, 1 reply; 19+ messages in thread
From: Hao Ge @ 2024-12-12 1:37 UTC (permalink / raw)
To: surenb, kent.overstreet, akpm
Cc: linux-mm, linux-kernel, hao.ge, Hao Ge, Ben Greear
From: Hao Ge <gehao@kylinos.cn>
When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
is not enabled, we may encounter a panic during system boot.
Because we haven't allocated pages and created mappings
for the shadow memory corresponding to module_tags region,
similar to how it is done for execmem_vmalloc.
The difference is that our module_tags are allocated on demand,
so similarly,we also need to allocate shadow memory regions on demand.
However, we still need to adhere to the MODULE_ALIGN principle.
Here is the log for panic:
[ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
[ 18.350016] #PF: supervisor read access in kernel mode
[ 18.350459] #PF: error_code(0x0000) - not-present page
[ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
[ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
[ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
[ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
[ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
[ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
[ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
[ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
[ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
[ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
[ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
[ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
[ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 18.362020] PKRU: 55555554
[ 18.362261] Call Trace:
[ 18.362481] <TASK>
[ 18.362671] ? __die+0x23/0x70
[ 18.362964] ? page_fault_oops+0xc2/0x160
[ 18.363318] ? exc_page_fault+0xad/0xc0
[ 18.363680] ? asm_exc_page_fault+0x26/0x30
[ 18.364056] ? move_module+0x3cc/0x8a0
[ 18.364398] ? kasan_check_range+0xba/0x1b0
[ 18.364755] __asan_memcpy+0x3c/0x60
[ 18.365074] move_module+0x3cc/0x8a0
[ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
[ 18.365841] ? early_mod_check+0x3dc/0x510
[ 18.366195] load_module+0x72/0x1850
[ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
[ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
[ 18.367262] init_module_from_file+0xd1/0x130
[ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
[ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
[ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
[ 18.368938] idempotent_init_module+0x22c/0x790
[ 18.369332] ? simple_getattr+0x6f/0x120
[ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
[ 18.370110] ? fdget+0x58/0x3a0
[ 18.370393] ? security_capable+0x64/0xf0
[ 18.370745] __x64_sys_finit_module+0xc2/0x140
[ 18.371136] do_syscall_64+0x7d/0x160
[ 18.371459] ? fdget_pos+0x1c8/0x4c0
[ 18.371784] ? ksys_read+0xfd/0x1d0
[ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.372525] ? do_syscall_64+0x89/0x160
[ 18.372860] ? do_syscall_64+0x89/0x160
[ 18.373194] ? do_syscall_64+0x89/0x160
[ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.373952] ? do_syscall_64+0x89/0x160
[ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.374701] ? do_syscall_64+0x89/0x160
[ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
[ 18.375416] ? clear_bhb_loop+0x25/0x80
[ 18.375748] ? clear_bhb_loop+0x25/0x80
[ 18.376119] ? clear_bhb_loop+0x25/0x80
[ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
Reported-by: Ben Greear <greearb@candelatech.com>
Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
Suggested-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Hao Ge <gehao@kylinos.cn>
---
v4: Based on Suren's suggestion for modification (to make the code simpler),
modify the code.
Update the comments in the code due to the modifications made to the code.
Add Suggested-by: Suren Baghdasaryan <surenb@google.com>
v3: Adjusting the title because the previous one was a bit unclear.
Suren has pointed out that our condition for determining whether
to allocate shadow memory is unreasonable.We have adjusted our method
to use every 8 pages as an index (idx), and we will make decisions based
on this idx when determining whether to allocate shadow memory.
v2: Add comments to facilitate understanding of the code.
Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
already handles this internally,but to make the code more readable and user-friendly
commit 233e89322cbe ("alloc_tag: fix module allocation
tags populated area calculation") is currently in the
mm-hotfixes-unstable branch, so this patch is
developed based on the mm-hotfixes-unstable branch.
---
lib/alloc_tag.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index f942408b53ef..c5bdfa297a35 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -407,6 +407,8 @@ static int vm_module_tags_populate(void)
if (phys_end < new_end) {
struct page **next_page = vm_module_tags->pages + vm_module_tags->nr_pages;
+ unsigned long old_shadow_end = ALIGN(phys_end, MODULE_ALIGN);
+ unsigned long new_shadow_end = ALIGN(new_end, MODULE_ALIGN);
unsigned long more_pages;
unsigned long nr;
@@ -421,7 +423,19 @@ static int vm_module_tags_populate(void)
__free_page(next_page[i]);
return -ENOMEM;
}
+
vm_module_tags->nr_pages += nr;
+
+ /*
+ * Kasan allocates 1 byte of shadow for every 8 bytes of data.
+ * When kasan_alloc_module_shadow allocates shadow memory,
+ * its unit of allocation is a page.
+ * Therefore, here we need to align to MODULE_ALIGN.
+ */
+ if (old_shadow_end < new_shadow_end)
+ kasan_alloc_module_shadow((void *)old_shadow_end,
+ new_shadow_end - old_shadow_end,
+ GFP_KERNEL);
}
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v3] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled
2024-12-11 17:18 ` [PATCH v3] " kernel test robot
@ 2024-12-12 2:23 ` Hao Ge
0 siblings, 0 replies; 19+ messages in thread
From: Hao Ge @ 2024-12-12 2:23 UTC (permalink / raw)
To: kernel test robot, surenb, kent.overstreet, akpm
Cc: oe-kbuild-all, linux-mm, linux-kernel, Hao Ge, Ben Greear
Hi
Thanks for you report.
This version has been deprecated, and a new V4 version has been released.
On 12/12/24 01:18, kernel test robot wrote:
> Hi Hao,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on akpm-mm/mm-everything]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Hao-Ge/mm-alloc_tag-Fix-panic-when-CONFIG_KASAN-enabled-and-CONFIG_KASAN_VMALLOC-not-enabled/20241211-110206
> base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link: https://lore.kernel.org/r/20241211025755.56173-1-hao.ge%40linux.dev
> patch subject: [PATCH v3] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled
> config: i386-buildonly-randconfig-005-20241211 (https://download.01.org/0day-ci/archive/20241212/202412120143.l3g6vx8b-lkp@intel.com/config)
> compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241212/202412120143.l3g6vx8b-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202412120143.l3g6vx8b-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
> lib/alloc_tag.c: In function 'vm_module_tags_populate':
>>> lib/alloc_tag.c:409:40: error: 'KASAN_SHADOW_SCALE_SHIFT' undeclared (first use in this function)
> 409 | (2 << KASAN_SHADOW_SCALE_SHIFT) - 1) >> KASAN_SHADOW_SCALE_SHIFT;
> | ^~~~~~~~~~~~~~~~~~~~~~~~
> lib/alloc_tag.c:409:40: note: each undeclared identifier is reported only once for each function it appears in
>
>
> vim +/KASAN_SHADOW_SCALE_SHIFT +409 lib/alloc_tag.c
>
> 402
> 403 static int vm_module_tags_populate(void)
> 404 {
> 405 unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) +
> 406 (vm_module_tags->nr_pages << PAGE_SHIFT);
> 407 unsigned long new_end = module_tags.start_addr + module_tags.size;
> 408 unsigned long phys_idx = (vm_module_tags->nr_pages +
> > 409 (2 << KASAN_SHADOW_SCALE_SHIFT) - 1) >> KASAN_SHADOW_SCALE_SHIFT;
> 410 unsigned long new_idx = 0;
> 411
> 412 if (phys_end < new_end) {
> 413 struct page **next_page = vm_module_tags->pages + vm_module_tags->nr_pages;
> 414 unsigned long more_pages;
> 415 unsigned long nr;
> 416
> 417 more_pages = ALIGN(new_end - phys_end, PAGE_SIZE) >> PAGE_SHIFT;
> 418 nr = alloc_pages_bulk_array_node(GFP_KERNEL | __GFP_NOWARN,
> 419 NUMA_NO_NODE, more_pages, next_page);
> 420 if (nr < more_pages ||
> 421 vmap_pages_range(phys_end, phys_end + (nr << PAGE_SHIFT), PAGE_KERNEL,
> 422 next_page, PAGE_SHIFT) < 0) {
> 423 /* Clean up and error out */
> 424 for (int i = 0; i < nr; i++)
> 425 __free_page(next_page[i]);
> 426 return -ENOMEM;
> 427 }
> 428
> 429 vm_module_tags->nr_pages += nr;
> 430
> 431 new_idx = (vm_module_tags->nr_pages +
> 432 (2 << KASAN_SHADOW_SCALE_SHIFT) - 1) >> KASAN_SHADOW_SCALE_SHIFT;
> 433
> 434 /*
> 435 * Kasan allocates 1 byte of shadow for every 8 bytes of data.
> 436 * When kasan_alloc_module_shadow allocates shadow memory,
> 437 * its unit of allocation is a page.
> 438 * Therefore, here we need to align to MODULE_ALIGN.
> 439 *
> 440 * For every KASAN_SHADOW_SCALE_SHIFT, a shadow page is allocated.
> 441 * So, we determine whether to allocate based on whether the
> 442 * number of pages falls within the scope of the same KASAN_SHADOW_SCALE_SHIFT.
> 443 */
> 444 if (phys_idx != new_idx)
> 445 kasan_alloc_module_shadow((void *)round_up(phys_end, MODULE_ALIGN),
> 446 (new_idx - phys_idx) * MODULE_ALIGN,
> 447 GFP_KERNEL);
> 448 }
> 449
> 450 /*
> 451 * Mark the pages as accessible, now that they are mapped.
> 452 * With hardware tag-based KASAN, marking is skipped for
> 453 * non-VM_ALLOC mappings, see __kasan_unpoison_vmalloc().
> 454 */
> 455 kasan_unpoison_vmalloc((void *)module_tags.start_addr,
> 456 new_end - module_tags.start_addr,
> 457 KASAN_VMALLOC_PROT_NORMAL);
> 458
> 459 return 0;
> 460 }
> 461
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v4] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled
2024-12-12 1:37 ` [PATCH v4] " Hao Ge
@ 2024-12-12 6:48 ` Suren Baghdasaryan
2024-12-12 7:03 ` [PATCH v5] " Hao Ge
0 siblings, 1 reply; 19+ messages in thread
From: Suren Baghdasaryan @ 2024-12-12 6:48 UTC (permalink / raw)
To: Hao Ge; +Cc: kent.overstreet, akpm, linux-mm, linux-kernel, Hao Ge, Ben Greear
On Wed, Dec 11, 2024 at 5:38 PM Hao Ge <hao.ge@linux.dev> wrote:
>
> From: Hao Ge <gehao@kylinos.cn>
>
> When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
> is not enabled, we may encounter a panic during system boot.
>
> Because we haven't allocated pages and created mappings
> for the shadow memory corresponding to module_tags region,
> similar to how it is done for execmem_vmalloc.
>
> The difference is that our module_tags are allocated on demand,
> so similarly,we also need to allocate shadow memory regions on demand.
> However, we still need to adhere to the MODULE_ALIGN principle.
nit: the above wording is a bit unclear. Instead of module_tags I
would call it "memory for module allocation tags". So, I would change
the above paragraph to:
The memory for module allocation tags is allocated on demand,
therefore we need to allocate shadow memory on demand as well in
MODULE_ALIGN blocks.
>
> Here is the log for panic:
>
> [ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
> [ 18.350016] #PF: supervisor read access in kernel mode
> [ 18.350459] #PF: error_code(0x0000) - not-present page
> [ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
> [ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
> [ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
> [ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> [ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
> [ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
> [ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
> [ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
> [ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
> [ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
> [ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
> [ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
> [ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
> [ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
> [ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 18.362020] PKRU: 55555554
> [ 18.362261] Call Trace:
> [ 18.362481] <TASK>
> [ 18.362671] ? __die+0x23/0x70
> [ 18.362964] ? page_fault_oops+0xc2/0x160
> [ 18.363318] ? exc_page_fault+0xad/0xc0
> [ 18.363680] ? asm_exc_page_fault+0x26/0x30
> [ 18.364056] ? move_module+0x3cc/0x8a0
> [ 18.364398] ? kasan_check_range+0xba/0x1b0
> [ 18.364755] __asan_memcpy+0x3c/0x60
> [ 18.365074] move_module+0x3cc/0x8a0
> [ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
> [ 18.365841] ? early_mod_check+0x3dc/0x510
> [ 18.366195] load_module+0x72/0x1850
> [ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
> [ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
> [ 18.367262] init_module_from_file+0xd1/0x130
> [ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
> [ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
> [ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
> [ 18.368938] idempotent_init_module+0x22c/0x790
> [ 18.369332] ? simple_getattr+0x6f/0x120
> [ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
> [ 18.370110] ? fdget+0x58/0x3a0
> [ 18.370393] ? security_capable+0x64/0xf0
> [ 18.370745] __x64_sys_finit_module+0xc2/0x140
> [ 18.371136] do_syscall_64+0x7d/0x160
> [ 18.371459] ? fdget_pos+0x1c8/0x4c0
> [ 18.371784] ? ksys_read+0xfd/0x1d0
> [ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
> [ 18.372525] ? do_syscall_64+0x89/0x160
> [ 18.372860] ? do_syscall_64+0x89/0x160
> [ 18.373194] ? do_syscall_64+0x89/0x160
> [ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
> [ 18.373952] ? do_syscall_64+0x89/0x160
> [ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
> [ 18.374701] ? do_syscall_64+0x89/0x160
> [ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
> [ 18.375416] ? clear_bhb_loop+0x25/0x80
> [ 18.375748] ? clear_bhb_loop+0x25/0x80
> [ 18.376119] ? clear_bhb_loop+0x25/0x80
> [ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
> Reported-by: Ben Greear <greearb@candelatech.com>
> Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
> Suggested-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Hao Ge <gehao@kylinos.cn>
Other than that nit, LGTM.
Acked-by: Suren Baghdasaryan <surenb@google.com>
> ---
> v4: Based on Suren's suggestion for modification (to make the code simpler),
> modify the code.
> Update the comments in the code due to the modifications made to the code.
> Add Suggested-by: Suren Baghdasaryan <surenb@google.com>
>
> v3: Adjusting the title because the previous one was a bit unclear.
> Suren has pointed out that our condition for determining whether
> to allocate shadow memory is unreasonable.We have adjusted our method
> to use every 8 pages as an index (idx), and we will make decisions based
> on this idx when determining whether to allocate shadow memory.
>
> v2: Add comments to facilitate understanding of the code.
> Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
> already handles this internally,but to make the code more readable and user-friendly
>
> commit 233e89322cbe ("alloc_tag: fix module allocation
> tags populated area calculation") is currently in the
> mm-hotfixes-unstable branch, so this patch is
> developed based on the mm-hotfixes-unstable branch.
> ---
> lib/alloc_tag.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> index f942408b53ef..c5bdfa297a35 100644
> --- a/lib/alloc_tag.c
> +++ b/lib/alloc_tag.c
> @@ -407,6 +407,8 @@ static int vm_module_tags_populate(void)
>
> if (phys_end < new_end) {
> struct page **next_page = vm_module_tags->pages + vm_module_tags->nr_pages;
> + unsigned long old_shadow_end = ALIGN(phys_end, MODULE_ALIGN);
> + unsigned long new_shadow_end = ALIGN(new_end, MODULE_ALIGN);
> unsigned long more_pages;
> unsigned long nr;
>
> @@ -421,7 +423,19 @@ static int vm_module_tags_populate(void)
> __free_page(next_page[i]);
> return -ENOMEM;
> }
> +
> vm_module_tags->nr_pages += nr;
> +
> + /*
> + * Kasan allocates 1 byte of shadow for every 8 bytes of data.
> + * When kasan_alloc_module_shadow allocates shadow memory,
> + * its unit of allocation is a page.
> + * Therefore, here we need to align to MODULE_ALIGN.
> + */
> + if (old_shadow_end < new_shadow_end)
> + kasan_alloc_module_shadow((void *)old_shadow_end,
> + new_shadow_end - old_shadow_end,
> + GFP_KERNEL);
> }
>
> /*
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v5] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled
2024-12-12 6:48 ` Suren Baghdasaryan
@ 2024-12-12 7:03 ` Hao Ge
2024-12-12 7:21 ` [PATCH v6] " Hao Ge
0 siblings, 1 reply; 19+ messages in thread
From: Hao Ge @ 2024-12-12 7:03 UTC (permalink / raw)
To: surenb, kent.overstreet, akpm
Cc: linux-mm, linux-kernel, hao.ge, Hao Ge, Ben Greear
From: Hao Ge <gehao@kylinos.cn>
When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
is not enabled, we may encounter a panic during system boot.
Because we haven't allocated pages and created mappings
for the shadow memory corresponding to module_tags region,
similar to how it is done for execmem_vmalloc.
The memory for module allocation tags is allocated on demand,
therefore we need to allocate shadow memory on demand as well in
MODULE_ALIGN blocks.
Here is the log for panic:
[ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
[ 18.350016] #PF: supervisor read access in kernel mode
[ 18.350459] #PF: error_code(0x0000) - not-present page
[ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
[ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
[ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
[ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
[ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
[ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
[ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
[ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
[ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
[ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
[ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
[ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
[ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 18.362020] PKRU: 55555554
[ 18.362261] Call Trace:
[ 18.362481] <TASK>
[ 18.362671] ? __die+0x23/0x70
[ 18.362964] ? page_fault_oops+0xc2/0x160
[ 18.363318] ? exc_page_fault+0xad/0xc0
[ 18.363680] ? asm_exc_page_fault+0x26/0x30
[ 18.364056] ? move_module+0x3cc/0x8a0
[ 18.364398] ? kasan_check_range+0xba/0x1b0
[ 18.364755] __asan_memcpy+0x3c/0x60
[ 18.365074] move_module+0x3cc/0x8a0
[ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
[ 18.365841] ? early_mod_check+0x3dc/0x510
[ 18.366195] load_module+0x72/0x1850
[ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
[ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
[ 18.367262] init_module_from_file+0xd1/0x130
[ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
[ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
[ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
[ 18.368938] idempotent_init_module+0x22c/0x790
[ 18.369332] ? simple_getattr+0x6f/0x120
[ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
[ 18.370110] ? fdget+0x58/0x3a0
[ 18.370393] ? security_capable+0x64/0xf0
[ 18.370745] __x64_sys_finit_module+0xc2/0x140
[ 18.371136] do_syscall_64+0x7d/0x160
[ 18.371459] ? fdget_pos+0x1c8/0x4c0
[ 18.371784] ? ksys_read+0xfd/0x1d0
[ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.372525] ? do_syscall_64+0x89/0x160
[ 18.372860] ? do_syscall_64+0x89/0x160
[ 18.373194] ? do_syscall_64+0x89/0x160
[ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.373952] ? do_syscall_64+0x89/0x160
[ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.374701] ? do_syscall_64+0x89/0x160
[ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
[ 18.375416] ? clear_bhb_loop+0x25/0x80
[ 18.375748] ? clear_bhb_loop+0x25/0x80
[ 18.376119] ? clear_bhb_loop+0x25/0x80
[ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
Reported-by: Ben Greear <greearb@candelatech.com>
Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
Suggested-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Hao Ge <gehao@kylinos.cn>
---
v5: Modify the commit message based on Suren's suggestions
Add Acked-by: Suren Baghdasaryan <surenb@google.com>
v4: Based on Suren's suggestion for modification (to make the code simpler),
modify the code.
Update the comments in the code due to the modifications made to the code.
Add Suggested-by: Suren Baghdasaryan <surenb@google.com>
v3: Adjusting the title because the previous one was a bit unclear.
Suren has pointed out that our condition for determining whether
to allocate shadow memory is unreasonable.We have adjusted our method
to use every 8 pages as an index (idx), and we will make decisions based
on this idx when determining whether to allocate shadow memory.
v2: Add comments to facilitate understanding of the code.
Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
already handles this internally,but to make the code more readable and user-friendly
commit 233e89322cbe ("alloc_tag: fix module allocation
tags populated area calculation") is currently in the
mm-hotfixes-unstable branch, so this patch is
developed based on the mm-hotfixes-unstable branch.
---
lib/alloc_tag.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index f942408b53ef..c5bdfa297a35 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -407,6 +407,8 @@ static int vm_module_tags_populate(void)
if (phys_end < new_end) {
struct page **next_page = vm_module_tags->pages + vm_module_tags->nr_pages;
+ unsigned long old_shadow_end = ALIGN(phys_end, MODULE_ALIGN);
+ unsigned long new_shadow_end = ALIGN(new_end, MODULE_ALIGN);
unsigned long more_pages;
unsigned long nr;
@@ -421,7 +423,19 @@ static int vm_module_tags_populate(void)
__free_page(next_page[i]);
return -ENOMEM;
}
+
vm_module_tags->nr_pages += nr;
+
+ /*
+ * Kasan allocates 1 byte of shadow for every 8 bytes of data.
+ * When kasan_alloc_module_shadow allocates shadow memory,
+ * its unit of allocation is a page.
+ * Therefore, here we need to align to MODULE_ALIGN.
+ */
+ if (old_shadow_end < new_shadow_end)
+ kasan_alloc_module_shadow((void *)old_shadow_end,
+ new_shadow_end - old_shadow_end,
+ GFP_KERNEL);
}
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v6] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled
2024-12-12 7:03 ` [PATCH v5] " Hao Ge
@ 2024-12-12 7:21 ` Hao Ge
2024-12-12 14:07 ` Adrian Huang12
0 siblings, 1 reply; 19+ messages in thread
From: Hao Ge @ 2024-12-12 7:21 UTC (permalink / raw)
To: surenb, kent.overstreet, akpm
Cc: linux-mm, linux-kernel, hao.ge, Hao Ge, Ben Greear
From: Hao Ge <gehao@kylinos.cn>
When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC
is not enabled, we may encounter a panic during system boot.
Because we haven't allocated pages and created mappings
for the shadow memory corresponding to module allocation tags
region,similar to how it is done for execmem_vmalloc.
The memory for module allocation tags is allocated on demand,
therefore we need to allocate shadow memory on demand as well in
MODULE_ALIGN blocks.
Here is the log for panic:
[ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
[ 18.350016] #PF: supervisor read access in kernel mode
[ 18.350459] #PF: error_code(0x0000) - not-present page
[ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD 102495067 PTE 0
[ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
[ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
[ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
[ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
[ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
[ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI: ffffffffc0490000
[ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09: fffffbfff809201d
[ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12: ffffffffc0490000
[ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15: 000000000000002c
[ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000) knlGS:0000000000000000
[ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4: 0000000000771ef0
[ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 18.362020] PKRU: 55555554
[ 18.362261] Call Trace:
[ 18.362481] <TASK>
[ 18.362671] ? __die+0x23/0x70
[ 18.362964] ? page_fault_oops+0xc2/0x160
[ 18.363318] ? exc_page_fault+0xad/0xc0
[ 18.363680] ? asm_exc_page_fault+0x26/0x30
[ 18.364056] ? move_module+0x3cc/0x8a0
[ 18.364398] ? kasan_check_range+0xba/0x1b0
[ 18.364755] __asan_memcpy+0x3c/0x60
[ 18.365074] move_module+0x3cc/0x8a0
[ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
[ 18.365841] ? early_mod_check+0x3dc/0x510
[ 18.366195] load_module+0x72/0x1850
[ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
[ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
[ 18.367262] init_module_from_file+0xd1/0x130
[ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
[ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
[ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
[ 18.368938] idempotent_init_module+0x22c/0x790
[ 18.369332] ? simple_getattr+0x6f/0x120
[ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
[ 18.370110] ? fdget+0x58/0x3a0
[ 18.370393] ? security_capable+0x64/0xf0
[ 18.370745] __x64_sys_finit_module+0xc2/0x140
[ 18.371136] do_syscall_64+0x7d/0x160
[ 18.371459] ? fdget_pos+0x1c8/0x4c0
[ 18.371784] ? ksys_read+0xfd/0x1d0
[ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.372525] ? do_syscall_64+0x89/0x160
[ 18.372860] ? do_syscall_64+0x89/0x160
[ 18.373194] ? do_syscall_64+0x89/0x160
[ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.373952] ? do_syscall_64+0x89/0x160
[ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
[ 18.374701] ? do_syscall_64+0x89/0x160
[ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
[ 18.375416] ? clear_bhb_loop+0x25/0x80
[ 18.375748] ? clear_bhb_loop+0x25/0x80
[ 18.376119] ? clear_bhb_loop+0x25/0x80
[ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area calculation")
Reported-by: Ben Greear <greearb@candelatech.com>
Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
Suggested-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Hao Ge <gehao@kylinos.cn>
---
v6: In the previous version, there was an omission
in the commit message where "module_tags"
should have been replaced with "module allocation tags".
v5: Modify the commit message based on Suren's suggestions
Add Acked-by: Suren Baghdasaryan <surenb@google.com>
v4: Based on Suren's suggestion for modification (to make the code simpler),
modify the code.
Update the comments in the code due to the modifications made to the code.
Add Suggested-by: Suren Baghdasaryan <surenb@google.com>
v3: Adjusting the title because the previous one was a bit unclear.
Suren has pointed out that our condition for determining whether
to allocate shadow memory is unreasonable.We have adjusted our method
to use every 8 pages as an index (idx), and we will make decisions based
on this idx when determining whether to allocate shadow memory.
v2: Add comments to facilitate understanding of the code.
Add align nr << PAGE_SHIFT to MODULE_ALIGN,even though kasan_alloc_module_shadow
already handles this internally,but to make the code more readable and user-friendly
commit 233e89322cbe ("alloc_tag: fix module allocation
tags populated area calculation") is currently in the
mm-hotfixes-unstable branch, so this patch is
developed based on the mm-hotfixes-unstable branch.
---
lib/alloc_tag.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index f942408b53ef..c5bdfa297a35 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -407,6 +407,8 @@ static int vm_module_tags_populate(void)
if (phys_end < new_end) {
struct page **next_page = vm_module_tags->pages + vm_module_tags->nr_pages;
+ unsigned long old_shadow_end = ALIGN(phys_end, MODULE_ALIGN);
+ unsigned long new_shadow_end = ALIGN(new_end, MODULE_ALIGN);
unsigned long more_pages;
unsigned long nr;
@@ -421,7 +423,19 @@ static int vm_module_tags_populate(void)
__free_page(next_page[i]);
return -ENOMEM;
}
+
vm_module_tags->nr_pages += nr;
+
+ /*
+ * Kasan allocates 1 byte of shadow for every 8 bytes of data.
+ * When kasan_alloc_module_shadow allocates shadow memory,
+ * its unit of allocation is a page.
+ * Therefore, here we need to align to MODULE_ALIGN.
+ */
+ if (old_shadow_end < new_shadow_end)
+ kasan_alloc_module_shadow((void *)old_shadow_end,
+ new_shadow_end - old_shadow_end,
+ GFP_KERNEL);
}
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* RE: [PATCH v6] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled
2024-12-12 7:21 ` [PATCH v6] " Hao Ge
@ 2024-12-12 14:07 ` Adrian Huang12
0 siblings, 0 replies; 19+ messages in thread
From: Adrian Huang12 @ 2024-12-12 14:07 UTC (permalink / raw)
To: Hao Ge, surenb@google.com, kent.overstreet@linux.dev,
akpm@linux-foundation.org
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Hao Ge,
Ben Greear
> -----Original Message-----
> From: owner-linux-mm@kvack.org <owner-linux-mm@kvack.org> On Behalf
> Of Hao Ge
> Sent: Thursday, December 12, 2024 3:21 PM
> To: surenb@google.com; kent.overstreet@linux.dev;
> akpm@linux-foundation.org
> Cc: linux-mm@kvack.org; linux-kernel@vger.kernel.org; hao.ge@linux.dev;
> Hao Ge <gehao@kylinos.cn>; Ben Greear <greearb@candelatech.com>
> Subject: [External] [PATCH v6] mm/alloc_tag: Fix panic when CONFIG_KASAN
> enabled and CONFIG_KASAN_VMALLOC not enabled
>
> From: Hao Ge <gehao@kylinos.cn>
>
> When CONFIG_KASAN is enabled but CONFIG_KASAN_VMALLOC is not
> enabled, we may encounter a panic during system boot.
>
> Because we haven't allocated pages and created mappings for the shadow
> memory corresponding to module allocation tags region,similar to how it is
> done for execmem_vmalloc.
>
> The memory for module allocation tags is allocated on demand, therefore we
> need to allocate shadow memory on demand as well in MODULE_ALIGN
> blocks.
>
> Here is the log for panic:
>
> [ 18.349421] BUG: unable to handle page fault for address: fffffbfff8092000
> [ 18.350016] #PF: supervisor read access in kernel mode
> [ 18.350459] #PF: error_code(0x0000) - not-present page
> [ 18.350904] PGD 20fe52067 P4D 219dc8067 PUD 219dc4067 PMD
> 102495067 PTE 0
> [ 18.351484] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
> [ 18.351961] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0-rc1+ #3
> [ 18.352533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> [ 18.353494] RIP: 0010:kasan_check_range+0xba/0x1b0
> [ 18.353931] Code: 8d 5a 07 4c 0f 49 da 49 c1 fb 03 45 85 db 0f 84 dd 00 00
> 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 c2 0f 84 c1 00 00 00 <48> 83 38
> 00 74 ed 48 8d 50 08 eb 0d 48 83 c0 01 48 39 d0 0f 84 90
> [ 18.355484] RSP: 0018:ff11000101877958 EFLAGS: 00010206
> [ 18.355937] RAX: fffffbfff8092000 RBX: fffffbfff809201e RCX: ffffffff82a7ceac
> [ 18.356542] RDX: fffffbfff8092018 RSI: 00000000000000f0 RDI:
> ffffffffc0490000
> [ 18.357153] RBP: fffffbfff8092000 R08: 0000000000000001 R09:
> fffffbfff809201d
> [ 18.357756] R10: ffffffffc04900ef R11: 0000000000000003 R12:
> ffffffffc0490000
> [ 18.358365] R13: ff11000101877b48 R14: ffffffffc0490000 R15:
> 000000000000002c
> [ 18.358968] FS: 00007f9bd13c5940(0000) GS:ff110001eb480000(0000)
> knlGS:0000000000000000
> [ 18.359648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 18.360178] CR2: fffffbfff8092000 CR3: 0000000109214004 CR4:
> 0000000000771ef0
> [ 18.360790] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 18.361404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [ 18.362020] PKRU: 55555554
> [ 18.362261] Call Trace:
> [ 18.362481] <TASK>
> [ 18.362671] ? __die+0x23/0x70
> [ 18.362964] ? page_fault_oops+0xc2/0x160
> [ 18.363318] ? exc_page_fault+0xad/0xc0
> [ 18.363680] ? asm_exc_page_fault+0x26/0x30
> [ 18.364056] ? move_module+0x3cc/0x8a0
> [ 18.364398] ? kasan_check_range+0xba/0x1b0
> [ 18.364755] __asan_memcpy+0x3c/0x60
> [ 18.365074] move_module+0x3cc/0x8a0
> [ 18.365386] layout_and_allocate.constprop.0+0x3d5/0x720
> [ 18.365841] ? early_mod_check+0x3dc/0x510
> [ 18.366195] load_module+0x72/0x1850
> [ 18.366509] ? __pfx_kernel_read_file+0x10/0x10
> [ 18.366918] ? vm_mmap_pgoff+0x21c/0x2d0
> [ 18.367262] init_module_from_file+0xd1/0x130
> [ 18.367638] ? __pfx_init_module_from_file+0x10/0x10
> [ 18.368073] ? __pfx__raw_spin_lock+0x10/0x10
> [ 18.368456] ? __pfx_cred_has_capability.isra.0+0x10/0x10
> [ 18.368938] idempotent_init_module+0x22c/0x790
> [ 18.369332] ? simple_getattr+0x6f/0x120
> [ 18.369676] ? __pfx_idempotent_init_module+0x10/0x10
> [ 18.370110] ? fdget+0x58/0x3a0
> [ 18.370393] ? security_capable+0x64/0xf0
> [ 18.370745] __x64_sys_finit_module+0xc2/0x140
> [ 18.371136] do_syscall_64+0x7d/0x160
> [ 18.371459] ? fdget_pos+0x1c8/0x4c0
> [ 18.371784] ? ksys_read+0xfd/0x1d0
> [ 18.372106] ? syscall_exit_to_user_mode+0x10/0x1f0
> [ 18.372525] ? do_syscall_64+0x89/0x160
> [ 18.372860] ? do_syscall_64+0x89/0x160
> [ 18.373194] ? do_syscall_64+0x89/0x160
> [ 18.373527] ? syscall_exit_to_user_mode+0x10/0x1f0
> [ 18.373952] ? do_syscall_64+0x89/0x160
> [ 18.374283] ? syscall_exit_to_user_mode+0x10/0x1f0
> [ 18.374701] ? do_syscall_64+0x89/0x160
> [ 18.375037] ? do_user_addr_fault+0x4a8/0xa40
> [ 18.375416] ? clear_bhb_loop+0x25/0x80
> [ 18.375748] ? clear_bhb_loop+0x25/0x80
> [ 18.376119] ? clear_bhb_loop+0x25/0x80
> [ 18.376450] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Fixes: 233e89322cbe ("alloc_tag: fix module allocation tags populated area
> calculation")
> Reported-by: Ben Greear <greearb@candelatech.com>
> Closes: https://lore.kernel.org/all/1ba0cc57-e2ed-caa2-1241-aa5615bee01f@candelatech.com/
> Suggested-by: Suren Baghdasaryan <surenb@google.com>
> Acked-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Hao Ge <gehao@kylinos.cn>
Thanks for the fix.
I encountered this issue recently and confirmed that this patch fixes it.
Feel free to my add my tested-by.
Tested-by: Adrian Huang <ahuang12@lenovo.com>
-- Adrian
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2024-12-12 14:07 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-10 4:15 [PATCH] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled Hao Ge
2024-12-10 6:53 ` [PATCH v2] " Hao Ge
2024-12-10 17:55 ` Suren Baghdasaryan
2024-12-10 18:45 ` Hao Ge
2024-12-10 19:20 ` Suren Baghdasaryan
2024-12-10 19:36 ` Hao Ge
2024-12-10 20:04 ` Suren Baghdasaryan
2024-12-11 1:10 ` Hao Ge
2024-12-11 2:57 ` [PATCH v3] mm/alloc_tag: Fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled Hao Ge
2024-12-11 16:32 ` Suren Baghdasaryan
2024-12-12 1:07 ` Hao Ge
2024-12-12 1:37 ` [PATCH v4] " Hao Ge
2024-12-12 6:48 ` Suren Baghdasaryan
2024-12-12 7:03 ` [PATCH v5] " Hao Ge
2024-12-12 7:21 ` [PATCH v6] " Hao Ge
2024-12-12 14:07 ` Adrian Huang12
2024-12-11 17:18 ` [PATCH v3] " kernel test robot
2024-12-12 2:23 ` Hao Ge
2024-12-10 18:56 ` [PATCH v2] mm/alloc_tag: Add kasan_alloc_module_shadow when CONFIS_KASAN_VMALLOC disabled Ben Greear
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.