[BUG] Fault during memory acceptance for TDX VMs with certain memory sizes

public inbox for linux-efi@vger.kernel.org
 help / color / mirror / Atom feed

* [BUG] Fault during memory acceptance for TDX VMs with certain memory sizes
@ 2026-02-12 16:29 Moritz Sanft
  2026-02-12 23:31 ` Dave Hansen
  0 siblings, 1 reply; 9+ messages in thread
From: Moritz Sanft @ 2026-02-12 16:29 UTC (permalink / raw)
  To: ardb; +Cc: linux-efi, linux-kernel

Dear Ard Biesheuvel,

We are running into a kernel panic when starting TDX VMs with certain 
configurations of memory size in QEMU. The issue reproduces on both 
mainline and the current 6.18 stable and 6.12 LTS kernels.

Based on our current (trial-and-error-based) knowledge, the issue only 
occurs on TDX VMs with memory sizes >64GB, where the memory size is not 
aligned to a multiple of 1024. For instance, the QEMU argument `-m 67G` 
works, while `-m 67000M` results in the crash cited below. The 
configurations we've tested so far are as follows:

- `-m 66690M` results in the crash.
- `-m 66900M` results in the crash.
- `-m 67000M` results in the crash.
- `-m 67G` does not crash.
- `-m 68608M` (67 * 1024) does not crash.
- `-m 6960M` does not crash.
- `-m 33960M` does not crash.

Note that QEMU interprets `67G` in terms of powers of 2, i.e., a 
multiple of 1024, as can be seen in [1].

The crash happens during memory acceptance in _find_next_bit (although, 
I presume, accept_memory) is the more revealing thing here:

```
[    0.111989] BUG: unable to handle page fault for address: 
ff1100007d625008
[    0.114651] #PF: supervisor read access in kernel mode
[    0.116645] #PF: error_code(0x0000) - not-present page
[    0.118539] PGD 40801067 P4D 40802067 PUD 10db7ff067 PMD 10db7fe067 PTE 0
[    0.121108] Oops: Oops: 0000 [#1] SMP NOPTI
[    0.122729] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.19.0 
#1-NixOS NONE
[    0.125651] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
unknown 02/02/2022
[    0.128650] RIP: 0010:_find_next_bit+0x1e/0x80
[    0.130217] Code: c1 c3 c3 66 0f 1f 84 00 00 00 00 00 49 89 f8 48 89 
f0 48 39 f2 73 5a 48 89 d6 48 c7 c7 ff ff ff ff 89 d1 48 c1 ee 06 48 d3 
e7 <49> 23 3c f0 75 42 48 8d 56 01 49 8d 4c f0 08 48 c1 e2 06 eb 2d 66
[    0.137510] RSP: 0000:ffffffff91603d28 EFLAGS: 00010087
[    0.139634] RAX: 0000000000007edc RBX: 0000000000000000 RCX: 
0000000000007edb
[    0.142417] RDX: 0000000000007edb RSI: 00000000000001fb RDI: 
fffffffff8000000
[    0.145263] RBP: ffffffff91603da8 R08: ff1100007d624030 R09: 
ffffffff91c2d090
[    0.148058] R10: 0000000000000083 R11: ffffffff916beb78 R12: 
00000000ffffffff
[    0.150849] R13: ff1100007d624030 R14: 0000000000200000 R15: 
ff1100007d624018
[    0.153702] FS:  0000000000000000(0000) GS:0000000000000000(0000) 
knlGS:0000000000000000
[    0.156962] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.159239] CR2: ff1100007d625008 CR3: 000000003fe1a000 CR4: 
00000000000210f0
[    0.162088] Call Trace:
[    0.163075]  <TASK>
[    0.163882]  ? accept_memory+0x184/0x250
[    0.165448]  ? memblock_alloc_range_nid+0x16b/0x180
[    0.167308]  ? memblock_alloc_internal+0x38/0x70
[    0.168997]  ? __memblock_alloc_or_panic+0x8c/0xe0
[    0.170835]  ? copy_device_tree+0x1e/0x40
[    0.172368]  ? unflatten_device_tree+0x5e/0x90
[    0.174074]  ? x86_flattree_get_config+0x9b/0xe0
[    0.175865]  ? setup_arch+0x948/0xa50
[    0.177264]  ? start_kernel+0x48/0x6f0
[    0.178768]  ? x86_64_start_reservations+0x20/0x20
[    0.180610]  ? x86_64_start_kernel+0xcd/0xd0
[    0.182265]  ? common_startup_64+0x13e/0x141
[    0.183921]  </TASK>
[    0.184767] Modules linked in:
[    0.185971] CR2: ff1100007d625008
[    0.187308] ---[ end trace 0000000000000000 ]---
[    0.189071] RIP: 0010:_find_next_bit+0x1e/0x80
[    0.190789] Code: c1 c3 c3 66 0f 1f 84 00 00 00 00 00 49 89 f8 48 89 
f0 48 39 f2 73 5a 48 89 d6 48 c7 c7 ff ff ff ff 89 d1 48 c1 ee 06 48 d3 
e7 <49> 23 3c f0 75 42 48 8d 56 01 49 8d 4c f0 08 48 c1 e2 06 eb 2d 66
[    0.198025] RSP: 0000:ffffffff91603d28 EFLAGS: 00010087
[    0.200114] RAX: 0000000000007edc RBX: 0000000000000000 RCX: 
0000000000007edb
[    0.202992] RDX: 0000000000007edb RSI: 00000000000001fb RDI: 
fffffffff8000000
[    0.205779] RBP: ffffffff91603da8 R08: ff1100007d624030 R09: 
ffffffff91c2d090
[    0.208578] R10: 0000000000000083 R11: ffffffff916beb78 R12: 
00000000ffffffff
[    0.211444] R13: ff1100007d624030 R14: 0000000000200000 R15: 
ff1100007d624018
[    0.214223] FS:  0000000000000000(0000) GS:0000000000000000(0000) 
knlGS:0000000000000000
[    0.217326] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.219588] CR2: ff1100007d625008 CR3: 000000003fe1a000 CR4: 
00000000000210f0
[    0.222420] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.226193] ---[ end Kernel panic - not syncing: Attempted to kill 
the idle task! ]---
```

Please find a full log at [2].

The used kernel configuration can be found at [3].

The following QEMU command can be used to reproduce the issue:

```
qemu-system-x86_64 \
   -machine 
q35,accel=kvm,kernel_irqchip=split,confidential-guest-support=tdx \
   -cpu host,pmu=off \
   -m 67000M \
   -object '{"qom-type":"tdx-guest","id":"tdx"}' \
   -display none \
   -vga none \
   -nodefaults \
   --no-reboot \
   -kernel "/path/to/kernel-binary" \
   -append "earlyprintk=ttyS0 console=ttyS0" \
   -serial stdio \
   -bios "/path/to/OVMF.fd" \
   -smp 1
```

We have verified this on QEMU v10.1.2 and OVMF version 202511. The 
former is necessary for support of TDX VM creation without further 
patches, whereas the OVMF version should not be of essential importance 
to the reproduction of this bug.

I do assume that this might be more fit to report to maintainers of the 
TDX subsystems due to the hardware requirements for the reproduction. 
However, I decided to report this to you first, as it concerns the EFI 
subsystem to some extent, at least. Please let me know if you have a 
recommendation on who to forward this to.

Best Regards,
Moritz Sanft

[1]: 
<https://gitlab.com/qemu-project/qemu/-/blob/v10.1.2/util/cutils.c?ref_type=tags#L368-371>
[2]: <https://gist.github.com/msanft/45d576466cc5483aef40946213790fcb>
[3]: <https://gist.github.com/msanft/0be5ee612c51df3fd307e449b1f04461>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] Fault during memory acceptance for TDX VMs with certain memory sizes
  2026-02-12 16:29 [BUG] Fault during memory acceptance for TDX VMs with certain memory sizes Moritz Sanft
@ 2026-02-12 23:31 ` Dave Hansen
  2026-02-13  8:34   ` Moritz Sanft
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Hansen @ 2026-02-12 23:31 UTC (permalink / raw)
  To: Moritz Sanft, ardb, Edgecombe, Rick P, Kiryl Shutsemau,
	Weiny, Ira, Wunner, Lukas
  Cc: linux-efi, linux-kernel

On 2/12/26 08:29, Moritz Sanft wrote:
> Based on our current (trial-and-error-based) knowledge, the issue only
> occurs on TDX VMs with memory sizes >64GB, where the memory size is not
> aligned to a multiple of 1024. For instance, the QEMU argument `-m 67G`
> works, while `-m 67000M` results in the crash cited below. The
> configurations we've tested so far are as follows:

I don't see any outrageous bugs in the code. I'm going to take a guess
though: the 'unit_size' and the bitmap size don't match or aren't
consistent.

I'd guess that _something_ is unaligned and you're running off the end
of the bitmap or the *mapping* for the bitmap. Any chance you can throw
a bunch of printk()'s in the kernel and see what all the fields in here are:

struct efi_unaccepted_memory {
        u32 version;
        u32 unit_size;
        u64 phys_base;
        u64 size;
        unsigned long bitmap[];
};

Along with the address of bitmap[] and all the calls to: bitmap_clear()?

That that should shed some light on it.

Any other TDX folks that want to try and reproduce this and do the same
would also be much appreciated!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] Fault during memory acceptance for TDX VMs with certain memory sizes
  2026-02-12 23:31 ` Dave Hansen
@ 2026-02-13  8:34   ` Moritz Sanft
  2026-02-13 11:56     ` Kiryl Shutsemau
  0 siblings, 1 reply; 9+ messages in thread
From: Moritz Sanft @ 2026-02-13  8:34 UTC (permalink / raw)
  To: Dave Hansen, ardb, Edgecombe, Rick P, Kiryl Shutsemau, Weiny, Ira,
	Wunner, Lukas
  Cc: linux-efi, linux-kernel

> Any chance you can throw
> a bunch of printk()'s in the kernel and see what all the fields in here are:
> 
> struct efi_unaccepted_memory {
>         u32 version;
>         u32 unit_size;
>         u64 phys_base;
>         u64 size;
>         unsigned long bitmap[];
> };
> 
> Along with the address of bitmap[] and all the calls to: bitmap_clear()?

Thanks for the guidance. I've added this logging via the patch in [1], 
which produced the following output:

```
[    0.033292] accept_memory(start=0x0000000000099000 size=0x6000)
[    0.037860]   unaccepted: version=1 unit_size=2097152 
phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
[    0.041469] Using GB pages for direct mapping
[    0.043090] accept_memory(start=0x00000010db600000 size=0x200000)
[    0.045311]   unaccepted: version=1 unit_size=2097152 
phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
[    0.058123]   bitmap_clear(bitmap=ff1100007d624030, start=32475, len=1)
[    0.060921] accept_memory(start=0x00000010db7ff000 size=0x1000)
[    0.063142]   unaccepted: version=1 unit_size=2097152 
phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
[    0.066865] accept_memory(start=0x00000010db7fe000 size=0x1000)
[    0.069096]   unaccepted: version=1 unit_size=2097152 
phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
[    0.073705] accept_memory(start=0x00000010db7fd000 size=0x1000)
[    0.075908]   unaccepted: version=1 unit_size=2097152 
phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
// unrelated logs omitted here
[    0.134988] accept_memory(start=0x00000010db7fcf40 size=0x83)
[    0.137152]   unaccepted: version=1 unit_size=2097152 
phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
[    0.140828] BUG: unable to handle page fault for address: 
ff1100007d625008
```

Find a full log attached in [2].

Please let me know if we need to gather any further logs - we're happy 
to do so.

Best Regards,
Moritz Sanft

[1]: https://gist.github.com/msanft/13709e1ec9976a1b4b2723b98163a04b
[2]: https://gist.github.com/msanft/d102475bb28baa4b7958ed35e001e962

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] Fault during memory acceptance for TDX VMs with certain memory sizes
  2026-02-13  8:34   ` Moritz Sanft
@ 2026-02-13 11:56     ` Kiryl Shutsemau
  2026-02-13 12:33       ` Moritz Sanft
  0 siblings, 1 reply; 9+ messages in thread
From: Kiryl Shutsemau @ 2026-02-13 11:56 UTC (permalink / raw)
  To: Moritz Sanft
  Cc: Dave Hansen, ardb, Edgecombe, Rick P, Weiny, Ira, Wunner, Lukas,
	linux-efi, linux-kernel

On Fri, Feb 13, 2026 at 09:34:46AM +0100, Moritz Sanft wrote:
> > Any chance you can throw
> > a bunch of printk()'s in the kernel and see what all the fields in here are:
> > 
> > struct efi_unaccepted_memory {
> >         u32 version;
> >         u32 unit_size;
> >         u64 phys_base;
> >         u64 size;
> >         unsigned long bitmap[];
> > };
> > 
> > Along with the address of bitmap[] and all the calls to: bitmap_clear()?
> 
> Thanks for the guidance. I've added this logging via the patch in [1], which
> produced the following output:
> 
> ```
> [    0.033292] accept_memory(start=0x0000000000099000 size=0x6000)
> [    0.037860]   unaccepted: version=1 unit_size=2097152
> phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
> [    0.041469] Using GB pages for direct mapping
> [    0.043090] accept_memory(start=0x00000010db600000 size=0x200000)
> [    0.045311]   unaccepted: version=1 unit_size=2097152
> phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
> [    0.058123]   bitmap_clear(bitmap=ff1100007d624030, start=32475, len=1)
> [    0.060921] accept_memory(start=0x00000010db7ff000 size=0x1000)
> [    0.063142]   unaccepted: version=1 unit_size=2097152
> phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
> [    0.066865] accept_memory(start=0x00000010db7fe000 size=0x1000)
> [    0.069096]   unaccepted: version=1 unit_size=2097152
> phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
> [    0.073705] accept_memory(start=0x00000010db7fd000 size=0x1000)
> [    0.075908]   unaccepted: version=1 unit_size=2097152
> phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
> // unrelated logs omitted here
> [    0.134988] accept_memory(start=0x00000010db7fcf40 size=0x83)
> [    0.137152]   unaccepted: version=1 unit_size=2097152
> phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
> [    0.140828] BUG: unable to handle page fault for address:
> ff1100007d625008
> ```
> 
> Find a full log attached in [2].
> 
> Please let me know if we need to gather any further logs - we're happy to do
> so.

Could you check it this patch makes a difference:

diff --git a/drivers/firmware/efi/unaccepted_memory.c b/drivers/firmware/efi/unaccepted_memory.c
index c2c067eff634..f2a00cd429f2 100644
--- a/drivers/firmware/efi/unaccepted_memory.c
+++ b/drivers/firmware/efi/unaccepted_memory.c
@@ -35,7 +35,7 @@ void accept_memory(phys_addr_t start, unsigned long size)
 	struct efi_unaccepted_memory *unaccepted;
 	unsigned long range_start, range_end;
 	struct accept_range range, *entry;
-	phys_addr_t end = start + size;
+	phys_addr_t end = start + PAGE_ALIGN(size);
 	unsigned long flags;
 	u64 unit_size;
 
-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [BUG] Fault during memory acceptance for TDX VMs with certain memory sizes
  2026-02-13 11:56     ` Kiryl Shutsemau
@ 2026-02-13 12:33       ` Moritz Sanft
  2026-02-13 14:24         ` Kiryl Shutsemau
  0 siblings, 1 reply; 9+ messages in thread
From: Moritz Sanft @ 2026-02-13 12:33 UTC (permalink / raw)
  To: Kiryl Shutsemau
  Cc: Dave Hansen, ardb, Edgecombe, Rick P, Weiny, Ira, Wunner, Lukas,
	linux-efi, linux-kernel

> Could you check it this patch makes a difference:
> 
> diff --git a/drivers/firmware/efi/unaccepted_memory.c b/drivers/firmware/efi/unaccepted_memory.c
> index c2c067eff634..f2a00cd429f2 100644
> --- a/drivers/firmware/efi/unaccepted_memory.c
> +++ b/drivers/firmware/efi/unaccepted_memory.c
> @@ -35,7 +35,7 @@ void accept_memory(phys_addr_t start, unsigned long size)
>  	struct efi_unaccepted_memory *unaccepted;
>  	unsigned long range_start, range_end;
>  	struct accept_range range, *entry;
> -	phys_addr_t end = start + size;
> +	phys_addr_t end = start + PAGE_ALIGN(size);
>  	unsigned long flags;
>  	u64 unit_size;

Thanks, I tried this on the `-m 67000M` VM and the crash still occurs. I 
extended the previously-added logging to also log the values for `start 
+ size` and `start + PAGE_ALIGN(size)`. Please find the full patch 
including the logging and your change in [1].

The produced logs are as follows:

```
[    0.046472] accept_memory(start=0x00000010db600000 size=0x200000)
[    0.048747]   unaccepted: version=1 unit_size=2097152 
phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
[    0.052557]   (start + size)=0x00000010db800000 (start + 
PAGE_ALIGN(size))=0x00000010db800000
[    0.065217]   bitmap_clear(bitmap=ff1100007d624030, start=32475, len=1)
[    0.067928] accept_memory(start=0x00000010db7ff000 size=0x1000)
[    0.070167]   unaccepted: version=1 unit_size=2097152 
phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
[    0.073917]   (start + size)=0x00000010db800000 (start + 
PAGE_ALIGN(size))=0x00000010db800000
[    0.077150] accept_memory(start=0x00000010db7fe000 size=0x1000)
[    0.079365]   unaccepted: version=1 unit_size=2097152 
phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
[    0.083080]   (start + size)=0x00000010db7ff000 (start + 
PAGE_ALIGN(size))=0x00000010db7ff000
[    0.087123] accept_memory(start=0x00000010db7fd000 size=0x1000)
[    0.089362]   unaccepted: version=1 unit_size=2097152 
phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
[    0.093239]   (start + size)=0x00000010db7fe000 (start + 
PAGE_ALIGN(size))=0x00000010db7fe000
// unrelated
[    0.150522] APIC: Switched APIC routing to: cluster x2apic
[    0.152595] accept_memory(start=0x00000010db7fcf40 size=0x83)
[    0.154745]   unaccepted: version=1 unit_size=2097152 
phys_base=0x100000000 size=0xfdc bitmap=ff1100007d624030
[    0.158479]   (start + size)=0x00000010db7fcfc3 (start + 
PAGE_ALIGN(size))=0x00000010db7fdf40
[    0.161713] BUG: unable to handle page fault for address: 
ff1100007d625008
```

[1]: https://gist.github.com/msanft/d6d7e32a65708f5bd36233649e4facee



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] Fault during memory acceptance for TDX VMs with certain memory sizes
  2026-02-13 12:33       ` Moritz Sanft
@ 2026-02-13 14:24         ` Kiryl Shutsemau
  2026-02-13 14:52           ` Moritz Sanft
                             ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Kiryl Shutsemau @ 2026-02-13 14:24 UTC (permalink / raw)
  To: Moritz Sanft
  Cc: Dave Hansen, ardb, Edgecombe, Rick P, Weiny, Ira, Wunner, Lukas,
	linux-efi, linux-kernel

On Fri, Feb 13, 2026 at 01:33:56PM +0100, Moritz Sanft wrote:
> > Could you check it this patch makes a difference:
> > 
> > diff --git a/drivers/firmware/efi/unaccepted_memory.c b/drivers/firmware/efi/unaccepted_memory.c
> > index c2c067eff634..f2a00cd429f2 100644
> > --- a/drivers/firmware/efi/unaccepted_memory.c
> > +++ b/drivers/firmware/efi/unaccepted_memory.c
> > @@ -35,7 +35,7 @@ void accept_memory(phys_addr_t start, unsigned long size)
> >  	struct efi_unaccepted_memory *unaccepted;
> >  	unsigned long range_start, range_end;
> >  	struct accept_range range, *entry;
> > -	phys_addr_t end = start + size;
> > +	phys_addr_t end = start + PAGE_ALIGN(size);
> >  	unsigned long flags;
> >  	u64 unit_size;
> 
> Thanks, I tried this on the `-m 67000M` VM and the crash still occurs. I
> extended the previously-added logging to also log the values for `start +
> size` and `start + PAGE_ALIGN(size)`. Please find the full patch including
> the logging and your change in [1].

What about the patch below. It seems we under-reserve memory for the
table if it is unaligned.

I still think that we need align start/size/end to the PAGE_SIZE in
accept_memory()/range_contains_unaccepted_memory() before doing anything
else. Otherwise (end % unit_size) check is broken. But it seems to be
unrelated to the problem you see.

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 111e87a618e5..56e9d73412fa 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -692,13 +692,13 @@ static __init int match_config_table(const efi_guid_t *guid,
 
 static __init void reserve_unaccepted(struct efi_unaccepted_memory *unaccepted)
 {
-	phys_addr_t start, size;
+	phys_addr_t start, end;
 
 	start = PAGE_ALIGN_DOWN(efi.unaccepted);
-	size = PAGE_ALIGN(sizeof(*unaccepted) + unaccepted->size);
+	end = PAGE_ALIGN(efi.unaccepted + sizeof(*unaccepted) + unaccepted->size);
 
-	memblock_add(start, size);
-	memblock_reserve(start, size);
+	memblock_add(start, end - start);
+	memblock_reserve(start, end - start);
 }
 
 int __init efi_config_parse_tables(const efi_config_table_t *config_tables,
-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [BUG] Fault during memory acceptance for TDX VMs with certain memory sizes
  2026-02-13 14:24         ` Kiryl Shutsemau
@ 2026-02-13 14:52           ` Moritz Sanft
  2026-02-13 14:52           ` Moritz Sanft
  2026-02-13 16:53           ` Verma, Vishal L
  2 siblings, 0 replies; 9+ messages in thread
From: Moritz Sanft @ 2026-02-13 14:52 UTC (permalink / raw)
  To: Kiryl Shutsemau
  Cc: Dave Hansen, ardb, Edgecombe, Rick P, Weiny, Ira, Wunner, Lukas,
	linux-efi, linux-kernel

> What about the patch below. It seems we under-reserve memory for the
> table if it is unaligned.
> 
> I still think that we need align start/size/end to the PAGE_SIZE in
> accept_memory()/range_contains_unaccepted_memory() before doing anything
> else. Otherwise (end % unit_size) check is broken. But it seems to be
> unrelated to the problem you see.
> 
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 111e87a618e5..56e9d73412fa 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -692,13 +692,13 @@ static __init int match_config_table(const efi_guid_t *guid,
>  
>  static __init void reserve_unaccepted(struct efi_unaccepted_memory *unaccepted)
>  {
> -	phys_addr_t start, size;
> +	phys_addr_t start, end;
>  
>  	start = PAGE_ALIGN_DOWN(efi.unaccepted);
> -	size = PAGE_ALIGN(sizeof(*unaccepted) + unaccepted->size);
> +	end = PAGE_ALIGN(efi.unaccepted + sizeof(*unaccepted) + unaccepted->size);
>  
> -	memblock_add(start, size);
> -	memblock_reserve(start, size);
> +	memblock_add(start, end - start);
> +	memblock_reserve(start, end - start);
>  }
>  
>  int __init efi_config_parse_tables(const efi_config_table_t *config_tables,

Thanks, this patch seems to fix the problem causing the panic. The VM 
boots as expected with this.

Please let me know if any more information is required.

Best Regards,
Moritz Sanft

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] Fault during memory acceptance for TDX VMs with certain memory sizes
  2026-02-13 14:24         ` Kiryl Shutsemau
  2026-02-13 14:52           ` Moritz Sanft
@ 2026-02-13 14:52           ` Moritz Sanft
  2026-02-13 16:53           ` Verma, Vishal L
  2 siblings, 0 replies; 9+ messages in thread
From: Moritz Sanft @ 2026-02-13 14:52 UTC (permalink / raw)
  To: Kiryl Shutsemau
  Cc: Dave Hansen, ardb, Edgecombe, Rick P, Weiny, Ira, Wunner, Lukas,
	linux-efi, linux-kernel

> What about the patch below. It seems we under-reserve memory for the
> table if it is unaligned.
> 
> I still think that we need align start/size/end to the PAGE_SIZE in
> accept_memory()/range_contains_unaccepted_memory() before doing anything
> else. Otherwise (end % unit_size) check is broken. But it seems to be
> unrelated to the problem you see.
> 
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 111e87a618e5..56e9d73412fa 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -692,13 +692,13 @@ static __init int match_config_table(const efi_guid_t *guid,
>  
>  static __init void reserve_unaccepted(struct efi_unaccepted_memory *unaccepted)
>  {
> -	phys_addr_t start, size;
> +	phys_addr_t start, end;
>  
>  	start = PAGE_ALIGN_DOWN(efi.unaccepted);
> -	size = PAGE_ALIGN(sizeof(*unaccepted) + unaccepted->size);
> +	end = PAGE_ALIGN(efi.unaccepted + sizeof(*unaccepted) + unaccepted->size);
>  
> -	memblock_add(start, size);
> -	memblock_reserve(start, size);
> +	memblock_add(start, end - start);
> +	memblock_reserve(start, end - start);
>  }
>  
>  int __init efi_config_parse_tables(const efi_config_table_t *config_tables,

Thanks, this patch seems to fix the problem causing the panic. The VM 
boots as expected with this.

Please let me know if any more information is required.

Best Regards,
Moritz Sanft

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] Fault during memory acceptance for TDX VMs with certain memory sizes
  2026-02-13 14:24         ` Kiryl Shutsemau
  2026-02-13 14:52           ` Moritz Sanft
  2026-02-13 14:52           ` Moritz Sanft
@ 2026-02-13 16:53           ` Verma, Vishal L
  2 siblings, 0 replies; 9+ messages in thread
From: Verma, Vishal L @ 2026-02-13 16:53 UTC (permalink / raw)
  To: ms@edgeless.systems, kas@kernel.org
  Cc: linux-efi@vger.kernel.org, Wunner, Lukas, Hansen, Dave,
	ardb@kernel.org, Edgecombe, Rick P, Weiny, Ira,
	linux-kernel@vger.kernel.org

On Fri, 2026-02-13 at 14:24 +0000, Kiryl Shutsemau wrote:
> 
> I still think that we need align start/size/end to the PAGE_SIZE in
> accept_memory()/range_contains_unaccepted_memory() before doing anything
> else. Otherwise (end % unit_size) check is broken. But it seems to be
> unrelated to the problem you see.
> 
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 111e87a618e5..56e9d73412fa 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -692,13 +692,13 @@ static __init int match_config_table(const efi_guid_t *guid,
>  
>  static __init void reserve_unaccepted(struct efi_unaccepted_memory *unaccepted)
>  {
> -	phys_addr_t start, size;
> +	phys_addr_t start, end;
>  
>  	start = PAGE_ALIGN_DOWN(efi.unaccepted);
> -	size = PAGE_ALIGN(sizeof(*unaccepted) + unaccepted->size);
> +	end = PAGE_ALIGN(efi.unaccepted + sizeof(*unaccepted) + unaccepted->size);
>  
> -	memblock_add(start, size);
> -	memblock_reserve(start, size);
> +	memblock_add(start, end - start);
> +	memblock_reserve(start, end - start);
>  }
>  
>  int __init efi_config_parse_tables(const efi_config_table_t *config_tables,

I was able to reproduce the original BUG on a TDX system, and after
some LLM-assisted debugging, this similar patch seems to fix it:

---

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 55452e61af31d..9f66f0f535420 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -695,7 +695,8 @@ static __init void reserve_unaccepted(struct
efi_unaccepted_memory *unaccepted)
        phys_addr_t start, size;
 
        start = PAGE_ALIGN_DOWN(efi.unaccepted);
-       size = PAGE_ALIGN(sizeof(*unaccepted) + unaccepted->size);
+       size = PAGE_ALIGN(sizeof(*unaccepted) + unaccepted->size +
+                         offset_in_page(efi.unaccepted));
 
        memblock_add(start, size);
        memblock_reserve(start, size);


---

The hypothesis is that the original size calculation does not account
for the table's offset within its starting page. The EFI pool allocator
performs sub-page allocation, so efi.unaccepted may not be page
aligned.

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-02-13 16:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-12 16:29 [BUG] Fault during memory acceptance for TDX VMs with certain memory sizes Moritz Sanft
2026-02-12 23:31 ` Dave Hansen
2026-02-13  8:34   ` Moritz Sanft
2026-02-13 11:56     ` Kiryl Shutsemau
2026-02-13 12:33       ` Moritz Sanft
2026-02-13 14:24         ` Kiryl Shutsemau
2026-02-13 14:52           ` Moritz Sanft
2026-02-13 14:52           ` Moritz Sanft
2026-02-13 16:53           ` Verma, Vishal L

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox