* [PATCH 1/2] gfp_types: Introduce a new GFP_ATOMIC_RT gfp flag
@ 2026-05-20 20:46 Waiman Long
2026-05-20 20:46 ` [PATCH 2/2] irqchip/gic-v3-its: Use GFP_ATOMIC_RT gfp flag in allocate_vpe_l1_table() Waiman Long
0 siblings, 1 reply; 2+ messages in thread
From: Waiman Long @ 2026-05-20 20:46 UTC (permalink / raw)
To: Marc Zyngier, Thomas Gleixner, Sebastian Andrzej Siewior,
Clark Williams, Steven Rostedt, Andrew Morton, David Hildenbrand,
Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko
Cc: linux-arm-kernel, linux-kernel, linux-mm, linux-rt-devel,
Waiman Long
The GFP_ATOMIC flag is to be used in atomic context where user cannot
sleep and need the allocation to succeed. However, it does not support
contexts where preemption or interrupt is disabled under PREEMPT_RT
like raw_spin_lock_irqsave() or plain preempt_disable().
With the advance of the ALLOC_TRYLOCK allocation flag in the v7.1
kernel, it is possible to allocate memory under such contexts by using
spin_trylock to acquire the spinlock in the memory allocation path. This
does increase the chance that the allocation can fail due to the presence
of concurrent memory allocation requests. So its users must be able to
handle such memory allocation failure gracefully.
The ALLOC_TRYLOCK flag will only be enabled if none of the
___GFP_DIRECT_RECLAIM and ___GFP_KSWAPD_RECLAIM flags are set.
Introduce a new GFP_ATOMIC_RT gfp flag for those PREEMPT_RT
atomic contexts. This new flag will fall back to GFP_ATOMIC in
non-PREEMPT_RT kernel. GFP_ATOMIC can continue to be used in contexts
where preemption and interrupt are not disabled in PREEMPT_RT kernel
like spin_lock_irqsave().
Signed-off-by: Waiman Long <longman@redhat.com>
---
include/linux/gfp_types.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
index cd4972a7c97c..ac30882b6cd4 100644
--- a/include/linux/gfp_types.h
+++ b/include/linux/gfp_types.h
@@ -316,6 +316,13 @@ enum {
* preempt_disable() - see "Memory allocation" in
* Documentation/core-api/real-time/differences.rst for more info.
*
+ * %GFP_ATOMIC_RT is similar to %GFP_ATOMIC with the addition that it can also
+ * be used in context where preemption and/or interrupt is disabled under
+ * PREEMPT_RT, but not in NMI or hardirq contexts. The allocation is more
+ * likely to fail under PREEMPT_RT due to the spin_trylock() nature of lock
+ * acquisition. So the caller must be ready to handle memory allocation failure
+ * gracefully.
+ *
* %GFP_KERNEL is typical for kernel-internal allocations. The caller requires
* %ZONE_NORMAL or a lower zone for direct access but can direct reclaim.
*
@@ -388,4 +395,10 @@ enum {
__GFP_NOMEMALLOC | __GFP_NOWARN) & ~__GFP_RECLAIM)
#define GFP_TRANSHUGE (GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM)
+#ifdef CONFIG_PREEMPT_RT
+# define GFP_ATOMIC_RT __GFP_HIGH
+#else
+# define GFP_ATOMIC_RT GFP_ATOMIC
+#endif
+
#endif /* __LINUX_GFP_TYPES_H */
--
2.54.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [PATCH 2/2] irqchip/gic-v3-its: Use GFP_ATOMIC_RT gfp flag in allocate_vpe_l1_table()
2026-05-20 20:46 [PATCH 1/2] gfp_types: Introduce a new GFP_ATOMIC_RT gfp flag Waiman Long
@ 2026-05-20 20:46 ` Waiman Long
0 siblings, 0 replies; 2+ messages in thread
From: Waiman Long @ 2026-05-20 20:46 UTC (permalink / raw)
To: Marc Zyngier, Thomas Gleixner, Sebastian Andrzej Siewior,
Clark Williams, Steven Rostedt, Andrew Morton, David Hildenbrand,
Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko
Cc: linux-arm-kernel, linux-kernel, linux-mm, linux-rt-devel,
Waiman Long
When running a PREEMPT_RT debug kernel on a 2-socket Grace arm64 system,
the following bug report was produced at bootup time.
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/72
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 1
:
CPU: 72 UID: 0 PID: 0 Comm: swapper/72 Tainted: G W 6.19.0-rc4-test+ #4 PREEMPT_{RT,(full)}
Tainted: [W]=WARN
Call trace:
:
rt_spin_lock+0xe4/0x408
rmqueue_bulk+0x48/0x1de8
__rmqueue_pcplist+0x410/0x650
rmqueue.constprop.0+0x6a8/0x2b50
get_page_from_freelist+0x3c0/0xe68
__alloc_frozen_pages_noprof+0x1dc/0x348
alloc_pages_mpol+0xe4/0x2f8
alloc_frozen_pages_noprof+0x124/0x190
allocate_slab+0x2f0/0x438
new_slab+0x4c/0x80
___slab_alloc+0x410/0x798
__slab_alloc.constprop.0+0x88/0x1e0
__kmalloc_cache_noprof+0x2dc/0x4b0
allocate_vpe_l1_table+0x114/0x788
its_cpu_init_lpis+0x344/0x790
its_cpu_init+0x60/0x220
gic_starting_cpu+0x64/0xe8
cpuhp_invoke_callback+0x438/0x6d8
__cpuhp_invoke_callback_range+0xd8/0x1f8
notify_cpu_starting+0x11c/0x178
secondary_start_kernel+0xc8/0x188
__secondary_switched+0xc0/0xc8
This is due to the fact that allocate_vpe_l1_table() will call kzalloc()
to allocate a cpumask_t when the first CPU of the second node of the
72-cpu Grace system is being called from the CPUHP_AP_IRQ_GIC_STARTING
state inside the starting section of the CPU hotplug bringup pipeline
where interrupt is disabled. This is an atomic context where sleeping
is not allowed and acquiring a sleeping rt_spin_lock within kzalloc()
may lead to system hang in case there is a lock contention.
A possible workaround is to use the new GFP_ATOMIC_RT gfp flag where only
spin_trylock() will be used to attempt to acquire spinlocks in the memory
allocation path to disallow sleeping. As this memory allocation is only
needed for the first core of a new socket in early boot, the chance of
memory allocation request collision is low. In case it happens, direct
injection of virtual interrupts from the physical Interrupt Translation
Service (ITS) into a guest Virtual Machine (VM) will be disabled.
A longer term solution is to defer the allocation to a later stage of the
hotplug pipeline where interrupt isn't disabled.
With that change applied, booting up a debug kernel on the same 2-socket
Grace system does not produce such a bug report anymore with no direct
injection disable warning.
Signed-off-by: Waiman Long <longman@redhat.com>
---
drivers/irqchip/irq-gic-v3-its.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 291d7668cc8d..d78057fb40df 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -2927,7 +2927,7 @@ static int allocate_vpe_l1_table(void)
if (val & GICR_VPROPBASER_4_1_VALID)
goto out;
- gic_data_rdist()->vpe_table_mask = kzalloc_obj(cpumask_t, GFP_ATOMIC);
+ gic_data_rdist()->vpe_table_mask = kzalloc_obj(cpumask_t, GFP_ATOMIC_RT);
if (!gic_data_rdist()->vpe_table_mask)
return -ENOMEM;
@@ -3271,6 +3271,8 @@ static void its_cpu_init_lpis(void)
*/
gic_rdists->has_rvpeid = false;
gic_rdists->has_vlpis = false;
+ pr_warn("GICv3: CPU%d: direct injection of virtual interrupt disabled\n",
+ smp_processor_id());
}
/* Make sure the GIC has seen the above */
--
2.54.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-20 20:47 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-20 20:46 [PATCH 1/2] gfp_types: Introduce a new GFP_ATOMIC_RT gfp flag Waiman Long
2026-05-20 20:46 ` [PATCH 2/2] irqchip/gic-v3-its: Use GFP_ATOMIC_RT gfp flag in allocate_vpe_l1_table() Waiman Long
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox