virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] x86/vector: Fix vector leak during CPU offline
@ 2024-05-10 19:06 Dongli Zhang
  2024-05-10 19:48 ` Dave Hansen
  2024-05-13 12:44 ` Thomas Gleixner
  0 siblings, 2 replies; 8+ messages in thread
From: Dongli Zhang @ 2024-05-10 19:06 UTC (permalink / raw)
  To: x86
  Cc: tglx, mingo, Borislavbp, dave.hansen, hpa, joe.jin, linux-kernel,
	virtualization

The absence of IRQD_MOVE_PCNTXT prevents immediate effectiveness of
interrupt affinity reconfiguration via procfs. Instead, the change is
deferred until the next instance of the interrupt being triggered on the
original CPU.

When the interrupt next triggers on the original CPU, the new affinity is
enforced within __irq_move_irq(). A vector is allocated from the new CPU,
but if the old vector on the original CPU remains online, it is not
immediately reclaimed. Instead, apicd->move_in_progress is flagged, and the
reclaiming process is delayed until the next trigger of the interrupt on
the new CPU.

Upon the subsequent triggering of the interrupt on the new CPU,
irq_complete_move() adds a task to the old CPU's vector_cleanup list if it
remains online. Subsequently, the timer on the old CPU iterates over its
vector_cleanup list, reclaiming vectors.

However, if the old CPU is offline before the interrupt triggers again on
the new CPU, irq_complete_move() simply resets both apicd->move_in_progress
and apicd->prev_vector to 0. Consequently, the vector remains unreclaimed
in vector_matrix, resulting in a CPU vector leak.

To address this issue, the fix borrows from the comments and implementation
of apic_update_vector(): "If the target CPU is offline then the regular
release mechanism via the cleanup vector is not possible and the vector can
be immediately freed in the underlying matrix allocator.".

Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 arch/x86/kernel/apic/vector.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 185738c72766..aad189a3bac9 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -1036,6 +1036,15 @@ static void __vector_schedule_cleanup(struct apic_chip_data *apicd)
 			add_timer_on(&cl->timer, cpu);
 		}
 	} else {
+		/*
+		 * This call borrows from the comments and implementation
+		 * of apic_update_vector(): "If the target CPU is offline
+		 * then the regular release mechanism via the cleanup
+		 * vector is not possible and the vector can be immediately
+		 * freed in the underlying matrix allocator.".
+		 */
+		irq_matrix_free(vector_matrix, apicd->prev_cpu,
+				apicd->prev_vector, apicd->is_managed);
 		apicd->prev_vector = 0;
 	}
 	raw_spin_unlock(&vector_lock);
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-05-22 21:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-10 19:06 [PATCH 1/1] x86/vector: Fix vector leak during CPU offline Dongli Zhang
2024-05-10 19:48 ` Dave Hansen
2024-05-13 12:44 ` Thomas Gleixner
2024-05-13 17:43   ` Dongli Zhang
2024-05-13 22:46     ` Thomas Gleixner
2024-05-15 19:51       ` Dongli Zhang
2024-05-21 12:00         ` Thomas Gleixner
2024-05-22 21:44           ` Dongli Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).