All of lore.kernel.org
 help / color / mirror / Atom feed
* Xen Security Advisory 441 v4 (CVE-2023-34324) - Possible deadlock in Linux kernel event handling
@ 2023-10-10 12:06 Xen.org security team
  0 siblings, 0 replies; only message in thread
From: Xen.org security team @ 2023-10-10 12:06 UTC (permalink / raw)
  To: xen-announce, xen-devel, xen-users, oss-security; +Cc: Xen.org security team

[-- Attachment #1: Type: text/plain, Size: 4287 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

            Xen Security Advisory CVE-2023-34324 / XSA-441
                               version 4

           Possible deadlock in Linux kernel event handling

UPDATES IN VERSION 4
====================

Public release.

Modified advisory again to state that Arm32 guests are NOT affected.

ISSUE DESCRIPTION
=================

Closing of an event channel in the Linux kernel can result in a deadlock.
This happens when the close is being performed in parallel to an unrelated
Xen console action and the handling of a Xen console interrupt in an
unprivileged guest.

The closing of an event channel is e.g. triggered by removal of a
paravirtual device on the other side. As this action will cause console
messages to be issued on the other side quite often, the chance of
triggering the deadlock is not neglectable.

Note that 32-bit Arm-guests are not affected, as the 32-bit Linux kernel
on Arm doesn't use queued-RW-locks, which are required to trigger the
issue (on Arm32 a waiting writer doesn't block further readers to get
the lock).

IMPACT
======

A (malicious) guest administrator could cause a denial of service (DoS)
in a backend domain (other than dom0) by disabling a paravirtualized
device.

A malicious backend could cause DoS in a guest running a Linux kernel by
disabling a paravirtualized device.

VULNERABLE SYSTEMS
==================

All unprivileged guests running a Linux kernel of version 5.10 and later,
or with the fixes for XSA-332, are vulnerable.

All guest types are vulnerable.

Only x86- and 64-bit Arm-guests are vulnerable.

Arm-guests running in 32-bit mode are not vulnerable.

Guests not using paravirtualized drivers are not vulnerable.

MITIGATION
==========

There is no known mitigation.

CREDITS
=======

This issue was discovered as a bug by Marek Marczykowski-Górecki of
Invisible Things Lab; the security impact was recognised by Jürgen
Groß of SUSE.

RESOLUTION
==========

Applying the attached patch resolves this issue.

Note that patches for released versions are generally prepared to
apply to the stable branches, and may not apply cleanly to the most
recent release tarball.  Downstreams are encouraged to update to the
tip of the stable branch before applying these patches.

xsa441-linux.patch     Linux

$ sha256sum xsa441*
937406d86dd6dd3e389fdae726a25e5f0e960f7004c314e370cb2369d6715c24  xsa441-linux.patch
$

DEPLOYMENT DURING EMBARGO
=========================

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) on the host and on VMs being
administered and used only by organisations which are members of the Xen
Project Security Issue Predisclosure List is permitted during the embargo,
even on public-facing systems with other untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

Deployment of patches or mitigations is NOT permitted on VMs being
administered or used by organisations which are not members of the Xen
Project Security Issue Predisclosure List. On those VMs deployment is
permitted only AFTER the embargo ends.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-----BEGIN PGP SIGNATURE-----

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAmUlNOkMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZOmAH/3D7dRH11wIRyFZ/nj4pwkPfPXvCDtUmaRXfAaV4
Xe9ODMSevcEQpSFW4VY6eK7DP6kqYMM7myoy+np8Ttnin7+y+PYUJkxM+liqhLyT
fhGi74NNuQLMvGcSKp26aIHAJNtZqWFeRTlEFJHlY4S6ENRoupWd2T2qgnts00NX
R4NzZ8yQFcsmvy9gqgq6MYoa2VIrhQlpiDPX81pA/HViv0GiXab1QSYTyI9jQ2EX
WC19sELYSK2jMAjuejHlw28B+giy0KxcJv6zewn3Jwn8h3ft4AI1OIh4KfOtEad+
wptYB87EM76Lr3B8ipFEvN4sSU1yBnE4iVOgZpAs74mylN8=
=hOm2
-----END PGP SIGNATURE-----

[-- Attachment #2: xsa441-linux.patch --]
[-- Type: application/octet-stream, Size: 7455 bytes --]

From b9518bb105635144acdbf9160583ab98521af89e Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Mon, 28 Aug 2023 08:09:47 +0200
Subject: [PATCH v3] xen/events: replace evtchn_rwlock with RCU

In unprivileged Xen guests event handling can cause a deadlock with
Xen console handling. The evtchn_rwlock and the hvc_lock are taken in
opposite sequence in __hvc_poll() and in Xen console IRQ handling.
Normally this is no problem, as the evtchn_rwlock is taken as a reader
in both paths, but as soon as an event channel is being closed, the
lock will be taken as a writer, which will cause read_lock() to block:

CPU0                     CPU1                CPU2
(IRQ handling)           (__hvc_poll())      (closing event channel)

read_lock(evtchn_rwlock)
                         spin_lock(hvc_lock)
                                             write_lock(evtchn_rwlock)
                                                 [blocks]
spin_lock(hvc_lock)
    [blocks]
                        read_lock(evtchn_rwlock)
                            [blocks due to writer waiting,
                             and not in_interrupt()]

This issue can be avoided by replacing evtchn_rwlock with RCU in
xen_free_irq(). Note that RCU is used only to delay freeing of the
irq_info memory. There is no RCU based dereferencing or replacement of
pointers involved.

In order to avoid potential races between removing the irq_info
reference and handling of interrupts, set the irq_info pointer to NULL
only when freeing its memory. The IRQ itself must be freed at that
time, too, as otherwise the same IRQ number could be allocated again
before handling of the old instance would have been finished.

This is XSA-441 / CVE-2023-34324.

Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Fixes: 54c9de89895e ("xen/events: add a new "late EOI" evtchn framework")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
 drivers/xen/events/events_base.c | 87 +++++++++++++++++---------------
 1 file changed, 46 insertions(+), 41 deletions(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 0bb86e6c4d0a..1b2136fe0fa5 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -33,6 +33,7 @@
 #include <linux/slab.h>
 #include <linux/irqnr.h>
 #include <linux/pci.h>
+#include <linux/rcupdate.h>
 #include <linux/spinlock.h>
 #include <linux/cpuhotplug.h>
 #include <linux/atomic.h>
@@ -96,6 +97,7 @@ enum xen_irq_type {
 struct irq_info {
 	struct list_head list;
 	struct list_head eoi_list;
+	struct rcu_work rwork;
 	short refcnt;
 	u8 spurious_cnt;
 	u8 is_accounted;
@@ -146,23 +148,13 @@ const struct evtchn_ops *evtchn_ops;
  */
 static DEFINE_MUTEX(irq_mapping_update_lock);
 
-/*
- * Lock protecting event handling loop against removing event channels.
- * Adding of event channels is no issue as the associated IRQ becomes active
- * only after everything is setup (before request_[threaded_]irq() the handler
- * can't be entered for an event, as the event channel will be unmasked only
- * then).
- */
-static DEFINE_RWLOCK(evtchn_rwlock);
-
 /*
  * Lock hierarchy:
  *
  * irq_mapping_update_lock
- *   evtchn_rwlock
- *     IRQ-desc lock
- *       percpu eoi_list_lock
- *         irq_info->lock
+ *   IRQ-desc lock
+ *     percpu eoi_list_lock
+ *       irq_info->lock
  */
 
 static LIST_HEAD(xen_irq_list_head);
@@ -306,6 +298,22 @@ static void channels_on_cpu_inc(struct irq_info *info)
 	info->is_accounted = 1;
 }
 
+static void delayed_free_irq(struct work_struct *work)
+{
+	struct irq_info *info = container_of(to_rcu_work(work), struct irq_info,
+					     rwork);
+	unsigned int irq = info->irq;
+
+	/* Remove the info pointer only now, with no potential users left. */
+	set_info_for_irq(irq, NULL);
+
+	kfree(info);
+
+	/* Legacy IRQ descriptors are managed by the arch. */
+	if (irq >= nr_legacy_irqs())
+		irq_free_desc(irq);
+}
+
 /* Constructors for packed IRQ information. */
 static int xen_irq_info_common_setup(struct irq_info *info,
 				     unsigned irq,
@@ -668,33 +676,36 @@ static void xen_irq_lateeoi_worker(struct work_struct *work)
 
 	eoi = container_of(to_delayed_work(work), struct lateeoi_work, delayed);
 
-	read_lock_irqsave(&evtchn_rwlock, flags);
+	rcu_read_lock();
 
 	while (true) {
-		spin_lock(&eoi->eoi_list_lock);
+		spin_lock_irqsave(&eoi->eoi_list_lock, flags);
 
 		info = list_first_entry_or_null(&eoi->eoi_list, struct irq_info,
 						eoi_list);
 
-		if (info == NULL || now < info->eoi_time) {
-			spin_unlock(&eoi->eoi_list_lock);
+		if (info == NULL)
+			break;
+
+		if (now < info->eoi_time) {
+			mod_delayed_work_on(info->eoi_cpu, system_wq,
+					    &eoi->delayed,
+					    info->eoi_time - now);
 			break;
 		}
 
 		list_del_init(&info->eoi_list);
 
-		spin_unlock(&eoi->eoi_list_lock);
+		spin_unlock_irqrestore(&eoi->eoi_list_lock, flags);
 
 		info->eoi_time = 0;
 
 		xen_irq_lateeoi_locked(info, false);
 	}
 
-	if (info)
-		mod_delayed_work_on(info->eoi_cpu, system_wq,
-				    &eoi->delayed, info->eoi_time - now);
+	spin_unlock_irqrestore(&eoi->eoi_list_lock, flags);
 
-	read_unlock_irqrestore(&evtchn_rwlock, flags);
+	rcu_read_unlock();
 }
 
 static void xen_cpu_init_eoi(unsigned int cpu)
@@ -709,16 +720,15 @@ static void xen_cpu_init_eoi(unsigned int cpu)
 void xen_irq_lateeoi(unsigned int irq, unsigned int eoi_flags)
 {
 	struct irq_info *info;
-	unsigned long flags;
 
-	read_lock_irqsave(&evtchn_rwlock, flags);
+	rcu_read_lock();
 
 	info = info_for_irq(irq);
 
 	if (info)
 		xen_irq_lateeoi_locked(info, eoi_flags & XEN_EOI_FLAG_SPURIOUS);
 
-	read_unlock_irqrestore(&evtchn_rwlock, flags);
+	rcu_read_unlock();
 }
 EXPORT_SYMBOL_GPL(xen_irq_lateeoi);
 
@@ -732,6 +742,7 @@ static void xen_irq_init(unsigned irq)
 
 	info->type = IRQT_UNBOUND;
 	info->refcnt = -1;
+	INIT_RCU_WORK(&info->rwork, delayed_free_irq);
 
 	set_info_for_irq(irq, info);
 	/*
@@ -789,31 +800,18 @@ static int __must_check xen_allocate_irq_gsi(unsigned gsi)
 static void xen_free_irq(unsigned irq)
 {
 	struct irq_info *info = info_for_irq(irq);
-	unsigned long flags;
 
 	if (WARN_ON(!info))
 		return;
 
-	write_lock_irqsave(&evtchn_rwlock, flags);
-
 	if (!list_empty(&info->eoi_list))
 		lateeoi_list_del(info);
 
 	list_del(&info->list);
 
-	set_info_for_irq(irq, NULL);
-
 	WARN_ON(info->refcnt > 0);
 
-	write_unlock_irqrestore(&evtchn_rwlock, flags);
-
-	kfree(info);
-
-	/* Legacy IRQ descriptors are managed by the arch. */
-	if (irq < nr_legacy_irqs())
-		return;
-
-	irq_free_desc(irq);
+	queue_rcu_work(system_wq, &info->rwork);
 }
 
 /* Not called for lateeoi events. */
@@ -1711,7 +1709,14 @@ int xen_evtchn_do_upcall(void)
 	int cpu = smp_processor_id();
 	struct evtchn_loop_ctrl ctrl = { 0 };
 
-	read_lock(&evtchn_rwlock);
+	/*
+	 * When closing an event channel the associated IRQ must not be freed
+	 * until all cpus have left the event handling loop. This is ensured
+	 * by taking the rcu_read_lock() while handling events, as freeing of
+	 * the IRQ is handled via queue_rcu_work() _after_ closing the event
+	 * channel.
+	 */
+	rcu_read_lock();
 
 	do {
 		vcpu_info->evtchn_upcall_pending = 0;
@@ -1724,7 +1729,7 @@ int xen_evtchn_do_upcall(void)
 
 	} while (vcpu_info->evtchn_upcall_pending);
 
-	read_unlock(&evtchn_rwlock);
+	rcu_read_unlock();
 
 	/*
 	 * Increment irq_epoch only now to defer EOIs only for
-- 
2.35.3


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2023-10-10 12:06 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-10 12:06 Xen Security Advisory 441 v4 (CVE-2023-34324) - Possible deadlock in Linux kernel event handling Xen.org security team

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.