kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] s390/virtio_ccw: don't allocate/assign airqs for non-existing queues
@ 2025-04-02 20:36 David Hildenbrand
  2025-04-03  9:44 ` Thomas Huth
                   ` (5 more replies)
  0 siblings, 6 replies; 63+ messages in thread
From: David Hildenbrand @ 2025-04-02 20:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-s390, virtualization, kvm, David Hildenbrand, Chandra Merla,
	Stable, Cornelia Huck, Thomas Huth, Halil Pasic, Eric Farman,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Michael S. Tsirkin,
	Wei Wang

If we finds a vq without a name in our input array in
virtio_ccw_find_vqs(), we treat it as "non-existing" and set the vq pointer
to NULL; we will not call virtio_ccw_setup_vq() to allocate/setup a vq.

Consequently, we create only a queue if it actually exists (name != NULL)
and assign an incremental queue index to each such existing queue.

However, in virtio_ccw_register_adapter_ind()->get_airq_indicator() we
will not ignore these "non-existing queues", but instead assign an airq
indicator to them.

Besides never releasing them in virtio_ccw_drop_indicators() (because
there is no virtqueue), the bigger issue seems to be that there will be a
disagreement between the device and the Linux guest about the airq
indicator to be used for notifying a queue, because the indicator bit
for adapter I/O interrupt is derived from the queue index.

The virtio spec states under "Setting Up Two-Stage Queue Indicators":

	... indicator contains the guest address of an area wherein the
	indicators for the devices are contained, starting at bit_nr, one
	bit per virtqueue of the device.

And further in "Notification via Adapter I/O Interrupts":

	For notifying the driver of virtqueue buffers, the device sets the
	bit in the guest-provided indicator area at the corresponding
	offset.

For example, QEMU uses in virtio_ccw_notify() the queue index (passed as
"vector") to select the relevant indicator bit. If a queue does not exist,
it does not have a corresponding indicator bit assigned, because it
effectively doesn't have a queue index.

Using a virtio-balloon-ccw device under QEMU with free-page-hinting
disabled ("free-page-hint=off") but free-page-reporting enabled
("free-page-reporting=on") will result in free page reporting
not working as expected: in the virtio_balloon driver, we'll be stuck
forever in virtballoon_free_page_report()->wait_event(), because the
waitqueue will not be woken up as the notification from the device is
lost: it would use the wrong indicator bit.

Free page reporting stops working and we get splats (when configured to
detect hung wqs) like:

 INFO: task kworker/1:3:463 blocked for more than 61 seconds.
       Not tainted 6.14.0 #4
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:kworker/1:3 [...]
 Workqueue: events page_reporting_process
 Call Trace:
  [<000002f404e6dfb2>] __schedule+0x402/0x1640
  [<000002f404e6f22e>] schedule+0x3e/0xe0
  [<000002f3846a88fa>] virtballoon_free_page_report+0xaa/0x110 [virtio_balloon]
  [<000002f40435c8a4>] page_reporting_process+0x2e4/0x740
  [<000002f403fd3ee2>] process_one_work+0x1c2/0x400
  [<000002f403fd4b96>] worker_thread+0x296/0x420
  [<000002f403fe10b4>] kthread+0x124/0x290
  [<000002f403f4e0dc>] __ret_from_fork+0x3c/0x60
  [<000002f404e77272>] ret_from_fork+0xa/0x38

There was recently a discussion [1] whether the "holes" should be
treated differently again, effectively assigning also non-existing
queues a queue index: that should also fix the issue, but requires other
workarounds to not break existing setups.

Let's fix it without affecting existing setups for now by properly ignoring
the non-existing queues, so the indicator bits will match the queue
indexes.

[1] https://lore.kernel.org/all/cover.1720611677.git.mst@redhat.com/

Fixes: a229989d975e ("virtio: don't allocate vqs when names[i] = NULL")
Reported-by: Chandra Merla <cmerla@redhat.com>
Cc: <Stable@vger.kernel.org>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Eric Farman <farman@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/s390/virtio/virtio_ccw.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 21fa7ac849e5c..4904b831c0a75 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -302,11 +302,17 @@ static struct airq_info *new_airq_info(int index)
 static unsigned long *get_airq_indicator(struct virtqueue *vqs[], int nvqs,
 					 u64 *first, void **airq_info)
 {
-	int i, j;
+	int i, j, queue_idx, highest_queue_idx = -1;
 	struct airq_info *info;
 	unsigned long *indicator_addr = NULL;
 	unsigned long bit, flags;
 
+	/* Array entries without an actual queue pointer must be ignored. */
+	for (i = 0; i < nvqs; i++) {
+		if (vqs[i])
+			highest_queue_idx++;
+	}
+
 	for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
 		mutex_lock(&airq_areas_lock);
 		if (!airq_areas[i])
@@ -316,7 +322,7 @@ static unsigned long *get_airq_indicator(struct virtqueue *vqs[], int nvqs,
 		if (!info)
 			return NULL;
 		write_lock_irqsave(&info->lock, flags);
-		bit = airq_iv_alloc(info->aiv, nvqs);
+		bit = airq_iv_alloc(info->aiv, highest_queue_idx + 1);
 		if (bit == -1UL) {
 			/* Not enough vacancies. */
 			write_unlock_irqrestore(&info->lock, flags);
@@ -325,8 +331,10 @@ static unsigned long *get_airq_indicator(struct virtqueue *vqs[], int nvqs,
 		*first = bit;
 		*airq_info = info;
 		indicator_addr = info->aiv->vector;
-		for (j = 0; j < nvqs; j++) {
-			airq_iv_set_ptr(info->aiv, bit + j,
+		for (j = 0, queue_idx = 0; j < nvqs; j++) {
+			if (!vqs[j])
+				continue;
+			airq_iv_set_ptr(info->aiv, bit + queue_idx++,
 					(unsigned long)vqs[j]);
 		}
 		write_unlock_irqrestore(&info->lock, flags);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2025-04-11 13:34 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-02 20:36 [PATCH v1] s390/virtio_ccw: don't allocate/assign airqs for non-existing queues David Hildenbrand
2025-04-03  9:44 ` Thomas Huth
2025-04-03 12:45 ` Cornelia Huck
2025-04-03 12:57 ` Michael S. Tsirkin
2025-04-03 13:12 ` Christian Borntraeger
2025-04-03 14:18 ` Halil Pasic
2025-04-03 14:28   ` David Hildenbrand
2025-04-04  4:36     ` Halil Pasic
2025-04-04 10:00       ` David Hildenbrand
2025-04-04 10:55         ` David Hildenbrand
2025-04-04 13:36           ` Halil Pasic
2025-04-04 13:48             ` David Hildenbrand
2025-04-04 14:00               ` Halil Pasic
2025-04-04 14:17                 ` David Hildenbrand
2025-04-04 15:39                   ` Halil Pasic
2025-04-04 16:49                     ` David Hildenbrand
2025-04-04 17:36                       ` David Hildenbrand
2025-04-07  7:52                     ` Michael S. Tsirkin
2025-04-07  8:17                       ` David Hildenbrand
2025-04-07  8:34                         ` Michael S. Tsirkin
2025-04-07  8:44                           ` David Hildenbrand
2025-04-07  8:49                             ` Michael S. Tsirkin
2025-04-07  8:54                               ` David Hildenbrand
2025-04-07  8:58                                 ` Michael S. Tsirkin
2025-04-07  9:11                                   ` David Hildenbrand
2025-04-07  9:13                                     ` David Hildenbrand
2025-04-07 13:13                                       ` David Hildenbrand
2025-04-07 17:39                                         ` Daniel Verkamp
2025-04-07 18:47                                           ` David Hildenbrand
2025-04-07 21:09                                             ` Daniel Verkamp
2025-04-09 11:02                                               ` David Hildenbrand
2025-04-07 21:20                                             ` Michael S. Tsirkin
2025-04-09 10:46                                               ` David Hildenbrand
2025-04-09 10:56                                                 ` Michael S. Tsirkin
2025-04-09 11:12                                                   ` David Hildenbrand
2025-04-09 12:07                                                     ` Michael S. Tsirkin
2025-04-09 12:24                                                       ` David Hildenbrand
2025-04-09 16:08                                                         ` Michael S. Tsirkin
2025-04-07  9:37                                     ` Michael S. Tsirkin
2025-04-07 13:12                           ` Halil Pasic
2025-04-07 13:17                             ` David Hildenbrand
2025-04-07 13:28                               ` Cornelia Huck
2025-04-07 13:32                                 ` Michael S. Tsirkin
2025-04-07 17:26                                 ` Halil Pasic
2025-04-07  8:38                         ` David Hildenbrand
2025-04-07  8:44                           ` Michael S. Tsirkin
2025-04-07  8:50                             ` David Hildenbrand
2025-04-07  9:22                             ` David Hildenbrand
2025-04-07  8:41                     ` Michael S. Tsirkin
2025-04-06 18:42               ` Michael S. Tsirkin
2025-04-07  7:18                 ` David Hildenbrand
2025-04-07  8:54                   ` Michael S. Tsirkin
2025-04-07  9:08                     ` David Hildenbrand
2025-04-06 15:40           ` Michael S. Tsirkin
2025-04-03 14:35   ` Michael S. Tsirkin
2025-04-04  4:02     ` Halil Pasic
2025-04-04  5:33       ` Michael S. Tsirkin
2025-04-04 12:05         ` Halil Pasic
2025-04-10 18:44 ` David Hildenbrand
2025-04-11 11:11   ` Christian Borntraeger
2025-04-11 12:42     ` Heiko Carstens
2025-04-11 12:47       ` Christian Borntraeger
2025-04-11 13:34       ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).