Linux s390 Architecture development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Anthony Krowiak" <akrowiak@linux.ibm.com>
Cc: linux-s390@vger.kernel.org, kvm@vger.kernel.org,
	"Heiko Carstens" <hca@linux.ibm.com>,
	"Alexander Gordeev" <agordeev@linux.ibm.com>,
	"Vasily Gorbik" <gor@linux.ibm.com>
Subject: Re: [PATCH v3 05/15] s390-vfio-ap: Callback to get/set vfio device mig state during guest migration
Date: Tue, 30 Jun 2026 11:11:41 +0000	[thread overview]
Message-ID: <20260630111142.08F0A1F00A3A@smtp.kernel.org> (raw)
In-Reply-To: <20260630104132.1598851-6-akrowiak@linux.ibm.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [Critical] Race condition in `vfio_ap_release_mig_file` accessing `mig_data` without locking, resulting in Use-After-Free and memory leak.
- [High] Removal of migration file cleanup logic from vfio_ap_reset_migration_state leaves stale file pointers, causing resource leaks.
- [High] The device state is not updated to VFIO_DEVICE_STATE_ERROR when a state transition fails.
- [Medium] Unratelimited dev_err allows unprivileged users to flood the kernel log.
- [Medium] Use of global mutex for device-specific state operations allows cross-VM interference.
--

commit d82cbc8a070306f90cbac1743dbb46ba5b58d080
Author: Anthony Krowiak <akrowiak@linux.ibm.com>

s390-vfio-ap: Callback to get/set vfio device mig state during guest migration

Implements two callbacks related to the current state of a vfio-ap device:
* The callback that sets a new migration state of a vfio-ap device during
  live migration of guests with pass-through access to AP devices. This
  callback is mandatory for VFIO_DEVICE_FEATURE_MIGRATION support.

  The function pointer for this callback is specified via the
  'migration_set_state' field of the 'vfio_migration_ops' structure
  which is stored with the VFIO device when the 'vfio_device'
  structure representing the mediated device is initialized.

* The callback that returns the current vfio device migration state during
  live migration of guests with pass-through access to AP devices.

  The function pointer for this callback is specified via the
  'migration_get_state' field of the 'vfio_migration_ops' structure
  which is stored with the VFIO device when the 'vfio_device'
  structure representing the mediated device is initialized.

> diff --git a/drivers/s390/crypto/vfio_ap_migration.c b/drivers/s390/crypto/vfio_ap_migration.c
> index c7c71980bdbb8..8274e7b65eea4 100644
> --- a/drivers/s390/crypto/vfio_ap_migration.c
> +++ b/drivers/s390/crypto/vfio_ap_migration.c
[ ... ]
> @@ -56,16 +57,140 @@ struct vfio_ap_config {
>  	struct vfio_ap_queue_info	qinfo[] __counted_by(num_queues);
>  };
>  
[ ... ]
> +static struct file *vfio_ap_set_state(struct vfio_device *vdev,
> +				      enum vfio_device_mig_state  new_state)
>  {
> -	return NULL;
> +	int ret;
> +	struct file *filp = NULL;
> +	struct ap_matrix_mdev *matrix_mdev;
> +	enum vfio_device_mig_state next_state;
> +	struct vfio_ap_migration_data *mig_data;
> +
> +	matrix_mdev = container_of(vdev, struct ap_matrix_mdev, vdev);
> +
> +	mutex_lock(&matrix_dev->mdevs_lock);

[Severity: Medium]
Does using the global mdevs_lock here allow an attacker controlling a guest 
to heavily contend the lock by repeatedly calling the migration ioctl? Would 
it be better to use a per-device lock to prevent potential cross-VM 
interference?

> +	if (ap_is_se_guest()) {
> +		dev_err(matrix_mdev->vdev.dev,
> +			"Migration not allowed from or to a Secure Execution guest\n");

[Severity: Medium]
Could this unratelimited error message allow an unprivileged user program to 
flood the kernel log by calling the ioctl in a tight loop?

> +		mutex_unlock(&matrix_dev->mdevs_lock);
> +		return ERR_PTR(-EPERM);
> +	}
> +
> +	mig_data = matrix_mdev->mig_data;
> +
[ ... ]
> +	while (mig_data->mig_state != new_state) {
> +		ret = vfio_mig_get_next_state(vdev, mig_data->mig_state,
> +					      new_state, &next_state);
> +		if (ret) {
> +			filp = ERR_PTR(ret);
> +			break;
> +		}
> +
> +		filp = vfio_ap_transition_to_state(matrix_mdev, next_state);
> +		if (IS_ERR(filp))
> +			break;

[Severity: High]
If a state transition fails here, should the device be placed into 
VFIO_DEVICE_STATE_ERROR? The current logic seems to break the loop and 
return the error, but mig_data->mig_state is left unmodified, so subsequent 
calls might report the old valid state instead of the required error state.

> +
> +		mig_data->mig_state = next_state;
> +
> +		if (WARN_ON(filp && new_state != next_state)) {
> +			fput(filp);
> +			filp = ERR_PTR(-EINVAL);
> +			break;
> +		}
> +	}
> +
> +	mutex_unlock(&matrix_dev->mdevs_lock);
> +
> +	return filp;
>  }
>  
>  static int vfio_ap_get_state(struct vfio_device *vdev,
>  			     enum vfio_device_mig_state  *current_state)
>  {
> -	return -EOPNOTSUPP;
> +	struct ap_matrix_mdev *matrix_mdev;
> +	struct vfio_ap_migration_data *mig_data;
> +
> +	mutex_lock(&matrix_dev->mdevs_lock);

[Severity: Medium]
Similar to the concern in vfio_ap_set_state(), could acquiring the global 
mdevs_lock just to read the device-specific state lead to cross-VM lock 
contention?

> +
> +	matrix_mdev = container_of(vdev, struct ap_matrix_mdev, vdev);
> +	mig_data =  matrix_mdev->mig_data;
> +
[ ... ]
> @@ -146,6 +271,5 @@ void vfio_ap_reset_migration_state(struct ap_matrix_mdev *matrix_mdev)
>  	if (!matrix_mdev->mig_data)
>  		return;
>  
> -	vfio_ap_release_mig_files(matrix_mdev);
>  	matrix_mdev->mig_data->mig_state = VFIO_DEVICE_STATE_RUNNING;
>  }

[Severity: High]
Is it safe to remove the call to vfio_ap_release_mig_files() here? If 
userspace starts a new migration, won't the new file pointers overwrite the 
old ones in mig_data? When the old file descriptors are closed later, the 
driver's release callback won't find a match in mig_data, which appears to 
result in a permanent memory leak of the associated allocations.

[Severity: Critical]
Are there also race conditions related to accessing mig_data without locks 
during release? Looking at vfio_ap_release_mig_file():

    struct ap_matrix_mdev *matrix_mdev = filp->private_data;

    if (!matrix_mdev || !matrix_mdev->mig_data)
        return -ENODEV;

    if (filp == matrix_mdev->mig_data->stop_copy_mig_file)

Could an unprivileged process cause a use-after-free by closing the VFIO 
device file descriptor while keeping the migration file descriptor open, 
and then closing the migration file descriptor? Since mig_data is accessed 
locklessly here, it seems it could be freed concurrently during device 
closure, leading to a dereference of freed memory.

Additionally, if vfio_ap_release_migration_data() is called concurrently, 
does it properly free internal allocations like resuming_mig_state.ap_config, 
or are those leaked when the parent mig_data is freed?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260630104132.1598851-1-akrowiak@linux.ibm.com?part=5

  reply	other threads:[~2026-06-30 11:11 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-30 10:41 [PATCH v3 00/15] s390/vfio-ap: Add live guest migration support Anthony Krowiak
2026-06-30 10:41 ` [PATCH v3 01/15] s390/vfio-ap: Provide function to get the number of queues assigned to mdev Anthony Krowiak
2026-06-30 10:41 ` [PATCH v3 02/15] s390/vfio-ap: Data structures for facilitating vfio device migration Anthony Krowiak
2026-06-30 10:55   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 03/15] s390/vfio-ap: Initialize/release vfio device migration data Anthony Krowiak
2026-06-30 11:04   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 04/15] s390/vfio-ap: Reset migration state in VFIO_DEVICE_RESET ioctl handler Anthony Krowiak
2026-06-30 11:10   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 05/15] s390-vfio-ap: Callback to get/set vfio device mig state during guest migration Anthony Krowiak
2026-06-30 11:11   ` sashiko-bot [this message]
2026-06-30 10:41 ` [PATCH v3 06/15] s390/vfio-ap: Transition guest migration state from STOP to STOP_COPY Anthony Krowiak
2026-06-30 11:23   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 07/15] s390/vfio-ap: File ops called to save the vfio device migration state Anthony Krowiak
2026-06-30 11:26   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 08/15] s390/vfio-ap: Transition device migration state from STOP to RESUMING Anthony Krowiak
2026-06-30 11:28   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 09/15] s390/vfio-ap: Add method to set a new guest AP configuration Anthony Krowiak
2026-06-30 11:34   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 10/15] s390/vfio-ap: File ops called to resume the vfio device migration Anthony Krowiak
2026-06-30 11:37   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 11/15] s390/vfio-ap: Transition device migration state to STOP Anthony Krowiak
2026-06-30 11:46   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 12/15] s390/vfio-ap: Transition device migration state from STOP to RUNNING and vice versa Anthony Krowiak
2026-06-30 11:48   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 13/15] s390/vfio-ap: Callback to get the size of data to be migrated during guest migration Anthony Krowiak
2026-06-30 11:49   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 14/15] s390/vfio-ap: Add 'migratable' feature to sysfs 'features' attribute Anthony Krowiak
2026-06-30 11:56   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 15/15] s390/vfio-ap: Add live guest migration chapter to vfio-ap.rst Anthony Krowiak
2026-06-30 11:54   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260630111142.08F0A1F00A3A@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=agordeev@linux.ibm.com \
    --cc=akrowiak@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox