Linux s390 Architecture development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Anthony Krowiak" <akrowiak@linux.ibm.com>
Cc: "Alexander Gordeev" <agordeev@linux.ibm.com>,
	kvm@vger.kernel.org, linux-s390@vger.kernel.org,
	"Vasily Gorbik" <gor@linux.ibm.com>,
	"Heiko Carstens" <hca@linux.ibm.com>
Subject: Re: [PATCH v3 12/15] s390/vfio-ap: Transition device migration state from STOP to RUNNING and vice versa
Date: Tue, 30 Jun 2026 11:48:57 +0000	[thread overview]
Message-ID: <20260630114857.B26E71F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260630104132.1598851-13-akrowiak@linux.ibm.com>

Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:

New issues:
- [High] Transitioning the VFIO AP device from RUNNING to STOP state does not quiesce or reset the underlying AP queues, potentially allowing hardware to modify guest memory asynchronously.

Pre-existing issues:
- [Critical] Writing a chunk smaller than `sizeof(struct vfio_ap_config)` to the resuming file allocates an undersized buffer, triggering an out-of-bounds read during validation.
- [Critical] A partial initial write to the vfio-ap migration resuming stream file skips saving the configuration pointer, leading to a memory leak and a subsequent NULL pointer dereference.
- [Critical] Error paths after a successful initial write double-free the configuration pointer, or leave a dangling pointer leading to a Use-After-Free.
- [High] The function `vfio_ap_resuming_write` conditionally unlocks a global mutex using `mutex_is_locked()`, incorrectly dropping the lock when held by other threads and corrupting mutex state.
- [High] The driver allocates large, potentially fragmented user-controlled memory chunks using `kzalloc()`, leading to unreliable device migration.
--

commit a3f4b819ba0ee66ec445b971234644e51eeab7e9
Author: Anthony Krowiak <akrowiak@linux.ibm.com>

s390/vfio-ap: Transition device migration state from STOP to RUNNING and vice versa

This commit updates the device migration state transitions to handle the
transition from STOP to RUNNING and vice versa. It treats these state
changes as a normal operation state where no internal device state needs
to be migrated.

> diff --git a/drivers/s390/crypto/vfio_ap_migration.c b/drivers/s390/crypto/vfio_ap_migration.c
> index 1b526b743ad02..572db04c4ac15 100644
> --- a/drivers/s390/crypto/vfio_ap_migration.c
> +++ b/drivers/s390/crypto/vfio_ap_migration.c
> @@ -1172,12 +1172,17 @@ vfio_ap_transition_to_state(struct ap_matrix_mdev *matrix_mdev,
>  		return NULL;
>  	}
>  
> +	/*
> +	 * These states indicate migration has either not been initiated or
> +	 * has completed and the vfio-ap device is operating normally.Since the
> +	 * vfio-ap device does not virtualize a DMA device, there is no internal
> +	 * device state to incorporate into the vfio-ap device on the target.
> +	 */
>  	if ((cur_state == VFIO_DEVICE_STATE_STOP &&
>  	     new_state == VFIO_DEVICE_STATE_RUNNING) ||
>  	    (cur_state == VFIO_DEVICE_STATE_RUNNING &&
>  	     new_state == VFIO_DEVICE_STATE_STOP)) {
> -		/* TODO */
> -		return ERR_PTR(-EOPNOTSUPP);
> +		return NULL;

[Severity: High]
Does returning NULL here safely quiesce the underlying AP queues when
transitioning from RUNNING to STOP?

The VFIO migration specification requires that when a device transitions to
the STOP state, it must cease processing commands, generating interrupts, and
writing to guest memory. If the queues are not stopped or reset during a
RUNNING to STOP transition, the hardware may continue to modify memory after
the page is migrated, which could result in missed completion events.

>  	}
>  
>  	/* vfio_mig_get_next_state() does not use arcs other than the above */

[Severity: Critical]
This is a pre-existing issue, but can writing a small chunk to the resuming
file lead to an out-of-bounds read?

If userspace writes a chunk smaller than the configuration struct size,
allocate_ap_config() appears to allocate an undersized buffer:

drivers/s390/crypto/vfio_ap_migration.c:allocate_ap_config() {
    ...
    *ap_config = kzalloc(config_size, GFP_KERNEL_ACCOUNT);
    ...
}

If this data is copied and triggers early validation in
vfio_ap_resuming_write():

drivers/s390/crypto/vfio_ap_migration.c:vfio_ap_resuming_write() {
    ...
    if (*pos + len == cfg_sz) {
        ret = do_post_copy_validation(mdev_name, ap_config);
    ...
}

Could this result in do_post_copy_validation() reading past the allocation
since the buffer was sized based on the short length rather than the full
struct?

[Severity: Critical]
This is a pre-existing issue, but does a partial initial write skip saving
the configuration pointer?

In vfio_ap_resuming_write(), if the initial write does not equal the full
configuration size, it seems to jump to done without saving the newly
allocated ap_config:

drivers/s390/crypto/vfio_ap_migration.c:vfio_ap_resuming_write() {
    ...
    ret = len;
    *pos += len;
    goto done;

cleanup:
    kfree(ap_config);
    ...
}

If a subsequent write occurs, reallocate_ap_config() might dereference a
null pointer because the pointer was never saved during the first partial
write:

drivers/s390/crypto/vfio_ap_migration.c:reallocate_ap_config() {
    ...
    struct vfio_ap_config *ap_config = *mig_ap_config;
    ...
    cur_cfg_sz = ap_config->config_sz;
    ...
}

Could this sequence lead to a memory leak and a null pointer dereference?

[Severity: Critical]
This is a pre-existing issue, but can error paths in vfio_ap_resuming_write()
cause a double-free or use-after-free?

If an additional write triggers an error, the code jumps to cleanup and frees
ap_config:

drivers/s390/crypto/vfio_ap_migration.c:vfio_ap_resuming_write() {
    ...
    ret = reallocate_ap_config(&ap_config, len);
    if (ret < 0)
        goto cleanup;
    ...
cleanup:
    kfree(ap_config);
    ...
}

However, the persistently tracked pointer in resuming_mig_state is not
nullified. When the file is released, the pointer is freed again:

drivers/s390/crypto/vfio_ap_migration.c:vfio_ap_release_resuming_file() {
    ...
    kfree(mig_data->resuming_mig_state.ap_config);
    mig_data->resuming_mig_state.ap_config = NULL;
    ...
}

Does this leave a dangling pointer that gets double-freed when the file is
closed?

[Severity: High]
This isn't a bug introduced by this patch, but does vfio_ap_resuming_write()
incorrectly unlock the global mutex if it is held by another thread?

At the end of the write operation, the function conditionally unlocks the
mutex:

drivers/s390/crypto/vfio_ap_migration.c:vfio_ap_resuming_write() {
    ...
done:
    if (mutex_is_locked(&matrix_dev->mdevs_lock))
        mutex_unlock(&matrix_dev->mdevs_lock);
    ...
}

Since mutex_is_locked() checks if the lock is held by any thread, not just
the current thread, could this cause the current thread to erroneously unlock
it when it is safely held by a completely different thread?

[Severity: High]
This is a pre-existing issue, but is kzalloc() safe for large allocations
when getting the AP configuration?

Both allocate_ap_config() and vfio_ap_get_config() use kzalloc() for memory
that can exceed 1MB depending on the user-supplied queue length:

drivers/s390/crypto/vfio_ap_migration.c:allocate_ap_config() {
    ...
    *ap_config = kzalloc(config_size, GFP_KERNEL_ACCOUNT);
    ...
}

drivers/s390/crypto/vfio_ap_migration.c:vfio_ap_get_config() {
    ...
    ap_configuration = kzalloc(*ap_config_size, GFP_KERNEL_ACCOUNT);
    ...
}

Since this requires large contiguous physical chunks, could this lead to
sporadic ENOMEM failures due to memory fragmentation? Would it be more
reliable to use kvzalloc() here since contiguous physical pages are not
strictly required for staging buffers?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260630104132.1598851-1-akrowiak@linux.ibm.com?part=12

  reply	other threads:[~2026-06-30 11:48 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-30 10:41 [PATCH v3 00/15] s390/vfio-ap: Add live guest migration support Anthony Krowiak
2026-06-30 10:41 ` [PATCH v3 01/15] s390/vfio-ap: Provide function to get the number of queues assigned to mdev Anthony Krowiak
2026-06-30 10:41 ` [PATCH v3 02/15] s390/vfio-ap: Data structures for facilitating vfio device migration Anthony Krowiak
2026-06-30 10:55   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 03/15] s390/vfio-ap: Initialize/release vfio device migration data Anthony Krowiak
2026-06-30 11:04   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 04/15] s390/vfio-ap: Reset migration state in VFIO_DEVICE_RESET ioctl handler Anthony Krowiak
2026-06-30 11:10   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 05/15] s390-vfio-ap: Callback to get/set vfio device mig state during guest migration Anthony Krowiak
2026-06-30 11:11   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 06/15] s390/vfio-ap: Transition guest migration state from STOP to STOP_COPY Anthony Krowiak
2026-06-30 11:23   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 07/15] s390/vfio-ap: File ops called to save the vfio device migration state Anthony Krowiak
2026-06-30 11:26   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 08/15] s390/vfio-ap: Transition device migration state from STOP to RESUMING Anthony Krowiak
2026-06-30 11:28   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 09/15] s390/vfio-ap: Add method to set a new guest AP configuration Anthony Krowiak
2026-06-30 11:34   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 10/15] s390/vfio-ap: File ops called to resume the vfio device migration Anthony Krowiak
2026-06-30 11:37   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 11/15] s390/vfio-ap: Transition device migration state to STOP Anthony Krowiak
2026-06-30 11:46   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 12/15] s390/vfio-ap: Transition device migration state from STOP to RUNNING and vice versa Anthony Krowiak
2026-06-30 11:48   ` sashiko-bot [this message]
2026-06-30 10:41 ` [PATCH v3 13/15] s390/vfio-ap: Callback to get the size of data to be migrated during guest migration Anthony Krowiak
2026-06-30 11:49   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 14/15] s390/vfio-ap: Add 'migratable' feature to sysfs 'features' attribute Anthony Krowiak
2026-06-30 11:56   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 15/15] s390/vfio-ap: Add live guest migration chapter to vfio-ap.rst Anthony Krowiak
2026-06-30 11:54   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260630114857.B26E71F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=agordeev@linux.ibm.com \
    --cc=akrowiak@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox