Intel-Wired-Lan Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Lingyu Liu <lingyu.liu@intel.com>
Cc: kevin.tian@intel.com, yi.l.liu@intel.com,
	intel-wired-lan@lists.osuosl.org, phani.r.burra@intel.com
Subject: Re: [Intel-wired-lan] [PATCH iwl-next V2 10/15] ice: save and restore TX queue head
Date: Wed, 21 Jun 2023 11:37:17 -0300	[thread overview]
Message-ID: <ZJMLHSq9rjGIVS4V@nvidia.com> (raw)
In-Reply-To: <20230621091112.44945-11-lingyu.liu@intel.com>

On Wed, Jun 21, 2023 at 09:11:07AM +0000, Lingyu Liu wrote:
> diff --git a/drivers/net/ethernet/intel/ice/ice_migration.c b/drivers/net/ethernet/intel/ice/ice_migration.c
> index 2579bc0bd193..c2a83a97af05 100644
> --- a/drivers/net/ethernet/intel/ice/ice_migration.c
> +++ b/drivers/net/ethernet/intel/ice/ice_migration.c

> +static int
> +ice_migration_restore_tx_head(struct ice_vf *vf,
> +			      struct ice_migration_dev_state *devstate,
> +			      struct vfio_device *vdev)
> +{
> +	struct ice_tx_desc *tx_desc_dummy, *tx_desc;
> +	struct ice_vsi *vsi = ice_get_vf_vsi(vf);
> +	struct ice_pf *pf = vf->pf;
> +	u16 max_ring_len = 0;
> +	struct device *dev;
> +	int ret = 0;
> +	int i = 0;
> +
> +	dev = ice_pf_to_dev(vf->pf);
> +
> +	if (!vsi) {
> +		dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id);
> +		return -EINVAL;
> +	}
> +
> +	ice_for_each_txq(vsi, i) {
> +		if (!test_bit(i, vf->txq_ena))
> +			continue;
> +
> +		max_ring_len = max(vsi->tx_rings[i]->count, max_ring_len);
> +	}
> +
> +	if (max_ring_len == 0)
> +		return 0;
> +
> +	tx_desc = (struct ice_tx_desc *)kcalloc
> +		  (max_ring_len, sizeof(struct ice_tx_desc), GFP_KERNEL);
> +	tx_desc_dummy = (struct ice_tx_desc *)kcalloc
> +			(max_ring_len, sizeof(struct ice_tx_desc), GFP_KERNEL);
> +	if (!tx_desc || !tx_desc_dummy) {
> +		dev_err(dev, "VF %d failed to allocate memory for tx descriptors to restore tx head\n",
> +			vf->vf_id);
> +		ret = -ENOMEM;
> +		goto err;
> +	}
> +
> +	for (i = 0; i < max_ring_len; i++) {
> +		u32 td_cmd;
> +
> +		td_cmd = ICE_TXD_LAST_DESC_CMD | ICE_TX_DESC_CMD_DUMMY;
> +		tx_desc_dummy[i].cmd_type_offset_bsz =
> +					ice_build_ctob(td_cmd, 0, SZ_256, 0);
> +	}
> +
> +	/* For each tx queue, we restore the tx head following below steps:
> +	 * 1. backup original tx ring descriptor memory
> +	 * 2. overwrite the tx ring descriptor with dummy packets
> +	 * 3. kick doorbell register to trigger descriptor writeback,
> +	 *    then tx head will move from 0 to tail - 1 and tx head is restored
> +	 *    to the place we expect.
> +	 * 4. restore the tx ring with original tx ring descriptor memory in
> +	 *    order not to corrupt the ring context.
> +	 */
> +	ice_for_each_txq(vsi, i) {
> +		struct ice_tx_ring *tx_ring = vsi->tx_rings[i];
> +		u16 *tx_heads = devstate->tx_head;
> +		u32 tx_head;
> +		int j;
> +
> +		if (!test_bit(i, vf->txq_ena) || tx_heads[i] == 0)
> +			continue;
> +
> +		if (tx_heads[i] >= tx_ring->count) {
> +			dev_err(dev, "saved tx ring head exceeds tx ring count\n");
> +			ret = -EINVAL;
> +			goto err;
> +		}
> +		ret = vfio_dma_rw(vdev, tx_ring->dma, (void *)tx_desc,
> +				  tx_ring->count * sizeof(tx_desc[0]), false);
> +		if (ret) {
> +			dev_err(dev, "kvm read guest tx ring error: %d\n",
> +				ret);
> +			goto err;

You can't call VFIO functions from a netdev driver. All this code
needs to be moved into the varient driver.

This design seems pretty wild to me, it doesn't seem too robust
against a hostile VM - eg these DMAs can all fail under guest control,
and then what?

We also don't have any guarentees defined for the VFIO protocol about
what state the vIOMMU will be in prior to reaching RUNNING.

IDK, all of this looks like it is trying really hard to hackily force
HW that was never ment to support live migration to somehow do
something that looks like it.

You really need to present an explanation in the VFIO driver comments
about how this whole scheme actually works and is secure and
functional against a hostile guest.

Jason

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

  reply	other threads:[~2023-06-21 14:37 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-21  9:10 [Intel-wired-lan] [PATCH iwl-next V2 00/15] Add E800 live migration driver Lingyu Liu
2023-06-21  9:10 ` [Intel-wired-lan] [PATCH iwl-next V2 01/15] ice: Fix missing legacy 32byte RXDID in the supported bitmap Lingyu Liu
2023-06-21  9:10 ` [Intel-wired-lan] [PATCH iwl-next V2 02/15] ice: add function to get rxq context Lingyu Liu
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 03/15] ice: check VF migration status before sending messages to VF Lingyu Liu
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 04/15] ice: add migration init field and helper functions Lingyu Liu
2023-06-21 13:35   ` Jason Gunthorpe
2023-06-27  7:50     ` Cao, Yahui
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 05/15] ice: save VF messages as device state Lingyu Liu
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 06/15] ice: save and restore " Lingyu Liu
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 07/15] ice: do not notify VF link state during migration Lingyu Liu
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 08/15] ice: change VSI id in virtual channel message after migration Lingyu Liu
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 09/15] ice: save and restore RX queue head Lingyu Liu
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 10/15] ice: save and restore TX " Lingyu Liu
2023-06-21 14:37   ` Jason Gunthorpe [this message]
2023-06-27  6:55     ` Tian, Kevin
2023-07-03  5:27       ` Cao, Yahui
2023-07-03 21:03         ` Jason Gunthorpe
2023-07-04  7:35           ` Tian, Kevin
2023-06-28  8:11     ` Liu, Yi L
2023-06-28 12:39       ` Jason Gunthorpe
2023-07-03 12:54         ` Liu, Yi L
2023-07-04  7:38           ` Tian, Kevin
2023-07-04 17:59             ` Peter Xu
2023-07-10 15:54               ` Jason Gunthorpe
2023-07-17 21:43                 ` Peter Xu
2023-07-18 15:38                   ` Jason Gunthorpe
2023-07-18 17:36                     ` Peter Xu
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 11/15] ice: stop device before saving device states Lingyu Liu
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 12/15] ice: mask VF advanced capabilities if live migration is activated Lingyu Liu
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 13/15] vfio/ice: implement vfio_pci driver for E800 devices Lingyu Liu
2023-06-21 14:23   ` Jason Gunthorpe
2023-06-27  9:00     ` Liu, Lingyu
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 14/15] vfio: Expose vfio_device_has_container() Lingyu Liu
2023-06-21  9:11 ` [Intel-wired-lan] [PATCH iwl-next V2 15/15] vfio/ice: support iommufd vfio compat mode Lingyu Liu
2023-06-21 14:40   ` Jason Gunthorpe
2023-06-27  8:09     ` Cao, Yahui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZJMLHSq9rjGIVS4V@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kevin.tian@intel.com \
    --cc=lingyu.liu@intel.com \
    --cc=phani.r.burra@intel.com \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox