virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v8 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
       [not found] ` <20211220195646.44498-10-cristian.marussi@arm.com>
@ 2021-12-20 23:17   ` Michael S. Tsirkin
       [not found]   ` <20211221140027.41524-1-cristian.marussi@arm.com>
  1 sibling, 0 replies; 3+ messages in thread
From: Michael S. Tsirkin @ 2021-12-20 23:17 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: f.fainelli, vincent.guittot, Igor Skalkin, sudeep.holla,
	linux-kernel, virtualization, Peter Hilber, james.quinlan,
	Jonathan.Cameron, souvik.chakravarty, etienne.carriere,
	linux-arm-kernel

On Mon, Dec 20, 2021 at 07:56:44PM +0000, Cristian Marussi wrote:
> Add support for .mark_txdone and .poll_done transport operations to SCMI
> VirtIO transport as pre-requisites to enable atomic operations.
> 
> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> and atomic mode for selected SCMI transactions while leaving it default
> disabled.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Igor Skalkin <igor.skalkin@opensynergy.com>
> Cc: Peter Hilber <peter.hilber@opensynergy.com>
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> v7 --> v8
> - removed ifdeffery
> - reviewed comments
> - simplified spinlocking aroung scmi_feed_vq_tx/rx
> - added deferred worker for TX replies to aid while polling mode is active
> V6 --> V7
> - added a few comments about virtio polling internals
> - fixed missing list_del on pending_cmds_list processing
> - shrinked spinlocked areas in virtio_poll_done
> - added proper spinlocking to scmi_vio_complete_cb while scanning list
>   of pending cmds
> ---
>  drivers/firmware/arm_scmi/Kconfig  |  15 ++
>  drivers/firmware/arm_scmi/virtio.c | 291 ++++++++++++++++++++++++++---
>  2 files changed, 280 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> index d429326433d1..7794bd41eaa0 100644
> --- a/drivers/firmware/arm_scmi/Kconfig
> +++ b/drivers/firmware/arm_scmi/Kconfig
> @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
>  	  the ones implemented by kvmtool) and let the core Kernel VirtIO layer
>  	  take care of the needed conversions, say N.
>  
> +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> +	bool "Enable atomic mode for SCMI VirtIO transport"
> +	depends on ARM_SCMI_TRANSPORT_VIRTIO
> +	help
> +	  Enable support of atomic operation for SCMI VirtIO based transport.
> +
> +	  If you want the SCMI VirtIO based transport to operate in atomic
> +	  mode, avoiding any kind of sleeping behaviour for selected
> +	  transactions on the TX path, answer Y.
> +
> +	  Enabling atomic mode operations allows any SCMI driver using this
> +	  transport to optionally ask for atomic SCMI transactions and operate
> +	  in atomic context too, at the price of using a number of busy-waiting
> +	  primitives all over instead. If unsure say N.
> +
>  endif #ARM_SCMI_PROTOCOL
>  
>  config ARM_SCMI_POWER_DOMAIN
> diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> index fd0f6f91fc0b..f589bbcc5db9 100644
> --- a/drivers/firmware/arm_scmi/virtio.c
> +++ b/drivers/firmware/arm_scmi/virtio.c
> @@ -3,8 +3,8 @@
>   * Virtio Transport driver for Arm System Control and Management Interface
>   * (SCMI).
>   *
> - * Copyright (C) 2020-2021 OpenSynergy.
> - * Copyright (C) 2021 ARM Ltd.
> + * Copyright (C) 2020-2022 OpenSynergy.
> + * Copyright (C) 2021-2022 ARM Ltd.
>   */
>  
>  /**
> @@ -38,6 +38,9 @@
>   * @vqueue: Associated virtqueue
>   * @cinfo: SCMI Tx or Rx channel
>   * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> + * @deferred_tx_work: Worker for TX deferred replies processing
> + * @deferred_tx_wq: Workqueue for TX deferred replies
> + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
>   * @is_rx: Whether channel is an Rx channel
>   * @ready: Whether transport user is ready to hear about channel
>   * @max_msg: Maximum number of pending messages for this channel.
> @@ -49,6 +52,9 @@ struct scmi_vio_channel {
>  	struct virtqueue *vqueue;
>  	struct scmi_chan_info *cinfo;
>  	struct list_head free_list;
> +	struct list_head pending_cmds_list;
> +	struct work_struct deferred_tx_work;
> +	struct workqueue_struct *deferred_tx_wq;
>  	bool is_rx;
>  	bool ready;
>  	unsigned int max_msg;
> @@ -65,12 +71,22 @@ struct scmi_vio_channel {
>   * @input: SDU used for (delayed) responses and notifications
>   * @list: List which scmi_vio_msg may be part of
>   * @rx_len: Input SDU size in bytes, once input has been received
> + * @poll_idx: Last used index registered for polling purposes if this message
> + *	      transaction reply was configured for polling.
> + *	      Note that since virtqueue used index is an unsigned 16-bit we can
> + *	      use some out-of-scale values to signify particular conditions.
> + * @poll_lock: Protect access to @poll_idx.
>   */
>  struct scmi_vio_msg {
>  	struct scmi_msg_payld *request;
>  	struct scmi_msg_payld *input;
>  	struct list_head list;
>  	unsigned int rx_len;
> +#define VIO_MSG_NOT_POLLED	0xeeeeeeeeUL
> +#define VIO_MSG_POLL_DONE	0xffffffffUL
> +	unsigned int poll_idx;
> +	/* lock to protect access to poll_idx. */
> +	spinlock_t poll_lock;
>  };
>  
>  /* Only one SCMI VirtIO device can possibly exist */
> @@ -81,40 +97,43 @@ static bool scmi_vio_have_vq_rx(struct virtio_device *vdev)
>  	return virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
>  }
>  
> +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
>  static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
>  			       struct scmi_vio_msg *msg,
>  			       struct device *dev)
>  {
>  	struct scatterlist sg_in;
>  	int rc;
> -	unsigned long flags;
>  
>  	sg_init_one(&sg_in, msg->input, VIRTIO_SCMI_MAX_PDU_SIZE);
>  
> -	spin_lock_irqsave(&vioch->lock, flags);
> -
>  	rc = virtqueue_add_inbuf(vioch->vqueue, &sg_in, 1, msg, GFP_ATOMIC);
>  	if (rc)
>  		dev_err(dev, "failed to add to RX virtqueue (%d)\n", rc);
>  	else
>  		virtqueue_kick(vioch->vqueue);
>  
> -	spin_unlock_irqrestore(&vioch->lock, flags);
> -
>  	return rc;
>  }
>  
> +/* Expect to be called with vioch->lock acquired by the caller and IRQs off */
> +static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
> +				       struct scmi_vio_msg *msg)
> +{
> +	spin_lock(&msg->poll_lock);
> +	msg->poll_idx = VIO_MSG_NOT_POLLED;
> +	spin_unlock(&msg->poll_lock);
> +
> +	list_add(&msg->list, &vioch->free_list);
> +}
> +
>  static void scmi_finalize_message(struct scmi_vio_channel *vioch,
>  				  struct scmi_vio_msg *msg)
>  {
> -	if (vioch->is_rx) {
> +	if (vioch->is_rx)
>  		scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
> -	} else {
> -		/* Here IRQs are assumed to be already disabled by the caller */
> -		spin_lock(&vioch->lock);
> -		list_add(&msg->list, &vioch->free_list);
> -		spin_unlock(&vioch->lock);
> -	}
> +	else
> +		scmi_vio_feed_vq_tx(vioch, msg);
>  }
>  
>  static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> @@ -144,6 +163,7 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  			virtqueue_disable_cb(vqueue);
>  			cb_enabled = false;
>  		}
> +
>  		msg = virtqueue_get_buf(vqueue, &length);
>  		if (!msg) {
>  			if (virtqueue_enable_cb(vqueue))
> @@ -157,7 +177,9 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  			scmi_rx_callback(vioch->cinfo,
>  					 msg_read_header(msg->input), msg);
>  
> +			spin_lock(&vioch->lock);
>  			scmi_finalize_message(vioch, msg);
> +			spin_unlock(&vioch->lock);
>  		}
>  
>  		/*
> @@ -176,6 +198,34 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>  	spin_unlock_irqrestore(&vioch->ready_lock, ready_flags);
>  }
>  
> +static void scmi_vio_deferred_tx_worker(struct work_struct *work)
> +{
> +	unsigned long flags;
> +	struct scmi_vio_channel *vioch;
> +	struct scmi_vio_msg *msg, *tmp;
> +
> +	vioch = container_of(work, struct scmi_vio_channel, deferred_tx_work);
> +
> +	/* Process pre-fetched messages */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +
> +	/* Scan the list of possibly pre-fetched messages during polling. */
> +	list_for_each_entry_safe(msg, tmp, &vioch->pending_cmds_list, list) {
> +		list_del(&msg->list);
> +
> +		scmi_rx_callback(vioch->cinfo,
> +				 msg_read_header(msg->input), msg);
> +
> +		/* Free the processed message once done */
> +		scmi_vio_feed_vq_tx(vioch, msg);
> +	}
> +
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	/* Process possibly still pending messages */
> +	scmi_vio_complete_cb(vioch->vqueue);
> +}
> +
>  static const char *const scmi_vio_vqueue_names[] = { "tx", "rx" };
>  
>  static vq_callback_t *scmi_vio_complete_callbacks[] = {
> @@ -244,6 +294,19 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  
>  	vioch = &((struct scmi_vio_channel *)scmi_vdev->priv)[index];
>  
> +	/* Setup a deferred worker for polling. */
> +	if (tx && !vioch->deferred_tx_wq) {
> +		vioch->deferred_tx_wq =
> +			alloc_workqueue(dev_name(&scmi_vdev->dev),
> +					WQ_UNBOUND | WQ_FREEZABLE | WQ_SYSFS,
> +					0);
> +		if (!vioch->deferred_tx_wq)
> +			return -ENOMEM;
> +
> +		INIT_WORK(&vioch->deferred_tx_work,
> +			  scmi_vio_deferred_tx_worker);
> +	}
> +
>  	for (i = 0; i < vioch->max_msg; i++) {
>  		struct scmi_vio_msg *msg;
>  
> @@ -257,6 +320,7 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  						    GFP_KERNEL);
>  			if (!msg->request)
>  				return -ENOMEM;
> +			spin_lock_init(&msg->poll_lock);
>  		}
>  
>  		msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
> @@ -264,13 +328,12 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>  		if (!msg->input)
>  			return -ENOMEM;
>  
> -		if (tx) {
> -			spin_lock_irqsave(&vioch->lock, flags);
> -			list_add_tail(&msg->list, &vioch->free_list);
> -			spin_unlock_irqrestore(&vioch->lock, flags);
> -		} else {
> +		spin_lock_irqsave(&vioch->lock, flags);
> +		if (tx)
> +			scmi_vio_feed_vq_tx(vioch, msg);
> +		else
>  			scmi_vio_feed_vq_rx(vioch, msg, cinfo->dev);
> -		}
> +		spin_unlock_irqrestore(&vioch->lock, flags);
>  	}
>  
>  	spin_lock_irqsave(&vioch->lock, flags);
> @@ -296,6 +359,11 @@ static int virtio_chan_free(int id, void *p, void *data)
>  	vioch->ready = false;
>  	spin_unlock_irqrestore(&vioch->ready_lock, flags);
>  
> +	if (!vioch->is_rx && vioch->deferred_tx_wq) {
> +		destroy_workqueue(vioch->deferred_tx_wq);
> +		vioch->deferred_tx_wq = NULL;
> +	}
> +
>  	scmi_free_channel(cinfo, data, id);
>  
>  	spin_lock_irqsave(&vioch->lock, flags);
> @@ -324,7 +392,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  	}
>  
>  	msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
> -	list_del(&msg->list);
> +	/* Re-init element so we can discern anytime if it is still in-flight */
> +	list_del_init(&msg->list);
>  
>  	msg_tx_prepare(msg->request, xfer);
>  
> @@ -337,6 +406,19 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
>  		dev_err(vioch->cinfo->dev,
>  			"failed to add to TX virtqueue (%d)\n", rc);
>  	} else {
> +		/*
> +		 * If polling was requested for this transaction:
> +		 *  - retrieve last used index (will be used as polling reference)
> +		 *  - bind the polled message to the xfer via .priv
> +		 */
> +		if (xfer->hdr.poll_completion) {
> +			spin_lock(&msg->poll_lock);
> +			msg->poll_idx =
> +				virtqueue_enable_cb_prepare(vioch->vqueue);
> +			spin_unlock(&msg->poll_lock);
> +			/* Ensure initialized msg is visibly bound to xfer */
> +			smp_store_mb(xfer->priv, msg);
> +		}
>  		virtqueue_kick(vioch->vqueue);
>  	}
>  
> @@ -350,10 +432,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
>  {
>  	struct scmi_vio_msg *msg = xfer->priv;
>  
> -	if (msg) {
> +	if (msg)
>  		msg_fetch_response(msg->input, msg->rx_len, xfer);
> -		xfer->priv = NULL;
> -	}
>  }
>  
>  static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> @@ -361,10 +441,165 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
>  {
>  	struct scmi_vio_msg *msg = xfer->priv;
>  
> -	if (msg) {
> +	if (msg)
>  		msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
> -		xfer->priv = NULL;
> +}
> +
> +/**
> + * virtio_mark_txdone  - Mark transmission done
> + *
> + * Free only successfully completed polling transfer messages.
> + *
> + * Note that in the SCMI VirtIO transport we never explicitly release timed-out
> + * messages by forcibly re-adding them to the free-list, even on timeout, inside
> + * the TX code path; we instead let IRQ/RX callbacks eventually clean up such
> + * messages once, finally, a late reply is received and discarded (if ever).
> + *
> + * This approach was deemed preferable since those pending timed-out buffers are
> + * still effectively owned by the SCMI platform VirtIO device even after timeout
> + * expiration: forcibly freeing and reusing them before they had been returned
> + * explicitly by the SCMI platform could lead to subtle bugs due to message
> + * corruption.
> + * An SCMI platform VirtIO device which never returns message buffers is
> + * anyway broken and it will quickly lead to exhaustion of available messages.
> + *
> + * For this same reason, here, we take care to free only the successfully
> + * completed polled messages, since they won't be freed elsewhere; late replies
> + * to timed-out polled messages would be anyway freed by RX callbacks instead.
> + *
> + * @cinfo: SCMI channel info
> + * @ret: Transmission return code
> + * @xfer: Transfer descriptor
> + */
> +static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
> +			       struct scmi_xfer *xfer)
> +{
> +	unsigned long flags;
> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +	struct scmi_vio_msg *msg = xfer->priv;
> +
> +	if (!msg)
> +		return;
> +
> +	/* Ensure msg is unbound from xfer before pushing onto the free list  */
> +	smp_store_mb(xfer->priv, NULL);
> +
> +	/* Is a successfully completed polled message still to be finalized ? */
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
> +		scmi_vio_feed_vq_tx(vioch, msg);
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +}
> +
> +/**
> + * virtio_poll_done  - Provide polling support for VirtIO transport
> + *
> + * @cinfo: SCMI channel info
> + * @xfer: Reference to the transfer being poll for.
> + *
> + * VirtIO core provides a polling mechanism based only on last used indexes:
> + * this means that it is possible to poll the virtqueues waiting for something
> + * new to arrive from the host side but the only way to check if the freshly
> + * arrived buffer was what we were waiting for is to compare the newly arrived
> + * message descriptors with the one we are polling on.
> + *
> + * As a consequence it can happen to dequeue something different from the buffer
> + * we were poll-waiting for: if that is the case such early fetched buffers are
> + * then added to a the @pending_cmds_list list for later processing by a
> + * dedicated deferred worker.
> + *
> + * So, basically, once something new is spotted we proceed to de-queue all the
> + * freshly received used buffers until we found the one we were polling on, or,
> + * we have 'seemingly' emptied the virtqueue; if some buffers are still pending
> + * in the vqueue at the end of the polling loop (possible due to inherent races
> + * in virtqueues handling mechanisms), we similarly kick the deferred worker
> + * and let it process those, to avoid indefinitely looping in the .poll_done
> + * helper.
> + *
> + * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
> + * when polling since such flag is per-virtqueues and we do not want to
> + * suppress notifications as a whole: so, if the message we are polling for is
> + * delivered via usual IRQs callbacks, on another core which are IRQs-on, it
> + * will be handled as such by scmi_rx_callback() and the polling loop in the
> + * SCMI Core TX path will be transparently terminated anyway.
> + *
> + * Return: True once polling has successfully completed.
> + */
> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> +			     struct scmi_xfer *xfer)
> +{
> +	bool pending, ret = false;
> +	unsigned int length, any_prefetched = 0;
> +	unsigned long flags;
> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> +
> +	if (!msg)
> +		return true;
> +
> +	spin_lock_irqsave(&msg->poll_lock, flags);
> +	/* Processed already by other polling loop on another CPU ? */
> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> +		return true;
>  	}
> +
> +	/* Has cmdq index moved at all ? */
> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> +	spin_unlock_irqrestore(&msg->poll_lock, flags);
> +	if (!pending)
> +		return false;
> +
> +	spin_lock_irqsave(&vioch->lock, flags);
> +	virtqueue_disable_cb(vioch->vqueue);
> +
> +	/*
> +	 * If something arrived we cannot be sure, without dequeueing, if it
> +	 * was the reply to the xfer we are polling for, or, to other, even
> +	 * possibly non-polling, pending xfers: process all new messages
> +	 * till the polled-for message is found OR the vqueue is empty.
> +	 */
> +	while ((next_msg = virtqueue_get_buf(vioch->vqueue, &length))) {
> +		next_msg->rx_len = length;
> +		/* Is the message we were polling for ? */
> +		if (next_msg == msg) {
> +			ret = true;
> +			break;
> +		}
> +
> +		spin_lock(&next_msg->poll_lock);
> +		if (next_msg->poll_idx == VIO_MSG_NOT_POLLED) {
> +			any_prefetched++;
> +			list_add_tail(&next_msg->list,
> +				      &vioch->pending_cmds_list);
> +		} else {
> +			next_msg->poll_idx = VIO_MSG_POLL_DONE;
> +		}
> +		spin_unlock(&next_msg->poll_lock);
> +	}
> +
> +	/*
> +	 * When the polling loop has successfully terminated if something
> +	 * else was queued in the meantime, it will be served by a deferred
> +	 * worker OR by the normal IRQ/callback OR by other poll loops.
> +	 *
> +	 * If we are still looking for the polled reply, the polling index has
> +	 * to be updated to the current vqueue last used index.
> +	 */
> +	if (ret) {
> +		pending = !virtqueue_enable_cb(vioch->vqueue);
> +	} else {
> +		spin_lock(&msg->poll_lock);
> +		msg->poll_idx = virtqueue_enable_cb_prepare(vioch->vqueue);
> +		pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> +		spin_unlock(&msg->poll_lock);
> +	}
> +	spin_unlock_irqrestore(&vioch->lock, flags);
> +
> +	if (any_prefetched || pending)
> +		queue_work(vioch->deferred_tx_wq, &vioch->deferred_tx_work);

I don't see any attempt to make sure the queued work is no longer
running on e.g. device or driver removal.


> +
> +	return ret;
>  }
>  
>  static const struct scmi_transport_ops scmi_virtio_ops = {
> @@ -376,6 +611,8 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
>  	.send_message = virtio_send_message,
>  	.fetch_response = virtio_fetch_response,
>  	.fetch_notification = virtio_fetch_notification,
> +	.mark_txdone = virtio_mark_txdone,
> +	.poll_done = virtio_poll_done,
>  };
>  
>  static int scmi_vio_probe(struct virtio_device *vdev)
> @@ -418,6 +655,7 @@ static int scmi_vio_probe(struct virtio_device *vdev)
>  		spin_lock_init(&channels[i].lock);
>  		spin_lock_init(&channels[i].ready_lock);
>  		INIT_LIST_HEAD(&channels[i].free_list);
> +		INIT_LIST_HEAD(&channels[i].pending_cmds_list);
>  		channels[i].vqueue = vqs[i];
>  
>  		sz = virtqueue_get_vring_size(channels[i].vqueue);
> @@ -506,4 +744,5 @@ const struct scmi_desc scmi_virtio_desc = {
>  	.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
>  	.max_msg = 0, /* overridden by virtio_get_max_msg() */
>  	.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
> +	.atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
>  };
> -- 
> 2.17.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
       [not found]         ` <2f1ea794-a0b9-2099-edc0-b2aeb3ca6b92@opensynergy.com>
@ 2022-01-20 20:39           ` Michael S. Tsirkin
       [not found]             ` <20220123200254.GF6113@e120937-lin>
  0 siblings, 1 reply; 3+ messages in thread
From: Michael S. Tsirkin @ 2022-01-20 20:39 UTC (permalink / raw)
  To: Peter Hilber
  Cc: f.fainelli, vincent.guittot, igor.skalkin, sudeep.holla,
	linux-kernel, virtualization, Cristian Marussi, james.quinlan,
	Jonathan.Cameron, souvik.chakravarty, etienne.carriere,
	linux-arm-kernel

On Thu, Jan 20, 2022 at 08:09:56PM +0100, Peter Hilber wrote:
> On 19.01.22 13:23, Cristian Marussi wrote:
> > On Tue, Jan 18, 2022 at 03:21:03PM +0100, Peter Hilber wrote:
> >> On 21.12.21 15:00, Cristian Marussi wrote:
> >>> Add support for .mark_txdone and .poll_done transport operations to SCMI
> >>> VirtIO transport as pre-requisites to enable atomic operations.
> >>>
> >>> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> >>> and atomic mode for selected SCMI transactions while leaving it default
> >>> disabled.
> >>>
> >>
> >> Hi Cristian,
> >>
> >> thanks for the update. I have some more remarks inline below.
> >>
> > 
> > Hi Peter,
> > 
> > thanks for your review, much appreciated, please see my replies online.
> > 
> >> My impression is that the virtio core does not expose helper functions suitable
> >> to busy-poll for used buffers. But changing this might not be difficult. Maybe
> >> more_used() from virtio_ring.c could be exposed via a wrapper?
> >>
> > 
> > While I definitely agree that the virtio core support for polling is far from
> > ideal, some support is provided and my point was at first to try implement SCMI
> > virtio polling leveraging what we have now in the core and see if it was attainable
> > (indeed I tried early in this series to avoid as a whole to have to support polling
> > at the SCMI transport layer to attain SCMI cmds atomicity..but that was an ill
> > attempt that led nowhere good...)
> > 
> > Btw, I was planning to post a new series next week (after merge-windows) with some
> > fixes I did already, at this point I'll include also some fixes derived
> > from some of your remarks.
> > 
> >> Best regards,
> >>
> >> Peter
> >>
> [snip]>>> + *
> >>> + * Return: True once polling has successfully completed.
> >>> + */
> >>> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> >>> +			     struct scmi_xfer *xfer)
> >>> +{
> >>> +	bool pending, ret = false;
> >>> +	unsigned int length, any_prefetched = 0;
> >>> +	unsigned long flags;
> >>> +	struct scmi_vio_msg *next_msg, *msg = xfer->priv;
> >>> +	struct scmi_vio_channel *vioch = cinfo->transport_info;
> >>> +
> >>> +	if (!msg)
> >>> +		return true;
> >>> +
> >>> +	spin_lock_irqsave(&msg->poll_lock, flags);
> >>> +	/* Processed already by other polling loop on another CPU ? */
> >>> +	if (msg->poll_idx == VIO_MSG_POLL_DONE) {
> >>> +		spin_unlock_irqrestore(&msg->poll_lock, flags);
> >>> +		return true;
> >>> +	}
> >>> +
> >>> +	/* Has cmdq index moved at all ? */
> >>> +	pending = virtqueue_poll(vioch->vqueue, msg->poll_idx);
> >>
> >> In my understanding, the polling comparison could still be subject to the ABA
> >> problem when exactly 2**16 messages have been marked as used since
> >> msg->poll_idx was set (unlikely scenario, granted).
> >>
> >> I think this would be a lot simpler if the virtio core exported some
> >> concurrency-safe helper function for such polling (similar to more_used() from
> >> virtio_ring.c), as discussed at the top.
> > 
> > So this is the main limitation indeed of the current implementation, I
> > cannot distinguish if there was an exact full wrap and I'm reading the same
> > last_idx as before but a whoppying 2**16 messages have instead gone through...
> > 
> > The tricky part seems to me here that even introducing dedicated helpers
> > for polling in order to account for such wrapping (similar to more_used())
> > those would be based by current VirtIO spec on a single bit wrap counter,
> > so how do you discern if 2 whole wraps have happened (even more unlikely..) ?
> > 
> > Maybe I'm missing something though...
> > 
> 
> In my understanding, there is no need to keep track of the old state. We
> actually only want to check whether the device has marked any buffers as `used'
> which we did not retrieve yet via virtqueue_get_buf_ctx().
> 
> This is what more_used() checks in my understanding. One would just need to
> translate the external `struct virtqueue' param to the virtio_ring.c internal
> representation `struct vring_virtqueue' and then call `more_used()'.
> 
> There would be no need to keep `poll_idx` then.
> 
> Best regards,
> 
> Peter

Not really, I don't think so.

There's no magic in more_used. No synchronization happens.
more_used is exactly like virtqueue_poll except
you get to maintain your own index.

As it is, it is quite possible to read the cached index,
then another thread makes 2^16 bufs available, then device
uses them all, and presto you get a false positive.

I guess we can play with memory barriers such that cache
read happens after the index read - but it seems that
will just lead to the same wrap around problem
in reverse. So IIUC it's quite a bit more involved than
just translating structures.

And yes, a more_used like API would remove the need to pass
the index around, but it will also obscure the fact that
there's internal state here and that it's inherently racy
wrt wrap arounds. Whereas I'm happy to see that virtqueue_poll
seems to have made it clear enough that people get it.


It's not hard to handle wrap around in the driver if you like though:
just have a 32 bit atomic counter and increment it each time you are
going to make 2^16 buffers available. That gets you to 2^48 with an
overhead of an atomic read and that should be enough short term. Make
sure the cache line where you put the counter is not needed elsewhere -
checking it in a tight loop with an atomic will force it to the local
CPU. And if you are doing that virtqueue_poll will be enough.



> 
> > I'll have a though about this, but in my opinion this seems something so
> > unlikely that we could live with it, for the moment at least...
> > [snip]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v9 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport
       [not found]             ` <20220123200254.GF6113@e120937-lin>
@ 2022-01-23 22:40               ` Michael S. Tsirkin
  0 siblings, 0 replies; 3+ messages in thread
From: Michael S. Tsirkin @ 2022-01-23 22:40 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: f.fainelli, vincent.guittot, igor.skalkin, sudeep.holla,
	linux-kernel, virtualization, Peter Hilber, james.quinlan,
	Jonathan.Cameron, souvik.chakravarty, etienne.carriere,
	linux-arm-kernel

On Sun, Jan 23, 2022 at 08:02:54PM +0000, Cristian Marussi wrote:
> I was thinking...keeping the current virtqueue_poll interface, since our
> possible issue arises from the used_index wrapping around exactly on top
> of the same polled index and given that currently the API returns an
> unsigned "opaque" value really carrying just the 16-bit index (and possibly
> the wrap bit as bit15 for packed vq) that is supposed to be fed back as
> it is to the virtqueue_poll() function....
> 
> ...why don't we just keep an internal full fledged per-virtqueue wrap-counter
> and return that as the MSB 16-bit of the opaque value returned by
> virtqueue_prepare_enable_cb and then check it back in virtqueue_poll when the
> opaque is fed back ? (filtering it out from the internal helpers machinery)
> 
> As in the example below the scissors.
> 
> I mean if the internal wrap count is at that point different from the
> one provided to virtqueue_poll() via the opaque poll_idx value previously
> provided, certainly there is something new to fetch without even looking
> at the indexes: at the same time, exposing an opaque index built as
> (wraps << 16 | idx) implicitly 'binds' each index to a specific
> wrap-iteration, so they can be distiguished (..ok until the wrap-count
> upper 16bit wraps too....but...)
> 
> I am not really extremely familiar with the internals of virtio so I
> could be missing something obvious...feel free to insult me :P
> 
> (..and I have not made any perf measurements or consideration at this
> point....nor considered the redundancy of the existent packed
> used_wrap_counter bit...)
> 
> Thanks,
> Cristian
> 
> ----
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 00f64f2f8b72..bda6af121cd7 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -117,6 +117,8 @@ struct vring_virtqueue {
>         /* Last used index we've seen. */
>         u16 last_used_idx;
>  
> +       u16 wraps;
> +
>         /* Hint for event idx: already triggered no need to disable. */
>         bool event_triggered;
>  
> @@ -806,6 +808,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
>         ret = vq->split.desc_state[i].data;
>         detach_buf_split(vq, i, ctx);
>         vq->last_used_idx++;
> +       if (unlikely(!vq->last_used_idx))
> +               vq->wraps++;

I wonder whether
               vq->wraps += !vq->last_used_idx;
is faster or slower. No branch but OTOH a dependency.


>         /* If we expect an interrupt for the next entry, tell host
>          * by writing event index and flush out the write before
>          * the read in the next get_buf call. */
> @@ -1508,6 +1512,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
>         if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
>                 vq->last_used_idx -= vq->packed.vring.num;
>                 vq->packed.used_wrap_counter ^= 1;
> +               vq->wraps++;
>         }
>  
>         /*
> @@ -1744,6 +1749,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
>         vq->weak_barriers = weak_barriers;
>         vq->broken = false;
>         vq->last_used_idx = 0;
> +       vq->wraps = 0;
>         vq->event_triggered = false;
>         vq->num_added = 0;
>         vq->packed_ring = true;
> @@ -2092,13 +2098,17 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
>   */
>  unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
>  {
> +       unsigned last_used_idx;
>         struct vring_virtqueue *vq = to_vvq(_vq);
>  
>         if (vq->event_triggered)
>                 vq->event_triggered = false;
>  
> -       return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) :
> -                                virtqueue_enable_cb_prepare_split(_vq);
> +       last_used_idx = vq->packed_ring ?
> +                       virtqueue_enable_cb_prepare_packed(_vq) :
> +                       virtqueue_enable_cb_prepare_split(_vq);
> +
> +       return VRING_BUILD_OPAQUE(last_used_idx, vq->wraps);
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
>  
> @@ -2118,9 +2128,13 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
>         if (unlikely(vq->broken))
>                 return false;
>  
> +       if (unlikely(vq->wraps != VRING_GET_WRAPS(last_used_idx)))
> +               return true;
> +
>         virtio_mb(vq->weak_barriers);
> -       return vq->packed_ring ? virtqueue_poll_packed(_vq, last_used_idx) :
> -                                virtqueue_poll_split(_vq, last_used_idx);
> +       return vq->packed_ring ?
> +               virtqueue_poll_packed(_vq, VRING_GET_IDX(last_used_idx)) :
> +                       virtqueue_poll_split(_vq, VRING_GET_IDX(last_used_idx));
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_poll);
>  
> @@ -2245,6 +2259,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
>         vq->weak_barriers = weak_barriers;
>         vq->broken = false;
>         vq->last_used_idx = 0;
> +       vq->wraps = 0;
>         vq->event_triggered = false;
>         vq->num_added = 0;
>         vq->use_dma_api = vring_use_dma_api(vdev);
> diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> index 476d3e5c0fe7..e6b03017ebd7 100644
> --- a/include/uapi/linux/virtio_ring.h
> +++ b/include/uapi/linux/virtio_ring.h
> @@ -77,6 +77,17 @@
>   */
>  #define VRING_PACKED_EVENT_F_WRAP_CTR  15
>  
> +#define VRING_IDX_MASK                                 GENMASK(15, 0)
> +#define VRING_GET_IDX(opaque)                          \
> +       ((u16)FIELD_GET(VRING_IDX_MASK, (opaque)))
> +
> +#define VRING_WRAPS_MASK                               GENMASK(31, 16)
> +#define VRING_GET_WRAPS(opaque)                                \
> +       ((u16)FIELD_GET(VRING_WRAPS_MASK, (opaque)))
> +
> +#define VRING_BUILD_OPAQUE(idx, wraps)                 \
> +       (FIELD_PREP(VRING_WRAPS_MASK, (wraps)) | ((idx) & VRING_IDX_MASK))
> +
>  /* We support indirect buffer descriptors */
>  #define VIRTIO_RING_F_INDIRECT_DESC    28

Yea I think this patch increases the time it takes to wrap around from
2^16 to 2^32 which seems good enough.
Need some comments to explain the logic.
Would be interesting to see perf data.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-01-23 22:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20211220195646.44498-1-cristian.marussi@arm.com>
     [not found] ` <20211220195646.44498-10-cristian.marussi@arm.com>
2021-12-20 23:17   ` [PATCH v8 09/11] firmware: arm_scmi: Add atomic mode support to virtio transport Michael S. Tsirkin
     [not found]   ` <20211221140027.41524-1-cristian.marussi@arm.com>
     [not found]     ` <f231094a-6f34-3dc1-237d-97218e8fde91@opensynergy.com>
     [not found]       ` <20220119122338.GE6113@e120937-lin>
     [not found]         ` <2f1ea794-a0b9-2099-edc0-b2aeb3ca6b92@opensynergy.com>
2022-01-20 20:39           ` [PATCH v9 " Michael S. Tsirkin
     [not found]             ` <20220123200254.GF6113@e120937-lin>
2022-01-23 22:40               ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).