Re: [RFC PATCH 01/12] virtio/s390: use vring_create

virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed

* Re: [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue
       [not found] ` <20190404231622.52531-2-pasic@linux.ibm.com>
@ 2019-04-08 11:01   ` Cornelia Huck
  2019-04-08 12:37     ` Michael S. Tsirkin
  0 siblings, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2019-04-08 11:01 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S . Tsirkin, Farhan Ali, virtualization,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:11 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> establishes a new way of allocating virtqueues (as a part of the effort
> that taught DMA to virtio rings).
> 
> In the future we will want virtio-ccw to use the DMA API as well.
> 
> Let us switch from the legacy method of allocating virtqueues to
> vring_create_virtqueue() as the first step into that direction.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 27 ++++++++-------------------
>  1 file changed, 8 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 74c328321889..edf4afe2d688 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c

> @@ -516,17 +512,10 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
>  		err = info->num;
>  		goto out_err;
>  	}
> -	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
> -	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
> -	if (info->queue == NULL) {
> -		dev_warn(&vcdev->cdev->dev, "no queue\n");
> -		err = -ENOMEM;
> -		goto out_err;
> -	}
> +	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
> +				    vdev, true, true, ctx,

This second true means 'may_reduce_num'. Looking at the vring code, it
seems that this parameter is never checked; the code will try to
allocate a smaller queue if it can't get the requested size in any
case... this will probably be a problem for legacy virtio-pci, which
explicitly sets may_reduce_num to false. (I can try to come up with a
patch to fix that.)

> +				    virtio_ccw_kvm_notify, callback, name);
>  
> -	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
> -				 true, ctx, info->queue, virtio_ccw_kvm_notify,
> -				 callback, name);
>  	if (!vq) {
>  		/* For now, we fail if we can't get the requested size. */
>  		dev_warn(&vcdev->cdev->dev, "no vq\n");
> @@ -534,15 +523,17 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
>  		goto out_err;
>  	}
>  
> +

Extra blank line :)

>  	/* Register it with the host. */
> +	queue = virtqueue_get_desc_addr(vq);
>  	if (vcdev->revision == 0) {
> -		info->info_block->l.queue = (__u64)info->queue;
> +		info->info_block->l.queue = queue;
>  		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
>  		info->info_block->l.index = i;
>  		info->info_block->l.num = info->num;

You always fill in the size requested by the host, but the actual size
may be smaller (see above). I don't think that is allowed for revision
0 (which implies !virtio-1). You probably need to call
vring_create_virtqueue with may_reduce_num=false for revision 0 (and
wait for the generic vring code to be fixed...)

>  		ccw->count = sizeof(info->info_block->l);
>  	} else {
> -		info->info_block->s.desc = (__u64)info->queue;
> +		info->info_block->s.desc = queue;
>  		info->info_block->s.index = i;
>  		info->info_block->s.num = info->num;

Here, you need to obtain the actual number via
virtqueue_get_vring_size().

>  		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue
  2019-04-08 11:01   ` [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue Cornelia Huck
@ 2019-04-08 12:37     ` Michael S. Tsirkin
  0 siblings, 0 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2019-04-08 12:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Halil Pasic,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Mon, Apr 08, 2019 at 01:01:28PM +0200, Cornelia Huck wrote:
> On Fri,  5 Apr 2019 01:16:11 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > establishes a new way of allocating virtqueues (as a part of the effort
> > that taught DMA to virtio rings).
> > 
> > In the future we will want virtio-ccw to use the DMA API as well.
> > 
> > Let us switch from the legacy method of allocating virtqueues to
> > vring_create_virtqueue() as the first step into that direction.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  drivers/s390/virtio/virtio_ccw.c | 27 ++++++++-------------------
> >  1 file changed, 8 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index 74c328321889..edf4afe2d688 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> 
> > @@ -516,17 +512,10 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
> >  		err = info->num;
> >  		goto out_err;
> >  	}
> > -	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
> > -	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
> > -	if (info->queue == NULL) {
> > -		dev_warn(&vcdev->cdev->dev, "no queue\n");
> > -		err = -ENOMEM;
> > -		goto out_err;
> > -	}
> > +	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
> > +				    vdev, true, true, ctx,
> 
> This second true means 'may_reduce_num'. Looking at the vring code, it
> seems that this parameter is never checked; the code will try to
> allocate a smaller queue if it can't get the requested size in any
> case... this will probably be a problem for legacy virtio-pci, which
> explicitly sets may_reduce_num to false. (I can try to come up with a
> patch to fix that.)

Yes, pls do. Not too late for a bugfix to go into the current linux.

> > +				    virtio_ccw_kvm_notify, callback, name);
> >  
> > -	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
> > -				 true, ctx, info->queue, virtio_ccw_kvm_notify,
> > -				 callback, name);
> >  	if (!vq) {
> >  		/* For now, we fail if we can't get the requested size. */
> >  		dev_warn(&vcdev->cdev->dev, "no vq\n");
> > @@ -534,15 +523,17 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
> >  		goto out_err;
> >  	}
> >  
> > +
> 
> Extra blank line :)
> 
> >  	/* Register it with the host. */
> > +	queue = virtqueue_get_desc_addr(vq);
> >  	if (vcdev->revision == 0) {
> > -		info->info_block->l.queue = (__u64)info->queue;
> > +		info->info_block->l.queue = queue;
> >  		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
> >  		info->info_block->l.index = i;
> >  		info->info_block->l.num = info->num;
> 
> You always fill in the size requested by the host, but the actual size
> may be smaller (see above). I don't think that is allowed for revision
> 0 (which implies !virtio-1). You probably need to call
> vring_create_virtqueue with may_reduce_num=false for revision 0 (and
> wait for the generic vring code to be fixed...)
> 
> >  		ccw->count = sizeof(info->info_block->l);
> >  	} else {
> > -		info->info_block->s.desc = (__u64)info->queue;
> > +		info->info_block->s.desc = queue;
> >  		info->info_block->s.index = i;
> >  		info->info_block->s.num = info->num;
> 
> Here, you need to obtain the actual number via
> virtqueue_get_vring_size().
> 
> >  		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw
       [not found] ` <20190404231622.52531-3-pasic@linux.ibm.com>
@ 2019-04-09  9:57   ` Cornelia Huck
       [not found]     ` <20190409132927.5df3bc50@oc2783563651>
  0 siblings, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2019-04-09  9:57 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:12 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Currently we have a problem if a virtio-ccw device has
> VIRTIO_F_IOMMU_PLATFORM. 

Can you please describe what the actual problem is?

> In future we do want to support DMA API with
> virtio-ccw.
> 
> Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
> with virtio-ccw.
> 
> Let us also switch from legacy avail/used accessors to the DMA aware
> ones (even if it isn't strictly necessary).

I think with this change we can remove the legacy accessors, if I
didn't mis-grep.

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index edf4afe2d688..5956c9e820bb 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -66,6 +66,7 @@ struct virtio_ccw_device {
>  	bool device_lost;
>  	unsigned int config_ready;
>  	void *airq_info;
> +	__u64 dma_mask;

u64?

>  };
>  
>  struct vq_info_block_legacy {
> @@ -536,8 +537,8 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
>  		info->info_block->s.desc = queue;
>  		info->info_block->s.index = i;
>  		info->info_block->s.num = info->num;
> -		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
> -		info->info_block->s.used = (__u64)virtqueue_get_used(vq);
> +		info->info_block->s.avail = (__u64)virtqueue_get_avail_addr(vq);
> +		info->info_block->s.used = (__u64)virtqueue_get_used_addr(vq);
>  		ccw->count = sizeof(info->info_block->s);
>  	}
>  	ccw->cmd_code = CCW_CMD_SET_VQ;
> @@ -769,10 +770,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
>  static void ccw_transport_features(struct virtio_device *vdev)
>  {
>  	/*
> -	 * Packed ring isn't enabled on virtio_ccw for now,
> -	 * because virtio_ccw uses some legacy accessors,
> -	 * e.g. virtqueue_get_avail() and virtqueue_get_used()
> -	 * which aren't available in packed ring currently.
> +	 * There shouldn't be anything that precludes supporting paced.

s/paced/packed/

> +	 * TODO: Remove the limitation after having another look into this.
>  	 */
>  	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
>  }
> @@ -1255,6 +1254,18 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>  		ret = -ENOMEM;
>  		goto out_free;
>  	}
> +	vcdev->vdev.dev.parent = &cdev->dev;

That one makes sense, pci and mmio are doing that as well.

> +	cdev->dev.dma_mask = &vcdev->dma_mask;

That one feels a bit weird. Will this change in one of the follow-on
patches? (Have not yet looked at the whole series.)

> +
> +	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> +	if (ret)
> +		ret = dma_set_mask_and_coherent(&cdev->dev,
> +						DMA_BIT_MASK(32));
> +	if (ret) {
> +		dev_warn(&cdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");

This does not look like you'd try to continue?

> +		goto out_free;
> +	}
> +
>  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
>  				   GFP_DMA | GFP_KERNEL);
>  	if (!vcdev->config_block) {

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization
       [not found] ` <20190404231622.52531-4-pasic@linux.ibm.com>
@ 2019-04-09 10:16   ` Cornelia Huck
       [not found]     ` <20190409125416.73713f23@oc2783563651>
  2019-04-09 12:22   ` Christoph Hellwig
  1 sibling, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2019-04-09 10:16 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:13 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On s390 protected virtualization guests also have to use bounce I/O
> buffers.  That requires some plumbing.
> 
> Let us make sure any device using DMA API accordingly is spared from the
> problems that hypervisor attempting I/O to a non-shared secure page would
> bring.

I have problems parsing this sentence :(

Do you mean that we want to exclude pages for I/O from encryption?

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/Kconfig                   |  4 ++++
>  arch/s390/include/asm/Kbuild        |  1 -
>  arch/s390/include/asm/dma-mapping.h | 13 +++++++++++
>  arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++++
>  arch/s390/mm/init.c                 | 44 +++++++++++++++++++++++++++++++++++++
>  5 files changed, 79 insertions(+), 1 deletion(-)
>  create mode 100644 arch/s390/include/asm/dma-mapping.h
>  create mode 100644 arch/s390/include/asm/mem_encrypt.h

(...)

> @@ -126,6 +129,45 @@ void mark_rodata_ro(void)
>  	pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
>  }
>  
> +int set_memory_encrypted(unsigned long addr, int numpages)
> +{
> +	/* also called for the swiotlb bounce buffers, make all pages shared */
> +	/* TODO: do ultravisor calls */
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> +
> +int set_memory_decrypted(unsigned long addr, int numpages)
> +{
> +	/* also called for the swiotlb bounce buffers, make all pages shared */
> +	/* TODO: do ultravisor calls */
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> +
> +/* are we a protected virtualization guest? */
> +bool sev_active(void)
> +{
> +	/*
> +	 * TODO: Do proper detection using ultravisor, for now let us fake we
> +	 *  have it so the code gets exercised.

That's the swiotlb stuff, right?

(The patches will obviously need some reordering before it is actually
getting merged.)

> +	 */
> +	return true;
> +}
> +EXPORT_SYMBOL_GPL(sev_active);
> +
> +/* protected virtualization */
> +static void pv_init(void)
> +{
> +	if (!sev_active())
> +		return;
> +
> +	/* make sure bounce buffers are shared */
> +	swiotlb_init(1);
> +	swiotlb_update_mem_attributes();
> +	swiotlb_force = SWIOTLB_FORCE;
> +}
> +
>  void __init mem_init(void)
>  {
>  	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> @@ -134,6 +176,8 @@ void __init mem_init(void)
>  	set_max_mapnr(max_low_pfn);
>          high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
>  
> +	pv_init();
> +
>  	/* Setup guest page hinting */
>  	cmma_init();
>  

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
       [not found] ` <20190404231622.52531-5-pasic@linux.ibm.com>
@ 2019-04-09 10:44   ` Cornelia Huck
       [not found]     ` <20190409141114.7dcce94a@oc2783563651>
  2019-04-11 18:25   ` Sebastian Ott
  1 sibling, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2019-04-09 10:44 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:14 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> To support protected virtualization cio will need to make sure the
> memory used for communication with the hypervisor is DMA memory.
> 
> Let us introduce a DMA pool to cio that will help us in allocating

missing 'and'

> deallocating those chunks of memory.
> 
> We use a gen_pool backed with DMA pages to avoid each allocation
> effectively wasting a page, as we typically allocate much less
> than PAGE_SIZE.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/Kconfig           |  1 +
>  arch/s390/include/asm/cio.h |  3 +++
>  drivers/s390/cio/css.c      | 57 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 61 insertions(+)
> 
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 46c69283a67b..e8099ab47368 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -194,6 +194,7 @@ config S390
>  	select VIRT_TO_BUS
>  	select HAVE_NMI
>  	select SWIOTLB
> +	select GENERIC_ALLOCATOR
>  
>  
>  config SCHED_OMIT_FRAME_POINTER
> diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
> index 1727180e8ca1..4510e418614a 100644
> --- a/arch/s390/include/asm/cio.h
> +++ b/arch/s390/include/asm/cio.h
> @@ -328,6 +328,9 @@ static inline u8 pathmask_to_pos(u8 mask)
>  void channel_subsystem_reinit(void);
>  extern void css_schedule_reprobe(void);
>  
> +extern void *cio_dma_zalloc(size_t size);
> +extern void cio_dma_free(void *cpu_addr, size_t size);
> +
>  /* Function from drivers/s390/cio/chsc.c */
>  int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
>  int chsc_sstpi(void *page, void *result, size_t size);
> diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
> index aea502922646..72629d99d8e4 100644
> --- a/drivers/s390/cio/css.c
> +++ b/drivers/s390/cio/css.c
> @@ -20,6 +20,8 @@
>  #include <linux/reboot.h>
>  #include <linux/suspend.h>
>  #include <linux/proc_fs.h>
> +#include <linux/genalloc.h>
> +#include <linux/dma-mapping.h>
>  #include <asm/isc.h>
>  #include <asm/crw.h>
>  
> @@ -886,6 +888,8 @@ static const struct attribute_group *cssdev_attr_groups[] = {
>  	NULL,
>  };
>  
> +static u64 css_dev_dma_mask = DMA_BIT_MASK(31);
> +
>  static int __init setup_css(int nr)
>  {
>  	struct channel_subsystem *css;
> @@ -899,6 +903,9 @@ static int __init setup_css(int nr)
>  	dev_set_name(&css->device, "css%x", nr);
>  	css->device.groups = cssdev_attr_groups;
>  	css->device.release = channel_subsystem_release;
> +	/* some cio DMA memory needs to be 31 bit addressable */
> +	css->device.coherent_dma_mask = DMA_BIT_MASK(31),
> +	css->device.dma_mask = &css_dev_dma_mask;

Question: Does this percolate down to the child devices eventually?
E.g., you have a ccw_device getting the mask from its parent
subchannel, which gets it from its parent css? If so, does that clash
with the the mask you used for the virtio_ccw_device in a previous
patch? Or are they two really different things?

>  
>  	mutex_init(&css->mutex);
>  	css->cssid = chsc_get_cssid(nr);
> @@ -1018,6 +1025,55 @@ static struct notifier_block css_power_notifier = {
>  	.notifier_call = css_power_event,
>  };
>  
> +#define POOL_INIT_PAGES 1
> +static struct gen_pool *cio_dma_pool;
> +/* Currently cio supports only a single css */
> +static struct device *cio_dma_css;

That global variable feels wrong, especially if you plan to support
MCSS-E in the future. (Do you? :) If yes, should the dma pool be global
or per-css? As css0 currently is the root device for the channel
subsystem stuff, you'd either need a new parent to hang this off from
or size this with the number of css images.)

For now, just grab channel_subsystems[0]->device directly?

> +static gfp_t  cio_dma_flags;
> +
> +static void __init cio_dma_pool_init(void)
> +{
> +	void *cpu_addr;
> +	dma_addr_t dma_addr;
> +	int i;
> +
> +	cio_dma_css = &channel_subsystems[0]->device;
> +	cio_dma_flags = GFP_DMA | GFP_KERNEL | __GFP_ZERO;
> +	cio_dma_pool = gen_pool_create(3, -1);
> +	/* No need to free up the resources: compiled in */
> +	for (i = 0; i < POOL_INIT_PAGES; ++i) {
> +		cpu_addr = dma_alloc_coherent(cio_dma_css, PAGE_SIZE, &dma_addr,
> +					      cio_dma_flags);
> +		if (!cpu_addr)
> +			return;
> +		gen_pool_add_virt(cio_dma_pool, (unsigned long) cpu_addr,
> +				  dma_addr, PAGE_SIZE, -1);
> +	}
> +
> +}
> +
> +void *cio_dma_zalloc(size_t size)
> +{
> +	dma_addr_t dma_addr;
> +	unsigned long addr = gen_pool_alloc(cio_dma_pool, size);
> +
> +	if (!addr) {
> +		addr = (unsigned long) dma_alloc_coherent(cio_dma_css,
> +					PAGE_SIZE, &dma_addr, cio_dma_flags);
> +		if (!addr)
> +			return NULL;
> +		gen_pool_add_virt(cio_dma_pool, addr, dma_addr, PAGE_SIZE, -1);
> +		addr = gen_pool_alloc(cio_dma_pool, size);
> +	}
> +	return (void *) addr;

At this point, you're always going via the css0 device. I'm wondering
whether you should pass in the cssid here and use css_by_id(cssid) to
make this future proof. But then, I'm not quite clear from which
context this will be called.

> +}
> +
> +void cio_dma_free(void *cpu_addr, size_t size)
> +{
> +	memset(cpu_addr, 0, size);
> +	gen_pool_free(cio_dma_pool, (unsigned long) cpu_addr, size);
> +}
> +
>  /*
>   * Now that the driver core is running, we can setup our channel subsystem.
>   * The struct subchannel's are created during probing.
> @@ -1063,6 +1119,7 @@ static int __init css_bus_init(void)
>  		unregister_reboot_notifier(&css_reboot_notifier);
>  		goto out_unregister;
>  	}
> +	cio_dma_pool_init();
>  	css_init_done = 1;
>  
>  	/* Enable default isc for I/O subchannels. */

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization
       [not found] ` <20190404231622.52531-4-pasic@linux.ibm.com>
  2019-04-09 10:16   ` [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization Cornelia Huck
@ 2019-04-09 12:22   ` Christoph Hellwig
  1 sibling, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2019-04-09 12:22 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Cornelia Huck, virtualization, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

> +++ b/arch/s390/include/asm/dma-mapping.h
> @@ -0,0 +1,13 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_S390_DMA_MAPPING_H
> +#define _ASM_S390_DMA_MAPPING_H
> +
> +#include <linux/dma-contiguous.h>
> +
> +static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
> +{
> +	return NULL;
> +}
> +
> +#endif /* _ASM_S390_DMA_MAPPING_H */
> +

Congratulations!  You ust create an entirely pointless duplicate of
include/asm-generic/dma-mapping.h.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw
       [not found]     ` <20190409132927.5df3bc50@oc2783563651>
@ 2019-04-09 13:01       ` Cornelia Huck
       [not found]         ` <20190409152313.0296e8f1@oc2783563651>
  0 siblings, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2019-04-09 13:01 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Tue, 9 Apr 2019 13:29:27 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 11:57:43 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:12 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > Currently we have a problem if a virtio-ccw device has
> > > VIRTIO_F_IOMMU_PLATFORM.   
> > 
> > Can you please describe what the actual problem is?
> >   
> 
> Without this patch:
> 
> WARNING: CPU: 2 PID: 26
> at [..]/kernel/dma/mapping.c:251
> dma_alloc_attrs+0x8e/0xd0 Modules linked in: CPU: 2 PID: 26 Comm:
> kworker/u6:1 Not tainted 5.1.0-rc3-00023-g1ec89ec #596 Hardware name:
> IBM 2964 NC9 712 (KVM/Linux) Workqueue: events_unbound async_run_entry_fn
> Krnl PSW : 0704c00180000000 000000000021b18e (dma_alloc_attrs+0x8e/0xd0)
>            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
> Krnl GPRS: 0000000000000000 0000000000000000 0000000000000000
> 0000000000001406 000003e00040f838 0000000000002dc0 0000000000000100
> 0000000000000001 0000000000001000 000000000236f028 000003e00040f838
> 0000000000001406 000000004b289828 0000000000000080 000003e00040f6f8
> 000003e00040f6a0 Krnl Code: 000000000021b17e: f0e00004ebaf
> srp	4(15,%r0),2991(%r14),0 000000000021b184: f0c0000407f4
> srp	4(13,%r0),2036,0 #000000000021b18a: a7f40001
> brc	15,21b18c >000000000021b18e: ec5520bc0055	risbg
> %r5,%r5,32,188,0 000000000021b194: b9020011		ltgr
> %r1,%r1 000000000021b198: a784ffd9		brc	8,21b14a
>            000000000021b19c: e31010000002	ltg	%r1,0(%r1)
>            000000000021b1a2: a7840012		brc	8,21b1c6
> Call Trace:
> ([<0000000000000004>] 0x4)
>  [<00000000007a7d54>] vring_alloc_queue+0x74/0x90 
>  [<00000000007a8390>] vring_create_virtqueue+0xf8/0x288 
>  [<0000000000919ec0>] virtio_ccw_find_vqs+0xf8/0x950 
>  [<000000000080772e>] init_vq+0x16e/0x318 
>  [<00000000008087c4>] virtblk_probe+0xf4/0xb58 
>  [<00000000007a62a6>] virtio_dev_probe+0x1a6/0x250 
>  [<00000000007ea498>] really_probe+0x1c8/0x290 
>  [<00000000007ea746>] driver_probe_device+0x86/0x160 
>  [<00000000007e7cba>] bus_for_each_drv+0x7a/0xc0 
>  [<00000000007ea23c>] __device_attach+0xfc/0x180 
>  [<00000000007e9116>] bus_probe_device+0xae/0xc8 
>  [<00000000007e5066>] device_add+0x3fe/0x698 
>  [<00000000007a5d92>] register_virtio_device+0xca/0x120 
>  [<00000000009195a2>] virtio_ccw_online+0x1b2/0x220 
>  [<000000000089853e>] ccw_device_set_online+0x1d6/0x4d8 
>  [<0000000000918cf6>] virtio_ccw_auto_online+0x26/0x58 
>  [<00000000001a61b6>] async_run_entry_fn+0x5e/0x158 
>  [<0000000000199322>] process_one_work+0x25a/0x668 
>  [<000000000019977a>] worker_thread+0x4a/0x428 
>  [<00000000001a1ae8>] kthread+0x150/0x170 
>  [<0000000000aeab3a>] kernel_thread_starter+0x6/0xc 
>  [<0000000000aeab34>] kernel_thread_starter+0x0/0xc 
> 
> [..]
> 
> virtio_ccw 0.0.0301: no vq
> ---[ end trace d35815958c12cad3 ]---
> virtio_ccw 0.0.0300: no vq
> virtio_blk: probe of virtio1 failed with error -12
> virtio_blk: probe of virtio3 failed with error -12
> 
> Means virtio devices broken.
> 
> Should I
> s/we have a problem if a virtio-ccw device/virtio-ccw devices do not work if the device/
> ?

Much better :)

(That is, this happens if we switch on the feature bit in the
hypervisor, right?)

> 
> > > In future we do want to support DMA API with
> > > virtio-ccw.
> > > 
> > > Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
> > > with virtio-ccw.
> > > 
> > > Let us also switch from legacy avail/used accessors to the DMA aware
> > > ones (even if it isn't strictly necessary).  
> > 
> > I think with this change we can remove the legacy accessors, if I
> > didn't mis-grep.
> >   
> 
> That is possible, I can do that in v1.

Sounds good.

> 
> > > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > ---
> > >  drivers/s390/virtio/virtio_ccw.c | 23 +++++++++++++++++------
> > >  1 file changed, 17 insertions(+), 6 deletions(-)

(...)

> > > @@ -1255,6 +1254,18 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> > >  		ret = -ENOMEM;
> > >  		goto out_free;
> > >  	}
> > > +	vcdev->vdev.dev.parent = &cdev->dev;  
> > 
> > That one makes sense, pci and mmio are doing that as well.
> >   
> > > +	cdev->dev.dma_mask = &vcdev->dma_mask;  
> > 
> > That one feels a bit weird. Will this change in one of the follow-on
> > patches? (Have not yet looked at the whole series.)  
> 
> I don't thinks so. Do you mean this should happen within the cio code?
> I think I started out with the idea to keep the scope as narrow as
> possible. Do you have any suggestions?

From what I see, you set the mask from the virtio-ccw side, then
propagate it up to the general ccw_device, and then the generic virtio
code will fetch it from the ccw_device. Don't you potentially need
something for other ccw_devices in that protected hipervisor case as
well (e.g for 3270)?

> 
> >   
> > > +
> > > +	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> > > +	if (ret)
> > > +		ret = dma_set_mask_and_coherent(&cdev->dev,
> > > +						DMA_BIT_MASK(32));
> > > +	if (ret) {
> > > +		dev_warn(&cdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");  
> > 
> > This does not look like you'd try to continue?
> >   
> 
> I remember now. First I did continue, then I changed this to fail hard
> so I can not ignore any such problems while smoke testing ('I don't always
> check the kernel messages'), but kept the old message. This basically
> should not fail anyway, otherwise we have a problem AFAIU.
> 
> By the way virtio-pci tries to continue indeed, and this is also where
> the wording comes from ;).
> 
> What would you prefer? Try to continue or fail right away?

If it does not have a chance of working properly in the general case,
I'd fail.

> 
> Regards,
> Halil
> 
> > > +		goto out_free;
> > > +	}
> > > +
> > >  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
> > >  				   GFP_DMA | GFP_KERNEL);
> > >  	if (!vcdev->config_block) {  
> >   
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw
       [not found]         ` <20190409152313.0296e8f1@oc2783563651>
@ 2019-04-09 15:47           ` Cornelia Huck
  0 siblings, 0 replies; 27+ messages in thread
From: Cornelia Huck @ 2019-04-09 15:47 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Tue, 9 Apr 2019 15:23:13 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 15:01:20 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Tue, 9 Apr 2019 13:29:27 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Tue, 9 Apr 2019 11:57:43 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Fri,  5 Apr 2019 01:16:12 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > > > @@ -1255,6 +1254,18 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> > > > >  		ret = -ENOMEM;
> > > > >  		goto out_free;
> > > > >  	}
> > > > > +	vcdev->vdev.dev.parent = &cdev->dev;    
> > > > 
> > > > That one makes sense, pci and mmio are doing that as well.
> > > >     
> > > > > +	cdev->dev.dma_mask = &vcdev->dma_mask;    
> > > > 
> > > > That one feels a bit weird. Will this change in one of the follow-on
> > > > patches? (Have not yet looked at the whole series.)    
> > > 
> > > I don't thinks so. Do you mean this should happen within the cio code?
> > > I think I started out with the idea to keep the scope as narrow as
> > > possible. Do you have any suggestions?  
> > 
> > From what I see, you set the mask from the virtio-ccw side, then
> > propagate it up to the general ccw_device, and then the generic virtio
> > code will fetch it from the ccw_device.  
> 
> Right! For some reason dma_mask is a pointer. And I need virtio core to
> use a sane value for virtio_ccw devices.
> 
> > Don't you potentially need
> > something for other ccw_devices in that protected hipervisor case as
> > well (e.g for 3270)?  
> 
> 
> Maybe, maybe not. The first stage is likely to be virito only. I would
> prefer sorting out stuff like 3270 as the need arises. Also see my
> response to patch 4 (Message-Id: <20190409141114.7dcce94a@oc2783563651>).

As long as the infrastructure is flexible enough to be extended later,
ok. I still need to read that mail.

> 
> >   
> > >   
> > > >     
> > > > > +
> > > > > +	ret = dma_set_mask_and_coherent(&cdev->dev,
> > > > > DMA_BIT_MASK(64));
> > > > > +	if (ret)
> > > > > +		ret = dma_set_mask_and_coherent(&cdev->dev,
> > > > > +
> > > > > DMA_BIT_MASK(32));
> > > > > +	if (ret) {
> > > > > +		dev_warn(&cdev->dev, "Failed to enable 64-bit
> > > > > or 32-bit DMA.  Trying to continue, but this might not
> > > > > work.\n");    
> > > > 
> > > > This does not look like you'd try to continue?
> > > >     
> > > 
> > > I remember now. First I did continue, then I changed this to fail
> > > hard so I can not ignore any such problems while smoke testing ('I
> > > don't always check the kernel messages'), but kept the old message.
> > > This basically should not fail anyway, otherwise we have a problem
> > > AFAIU.
> > > 
> > > By the way virtio-pci tries to continue indeed, and this is also
> > > where the wording comes from ;).
> > > 
> > > What would you prefer? Try to continue or fail right away?  
> > 
> > If it does not have a chance of working properly in the general case,
> > I'd fail.
> >   
> 
> Agreed! I will make it so. Would dropping '  Trying to continue, but
> this might not work.' from the warning message work for you?

Sounds fine.

> 
> I could also drop the attempt to set a 32 bit mask if you agree. Do you?

Only if you also drop it from the message as well ;)

Not sure in what cases you'll fail to set a 64 bit mask, but succeed
with a 32 bit mask. If there's no sensible situation where that might
happen, I'd just go ahead and drop it.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
       [not found]     ` <20190409141114.7dcce94a@oc2783563651>
@ 2019-04-09 17:14       ` Cornelia Huck
       [not found]         ` <20190410173148.067555dc@oc2783563651>
  0 siblings, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2019-04-09 17:14 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Tue, 9 Apr 2019 14:11:14 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 12:44:58 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:14 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > @@ -886,6 +888,8 @@ static const struct attribute_group *cssdev_attr_groups[] = {
> > >  	NULL,
> > >  };
> > >  
> > > +static u64 css_dev_dma_mask = DMA_BIT_MASK(31);
> > > +
> > >  static int __init setup_css(int nr)
> > >  {
> > >  	struct channel_subsystem *css;
> > > @@ -899,6 +903,9 @@ static int __init setup_css(int nr)
> > >  	dev_set_name(&css->device, "css%x", nr);
> > >  	css->device.groups = cssdev_attr_groups;
> > >  	css->device.release = channel_subsystem_release;
> > > +	/* some cio DMA memory needs to be 31 bit addressable */
> > > +	css->device.coherent_dma_mask = DMA_BIT_MASK(31),
> > > +	css->device.dma_mask = &css_dev_dma_mask;  
> > 
> > Question: Does this percolate down to the child devices eventually?
> > E.g., you have a ccw_device getting the mask from its parent
> > subchannel, which gets it from its parent css? If so, does that clash
> > with the the mask you used for the virtio_ccw_device in a previous
> > patch? Or are they two really different things?
> >   
> 
> AFAIK id does not percolate. I could not find anything showing in this
> direction in drivers core at least. AFAIU dma_mask is also about the
> addressing limitations of the device itself (in the good old pci world,
> not really applicable for ccw devices).
> 
> Regarding virtio_ccw, I need to set the DMA stuff on the parent device
> because that is how the stuff in virtio_ring.c works. There I we can
> get away with DMA_BIT_MASK(64) as that stuff is not used in places where
> 31 bit addresses are required. For the actual ccw stuff I used the
> cio DMA pool introduced here.

Ok, so those are two different users then.

> 
> FWIW the allocation mechanisms provided by  DMA API (as documented) is
> not very well suited with what we need for ccw. This is why I choose to
> do this gen_pool thing. The gist of it is as-is at the moment we get
> page granularity allocations from DMA API. Of course we could use
> dma_pool which is kind of perfect for uniformly sized objects. As you
> will see in the rest of the series, we have a comparatively small number
> of smallish objects of varying sizes. And some of these are short lived.
> 
> So the child devices like subchannels and ccw devices do not use DMA API
> directly, except for virtio_ccw.
> 
> Does all that make any sense to you?

I think I need to read more of the series, but that should be enough
info to get me started :)

> 
> 
> > >  
> > >  	mutex_init(&css->mutex);
> > >  	css->cssid = chsc_get_cssid(nr);
> > > @@ -1018,6 +1025,55 @@ static struct notifier_block css_power_notifier = {
> > >  	.notifier_call = css_power_event,
> > >  };
> > >  
> > > +#define POOL_INIT_PAGES 1
> > > +static struct gen_pool *cio_dma_pool;
> > > +/* Currently cio supports only a single css */
> > > +static struct device *cio_dma_css;  
> > 
> > That global variable feels wrong, especially if you plan to support
> > MCSS-E in the future. (Do you? :)   
> 
> Not that I'm aware of any plans to add support MCSS-E.
> 
> > If yes, should the dma pool be global
> > or per-css? As css0 currently is the root device for the channel
> > subsystem stuff, you'd either need a new parent to hang this off from
> > or size this with the number of css images.)
> > 
> > For now, just grab channel_subsystems[0]->device directly?
> >   
> 
> Patch 6 gets rid of this variable and adds an accessor instead:
> 
> +struct device *cio_get_dma_css_dev(void)
> +{
> +	return &channel_subsystems[0]->device;
> +}
> +
> 
> I can move that here if you like (for v1).

An accessor looks more sane than a global variable, yes.

> 
> Yes it is kind of unfortunate that some pieces of this code look
> like there could be more than one css, but actually MCSS-E is not
> supported. I would prefer sorting these problems out when they arise, if
> they ever arise.

As long as it's not too unreasonable, I think we should keep the
infrastructure for multiple css instances in place.

> 
> > > +static gfp_t  cio_dma_flags;
> > > +
> > > +static void __init cio_dma_pool_init(void)
> > > +{
> > > +	void *cpu_addr;
> > > +	dma_addr_t dma_addr;
> > > +	int i;
> > > +
> > > +	cio_dma_css = &channel_subsystems[0]->device;
> > > +	cio_dma_flags = GFP_DMA | GFP_KERNEL | __GFP_ZERO;
> > > +	cio_dma_pool = gen_pool_create(3, -1);
> > > +	/* No need to free up the resources: compiled in */
> > > +	for (i = 0; i < POOL_INIT_PAGES; ++i) {
> > > +		cpu_addr = dma_alloc_coherent(cio_dma_css, PAGE_SIZE, &dma_addr,
> > > +					      cio_dma_flags);
> > > +		if (!cpu_addr)
> > > +			return;
> > > +		gen_pool_add_virt(cio_dma_pool, (unsigned long) cpu_addr,
> > > +				  dma_addr, PAGE_SIZE, -1);
> > > +	}
> > > +
> > > +}
> > > +
> > > +void *cio_dma_zalloc(size_t size)
> > > +{
> > > +	dma_addr_t dma_addr;
> > > +	unsigned long addr = gen_pool_alloc(cio_dma_pool, size);
> > > +
> > > +	if (!addr) {
> > > +		addr = (unsigned long) dma_alloc_coherent(cio_dma_css,
> > > +					PAGE_SIZE, &dma_addr, cio_dma_flags);
> > > +		if (!addr)
> > > +			return NULL;
> > > +		gen_pool_add_virt(cio_dma_pool, addr, dma_addr, PAGE_SIZE, -1);
> > > +		addr = gen_pool_alloc(cio_dma_pool, size);
> > > +	}
> > > +	return (void *) addr;  
> > 
> > At this point, you're always going via the css0 device. I'm wondering
> > whether you should pass in the cssid here and use css_by_id(cssid) to
> > make this future proof. But then, I'm not quite clear from which
> > context this will be called.  
> 
> As said before I don't see MCSS-E coming. Regarding the client code,
> please check out the rest of the series. Especially patch 6. 
> 
> From my perspective it would be at this stage saner to make it more
> obvious that only one css is supported (at the moment), than to implement
> MCSS-E, or to make this (kind of) MCSS-E ready.

I disagree. I think there's value in keeping the interfaces clean
(within reasonable bounds, of course.) Even if there is no
implementation of MCSS-E other than QEMU... it is probably a good idea
to spend some brain cycles to make this conceptually clean.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization
       [not found]     ` <20190409125416.73713f23@oc2783563651>
@ 2019-04-09 17:18       ` Cornelia Huck
  0 siblings, 0 replies; 27+ messages in thread
From: Cornelia Huck @ 2019-04-09 17:18 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Tue, 9 Apr 2019 12:54:16 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 12:16:47 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:13 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On s390 protected virtualization guests also have to use bounce I/O
> > > buffers.  That requires some plumbing.
> > > 
> > > Let us make sure any device using DMA API accordingly is spared from the  
>                     ^,                                   ^,
> Maybe this helps...
> 
> > > problems that hypervisor attempting I/O to a non-shared secure page would
> > > bring.  
> > 
> > I have problems parsing this sentence :(
> > 
> > Do you mean that we want to exclude pages for I/O from encryption?  
> 
> The intended meaning is:
> * Devices that do use DMA API (properly) to get get/map the memory
>   that is used to talk to hypervisor should be OK with PV (protected
>   virtualizaton). I.e. for such devices PV or not PV is basically
>   transparent.
> * But if a device does not use DMA API for the memory that is used to
>   talk to the hypervisor this patch won't help.
> 
> And yes the gist of it is: memory accessed by the hypervisor needs to
> be on pages excluded from protection (which in case of PV is technically
> not encryption).
> 
> Does that help?

Hm, let me sleep on this. The original sentence was a bit too
convoluted for me...

> 
> >   
> > > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > ---
> > >  arch/s390/Kconfig                   |  4 ++++
> > >  arch/s390/include/asm/Kbuild        |  1 -
> > >  arch/s390/include/asm/dma-mapping.h | 13 +++++++++++
> > >  arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++++
> > >  arch/s390/mm/init.c                 | 44 +++++++++++++++++++++++++++++++++++++
> > >  5 files changed, 79 insertions(+), 1 deletion(-)
> > >  create mode 100644 arch/s390/include/asm/dma-mapping.h
> > >  create mode 100644 arch/s390/include/asm/mem_encrypt.h  
> > 
> > (...)
> >   
> > > @@ -126,6 +129,45 @@ void mark_rodata_ro(void)
> > >  	pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
> > >  }
> > >  
> > > +int set_memory_encrypted(unsigned long addr, int numpages)
> > > +{
> > > +	/* also called for the swiotlb bounce buffers, make all pages shared */
> > > +	/* TODO: do ultravisor calls */
> > > +	return 0;
> > > +}
> > > +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> > > +
> > > +int set_memory_decrypted(unsigned long addr, int numpages)
> > > +{
> > > +	/* also called for the swiotlb bounce buffers, make all pages shared */
> > > +	/* TODO: do ultravisor calls */
> > > +	return 0;
> > > +}
> > > +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> > > +
> > > +/* are we a protected virtualization guest? */
> > > +bool sev_active(void)
> > > +{
> > > +	/*
> > > +	 * TODO: Do proper detection using ultravisor, for now let us fake we
> > > +	 *  have it so the code gets exercised.  
> > 
> > That's the swiotlb stuff, right?
> >   
> 
> You mean 'That' == code to get exercised == 'swiotlb stuff'?
> 
> If yes then the answer is kind of. The swiotlb (i.e. bounce buffers) is
> when we map (like we map the buffers pointed to by the descriptors in
> case of the virtio ring). The other part of it is the memory allocated
> as DMA coherent (i.e. the virtio ring (desc, avail used) itself).

Ok.

> 
> > (The patches will obviously need some reordering before it is actually
> > getting merged.)
> >   
> 
> What do you mean by reordering?
> 
> One reason why this is an early RFC is the missing dependency (i.e. the
> stuff described by most of the TODO comments). As pointed out in the
> cover letter. Another reason is that I wanted to avoid putting a lots of
> effort into fine-polishing before clarifying the getting some feedback
> on the basics from the community. ;)

Sure. I'm just reading top-down and unconditionally enabling this is
something that obviously needs to be changed in later iterations ;)

> 
> 
> > > +	 */
> > > +	return true;
> > > +}
> > > +EXPORT_SYMBOL_GPL(sev_active);
> > > +
> > > +/* protected virtualization */
> > > +static void pv_init(void)
> > > +{
> > > +	if (!sev_active())
> > > +		return;
> > > +
> > > +	/* make sure bounce buffers are shared */
> > > +	swiotlb_init(1);
> > > +	swiotlb_update_mem_attributes();
> > > +	swiotlb_force = SWIOTLB_FORCE;
> > > +}
> > > +
> > >  void __init mem_init(void)
> > >  {
> > >  	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> > > @@ -134,6 +176,8 @@ void __init mem_init(void)
> > >  	set_max_mapnr(max_low_pfn);
> > >          high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
> > >  
> > > +	pv_init();
> > > +
> > >  	/* Setup guest page hinting */
> > >  	cmma_init();
> > >    
> >   
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
       [not found] ` <20190404231622.52531-6-pasic@linux.ibm.com>
@ 2019-04-09 17:55   ` Cornelia Huck
       [not found]     ` <20190410021044.4da3e847@oc2783563651>
  2019-04-11 14:15   ` Sebastian Ott
  1 sibling, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2019-04-09 17:55 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:15 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Virtio-ccw relies on cio mechanisms for bootstrapping the ccw device.

Well, a ccw device is, by definition, using cio mechanisms ;)

Better say: "As virtio-ccw devices are channel devices, we need to use
the dma area for any communication with the hypervisor."
Or something like that.

> Thus we need to make sure any memory that is used for communication with
> the hypervisor is shared.

In this context, does 'hypervisor' always mean 'QEMU/KVM'? If Other
Hypervisors implement protected virtualization, we probably need to
make sure that all common I/O layer control blocks are in the dma area
(including e.g. QDIO), not just what virtio-ccw devices use.

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/cio/ccwreq.c        |  8 +++----
>  drivers/s390/cio/device.c        | 46 +++++++++++++++++++++++++++++++---------
>  drivers/s390/cio/device_fsm.c    | 40 +++++++++++++++++-----------------
>  drivers/s390/cio/device_id.c     | 18 ++++++++--------
>  drivers/s390/cio/device_ops.c    |  4 ++--
>  drivers/s390/cio/device_pgid.c   | 20 ++++++++---------
>  drivers/s390/cio/device_status.c | 24 ++++++++++-----------
>  drivers/s390/cio/io_sch.h        | 19 ++++++++++++-----
>  8 files changed, 107 insertions(+), 72 deletions(-)
> 

(...)

(just looking at which fields are moved for now)

> diff --git a/drivers/s390/cio/io_sch.h b/drivers/s390/cio/io_sch.h
> index 90e4e3a7841b..fc3220fede0f 100644
> --- a/drivers/s390/cio/io_sch.h
> +++ b/drivers/s390/cio/io_sch.h
> @@ -9,15 +9,20 @@
>  #include "css.h"
>  #include "orb.h"
>  
> +
> +struct io_subchannel_dma_area {
> +	struct ccw1 sense_ccw;	/* static ccw for sense command */

The ccw makes sense.

> +};
> +
>  struct io_subchannel_private {
>  	union orb orb;		/* operation request block */
> -	struct ccw1 sense_ccw;	/* static ccw for sense command */
>  	struct ccw_device *cdev;/* pointer to the child ccw device */
>  	struct {
>  		unsigned int suspend:1;	/* allow suspend */
>  		unsigned int prefetch:1;/* deny prefetch */
>  		unsigned int inter:1;	/* suppress intermediate interrupts */
>  	} __packed options;
> +	struct io_subchannel_dma_area *dma_area;
>  } __aligned(8);
>  
>  #define to_io_private(n) ((struct io_subchannel_private *) \
> @@ -115,6 +120,13 @@ enum cdev_todo {
>  #define FAKE_CMD_IRB	1
>  #define FAKE_TM_IRB	2
>  
> +struct ccw_device_dma_area {
> +	struct senseid senseid;	/* SenseID info */
> +	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
> +	struct irb irb;		/* device status */
> +	struct pgid pgid[8];	/* path group IDs per chpid*/

Again, ccws, and blocks that will be written by the hypervisor, which
makes sense as well.

> +};
> +
>  struct ccw_device_private {
>  	struct ccw_device *cdev;
>  	struct subchannel *sch;
> @@ -156,11 +168,7 @@ struct ccw_device_private {
>  	} __attribute__((packed)) flags;
>  	unsigned long intparm;	/* user interruption parameter */
>  	struct qdio_irq *qdio_data;
> -	struct irb irb;		/* device status */
>  	int async_kill_io_rc;
> -	struct senseid senseid;	/* SenseID info */
> -	struct pgid pgid[8];	/* path group IDs per chpid*/
> -	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
>  	struct work_struct todo_work;
>  	enum cdev_todo todo;
>  	wait_queue_head_t wait_q;
> @@ -169,6 +177,7 @@ struct ccw_device_private {
>  	struct list_head cmb_list;	/* list of measured devices */
>  	u64 cmb_start_time;		/* clock value of cmb reset */
>  	void *cmb_wait;			/* deferred cmb enable/disable */
> +	struct ccw_device_dma_area *dma_area;
>  	enum interruption_class int_class;
>  };
>  

So, this leaves some things I'm not sure about, especially as I do not
know the architecture of this new feature.

- This applies only to asynchronously handled things, it seems? So
  things like control blocks modified by stsch/msch/etc does not need
  special treatment?
- What about channel measurements? Or are they not supported?
- What about CHSCs? Or would only asynchronous commands (which we
  currently don't implement in QEMU) need special treatment?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
       [not found]     ` <20190410021044.4da3e847@oc2783563651>
@ 2019-04-10  8:25       ` Cornelia Huck
       [not found]         ` <20190410150225.61b86cd9@oc2783563651>
  0 siblings, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2019-04-10  8:25 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 02:10:44 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 19:55:48 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:15 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > Thus we need to make sure any memory that is used for communication with
> > > the hypervisor is shared.  
> > 
> > In this context, does 'hypervisor' always mean 'QEMU/KVM'? If Other
> > Hypervisors implement protected virtualization, we probably need to
> > make sure that all common I/O layer control blocks are in the dma area
> > (including e.g. QDIO), not just what virtio-ccw devices use.
> >   
> 
> Hypervisor could theoretically be something different than QEMU/KVM. Yet,
> as stated before, this series is about getting virtio-ccw working
> (modulo the TODOs). 

Sure, just wanted to point it out. If this is "enable the common I/O
layer, except for QDIO" or so, that would sound fine to me :)

> 
> [..]
> 
> > >    
> > 
> > So, this leaves some things I'm not sure about, especially as I do not
> > know the architecture of this new feature.
> > 
> > - This applies only to asynchronously handled things, it seems? So
> >   things like control blocks modified by stsch/msch/etc does not need
> >   special treatment?  
> 
> I had a feeble attempt at explaining this in the cover letter:
> 
> * make sure that virtio-ccw specific stuff uses shared memory when
>   talking to the hypervisor (except communication blocks like ORB, these
>   are handled by the hypervisor)
> 
> Unfortunately the last 'hypervisor' was supposed to be 'ultravisor'.
> 
> I.e. the ultravisor will take care of exposing the control blocks
> to the hypervisor (and of changes as well).

Yeah, that "control blocks" or "communication blocks" leaves me a bit
fuzzy :)

So, what is a high-level summary of areas that need the treatment?
What I get from looking at the patches so far, it's:
- stuff that is written by the hypervisor's interrupt injection code:
  IRB, indicators, etc.
- buffers that are filled by a channel program: sense, sense id, etc.
- ccws themselves (because of translation?)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O
       [not found] ` <20190404231622.52531-8-pasic@linux.ibm.com>
@ 2019-04-10  8:42   ` Cornelia Huck
       [not found]     ` <20190410164245.53f8b26d@oc2783563651>
  0 siblings, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2019-04-10  8:42 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:17 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Before virtio-ccw could get away with not using DMA API for the pieces of
> memory it does ccw I/O with. With protected virtualization this has to
> change, since the hypervisor needs to read and sometimes also write these
> pieces of memory.
> 
> Let us make sure all ccw I/O is done through shared memory.
> 
> Note: The control blocks of I/O instructions do not need to be shared.
> These are marshalled by the ultravisor.

Ok, so direct parameters of I/O instructions are handled by the
ultravisor?

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++++----------------
>  1 file changed, 107 insertions(+), 70 deletions(-)
> 
(...)
> @@ -167,6 +170,28 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
>  	return container_of(vdev, struct virtio_ccw_device, vdev);
>  }
>  
> +#define vc_dma_decl_struct(type, field) \
> +	dma_addr_t field ## _dma_addr;  \
> +	struct type *field
> +
> +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size,
> +				   dma_addr_t *dma_handle)
> +{
> +	return dma_alloc_coherent(vdev->dev.parent, size, dma_handle,
> +				  GFP_DMA | GFP_KERNEL | __GFP_ZERO);
> +}
> +
> +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
> +				 void *cpu_addr, dma_addr_t dma_handle)
> +{
> +	dma_free_coherent(vdev->dev.parent, size, cpu_addr, dma_handle);
> +}
> +
> +#define vc_dma_alloc_struct(vdev, ptr) \
> +	({ ptr = __vc_dma_alloc(vdev, (sizeof(*(ptr))), &(ptr ## _dma_addr)); })
> +#define vc_dma_free_struct(vdev, ptr) \
> +	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr), (ptr ## _dma_addr))

Not sure I'm a fan of those wrappers... I think they actually hurt
readability of the code.

> +
>  static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
>  {
>  	unsigned long i, flags;
> @@ -322,12 +347,12 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>  {
>  	int ret;
>  	unsigned long *indicatorp = NULL;
> -	struct virtio_thinint_area *thinint_area = NULL;
> +	vc_dma_decl_struct(virtio_thinint_area, thinint_area) = NULL;
> +	dma_addr_t indicatorp_dma_addr;
>  	struct airq_info *airq_info = vcdev->airq_info;
>  
>  	if (vcdev->is_thinint) {
> -		thinint_area = kzalloc(sizeof(*thinint_area),
> -				       GFP_DMA | GFP_KERNEL);
> +		vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
>  		if (!thinint_area)
>  			return;
>  		thinint_area->summary_indicator =
> @@ -338,8 +363,9 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>  		ccw->cda = (__u32)(unsigned long) thinint_area;
>  	} else {
>  		/* payload is the address of the indicators */
> -		indicatorp = kmalloc(sizeof(&vcdev->indicators),
> -				     GFP_DMA | GFP_KERNEL);
> +		indicatorp = __vc_dma_alloc(&vcdev->vdev,
> +					    sizeof(&vcdev->indicators),
> +					    &indicatorp_dma_addr);
>  		if (!indicatorp)
>  			return;
>  		*indicatorp = 0;
> @@ -359,8 +385,10 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>  			 "Failed to deregister indicators (%d)\n", ret);
>  	else if (vcdev->is_thinint)
>  		virtio_ccw_drop_indicators(vcdev);
> -	kfree(indicatorp);
> -	kfree(thinint_area);
> +	if (indicatorp)
> +		__vc_dma_free(&vcdev->vdev, sizeof(&vcdev->indicators),
> +			       indicatorp, indicatorp_dma_addr);
> +	vc_dma_free_struct(&vcdev->vdev, thinint_area);

Don't you need to check for !NULL here as well?

>  }
>  
>  static inline long __do_kvm_notify(struct subchannel_id schid,
(...)
> @@ -1280,7 +1318,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>  
>  	vcdev->is_thinint = virtio_ccw_use_airq; /* at least try */
>  
> -	vcdev->vdev.dev.parent = &cdev->dev;

Hm?

(You added a line like that in a previous patch; should it simply have
been a movement instead? Or am I misremembering?)

>  	vcdev->vdev.dev.release = virtio_ccw_release_dev;
>  	vcdev->vdev.config = &virtio_ccw_config_ops;
>  	vcdev->cdev = cdev;

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
       [not found] ` <20190404231622.52531-11-pasic@linux.ibm.com>
@ 2019-04-10  8:46   ` Cornelia Huck
       [not found]     ` <20190410171254.71206015@oc2783563651>
  0 siblings, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2019-04-10  8:46 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:20 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> We can reduce the number of DMA allocations by pulling the bits
> of memory that belongs to virtio_ccw_device and needs to be DMA
> memory into a single area.

That makes a lot of sense (maybe start with introducing the dma area
from the beginning?), but...

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 48 ++++++++++++++--------------------------
>  1 file changed, 16 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index aa45a6a027ae..7268149f2ee8 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -49,12 +49,12 @@ struct vq_config_block {
>  struct vcdev_dma_area {
>  	unsigned long indicators;
>  	unsigned long indicators2;
> +	struct vq_config_block config_block;
> +	__u8 status; /* TODO check __aligned(8); */

...I think that needs attention.

>  };
>  
>  struct virtio_ccw_device {
>  	struct virtio_device vdev;
> -	__u8 *status;
> -	dma_addr_t status_dma_addr;
>  	__u8 config[VIRTIO_CCW_CONFIG_SIZE];
>  	struct ccw_device *cdev;
>  	__u32 curr_io;

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
       [not found] <20190404231622.52531-1-pasic@linux.ibm.com>
                   ` (5 preceding siblings ...)
       [not found] ` <20190404231622.52531-11-pasic@linux.ibm.com>
@ 2019-04-10  9:20 ` Cornelia Huck
       [not found]   ` <20190410175750.0ed0a454@oc2783563651>
       [not found] ` <20190404231622.52531-5-pasic@linux.ibm.com>
  2019-04-12 13:47 ` [RFC PATCH 00/12] s390: virtio: support protected virtualization David Hildenbrand
  8 siblings, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2019-04-10  9:20 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:10 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Enhanced virtualization protection technology may require the use of
> bounce buffers for I/O. While support for this was built into the virtio
> core,  virtio-ccw wasn't changed accordingly.
> 
> Thus what needs to be done to bring virtio-ccw up to speed with respect
> to this is:
> * use some 'new' common virtio stuff
> * make sure that virtio-ccw specific stuff uses shared memory when
>   talking to the hypervisor (except communication blocks like ORB, these
>   are handled by the hypervisor)
> * make sure the DMA API does what is necessary to talk through shared
>   memory if we are a protected virtualization guest.
> * make sure the common IO layer plays along as well (airqs, sense).

It would be good to have a summary somewhere in the code (or
Documentation/) as to what needs the dma treatment and what doesn't,
for later reference. We don't want people to accidentally break things
(especially if they cannot refer to architecture documentation - or
will at least some of that be published?)

> 
> The series is structured in incremental fashion: some of the changes are
> overridden by following patches. The main reason why is that this is how I
> developed. But I think it ain't bad for the didactic and we are a bit more
> flexible with regards to throwing out some of the stuff in the end.

FWIW, I think reshuffling the patches in the next iteration would ease
review.

> 
> Important notes:
> 
> * This is an early (WIP) RFC that does not add any function to the
>   kernel at his stage, as the ultravisor interactions are left out.
>   The purpose is getting some early feedback ASAP.

I would like some comments from people who have experience with the dma
api.

> 
> * In future these patches will depend on some code interacting with the
>   ultravisor (WIP by Vasily and Janosch).
> 
> * The s390 names are by no means final, and are not properly explained. Should
>   not hamper understanding too much. If it does please ask.
> 
> * The existing naming in the common infrastructure (kernel internal
>   interfaces) is pretty much based on the AMD SEV terminology. Thus the
>   names aren't always perfect. There might be merit to changing these
>   names to more abstract ones. I did not put much thought into that at
>   the current stage.

If we can find some generic names that work well for everyone,
converting seems like a good idea. But following SEV is not that bad,
either (you'll probably find more people who have heard about SEV than
folks familiar with s390 ;)

> 
> 
> Testing:
> 
> Please use iommu_platform=on for any virtio devices you are going
> to test this code with (so virtio actually uses the DMA API).
> 
> Looking forward to your review or any other type of input.

I have now read through the whole series and commented in some places.
But I'd really like to see comments from others as well.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
       [not found]         ` <20190410173148.067555dc@oc2783563651>
@ 2019-04-10 16:07           ` Cornelia Huck
  0 siblings, 0 replies; 27+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:07 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 17:31:48 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 19:14:53 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> [..]
> 
> > > > At this point, you're always going via the css0 device. I'm
> > > > wondering whether you should pass in the cssid here and use
> > > > css_by_id(cssid) to make this future proof. But then, I'm not
> > > > quite clear from which context this will be called.      
> > > 
> > > As said before I don't see MCSS-E coming. Regarding the client code,
> > > please check out the rest of the series. Especially patch 6. 
> > > 
> > > From my perspective it would be at this stage saner to make it more
> > > obvious that only one css is supported (at the moment), than to
> > > implement MCSS-E, or to make this (kind of) MCSS-E ready.    
> > 
> > I disagree. I think there's value in keeping the interfaces clean
> > (within reasonable bounds, of course.) Even if there is no
> > implementation of MCSS-E other than QEMU... it is probably a good idea
> > to spend some brain cycles to make this conceptually clean.  
> 
> AFAIU Linux currently does not support MCSS-E. I don't have the
> bandwidth to implement MCSS-E support in the kernel.
> 
> I fully agree for external interfaces should be MCSS-E conform, so
> should we ever decide to implement we don't have to overcome self-made
> obstacles.
> 
> Kernel internal stuff however, IMHO, should avoid carrying a ballast of
> an 20%-30% implemented MCSS-E support. I see no benefit.
> 
> But I don't insist. If you have a good idea how to make this more MCSS-E
> conform, you are welcome. In think something like this is best done on
> top of a series that provides a basic working solution. Especially if the
> conceptually clean thing is conceptually or code-wise more complex than
> the basic solution. If you have something in mind that is simpler and
> more lightweight than what I did here, I would be more than happy to make
> that happen.

I haven't asked for adding complete support for MCSS-E, just to keep it
in the back of our minds...

Back to the issue at hand. You're currently hard-wiring this to the
css0 device. My question is: Does it make sense to have a per-css area?
From my limited perspective, I think it does (more css images would
mean more devices, and just adding smaller areas per-css is probably
easier than adding one big area scaling with the number of css images).
In that case, what about tacking the area to the actual css object,
instead of having the global variable? Should be an easy change, and
feels cleaner.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
       [not found]         ` <20190410150225.61b86cd9@oc2783563651>
@ 2019-04-10 16:16           ` Cornelia Huck
  0 siblings, 0 replies; 27+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:16 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 15:02:25 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 10:25:57 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:

> > So, what is a high-level summary of areas that need the treatment?
> > What I get from looking at the patches so far, it's:
> > - stuff that is written by the hypervisor's interrupt injection code:
> >   IRB, indicators, etc.  
> 
> In interrupt context IRB behaves like a control block. We don't have
> to make it shared.
> 
> I moved IRB because of snse:
> 
> 
> @@ -329,9 +329,9 @@ ccw_device_do_sense(struct ccw_device *cdev, struct
> irb *irb) /*
>          * We have ending status but no sense information. Do a basic
> sense. */
> -       sense_ccw = &to_io_private(sch)->sense_ccw;
> +       sense_ccw = &to_io_private(sch)->dma_area->sense_ccw;
>         sense_ccw->cmd_code = CCW_CMD_BASIC_SENSE;
> -       sense_ccw->cda = (__u32) __pa(cdev->private->irb.ecw);
> +       sense_ccw->cda = (__u32) __pa(cdev->private->dma_area->irb.ecw);
> 
> as the irb.ecw is used as channel program data. And that needs to be
> shared.

Ah, I see. So it's more "guest points to some memory where the
hypervisor needs to write data"?

> 
> > - buffers that are filled by a channel program: sense, sense id, etc.
> > - ccws themselves (because of translation?)
> >   
> 
> Right. The idea is: smallish, basically fixed size a readily available
> control blocks (specified as a ) are copied back and forth by the

As a what? :)

> ultravisor (via SIE SD).

Ok. It's really hard to be sure from the outside what falls into which
category :(

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O
       [not found]     ` <20190410164245.53f8b26d@oc2783563651>
@ 2019-04-10 16:21       ` Cornelia Huck
  0 siblings, 0 replies; 27+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:21 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 16:42:45 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 10:42:51 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:17 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > @@ -167,6 +170,28 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
> > >  	return container_of(vdev, struct virtio_ccw_device, vdev);
> > >  }
> > >  
> > > +#define vc_dma_decl_struct(type, field) \
> > > +	dma_addr_t field ## _dma_addr;  \
> > > +	struct type *field
> > > +
> > > +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size,
> > > +				   dma_addr_t *dma_handle)
> > > +{
> > > +	return dma_alloc_coherent(vdev->dev.parent, size, dma_handle,
> > > +				  GFP_DMA | GFP_KERNEL | __GFP_ZERO);
> > > +}
> > > +
> > > +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
> > > +				 void *cpu_addr, dma_addr_t dma_handle)
> > > +{
> > > +	dma_free_coherent(vdev->dev.parent, size, cpu_addr, dma_handle);
> > > +}
> > > +
> > > +#define vc_dma_alloc_struct(vdev, ptr) \
> > > +	({ ptr = __vc_dma_alloc(vdev, (sizeof(*(ptr))), &(ptr ## _dma_addr)); })
> > > +#define vc_dma_free_struct(vdev, ptr) \
> > > +	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr), (ptr ## _dma_addr))  
> > 
> > Not sure I'm a fan of those wrappers... I think they actually hurt
> > readability of the code.
> >   
> 
> By wrappers you mean just the macros or also the inline functions?

In particular, I dislike the macros.

> 
> If we agree to go with the cio DMA pool instead of using DMA API
> facilities for allocation (dma_alloc_coherent or maybe a per ccw-device
> dma_pool) I think I could just use cio_dma_zalloc() directly if you like.

If we go with the pool (I'm not familiar enough with the dma stuff to
be able to make a good judgment there), nice and obvious calls sound
good to me :)

> 
> I was quite insecure about how this gen_pool idea is going to be received
> here. That's why I decided to keep the dma_alloc_coherent() version in
> for the RFC.
> 
> If you prefer I can squash patches #7 #9 #10 and #11 together and
> pull #8 forward. Would you prefer that?

If that avoids multiple switches of the approach used, that sounds like
a good idea.

(Still would like to see some feedback from others.)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
       [not found]   ` <20190410175750.0ed0a454@oc2783563651>
@ 2019-04-10 16:24     ` Cornelia Huck
  0 siblings, 0 replies; 27+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:24 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 17:57:50 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 11:20:48 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:10 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > Enhanced virtualization protection technology may require the use of
> > > bounce buffers for I/O. While support for this was built into the
> > > virtio core,  virtio-ccw wasn't changed accordingly.
> > > 
> > > Thus what needs to be done to bring virtio-ccw up to speed with
> > > respect to this is:
> > > * use some 'new' common virtio stuff
> > > * make sure that virtio-ccw specific stuff uses shared memory when
> > >   talking to the hypervisor (except communication blocks like ORB,
> > > these are handled by the hypervisor)
> > > * make sure the DMA API does what is necessary to talk through shared
> > >   memory if we are a protected virtualization guest.
> > > * make sure the common IO layer plays along as well (airqs, sense).  
> > 
> > It would be good to have a summary somewhere in the code (or
> > Documentation/) as to what needs the dma treatment and what doesn't,
> > for later reference. We don't want people to accidentally break things
> > (especially if they cannot refer to architecture documentation - or
> > will at least some of that be published?)
> >   
> 
> I can put documentation on my TODO list. This cover letter was also
> supposed to provide a bird's-eye view on what needs to be done.

Some comments in the code to prevent further headscratching are also a
good idea.

> 
> > > 
> > > The series is structured in incremental fashion: some of the changes
> > > are overridden by following patches. The main reason why is that
> > > this is how I developed. But I think it ain't bad for the didactic
> > > and we are a bit more flexible with regards to throwing out some of
> > > the stuff in the end.  
> > 
> > FWIW, I think reshuffling the patches in the next iteration would ease
> > review.
> >   
> 
> Can you please tell me more about what is desired here? I mean, I made
> some tentative proposals on squashing some patches together. I don't
> remember any requests to reorder patches or split.

Just try to avoid to rewrite things multiple times -- if we agree on
the end result, we should be able to go there directly :)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
       [not found]     ` <20190410171254.71206015@oc2783563651>
@ 2019-04-10 16:36       ` Cornelia Huck
       [not found]         ` <20190410194849.511ecc46@oc2783563651>
  0 siblings, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:36 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 17:12:54 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 10:46:49 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:20 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > > index aa45a6a027ae..7268149f2ee8 100644
> > > --- a/drivers/s390/virtio/virtio_ccw.c
> > > +++ b/drivers/s390/virtio/virtio_ccw.c
> > > @@ -49,12 +49,12 @@ struct vq_config_block {
> > >  struct vcdev_dma_area {
> > >  	unsigned long indicators;
> > >  	unsigned long indicators2;
> > > +	struct vq_config_block config_block;
> > > +	__u8 status; /* TODO check __aligned(8); */  
> > 
> > ...I think that needs attention.  
> 
> Yes I wanted to discuss this with you. I could not find anything
> in the virtio spec that would put requirements on how this
> status field needs to be aligned. But I did not look to hard.
> 
> The ccw.cda can hold an arbitrary data address AFAIR (for indirect,
> of course we do have alignment requirements).

I think it needs to be doubleword aligned.

> 
> Apparently status used to be a normal field, and became a pointer with
> 73fa21ea4fc6 "KVM: s390: Dynamic allocation of virtio-ccw I/O
> data." (Cornelia Huck, 2013-01-07). I could not quite figure out why.

In the beginning, the code used a below-2G-area for all commands.
Rather than adding locking to avoid races there, that commit switches
to allocating the needed structures individually. The status field
needed to be below 2G, so it needed to be allocated separately.

> 
> So maybe dropping the TODO comment will do just fine. What do you think?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
       [not found]         ` <20190410194849.511ecc46@oc2783563651>
@ 2019-04-11  9:24           ` Cornelia Huck
  0 siblings, 0 replies; 27+ messages in thread
From: Cornelia Huck @ 2019-04-11  9:24 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 19:48:49 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 18:36:43 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 10 Apr 2019 17:12:54 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Wed, 10 Apr 2019 10:46:49 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Fri,  5 Apr 2019 01:16:20 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:  
> >   
> > > > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > > > > index aa45a6a027ae..7268149f2ee8 100644
> > > > > --- a/drivers/s390/virtio/virtio_ccw.c
> > > > > +++ b/drivers/s390/virtio/virtio_ccw.c
> > > > > @@ -49,12 +49,12 @@ struct vq_config_block {
> > > > >  struct vcdev_dma_area {
> > > > >  	unsigned long indicators;
> > > > >  	unsigned long indicators2;
> > > > > +	struct vq_config_block config_block;
> > > > > +	__u8 status; /* TODO check __aligned(8); */    
> > > > 
> > > > ...I think that needs attention.    
> > > 
> > > Yes I wanted to discuss this with you. I could not find anything
> > > in the virtio spec that would put requirements on how this
> > > status field needs to be aligned. But I did not look to hard.
> > > 
> > > The ccw.cda can hold an arbitrary data address AFAIR (for indirect,
> > > of course we do have alignment requirements).  
> > 
> > I think it needs to be doubleword aligned.
> >   
> 
> I've re-read the part of the PoP that describes the ccw formats. And
> it reinforced my position: for IDA and MIDA we need proper alignment,
> but if the CCW ain't an indirect one there is no alignment requirement.
> 
> QEMU also does not seem to check either.
> 
> Can you double-check and provide me with a reference that proves me
> wrong if I'm wrong.

Ah, it was the ccw itself, not the cda. Indeed, there do not seem to be
any requirements for direct addressing.

> 
> > > 
> > > Apparently status used to be a normal field, and became a pointer with
> > > 73fa21ea4fc6 "KVM: s390: Dynamic allocation of virtio-ccw I/O
> > > data." (Cornelia Huck, 2013-01-07). I could not quite figure out why.  
> > 
> > In the beginning, the code used a below-2G-area for all commands.
> > Rather than adding locking to avoid races there, that commit switches
> > to allocating the needed structures individually. The status field
> > needed to be below 2G, so it needed to be allocated separately.
> >   
> 
> I get it now. The confusing part was that the field 'area' was about
> holding the address of the also previously dynamically allocated
> below 2G area that was used for talking to the hypervisor via CCW I/O.
> 
> > > 
> > > So maybe dropping the TODO comment will do just fine. What do you think?  
> >   
> 
> I still think we just need to drop the comment, as we don't have to
> align it.

Agreed.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
       [not found] ` <20190404231622.52531-6-pasic@linux.ibm.com>
  2019-04-09 17:55   ` [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio Cornelia Huck
@ 2019-04-11 14:15   ` Sebastian Ott
  1 sibling, 0 replies; 27+ messages in thread
From: Sebastian Ott @ 2019-04-11 14:15 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Eric Farman, Claudio Imbrenda,
	Vasily Gorbik, kvm, Cornelia Huck, virtualization,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Fri, 5 Apr 2019, Halil Pasic wrote:
> @@ -1593,20 +1609,29 @@ struct ccw_device * __init ccw_device_create_console(struct ccw_driver *drv)
>  		return ERR_CAST(sch);
>  
>  	io_priv = kzalloc(sizeof(*io_priv), GFP_KERNEL | GFP_DMA);
> -	if (!io_priv) {
> -		put_device(&sch->dev);
> -		return ERR_PTR(-ENOMEM);
> -	}
> +	if (!io_priv)
> +		goto err_priv;
> +	io_priv->dma_area = cio_dma_zalloc(sizeof(*io_priv->dma_area));

This is called very early - the dma pool is not yet initialized.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
       [not found] ` <20190404231622.52531-5-pasic@linux.ibm.com>
  2019-04-09 10:44   ` [RFC PATCH 04/12] s390/cio: introduce cio DMA pool Cornelia Huck
@ 2019-04-11 18:25   ` Sebastian Ott
       [not found]     ` <20190412132010.3c74cb63@oc2783563651>
  1 sibling, 1 reply; 27+ messages in thread
From: Sebastian Ott @ 2019-04-11 18:25 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Eric Farman, Claudio Imbrenda,
	Vasily Gorbik, kvm, Cornelia Huck, virtualization,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Fri, 5 Apr 2019, Halil Pasic wrote:
> To support protected virtualization cio will need to make sure the
> memory used for communication with the hypervisor is DMA memory.
> 
> Let us introduce a DMA pool to cio that will help us in allocating
> deallocating those chunks of memory.
> 
> We use a gen_pool backed with DMA pages to avoid each allocation
> effectively wasting a page, as we typically allocate much less
> than PAGE_SIZE.

I don't think we should use this global DMA pool. I guess it's OK for
stuff like airq (where we don't have a struct device at hand) but for
CCW we should use the device we have. Yes, this way we waste some memory
but all dma memory a device uses should fit in a page - so the wastage
is not too much.

Sebastian

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
       [not found]     ` <20190412132010.3c74cb63@oc2783563651>
@ 2019-04-12 12:12       ` Sebastian Ott
       [not found]         ` <20190412173017.04b768bb@oc2783563651>
  0 siblings, 1 reply; 27+ messages in thread
From: Sebastian Ott @ 2019-04-12 12:12 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Eric Farman, Claudio Imbrenda,
	Vasily Gorbik, kvm, Cornelia Huck, virtualization,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Fri, 12 Apr 2019, Halil Pasic wrote:
> On Thu, 11 Apr 2019 20:25:01 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> > I don't think we should use this global DMA pool. I guess it's OK for
> > stuff like airq (where we don't have a struct device at hand) but for
> > CCW we should use the device we have. Yes, this way we waste some memory
> > but all dma memory a device uses should fit in a page - so the wastage
> > is not too much.
> > 
> 
> Is what you envision an own gen_pool on for each subchannel (e.g. a
> struct io_subchannel member)?

Either that or if that's too much overhead simply map a page and create
a struct containing the few dma areas for that device.

> I'm struggling with understanding the expected benefits of a
> per-subchannel pool/allocator. Can you please tell me what benefits do
> you expect (over the current approach)?

Logically DMA is a capability of a device and the whole DMA API is build
around devices. Working around that just feels wrong. For practical
matters: DMA debugging will complain about misuse of a specific device or
driver.

> I understand you idea is to keep the CIO global pool for stuff that can
> not be tied to a single device (i.e. ariq). So the per device stuff would
> also mean more code. Would you be OK with postponing this alleged
> enhancement (i.e. implement it as a patch on top of this series)?

I don't like it but it's just in-kernel usage which we can change at any
time. So if it helps you to do it that way, why not..

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
       [not found] <20190404231622.52531-1-pasic@linux.ibm.com>
                   ` (7 preceding siblings ...)
       [not found] ` <20190404231622.52531-5-pasic@linux.ibm.com>
@ 2019-04-12 13:47 ` David Hildenbrand
       [not found]   ` <20190416131005.6f3e05eb@oc2783563651>
  8 siblings, 1 reply; 27+ messages in thread
From: David Hildenbrand @ 2019-04-12 13:47 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: Eric Farman, Viktor Mihajlovski, Janosch Frank, Vasily Gorbik,
	Farhan Ali, virtualization, Claudio Imbrenda

On 05.04.19 01:16, Halil Pasic wrote:
> Enhanced virtualization protection technology may require the use of
> bounce buffers for I/O. While support for this was built into the virtio
> core,  virtio-ccw wasn't changed accordingly.

Can you elaborate some more about the general approach (Enhanced
virtualization protection technology, ultravisor, concept, issues, how
to squeeze it into QEMU/KVM/kernel) etc =

For my taste, this cover letter misses some important context :)

> 
> Thus what needs to be done to bring virtio-ccw up to speed with respect
> to this is:
> * use some 'new' common virtio stuff
> * make sure that virtio-ccw specific stuff uses shared memory when
>   talking to the hypervisor (except communication blocks like ORB, these
>   are handled by the hypervisor)
> * make sure the DMA API does what is necessary to talk through shared
>   memory if we are a protected virtualization guest.
> * make sure the common IO layer plays along as well (airqs, sense).
> 
> The series is structured in incremental fashion: some of the changes are
> overridden by following patches. The main reason why is that this is how I
> developed. But I think it ain't bad for the didactic and we are a bit more
> flexible with regards to throwing out some of the stuff in the end.
> 
> Important notes:
> 
> * This is an early (WIP) RFC that does not add any function to the
>   kernel at his stage, as the ultravisor interactions are left out.
>   The purpose is getting some early feedback ASAP.
> 
> * In future these patches will depend on some code interacting with the
>   ultravisor (WIP by Vasily and Janosch).
> 
> * The s390 names are by no means final, and are not properly explained. Should
>   not hamper understanding too much. If it does please ask.
> 
> * The existing naming in the common infrastructure (kernel internal
>   interfaces) is pretty much based on the AMD SEV terminology. Thus the
>   names aren't always perfect. There might be merit to changing these
>   names to more abstract ones. I did not put much thought into that at
>   the current stage.
> 
> 
> Testing:
> 
> Please use iommu_platform=on for any virtio devices you are going
> to test this code with (so virtio actually uses the DMA API).
> 
> Looking forward to your review or any other type of input.
> 
> Halil Pasic (12):
>   virtio/s390: use vring_create_virtqueue
>   virtio/s390: DMA support for virtio-ccw
>   s390/mm: force swiotlb for protected virtualization
>   s390/cio: introduce cio DMA pool
>   s390/cio: add protected virtualization support to cio
>   s390/airq: use DMA memory for adapter interrupts
>   virtio/s390: use DMA memory for ccw I/O
>   virtio/s390: add indirection to indicators access
>   virtio/s390: use DMA memory for notifiers
>   virtio/s390: consolidate DMA allocations
>   virtio/s390: use the cio DMA pool
>   virtio/s390:  make airq summary indicators DMA
> 
>  arch/s390/Kconfig                   |   5 +
>  arch/s390/include/asm/Kbuild        |   1 -
>  arch/s390/include/asm/airq.h        |   2 +
>  arch/s390/include/asm/cio.h         |   4 +
>  arch/s390/include/asm/dma-mapping.h |  13 ++
>  arch/s390/include/asm/mem_encrypt.h |  18 +++
>  arch/s390/mm/init.c                 |  44 +++++
>  drivers/s390/cio/airq.c             |  18 ++-
>  drivers/s390/cio/ccwreq.c           |   8 +-
>  drivers/s390/cio/css.c              |  63 ++++++++
>  drivers/s390/cio/device.c           |  46 ++++--
>  drivers/s390/cio/device_fsm.c       |  40 ++---
>  drivers/s390/cio/device_id.c        |  18 +--
>  drivers/s390/cio/device_ops.c       |   4 +-
>  drivers/s390/cio/device_pgid.c      |  20 +--
>  drivers/s390/cio/device_status.c    |  24 +--
>  drivers/s390/cio/io_sch.h           |  19 ++-
>  drivers/s390/virtio/virtio_ccw.c    | 310 ++++++++++++++++++++----------------
>  18 files changed, 444 insertions(+), 213 deletions(-)
>  create mode 100644 arch/s390/include/asm/dma-mapping.h
>  create mode 100644 arch/s390/include/asm/mem_encrypt.h
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
       [not found]   ` <20190416131005.6f3e05eb@oc2783563651>
@ 2019-04-16 11:50     ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2019-04-16 11:50 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Cornelia Huck, virtualization, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

On 16.04.19 13:10, Halil Pasic wrote:
> On Fri, 12 Apr 2019 15:47:50 +0200
> David Hildenbrand <david@redhat.com> wrote:
> 
>> On 05.04.19 01:16, Halil Pasic wrote:
>>> Enhanced virtualization protection technology may require the use of
>>> bounce buffers for I/O. While support for this was built into the virtio
>>> core,  virtio-ccw wasn't changed accordingly.
>>
>> Can you elaborate some more about the general approach (Enhanced
>> virtualization protection technology, ultravisor, concept, issues, how
>> to squeeze it into QEMU/KVM/kernel) etc =
>>
>> For my taste, this cover letter misses some important context :)
>>
> 
> I'm aware. Unfortunately we don't have a decision yet about which parts
> of the protected virtualization architecture are going to be PoP
> material.

Oh, okay.

> 
> You can get some more context immediately by having a look at Martin's
> features branch and looking at the s390/protvirt  and s390/uv patches.

As I don't have time to dig through random branches to
discover/reverse-engineer the obvious, I won't be reviewing this patch
series. But as I am not an I/O expert, this might not be bad at all :)

Maybe other people can help.

> 
> I will try to provide more background information for v1. But having
> a remotely complete and reliable documentation will take some time. What
> I can offer at the moment is answers to specific questions.

Waiting for v1 then.

Cheers!

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
       [not found]         ` <20190412173017.04b768bb@oc2783563651>
@ 2019-04-16 12:50           ` Sebastian Ott
  0 siblings, 0 replies; 27+ messages in thread
From: Sebastian Ott @ 2019-04-16 12:50 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Eric Farman, Claudio Imbrenda,
	Vasily Gorbik, kvm, Cornelia Huck, virtualization,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Fri, 12 Apr 2019, Halil Pasic wrote:
> On Fri, 12 Apr 2019 14:12:31 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> > On Fri, 12 Apr 2019, Halil Pasic wrote:
> > > On Thu, 11 Apr 2019 20:25:01 +0200 (CEST)
> > > Sebastian Ott <sebott@linux.ibm.com> wrote:
> > > > I don't think we should use this global DMA pool. I guess it's OK for
> > > > stuff like airq (where we don't have a struct device at hand) but for
> > > > CCW we should use the device we have. Yes, this way we waste some memory
> > > > but all dma memory a device uses should fit in a page - so the wastage
> > > > is not too much.
> 
> Regarding the wastage. Let us do the math together in search for an
> upper (wastage) limit.
[...]
> Currently we need at least 224 bytes per device that is ~ 6%
> of a PAGE_SIZE.

Yes, we basically waste the whole page. I'm ok with that if the benefit is
to play nice with the kernel APIs.

> > For practical
> > matters: DMA debugging will complain about misuse of a specific device or
> > driver.
> > 
> 
> Do you mean CONFIG_DMA_API_DEBUG and CONFIG_DMA_API_DEBUG_SG? I've been
> running with those and did not see any complaints. Maybe we should
> clarify this one offline...

I didn't mean to imply that there are bugs already - just that when used
as intended the DMA_DEBUG_API can complain about stuff like "your device
is gone but you have still DMA memory set up for it" which will not work
if you don't use the correct device...

Sebastian

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-04-16 12:50 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20190404231622.52531-1-pasic@linux.ibm.com>
     [not found] ` <20190404231622.52531-2-pasic@linux.ibm.com>
2019-04-08 11:01   ` [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue Cornelia Huck
2019-04-08 12:37     ` Michael S. Tsirkin
     [not found] ` <20190404231622.52531-3-pasic@linux.ibm.com>
2019-04-09  9:57   ` [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw Cornelia Huck
     [not found]     ` <20190409132927.5df3bc50@oc2783563651>
2019-04-09 13:01       ` Cornelia Huck
     [not found]         ` <20190409152313.0296e8f1@oc2783563651>
2019-04-09 15:47           ` Cornelia Huck
     [not found] ` <20190404231622.52531-4-pasic@linux.ibm.com>
2019-04-09 10:16   ` [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization Cornelia Huck
     [not found]     ` <20190409125416.73713f23@oc2783563651>
2019-04-09 17:18       ` Cornelia Huck
2019-04-09 12:22   ` Christoph Hellwig
     [not found] ` <20190404231622.52531-6-pasic@linux.ibm.com>
2019-04-09 17:55   ` [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio Cornelia Huck
     [not found]     ` <20190410021044.4da3e847@oc2783563651>
2019-04-10  8:25       ` Cornelia Huck
     [not found]         ` <20190410150225.61b86cd9@oc2783563651>
2019-04-10 16:16           ` Cornelia Huck
2019-04-11 14:15   ` Sebastian Ott
     [not found] ` <20190404231622.52531-8-pasic@linux.ibm.com>
2019-04-10  8:42   ` [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O Cornelia Huck
     [not found]     ` <20190410164245.53f8b26d@oc2783563651>
2019-04-10 16:21       ` Cornelia Huck
     [not found] ` <20190404231622.52531-11-pasic@linux.ibm.com>
2019-04-10  8:46   ` [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations Cornelia Huck
     [not found]     ` <20190410171254.71206015@oc2783563651>
2019-04-10 16:36       ` Cornelia Huck
     [not found]         ` <20190410194849.511ecc46@oc2783563651>
2019-04-11  9:24           ` Cornelia Huck
2019-04-10  9:20 ` [RFC PATCH 00/12] s390: virtio: support protected virtualization Cornelia Huck
     [not found]   ` <20190410175750.0ed0a454@oc2783563651>
2019-04-10 16:24     ` Cornelia Huck
     [not found] ` <20190404231622.52531-5-pasic@linux.ibm.com>
2019-04-09 10:44   ` [RFC PATCH 04/12] s390/cio: introduce cio DMA pool Cornelia Huck
     [not found]     ` <20190409141114.7dcce94a@oc2783563651>
2019-04-09 17:14       ` Cornelia Huck
     [not found]         ` <20190410173148.067555dc@oc2783563651>
2019-04-10 16:07           ` Cornelia Huck
2019-04-11 18:25   ` Sebastian Ott
     [not found]     ` <20190412132010.3c74cb63@oc2783563651>
2019-04-12 12:12       ` Sebastian Ott
     [not found]         ` <20190412173017.04b768bb@oc2783563651>
2019-04-16 12:50           ` Sebastian Ott
2019-04-12 13:47 ` [RFC PATCH 00/12] s390: virtio: support protected virtualization David Hildenbrand
     [not found]   ` <20190416131005.6f3e05eb@oc2783563651>
2019-04-16 11:50     ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).