* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Michael S. Tsirkin @ 2018-07-30 13:26 UTC (permalink / raw)
To: Christoph Hellwig
Cc: robh, srikar, benh, Will Deacon, linux-kernel, linuxram,
virtualization, paulus, marc.zyngier, mpe, joe, robin.murphy,
david, linuxppc-dev, elfring, haren, Anshuman Khandual
In-Reply-To: <20180730111802.GA9830@infradead.org>
On Mon, Jul 30, 2018 at 04:18:02AM -0700, Christoph Hellwig wrote:
> On Mon, Jul 30, 2018 at 01:28:03PM +0300, Michael S. Tsirkin wrote:
> > Let me reply to the "crappy" part first:
> > So virtio devices can run on another CPU or on a PCI bus. Configuration
> > can happen over mupltiple transports. There is a discovery protocol to
> > figure out where it is. It has some warts but any real system has warts.
> >
> > So IMHO virtio running on another CPU isn't "legacy virtual crappy
> > virtio". virtio devices that actually sit on a PCI bus aren't "sane"
> > simply because the DMA is more convoluted on some architectures.
>
> All of what you said would be true if virtio didn't claim to be
> a PCI device.
There's nothing virtio claims to be. It's a PV device that uses PCI for
its configuration. Configuration is enumerated on the virtual PCI bus.
That part of the interface is emulated PCI. Data path is through a
PV device enumerated on the virtio bus.
> Once it claims to be a PCI device and we also see
> real hardware written to the interface I stand to all what I said
> above.
Real hardware would reuse parts of the interface but by necessity it
needs to behave slightly differently on some platforms. However for
some platforms (such as x86) a PV virtio driver will by luck work with a
PCI device backend without changes. As these platforms and drivers are
widely deployed, some people will deploy hardware like that. Should be
a non issue as by definition it's transparent to guests.
> > With this out of my system:
> > I agree these approaches are hacky. I think it is generally better to
> > have virtio feature negotiation tell you whether device runs on a CPU or
> > not rather than rely on platform specific ways for this. To this end
> > there was a recent proposal to rename VIRTIO_F_IO_BARRIER to
> > VIRTIO_F_REAL_DEVICE. It got stuck since "real" sounds vague to people,
> > e.g. what if it's a VF - is that real or not? But I can see something
> > like e.g. VIRTIO_F_PLATFORM_DMA gaining support.
> >
> > We would then rename virtio_has_iommu_quirk to virtio_has_dma_quirk
> > and test VIRTIO_F_PLATFORM_DMA in addition to the IOMMU thing.
>
> I don't really care about the exact naming, and indeed a device that
> sets the flag doesn't have to be a 'real' device - it just has to act
> like one. I explained all the issues that this means (at least relating
> to DMA) in one of the previous threads.
I believe you refer to this:
https://lkml.org/lkml/2018/6/7/15
that was a very helpful list outlining the problems we need to solve,
thanks a lot for that!
> The important bit is that we can specify exact behavior for both
> devices that sets the "I'm real!" flag and that ones that don't exactly
> in the spec.
I would very much like that, yes.
> And that very much excludes arch-specific (or
> Xen-specific) overrides.
We already committed to a xen specific hack but generally I prefer
devices that describe how they work instead of platforms magically
guessing, yes.
However the question people raise is that DMA API is already full of
arch-specific tricks the likes of which are outlined in your post linked
above. How is this one much worse?
--
MST
^ permalink raw reply
* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Christoph Hellwig @ 2018-07-30 11:18 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: robh, srikar, benh, Will Deacon, linux-kernel, linuxram,
virtualization, Christoph Hellwig, paulus, marc.zyngier, mpe, joe,
robin.murphy, david, linuxppc-dev, elfring, haren,
Anshuman Khandual
In-Reply-To: <20180730125100-mutt-send-email-mst@kernel.org>
On Mon, Jul 30, 2018 at 01:28:03PM +0300, Michael S. Tsirkin wrote:
> Let me reply to the "crappy" part first:
> So virtio devices can run on another CPU or on a PCI bus. Configuration
> can happen over mupltiple transports. There is a discovery protocol to
> figure out where it is. It has some warts but any real system has warts.
>
> So IMHO virtio running on another CPU isn't "legacy virtual crappy
> virtio". virtio devices that actually sit on a PCI bus aren't "sane"
> simply because the DMA is more convoluted on some architectures.
All of what you said would be true if virtio didn't claim to be
a PCI device. Once it claims to be a PCI device and we also see
real hardware written to the interface I stand to all what I said
above.
> With this out of my system:
> I agree these approaches are hacky. I think it is generally better to
> have virtio feature negotiation tell you whether device runs on a CPU or
> not rather than rely on platform specific ways for this. To this end
> there was a recent proposal to rename VIRTIO_F_IO_BARRIER to
> VIRTIO_F_REAL_DEVICE. It got stuck since "real" sounds vague to people,
> e.g. what if it's a VF - is that real or not? But I can see something
> like e.g. VIRTIO_F_PLATFORM_DMA gaining support.
>
> We would then rename virtio_has_iommu_quirk to virtio_has_dma_quirk
> and test VIRTIO_F_PLATFORM_DMA in addition to the IOMMU thing.
I don't really care about the exact naming, and indeed a device that
sets the flag doesn't have to be a 'real' device - it just has to act
like one. I explained all the issues that this means (at least relating
to DMA) in one of the previous threads.
The important bit is that we can specify exact behavior for both
devices that sets the "I'm real!" flag and that ones that don't exactly
in the spec. And that very much excludes arch-specific (or
Xen-specific) overrides.
^ permalink raw reply
* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Michael S. Tsirkin @ 2018-07-30 10:28 UTC (permalink / raw)
To: Christoph Hellwig
Cc: robh, srikar, benh, Will Deacon, linux-kernel, linuxram,
virtualization, paulus, marc.zyngier, mpe, joe, robin.murphy,
david, linuxppc-dev, elfring, haren, Anshuman Khandual
In-Reply-To: <20180730093414.GD26245@infradead.org>
On Mon, Jul 30, 2018 at 02:34:14AM -0700, Christoph Hellwig wrote:
> We really need to distinguish between legacy virtual crappy
> virtio (and that includes v1) that totally ignores the bus it pretends
> to be on, and sane virtio (to be defined) that sit on a real (or
> properly emulated including iommu and details for dma mapping) bus.
Let me reply to the "crappy" part first:
So virtio devices can run on another CPU or on a PCI bus. Configuration
can happen over mupltiple transports. There is a discovery protocol to
figure out where it is. It has some warts but any real system has warts.
So IMHO virtio running on another CPU isn't "legacy virtual crappy
virtio". virtio devices that actually sit on a PCI bus aren't "sane"
simply because the DMA is more convoluted on some architectures.
Performance impact of the optimizations possible when you know
your "device" is in fact just another CPU has been measured,
it is real, so we aren't interested in adding all that overhead back
just so we can use DMA API. The "correct then fast" mantra doesn't
apply to something that is as widely deployed as virtio.
And I can accept an argument that maybe the DMA API isn't designed to
support such virtual DMA. Whether it should I don't know.
With this out of my system:
I agree these approaches are hacky. I think it is generally better to
have virtio feature negotiation tell you whether device runs on a CPU or
not rather than rely on platform specific ways for this. To this end
there was a recent proposal to rename VIRTIO_F_IO_BARRIER to
VIRTIO_F_REAL_DEVICE. It got stuck since "real" sounds vague to people,
e.g. what if it's a VF - is that real or not? But I can see something
like e.g. VIRTIO_F_PLATFORM_DMA gaining support.
We would then rename virtio_has_iommu_quirk to virtio_has_dma_quirk
and test VIRTIO_F_PLATFORM_DMA in addition to the IOMMU thing.
--
MST
^ permalink raw reply
* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Christoph Hellwig @ 2018-07-30 9:34 UTC (permalink / raw)
To: Will Deacon
Cc: robh, srikar, mst, benh, linuxram, linux-kernel, virtualization,
hch, paulus, marc.zyngier, mpe, joe, robin.murphy, david,
linuxppc-dev, elfring, haren, Anshuman Khandual
In-Reply-To: <20180727095804.GA25592@arm.com>
On Fri, Jul 27, 2018 at 10:58:05AM +0100, Will Deacon wrote:
>
> I just wanted to say that this patch series provides a means for us to
> force the coherent DMA ops for legacy virtio devices on arm64, which in turn
> means that we can enable the SMMU with legacy devices in our fastmodel
> emulation platform (which is slowly being upgraded to virtio 1.0) without
> hanging during boot. Patch below.
Yikes, this is a nightmare. That is exactly where I do not want things
to end up. We really need to distinguish between legacy virtual crappy
virtio (and that includes v1) that totally ignores the bus it pretends
to be on, and sane virtio (to be defined) that sit on a real (or
properly emulated including iommu and details for dma mapping) bus.
Having a mumble jumble of arch specific undocumented magic as in
the powerpc patch replied to or this arm patch is a complete no-go.
Nacked-by: Christoph Hellwig <hch@lst.de>
for both.
^ permalink raw reply
* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Christoph Hellwig @ 2018-07-30 9:30 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: robh, srikar, benh, linuxram, linux-kernel, virtualization, hch,
paulus, mpe, joe, david, linuxppc-dev, elfring, haren,
Anshuman Khandual
In-Reply-To: <20180729001344-mutt-send-email-mst@kernel.org>
> > +
> > + if (xen_domain())
> > + goto skip_override;
> > +
> > + if (virtio_has_iommu_quirk(dev))
> > + set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
> > +
> > + skip_override:
> > +
>
> I prefer normal if scoping as opposed to goto spaghetti pls.
> Better yet move vring_use_dma_api here and use it.
> Less of a chance something will break.
I agree about avoid pointless gotos here, but we can do things
perfectly well without either gotos or a confusing helper here
if we structure it right. E.g.:
// suitably detailed comment here
if (!xen_domain() &&
!virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM))
set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
and while we're at it - modifying dma ops for the parent looks very
dangerous. I don't think we can do that, as it could break iommu
setup interactions. IFF we set a specific dma map ops it has to be
on the virtio device itself, of which we have full control.
^ permalink raw reply
* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Christoph Hellwig @ 2018-07-30 9:25 UTC (permalink / raw)
To: Anshuman Khandual
Cc: robh, srikar, mst, benh, linuxram, linux-kernel, virtualization,
hch, paulus, mpe, joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <20180720035941.6844-3-khandual@linux.vnet.ibm.com>
> +const struct dma_map_ops virtio_direct_dma_ops;
This belongs into a header if it is non-static. If you only
use it in this file anyway please mark it static and avoid a forward
declaration.
> +
> int virtio_finalize_features(struct virtio_device *dev)
> {
> int ret = dev->config->finalize_features(dev);
> @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
> if (ret)
> return ret;
>
> + if (virtio_has_iommu_quirk(dev))
> + set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
This needs a big fat comment explaining what is going on here.
Also not new, but I find the existance of virtio_has_iommu_quirk and its
name horribly confusing. It might be better to open code it here once
only a single caller is left.
^ permalink raw reply
* Re: [RFC 1/4] virtio: Define virtio_direct_dma_ops structure
From: Christoph Hellwig @ 2018-07-30 9:24 UTC (permalink / raw)
To: Anshuman Khandual
Cc: robh, srikar, mst, benh, linuxram, linux-kernel, virtualization,
hch, paulus, mpe, joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <20180720035941.6844-2-khandual@linux.vnet.ibm.com>
> +/*
> + * Virtio direct mapping DMA API operations structure
> + *
> + * This defines DMA API structure for all virtio devices which would not
> + * either bring in their own DMA OPS from architecture or they would not
> + * like to use architecture specific IOMMU based DMA OPS because QEMU
> + * expects GPA instead of an IOVA in absence of VIRTIO_F_IOMMU_PLATFORM.
> + */
> +dma_addr_t virtio_direct_map_page(struct device *dev, struct page *page,
> + unsigned long offset, size_t size,
> + enum dma_data_direction dir,
> + unsigned long attrs)
All these functions should probably be marked static.
> +void virtio_direct_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> + size_t size, enum dma_data_direction dir,
> + unsigned long attrs)
> +{
> +}
No need to implement no-op callbacks in struct dma_map_ops.
> +
> +int virtio_direct_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
> +{
> + return 0;
> +}
Including this one.
> +void *virtio_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> + gfp_t gfp, unsigned long attrs)
> +{
> + void *queue = alloc_pages_exact(PAGE_ALIGN(size), gfp);
> +
> + if (queue) {
> + phys_addr_t phys_addr = virt_to_phys(queue);
> + *dma_handle = (dma_addr_t)phys_addr;
> +
> + if (WARN_ON_ONCE(*dma_handle != phys_addr)) {
> + free_pages_exact(queue, PAGE_ALIGN(size));
> + return NULL;
> + }
> + }
> + return queue;
queue is a very odd name in a generic memory allocator.
> +void virtio_direct_free(struct device *dev, size_t size, void *vaddr,
> + dma_addr_t dma_addr, unsigned long attrs)
> +{
> + free_pages_exact(vaddr, PAGE_ALIGN(size));
> +}
> +
> +const struct dma_map_ops virtio_direct_dma_ops = {
> + .alloc = virtio_direct_alloc,
> + .free = virtio_direct_free,
> + .map_page = virtio_direct_map_page,
> + .unmap_page = virtio_direct_unmap_page,
> + .mapping_error = virtio_direct_mapping_error,
> +};
This is missing a dma_map_sg implementation. In general this is
mandatory for dma_ops. So either you implement it or explain in
a common why you think you can skip it.
> +EXPORT_SYMBOL(virtio_direct_dma_ops);
EXPORT_SYMBOL_GPL like all virtio symbols, please.
^ permalink raw reply
* Re: [PATCH v2 2/2] virtio_balloon: replace oom notifier with shrinker
From: Michal Hocko @ 2018-07-30 9:00 UTC (permalink / raw)
To: Wei Wang; +Cc: virtio-dev, mst, linux-kernel, virtualization, linux-mm, akpm
In-Reply-To: <1532683495-31974-3-git-send-email-wei.w.wang@intel.com>
On Fri 27-07-18 17:24:55, Wei Wang wrote:
> The OOM notifier is getting deprecated to use for the reasons mentioned
> here by Michal Hocko: https://lkml.org/lkml/2018/7/12/314
>
> This patch replaces the virtio-balloon oom notifier with a shrinker
> to release balloon pages on memory pressure.
It would be great to document the replacement. This is not a small
change...
> In addition, the bug in the replaced virtballoon_oom_notify that only
> VIRTIO_BALLOON_ARRAY_PFNS_MAX (i.e 256) balloon pages can be freed
> though the user has specified more than that number is fixed in the
> shrinker_scan function.
>
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> ---
> drivers/virtio/virtio_balloon.c | 115 +++++++++++++++++++++++-----------------
> 1 file changed, 65 insertions(+), 50 deletions(-)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 9356a1a..6b2229b 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -27,7 +27,6 @@
> #include <linux/slab.h>
> #include <linux/module.h>
> #include <linux/balloon_compaction.h>
> -#include <linux/oom.h>
> #include <linux/wait.h>
> #include <linux/mm.h>
> #include <linux/mount.h>
> @@ -40,12 +39,12 @@
> */
> #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT)
> #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
> -#define OOM_VBALLOON_DEFAULT_PAGES 256
> +#define DEFAULT_BALLOON_PAGES_TO_SHRINK 256
> #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
>
> -static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> -module_param(oom_pages, int, S_IRUSR | S_IWUSR);
> -MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
> +static unsigned long balloon_pages_to_shrink = DEFAULT_BALLOON_PAGES_TO_SHRINK;
> +module_param(balloon_pages_to_shrink, ulong, 0600);
> +MODULE_PARM_DESC(balloon_pages_to_shrink, "pages to free on memory presure");
>
> #ifdef CONFIG_BALLOON_COMPACTION
> static struct vfsmount *balloon_mnt;
> @@ -86,8 +85,8 @@ struct virtio_balloon {
> /* Memory statistics */
> struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
>
> - /* To register callback in oom notifier call chain */
> - struct notifier_block nb;
> + /* To register a shrinker to shrink memory upon memory pressure */
> + struct shrinker shrinker;
> };
>
> static struct virtio_device_id id_table[] = {
> @@ -365,38 +364,6 @@ static void update_balloon_size(struct virtio_balloon *vb)
> &actual);
> }
>
> -/*
> - * virtballoon_oom_notify - release pages when system is under severe
> - * memory pressure (called from out_of_memory())
> - * @self : notifier block struct
> - * @dummy: not used
> - * @parm : returned - number of freed pages
> - *
> - * The balancing of memory by use of the virtio balloon should not cause
> - * the termination of processes while there are pages in the balloon.
> - * If virtio balloon manages to release some memory, it will make the
> - * system return and retry the allocation that forced the OOM killer
> - * to run.
> - */
> -static int virtballoon_oom_notify(struct notifier_block *self,
> - unsigned long dummy, void *parm)
> -{
> - struct virtio_balloon *vb;
> - unsigned long *freed;
> - unsigned num_freed_pages;
> -
> - vb = container_of(self, struct virtio_balloon, nb);
> - if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> - return NOTIFY_OK;
> -
> - freed = parm;
> - num_freed_pages = leak_balloon(vb, oom_pages);
> - update_balloon_size(vb);
> - *freed += num_freed_pages;
> -
> - return NOTIFY_OK;
> -}
> -
> static void update_balloon_stats_func(struct work_struct *work)
> {
> struct virtio_balloon *vb;
> @@ -548,6 +515,54 @@ static struct file_system_type balloon_fs = {
>
> #endif /* CONFIG_BALLOON_COMPACTION */
>
> +static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker,
> + struct shrink_control *sc)
> +{
> + unsigned long pages_to_free = balloon_pages_to_shrink,
> + pages_freed = 0;
> + struct virtio_balloon *vb = container_of(shrinker,
> + struct virtio_balloon, shrinker);
> +
> + /*
> + * One invocation of leak_balloon can deflate at most
> + * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
> + * multiple times to deflate pages till reaching
> + * balloon_pages_to_shrink pages.
> + */
> + while (vb->num_pages && pages_to_free) {
> + pages_to_free = balloon_pages_to_shrink - pages_freed;
> + pages_freed += leak_balloon(vb, pages_to_free);
> + }
> + update_balloon_size(vb);
> +
> + return pages_freed / VIRTIO_BALLOON_PAGES_PER_PAGE;
> +}
> +
> +static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker,
> + struct shrink_control *sc)
> +{
> + struct virtio_balloon *vb = container_of(shrinker,
> + struct virtio_balloon, shrinker);
> +
> + return min_t(unsigned long, vb->num_pages, balloon_pages_to_shrink) /
> + VIRTIO_BALLOON_PAGES_PER_PAGE;
> +}
> +
> +static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb)
> +{
> + unregister_shrinker(&vb->shrinker);
> +}
> +
> +static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
> +{
> + vb->shrinker.scan_objects = virtio_balloon_shrinker_scan;
> + vb->shrinker.count_objects = virtio_balloon_shrinker_count;
> + vb->shrinker.batch = 0;
> + vb->shrinker.seeks = DEFAULT_SEEKS;
> +
> + return register_shrinker(&vb->shrinker);
> +}
> +
> static int virtballoon_probe(struct virtio_device *vdev)
> {
> struct virtio_balloon *vb;
> @@ -580,17 +595,10 @@ static int virtballoon_probe(struct virtio_device *vdev)
> if (err)
> goto out_free_vb;
>
> - vb->nb.notifier_call = virtballoon_oom_notify;
> - vb->nb.priority = VIRTBALLOON_OOM_NOTIFY_PRIORITY;
> - err = register_oom_notifier(&vb->nb);
> - if (err < 0)
> - goto out_del_vqs;
> -
> #ifdef CONFIG_BALLOON_COMPACTION
> balloon_mnt = kern_mount(&balloon_fs);
> if (IS_ERR(balloon_mnt)) {
> err = PTR_ERR(balloon_mnt);
> - unregister_oom_notifier(&vb->nb);
> goto out_del_vqs;
> }
>
> @@ -599,13 +607,20 @@ static int virtballoon_probe(struct virtio_device *vdev)
> if (IS_ERR(vb->vb_dev_info.inode)) {
> err = PTR_ERR(vb->vb_dev_info.inode);
> kern_unmount(balloon_mnt);
> - unregister_oom_notifier(&vb->nb);
> vb->vb_dev_info.inode = NULL;
> goto out_del_vqs;
> }
> vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops;
> #endif
> -
> + /*
> + * We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if a
> + * shrinker needs to be registered to relieve memory pressure.
> + */
> + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) {
> + err = virtio_balloon_register_shrinker(vb);
> + if (err)
> + goto out_del_vqs;
> + }
> virtio_device_ready(vdev);
>
> if (towards_target(vb))
> @@ -637,8 +652,8 @@ static void virtballoon_remove(struct virtio_device *vdev)
> {
> struct virtio_balloon *vb = vdev->priv;
>
> - unregister_oom_notifier(&vb->nb);
> -
> + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> + virtio_balloon_unregister_shrinker(vb);
> spin_lock_irq(&vb->stop_update_lock);
> vb->stop_update = true;
> spin_unlock_irq(&vb->stop_update_lock);
> --
> 2.7.4
--
Michal Hocko
SUSE Labs
^ permalink raw reply
* Re: [PATCH 2/2] tools/virtio: add kmalloc_array stub
From: Jason Wang @ 2018-07-30 7:28 UTC (permalink / raw)
To: Michael S. Tsirkin, linux-kernel; +Cc: virtualization, khandual
In-Reply-To: <20180725134057.113423-2-mst@redhat.com>
On 2018年07月25日 21:45, Michael S. Tsirkin wrote:
> Fixes: 6da2ec56059 ("treewide: kmalloc() -> kmalloc_array()")
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> tools/virtio/linux/kernel.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h
> index fca8381bbe04..fb22bccfbc8a 100644
> --- a/tools/virtio/linux/kernel.h
> +++ b/tools/virtio/linux/kernel.h
> @@ -52,6 +52,11 @@ static inline void *kmalloc(size_t s, gfp_t gfp)
> return __kmalloc_fake;
> return malloc(s);
> }
> +static inline void *kmalloc_array(unsigned n, size_t s, gfp_t gfp)
> +{
> + return kmalloc(n * s, gfp);
> +}
> +
> static inline void *kzalloc(size_t s, gfp_t gfp)
> {
> void *p = kmalloc(s, gfp);
Reviewed-by: Jason Wang <jasowang@redhat.com>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [PATCH 1/2] tools/virtio: add dma barrier stubs
From: Jason Wang @ 2018-07-30 7:28 UTC (permalink / raw)
To: Michael S. Tsirkin, linux-kernel; +Cc: virtualization, khandual
In-Reply-To: <20180725134057.113423-1-mst@redhat.com>
On 2018年07月25日 21:45, Michael S. Tsirkin wrote:
> Fixes: 55e49dc43a8 ("virtio_ring: switch to dma_XX barriers for rpmsg")
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> tools/virtio/asm/barrier.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/virtio/asm/barrier.h b/tools/virtio/asm/barrier.h
> index 0ac3caf90877..d0351f83aebe 100644
> --- a/tools/virtio/asm/barrier.h
> +++ b/tools/virtio/asm/barrier.h
> @@ -13,8 +13,8 @@
> } while (0);
> /* Weak barriers should be used. If not - it's a bug */
> # define mb() abort()
> -# define rmb() abort()
> -# define wmb() abort()
> +# define dma_rmb() abort()
> +# define dma_wmb() abort()
> #else
> #error Please fill in barrier macros
> #endif
Reviewed-by: Jason Wang <jasowang@redhat.com>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Anshuman Khandual @ 2018-07-30 4:15 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: robh, srikar, benh, linuxram, linux-kernel, virtualization, hch,
paulus, mpe, joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <20180729001344-mutt-send-email-mst@kernel.org>
On 07/29/2018 02:46 AM, Michael S. Tsirkin wrote:
> On Sat, Jul 28, 2018 at 02:26:24PM +0530, Anshuman Khandual wrote:
>> On 07/20/2018 09:29 AM, Anshuman Khandual wrote:
>>> Now that virtio core always needs all virtio devices to have DMA OPS, we
>>> need to make sure that the structure it points is the right one. In the
>>> absence of VIRTIO_F_IOMMU_PLATFORM flag QEMU expects GPA from guest kernel.
>>> In such case, virtio device must use default virtio_direct_dma_ops DMA OPS
>>> structure which transforms scatter gather buffer addresses as GPA. This
>>> DMA OPS override must happen as early as possible during virtio device
>>> initializatin sequence before virtio core starts using given device's DMA
>>> OPS callbacks for I/O transactions. This change detects device's IOMMU flag
>>> and does the override in case the flag is cleared.
>>>
>>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>> ---
>>> drivers/virtio/virtio.c | 5 +++++
>>> 1 file changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
>>> index 7907ad3..6b13987 100644
>>> --- a/drivers/virtio/virtio.c
>>> +++ b/drivers/virtio/virtio.c
>>> @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status)
>>> }
>>> EXPORT_SYMBOL_GPL(virtio_add_status);
>>>
>>> +const struct dma_map_ops virtio_direct_dma_ops;
>>> +
>>> int virtio_finalize_features(struct virtio_device *dev)
>>> {
>>> int ret = dev->config->finalize_features(dev);
>>> @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
>>> if (ret)
>>> return ret;
>>
>>
>> The previous patch removed the code block for XEN guests which forced
>> the use of DMA API all the time irrespective of VIRTIO_F_IOMMU_PLATFORM
>> flag on the device. Here is what I have removed with patch 2/4 which
>> breaks the existing semantics on XEN guests.
>>
>> -static bool vring_use_dma_api(struct virtio_device *vdev)
>> -{
>> - if (!virtio_has_iommu_quirk(vdev))
>> - return true;
>> -
>> - /* Otherwise, we are left to guess. */
>> - /*
>> - * In theory, it's possible to have a buggy QEMU-supposed
>> - * emulated Q35 IOMMU and Xen enabled at the same time. On
>> - * such a configuration, virtio has never worked and will
>> - * not work without an even larger kludge. Instead, enable
>> - * the DMA API if we're a Xen guest, which at least allows
>> - * all of the sensible Xen configurations to work correctly.
>> - */
>> - if (xen_domain())
>> - return true;
>> -
>> - return false;
>> -}
>>
>> XEN guests would not like override with virtio_direct_dma_ops in any
>> case irrespective of the flag VIRTIO_F_IOMMU_PLATFORM. So the existing
>> semantics can be preserved with something like this. It just assumes
>> that dev->dma_ops is non-NULL and a valid one set by the architecture.
>> If required we can add those tests here before skipping the override.
>>
>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
>> index 7907ad3..6b13987 100644
>> --- a/drivers/virtio/virtio.c
>> +++ b/drivers/virtio/virtio.c
>> @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status)
>> }
>> EXPORT_SYMBOL_GPL(virtio_add_status);
>>
>> +const struct dma_map_ops virtio_direct_dma_ops;
>> +
>> int virtio_finalize_features(struct virtio_device *dev)
>> {
>> int ret = dev->config->finalize_features(dev);
>> @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
>> if (ret)
>> return ret;
>> +
>> + if (xen_domain())
>> + goto skip_override;
>> +
>> + if (virtio_has_iommu_quirk(dev))
>> + set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
>> +
>> + skip_override:
>> +
>
> I prefer normal if scoping as opposed to goto spaghetti pls.
> Better yet move vring_use_dma_api here and use it.
> Less of a chance something will break.
Sure, will move vring_use_dma_api() function in here.
^ permalink raw reply
* Re: [PATCH net-next v6 3/4] net: vhost: factor out busy polling logic to vhost_net_busy_poll()
From: Jason Wang @ 2018-07-30 3:16 UTC (permalink / raw)
To: Tonghao Zhang, makita.toshiaki
Cc: Linux Kernel Network Developers, toshiaki.makita1, virtualization,
mst
In-Reply-To: <CAMDZJNVVJs35kuvktTxn+mmDz7db+1K-kfuOoMUn9Z=WoayUVw@mail.gmail.com>
On 2018年07月24日 11:28, Tonghao Zhang wrote:
> On Tue, Jul 24, 2018 at 10:53 AM Toshiaki Makita
> <makita.toshiaki@lab.ntt.co.jp> wrote:
>> On 2018/07/24 2:31, Tonghao Zhang wrote:
>>> On Mon, Jul 23, 2018 at 10:20 PM Toshiaki Makita
>>> <toshiaki.makita1@gmail.com> wrote:
>>>> On 18/07/23 (月) 21:43, Tonghao Zhang wrote:
>>>>> On Mon, Jul 23, 2018 at 5:58 PM Toshiaki Makita
>>>>> <makita.toshiaki@lab.ntt.co.jp> wrote:
>>>>>> On 2018/07/22 3:04,xiangxia.m.yue@gmail.com wrote:
>>>>>>> From: Tonghao Zhang<xiangxia.m.yue@gmail.com>
>>>>>>>
>>>>>>> Factor out generic busy polling logic and will be
>>>>>>> used for in tx path in the next patch. And with the patch,
>>>>>>> qemu can set differently the busyloop_timeout for rx queue.
>>>>>>>
>>>>>>> Signed-off-by: Tonghao Zhang<xiangxia.m.yue@gmail.com>
>>>>>>> ---
>>>>>> ...
>>>>>>> +static void vhost_net_busy_poll_vq_check(struct vhost_net *net,
>>>>>>> + struct vhost_virtqueue *rvq,
>>>>>>> + struct vhost_virtqueue *tvq,
>>>>>>> + bool rx)
>>>>>>> +{
>>>>>>> + struct socket *sock = rvq->private_data;
>>>>>>> +
>>>>>>> + if (rx) {
>>>>>>> + if (!vhost_vq_avail_empty(&net->dev, tvq)) {
>>>>>>> + vhost_poll_queue(&tvq->poll);
>>>>>>> + } else if (unlikely(vhost_enable_notify(&net->dev, tvq))) {
>>>>>>> + vhost_disable_notify(&net->dev, tvq);
>>>>>>> + vhost_poll_queue(&tvq->poll);
>>>>>>> + }
>>>>>>> + } else if ((sock && sk_has_rx_data(sock->sk)) &&
>>>>>>> + !vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>>>> + vhost_poll_queue(&rvq->poll);
>>>>>> Now we wait for vq_avail for rx as well, I think you cannot skip
>>>>>> vhost_enable_notify() on tx. Probably you might want to do:
>>>>> I think vhost_enable_notify is needed.
>>>>>
>>>>>> } else if (sock && sk_has_rx_data(sock->sk)) {
>>>>>> if (!vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>>> vhost_poll_queue(&rvq->poll);
>>>>>> } else if (unlikely(vhost_enable_notify(&net->dev, rvq))) {
>>>>>> vhost_disable_notify(&net->dev, rvq);
>>>>>> vhost_poll_queue(&rvq->poll);
>>>>>> }
>>>>>> }
>>>>> As Jason review as before, we only want rx kick when packet is pending at
>>>>> socket but we're out of available buffers. So we just enable notify,
>>>>> but not poll it ?
>>>>>
>>>>> } else if ((sock && sk_has_rx_data(sock->sk)) &&
>>>>> !vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>> vhost_poll_queue(&rvq->poll);
>>>>> else {
>>>>> vhost_enable_notify(&net->dev, rvq);
>>>>> }
>>>> When vhost_enable_notify() returns true the avail becomes non-empty
>>>> while we are enabling notify. We may delay the rx process if we don't
>>>> check the return value of vhost_enable_notify().
>>> I got it thanks.
>>>>>> Also it's better to care vhost_net_disable_vq()/vhost_net_enable_vq() on tx?
>>>>> I cant find why it is better, if necessary, we can do it.
>>>> The reason is pretty simple... we are busypolling the socket so we don't
>>>> need rx wakeups during it?
>>> OK, but one question, how about rx? do we use the
>>> vhost_net_disable_vq/vhost_net_ensable_vq on rx ?
>> If we are busypolling the sock tx buf? I'm not sure if polling it
>> improves the performance.
> Not the sock tx buff, when we are busypolling in handle_rx, we will
> check the tx vring via vhost_vq_avail_empty.
> So, should we the disable tvq, e.g. vhost_net_disable_vq(net, tvq)?> --
This could be done on top since tx wakeups only happnes when we run out
of sndbuf.
Thanks
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [PATCH net-next v6 1/4] net: vhost: lock the vqs one by one
From: Jason Wang @ 2018-07-30 2:54 UTC (permalink / raw)
To: Tonghao Zhang, mst; +Cc: Linux Kernel Network Developers, virtualization
In-Reply-To: <CAMDZJNX41vtdNNEAxHYwC+WcrJFkON70hVumVE9rbFDBC5QUOQ@mail.gmail.com>
On 2018年07月25日 20:05, Tonghao Zhang wrote:
> On Sun, Jul 22, 2018 at 11:26 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Sat, Jul 21, 2018 at 11:03:59AM -0700, xiangxia.m.yue@gmail.com wrote:
>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>
>>> This patch changes the way that lock all vqs
>>> at the same, to lock them one by one. It will
>>> be used for next patch to avoid the deadlock.
>>>
>>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>> Acked-by: Jason Wang <jasowang@redhat.com>
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>> ---
>>> drivers/vhost/vhost.c | 24 +++++++-----------------
>>> 1 file changed, 7 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>>> index a502f1a..a1c06e7 100644
>>> --- a/drivers/vhost/vhost.c
>>> +++ b/drivers/vhost/vhost.c
>>> @@ -294,8 +294,11 @@ static void vhost_vq_meta_reset(struct vhost_dev *d)
>>> {
>>> int i;
>>>
>>> - for (i = 0; i < d->nvqs; ++i)
>>> + for (i = 0; i < d->nvqs; ++i) {
>>> + mutex_lock(&d->vqs[i]->mutex);
>>> __vhost_vq_meta_reset(d->vqs[i]);
>>> + mutex_unlock(&d->vqs[i]->mutex);
>>> + }
>>> }
>>>
>>> static void vhost_vq_reset(struct vhost_dev *dev,
>>> @@ -890,20 +893,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
>>> #define vhost_get_used(vq, x, ptr) \
>>> vhost_get_user(vq, x, ptr, VHOST_ADDR_USED)
>>>
>>> -static void vhost_dev_lock_vqs(struct vhost_dev *d)
>>> -{
>>> - int i = 0;
>>> - for (i = 0; i < d->nvqs; ++i)
>>> - mutex_lock_nested(&d->vqs[i]->mutex, i);
>>> -}
>>> -
>>> -static void vhost_dev_unlock_vqs(struct vhost_dev *d)
>>> -{
>>> - int i = 0;
>>> - for (i = 0; i < d->nvqs; ++i)
>>> - mutex_unlock(&d->vqs[i]->mutex);
>>> -}
>>> -
>>> static int vhost_new_umem_range(struct vhost_umem *umem,
>>> u64 start, u64 size, u64 end,
>>> u64 userspace_addr, int perm)
>>> @@ -953,7 +942,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
>>> if (msg->iova <= vq_msg->iova &&
>>> msg->iova + msg->size - 1 > vq_msg->iova &&
>>> vq_msg->type == VHOST_IOTLB_MISS) {
>>> + mutex_lock(&node->vq->mutex);
>>> vhost_poll_queue(&node->vq->poll);
>>> + mutex_unlock(&node->vq->mutex);
>>> +
>>> list_del(&node->node);
>>> kfree(node);
>>> }
>>> @@ -985,7 +977,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>>> int ret = 0;
>>>
>>> mutex_lock(&dev->mutex);
>>> - vhost_dev_lock_vqs(dev);
>>> switch (msg->type) {
>>> case VHOST_IOTLB_UPDATE:
>>> if (!dev->iotlb) {
>>> @@ -1019,7 +1010,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>>> break;
>>> }
>>>
>>> - vhost_dev_unlock_vqs(dev);
>>> mutex_unlock(&dev->mutex);
>>>
>>> return ret;
>> I do prefer the finer-grained locking but I remember we
>> discussed something like this in the past and Jason saw issues
>> with such a locking.
> This change is suggested by Jason. Should I send new version because
> the patch 3 is changed.
>
>> Jason?
Actually, the code was a little bit tricky here. Since it assumes
handle_tx() and handle_rx() run on a single thread. Though the lock
ordering is different, it was still safe.
Maybe we can add some comments to explain this.
Thanks
>>
>>> --
>>> 1.8.3.1
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Michael S. Tsirkin @ 2018-07-28 21:16 UTC (permalink / raw)
To: Anshuman Khandual
Cc: robh, srikar, benh, linuxram, linux-kernel, virtualization, hch,
paulus, mpe, joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <c443ad8c-fb81-302d-edb2-5521831d38da@linux.vnet.ibm.com>
On Sat, Jul 28, 2018 at 02:26:24PM +0530, Anshuman Khandual wrote:
> On 07/20/2018 09:29 AM, Anshuman Khandual wrote:
> > Now that virtio core always needs all virtio devices to have DMA OPS, we
> > need to make sure that the structure it points is the right one. In the
> > absence of VIRTIO_F_IOMMU_PLATFORM flag QEMU expects GPA from guest kernel.
> > In such case, virtio device must use default virtio_direct_dma_ops DMA OPS
> > structure which transforms scatter gather buffer addresses as GPA. This
> > DMA OPS override must happen as early as possible during virtio device
> > initializatin sequence before virtio core starts using given device's DMA
> > OPS callbacks for I/O transactions. This change detects device's IOMMU flag
> > and does the override in case the flag is cleared.
> >
> > Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> > ---
> > drivers/virtio/virtio.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > index 7907ad3..6b13987 100644
> > --- a/drivers/virtio/virtio.c
> > +++ b/drivers/virtio/virtio.c
> > @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status)
> > }
> > EXPORT_SYMBOL_GPL(virtio_add_status);
> >
> > +const struct dma_map_ops virtio_direct_dma_ops;
> > +
> > int virtio_finalize_features(struct virtio_device *dev)
> > {
> > int ret = dev->config->finalize_features(dev);
> > @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
> > if (ret)
> > return ret;
>
>
> The previous patch removed the code block for XEN guests which forced
> the use of DMA API all the time irrespective of VIRTIO_F_IOMMU_PLATFORM
> flag on the device. Here is what I have removed with patch 2/4 which
> breaks the existing semantics on XEN guests.
>
> -static bool vring_use_dma_api(struct virtio_device *vdev)
> -{
> - if (!virtio_has_iommu_quirk(vdev))
> - return true;
> -
> - /* Otherwise, we are left to guess. */
> - /*
> - * In theory, it's possible to have a buggy QEMU-supposed
> - * emulated Q35 IOMMU and Xen enabled at the same time. On
> - * such a configuration, virtio has never worked and will
> - * not work without an even larger kludge. Instead, enable
> - * the DMA API if we're a Xen guest, which at least allows
> - * all of the sensible Xen configurations to work correctly.
> - */
> - if (xen_domain())
> - return true;
> -
> - return false;
> -}
>
> XEN guests would not like override with virtio_direct_dma_ops in any
> case irrespective of the flag VIRTIO_F_IOMMU_PLATFORM. So the existing
> semantics can be preserved with something like this. It just assumes
> that dev->dma_ops is non-NULL and a valid one set by the architecture.
> If required we can add those tests here before skipping the override.
>
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index 7907ad3..6b13987 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status)
> }
> EXPORT_SYMBOL_GPL(virtio_add_status);
>
> +const struct dma_map_ops virtio_direct_dma_ops;
> +
> int virtio_finalize_features(struct virtio_device *dev)
> {
> int ret = dev->config->finalize_features(dev);
> @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
> if (ret)
> return ret;
> +
> + if (xen_domain())
> + goto skip_override;
> +
> + if (virtio_has_iommu_quirk(dev))
> + set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
> +
> + skip_override:
> +
I prefer normal if scoping as opposed to goto spaghetti pls.
Better yet move vring_use_dma_api here and use it.
Less of a chance something will break.
> if (!virtio_has_feature(dev, VIRTIO_F_VERSION_1))
> return 0
>
> Will incorporate these changes in the next version.
^ permalink raw reply
* Call for Workshops Proposals - WorldCIST'19, La Toja Island, Spain
From: Maria Lemos @ 2018-07-28 13:12 UTC (permalink / raw)
To: virtualization
[-- Attachment #1.1: Type: text/plain, Size: 4413 bytes --]
----------------- CALL FOR WORKSHOPS PROPOSALS --------------------
WorldCIST'19 - 7th World Conference on Information Systems and Technologies
16th-19th of April 2019, La Toja Island, Galicia, Spain
http://www.worldcist.org/ <http://www.worldcist.org/>
-----------------------------------------------------------------------------------
The Information Systems and Technologies research and industrial community is invited to submit proposals for the organization of Workshops at WorldCist'19 - 7th World Conference on Information Systems and Technologies, to be held at La Toja Island, Galicia, Spain, 16 - 19 April 2019. WorldCist is a global forum for researchers and practitioners to present and discuss the most recent innovations, trends, results, experiences and concerns in the several perspectives of Information Systems and Technologies.
###############
WORKSHOP FORMAT
###############
Workshops should focus on a specific scientific subject on the scope of WorldCist'19 but not directly included on the main conference areas. Each workshop will be coordinated by an Organizing Committee composed of, at least, two researchers in the field, preferably from different institutions and different countries. The organizers should create an international Program Committee for the Workshop, with recognized researchers within the specific Workshop scientific area. Each workshop should have at least ten submissions and five accepted papers in order to be conducted at WorldCist'19.
The selection of Workshops will be performed by WorldCist'19 Conference/Workshop Chairs. Workshops full and short papers will be published in the conference main proceedings in specific Workshop chapters published by Springer in a book of the AISC series. Proceedings will be submitted for indexation by ISI Thomson, SCOPUS, DBLP, EI-Compendex among several other scientific databases. Extended versions of best selected papers will be published in journals indexed by ISI/SCI, SCOPUS and DBLP. Detailed and up-to-date information may be found at WorldCist'19 website: http://www.worldcist.org/ <http://www.worldcist.org/>
#####################
WORKSHOP ORGANIZATION
#####################
The Organizing Committee of each Workshop will be responsible for:
- Producing and distributing the Workshop Call for Papers (CFP);
- Coordinating the review and selection process for the papers submitted to the Workshop, as Workshop chairs (on the paper submission system to be installed);
- Delivering the final versions of the papers accepted for the Workshop in accordance with the guidelines and deadlines defined by WorldCist'19 organizers;
- Coordinating and chairing the Workshop sessions at the conference.
WorldCist'19 organizers reserve the right to cancel any Workshop if deadlines are missed or if the number of registered attendees is too low to support the costs associated with the Workshop.
################
PROPOSAL CONTENT
################
Workshop proposals should contain the following information:
- Workshop title;
- Brief description of the specific scientific scope of the Workshop;
- List of topics of interest (max 15 topics);
- Reasons the Workshop should be held within WorldCist’19;
- Name, postal address, phone and email of all the members of the Workshop Organizing Committee;
- Preliminary proposal for the Workshop Program Committee (Names and affiliations).
Proposals should be submitted at https://easychair.org/conferences/?conf=worldcist-workshops2019 <https://easychair.org/conferences/?conf=worldcist-workshops2019> in PDF (in English), by September 10, 2018.
###############
IMPORTANT DATES
###############
- Deadline for Workshop proposals: September 10, 2018
- Notification of Workshop acceptance: September 20, 2018
- Workshop Final Information and Program Committee: October 10, 2018
- Deadline for paper submission: November 30, 2018
- Notification of paper acceptance: January 6, 2019
- Deadline for final versions and conference registration: January 20, 2019
- Conference dates: April 16-19, 2019
#####
CHAIR
#####
Luis Paulo Reis, AISTI, IEEE & University of Porto, Portugal
WorldCIST'19 Website: http://www.worldcist.org/ <http://www.worldcist.org/>
---
This email has been checked for viruses by AVG.
https://www.avg.com
[-- Attachment #1.2: Type: text/html, Size: 6036 bytes --]
[-- Attachment #2: Type: text/plain, Size: 183 bytes --]
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Anshuman Khandual @ 2018-07-28 8:56 UTC (permalink / raw)
To: virtualization, linux-kernel
Cc: robh, srikar, mst, benh, linuxram, hch, paulus, mpe, joe,
linuxppc-dev, elfring, haren, david
In-Reply-To: <20180720035941.6844-3-khandual@linux.vnet.ibm.com>
On 07/20/2018 09:29 AM, Anshuman Khandual wrote:
> Now that virtio core always needs all virtio devices to have DMA OPS, we
> need to make sure that the structure it points is the right one. In the
> absence of VIRTIO_F_IOMMU_PLATFORM flag QEMU expects GPA from guest kernel.
> In such case, virtio device must use default virtio_direct_dma_ops DMA OPS
> structure which transforms scatter gather buffer addresses as GPA. This
> DMA OPS override must happen as early as possible during virtio device
> initializatin sequence before virtio core starts using given device's DMA
> OPS callbacks for I/O transactions. This change detects device's IOMMU flag
> and does the override in case the flag is cleared.
>
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
> drivers/virtio/virtio.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index 7907ad3..6b13987 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status)
> }
> EXPORT_SYMBOL_GPL(virtio_add_status);
>
> +const struct dma_map_ops virtio_direct_dma_ops;
> +
> int virtio_finalize_features(struct virtio_device *dev)
> {
> int ret = dev->config->finalize_features(dev);
> @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
> if (ret)
> return ret;
The previous patch removed the code block for XEN guests which forced
the use of DMA API all the time irrespective of VIRTIO_F_IOMMU_PLATFORM
flag on the device. Here is what I have removed with patch 2/4 which
breaks the existing semantics on XEN guests.
-static bool vring_use_dma_api(struct virtio_device *vdev)
-{
- if (!virtio_has_iommu_quirk(vdev))
- return true;
-
- /* Otherwise, we are left to guess. */
- /*
- * In theory, it's possible to have a buggy QEMU-supposed
- * emulated Q35 IOMMU and Xen enabled at the same time. On
- * such a configuration, virtio has never worked and will
- * not work without an even larger kludge. Instead, enable
- * the DMA API if we're a Xen guest, which at least allows
- * all of the sensible Xen configurations to work correctly.
- */
- if (xen_domain())
- return true;
-
- return false;
-}
XEN guests would not like override with virtio_direct_dma_ops in any
case irrespective of the flag VIRTIO_F_IOMMU_PLATFORM. So the existing
semantics can be preserved with something like this. It just assumes
that dev->dma_ops is non-NULL and a valid one set by the architecture.
If required we can add those tests here before skipping the override.
diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 7907ad3..6b13987 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status)
}
EXPORT_SYMBOL_GPL(virtio_add_status);
+const struct dma_map_ops virtio_direct_dma_ops;
+
int virtio_finalize_features(struct virtio_device *dev)
{
int ret = dev->config->finalize_features(dev);
@@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
if (ret)
return ret;
+
+ if (xen_domain())
+ goto skip_override;
+
+ if (virtio_has_iommu_quirk(dev))
+ set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
+
+ skip_override:
+
if (!virtio_has_feature(dev, VIRTIO_F_VERSION_1))
return 0
Will incorporate these changes in the next version.
^ permalink raw reply related
* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Anshuman Khandual @ 2018-07-28 8:37 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: robh, srikar, linuxram, linux-kernel, virtualization, hch, paulus,
joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <20180727143008-mutt-send-email-mst@kernel.org>
On 07/27/2018 05:01 PM, Michael S. Tsirkin wrote:
> On Wed, Jul 25, 2018 at 08:56:23AM +0530, Anshuman Khandual wrote:
>> Results with and without the patches are similar.
>
> Thanks! And another thing to try is virtio-net with
> a fast NIC backend (40G and up). Unfortunately
> at this point loopback tests stress the host
> scheduler too much.
>
Sure. Will look around for a 40G NIC system. BTW I have been testing
virtio-net with a TAP device as back end.
ip tuntap add dev tap0 mode tap user $(whoami)
ip link set tap0 master virbr0
ip link set dev virbr0 up
ip link set dev tap0 up
which is exported into the guest as follows
-device virtio-net,netdev=network0,mac=52:55:00:d1:55:01 \
-netdev tap,id=network0,ifname=tap0,script=no,downscript=no \
But I have not run any network benchmarks on it though.
^ permalink raw reply
* RE: [PATCH v2 0/2] virtio-balloon: some improvements
From: Wang, Wei W @ 2018-07-28 2:00 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: virtio-dev@lists.oasis-open.org, linux-kernel@vger.kernel.org,
mhocko@kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org,
virtualization@lists.linux-foundation.org
In-Reply-To: <20180727170605-mutt-send-email-mst@kernel.org>
On Friday, July 27, 2018 10:06 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 27, 2018 at 05:24:53PM +0800, Wei Wang wrote:
> > This series is split from the "Virtio-balloon: support free page
> > reporting" series to make some improvements.
> >
> > v1->v2 ChangeLog:
> > - register the shrinker when VIRTIO_BALLOON_F_DEFLATE_ON_OOM is
> negotiated.
> >
> > Wei Wang (2):
> > virtio-balloon: remove BUG() in init_vqs
> > virtio_balloon: replace oom notifier with shrinker
>
> Thanks!
> Given it's very late in the release cycle, I'll merge this for the next Linux
> release.
No problem. Thanks!
Best,
Wei
^ permalink raw reply
* [PATCH] drm: qxl: Fix NULL pointer dereference at qxl_alloc_client_monitors_config
From: Anton Vasilyev @ 2018-07-27 15:30 UTC (permalink / raw)
To: Dave Airlie
Cc: ldv-project, David Airlie, linux-kernel, dri-devel,
virtualization, Anton Vasilyev
If qxl_alloc_client_monitors_config() fails to allocate
client_monitors_config then NULL pointer dereference occurs
in function qxl_display_copy_rom_client_monitors_config() after
qxl_alloc_client_monitors_config() call.
The patch adds return error from qxl_alloc_client_monitors_config()
and additional status for qxl_display_copy_rom_client_monitors_config
return value.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
---
Note: Is it correct that qxl_display_read_client_monitors_config() does not
return error in case of fail?
---
drivers/gpu/drm/qxl/qxl_display.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/qxl/qxl_display.c b/drivers/gpu/drm/qxl/qxl_display.c
index 768207fbbae3..a59b2eca5f5b 100644
--- a/drivers/gpu/drm/qxl/qxl_display.c
+++ b/drivers/gpu/drm/qxl/qxl_display.c
@@ -37,7 +37,8 @@ static bool qxl_head_enabled(struct qxl_head *head)
return head->width && head->height;
}
-static void qxl_alloc_client_monitors_config(struct qxl_device *qdev, unsigned count)
+static int qxl_alloc_client_monitors_config(struct qxl_device *qdev,
+ unsigned int count)
{
if (qdev->client_monitors_config &&
count > qdev->client_monitors_config->count) {
@@ -49,15 +50,17 @@ static void qxl_alloc_client_monitors_config(struct qxl_device *qdev, unsigned c
sizeof(struct qxl_monitors_config) +
sizeof(struct qxl_head) * count, GFP_KERNEL);
if (!qdev->client_monitors_config)
- return;
+ return -ENOMEM;
}
qdev->client_monitors_config->count = count;
+ return 0;
}
enum {
MONITORS_CONFIG_MODIFIED,
MONITORS_CONFIG_UNCHANGED,
MONITORS_CONFIG_BAD_CRC,
+ MONITORS_CONFIG_ERROR,
};
static int qxl_display_copy_rom_client_monitors_config(struct qxl_device *qdev)
@@ -87,7 +90,10 @@ static int qxl_display_copy_rom_client_monitors_config(struct qxl_device *qdev)
&& (num_monitors != qdev->client_monitors_config->count)) {
status = MONITORS_CONFIG_MODIFIED;
}
- qxl_alloc_client_monitors_config(qdev, num_monitors);
+ if (qxl_alloc_client_monitors_config(qdev, num_monitors)) {
+ status = MONITORS_CONFIG_ERROR;
+ return status;
+ }
/* we copy max from the client but it isn't used */
qdev->client_monitors_config->max_allowed =
qdev->monitors_config->max_allowed;
@@ -161,6 +167,10 @@ void qxl_display_read_client_monitors_config(struct qxl_device *qdev)
break;
udelay(5);
}
+ if (status == MONITORS_CONFIG_ERROR) {
+ DRM_DEBUG_KMS("ignoring client monitors config: error");
+ return;
+ }
if (status == MONITORS_CONFIG_BAD_CRC) {
DRM_DEBUG_KMS("ignoring client monitors config: bad crc");
return;
--
2.18.0
^ permalink raw reply related
* Re: [PATCH v2 0/2] virtio-balloon: some improvements
From: Michael S. Tsirkin @ 2018-07-27 14:06 UTC (permalink / raw)
To: Wei Wang; +Cc: virtio-dev, linux-kernel, mhocko, linux-mm, akpm, virtualization
In-Reply-To: <1532683495-31974-1-git-send-email-wei.w.wang@intel.com>
On Fri, Jul 27, 2018 at 05:24:53PM +0800, Wei Wang wrote:
> This series is split from the "Virtio-balloon: support free page
> reporting" series to make some improvements.
>
> v1->v2 ChangeLog:
> - register the shrinker when VIRTIO_BALLOON_F_DEFLATE_ON_OOM is negotiated.
>
> Wei Wang (2):
> virtio-balloon: remove BUG() in init_vqs
> virtio_balloon: replace oom notifier with shrinker
Thanks!
Given it's very late in the release cycle, I'll merge this for
the next Linux release.
> drivers/virtio/virtio_balloon.c | 125 +++++++++++++++++++++++-----------------
> 1 file changed, 72 insertions(+), 53 deletions(-)
>
> --
> 2.7.4
^ permalink raw reply
* Re: net-next boot error
From: Michael S. Tsirkin @ 2018-07-27 14:00 UTC (permalink / raw)
To: Steven Rostedt
Cc: amritha.nambiar, Marc Zyngier, netdev, Tetsuo Handa,
syzkaller-bugs, LKML, David Miller, Peter Zijlstra, Jason Baron,
Josh Poimboeuf, syzbot, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, virtualization, Ingo Molnar, Dmitry Vyukov
In-Reply-To: <20180726101748.7bbb4805@gandalf.local.home>
On Thu, Jul 26, 2018 at 10:17:48AM -0400, Steven Rostedt wrote:
>
> [ Added Thomas Gleixner ]
>
>
> On Thu, 26 Jul 2018 11:34:39 +0200
> Dmitry Vyukov <dvyukov@google.com> wrote:
>
> > On Thu, Jul 26, 2018 at 11:29 AM, syzbot
> > <syzbot+604f8271211546f5b3c7@syzkaller.appspotmail.com> wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit: dc66fe43b7eb rds: send: Fix dead code in rds_sendmsg
> > > git tree: net-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=127874c8400000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=f34ce142a9f5f0e8
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=604f8271211546f5b3c7
> > > compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+604f8271211546f5b3c7@syzkaller.appspotmail.com
> > >
> > > possible deadlock in static_key_slow_incsd 0:0:1:0: [sda] Attached SCSI disk
> > > MACsec IEEE 802.1AE
> > > tun: Universal TUN/TAP device driver, 1.6
> > >
> > > ============================================
> > > WARNING: possible recursive locking detected
> >
> > +Tetsuo, perhaps this boot lockdep problem then disables lockdep for
> > actual testing. I think lockdep should respect panic_on_warn.
> >
> >
> > > 4.18.0-rc6+ #141 Not tainted
> > > --------------------------------------------
> > > swapper/0/1 is trying to acquire lock:
> > > (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at:
> > > static_key_slow_inc+0x12/0x30 kernel/jump_label.c:124
> > >
> > > but task is already holding lock:
> > > (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: get_online_cpus
> > > include/linux/cpu.h:126 [inline]
> > > (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0xe1a/0x1520
> > > drivers/net/virtio_net.c:2777
>
> Here init_vqs() does:
>
> get_online_cpus();
> virtnet_set_affinity(vi);
> put_online_cpus();
>
> Which disables cpu hotplug and calls virtnet_set_affinity()
>
> Note, get_online_cpus() is no longer recursive.
>
> > >
> > > other info that might help us debug this:
> > > Possible unsafe locking scenario:
> > >
> > > CPU0
> > > ----
> > > lock(cpu_hotplug_lock.rw_sem);
> > > lock(cpu_hotplug_lock.rw_sem);
> > >
> > > *** DEADLOCK ***
> > >
> > > May be due to missing lock nesting notation
> > >
> > > 3 locks held by swapper/0/1:
> > > #0: (____ptrval____) (&dev->mutex){....}, at: device_lock
> > > include/linux/device.h:1134 [inline]
> > > #0: (____ptrval____) (&dev->mutex){....}, at: __driver_attach+0x15f/0x2f0
> > > drivers/base/dd.c:820
> > > #1: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: get_online_cpus
> > > include/linux/cpu.h:126 [inline]
> > > #1: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at:
> > > init_vqs+0xe1a/0x1520 drivers/net/virtio_net.c:2777
> > > #2: (____ptrval____) (xps_map_mutex){+.+.}, at:
> > > __netif_set_xps_queue+0x243/0x23f0 net/core/dev.c:2278
> > >
> > > stack backtrace:
> > > CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc6+ #141
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > Call Trace:
> > > __dump_stack lib/dump_stack.c:77 [inline]
> > > dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
> > > print_deadlock_bug kernel/locking/lockdep.c:1765 [inline]
> > > check_deadlock kernel/locking/lockdep.c:1809 [inline]
> > > validate_chain kernel/locking/lockdep.c:2405 [inline]
> > > __lock_acquire.cold.65+0x1fb/0x486 kernel/locking/lockdep.c:3435
> > > lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924
> > > percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:36 [inline]
> > > percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
> > > cpus_read_lock+0x43/0xa0 kernel/cpu.c:289
> > > static_key_slow_inc+0x12/0x30 kernel/jump_label.c:124
> > > __netif_set_xps_queue+0xaac/0x23f0 net/core/dev.c:2320
>
>
> __netif_set_xps_queue() calls static_key_slow_inc() which will also do
> a get_online_cpus() which will trigger this bug.
>
> There's a static_key_slow_inc_cpuslocked() version that should be used
> when get_online_cpus() is already taken, but I see
> __netif_set_xps_queue() is called from several places, and I doubt it
> is always called with get_online_cpus() held. Thus just using the
> cpuslocked() version is probably not sufficient of a fix.
>
> I don't know the code enough to offer other suggestions.
>
> -- Steve
OK so the guess is it's due to combination of
commit 04157469b7b848f4a9978b63b1ea2ce62ad3a0a3
Author: Amritha Nambiar <amritha.nambiar@intel.com>
Date: Fri Jun 29 21:26:46 2018 -0700
net: Use static_key for XPS maps
which uses static_key_slow_inc and
commit 8af2c06ff4b144064b51b7f688194474123d9c9c
Author: Amritha Nambiar <amritha.nambiar@intel.com>
Date: Fri Jun 29 21:27:07 2018 -0700
net-sysfs: Add interface for Rx queue(s) map per Tx queue
which makes it all user-triggerable.
>
> > > netif_set_xps_queue+0x26/0x30 net/core/dev.c:2455
> > > virtnet_set_affinity+0x2ba/0x4b0 drivers/net/virtio_net.c:1944
> > > init_vqs+0xe22/0x1520 drivers/net/virtio_net.c:2778
> > > virtnet_probe+0x1092/0x2260 drivers/net/virtio_net.c:3016
> > > virtio_dev_probe+0x592/0x942 drivers/virtio/virtio.c:245
> > > really_probe drivers/base/dd.c:446 [inline]
> > > driver_probe_device+0x6ad/0x970 drivers/base/dd.c:588
> > > __driver_attach+0x28b/0x2f0 drivers/base/dd.c:822
> > > bus_for_each_dev+0x15d/0x1f0 drivers/base/bus.c:311
> > > driver_attach+0x3d/0x50 drivers/base/dd.c:841
> > > bus_add_driver+0x4b2/0x600 drivers/base/bus.c:667
> > > driver_register+0x1c8/0x320 drivers/base/driver.c:170
> > > register_virtio_driver+0x79/0xd0 drivers/virtio/virtio.c:296
> > > virtio_net_driver_init+0x8d/0xc9 drivers/net/virtio_net.c:3209
> > > do_one_initcall+0x127/0x913 init/main.c:884
> > > do_initcall_level init/main.c:952 [inline]
> > > do_initcalls init/main.c:960 [inline]
> > > do_basic_setup init/main.c:978 [inline]
> > > kernel_init_freeable+0x49b/0x58e init/main.c:1135
> > > kernel_init+0x11/0x1b3 init/main.c:1061
> > > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
> > > vcan: Virtual CAN interface driver
> > > vxcan: Virtual CAN Tunnel driver
> > > slcan: serial line CAN interface driver
> > > slcan: 10 dynamic interface channels.
> > > CAN device driver interface
> > > enic: Cisco VIC Ethernet NIC Driver, ver 2.3.0.53
> > > e100: Intel(R) PRO/100 Network Driver, 3.5.24-k2-NAPI
> > > e100: Copyright(c) 1999-2006 Intel Corporation
> > > e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
> > > e1000: Copyright (c) 1999-2006 Intel Corporation.
> > > e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
> > > e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> > > sky2: driver version 1.30
> > > PPP generic driver version 2.4.2
> > > PPP BSD Compression module registered
> > > PPP Deflate Compression module registered
> > > PPP MPPE Compression module registered
> > > NET: Registered protocol family 24
> > > PPTP driver version 0.8.5
> > > mac80211_hwsim: initializing netlink
> > > ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
> > > ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
> > > usbcore: registered new interface driver asix
> > > usbcore: registered new interface driver ax88179_178a
> > > usbcore: registered new interface driver cdc_ether
> > > usbcore: registered new interface driver net1080
> > > usbcore: registered new interface driver cdc_subset
> > > usbcore: registered new interface driver zaurus
> > > usbcore: registered new interface driver cdc_ncm
> > > aoe: AoE v85 initialised.
> > > ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> > > ehci-pci: EHCI PCI platform driver
> > > ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> > > ohci-pci: OHCI PCI platform driver
> > > uhci_hcd: USB Universal Host Controller Interface driver
> > > usbcore: registered new interface driver usblp
> > > usbcore: registered new interface driver usb-storage
> > > i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
> > > i8042: Warning: Keylock active
> > > serio: i8042 KBD port at 0x60,0x64 irq 1
> > > serio: i8042 AUX port at 0x60,0x64 irq 12
> > > mousedev: PS/2 mouse device common for all mice
> > > rtc_cmos 00:00: RTC can wake from S4
> > > rtc_cmos 00:00: registered as rtc0
> > > rtc_cmos 00:00: alarms up to one day, 114 bytes nvram
> > > i2c /dev entries driver
> > > piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or
> > > use force_addr=0xaddr
> > > i2c-parport-light: adapter type unspecified
> > > usbcore: registered new interface driver RobotFuzz Open Source InterFace,
> > > OSIF
> > > usbcore: registered new interface driver i2c-tiny-usb
> > > device-mapper: ioctl: 4.39.0-ioctl (2018-04-03) initialised:
> > > dm-devel@redhat.com
> > > device-mapper: raid: Loading target version 1.13.2
> > > usbcore: registered new interface driver btusb
> > > usnic_verbs: Cisco VIC (USNIC) Verbs Driver v1.0.3 (December 19, 2013)
> > > usnic_verbs:usnic_uiom_init:585:
> > > IOMMU required but not present or enabled. USNIC QPs will not function w/o
> > > enabling IOMMU
> > > usnic_verbs:usnic_ib_init:649:
> > > Unable to initalize umem with err -1
> > > iscsi: registered transport (iser)
> > > OPA Virtual Network Driver - v1.0
> > > hidraw: raw HID events driver (C) Jiri Kosina
> > > usbcore: registered new interface driver usbhid
> > > usbhid: USB HID core driver
> > > NET: Registered protocol family 40
> > > ashmem: initialized
> > > NET: Registered protocol family 26
> > > Mirror/redirect action on
> > > Simple TC action Loaded
> > > netem: version 1.3
> > > u32 classifier
> > > Actions configured
> > > nf_conntrack_irc: failed to register helpers
> > > nf_conntrack_sane: failed to register helpers
> > > nf_conntrack_sip: failed to register helpers
> > > xt_time: kernel timezone is -0000
> > > IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
> > > IPVS: Connection hash table configured (size=4096, memory=64Kbytes)
> > > IPVS: ipvs loaded.
> > > IPVS: [rr] scheduler registered.
> > > IPVS: [wrr] scheduler registered.
> > > IPVS: [lc] scheduler registered.
> > > IPVS: [wlc] scheduler registered.
> > > IPVS: [fo] scheduler registered.
> > > IPVS: [ovf] scheduler registered.
> > > IPVS: [lblc] scheduler registered.
> > > IPVS: [lblcr] scheduler registered.
> > > IPVS: [dh] scheduler registered.
> > > IPVS: [sh] scheduler registered.
> > > IPVS: [mh] scheduler registered.
> > > IPVS: [sed] scheduler registered.
> > > IPVS: [nq] scheduler registered.
> > > IPVS: ftp: loaded support on port[0] = 21
> > > IPVS: [sip] pe registered.
> > > ipip: IPv4 and MPLS over IPv4 tunneling driver
> > > gre: GRE over IPv4 demultiplexor driver
> > > ip_gre: GRE over IPv4 tunneling driver
> > > IPv4 over IPsec tunneling driver
> > > ipt_CLUSTERIP: ClusterIP Version 0.8 loaded successfully
> > > Initializing XFRM netlink socket
> > > NET: Registered protocol family 10
> > > Segment Routing with IPv6
> > > mip6: Mobile IPv6
> > > sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
> > > ip6_gre: GRE over IPv6 tunneling driver
> > > bpfilter: Loaded bpfilter_umh pid 2080
> > > NET: Registered protocol family 15
> > > Bridge firewalling registered
> > > can: controller area network core (rev 20170425 abi 9)
> > > NET: Registered protocol family 29
> > > can: raw protocol (rev 20170425)
> > > can: broadcast manager protocol (rev 20170425 t)
> > > can: netlink gateway (rev 20170425) max_hops=1
> > > Bluetooth: RFCOMM TTY layer initialized
> > > Bluetooth: RFCOMM socket layer initialized
> > > Bluetooth: RFCOMM ver 1.11
> > > Bluetooth: BNEP (Ethernet Emulation) ver 1.3
> > > Bluetooth: BNEP filters: protocol multicast
> > > Bluetooth: BNEP socket layer initialized
> > > Bluetooth: HIDP (Human Interface Emulation) ver 1.2
> > > Bluetooth: HIDP socket layer initialized
> > > RPC: Registered rdma transport module.
> > > RPC: Registered rdma backchannel transport module.
> > > NET: Registered protocol family 41
> > > lec:lane_module_init: lec.c: initialized
> > > mpoa:atm_mpoa_init: mpc.c: initialized
> > > l2tp_core: L2TP core driver, V2.0
> > > l2tp_ppp: PPPoL2TP kernel driver, V2.0
> > > 8021q: 802.1Q VLAN Support v1.8
> > > input: AT Translated Set 2 keyboard as
> > > /devices/platform/i8042/serio0/input/input2
> > > DCCP: Activated CCID 2 (TCP-like)
> > > DCCP: Activated CCID 3 (TCP-Friendly Rate Control)
> > > sctp: Hash tables configured (bind 64/64)
> > > tipc: Activated (version 2.0.0)
> > > NET: Registered protocol family 30
> > > tipc: Started in single node mode
> > > NET: Registered protocol family 43
> > > 9pnet: Installing 9P2000 support
> > > NET: Registered protocol family 36
> > > Key type dns_resolver registered
> > > Key type ceph registered
> > > libceph: loaded (mon/osd proto 15/24)
> > > openvswitch: Open vSwitch switching datapath
> > > mpls_gso: MPLS GSO support
> > > start plist test
> > > end plist test
> > > AVX2 version of gcm_enc/dec engaged.
> > > AES CTR mode by8 optimization enabled
> > > sched_clock: Marking stable (4559438359, 0)->(6126385605, -1566947246)
> > > registered taskstats version 1
> > > Loading compiled-in X.509 certificates
> > > zswap: default zpool zbud not available
> > > zswap: pool creation failed
> > > Btrfs loaded, crc32c=crc32c-intel
> > > Key type big_key registered
> > > Key type encrypted registered
> > > Magic number: 10:317:168
> > > console [netcon0] enabled
> > > netconsole: network logging started
> > > gtp: GTP module loaded (pdp ctx size 104 bytes)
> > > rdma_rxe: loaded
> > > cfg80211: Loading compiled-in X.509 certificates for regulatory database
> > > cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
> > > platform regulatory.0: Direct firmware load for regulatory.db failed with
> > > error -2
> > > cfg80211: failed to load regulatory.db
> > > ALSA device list:
> > > #0: Dummy 1
> > > #1: Loopback 1
> > > #2: Virtual MIDI Card 1
> > > input: ImExPS/2 Generic Explorer Mouse as
> > > /devices/platform/i8042/serio1/input/input4
> > > md: Waiting for all devices to be available before autodetect
> > > md: If you don't use raid, use raid=noautodetect
> > > md: Autodetecting RAID arrays.
> > > md: autorun ...
> > > md: ... autorun DONE.
> > > EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
> > > VFS: Mounted root (ext4 filesystem) readonly on device 8:1.
> > > devtmpfs: mounted
> > > Freeing unused kernel memory: 3900K
> > > Kernel memory protection disabled.
> > > SELinux: Disabled at runtime.
> > > SELinux: Unregistering netfilter hooks
> > > audit: type=1404 audit(1532588961.277:2): enforcing=0 old_enforcing=0
> > > auid=4294967295 ses=4294967295 enabled=0 old-enabled=1 lsm=selinux res=1
> > > stty (2166) used greatest stack depth: 19664 bytes left
> > > EXT4-fs (sda1): re-mounted. Opts: (null)
> > > logsave (3615) used greatest stack depth: 17632 bytes left
> > > random: dd: uninitialized urandom read (512 bytes read)
> > > ==================================================================
> > > BUG: KASAN: slab-out-of-bounds in virtnet_receive
> > > drivers/net/virtio_net.c:1356 [inline]
> >
> > +virtio maintainers for this one
> > Probably something very recent.
> >
> > > BUG: KASAN: slab-out-of-bounds in virtnet_poll+0x111a/0x1226
> > > drivers/net/virtio_net.c:1421
> > > Read of size 8 at addr ffff8801cee08ff0 by task ip/3969
> > >
> > > CPU: 0 PID: 3969 Comm: ip Not tainted 4.18.0-rc6+ #141
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > Call Trace:
> > > <IRQ>
> > > __dump_stack lib/dump_stack.c:77 [inline]
> > > dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
> > > print_address_description+0x6c/0x20b mm/kasan/report.c:256
> > > kasan_report_error mm/kasan/report.c:354 [inline]
> > > kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
> > > __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> > > virtnet_receive drivers/net/virtio_net.c:1356 [inline]
> > > virtnet_poll+0x111a/0x1226 drivers/net/virtio_net.c:1421
> > > napi_poll net/core/dev.c:6214 [inline]
> > > net_rx_action+0x7a5/0x1920 net/core/dev.c:6280
> > > __do_softirq+0x2e8/0xb17 kernel/softirq.c:292
> > > do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1046
> > > </IRQ>
> > > do_softirq.part.18+0x155/0x1a0 kernel/softirq.c:336
> > > do_softirq arch/x86/include/asm/preempt.h:23 [inline]
> > > __local_bh_enable_ip+0x1ec/0x230 kernel/softirq.c:189
> > > local_bh_enable include/linux/bottom_half.h:32 [inline]
> > > virtnet_napi_enable+0x8c/0xb0 drivers/net/virtio_net.c:1264
> > > virtnet_open+0x16d/0x4d0 drivers/net/virtio_net.c:1464
> > > __dev_open+0x26d/0x410 net/core/dev.c:1392
> > > __dev_change_flags+0x739/0x9c0 net/core/dev.c:7434
> > > dev_change_flags+0x89/0x150 net/core/dev.c:7503
> > > do_setlink+0xb16/0x3dd0 net/core/rtnetlink.c:2416
> > > rtnl_newlink+0x138d/0x1d60 net/core/rtnetlink.c:3029
> > > rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4705
> > > netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2447
> > > rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4723
> > > netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
> > > netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1336
> > > netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1901
> > > sock_sendmsg_nosec net/socket.c:641 [inline]
> > > sock_sendmsg+0xd5/0x120 net/socket.c:651
> > > ___sys_sendmsg+0x7fd/0x930 net/socket.c:2125
> > > __sys_sendmsg+0x11d/0x290 net/socket.c:2163
> > > __do_sys_sendmsg net/socket.c:2172 [inline]
> > > __se_sys_sendmsg net/socket.c:2170 [inline]
> > > __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2170
> > > do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> > > entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > > RIP: 0033:0x7f318d594320
> > > Code: 02 48 83 c8 ff eb 8d 48 8b 05 14 7b 2a 00 f7 da 64 89 10 48 83 c8 ff
> > > eb c9 90 83 3d d5 d2 2a 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff
> > > 73 31 c3 48 83 ec 08 e8 5e ba 00 00 48 89 04 24
> > > RSP: 002b:00007ffd985d8f38 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> > > RAX: ffffffffffffffda RBX: 00007ffd985dd030 RCX: 00007f318d594320
> > > RDX: 0000000000000000 RSI: 00007ffd985d8f70 RDI: 0000000000000003
> > > RBP: 00007ffd985d8f70 R08: 0000000000000000 R09: 000000000000000f
> > > R10: 0000000000000000 R11: 0000000000000246 R12: 000000005b5973aa
> > > R13: 0000000000000000 R14: 00000000006395c0 R15: 00007ffd985dd808
> > >
> > > Allocated by task 1:
> > > save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> > > set_track mm/kasan/kasan.c:460 [inline]
> > > kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
> > > __do_kmalloc mm/slab.c:3718 [inline]
> > > __kmalloc+0x14e/0x760 mm/slab.c:3727
> > > kmalloc_array include/linux/slab.h:635 [inline]
> > > kcalloc include/linux/slab.h:646 [inline]
> > > virtnet_alloc_queues drivers/net/virtio_net.c:2731 [inline]
> > > init_vqs+0x127/0x1520 drivers/net/virtio_net.c:2769
> > > virtnet_probe+0x1092/0x2260 drivers/net/virtio_net.c:3016
> > > virtio_dev_probe+0x592/0x942 drivers/virtio/virtio.c:245
> > > really_probe drivers/base/dd.c:446 [inline]
> > > driver_probe_device+0x6ad/0x970 drivers/base/dd.c:588
> > > __driver_attach+0x28b/0x2f0 drivers/base/dd.c:822
> > > bus_for_each_dev+0x15d/0x1f0 drivers/base/bus.c:311
> > > driver_attach+0x3d/0x50 drivers/base/dd.c:841
> > > bus_add_driver+0x4b2/0x600 drivers/base/bus.c:667
> > > driver_register+0x1c8/0x320 drivers/base/driver.c:170
> > > register_virtio_driver+0x79/0xd0 drivers/virtio/virtio.c:296
> > > virtio_net_driver_init+0x8d/0xc9 drivers/net/virtio_net.c:3209
> > > do_one_initcall+0x127/0x913 init/main.c:884
> > > do_initcall_level init/main.c:952 [inline]
> > > do_initcalls init/main.c:960 [inline]
> > > do_basic_setup init/main.c:978 [inline]
> > > kernel_init_freeable+0x49b/0x58e init/main.c:1135
> > > kernel_init+0x11/0x1b3 init/main.c:1061
> > > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
> > >
> > > Freed by task 0:
> > > (stack is not available)
> > >
> > > The buggy address belongs to the object at ffff8801cee08500
> > > which belongs to the cache kmalloc-4096 of size 4096
> > > The buggy address is located 2800 bytes inside of
> > > 4096-byte region [ffff8801cee08500, ffff8801cee09500)
> > > The buggy address belongs to the page:
> > > page:ffffea00073b8200 count:1 mapcount:0 mapping:ffff8801dac00dc0 index:0x0
> > > compound_mapcount: 0
> > > flags: 0x2fffc0000008100(slab|head)
> > > raw: 02fffc0000008100 ffffea00073b7d88 ffffea00073b8288 ffff8801dac00dc0
> > > raw: 0000000000000000 ffff8801cee08500 0000000100000001 0000000000000000
> > > page dumped because: kasan: bad access detected
> > >
> > > Memory state around the buggy address:
> > > ffff8801cee08e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > > ffff8801cee08f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > >>
> > >> ffff8801cee08f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > >
> > > ^
> > > ffff8801cee09000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > > ffff8801cee09080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > > ==================================================================
> > >
> > >
> > > ---
> > > This bug is generated by a bot. It may contain errors.
> > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > >
> > > syzbot will keep track of this bug report. See:
> > > https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
> > > syzbot.
> > >
> > > --
> > > You received this message because you are subscribed to the Google Groups
> > > "syzkaller-bugs" group.
> > > To unsubscribe from this group and stop receiving emails from it, send an
> > > email to syzkaller-bugs+unsubscribe@googlegroups.com.
> > > To view this discussion on the web visit
> > > https://groups.google.com/d/msgid/syzkaller-bugs/000000000000352dc20571e3a0d8%40google.com.
> > > For more options, visit https://groups.google.com/d/optout.
^ permalink raw reply
* [PATCH] drm: qxl: Fix error handling at qxl_device_init
From: Anton Vasilyev @ 2018-07-27 11:54 UTC (permalink / raw)
To: Dave Airlie
Cc: ldv-project, David Airlie, linux-kernel, dri-devel,
virtualization, Anton Vasilyev
If qxl_device_init fails on creating resources and does not report it,
then qxl module will catch null pointer exception on remove, or on
probe's error path.
The patch adds error path with resources release into qxl_device_init.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
---
drivers/gpu/drm/qxl/qxl_kms.c | 80 ++++++++++++++++++++++++++++++++---
1 file changed, 73 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/qxl/qxl_kms.c b/drivers/gpu/drm/qxl/qxl_kms.c
index 771250aed78d..e25c589d5f50 100644
--- a/drivers/gpu/drm/qxl/qxl_kms.c
+++ b/drivers/gpu/drm/qxl/qxl_kms.c
@@ -102,8 +102,10 @@ int qxl_device_init(struct qxl_device *qdev,
int r, sb;
r = drm_dev_init(&qdev->ddev, drv, &pdev->dev);
- if (r)
- return r;
+ if (r) {
+ pr_err("Unable to init drm dev");
+ goto error;
+ }
qdev->ddev.pdev = pdev;
pci_set_drvdata(pdev, &qdev->ddev);
@@ -121,6 +123,11 @@ int qxl_device_init(struct qxl_device *qdev,
qdev->io_base = pci_resource_start(pdev, 3);
qdev->vram_mapping = io_mapping_create_wc(qdev->vram_base, pci_resource_len(pdev, 0));
+ if (!qdev->vram_mapping) {
+ pr_err("Unable to create vram_mapping");
+ r = -ENOMEM;
+ goto error;
+ }
if (pci_resource_len(pdev, 4) > 0) {
/* 64bit surface bar present */
@@ -139,6 +146,11 @@ int qxl_device_init(struct qxl_device *qdev,
qdev->surface_mapping =
io_mapping_create_wc(qdev->surfaceram_base,
qdev->surfaceram_size);
+ if (!qdev->surface_mapping) {
+ pr_err("Unable to create surface_mapping");
+ r = -ENOMEM;
+ goto vram_mapping_free;
+ }
}
DRM_DEBUG_KMS("qxl: vram %llx-%llx(%dM %dk), surface %llx-%llx(%dM %dk, %s)\n",
@@ -155,20 +167,29 @@ int qxl_device_init(struct qxl_device *qdev,
qdev->rom = ioremap(qdev->rom_base, qdev->rom_size);
if (!qdev->rom) {
pr_err("Unable to ioremap ROM\n");
- return -ENOMEM;
+ r = -ENOMEM;
+ goto surface_mapping_free;
}
- qxl_check_device(qdev);
+ if (!qxl_check_device(qdev)) {
+ r = -ENODEV;
+ goto surface_mapping_free;
+ }
r = qxl_bo_init(qdev);
if (r) {
DRM_ERROR("bo init failed %d\n", r);
- return r;
+ goto rom_unmap;
}
qdev->ram_header = ioremap(qdev->vram_base +
qdev->rom->ram_header_offset,
sizeof(*qdev->ram_header));
+ if (!qdev->ram_header) {
+ DRM_ERROR("Unable to ioremap RAM header\n");
+ r = -ENOMEM;
+ goto bo_fini;
+ }
qdev->command_ring = qxl_ring_create(&(qdev->ram_header->cmd_ring_hdr),
sizeof(struct qxl_command),
@@ -176,6 +197,11 @@ int qxl_device_init(struct qxl_device *qdev,
qdev->io_base + QXL_IO_NOTIFY_CMD,
false,
&qdev->display_event);
+ if (!qdev->command_ring) {
+ DRM_ERROR("Unable to create command ring\n");
+ r = -ENOMEM;
+ goto ram_header_unmap;
+ }
qdev->cursor_ring = qxl_ring_create(
&(qdev->ram_header->cursor_ring_hdr),
@@ -185,12 +211,23 @@ int qxl_device_init(struct qxl_device *qdev,
false,
&qdev->cursor_event);
+ if (!qdev->cursor_ring) {
+ DRM_ERROR("Unable to create cursor ring\n");
+ r = -ENOMEM;
+ goto command_ring_free;
+ }
+
qdev->release_ring = qxl_ring_create(
&(qdev->ram_header->release_ring_hdr),
sizeof(uint64_t),
QXL_RELEASE_RING_SIZE, 0, true,
NULL);
+ if (!qdev->release_ring) {
+ DRM_ERROR("Unable to create release ring\n");
+ r = -ENOMEM;
+ goto cursor_ring_free;
+ }
/* TODO - slot initialization should happen on reset. where is our
* reset handler? */
qdev->n_mem_slots = qdev->rom->slots_end;
@@ -203,6 +240,12 @@ int qxl_device_init(struct qxl_device *qdev,
kmalloc_array(qdev->n_mem_slots, sizeof(struct qxl_memslot),
GFP_KERNEL);
+ if (!qdev->mem_slots) {
+ DRM_ERROR("Unable to alloc mem slots\n");
+ r = -ENOMEM;
+ goto release_ring_free;
+ }
+
idr_init(&qdev->release_idr);
spin_lock_init(&qdev->release_idr_lock);
spin_lock_init(&qdev->release_lock);
@@ -218,8 +261,10 @@ int qxl_device_init(struct qxl_device *qdev,
/* must initialize irq before first async io - slot creation */
r = qxl_irq_init(qdev);
- if (r)
- return r;
+ if (r) {
+ DRM_ERROR("Unable to init qxl irq\n");
+ goto mem_slots_free;
+ }
/*
* Note that virtual is surface0. We rely on the single ioremap done
@@ -243,6 +288,27 @@ int qxl_device_init(struct qxl_device *qdev,
INIT_WORK(&qdev->gc_work, qxl_gc_work);
return 0;
+
+mem_slots_free:
+ kfree(qdev->mem_slots);
+release_ring_free:
+ qxl_ring_free(qdev->release_ring);
+cursor_ring_free:
+ qxl_ring_free(qdev->cursor_ring);
+command_ring_free:
+ qxl_ring_free(qdev->command_ring);
+ram_header_unmap:
+ iounmap(qdev->ram_header);
+bo_fini:
+ qxl_bo_fini(qdev);
+rom_unmap:
+ iounmap(qdev->rom);
+surface_mapping_free:
+ io_mapping_free(qdev->surface_mapping);
+vram_mapping_free:
+ io_mapping_free(qdev->vram_mapping);
+error:
+ return r;
}
void qxl_device_fini(struct qxl_device *qdev)
--
2.18.0
^ permalink raw reply related
* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Michael S. Tsirkin @ 2018-07-27 11:31 UTC (permalink / raw)
To: Anshuman Khandual
Cc: robh, srikar, linuxram, linux-kernel, virtualization, hch, paulus,
joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <4062dd48-2b5b-e454-e860-c6bfe321ebdc@linux.vnet.ibm.com>
On Wed, Jul 25, 2018 at 08:56:23AM +0530, Anshuman Khandual wrote:
> Results with and without the patches are similar.
Thanks! And another thing to try is virtio-net with
a fast NIC backend (40G and up). Unfortunately
at this point loopback tests stress the host
scheduler too much.
--
MST
^ permalink raw reply
* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Anshuman Khandual @ 2018-07-27 10:58 UTC (permalink / raw)
To: Will Deacon
Cc: robh, srikar, mst, linuxppc-dev, linuxram, linux-kernel,
virtualization, hch, paulus, marc.zyngier, joe, robin.murphy,
elfring, haren, david
In-Reply-To: <20180727095804.GA25592@arm.com>
On 07/27/2018 03:28 PM, Will Deacon wrote:
> Hi Anshuman,
>
> On Fri, Jul 20, 2018 at 09:29:37AM +0530, Anshuman Khandual wrote:
>> This patch series is the follow up on the discussions we had before about
>> the RFC titled [RFC,V2] virtio: Add platform specific DMA API translation
>> for virito devices (https://patchwork.kernel.org/patch/10417371/). There
>> were suggestions about doing away with two different paths of transactions
>> with the host/QEMU, first being the direct GPA and the other being the DMA
>> API based translations.
>>
>> First patch attempts to create a direct GPA mapping based DMA operations
>> structure called 'virtio_direct_dma_ops' with exact same implementation
>> of the direct GPA path which virtio core currently has but just wrapped in
>> a DMA API format. Virtio core must use 'virtio_direct_dma_ops' instead of
>> the arch default in absence of VIRTIO_F_IOMMU_PLATFORM flag to preserve the
>> existing semantics. The second patch does exactly that inside the function
>> virtio_finalize_features(). The third patch removes the default direct GPA
>> path from virtio core forcing it to use DMA API callbacks for all devices.
>> Now with that change, every device must have a DMA operations structure
>> associated with it. The fourth patch adds an additional hook which gives
>> the platform an opportunity to do yet another override if required. This
>> platform hook can be used on POWER Ultravisor based protected guests to
>> load up SWIOTLB DMA callbacks to do the required (as discussed previously
>> in the above mentioned thread how host is allowed to access only parts of
>> the guest GPA range) bounce buffering into the shared memory for all I/O
>> scatter gather buffers to be consumed on the host side.
>>
>> Please go through these patches and review whether this approach broadly
>> makes sense. I will appreciate suggestions, inputs, comments regarding
>> the patches or the approach in general. Thank you.
> I just wanted to say that this patch series provides a means for us to
> force the coherent DMA ops for legacy virtio devices on arm64, which in turn
> means that we can enable the SMMU with legacy devices in our fastmodel
> emulation platform (which is slowly being upgraded to virtio 1.0) without
> hanging during boot. Patch below.
>
> So:
>
> Acked-by: Will Deacon <will.deacon@arm.com>
> Tested-by: Will Deacon <will.deacon@arm.com>
Thanks Will.
^ permalink raw reply
* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Will Deacon @ 2018-07-27 9:58 UTC (permalink / raw)
To: Anshuman Khandual
Cc: robh, srikar, mst, benh, linuxram, linux-kernel, virtualization,
hch, paulus, marc.zyngier, mpe, joe, robin.murphy, linuxppc-dev,
elfring, haren, david
In-Reply-To: <20180720035941.6844-1-khandual@linux.vnet.ibm.com>
Hi Anshuman,
On Fri, Jul 20, 2018 at 09:29:37AM +0530, Anshuman Khandual wrote:
> This patch series is the follow up on the discussions we had before about
> the RFC titled [RFC,V2] virtio: Add platform specific DMA API translation
> for virito devices (https://patchwork.kernel.org/patch/10417371/). There
> were suggestions about doing away with two different paths of transactions
> with the host/QEMU, first being the direct GPA and the other being the DMA
> API based translations.
>
> First patch attempts to create a direct GPA mapping based DMA operations
> structure called 'virtio_direct_dma_ops' with exact same implementation
> of the direct GPA path which virtio core currently has but just wrapped in
> a DMA API format. Virtio core must use 'virtio_direct_dma_ops' instead of
> the arch default in absence of VIRTIO_F_IOMMU_PLATFORM flag to preserve the
> existing semantics. The second patch does exactly that inside the function
> virtio_finalize_features(). The third patch removes the default direct GPA
> path from virtio core forcing it to use DMA API callbacks for all devices.
> Now with that change, every device must have a DMA operations structure
> associated with it. The fourth patch adds an additional hook which gives
> the platform an opportunity to do yet another override if required. This
> platform hook can be used on POWER Ultravisor based protected guests to
> load up SWIOTLB DMA callbacks to do the required (as discussed previously
> in the above mentioned thread how host is allowed to access only parts of
> the guest GPA range) bounce buffering into the shared memory for all I/O
> scatter gather buffers to be consumed on the host side.
>
> Please go through these patches and review whether this approach broadly
> makes sense. I will appreciate suggestions, inputs, comments regarding
> the patches or the approach in general. Thank you.
I just wanted to say that this patch series provides a means for us to
force the coherent DMA ops for legacy virtio devices on arm64, which in turn
means that we can enable the SMMU with legacy devices in our fastmodel
emulation platform (which is slowly being upgraded to virtio 1.0) without
hanging during boot. Patch below.
So:
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Thanks!
Will
--->8
From 4ef39e9de2c87c97bf046816ca762832f92e39b5 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Fri, 27 Jul 2018 10:49:25 +0100
Subject: [PATCH] arm64: dma: Override DMA ops for legacy virtio devices
Virtio devices are always cache-coherent, so force use of the coherent
DMA ops for legacy virtio devices where the dma-coherent is known to
be omitted by QEMU for the MMIO transport.
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/arm64/include/asm/dma-mapping.h | 6 ++++++
arch/arm64/mm/dma-mapping.c | 19 +++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
index b7847eb8a7bb..30aa8fb62dc3 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -44,6 +44,12 @@ void arch_teardown_dma_ops(struct device *dev);
#define arch_teardown_dma_ops arch_teardown_dma_ops
#endif
+#ifdef CONFIG_VIRTIO
+struct virtio_device;
+void platform_override_dma_ops(struct virtio_device *vdev);
+#define platform_override_dma_ops platform_override_dma_ops
+#endif
+
/* do not use this function in a driver */
static inline bool is_device_dma_coherent(struct device *dev)
{
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 61e93f0b5482..f9ca61b1b34d 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -891,3 +891,22 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
}
#endif
}
+
+#ifdef CONFIG_VIRTIO
+#include <linux/virtio_config.h>
+
+void platform_override_dma_ops(struct virtio_device *vdev)
+{
+ struct device *dev = vdev->dev.parent;
+ const struct dma_map_ops *dma_ops = &arm64_swiotlb_dma_ops;
+
+ if (virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
+ return;
+
+ dev->archdata.dma_coherent = true;
+ if (iommu_get_domain_for_dev(dev))
+ dma_ops = &iommu_dma_ops;
+
+ set_dma_ops(dev, dma_ops);
+}
+#endif /* CONFIG_VIRTIO */
--
2.1.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox