Linux virtualization list
 help / color / mirror / Atom feed
* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Will Deacon @ 2018-07-27  9:58 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: robh, srikar, mst, benh, linuxram, linux-kernel, virtualization,
	hch, paulus, marc.zyngier, mpe, joe, robin.murphy, linuxppc-dev,
	elfring, haren, david
In-Reply-To: <20180720035941.6844-1-khandual@linux.vnet.ibm.com>

Hi Anshuman,

On Fri, Jul 20, 2018 at 09:29:37AM +0530, Anshuman Khandual wrote:
> This patch series is the follow up on the discussions we had before about
> the RFC titled [RFC,V2] virtio: Add platform specific DMA API translation
> for virito devices (https://patchwork.kernel.org/patch/10417371/). There
> were suggestions about doing away with two different paths of transactions
> with the host/QEMU, first being the direct GPA and the other being the DMA
> API based translations.
> 
> First patch attempts to create a direct GPA mapping based DMA operations
> structure called 'virtio_direct_dma_ops' with exact same implementation
> of the direct GPA path which virtio core currently has but just wrapped in
> a DMA API format. Virtio core must use 'virtio_direct_dma_ops' instead of
> the arch default in absence of VIRTIO_F_IOMMU_PLATFORM flag to preserve the
> existing semantics. The second patch does exactly that inside the function
> virtio_finalize_features(). The third patch removes the default direct GPA
> path from virtio core forcing it to use DMA API callbacks for all devices.
> Now with that change, every device must have a DMA operations structure
> associated with it. The fourth patch adds an additional hook which gives
> the platform an opportunity to do yet another override if required. This
> platform hook can be used on POWER Ultravisor based protected guests to
> load up SWIOTLB DMA callbacks to do the required (as discussed previously
> in the above mentioned thread how host is allowed to access only parts of
> the guest GPA range) bounce buffering into the shared memory for all I/O
> scatter gather buffers to be consumed on the host side.
> 
> Please go through these patches and review whether this approach broadly
> makes sense. I will appreciate suggestions, inputs, comments regarding
> the patches or the approach in general. Thank you.

I just wanted to say that this patch series provides a means for us to
force the coherent DMA ops for legacy virtio devices on arm64, which in turn
means that we can enable the SMMU with legacy devices in our fastmodel
emulation platform (which is slowly being upgraded to virtio 1.0) without
hanging during boot. Patch below.

So:

Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>

Thanks!

Will

--->8

From 4ef39e9de2c87c97bf046816ca762832f92e39b5 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Fri, 27 Jul 2018 10:49:25 +0100
Subject: [PATCH] arm64: dma: Override DMA ops for legacy virtio devices

Virtio devices are always cache-coherent, so force use of the coherent
DMA ops for legacy virtio devices where the dma-coherent is known to
be omitted by QEMU for the MMIO transport.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/dma-mapping.h |  6 ++++++
 arch/arm64/mm/dma-mapping.c          | 19 +++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
index b7847eb8a7bb..30aa8fb62dc3 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -44,6 +44,12 @@ void arch_teardown_dma_ops(struct device *dev);
 #define arch_teardown_dma_ops	arch_teardown_dma_ops
 #endif
 
+#ifdef CONFIG_VIRTIO
+struct virtio_device;
+void platform_override_dma_ops(struct virtio_device *vdev);
+#define platform_override_dma_ops	platform_override_dma_ops
+#endif
+
 /* do not use this function in a driver */
 static inline bool is_device_dma_coherent(struct device *dev)
 {
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 61e93f0b5482..f9ca61b1b34d 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -891,3 +891,22 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
 	}
 #endif
 }
+
+#ifdef CONFIG_VIRTIO
+#include <linux/virtio_config.h>
+
+void platform_override_dma_ops(struct virtio_device *vdev)
+{
+	struct device *dev = vdev->dev.parent;
+	const struct dma_map_ops *dma_ops = &arm64_swiotlb_dma_ops;
+
+	if (virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
+		return;
+
+	dev->archdata.dma_coherent = true;
+	if (iommu_get_domain_for_dev(dev))
+		dma_ops = &iommu_dma_ops;
+
+	set_dma_ops(dev, dma_ops);
+}
+#endif	/* CONFIG_VIRTIO */
-- 
2.1.4

^ permalink raw reply related

* [PATCH v2 2/2] virtio_balloon: replace oom notifier with shrinker
From: Wei Wang @ 2018-07-27  9:24 UTC (permalink / raw)
  To: virtio-dev, linux-kernel, virtualization, linux-mm, mst, mhocko,
	akpm
In-Reply-To: <1532683495-31974-1-git-send-email-wei.w.wang@intel.com>

The OOM notifier is getting deprecated to use for the reasons mentioned
here by Michal Hocko: https://lkml.org/lkml/2018/7/12/314

This patch replaces the virtio-balloon oom notifier with a shrinker
to release balloon pages on memory pressure.

In addition, the bug in the replaced virtballoon_oom_notify that only
VIRTIO_BALLOON_ARRAY_PFNS_MAX (i.e 256) balloon pages can be freed
though the user has specified more than that number is fixed in the
shrinker_scan function.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 drivers/virtio/virtio_balloon.c | 115 +++++++++++++++++++++++-----------------
 1 file changed, 65 insertions(+), 50 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 9356a1a..6b2229b 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -27,7 +27,6 @@
 #include <linux/slab.h>
 #include <linux/module.h>
 #include <linux/balloon_compaction.h>
-#include <linux/oom.h>
 #include <linux/wait.h>
 #include <linux/mm.h>
 #include <linux/mount.h>
@@ -40,12 +39,12 @@
  */
 #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT)
 #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
-#define OOM_VBALLOON_DEFAULT_PAGES 256
+#define DEFAULT_BALLOON_PAGES_TO_SHRINK 256
 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
 
-static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
-module_param(oom_pages, int, S_IRUSR | S_IWUSR);
-MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
+static unsigned long balloon_pages_to_shrink = DEFAULT_BALLOON_PAGES_TO_SHRINK;
+module_param(balloon_pages_to_shrink, ulong, 0600);
+MODULE_PARM_DESC(balloon_pages_to_shrink, "pages to free on memory presure");
 
 #ifdef CONFIG_BALLOON_COMPACTION
 static struct vfsmount *balloon_mnt;
@@ -86,8 +85,8 @@ struct virtio_balloon {
 	/* Memory statistics */
 	struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
 
-	/* To register callback in oom notifier call chain */
-	struct notifier_block nb;
+	/* To register a shrinker to shrink memory upon memory pressure */
+	struct shrinker shrinker;
 };
 
 static struct virtio_device_id id_table[] = {
@@ -365,38 +364,6 @@ static void update_balloon_size(struct virtio_balloon *vb)
 		      &actual);
 }
 
-/*
- * virtballoon_oom_notify - release pages when system is under severe
- *			    memory pressure (called from out_of_memory())
- * @self : notifier block struct
- * @dummy: not used
- * @parm : returned - number of freed pages
- *
- * The balancing of memory by use of the virtio balloon should not cause
- * the termination of processes while there are pages in the balloon.
- * If virtio balloon manages to release some memory, it will make the
- * system return and retry the allocation that forced the OOM killer
- * to run.
- */
-static int virtballoon_oom_notify(struct notifier_block *self,
-				  unsigned long dummy, void *parm)
-{
-	struct virtio_balloon *vb;
-	unsigned long *freed;
-	unsigned num_freed_pages;
-
-	vb = container_of(self, struct virtio_balloon, nb);
-	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
-		return NOTIFY_OK;
-
-	freed = parm;
-	num_freed_pages = leak_balloon(vb, oom_pages);
-	update_balloon_size(vb);
-	*freed += num_freed_pages;
-
-	return NOTIFY_OK;
-}
-
 static void update_balloon_stats_func(struct work_struct *work)
 {
 	struct virtio_balloon *vb;
@@ -548,6 +515,54 @@ static struct file_system_type balloon_fs = {
 
 #endif /* CONFIG_BALLOON_COMPACTION */
 
+static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker,
+						  struct shrink_control *sc)
+{
+	unsigned long pages_to_free = balloon_pages_to_shrink,
+		      pages_freed = 0;
+	struct virtio_balloon *vb = container_of(shrinker,
+					struct virtio_balloon, shrinker);
+
+	/*
+	 * One invocation of leak_balloon can deflate at most
+	 * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
+	 * multiple times to deflate pages till reaching
+	 * balloon_pages_to_shrink pages.
+	 */
+	while (vb->num_pages && pages_to_free) {
+		pages_to_free = balloon_pages_to_shrink - pages_freed;
+		pages_freed += leak_balloon(vb, pages_to_free);
+	}
+	update_balloon_size(vb);
+
+	return pages_freed / VIRTIO_BALLOON_PAGES_PER_PAGE;
+}
+
+static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker,
+						   struct shrink_control *sc)
+{
+	struct virtio_balloon *vb = container_of(shrinker,
+					struct virtio_balloon, shrinker);
+
+	return min_t(unsigned long, vb->num_pages, balloon_pages_to_shrink) /
+	       VIRTIO_BALLOON_PAGES_PER_PAGE;
+}
+
+static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb)
+{
+	unregister_shrinker(&vb->shrinker);
+}
+
+static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
+{
+	vb->shrinker.scan_objects = virtio_balloon_shrinker_scan;
+	vb->shrinker.count_objects = virtio_balloon_shrinker_count;
+	vb->shrinker.batch = 0;
+	vb->shrinker.seeks = DEFAULT_SEEKS;
+
+	return register_shrinker(&vb->shrinker);
+}
+
 static int virtballoon_probe(struct virtio_device *vdev)
 {
 	struct virtio_balloon *vb;
@@ -580,17 +595,10 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	if (err)
 		goto out_free_vb;
 
-	vb->nb.notifier_call = virtballoon_oom_notify;
-	vb->nb.priority = VIRTBALLOON_OOM_NOTIFY_PRIORITY;
-	err = register_oom_notifier(&vb->nb);
-	if (err < 0)
-		goto out_del_vqs;
-
 #ifdef CONFIG_BALLOON_COMPACTION
 	balloon_mnt = kern_mount(&balloon_fs);
 	if (IS_ERR(balloon_mnt)) {
 		err = PTR_ERR(balloon_mnt);
-		unregister_oom_notifier(&vb->nb);
 		goto out_del_vqs;
 	}
 
@@ -599,13 +607,20 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	if (IS_ERR(vb->vb_dev_info.inode)) {
 		err = PTR_ERR(vb->vb_dev_info.inode);
 		kern_unmount(balloon_mnt);
-		unregister_oom_notifier(&vb->nb);
 		vb->vb_dev_info.inode = NULL;
 		goto out_del_vqs;
 	}
 	vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops;
 #endif
-
+	/*
+	 * We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if a
+	 * shrinker needs to be registered to relieve memory pressure.
+	 */
+	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) {
+		err = virtio_balloon_register_shrinker(vb);
+		if (err)
+			goto out_del_vqs;
+	}
 	virtio_device_ready(vdev);
 
 	if (towards_target(vb))
@@ -637,8 +652,8 @@ static void virtballoon_remove(struct virtio_device *vdev)
 {
 	struct virtio_balloon *vb = vdev->priv;
 
-	unregister_oom_notifier(&vb->nb);
-
+	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
+		virtio_balloon_unregister_shrinker(vb);
 	spin_lock_irq(&vb->stop_update_lock);
 	vb->stop_update = true;
 	spin_unlock_irq(&vb->stop_update_lock);
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 1/2] virtio-balloon: remove BUG() in init_vqs
From: Wei Wang @ 2018-07-27  9:24 UTC (permalink / raw)
  To: virtio-dev, linux-kernel, virtualization, linux-mm, mst, mhocko,
	akpm
In-Reply-To: <1532683495-31974-1-git-send-email-wei.w.wang@intel.com>

It's a bit overkill to use BUG when failing to add an entry to the
stats_vq in init_vqs. So remove it and just return the error to the
caller to bail out nicely.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_balloon.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 6b237e3..9356a1a 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -455,9 +455,13 @@ static int init_vqs(struct virtio_balloon *vb)
 		num_stats = update_balloon_stats(vb);
 
 		sg_init_one(&sg, vb->stats, sizeof(vb->stats[0]) * num_stats);
-		if (virtqueue_add_outbuf(vb->stats_vq, &sg, 1, vb, GFP_KERNEL)
-		    < 0)
-			BUG();
+		err = virtqueue_add_outbuf(vb->stats_vq, &sg, 1, vb,
+					   GFP_KERNEL);
+		if (err) {
+			dev_warn(&vb->vdev->dev, "%s: add stat_vq failed\n",
+				 __func__);
+			return err;
+		}
 		virtqueue_kick(vb->stats_vq);
 	}
 	return 0;
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 0/2] virtio-balloon: some improvements
From: Wei Wang @ 2018-07-27  9:24 UTC (permalink / raw)
  To: virtio-dev, linux-kernel, virtualization, linux-mm, mst, mhocko,
	akpm

This series is split from the "Virtio-balloon: support free page
reporting" series to make some improvements.

v1->v2 ChangeLog:
- register the shrinker when VIRTIO_BALLOON_F_DEFLATE_ON_OOM is negotiated.

Wei Wang (2):
  virtio-balloon: remove BUG() in init_vqs
  virtio_balloon: replace oom notifier with shrinker

 drivers/virtio/virtio_balloon.c | 125 +++++++++++++++++++++++-----------------
 1 file changed, 72 insertions(+), 53 deletions(-)

-- 
2.7.4

^ permalink raw reply

* IEEE Record # 41985: 2018 3rd International Conference on Contemporary Computing and Informatics (IC3I).
From: Dr. S K Niranjan Aradhya @ 2018-07-27  6:03 UTC (permalink / raw)
  To: virtualization


[-- Attachment #1.1: Type: text/plain, Size: 1576 bytes --]

*<< Apologies for cross-postings >><<< Please circulate among your friends,
peers and researchers >>>*

IEEE Conference Record No.: #41985;

2018 3rd International Conference on Contemporary Computing and Informatics
(IC3I).

Conference Date : 10 - 12 October 2018
Submission Deadline: 30 July 2018

*Submission Link:http://cmsweb.com.sg/ic3i18/index.php/ic3i18/ic3i18/login
<http://cmsweb.com.sg/ic3i18/index.php/ic3i18/ic3i18/login>*

IEEE ISBN : 978-1-5386-6894-8
IEEE Part No. : CFP18AWQ-ART

Selected, accepted and extended paper will be published in Scopus Indexed
International Journal of Forensic Software Engineering published by
InderScience

All accepted and presented papers will be submitted to the IEEE for
possible publication in IEEE Xplore Digital Library. Previous edition
indexed in: SCOPUS, ISI Web of Science, Engineering Index, Google, etc.

If you like to join the TPC or propose a special session or symposiums
please write to: secretariat@ic3i.org

General Chair(s)
IC3I  2018 Conference

----------------------
Disclaimer: We have clearly mentioned the subject lines and your email
address won't be misleading in any form. We have found your mail address
through our own efforts on the web search and not through any illegal way.
If you wish to remove your information from our mailing list or no longer
receive future announcements, please email with REMOVE in subject. Your
request to opt-out will be effective within a reasonable amount of time.
 ic3i-cfp.pdf
<https://drive.google.com/file/d/1wjyVxnuBxgZoHxrqNxxdDzPumPVHu4ma/view?usp=drive_web>

[-- Attachment #1.2: Type: text/html, Size: 2640 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: net-next boot error
From: Steven Rostedt @ 2018-07-26 14:17 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Michael S. Tsirkin, Marc Zyngier, netdev, Tetsuo Handa,
	syzkaller-bugs, LKML, David Miller, Peter Zijlstra, Jason Baron,
	Josh Poimboeuf, syzbot, Paolo Bonzini, Thomas Gleixner,
	Borislav Petkov, virtualization, Ingo Molnar
In-Reply-To: <CACT4Y+ZBf_X=wKQLnAtVz_P-y-6L7+azmpRqT5XFaH5ySx3UiQ@mail.gmail.com>


[ Added Thomas Gleixner ]


On Thu, 26 Jul 2018 11:34:39 +0200
Dmitry Vyukov <dvyukov@google.com> wrote:

> On Thu, Jul 26, 2018 at 11:29 AM, syzbot
> <syzbot+604f8271211546f5b3c7@syzkaller.appspotmail.com> wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    dc66fe43b7eb rds: send: Fix dead code in rds_sendmsg
> > git tree:       net-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=127874c8400000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=f34ce142a9f5f0e8
> > dashboard link: https://syzkaller.appspot.com/bug?extid=604f8271211546f5b3c7
> > compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+604f8271211546f5b3c7@syzkaller.appspotmail.com
> >
> > possible deadlock in static_key_slow_incsd 0:0:1:0: [sda] Attached SCSI disk
> > MACsec IEEE 802.1AE
> > tun: Universal TUN/TAP device driver, 1.6
> >
> > ============================================
> > WARNING: possible recursive locking detected  
> 
> +Tetsuo, perhaps this boot lockdep problem then disables lockdep for
> actual testing. I think lockdep should respect panic_on_warn.
> 
> 
> > 4.18.0-rc6+ #141 Not tainted
> > --------------------------------------------
> > swapper/0/1 is trying to acquire lock:
> > (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at:
> > static_key_slow_inc+0x12/0x30 kernel/jump_label.c:124
> >
> > but task is already holding lock:
> > (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: get_online_cpus
> > include/linux/cpu.h:126 [inline]
> > (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0xe1a/0x1520
> > drivers/net/virtio_net.c:2777

Here init_vqs() does:

	get_online_cpus();
	virtnet_set_affinity(vi);
	put_online_cpus();

Which disables cpu hotplug and calls virtnet_set_affinity()

Note, get_online_cpus() is no longer recursive.

> >
> > other info that might help us debug this:
> >  Possible unsafe locking scenario:
> >
> >        CPU0
> >        ----
> >   lock(cpu_hotplug_lock.rw_sem);
> >   lock(cpu_hotplug_lock.rw_sem);
> >
> >  *** DEADLOCK ***
> >
> >  May be due to missing lock nesting notation
> >
> > 3 locks held by swapper/0/1:
> >  #0: (____ptrval____) (&dev->mutex){....}, at: device_lock
> > include/linux/device.h:1134 [inline]
> >  #0: (____ptrval____) (&dev->mutex){....}, at: __driver_attach+0x15f/0x2f0
> > drivers/base/dd.c:820
> >  #1: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: get_online_cpus
> > include/linux/cpu.h:126 [inline]
> >  #1: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at:
> > init_vqs+0xe1a/0x1520 drivers/net/virtio_net.c:2777
> >  #2: (____ptrval____) (xps_map_mutex){+.+.}, at:
> > __netif_set_xps_queue+0x243/0x23f0 net/core/dev.c:2278
> >
> > stack backtrace:
> > CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc6+ #141
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
> >  print_deadlock_bug kernel/locking/lockdep.c:1765 [inline]
> >  check_deadlock kernel/locking/lockdep.c:1809 [inline]
> >  validate_chain kernel/locking/lockdep.c:2405 [inline]
> >  __lock_acquire.cold.65+0x1fb/0x486 kernel/locking/lockdep.c:3435
> >  lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924
> >  percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:36 [inline]
> >  percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
> >  cpus_read_lock+0x43/0xa0 kernel/cpu.c:289
> >  static_key_slow_inc+0x12/0x30 kernel/jump_label.c:124
> >  __netif_set_xps_queue+0xaac/0x23f0 net/core/dev.c:2320


__netif_set_xps_queue() calls static_key_slow_inc() which will also do
a get_online_cpus() which will trigger this bug.

There's a static_key_slow_inc_cpuslocked() version that should be used
when get_online_cpus() is already taken, but I see
__netif_set_xps_queue() is called from several places, and I doubt it
is always called with get_online_cpus() held. Thus just using the
cpuslocked() version is probably not sufficient of a fix.

I don't know the code enough to offer other suggestions.

-- Steve


> >  netif_set_xps_queue+0x26/0x30 net/core/dev.c:2455
> >  virtnet_set_affinity+0x2ba/0x4b0 drivers/net/virtio_net.c:1944
> >  init_vqs+0xe22/0x1520 drivers/net/virtio_net.c:2778
> >  virtnet_probe+0x1092/0x2260 drivers/net/virtio_net.c:3016
> >  virtio_dev_probe+0x592/0x942 drivers/virtio/virtio.c:245
> >  really_probe drivers/base/dd.c:446 [inline]
> >  driver_probe_device+0x6ad/0x970 drivers/base/dd.c:588
> >  __driver_attach+0x28b/0x2f0 drivers/base/dd.c:822
> >  bus_for_each_dev+0x15d/0x1f0 drivers/base/bus.c:311
> >  driver_attach+0x3d/0x50 drivers/base/dd.c:841
> >  bus_add_driver+0x4b2/0x600 drivers/base/bus.c:667
> >  driver_register+0x1c8/0x320 drivers/base/driver.c:170
> >  register_virtio_driver+0x79/0xd0 drivers/virtio/virtio.c:296
> >  virtio_net_driver_init+0x8d/0xc9 drivers/net/virtio_net.c:3209
> >  do_one_initcall+0x127/0x913 init/main.c:884
> >  do_initcall_level init/main.c:952 [inline]
> >  do_initcalls init/main.c:960 [inline]
> >  do_basic_setup init/main.c:978 [inline]
> >  kernel_init_freeable+0x49b/0x58e init/main.c:1135
> >  kernel_init+0x11/0x1b3 init/main.c:1061
> >  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
> > vcan: Virtual CAN interface driver
> > vxcan: Virtual CAN Tunnel driver
> > slcan: serial line CAN interface driver
> > slcan: 10 dynamic interface channels.
> > CAN device driver interface
> > enic: Cisco VIC Ethernet NIC Driver, ver 2.3.0.53
> > e100: Intel(R) PRO/100 Network Driver, 3.5.24-k2-NAPI
> > e100: Copyright(c) 1999-2006 Intel Corporation
> > e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
> > e1000: Copyright (c) 1999-2006 Intel Corporation.
> > e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
> > e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> > sky2: driver version 1.30
> > PPP generic driver version 2.4.2
> > PPP BSD Compression module registered
> > PPP Deflate Compression module registered
> > PPP MPPE Compression module registered
> > NET: Registered protocol family 24
> > PPTP driver version 0.8.5
> > mac80211_hwsim: initializing netlink
> > ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
> > ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
> > usbcore: registered new interface driver asix
> > usbcore: registered new interface driver ax88179_178a
> > usbcore: registered new interface driver cdc_ether
> > usbcore: registered new interface driver net1080
> > usbcore: registered new interface driver cdc_subset
> > usbcore: registered new interface driver zaurus
> > usbcore: registered new interface driver cdc_ncm
> > aoe: AoE v85 initialised.
> > ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> > ehci-pci: EHCI PCI platform driver
> > ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> > ohci-pci: OHCI PCI platform driver
> > uhci_hcd: USB Universal Host Controller Interface driver
> > usbcore: registered new interface driver usblp
> > usbcore: registered new interface driver usb-storage
> > i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
> > i8042: Warning: Keylock active
> > serio: i8042 KBD port at 0x60,0x64 irq 1
> > serio: i8042 AUX port at 0x60,0x64 irq 12
> > mousedev: PS/2 mouse device common for all mice
> > rtc_cmos 00:00: RTC can wake from S4
> > rtc_cmos 00:00: registered as rtc0
> > rtc_cmos 00:00: alarms up to one day, 114 bytes nvram
> > i2c /dev entries driver
> > piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or
> > use force_addr=0xaddr
> > i2c-parport-light: adapter type unspecified
> > usbcore: registered new interface driver RobotFuzz Open Source InterFace,
> > OSIF
> > usbcore: registered new interface driver i2c-tiny-usb
> > device-mapper: ioctl: 4.39.0-ioctl (2018-04-03) initialised:
> > dm-devel@redhat.com
> > device-mapper: raid: Loading target version 1.13.2
> > usbcore: registered new interface driver btusb
> > usnic_verbs: Cisco VIC (USNIC) Verbs Driver v1.0.3 (December 19, 2013)
> > usnic_verbs:usnic_uiom_init:585:
> > IOMMU required but not present or enabled.  USNIC QPs will not function w/o
> > enabling IOMMU
> > usnic_verbs:usnic_ib_init:649:
> > Unable to initalize umem with err -1
> > iscsi: registered transport (iser)
> > OPA Virtual Network Driver - v1.0
> > hidraw: raw HID events driver (C) Jiri Kosina
> > usbcore: registered new interface driver usbhid
> > usbhid: USB HID core driver
> > NET: Registered protocol family 40
> > ashmem: initialized
> > NET: Registered protocol family 26
> > Mirror/redirect action on
> > Simple TC action Loaded
> > netem: version 1.3
> > u32 classifier
> >     Actions configured
> > nf_conntrack_irc: failed to register helpers
> > nf_conntrack_sane: failed to register helpers
> > nf_conntrack_sip: failed to register helpers
> > xt_time: kernel timezone is -0000
> > IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
> > IPVS: Connection hash table configured (size=4096, memory=64Kbytes)
> > IPVS: ipvs loaded.
> > IPVS: [rr] scheduler registered.
> > IPVS: [wrr] scheduler registered.
> > IPVS: [lc] scheduler registered.
> > IPVS: [wlc] scheduler registered.
> > IPVS: [fo] scheduler registered.
> > IPVS: [ovf] scheduler registered.
> > IPVS: [lblc] scheduler registered.
> > IPVS: [lblcr] scheduler registered.
> > IPVS: [dh] scheduler registered.
> > IPVS: [sh] scheduler registered.
> > IPVS: [mh] scheduler registered.
> > IPVS: [sed] scheduler registered.
> > IPVS: [nq] scheduler registered.
> > IPVS: ftp: loaded support on port[0] = 21
> > IPVS: [sip] pe registered.
> > ipip: IPv4 and MPLS over IPv4 tunneling driver
> > gre: GRE over IPv4 demultiplexor driver
> > ip_gre: GRE over IPv4 tunneling driver
> > IPv4 over IPsec tunneling driver
> > ipt_CLUSTERIP: ClusterIP Version 0.8 loaded successfully
> > Initializing XFRM netlink socket
> > NET: Registered protocol family 10
> > Segment Routing with IPv6
> > mip6: Mobile IPv6
> > sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
> > ip6_gre: GRE over IPv6 tunneling driver
> > bpfilter: Loaded bpfilter_umh pid 2080
> > NET: Registered protocol family 15
> > Bridge firewalling registered
> > can: controller area network core (rev 20170425 abi 9)
> > NET: Registered protocol family 29
> > can: raw protocol (rev 20170425)
> > can: broadcast manager protocol (rev 20170425 t)
> > can: netlink gateway (rev 20170425) max_hops=1
> > Bluetooth: RFCOMM TTY layer initialized
> > Bluetooth: RFCOMM socket layer initialized
> > Bluetooth: RFCOMM ver 1.11
> > Bluetooth: BNEP (Ethernet Emulation) ver 1.3
> > Bluetooth: BNEP filters: protocol multicast
> > Bluetooth: BNEP socket layer initialized
> > Bluetooth: HIDP (Human Interface Emulation) ver 1.2
> > Bluetooth: HIDP socket layer initialized
> > RPC: Registered rdma transport module.
> > RPC: Registered rdma backchannel transport module.
> > NET: Registered protocol family 41
> > lec:lane_module_init: lec.c: initialized
> > mpoa:atm_mpoa_init: mpc.c: initialized
> > l2tp_core: L2TP core driver, V2.0
> > l2tp_ppp: PPPoL2TP kernel driver, V2.0
> > 8021q: 802.1Q VLAN Support v1.8
> > input: AT Translated Set 2 keyboard as
> > /devices/platform/i8042/serio0/input/input2
> > DCCP: Activated CCID 2 (TCP-like)
> > DCCP: Activated CCID 3 (TCP-Friendly Rate Control)
> > sctp: Hash tables configured (bind 64/64)
> > tipc: Activated (version 2.0.0)
> > NET: Registered protocol family 30
> > tipc: Started in single node mode
> > NET: Registered protocol family 43
> > 9pnet: Installing 9P2000 support
> > NET: Registered protocol family 36
> > Key type dns_resolver registered
> > Key type ceph registered
> > libceph: loaded (mon/osd proto 15/24)
> > openvswitch: Open vSwitch switching datapath
> > mpls_gso: MPLS GSO support
> > start plist test
> > end plist test
> > AVX2 version of gcm_enc/dec engaged.
> > AES CTR mode by8 optimization enabled
> > sched_clock: Marking stable (4559438359, 0)->(6126385605, -1566947246)
> > registered taskstats version 1
> > Loading compiled-in X.509 certificates
> > zswap: default zpool zbud not available
> > zswap: pool creation failed
> > Btrfs loaded, crc32c=crc32c-intel
> > Key type big_key registered
> > Key type encrypted registered
> >   Magic number: 10:317:168
> > console [netcon0] enabled
> > netconsole: network logging started
> > gtp: GTP module loaded (pdp ctx size 104 bytes)
> > rdma_rxe: loaded
> > cfg80211: Loading compiled-in X.509 certificates for regulatory database
> > cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
> > platform regulatory.0: Direct firmware load for regulatory.db failed with
> > error -2
> > cfg80211: failed to load regulatory.db
> > ALSA device list:
> >   #0: Dummy 1
> >   #1: Loopback 1
> >   #2: Virtual MIDI Card 1
> > input: ImExPS/2 Generic Explorer Mouse as
> > /devices/platform/i8042/serio1/input/input4
> > md: Waiting for all devices to be available before autodetect
> > md: If you don't use raid, use raid=noautodetect
> > md: Autodetecting RAID arrays.
> > md: autorun ...
> > md: ... autorun DONE.
> > EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
> > VFS: Mounted root (ext4 filesystem) readonly on device 8:1.
> > devtmpfs: mounted
> > Freeing unused kernel memory: 3900K
> > Kernel memory protection disabled.
> > SELinux:  Disabled at runtime.
> > SELinux:  Unregistering netfilter hooks
> > audit: type=1404 audit(1532588961.277:2): enforcing=0 old_enforcing=0
> > auid=4294967295 ses=4294967295 enabled=0 old-enabled=1 lsm=selinux res=1
> > stty (2166) used greatest stack depth: 19664 bytes left
> > EXT4-fs (sda1): re-mounted. Opts: (null)
> > logsave (3615) used greatest stack depth: 17632 bytes left
> > random: dd: uninitialized urandom read (512 bytes read)
> > ==================================================================
> > BUG: KASAN: slab-out-of-bounds in virtnet_receive
> > drivers/net/virtio_net.c:1356 [inline]  
> 
> +virtio maintainers for this one
> Probably something very recent.
> 
> > BUG: KASAN: slab-out-of-bounds in virtnet_poll+0x111a/0x1226
> > drivers/net/virtio_net.c:1421
> > Read of size 8 at addr ffff8801cee08ff0 by task ip/3969
> >
> > CPU: 0 PID: 3969 Comm: ip Not tainted 4.18.0-rc6+ #141
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > Call Trace:
> >  <IRQ>
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
> >  print_address_description+0x6c/0x20b mm/kasan/report.c:256
> >  kasan_report_error mm/kasan/report.c:354 [inline]
> >  kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
> >  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> >  virtnet_receive drivers/net/virtio_net.c:1356 [inline]
> >  virtnet_poll+0x111a/0x1226 drivers/net/virtio_net.c:1421
> >  napi_poll net/core/dev.c:6214 [inline]
> >  net_rx_action+0x7a5/0x1920 net/core/dev.c:6280
> >  __do_softirq+0x2e8/0xb17 kernel/softirq.c:292
> >  do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1046
> >  </IRQ>
> >  do_softirq.part.18+0x155/0x1a0 kernel/softirq.c:336
> >  do_softirq arch/x86/include/asm/preempt.h:23 [inline]
> >  __local_bh_enable_ip+0x1ec/0x230 kernel/softirq.c:189
> >  local_bh_enable include/linux/bottom_half.h:32 [inline]
> >  virtnet_napi_enable+0x8c/0xb0 drivers/net/virtio_net.c:1264
> >  virtnet_open+0x16d/0x4d0 drivers/net/virtio_net.c:1464
> >  __dev_open+0x26d/0x410 net/core/dev.c:1392
> >  __dev_change_flags+0x739/0x9c0 net/core/dev.c:7434
> >  dev_change_flags+0x89/0x150 net/core/dev.c:7503
> >  do_setlink+0xb16/0x3dd0 net/core/rtnetlink.c:2416
> >  rtnl_newlink+0x138d/0x1d60 net/core/rtnetlink.c:3029
> >  rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4705
> >  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2447
> >  rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4723
> >  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
> >  netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1336
> >  netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1901
> >  sock_sendmsg_nosec net/socket.c:641 [inline]
> >  sock_sendmsg+0xd5/0x120 net/socket.c:651
> >  ___sys_sendmsg+0x7fd/0x930 net/socket.c:2125
> >  __sys_sendmsg+0x11d/0x290 net/socket.c:2163
> >  __do_sys_sendmsg net/socket.c:2172 [inline]
> >  __se_sys_sendmsg net/socket.c:2170 [inline]
> >  __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2170
> >  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x7f318d594320
> > Code: 02 48 83 c8 ff eb 8d 48 8b 05 14 7b 2a 00 f7 da 64 89 10 48 83 c8 ff
> > eb c9 90 83 3d d5 d2 2a 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff
> > 73 31 c3 48 83 ec 08 e8 5e ba 00 00 48 89 04 24
> > RSP: 002b:00007ffd985d8f38 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> > RAX: ffffffffffffffda RBX: 00007ffd985dd030 RCX: 00007f318d594320
> > RDX: 0000000000000000 RSI: 00007ffd985d8f70 RDI: 0000000000000003
> > RBP: 00007ffd985d8f70 R08: 0000000000000000 R09: 000000000000000f
> > R10: 0000000000000000 R11: 0000000000000246 R12: 000000005b5973aa
> > R13: 0000000000000000 R14: 00000000006395c0 R15: 00007ffd985dd808
> >
> > Allocated by task 1:
> >  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> >  set_track mm/kasan/kasan.c:460 [inline]
> >  kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
> >  __do_kmalloc mm/slab.c:3718 [inline]
> >  __kmalloc+0x14e/0x760 mm/slab.c:3727
> >  kmalloc_array include/linux/slab.h:635 [inline]
> >  kcalloc include/linux/slab.h:646 [inline]
> >  virtnet_alloc_queues drivers/net/virtio_net.c:2731 [inline]
> >  init_vqs+0x127/0x1520 drivers/net/virtio_net.c:2769
> >  virtnet_probe+0x1092/0x2260 drivers/net/virtio_net.c:3016
> >  virtio_dev_probe+0x592/0x942 drivers/virtio/virtio.c:245
> >  really_probe drivers/base/dd.c:446 [inline]
> >  driver_probe_device+0x6ad/0x970 drivers/base/dd.c:588
> >  __driver_attach+0x28b/0x2f0 drivers/base/dd.c:822
> >  bus_for_each_dev+0x15d/0x1f0 drivers/base/bus.c:311
> >  driver_attach+0x3d/0x50 drivers/base/dd.c:841
> >  bus_add_driver+0x4b2/0x600 drivers/base/bus.c:667
> >  driver_register+0x1c8/0x320 drivers/base/driver.c:170
> >  register_virtio_driver+0x79/0xd0 drivers/virtio/virtio.c:296
> >  virtio_net_driver_init+0x8d/0xc9 drivers/net/virtio_net.c:3209
> >  do_one_initcall+0x127/0x913 init/main.c:884
> >  do_initcall_level init/main.c:952 [inline]
> >  do_initcalls init/main.c:960 [inline]
> >  do_basic_setup init/main.c:978 [inline]
> >  kernel_init_freeable+0x49b/0x58e init/main.c:1135
> >  kernel_init+0x11/0x1b3 init/main.c:1061
> >  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
> >
> > Freed by task 0:
> > (stack is not available)
> >
> > The buggy address belongs to the object at ffff8801cee08500
> >  which belongs to the cache kmalloc-4096 of size 4096
> > The buggy address is located 2800 bytes inside of
> >  4096-byte region [ffff8801cee08500, ffff8801cee09500)
> > The buggy address belongs to the page:
> > page:ffffea00073b8200 count:1 mapcount:0 mapping:ffff8801dac00dc0 index:0x0
> > compound_mapcount: 0
> > flags: 0x2fffc0000008100(slab|head)
> > raw: 02fffc0000008100 ffffea00073b7d88 ffffea00073b8288 ffff8801dac00dc0
> > raw: 0000000000000000 ffff8801cee08500 0000000100000001 0000000000000000
> > page dumped because: kasan: bad access detected
> >
> > Memory state around the buggy address:
> >  ffff8801cee08e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >  ffff8801cee08f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc  
> >>
> >> ffff8801cee08f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc  
> >
> >                                                              ^
> >  ffff8801cee09000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >  ffff8801cee09080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > ==================================================================
> >
> >
> > ---
> > This bug is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@googlegroups.com.
> >
> > syzbot will keep track of this bug report. See:
> > https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
> > syzbot.
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "syzkaller-bugs" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to syzkaller-bugs+unsubscribe@googlegroups.com.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/syzkaller-bugs/000000000000352dc20571e3a0d8%40google.com.
> > For more options, visit https://groups.google.com/d/optout.  

^ permalink raw reply

* Re: net-next boot error
From: Dmitry Vyukov via Virtualization @ 2018-07-26  9:34 UTC (permalink / raw)
  To: syzbot
  Cc: Michael S. Tsirkin, Marc Zyngier, netdev, Tetsuo Handa,
	syzkaller-bugs, LKML, Steven Rostedt, David Miller,
	Peter Zijlstra, Jason Baron, Josh Poimboeuf, Paolo Bonzini,
	Borislav Petkov, virtualization, Ingo Molnar
In-Reply-To: <000000000000352dc20571e3a0d8@google.com>

On Thu, Jul 26, 2018 at 11:29 AM, syzbot
<syzbot+604f8271211546f5b3c7@syzkaller.appspotmail.com> wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    dc66fe43b7eb rds: send: Fix dead code in rds_sendmsg
> git tree:       net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=127874c8400000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=f34ce142a9f5f0e8
> dashboard link: https://syzkaller.appspot.com/bug?extid=604f8271211546f5b3c7
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+604f8271211546f5b3c7@syzkaller.appspotmail.com
>
> possible deadlock in static_key_slow_incsd 0:0:1:0: [sda] Attached SCSI disk
> MACsec IEEE 802.1AE
> tun: Universal TUN/TAP device driver, 1.6
>
> ============================================
> WARNING: possible recursive locking detected

+Tetsuo, perhaps this boot lockdep problem then disables lockdep for
actual testing. I think lockdep should respect panic_on_warn.


> 4.18.0-rc6+ #141 Not tainted
> --------------------------------------------
> swapper/0/1 is trying to acquire lock:
> (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at:
> static_key_slow_inc+0x12/0x30 kernel/jump_label.c:124
>
> but task is already holding lock:
> (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: get_online_cpus
> include/linux/cpu.h:126 [inline]
> (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0xe1a/0x1520
> drivers/net/virtio_net.c:2777
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(cpu_hotplug_lock.rw_sem);
>   lock(cpu_hotplug_lock.rw_sem);
>
>  *** DEADLOCK ***
>
>  May be due to missing lock nesting notation
>
> 3 locks held by swapper/0/1:
>  #0: (____ptrval____) (&dev->mutex){....}, at: device_lock
> include/linux/device.h:1134 [inline]
>  #0: (____ptrval____) (&dev->mutex){....}, at: __driver_attach+0x15f/0x2f0
> drivers/base/dd.c:820
>  #1: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: get_online_cpus
> include/linux/cpu.h:126 [inline]
>  #1: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at:
> init_vqs+0xe1a/0x1520 drivers/net/virtio_net.c:2777
>  #2: (____ptrval____) (xps_map_mutex){+.+.}, at:
> __netif_set_xps_queue+0x243/0x23f0 net/core/dev.c:2278
>
> stack backtrace:
> CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc6+ #141
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
>  print_deadlock_bug kernel/locking/lockdep.c:1765 [inline]
>  check_deadlock kernel/locking/lockdep.c:1809 [inline]
>  validate_chain kernel/locking/lockdep.c:2405 [inline]
>  __lock_acquire.cold.65+0x1fb/0x486 kernel/locking/lockdep.c:3435
>  lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924
>  percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:36 [inline]
>  percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
>  cpus_read_lock+0x43/0xa0 kernel/cpu.c:289
>  static_key_slow_inc+0x12/0x30 kernel/jump_label.c:124
>  __netif_set_xps_queue+0xaac/0x23f0 net/core/dev.c:2320
>  netif_set_xps_queue+0x26/0x30 net/core/dev.c:2455
>  virtnet_set_affinity+0x2ba/0x4b0 drivers/net/virtio_net.c:1944
>  init_vqs+0xe22/0x1520 drivers/net/virtio_net.c:2778
>  virtnet_probe+0x1092/0x2260 drivers/net/virtio_net.c:3016
>  virtio_dev_probe+0x592/0x942 drivers/virtio/virtio.c:245
>  really_probe drivers/base/dd.c:446 [inline]
>  driver_probe_device+0x6ad/0x970 drivers/base/dd.c:588
>  __driver_attach+0x28b/0x2f0 drivers/base/dd.c:822
>  bus_for_each_dev+0x15d/0x1f0 drivers/base/bus.c:311
>  driver_attach+0x3d/0x50 drivers/base/dd.c:841
>  bus_add_driver+0x4b2/0x600 drivers/base/bus.c:667
>  driver_register+0x1c8/0x320 drivers/base/driver.c:170
>  register_virtio_driver+0x79/0xd0 drivers/virtio/virtio.c:296
>  virtio_net_driver_init+0x8d/0xc9 drivers/net/virtio_net.c:3209
>  do_one_initcall+0x127/0x913 init/main.c:884
>  do_initcall_level init/main.c:952 [inline]
>  do_initcalls init/main.c:960 [inline]
>  do_basic_setup init/main.c:978 [inline]
>  kernel_init_freeable+0x49b/0x58e init/main.c:1135
>  kernel_init+0x11/0x1b3 init/main.c:1061
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
> vcan: Virtual CAN interface driver
> vxcan: Virtual CAN Tunnel driver
> slcan: serial line CAN interface driver
> slcan: 10 dynamic interface channels.
> CAN device driver interface
> enic: Cisco VIC Ethernet NIC Driver, ver 2.3.0.53
> e100: Intel(R) PRO/100 Network Driver, 3.5.24-k2-NAPI
> e100: Copyright(c) 1999-2006 Intel Corporation
> e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
> e1000: Copyright (c) 1999-2006 Intel Corporation.
> e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
> e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> sky2: driver version 1.30
> PPP generic driver version 2.4.2
> PPP BSD Compression module registered
> PPP Deflate Compression module registered
> PPP MPPE Compression module registered
> NET: Registered protocol family 24
> PPTP driver version 0.8.5
> mac80211_hwsim: initializing netlink
> ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
> ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
> usbcore: registered new interface driver asix
> usbcore: registered new interface driver ax88179_178a
> usbcore: registered new interface driver cdc_ether
> usbcore: registered new interface driver net1080
> usbcore: registered new interface driver cdc_subset
> usbcore: registered new interface driver zaurus
> usbcore: registered new interface driver cdc_ncm
> aoe: AoE v85 initialised.
> ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> ehci-pci: EHCI PCI platform driver
> ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> ohci-pci: OHCI PCI platform driver
> uhci_hcd: USB Universal Host Controller Interface driver
> usbcore: registered new interface driver usblp
> usbcore: registered new interface driver usb-storage
> i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
> i8042: Warning: Keylock active
> serio: i8042 KBD port at 0x60,0x64 irq 1
> serio: i8042 AUX port at 0x60,0x64 irq 12
> mousedev: PS/2 mouse device common for all mice
> rtc_cmos 00:00: RTC can wake from S4
> rtc_cmos 00:00: registered as rtc0
> rtc_cmos 00:00: alarms up to one day, 114 bytes nvram
> i2c /dev entries driver
> piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or
> use force_addr=0xaddr
> i2c-parport-light: adapter type unspecified
> usbcore: registered new interface driver RobotFuzz Open Source InterFace,
> OSIF
> usbcore: registered new interface driver i2c-tiny-usb
> device-mapper: ioctl: 4.39.0-ioctl (2018-04-03) initialised:
> dm-devel@redhat.com
> device-mapper: raid: Loading target version 1.13.2
> usbcore: registered new interface driver btusb
> usnic_verbs: Cisco VIC (USNIC) Verbs Driver v1.0.3 (December 19, 2013)
> usnic_verbs:usnic_uiom_init:585:
> IOMMU required but not present or enabled.  USNIC QPs will not function w/o
> enabling IOMMU
> usnic_verbs:usnic_ib_init:649:
> Unable to initalize umem with err -1
> iscsi: registered transport (iser)
> OPA Virtual Network Driver - v1.0
> hidraw: raw HID events driver (C) Jiri Kosina
> usbcore: registered new interface driver usbhid
> usbhid: USB HID core driver
> NET: Registered protocol family 40
> ashmem: initialized
> NET: Registered protocol family 26
> Mirror/redirect action on
> Simple TC action Loaded
> netem: version 1.3
> u32 classifier
>     Actions configured
> nf_conntrack_irc: failed to register helpers
> nf_conntrack_sane: failed to register helpers
> nf_conntrack_sip: failed to register helpers
> xt_time: kernel timezone is -0000
> IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
> IPVS: Connection hash table configured (size=4096, memory=64Kbytes)
> IPVS: ipvs loaded.
> IPVS: [rr] scheduler registered.
> IPVS: [wrr] scheduler registered.
> IPVS: [lc] scheduler registered.
> IPVS: [wlc] scheduler registered.
> IPVS: [fo] scheduler registered.
> IPVS: [ovf] scheduler registered.
> IPVS: [lblc] scheduler registered.
> IPVS: [lblcr] scheduler registered.
> IPVS: [dh] scheduler registered.
> IPVS: [sh] scheduler registered.
> IPVS: [mh] scheduler registered.
> IPVS: [sed] scheduler registered.
> IPVS: [nq] scheduler registered.
> IPVS: ftp: loaded support on port[0] = 21
> IPVS: [sip] pe registered.
> ipip: IPv4 and MPLS over IPv4 tunneling driver
> gre: GRE over IPv4 demultiplexor driver
> ip_gre: GRE over IPv4 tunneling driver
> IPv4 over IPsec tunneling driver
> ipt_CLUSTERIP: ClusterIP Version 0.8 loaded successfully
> Initializing XFRM netlink socket
> NET: Registered protocol family 10
> Segment Routing with IPv6
> mip6: Mobile IPv6
> sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
> ip6_gre: GRE over IPv6 tunneling driver
> bpfilter: Loaded bpfilter_umh pid 2080
> NET: Registered protocol family 15
> Bridge firewalling registered
> can: controller area network core (rev 20170425 abi 9)
> NET: Registered protocol family 29
> can: raw protocol (rev 20170425)
> can: broadcast manager protocol (rev 20170425 t)
> can: netlink gateway (rev 20170425) max_hops=1
> Bluetooth: RFCOMM TTY layer initialized
> Bluetooth: RFCOMM socket layer initialized
> Bluetooth: RFCOMM ver 1.11
> Bluetooth: BNEP (Ethernet Emulation) ver 1.3
> Bluetooth: BNEP filters: protocol multicast
> Bluetooth: BNEP socket layer initialized
> Bluetooth: HIDP (Human Interface Emulation) ver 1.2
> Bluetooth: HIDP socket layer initialized
> RPC: Registered rdma transport module.
> RPC: Registered rdma backchannel transport module.
> NET: Registered protocol family 41
> lec:lane_module_init: lec.c: initialized
> mpoa:atm_mpoa_init: mpc.c: initialized
> l2tp_core: L2TP core driver, V2.0
> l2tp_ppp: PPPoL2TP kernel driver, V2.0
> 8021q: 802.1Q VLAN Support v1.8
> input: AT Translated Set 2 keyboard as
> /devices/platform/i8042/serio0/input/input2
> DCCP: Activated CCID 2 (TCP-like)
> DCCP: Activated CCID 3 (TCP-Friendly Rate Control)
> sctp: Hash tables configured (bind 64/64)
> tipc: Activated (version 2.0.0)
> NET: Registered protocol family 30
> tipc: Started in single node mode
> NET: Registered protocol family 43
> 9pnet: Installing 9P2000 support
> NET: Registered protocol family 36
> Key type dns_resolver registered
> Key type ceph registered
> libceph: loaded (mon/osd proto 15/24)
> openvswitch: Open vSwitch switching datapath
> mpls_gso: MPLS GSO support
> start plist test
> end plist test
> AVX2 version of gcm_enc/dec engaged.
> AES CTR mode by8 optimization enabled
> sched_clock: Marking stable (4559438359, 0)->(6126385605, -1566947246)
> registered taskstats version 1
> Loading compiled-in X.509 certificates
> zswap: default zpool zbud not available
> zswap: pool creation failed
> Btrfs loaded, crc32c=crc32c-intel
> Key type big_key registered
> Key type encrypted registered
>   Magic number: 10:317:168
> console [netcon0] enabled
> netconsole: network logging started
> gtp: GTP module loaded (pdp ctx size 104 bytes)
> rdma_rxe: loaded
> cfg80211: Loading compiled-in X.509 certificates for regulatory database
> cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
> platform regulatory.0: Direct firmware load for regulatory.db failed with
> error -2
> cfg80211: failed to load regulatory.db
> ALSA device list:
>   #0: Dummy 1
>   #1: Loopback 1
>   #2: Virtual MIDI Card 1
> input: ImExPS/2 Generic Explorer Mouse as
> /devices/platform/i8042/serio1/input/input4
> md: Waiting for all devices to be available before autodetect
> md: If you don't use raid, use raid=noautodetect
> md: Autodetecting RAID arrays.
> md: autorun ...
> md: ... autorun DONE.
> EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
> VFS: Mounted root (ext4 filesystem) readonly on device 8:1.
> devtmpfs: mounted
> Freeing unused kernel memory: 3900K
> Kernel memory protection disabled.
> SELinux:  Disabled at runtime.
> SELinux:  Unregistering netfilter hooks
> audit: type=1404 audit(1532588961.277:2): enforcing=0 old_enforcing=0
> auid=4294967295 ses=4294967295 enabled=0 old-enabled=1 lsm=selinux res=1
> stty (2166) used greatest stack depth: 19664 bytes left
> EXT4-fs (sda1): re-mounted. Opts: (null)
> logsave (3615) used greatest stack depth: 17632 bytes left
> random: dd: uninitialized urandom read (512 bytes read)
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in virtnet_receive
> drivers/net/virtio_net.c:1356 [inline]

+virtio maintainers for this one
Probably something very recent.

> BUG: KASAN: slab-out-of-bounds in virtnet_poll+0x111a/0x1226
> drivers/net/virtio_net.c:1421
> Read of size 8 at addr ffff8801cee08ff0 by task ip/3969
>
> CPU: 0 PID: 3969 Comm: ip Not tainted 4.18.0-rc6+ #141
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
>  print_address_description+0x6c/0x20b mm/kasan/report.c:256
>  kasan_report_error mm/kasan/report.c:354 [inline]
>  kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>  virtnet_receive drivers/net/virtio_net.c:1356 [inline]
>  virtnet_poll+0x111a/0x1226 drivers/net/virtio_net.c:1421
>  napi_poll net/core/dev.c:6214 [inline]
>  net_rx_action+0x7a5/0x1920 net/core/dev.c:6280
>  __do_softirq+0x2e8/0xb17 kernel/softirq.c:292
>  do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1046
>  </IRQ>
>  do_softirq.part.18+0x155/0x1a0 kernel/softirq.c:336
>  do_softirq arch/x86/include/asm/preempt.h:23 [inline]
>  __local_bh_enable_ip+0x1ec/0x230 kernel/softirq.c:189
>  local_bh_enable include/linux/bottom_half.h:32 [inline]
>  virtnet_napi_enable+0x8c/0xb0 drivers/net/virtio_net.c:1264
>  virtnet_open+0x16d/0x4d0 drivers/net/virtio_net.c:1464
>  __dev_open+0x26d/0x410 net/core/dev.c:1392
>  __dev_change_flags+0x739/0x9c0 net/core/dev.c:7434
>  dev_change_flags+0x89/0x150 net/core/dev.c:7503
>  do_setlink+0xb16/0x3dd0 net/core/rtnetlink.c:2416
>  rtnl_newlink+0x138d/0x1d60 net/core/rtnetlink.c:3029
>  rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4705
>  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2447
>  rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4723
>  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
>  netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1336
>  netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1901
>  sock_sendmsg_nosec net/socket.c:641 [inline]
>  sock_sendmsg+0xd5/0x120 net/socket.c:651
>  ___sys_sendmsg+0x7fd/0x930 net/socket.c:2125
>  __sys_sendmsg+0x11d/0x290 net/socket.c:2163
>  __do_sys_sendmsg net/socket.c:2172 [inline]
>  __se_sys_sendmsg net/socket.c:2170 [inline]
>  __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2170
>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x7f318d594320
> Code: 02 48 83 c8 ff eb 8d 48 8b 05 14 7b 2a 00 f7 da 64 89 10 48 83 c8 ff
> eb c9 90 83 3d d5 d2 2a 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff
> 73 31 c3 48 83 ec 08 e8 5e ba 00 00 48 89 04 24
> RSP: 002b:00007ffd985d8f38 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 00007ffd985dd030 RCX: 00007f318d594320
> RDX: 0000000000000000 RSI: 00007ffd985d8f70 RDI: 0000000000000003
> RBP: 00007ffd985d8f70 R08: 0000000000000000 R09: 000000000000000f
> R10: 0000000000000000 R11: 0000000000000246 R12: 000000005b5973aa
> R13: 0000000000000000 R14: 00000000006395c0 R15: 00007ffd985dd808
>
> Allocated by task 1:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>  set_track mm/kasan/kasan.c:460 [inline]
>  kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
>  __do_kmalloc mm/slab.c:3718 [inline]
>  __kmalloc+0x14e/0x760 mm/slab.c:3727
>  kmalloc_array include/linux/slab.h:635 [inline]
>  kcalloc include/linux/slab.h:646 [inline]
>  virtnet_alloc_queues drivers/net/virtio_net.c:2731 [inline]
>  init_vqs+0x127/0x1520 drivers/net/virtio_net.c:2769
>  virtnet_probe+0x1092/0x2260 drivers/net/virtio_net.c:3016
>  virtio_dev_probe+0x592/0x942 drivers/virtio/virtio.c:245
>  really_probe drivers/base/dd.c:446 [inline]
>  driver_probe_device+0x6ad/0x970 drivers/base/dd.c:588
>  __driver_attach+0x28b/0x2f0 drivers/base/dd.c:822
>  bus_for_each_dev+0x15d/0x1f0 drivers/base/bus.c:311
>  driver_attach+0x3d/0x50 drivers/base/dd.c:841
>  bus_add_driver+0x4b2/0x600 drivers/base/bus.c:667
>  driver_register+0x1c8/0x320 drivers/base/driver.c:170
>  register_virtio_driver+0x79/0xd0 drivers/virtio/virtio.c:296
>  virtio_net_driver_init+0x8d/0xc9 drivers/net/virtio_net.c:3209
>  do_one_initcall+0x127/0x913 init/main.c:884
>  do_initcall_level init/main.c:952 [inline]
>  do_initcalls init/main.c:960 [inline]
>  do_basic_setup init/main.c:978 [inline]
>  kernel_init_freeable+0x49b/0x58e init/main.c:1135
>  kernel_init+0x11/0x1b3 init/main.c:1061
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
>
> Freed by task 0:
> (stack is not available)
>
> The buggy address belongs to the object at ffff8801cee08500
>  which belongs to the cache kmalloc-4096 of size 4096
> The buggy address is located 2800 bytes inside of
>  4096-byte region [ffff8801cee08500, ffff8801cee09500)
> The buggy address belongs to the page:
> page:ffffea00073b8200 count:1 mapcount:0 mapping:ffff8801dac00dc0 index:0x0
> compound_mapcount: 0
> flags: 0x2fffc0000008100(slab|head)
> raw: 02fffc0000008100 ffffea00073b7d88 ffffea00073b8288 ffff8801dac00dc0
> raw: 0000000000000000 ffff8801cee08500 0000000100000001 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>  ffff8801cee08e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff8801cee08f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>
>> ffff8801cee08f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>
>                                                              ^
>  ffff8801cee09000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff8801cee09080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==================================================================
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
> syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/000000000000352dc20571e3a0d8%40google.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply

* IEEE Record # 44854: iCATccT 2018, Alva's Institute Of Engineering & Technology (AIET)-CFP
From: Dr. S K Niranjan Aradhya @ 2018-07-26  4:41 UTC (permalink / raw)
  To: virtualization


[-- Attachment #1.1: Type: text/plain, Size: 1653 bytes --]

<< Apologies for cross-postings >>
<<< Please circulate among your friends, peers and researchers >>>

IEEE Conference Record No.: # 44854;

4th International Conference on Applied and Theoretical Computing and
Communication Technology (iCATccT - 2018)
 Alva's Institute Of Engineering & Technology (AIET)

Conference Date : 6-8 Sept 2018
Submission Deadline: 10 August 2018

Submission Link: http://itekcmsonline.com/icatcct18/index.php/icatcct18/
icatcct18/login

Review is underway for submitted papers.

IEEE ISBN : 978-1-5386-7706-3
IEEE Part No. : CFP18D66-ART
Selected, accepted and extended paper will be published in Scopus Indexed
International Journal of Forensic Software Engineering published by
InderScience
All accepted and presented papers will be submitted to the IEEE for
possible publication in IEEE Xplore Digital Library. Previous edition
indexed in: SCOPUS, ISI Web of Science, Engineering Index, Google, etc.

If you like to join the TPC or propose a special session or symposiums
please write to: secretariat@icatcct.org

General Chair(s)
iCATccT 2018 Conference

----------------------
Disclaimer: We have clearly mentioned the subject lines and your email
address won't be misleading in any form. We have found your mail address
through our own efforts on the web search and not through any illegal way.
If you wish to remove your information from our mailing list or no longer
receive future announcements, please email with REMOVE in subject. Your
request to opt-out will be effective within a reasonable amount of time.
 icatcct-cfp.pdf
<https://drive.google.com/file/d/1OWXPZVS1IRZlNoWTjfVyxl-yIL2CsByg/view?usp=drive_web>

[-- Attachment #1.2: Type: text/html, Size: 3241 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net-next v6 0/4] net: vhost: improve performance when enable busyloop
From: David Miller @ 2018-07-25 20:01 UTC (permalink / raw)
  To: xiangxia.m.yue; +Cc: netdev, virtualization, mst
In-Reply-To: <1532196242-2998-1-git-send-email-xiangxia.m.yue@gmail.com>

From: xiangxia.m.yue@gmail.com
Date: Sat, 21 Jul 2018 11:03:58 -0700

> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> 
> This patches improve the guest receive performance.
> On the handle_tx side, we poll the sock receive queue
> at the same time. handle_rx do that in the same way.
> 
> For more performance report, see patch 4.
> 
> v5->v6:
> rebase the codes.

It looks like there is still some dangling discussions about this
patch set.

Please repost this series when those discussions have completed.

Thank you.

^ permalink raw reply

* Re: [PATCH net-next 0/6] virtio_net: Add ethtool stat items
From: David Miller @ 2018-07-25 19:59 UTC (permalink / raw)
  To: mst; +Cc: netdev, toshiaki.makita1, virtualization
In-Reply-To: <20180725123908-mutt-send-email-mst@kernel.org>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Wed, 25 Jul 2018 12:40:12 +0300

> On Mon, Jul 23, 2018 at 11:36:03PM +0900, Toshiaki Makita wrote:
>> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>> 
>> Add some ethtool stat items useful for performance analysis.
>> 
>> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> 
> Series:
> 
> Acked-by: Michael S. Tsirkin <mst@redhat.com>

Series applied.

> Patch 1 seems appropriate for stable, even though it's minor.

Ok, I'll put patch #1 also into 'net' and queue it up for -stable.

Thanks.

^ permalink raw reply

* [PATCH 2/2] tools/virtio: add kmalloc_array stub
From: Michael S. Tsirkin @ 2018-07-25 13:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: virtualization, khandual
In-Reply-To: <20180725134057.113423-1-mst@redhat.com>

Fixes: 6da2ec56059 ("treewide: kmalloc() -> kmalloc_array()")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 tools/virtio/linux/kernel.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h
index fca8381bbe04..fb22bccfbc8a 100644
--- a/tools/virtio/linux/kernel.h
+++ b/tools/virtio/linux/kernel.h
@@ -52,6 +52,11 @@ static inline void *kmalloc(size_t s, gfp_t gfp)
 		return __kmalloc_fake;
 	return malloc(s);
 }
+static inline void *kmalloc_array(unsigned n, size_t s, gfp_t gfp)
+{
+	return kmalloc(n * s, gfp);
+}
+
 static inline void *kzalloc(size_t s, gfp_t gfp)
 {
 	void *p = kmalloc(s, gfp);
-- 
MST

^ permalink raw reply related

* [PATCH 1/2] tools/virtio: add dma barrier stubs
From: Michael S. Tsirkin @ 2018-07-25 13:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: virtualization, khandual

Fixes: 55e49dc43a8 ("virtio_ring: switch to dma_XX barriers for rpmsg")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 tools/virtio/asm/barrier.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/virtio/asm/barrier.h b/tools/virtio/asm/barrier.h
index 0ac3caf90877..d0351f83aebe 100644
--- a/tools/virtio/asm/barrier.h
+++ b/tools/virtio/asm/barrier.h
@@ -13,8 +13,8 @@
 } while (0);
 /* Weak barriers should be used. If not - it's a bug */
 # define mb() abort()
-# define rmb() abort()
-# define wmb() abort()
+# define dma_rmb() abort()
+# define dma_wmb() abort()
 #else
 #error Please fill in barrier macros
 #endif
-- 
MST

^ permalink raw reply related

* Re: [RFC 4/4] virtio: Add platform specific DMA API translation for virito devices
From: Michael S. Tsirkin @ 2018-07-25 13:31 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: robh, srikar, benh, linuxram, linux-kernel, virtualization, hch,
	paulus, mpe, joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <3dd36d8e-3bc8-91ba-cf6d-939495439d2c@linux.vnet.ibm.com>

On Mon, Jul 23, 2018 at 07:46:09AM +0530, Anshuman Khandual wrote:
> There is a redundant definition of virtio_has_iommu_quirk in the tools
> directory (tools/virtio/linux/virtio_config.h) which does not seem to
> be used any where. I guess that can be removed without problem.

It's there just to make tools/virtio build.
Try
	make -C tools/virtio/
In fact I see there's a missing definition for
dma_rmb/dma_wmb there, I'll post a patch.

-- 
MST

^ permalink raw reply

* Re: [PATCH net-next v6 1/4] net: vhost: lock the vqs one by one
From: Tonghao Zhang @ 2018-07-25 12:05 UTC (permalink / raw)
  To: mst; +Cc: Linux Kernel Network Developers, virtualization
In-Reply-To: <20180722182448-mutt-send-email-mst@kernel.org>

On Sun, Jul 22, 2018 at 11:26 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Sat, Jul 21, 2018 at 11:03:59AM -0700, xiangxia.m.yue@gmail.com wrote:
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >
> > This patch changes the way that lock all vqs
> > at the same, to lock them one by one. It will
> > be used for next patch to avoid the deadlock.
> >
> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > Acked-by: Jason Wang <jasowang@redhat.com>
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/vhost/vhost.c | 24 +++++++-----------------
> >  1 file changed, 7 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index a502f1a..a1c06e7 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -294,8 +294,11 @@ static void vhost_vq_meta_reset(struct vhost_dev *d)
> >  {
> >       int i;
> >
> > -     for (i = 0; i < d->nvqs; ++i)
> > +     for (i = 0; i < d->nvqs; ++i) {
> > +             mutex_lock(&d->vqs[i]->mutex);
> >               __vhost_vq_meta_reset(d->vqs[i]);
> > +             mutex_unlock(&d->vqs[i]->mutex);
> > +     }
> >  }
> >
> >  static void vhost_vq_reset(struct vhost_dev *dev,
> > @@ -890,20 +893,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
> >  #define vhost_get_used(vq, x, ptr) \
> >       vhost_get_user(vq, x, ptr, VHOST_ADDR_USED)
> >
> > -static void vhost_dev_lock_vqs(struct vhost_dev *d)
> > -{
> > -     int i = 0;
> > -     for (i = 0; i < d->nvqs; ++i)
> > -             mutex_lock_nested(&d->vqs[i]->mutex, i);
> > -}
> > -
> > -static void vhost_dev_unlock_vqs(struct vhost_dev *d)
> > -{
> > -     int i = 0;
> > -     for (i = 0; i < d->nvqs; ++i)
> > -             mutex_unlock(&d->vqs[i]->mutex);
> > -}
> > -
> >  static int vhost_new_umem_range(struct vhost_umem *umem,
> >                               u64 start, u64 size, u64 end,
> >                               u64 userspace_addr, int perm)
> > @@ -953,7 +942,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
> >               if (msg->iova <= vq_msg->iova &&
> >                   msg->iova + msg->size - 1 > vq_msg->iova &&
> >                   vq_msg->type == VHOST_IOTLB_MISS) {
> > +                     mutex_lock(&node->vq->mutex);
> >                       vhost_poll_queue(&node->vq->poll);
> > +                     mutex_unlock(&node->vq->mutex);
> > +
> >                       list_del(&node->node);
> >                       kfree(node);
> >               }
> > @@ -985,7 +977,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
> >       int ret = 0;
> >
> >       mutex_lock(&dev->mutex);
> > -     vhost_dev_lock_vqs(dev);
> >       switch (msg->type) {
> >       case VHOST_IOTLB_UPDATE:
> >               if (!dev->iotlb) {
> > @@ -1019,7 +1010,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
> >               break;
> >       }
> >
> > -     vhost_dev_unlock_vqs(dev);
> >       mutex_unlock(&dev->mutex);
> >
> >       return ret;
>
> I do prefer the finer-grained locking but I remember we
> discussed something like this in the past and Jason saw issues
> with such a locking.
This change is suggested by Jason. Should I send new version because
the patch 3 is changed.

> Jason?
>
> > --
> > 1.8.3.1

^ permalink raw reply

* Re: [PATCH net-next 0/6] virtio_net: Add ethtool stat items
From: Michael S. Tsirkin @ 2018-07-25  9:40 UTC (permalink / raw)
  To: Toshiaki Makita; +Cc: netdev, virtualization, David S. Miller
In-Reply-To: <20180723143609.2242-1-toshiaki.makita1@gmail.com>

On Mon, Jul 23, 2018 at 11:36:03PM +0900, Toshiaki Makita wrote:
> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> 
> Add some ethtool stat items useful for performance analysis.
> 
> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>

Series:

Acked-by: Michael S. Tsirkin <mst@redhat.com>

Patch 1 seems appropriate for stable, even though it's minor.

> Toshiaki Makita (6):
>   virtio_net: Fix incosistent received bytes counter
>   virtio_net: Use temporary storage for accounting rx stats
>   virtio_net: Make drop counter per-queue
>   virtio_net: Factor out the logic to determine xdp sq
>   virtio_net: Add XDP related stats
>   virtio_net: Add kick stats
> 
>  drivers/net/virtio_net.c | 221 +++++++++++++++++++++++++++++++++--------------
>  1 file changed, 158 insertions(+), 63 deletions(-)
> 
> -- 
> 2.14.3

^ permalink raw reply

* Re: [PATCH RFC V4 3/3] KVM: X86: Adding skeleton for Memory ROE
From: David Hildenbrand @ 2018-07-25  9:36 UTC (permalink / raw)
  To: Ahmed Abd El Mawgood, kvm, Kernel Hardening, virtualization,
	linux-doc, x86, xen-devel
  Cc: Ard Biesheuvel, Kees Cook, nathan Corbet, rkrcmar, David Vrabel,
	Boris Lukashev, Ingo Molnar, nigel.edwards, hpa, Paolo Bonzini,
	Thomas Gleixner, Rik van Riel
In-Reply-To: <20180720233130.14129-4-ahmedsoliman0x666@gmail.com>


>  		if (kvm_x86_ops->slot_disable_log_dirty)
>  			kvm_x86_ops->slot_disable_log_dirty(kvm, new);
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 4ee7bc548a83..82c5780e11d9 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -297,6 +297,9 @@ static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu)
>  struct kvm_memory_slot {
>  	gfn_t base_gfn;
>  	unsigned long npages;
> +#ifdef CONFIG_KVM_MROE
> +	unsigned long *mroe_bitmap;
> +#endif

Yet another problematic bitmap when it comes to splitting/resizing
memory slots atomically :(


-- 

Thanks,

David / dhildenb

^ permalink raw reply

* Re: [RFC 4/4] virtio: Add platform specific DMA API translation for virito devices
From: Anshuman Khandual @ 2018-07-25  4:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: robh, srikar, linuxram, linux-kernel, virtualization, hch, paulus,
	joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <3dd36d8e-3bc8-91ba-cf6d-939495439d2c@linux.vnet.ibm.com>

On 07/23/2018 07:46 AM, Anshuman Khandual wrote:
> On 07/20/2018 06:45 PM, Michael S. Tsirkin wrote:
>> On Fri, Jul 20, 2018 at 09:29:41AM +0530, Anshuman Khandual wrote:
>>> Subject: Re: [RFC 4/4] virtio: Add platform specific DMA API translation for
>>> virito devices
>>
>> s/virito/virtio/
> 
> Oops, will fix it. Thanks for pointing out.
> 
>>
>>> This adds a hook which a platform can define in order to allow it to
>>> override virtio device's DMA OPS irrespective of whether it has the
>>> flag VIRTIO_F_IOMMU_PLATFORM set or not. We want to use this to do
>>> bounce-buffering of data on the new secure pSeries platform, currently
>>> under development, where a KVM host cannot access all of the memory
>>> space of a secure KVM guest.  The host can only access the pages which
>>> the guest has explicitly requested to be shared with the host, thus
>>> the virtio implementation in the guest has to copy data to and from
>>> shared pages.
>>>
>>> With this hook, the platform code in the secure guest can force the
>>> use of swiotlb for virtio buffers, with a back-end for swiotlb which
>>> will use a pool of pre-allocated shared pages.  Thus all data being
>>> sent or received by virtio devices will be copied through pages which
>>> the host has access to.
>>>
>>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>> ---
>>>  arch/powerpc/include/asm/dma-mapping.h | 6 ++++++
>>>  arch/powerpc/platforms/pseries/iommu.c | 6 ++++++
>>>  drivers/virtio/virtio.c                | 7 +++++++
>>>  3 files changed, 19 insertions(+)
>>>
>>> diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h
>>> index 8fa3945..bc5a9d3 100644
>>> --- a/arch/powerpc/include/asm/dma-mapping.h
>>> +++ b/arch/powerpc/include/asm/dma-mapping.h
>>> @@ -116,3 +116,9 @@ extern u64 __dma_get_required_mask(struct device *dev);
>>>  
>>>  #endif /* __KERNEL__ */
>>>  #endif	/* _ASM_DMA_MAPPING_H */
>>> +
>>> +#define platform_override_dma_ops platform_override_dma_ops
>>> +
>>> +struct virtio_device;
>>> +
>>> +extern void platform_override_dma_ops(struct virtio_device *vdev);
>>> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
>>> index 06f0296..5773bc7 100644
>>> --- a/arch/powerpc/platforms/pseries/iommu.c
>>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>>> @@ -38,6 +38,7 @@
>>>  #include <linux/of.h>
>>>  #include <linux/iommu.h>
>>>  #include <linux/rculist.h>
>>> +#include <linux/virtio.h>
>>>  #include <asm/io.h>
>>>  #include <asm/prom.h>
>>>  #include <asm/rtas.h>
>>> @@ -1396,3 +1397,8 @@ static int __init disable_multitce(char *str)
>>>  __setup("multitce=", disable_multitce);
>>>  
>>>  machine_subsys_initcall_sync(pseries, tce_iommu_bus_notifier_init);
>>> +
>>> +void platform_override_dma_ops(struct virtio_device *vdev)
>>> +{
>>> +	/* Override vdev->parent.dma_ops if required */
>>> +}
>>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
>>> index 6b13987..432c332 100644
>>> --- a/drivers/virtio/virtio.c
>>> +++ b/drivers/virtio/virtio.c
>>> @@ -168,6 +168,12 @@ EXPORT_SYMBOL_GPL(virtio_add_status);
>>>  
>>>  const struct dma_map_ops virtio_direct_dma_ops;
>>>  
>>> +#ifndef platform_override_dma_ops
>>> +static inline void platform_override_dma_ops(struct virtio_device *vdev)
>>> +{
>>> +}
>>> +#endif
>>> +
>>>  int virtio_finalize_features(struct virtio_device *dev)
>>>  {
>>>  	int ret = dev->config->finalize_features(dev);
>>> @@ -179,6 +185,7 @@ int virtio_finalize_features(struct virtio_device *dev)
>>>  	if (virtio_has_iommu_quirk(dev))
>>>  		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
>>>  
>>> +	platform_override_dma_ops(dev);
>>
>> Is there a single place where virtio_has_iommu_quirk is called now?
> 
> Not other than this one. But in the proposed implementation of
> platform_override_dma_ops on powerpc, we will again check on
> virtio_has_iommu_quirk before overriding it with SWIOTLB.
> 
> void platform_override_dma_ops(struct virtio_device *vdev)
> {
>         if (is_ultravisor_platform() && virtio_has_iommu_quirk(vdev))
>                 set_dma_ops(vdev->dev.parent, &swiotlb_dma_ops);
> }
> 
>> If so, we could put this into virtio_has_iommu_quirk then.
> 
> Did you mean platform_override_dma_ops instead ? If so, yes that
> is possible. Default implementation of platform_override_dma_ops
> should just check on VIRTIO_F_IOMMU_PLATFORM feature and override
> with virtio_direct_dma_ops but arch implementation can check on
> what ever else they would like and override appropriately.
> 
> Default platform_override_dma_ops will be like this
> 
> #ifndef platform_override_dma_ops
> static inline void platform_override_dma_ops(struct virtio_device *vdev)
> {
> 	if(!virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM))
> 		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
> }
> #endif
> 
> Proposed powerpc implementation will be like this instead
> 
> void platform_override_dma_ops(struct virtio_device *vdev)
> {
> 	if (virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM))
> 		return;
> 
>         if (is_ultravisor_platform())
>                 set_dma_ops(vdev->dev.parent, &swiotlb_dma_ops);
> 	else
> 		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
> 	
> }
> 
> There is a redundant definition of virtio_has_iommu_quirk in the tools
> directory (tools/virtio/linux/virtio_config.h) which does not seem to
> be used any where. I guess that can be removed without problem.

Does this sound okay ? It will merge patch 3 and 4 into a single one.
On the other hand it also passes the responsibility of dealing with
VIRTIO_F_IOMMU_PLATFORM flag to the architecture callback which might
be bit problematic. Keeping VIRTIO_F_IOMMU_PLATFORM handling in virtio
core at least makes sure that the device has a working DMA ops to fall
back on if the arch hook fails to take care of it somehow.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Anshuman Khandual @ 2018-07-25  3:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: robh, srikar, linuxram, linux-kernel, virtualization, hch, paulus,
	joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <20180723120511-mutt-send-email-mst@kernel.org>

On 07/23/2018 02:38 PM, Michael S. Tsirkin wrote:
> On Mon, Jul 23, 2018 at 11:58:23AM +0530, Anshuman Khandual wrote:
>> On 07/20/2018 06:46 PM, Michael S. Tsirkin wrote:
>>> On Fri, Jul 20, 2018 at 09:29:37AM +0530, Anshuman Khandual wrote:
>>>> This patch series is the follow up on the discussions we had before about
>>>> the RFC titled [RFC,V2] virtio: Add platform specific DMA API translation
>>>> for virito devices (https://patchwork.kernel.org/patch/10417371/). There
>>>> were suggestions about doing away with two different paths of transactions
>>>> with the host/QEMU, first being the direct GPA and the other being the DMA
>>>> API based translations.
>>>>
>>>> First patch attempts to create a direct GPA mapping based DMA operations
>>>> structure called 'virtio_direct_dma_ops' with exact same implementation
>>>> of the direct GPA path which virtio core currently has but just wrapped in
>>>> a DMA API format. Virtio core must use 'virtio_direct_dma_ops' instead of
>>>> the arch default in absence of VIRTIO_F_IOMMU_PLATFORM flag to preserve the
>>>> existing semantics. The second patch does exactly that inside the function
>>>> virtio_finalize_features(). The third patch removes the default direct GPA
>>>> path from virtio core forcing it to use DMA API callbacks for all devices.
>>>> Now with that change, every device must have a DMA operations structure
>>>> associated with it. The fourth patch adds an additional hook which gives
>>>> the platform an opportunity to do yet another override if required. This
>>>> platform hook can be used on POWER Ultravisor based protected guests to
>>>> load up SWIOTLB DMA callbacks to do the required (as discussed previously
>>>> in the above mentioned thread how host is allowed to access only parts of
>>>> the guest GPA range) bounce buffering into the shared memory for all I/O
>>>> scatter gather buffers to be consumed on the host side.
>>>>
>>>> Please go through these patches and review whether this approach broadly
>>>> makes sense. I will appreciate suggestions, inputs, comments regarding
>>>> the patches or the approach in general. Thank you.
>>> I like how patches 1-3 look. Could you test performance
>>> with/without to see whether the extra indirection through
>>> use of DMA ops causes a measurable slow-down?
>>
>> I ran this simple DD command 10 times where /dev/vda is a virtio block
>> device of 10GB size.
>>
>> dd if=/dev/zero of=/dev/vda bs=8M count=1024 oflag=direct
>>
>> With and without patches bandwidth which has a bit wide range does not
>> look that different from each other.
>>
>> Without patches
>> ===============
>>
>> ---------- 1 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.95557 s, 4.4 GB/s
>> ---------- 2 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 2.05176 s, 4.2 GB/s
>> ---------- 3 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.88314 s, 4.6 GB/s
>> ---------- 4 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.84899 s, 4.6 GB/s
>> ---------- 5 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 5.37184 s, 1.6 GB/s
>> ---------- 6 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.9205 s, 4.5 GB/s
>> ---------- 7 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.85166 s, 1.3 GB/s
>> ---------- 8 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.74049 s, 4.9 GB/s
>> ---------- 9 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.31699 s, 1.4 GB/s
>> ---------- 10 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 2.47057 s, 3.5 GB/s
>>
>>
>> With patches
>> ============
>>
>> ---------- 1 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 2.25993 s, 3.8 GB/s
>> ---------- 2 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.82438 s, 4.7 GB/s
>> ---------- 3 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.93856 s, 4.4 GB/s
>> ---------- 4 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.83405 s, 4.7 GB/s
>> ---------- 5 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 7.50199 s, 1.1 GB/s
>> ---------- 6 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 2.28742 s, 3.8 GB/s
>> ---------- 7 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 5.74958 s, 1.5 GB/s
>> ---------- 8 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.99149 s, 4.3 GB/s
>> ---------- 9 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 5.67647 s, 1.5 GB/s
>> ---------- 10 ---------
>> 1024+0 records in
>> 1024+0 records out
>> 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 2.93957 s, 2.9 GB/s
>>
>> Does this look okay ?
> 
> You want to test IOPS with lots of small writes and using
> raw ramdisk on host.

Hello Michael,

I have conducted the following experiments and here are the results.

TEST SETUP
==========

A virtio block disk is mounted on the guest as follows.


    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' ioeventfd='off'/>
      <source file='/mnt/disk2.img'/>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>

In the host back end its an QEMU raw image on tmpfs file system.

disk:

-rw-r--r-- 1 libvirt-qemu kvm  5.0G Jul 24 06:26 disk2.img

mount:

size=21G on /mnt type tmpfs (rw,relatime,size=22020096k)

TEST CONFIG
===========

FIO (https://linux.die.net/man/1/fio) is being run with and without
the patches.

Read test config:

[Sequential]
direct=1
ioengine=libaio
runtime=5m
time_based
filename=/dev/vda
bs=4k
numjobs=16
rw=read
unlink=1
iodepth=256


Write test config:

[Sequential]
direct=1
ioengine=libaio
runtime=5m
time_based
filename=/dev/vda
bs=4k
numjobs=16
rw=write
unlink=1
iodepth=256

The virtio block device comes up as /dev/vda on the guest with

/sys/block/vda/queue/nr_requests=128

TEST RESULTS
============

Without the patches
-------------------

Read test:

Run status group 0 (all jobs):
   READ: bw=550MiB/s (577MB/s), 33.2MiB/s-35.6MiB/s (34.9MB/s-37.4MB/s), io=161GiB (173GB), run=300001-300009msec

Disk stats (read/write):
  vda: ios=42249926/0, merge=0/0, ticks=1499920/0, in_queue=35672384, util=100.00%


Write test:

Run status group 0 (all jobs):
  WRITE: bw=514MiB/s (539MB/s), 31.5MiB/s-34.6MiB/s (33.0MB/s-36.2MB/s), io=151GiB (162GB), run=300001-300009msec

Disk stats (read/write):
  vda: ios=29/39459261, merge=0/0, ticks=0/1570580, in_queue=35745992, util=100.00%

With the patches
----------------

Read test:

Run status group 0 (all jobs):
   READ: bw=572MiB/s (600MB/s), 35.0MiB/s-37.2MiB/s (36.7MB/s-38.0MB/s), io=168GiB (180GB), run=300001-300006msec

Disk stats (read/write):
  vda: ios=43917611/0, merge=0/0, ticks=1934268/0, in_queue=35531688, util=100.00%
  
Write test:

Run status group 0 (all jobs):
  WRITE: bw=546MiB/s (572MB/s), 33.7MiB/s-35.0MiB/s (35.3MB/s-36.7MB/s), io=160GiB (172GB), run=300001-300007msec

Disk stats (read/write):
  vda: ios=14/41893878, merge=0/0, ticks=8/2107816, in_queue=35535716, util=100.00%

Results with and without the patches are similar.

^ permalink raw reply

* Re: [PATCH net-next 0/6] virtio_net: Add ethtool stat items
From: David Miller @ 2018-07-25  2:06 UTC (permalink / raw)
  To: toshiaki.makita1; +Cc: netdev, virtualization, mst
In-Reply-To: <20180723143609.2242-1-toshiaki.makita1@gmail.com>

From: Toshiaki Makita <toshiaki.makita1@gmail.com>
Date: Mon, 23 Jul 2018 23:36:03 +0900

> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> 
> Add some ethtool stat items useful for performance analysis.
> 
> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>

Michael and Jason, any objections to these new stats?

^ permalink raw reply

* Re: [PATCH v36 0/5] Virtio-balloon: support free page reporting
From: Wei Wang @ 2018-07-24  8:12 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Michael S. Tsirkin
  Cc: yang.zhang.wz, virtio-dev, riel, quan.xu0, kvm, nilal,
	liliang.opensource, linux-kernel, mhocko, linux-mm, pbonzini,
	akpm, virtualization, torvalds
In-Reply-To: <20180723143604.GB2457@work-vm>

On 07/23/2018 10:36 PM, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (mst@redhat.com) wrote:
>> On Fri, Jul 20, 2018 at 04:33:00PM +0800, Wei Wang wrote:
>>> This patch series is separated from the previous "Virtio-balloon
>>> Enhancement" series. The new feature, VIRTIO_BALLOON_F_FREE_PAGE_HINT,
>>> implemented by this series enables the virtio-balloon driver to report
>>> hints of guest free pages to the host. It can be used to accelerate live
>>> migration of VMs. Here is an introduction of this usage:
>>>
>>> Live migration needs to transfer the VM's memory from the source machine
>>> to the destination round by round. For the 1st round, all the VM's memory
>>> is transferred. From the 2nd round, only the pieces of memory that were
>>> written by the guest (after the 1st round) are transferred. One method
>>> that is popularly used by the hypervisor to track which part of memory is
>>> written is to write-protect all the guest memory.
>>>
>>> This feature enables the optimization by skipping the transfer of guest
>>> free pages during VM live migration. It is not concerned that the memory
>>> pages are used after they are given to the hypervisor as a hint of the
>>> free pages, because they will be tracked by the hypervisor and transferred
>>> in the subsequent round if they are used and written.
>>>
>>> * Tests
>>> - Test Environment
>>>      Host: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
>>>      Guest: 8G RAM, 4 vCPU
>>>      Migration setup: migrate_set_speed 100G, migrate_set_downtime 2 second
>>>
>>> - Test Results
>>>      - Idle Guest Live Migration Time (results are averaged over 10 runs):
>>>          - Optimization v.s. Legacy = 409ms vs 1757ms --> ~77% reduction
>>> 	(setting page poisoning zero and enabling ksm don't affect the
>>>           comparison result)
>>>      - Guest with Linux Compilation Workload (make bzImage -j4):
>>>          - Live Migration Time (average)
>>>            Optimization v.s. Legacy = 1407ms v.s. 2528ms --> ~44% reduction
>>>          - Linux Compilation Time
>>>            Optimization v.s. Legacy = 5min4s v.s. 5min12s
>>>            --> no obvious difference
>> I'd like to see dgilbert's take on whether this kind of gain
>> justifies adding a PV interfaces, and what kind of guest workload
>> is appropriate.
>>
>> Cc'd.
> Well, 44% is great ... although the measurement is a bit weird.
>
> a) A 2 second downtime is very large; 300-500ms is more normal

No problem, I will set downtime to 400ms for the tests.

> b) I'm not sure what the 'average' is  - is that just between a bunch of
> repeated migrations?

Yes, just repeatedly ("source<---->destination" migration) do the tests 
and get an averaged result.


> c) What load was running in the guest during the live migration?

The first one above just uses a guest without running any specific 
workload (named idle guests).
The second one uses a guest with the Linux compilation workload running.

>
> An interesting measurement to add would be to do the same test but
> with a VM with a lot more RAM but the same load;  you'd hope the gain
> would be even better.
> It would be interesting, especially because the users who are interested
> are people creating VMs allocated with lots of extra memory (for the
> worst case) but most of the time migrating when it's fairly idle.

OK. I will add tests of a guest with larger memory.

Best,
Wei

^ permalink raw reply

* Re: [PATCH net-next v6 3/4] net: vhost: factor out busy polling logic to vhost_net_busy_poll()
From: Toshiaki Makita @ 2018-07-24  3:41 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: Linux Kernel Network Developers, toshiaki.makita1, virtualization,
	mst
In-Reply-To: <CAMDZJNVVJs35kuvktTxn+mmDz7db+1K-kfuOoMUn9Z=WoayUVw@mail.gmail.com>

On 2018/07/24 12:28, Tonghao Zhang wrote:
> On Tue, Jul 24, 2018 at 10:53 AM Toshiaki Makita
> <makita.toshiaki@lab.ntt.co.jp> wrote:
>>
>> On 2018/07/24 2:31, Tonghao Zhang wrote:
>>> On Mon, Jul 23, 2018 at 10:20 PM Toshiaki Makita
>>> <toshiaki.makita1@gmail.com> wrote:
>>>>
>>>> On 18/07/23 (月) 21:43, Tonghao Zhang wrote:
>>>>> On Mon, Jul 23, 2018 at 5:58 PM Toshiaki Makita
>>>>> <makita.toshiaki@lab.ntt.co.jp> wrote:
>>>>>>
>>>>>> On 2018/07/22 3:04, xiangxia.m.yue@gmail.com wrote:
>>>>>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>>>>>
>>>>>>> Factor out generic busy polling logic and will be
>>>>>>> used for in tx path in the next patch. And with the patch,
>>>>>>> qemu can set differently the busyloop_timeout for rx queue.
>>>>>>>
>>>>>>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>>>>> ---
>>>>>> ...
>>>>>>> +static void vhost_net_busy_poll_vq_check(struct vhost_net *net,
>>>>>>> +                                      struct vhost_virtqueue *rvq,
>>>>>>> +                                      struct vhost_virtqueue *tvq,
>>>>>>> +                                      bool rx)
>>>>>>> +{
>>>>>>> +     struct socket *sock = rvq->private_data;
>>>>>>> +
>>>>>>> +     if (rx) {
>>>>>>> +             if (!vhost_vq_avail_empty(&net->dev, tvq)) {
>>>>>>> +                     vhost_poll_queue(&tvq->poll);
>>>>>>> +             } else if (unlikely(vhost_enable_notify(&net->dev, tvq))) {
>>>>>>> +                     vhost_disable_notify(&net->dev, tvq);
>>>>>>> +                     vhost_poll_queue(&tvq->poll);
>>>>>>> +             }
>>>>>>> +     } else if ((sock && sk_has_rx_data(sock->sk)) &&
>>>>>>> +                 !vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>>>> +             vhost_poll_queue(&rvq->poll);
>>>>>>
>>>>>> Now we wait for vq_avail for rx as well, I think you cannot skip
>>>>>> vhost_enable_notify() on tx. Probably you might want to do:
>>>>> I think vhost_enable_notify is needed.
>>>>>
>>>>>> } else if (sock && sk_has_rx_data(sock->sk)) {
>>>>>>          if (!vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>>>                  vhost_poll_queue(&rvq->poll);
>>>>>>          } else if (unlikely(vhost_enable_notify(&net->dev, rvq))) {
>>>>>>                  vhost_disable_notify(&net->dev, rvq);
>>>>>>                  vhost_poll_queue(&rvq->poll);
>>>>>>          }
>>>>>> }
>>>>> As Jason review as before, we only want rx kick when packet is pending at
>>>>> socket but we're out of available buffers. So we just enable notify,
>>>>> but not poll it ?
>>>>>
>>>>>          } else if ((sock && sk_has_rx_data(sock->sk)) &&
>>>>>                      !vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>>                  vhost_poll_queue(&rvq->poll);
>>>>>          else {
>>>>>                  vhost_enable_notify(&net->dev, rvq);
>>>>>          }
>>>>
>>>> When vhost_enable_notify() returns true the avail becomes non-empty
>>>> while we are enabling notify. We may delay the rx process if we don't
>>>> check the return value of vhost_enable_notify().
>>> I got it thanks.
>>>>>> Also it's better to care vhost_net_disable_vq()/vhost_net_enable_vq() on tx?
>>>>> I cant find why it is better, if necessary, we can do it.
>>>>
>>>> The reason is pretty simple... we are busypolling the socket so we don't
>>>> need rx wakeups during it?
>>> OK, but one question, how about rx? do we use the
>>> vhost_net_disable_vq/vhost_net_ensable_vq on rx ?
>>
>> If we are busypolling the sock tx buf? I'm not sure if polling it
>> improves the performance.
> Not the sock tx buff, when we are busypolling in handle_rx, we will
> check the tx vring via  vhost_vq_avail_empty.
> So, should we the disable tvq, e.g. vhost_net_disable_vq(net, tvq)?> --

When you want to stop vq kicks from the guest you should call
vhost_disable_notify() and when you want to stop vq wakeups from the
socket you should call vhost_net_disable_vq().

You are polling vq_avail so you want to stop vq kicks thus
vhost_disable_notify() is needed and it is already called.

-- 
Toshiaki Makita

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net-next v6 3/4] net: vhost: factor out busy polling logic to vhost_net_busy_poll()
From: Tonghao Zhang @ 2018-07-24  3:28 UTC (permalink / raw)
  To: makita.toshiaki
  Cc: Linux Kernel Network Developers, toshiaki.makita1, virtualization,
	mst
In-Reply-To: <14d01d2d-0eb8-172b-1c53-7dadc5fffbac@lab.ntt.co.jp>

On Tue, Jul 24, 2018 at 10:53 AM Toshiaki Makita
<makita.toshiaki@lab.ntt.co.jp> wrote:
>
> On 2018/07/24 2:31, Tonghao Zhang wrote:
> > On Mon, Jul 23, 2018 at 10:20 PM Toshiaki Makita
> > <toshiaki.makita1@gmail.com> wrote:
> >>
> >> On 18/07/23 (月) 21:43, Tonghao Zhang wrote:
> >>> On Mon, Jul 23, 2018 at 5:58 PM Toshiaki Makita
> >>> <makita.toshiaki@lab.ntt.co.jp> wrote:
> >>>>
> >>>> On 2018/07/22 3:04, xiangxia.m.yue@gmail.com wrote:
> >>>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >>>>>
> >>>>> Factor out generic busy polling logic and will be
> >>>>> used for in tx path in the next patch. And with the patch,
> >>>>> qemu can set differently the busyloop_timeout for rx queue.
> >>>>>
> >>>>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >>>>> ---
> >>>> ...
> >>>>> +static void vhost_net_busy_poll_vq_check(struct vhost_net *net,
> >>>>> +                                      struct vhost_virtqueue *rvq,
> >>>>> +                                      struct vhost_virtqueue *tvq,
> >>>>> +                                      bool rx)
> >>>>> +{
> >>>>> +     struct socket *sock = rvq->private_data;
> >>>>> +
> >>>>> +     if (rx) {
> >>>>> +             if (!vhost_vq_avail_empty(&net->dev, tvq)) {
> >>>>> +                     vhost_poll_queue(&tvq->poll);
> >>>>> +             } else if (unlikely(vhost_enable_notify(&net->dev, tvq))) {
> >>>>> +                     vhost_disable_notify(&net->dev, tvq);
> >>>>> +                     vhost_poll_queue(&tvq->poll);
> >>>>> +             }
> >>>>> +     } else if ((sock && sk_has_rx_data(sock->sk)) &&
> >>>>> +                 !vhost_vq_avail_empty(&net->dev, rvq)) {
> >>>>> +             vhost_poll_queue(&rvq->poll);
> >>>>
> >>>> Now we wait for vq_avail for rx as well, I think you cannot skip
> >>>> vhost_enable_notify() on tx. Probably you might want to do:
> >>> I think vhost_enable_notify is needed.
> >>>
> >>>> } else if (sock && sk_has_rx_data(sock->sk)) {
> >>>>          if (!vhost_vq_avail_empty(&net->dev, rvq)) {
> >>>>                  vhost_poll_queue(&rvq->poll);
> >>>>          } else if (unlikely(vhost_enable_notify(&net->dev, rvq))) {
> >>>>                  vhost_disable_notify(&net->dev, rvq);
> >>>>                  vhost_poll_queue(&rvq->poll);
> >>>>          }
> >>>> }
> >>> As Jason review as before, we only want rx kick when packet is pending at
> >>> socket but we're out of available buffers. So we just enable notify,
> >>> but not poll it ?
> >>>
> >>>          } else if ((sock && sk_has_rx_data(sock->sk)) &&
> >>>                      !vhost_vq_avail_empty(&net->dev, rvq)) {
> >>>                  vhost_poll_queue(&rvq->poll);
> >>>          else {
> >>>                  vhost_enable_notify(&net->dev, rvq);
> >>>          }
> >>
> >> When vhost_enable_notify() returns true the avail becomes non-empty
> >> while we are enabling notify. We may delay the rx process if we don't
> >> check the return value of vhost_enable_notify().
> > I got it thanks.
> >>>> Also it's better to care vhost_net_disable_vq()/vhost_net_enable_vq() on tx?
> >>> I cant find why it is better, if necessary, we can do it.
> >>
> >> The reason is pretty simple... we are busypolling the socket so we don't
> >> need rx wakeups during it?
> > OK, but one question, how about rx? do we use the
> > vhost_net_disable_vq/vhost_net_ensable_vq on rx ?
>
> If we are busypolling the sock tx buf? I'm not sure if polling it
> improves the performance.
Not the sock tx buff, when we are busypolling in handle_rx, we will
check the tx vring via  vhost_vq_avail_empty.
So, should we the disable tvq, e.g. vhost_net_disable_vq(net, tvq)?> --
> Toshiaki Makita
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net-next v6 3/4] net: vhost: factor out busy polling logic to vhost_net_busy_poll()
From: Toshiaki Makita @ 2018-07-24  2:53 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: Linux Kernel Network Developers, toshiaki.makita1, virtualization,
	mst
In-Reply-To: <CAMDZJNXWs+yqAcZ-7gW6RQjen0-mzfJ-Ar-O0_wttse2A-3-HQ@mail.gmail.com>

On 2018/07/24 2:31, Tonghao Zhang wrote:
> On Mon, Jul 23, 2018 at 10:20 PM Toshiaki Makita
> <toshiaki.makita1@gmail.com> wrote:
>>
>> On 18/07/23 (月) 21:43, Tonghao Zhang wrote:
>>> On Mon, Jul 23, 2018 at 5:58 PM Toshiaki Makita
>>> <makita.toshiaki@lab.ntt.co.jp> wrote:
>>>>
>>>> On 2018/07/22 3:04, xiangxia.m.yue@gmail.com wrote:
>>>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>>>
>>>>> Factor out generic busy polling logic and will be
>>>>> used for in tx path in the next patch. And with the patch,
>>>>> qemu can set differently the busyloop_timeout for rx queue.
>>>>>
>>>>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>>> ---
>>>> ...
>>>>> +static void vhost_net_busy_poll_vq_check(struct vhost_net *net,
>>>>> +                                      struct vhost_virtqueue *rvq,
>>>>> +                                      struct vhost_virtqueue *tvq,
>>>>> +                                      bool rx)
>>>>> +{
>>>>> +     struct socket *sock = rvq->private_data;
>>>>> +
>>>>> +     if (rx) {
>>>>> +             if (!vhost_vq_avail_empty(&net->dev, tvq)) {
>>>>> +                     vhost_poll_queue(&tvq->poll);
>>>>> +             } else if (unlikely(vhost_enable_notify(&net->dev, tvq))) {
>>>>> +                     vhost_disable_notify(&net->dev, tvq);
>>>>> +                     vhost_poll_queue(&tvq->poll);
>>>>> +             }
>>>>> +     } else if ((sock && sk_has_rx_data(sock->sk)) &&
>>>>> +                 !vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>> +             vhost_poll_queue(&rvq->poll);
>>>>
>>>> Now we wait for vq_avail for rx as well, I think you cannot skip
>>>> vhost_enable_notify() on tx. Probably you might want to do:
>>> I think vhost_enable_notify is needed.
>>>
>>>> } else if (sock && sk_has_rx_data(sock->sk)) {
>>>>          if (!vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>                  vhost_poll_queue(&rvq->poll);
>>>>          } else if (unlikely(vhost_enable_notify(&net->dev, rvq))) {
>>>>                  vhost_disable_notify(&net->dev, rvq);
>>>>                  vhost_poll_queue(&rvq->poll);
>>>>          }
>>>> }
>>> As Jason review as before, we only want rx kick when packet is pending at
>>> socket but we're out of available buffers. So we just enable notify,
>>> but not poll it ?
>>>
>>>          } else if ((sock && sk_has_rx_data(sock->sk)) &&
>>>                      !vhost_vq_avail_empty(&net->dev, rvq)) {
>>>                  vhost_poll_queue(&rvq->poll);
>>>          else {
>>>                  vhost_enable_notify(&net->dev, rvq);
>>>          }
>>
>> When vhost_enable_notify() returns true the avail becomes non-empty
>> while we are enabling notify. We may delay the rx process if we don't
>> check the return value of vhost_enable_notify().
> I got it thanks.
>>>> Also it's better to care vhost_net_disable_vq()/vhost_net_enable_vq() on tx?
>>> I cant find why it is better, if necessary, we can do it.
>>
>> The reason is pretty simple... we are busypolling the socket so we don't
>> need rx wakeups during it?
> OK, but one question, how about rx? do we use the
> vhost_net_disable_vq/vhost_net_ensable_vq on rx ?

If we are busypolling the sock tx buf? I'm not sure if polling it
improves the performance.

-- 
Toshiaki Makita

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH v36 2/5] virtio_balloon: replace oom notifier with shrinker
From: Wei Wang @ 2018-07-24  1:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: yang.zhang.wz, virtio-dev, riel, quan.xu0, kvm, nilal,
	liliang.opensource, linux-kernel, mhocko, linux-mm, pbonzini,
	akpm, virtualization, torvalds
In-Reply-To: <20180723170826-mutt-send-email-mst@kernel.org>

On 07/23/2018 10:13 PM, Michael S. Tsirkin wrote:
>>>    	vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops;
>>>    #endif
>>> +	err = virtio_balloon_register_shrinker(vb);
>>> +	if (err)
>>> +		goto out_del_vqs;
>>> So we can get scans before device is ready. Leak will fail
>>> then. Why not register later after device is ready?
>> Probably no.
>>
>> - it would be better not to set device ready when register_shrinker failed.
> That's very rare so I won't be too worried.

Just a little confused with the point here. "very rare" means it still 
could happen (even it's a corner case), and if that happens, we got 
something wrong functionally. So it will be a bug if we change like 
that, right?

Still couldn't understand the reason of changing shrinker_register after 
device_ready (the original oom notifier was registered before setting 
device ready too)?
(I think the driver won't get shrinker_scan called if device isn't ready 
because of the reasons below)

>> - When the device isn't ready, ballooning won't happen, that is,
>> vb->num_pages will be 0, which results in shrinker_count=0 and shrinker_scan
>> won't be called.

Best,
Wei

^ permalink raw reply

* Call for Papers - ICITS'19 - Quito, Ecuador
From: Maria @ 2018-07-23 18:25 UTC (permalink / raw)
  To: virtualization


[-- Attachment #1.1: Type: text/plain, Size: 5971 bytes --]

***** Proceedings by Springer. Indexed in Scopus, ISI, etc.

------------

ICITS'19 - The 2019 International Conference on Information Technology & Systems

6 - 8 February 2019, Quito, Ecuador

http://www.icits.me/ <http://www.icits.me/>

------------------------------   ------------------------------   ------------------------------   ------------------------


ICITS'19 - The 2019 International Conference on Information Technology & Systems, to be held at Quito, Ecuador, 6 - 8 February 2019, is an international forum for researchers and practitioners to present and discuss the most recent innovations, trends, results, experiences and concerns in the several perspectives of Information Technology & Systems.

We are pleased to invite you to submit your papers to ICITS'19. They can be written in English, Spanish or Portuguese. All submissions will be reviewed on the basis of relevance, originality, importance and clarity.



Topics

Submitted papers should be related with one or more of the main themes proposed for the Conference:

A) Information and Knowledge Management (IKM);

B) Organizational Models and Information Systems (OMIS);

C) Software and Systems Modeling (SSM);

D) Software Systems, Architectures, Applications and Tools (SSAAT);

E) Multimedia Systems and Applications (MSA);

F) Computer Networks, Mobility and Pervasive Systems (CNMPS);

G) Intelligent and Decision Support Systems (IDSS);

H) Big Data Analytics and Applications (BDAA);

I) Human-Computer Interaction (HCI);

J) Ethics, Computers and Security (ECS)

K) Health Informatics (HIS);

L) Information Technologies in Education (ITE);

M) Cybersecurity and Cyber-defense.



Submission and Decision

Submitted papers written in English (until 10-page limit) must comply with the format of Advances in Intelligent Systems and Computing series (see Instructions for Authors at Springer Website <http://www.springer.com/series/11156> or download a DOC example <http://www.icits.me/springerformat.doc>), must not have been published before, not be under review for any other conference or publication and not include any information leading to the authors’ identification. Therefore, the authors’ names, affiliations and bibliographic references should not be included in the version for evaluation by the Scientific Committee. This information should only be included in the camera-ready version, saved in Word or Latex format and also in PDF format. These files must be accompanied by the Consent to Publish form <http://www.icits.me/copyright.pdf> filled out, in a ZIP file, and uploaded at the conference management system.

Submitted papers written in Spanish or Portuguese (until 15-page limit) must comply with the format of RISTI <http://www.risti.xyz/> - Revista Ibérica de Sistemas e Tecnologias de Informação (download instructions/template for authors in Spanish <http://www.risti.xyz/formato-es.doc> or Portuguese <http://www.risti.xyz/formato-pt.doc>), must not have been published before, not be under review for any other conference or publication and not include any information leading to the authors’ identification. Therefore, the authors’ names, affiliations and bibliographic references should not be included in the version for evaluation by the Scientific Committee. This information should only be included in the camera-ready version, saved in Word. These file must be uploaded at the conference management system in a ZIP file.

All papers will be subjected to a “double-blind review” by at least two members of the Scientific Committee.

Based on Scientific Committee evaluation, a paper can be rejected or accepted by the Conference Chairs. In the later case, it can be accepted as paper or poster.

The authors of papers accepted as posters must build and print a poster to be exhibited during the Conference. This poster must follow an A1 or A2 vertical format. The Conference can includes Work Sessions where these posters are presented and orally discussed, with a 7 minute limit per poster.

The authors of accepted papers will have 15 minutes to present their work in a Conference Work Session; approximately 5 minutes of discussion will follow each presentation.



Publication and Indexing

To ensure that an accepted paper is published, at least one of the authors must be fully registered by the 9th of October 2018, and the paper must comply with the suggested layout and page-limit. Additionally, all recommended changes must be addressed by the authors before they submit the camera-ready version.

No more than one paper per registration will be published. An extra fee must be paid for publication of additional papers, with a maximum of one additional paper per registration. One registration permits only the participation of one author in the conference.

Papers written in English and accepted and registered will be published in Proceedings by Springer, in a book of the Advances in Intelligent Systems and Computing <http://www.springer.com/series/11156>series, will  be submitted for indexation by ISI, EI-Compendex, SCOPUS and DBLP, among others, and will be available in the SpringerLink Digital Library <http://link.springer.com/>.

Papers written in Spanish or Portuguese and accepted and registered will be published in a Special Issue of RISTI <http://www.risti.xyz/index.php?option=com_content&view=article&id=3&Itemid=104&lang=es> and will be submitted for indexation by SCOPUS, among others.



Important Dates

Paper Submission: September 16, 2018

Notification of Acceptance: October 28, 2018

Payment of Registration, to ensure the inclusion of an accepted paper in the conference proceedings: November 9, 2018.

Camera-ready Submission: November 9, 2018



Website of ICITS'19: http://www.icits.me/ <http://www.icits.me/>






---
This email has been checked for viruses by AVG.
https://www.avg.com

[-- Attachment #1.2: Type: text/html, Size: 10642 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox