From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Konrad Rzeszutek Wilk <konrad@darnok.org>
Cc: xen-devel <xen-devel@lists.xensource.com>,
Ian Campbell <Ian.Campbell@citrix.com>,
Carsten Schiers <carsten@schiers.de>,
"zhenzhong.duan@oracle.com" <zhenzhong.duan@oracle.com>,
linux@eikelenboom.it, "lersek@redhat.com" <lersek@redhat.com>
Subject: Re: Load increase after memory upgrade (part2)
Date: Wed, 14 Dec 2011 17:07:00 -0500 [thread overview]
Message-ID: <20111214220700.GA9926@phenom.dumpdata.com> (raw)
In-Reply-To: <20111214202351.GA25896@andromeda.dapyr.net>
[-- Attachment #1: Type: text/plain, Size: 2328 bytes --]
On Wed, Dec 14, 2011 at 04:23:51PM -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Dec 05, 2011 at 10:26:21PM -0500, Konrad Rzeszutek Wilk wrote:
> > On Sun, Dec 04, 2011 at 01:09:28PM +0100, Carsten Schiers wrote:
> > > Here with two cards enabled and creating a bit "work" by watching TV with one oft hem:
> > >
> > > [ 23.842720] Starting SWIOTLB debug thread.
> > > [ 23.842750] swiotlb_start_thread: Go!
> > > [ 23.842838] xen_swiotlb_start_thread: Go!
> > > [ 28.841451] 0 [budget_av 0000:00:01.0] bounce: from:435596(slow:0)to:0 map:658 unmap:0 sync:435596
> > > [ 28.841592] SWIOTLB is 4% full
> > > [ 33.840147] 0 [budget_av 0000:00:01.0] bounce: from:127652(slow:0)to:0 map:0 unmap:0 sync:127652
> > > [ 33.840283] SWIOTLB is 4% full
> > > [ 33.844222] 0 budget_av 0000:00:01.0 alloc coherent: 8, free: 0
> > > [ 38.840227] 0 [budget_av 0000:00:01.0] bounce: from:128310(slow:0)to:0 map:0 unmap:0 sync:128310
> >
> > Whoa. Yes. You are definitly using the bounce buffer :-)
> >
> > Now it is time to look at why the drive is not using those coherent ones - it
> > looks to allocate just eight of them but does not use them.. Unless it is
> > using them _and_ bouncing them (which would be odd).
> >
> > And BTW, you can lower your 'swiotlb=XX' value. The 4% is how much you
> > are using of the default size.
>
> So I able to see this with an atl1c ethernet driver on my SandyBridge i3
> box. It looks as if the card is truly 32-bit so on a box with 8GB it
> bounces the data. If I booted the Xen hypervisor with 'mem=4GB' I get no
> bounces (no surprise there).
>
> In other words - I see the same behavior you are seeing. Now off to:
> >
> > I should find out_why_ the old Xen kernels do not use the bounce buffer
> > so much...
>
> which will require some fiddling around.
And I am not seeing any difference - the swiotlb is used with the same usage when
booting a classic (old style XEnoLinux) 2.6.32 vs using a brand new pvops (3.2).
Obviously if I limit the physical amount of memory (so 'mem=4GB' on Xen hypervisor
line), the bounce usage disappears. Hmm, I wonder if there is a nice way to
tell the hypervisor - hey, please stuff dom0 under 4GB.
Here is the patch I used against classic XenLinux. Any chance you could run
it with your classis guests and see what numbers you get?
[-- Attachment #2: swiotlb-against-old-type.patch --]
[-- Type: text/plain, Size: 7793 bytes --]
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index ab0bb23..17faefd 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -469,3 +469,10 @@ config XEN_SYS_HYPERVISOR
hypervisor environment. When running native or in another
virtual environment, /sys/hypervisor will still be present,
but will have no xen contents.
+
+config SWIOTLB_DEBUG
+ tristate "swiotlb debug facility."
+ default m
+ help
+ Do not enable it.
+
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 28fb50a..df84614 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -42,3 +42,4 @@ obj-$(CONFIG_XEN_GRANT_DEV) += gntdev/
obj-$(CONFIG_XEN_NETDEV_ACCEL_SFC_UTIL) += sfc_netutil/
obj-$(CONFIG_XEN_NETDEV_ACCEL_SFC_FRONTEND) += sfc_netfront/
obj-$(CONFIG_XEN_NETDEV_ACCEL_SFC_BACKEND) += sfc_netback/
+obj-$(CONFIG_SWIOTLB_DEBUG) += dump_swiotlb.o
diff --git a/drivers/xen/dump_swiotlb.c b/drivers/xen/dump_swiotlb.c
new file mode 100644
index 0000000..7168eed
--- /dev/null
+++ b/drivers/xen/dump_swiotlb.c
@@ -0,0 +1,72 @@
+/*
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License v2.0 as published by
+ * the Free Software Foundation
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/module.h>
+#include <linux/string.h>
+#include <linux/types.h>
+#include <linux/init.h>
+#include <linux/stat.h>
+#include <linux/err.h>
+#include <linux/ctype.h>
+#include <linux/slab.h>
+#include <linux/limits.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/blkdev.h>
+#include <linux/device.h>
+
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/fcntl.h>
+#include <linux/slab.h>
+#include <linux/kmod.h>
+#include <linux/major.h>
+#include <linux/highmem.h>
+#include <linux/blkdev.h>
+#include <linux/module.h>
+#include <linux/blkpg.h>
+#include <linux/buffer_head.h>
+#include <linux/mpage.h>
+#include <linux/mount.h>
+#include <linux/uio.h>
+#include <linux/namei.h>
+#include <asm/uaccess.h>
+
+#include <linux/pagemap.h>
+#include <linux/pagevec.h>
+
+#include <linux/swiotlb.h>
+#define DUMP_SWIOTLB_FUN "0.1"
+
+MODULE_AUTHOR("Konrad Rzeszutek Wilk <konrad@darnok.org>");
+MODULE_DESCRIPTION("dump swiotlb");
+MODULE_LICENSE("GPL");
+MODULE_VERSION(DUMP_SWIOTLB_FUN);
+/*
+extern int xen_swiotlb_start_thread(void);
+extern void xen_swiotlb_stop_thread(void);*/
+static int __init dump_swiotlb_init(void)
+{
+ printk(KERN_INFO "Starting SWIOTLB debug thread.\n");
+ swiotlb_start_thread();
+ //xen_swiotlb_start_thread();
+ return 0;
+}
+
+static void __exit dump_swiotlb_exit(void)
+{
+ swiotlb_stop_thread();
+ //xen_swiotlb_stop_thread();
+}
+
+module_init(dump_swiotlb_init);
+module_exit(dump_swiotlb_exit);
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 73b1f1c..81f5a1e 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -7,6 +7,9 @@ struct device;
struct dma_attrs;
struct scatterlist;
+
+extern int swiotlb_start_thread(void);
+extern void swiotlb_stop_thread(void);
/*
* Maximum allowable number of contiguous slabs to map,
* must be a power of 2. What is the appropriate value ?
diff --git a/lib/swiotlb-xen.c b/lib/swiotlb-xen.c
index 152696c..d1df462 100644
--- a/lib/swiotlb-xen.c
+++ b/lib/swiotlb-xen.c
@@ -118,6 +118,78 @@ setup_io_tlb_npages(char *str)
}
__setup("swiotlb=", setup_io_tlb_npages);
/* make io_tlb_overflow tunable too? */
+
+#include <linux/percpu.h>
+struct swiotlb_debug {
+ unsigned long bounce_to;
+ unsigned long bounce_from;
+ unsigned long bounce_slow;
+ unsigned long map;
+ unsigned long unmap;
+ unsigned long sync;
+ char dev_name[64];
+};
+
+;
+static DEFINE_PER_CPU(struct swiotlb_debug, tlb_debug);
+#include <linux/kthread.h>
+static int swiotlb_debug_thread(void *arg)
+{
+ int cpu;
+ int size = io_tlb_nslabs;
+ do {
+ int i;
+ unsigned long filled = 0;
+ set_current_state(TASK_INTERRUPTIBLE);
+ schedule_timeout_interruptible(HZ*5);
+
+ for_each_online_cpu(cpu) {
+ struct swiotlb_debug *d = &per_cpu(tlb_debug, cpu);
+ /* Can't really happend.*/
+ if (!d)
+ continue;
+ if (d->dev_name[0] == 0)
+ continue;
+
+ printk(KERN_INFO "%d [%s] bounce: from:%ld(slow:%ld)to:%ld map:%ld unmap:%ld sync:%ld\n",
+ cpu,
+ d->dev_name ? d->dev_name : "?",
+ d->bounce_from,
+ d->bounce_slow,
+ d->bounce_to,
+ d->map, d->unmap, d->sync);
+ memset(d, 0, sizeof(struct swiotlb_debug));
+ }
+ /* Very crude calculation. */
+ for (i = 0; i < size; i++) {
+ if (io_tlb_list[i] == 0)
+ filled++;
+ }
+ printk(KERN_INFO "SWIOTLB is %ld%% full\n", (filled * 100) / size);
+
+ } while (!kthread_should_stop());
+ return 0;
+}
+static struct task_struct *debug_thread = NULL;
+
+
+int swiotlb_start_thread(void) {
+
+ if (debug_thread)
+ return -EINVAL;
+ printk(KERN_INFO "%s: Go!\n",__func__);
+ debug_thread = kthread_run(swiotlb_debug_thread, NULL, "swiotlb_debug");
+}
+EXPORT_SYMBOL_GPL(swiotlb_start_thread);
+void swiotlb_stop_thread(void) {
+
+ printk(KERN_INFO "%s: Stop!\n",__func__);
+ if (debug_thread)
+ kthread_stop(debug_thread);
+ debug_thread = NULL;
+}
+EXPORT_SYMBOL_GPL(swiotlb_stop_thread);
+
/* Note that this doesn't work with highmem page */
static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
@@ -270,6 +342,11 @@ static void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size,
enum dma_data_direction dir)
{
unsigned long pfn = PFN_DOWN(phys);
+ struct swiotlb_debug *d;
+
+ preempt_disable();
+ d = &__get_cpu_var(tlb_debug);
+ preempt_enable();
if (PageHighMem(pfn_to_page(pfn))) {
/* The buffer does not have a mapping. Map it in and copy */
@@ -297,12 +374,18 @@ static void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size,
dma_addr += sz;
offset = 0;
}
+ d->bounce_slow++;
} else {
- if (dir == DMA_TO_DEVICE)
+ if (dir == DMA_TO_DEVICE) {
memcpy(dma_addr, phys_to_virt(phys), size);
- else if (__copy_to_user_inatomic(phys_to_virt(phys),
+ d->bounce_to++;
+ }
+ else {
+ if (__copy_to_user_inatomic(phys_to_virt(phys),
dma_addr, size))
/* inaccessible */;
+ d->bounce_from++;
+ }
}
}
@@ -406,6 +489,16 @@ found:
if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
swiotlb_bounce(phys, dma_addr, size, DMA_TO_DEVICE);
+ {
+ struct swiotlb_debug *d;
+ preempt_disable();
+ d = &__get_cpu_var(tlb_debug);
+ preempt_enable();
+ d->map++;
+ snprintf(d->dev_name, sizeof(d->dev_name), "%s %s",
+ dev_driver_string(hwdev), dev_name(hwdev));
+ }
+
return dma_addr;
}
@@ -453,6 +546,17 @@ do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir)
io_tlb_list[i] = ++count;
}
spin_unlock_irqrestore(&io_tlb_lock, flags);
+
+ {
+ struct swiotlb_debug *d;
+ preempt_disable();
+ d = &__get_cpu_var(tlb_debug);
+ preempt_enable();
+ d->unmap++;
+ snprintf(d->dev_name, sizeof(d->dev_name), "%s %s",
+ dev_driver_string(hwdev), dev_name(hwdev));
+ }
+
}
static void
@@ -462,6 +566,14 @@ sync_single(struct device *hwdev, char *dma_addr, size_t size,
int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
phys_addr_t phys = io_tlb_orig_addr[index];
+ struct swiotlb_debug *d;
+ preempt_disable();
+ d = &__get_cpu_var(tlb_debug);
+ preempt_enable();
+ d->sync++;
+ snprintf(d->dev_name, sizeof(d->dev_name), "%s %s",
+ dev_driver_string(hwdev), dev_name(hwdev));
+
phys += ((unsigned long)dma_addr & ((1 << IO_TLB_SHIFT) - 1));
switch (target) {
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
next prev parent reply other threads:[~2011-12-14 22:07 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-24 12:28 Load increase after memory upgrade (part2) Carsten Schiers
2011-11-25 18:42 ` Konrad Rzeszutek Wilk
2011-11-25 22:11 ` Carsten Schiers
2011-11-28 15:28 ` Konrad Rzeszutek Wilk
2011-11-28 15:40 ` Ian Campbell
2011-11-28 16:45 ` Konrad Rzeszutek Wilk
2011-11-29 8:31 ` Jan Beulich
2011-11-29 9:31 ` Carsten Schiers
2011-11-29 9:46 ` Carsten Schiers
2011-11-29 10:23 ` Ian Campbell
2011-11-29 15:33 ` Konrad Rzeszutek Wilk
2011-12-02 15:23 ` Konrad Rzeszutek Wilk
2011-12-04 11:59 ` Carsten Schiers
2011-12-04 12:09 ` Carsten Schiers
2011-12-06 3:26 ` Konrad Rzeszutek Wilk
2011-12-14 20:23 ` Konrad Rzeszutek Wilk
2011-12-14 22:07 ` Konrad Rzeszutek Wilk [this message]
2011-12-15 14:52 ` Carsten Schiers
2011-12-16 14:56 ` Carsten Schiers
2011-12-16 15:04 ` Konrad Rzeszutek Wilk
2011-12-16 15:51 ` Carsten Schiers
2011-12-16 16:19 ` Konrad Rzeszutek Wilk
2011-12-17 22:12 ` Carsten Schiers
2011-12-18 0:19 ` Sander Eikelenboom
2011-12-19 14:56 ` Konrad Rzeszutek Wilk
2012-01-10 21:55 ` Konrad Rzeszutek Wilk
2012-01-12 22:06 ` Sander Eikelenboom
2012-01-13 8:12 ` Jan Beulich
2012-01-13 15:13 ` Konrad Rzeszutek Wilk
2012-01-15 11:32 ` Sander Eikelenboom
2012-01-17 21:02 ` Konrad Rzeszutek Wilk
2012-01-18 11:28 ` Pasi Kärkkäinen
2012-01-18 11:39 ` Jan Beulich
2012-01-18 11:35 ` Jan Beulich
2012-01-18 14:29 ` Konrad Rzeszutek Wilk
2012-01-23 22:32 ` Konrad Rzeszutek Wilk
2012-01-24 8:58 ` Jan Beulich
2012-01-24 14:17 ` Konrad Rzeszutek Wilk
2012-01-24 21:32 ` Carsten Schiers
2012-01-25 12:02 ` Carsten Schiers
2012-01-25 19:06 ` Carsten Schiers
2012-01-25 21:02 ` Konrad Rzeszutek Wilk
2012-02-15 19:28 ` Konrad Rzeszutek Wilk
2012-02-16 8:56 ` Jan Beulich
2012-02-17 15:07 ` Konrad Rzeszutek Wilk
2012-02-28 14:35 ` Carsten Schiers
2012-02-29 12:10 ` Carsten Schiers
2012-02-29 12:56 ` Carsten Schiers
2012-05-11 9:39 ` Carsten Schiers
2012-05-11 19:41 ` Konrad Rzeszutek Wilk
2012-06-13 16:55 ` Konrad Rzeszutek Wilk
2012-06-14 7:07 ` Jan Beulich
2012-06-14 18:33 ` Konrad Rzeszutek Wilk
2012-06-14 18:43 ` Carsten Schiers
2012-06-14 8:38 ` David Vrabel
2012-06-14 18:31 ` Konrad Rzeszutek Wilk
2012-06-14 18:40 ` Carsten Schiers
2012-06-14 19:16 ` Carsten Schiers
2011-12-19 14:54 ` Konrad Rzeszutek Wilk
2011-12-04 12:18 ` Carsten Schiers
2011-11-28 16:58 ` Laszlo Ersek
2011-11-29 9:37 ` Carsten Schiers
2011-11-28 15:52 ` Carsten Schiers
2011-11-26 9:14 ` Carsten Schiers
2011-11-28 15:30 ` Konrad Rzeszutek Wilk
2011-11-29 9:42 ` Carsten Schiers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111214220700.GA9926@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=Ian.Campbell@citrix.com \
--cc=carsten@schiers.de \
--cc=konrad@darnok.org \
--cc=lersek@redhat.com \
--cc=linux@eikelenboom.it \
--cc=xen-devel@lists.xensource.com \
--cc=zhenzhong.duan@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).