From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Konrad Rzeszutek Wilk <konrad@darnok.org>
Cc: xen-devel <xen-devel@lists.xensource.com>,
Ian Campbell <Ian.Campbell@citrix.com>,
Carsten Schiers <carsten@schiers.de>,
"zhenzhong.duan@oracle.com" <zhenzhong.duan@oracle.com>,
linux@eikelenboom.it, "lersek@redhat.com" <lersek@redhat.com>
Subject: Re: Load increase after memory upgrade (part2)
Date: Wed, 14 Dec 2011 17:07:00 -0500 [thread overview]
Message-ID: <20111214220700.GA9926@phenom.dumpdata.com> (raw)
In-Reply-To: <20111214202351.GA25896@andromeda.dapyr.net>
[-- Attachment #1: Type: text/plain, Size: 2328 bytes --]
On Wed, Dec 14, 2011 at 04:23:51PM -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Dec 05, 2011 at 10:26:21PM -0500, Konrad Rzeszutek Wilk wrote:
> > On Sun, Dec 04, 2011 at 01:09:28PM +0100, Carsten Schiers wrote:
> > > Here with two cards enabled and creating a bit "work" by watching TV with one oft hem:
> > >
> > > [ 23.842720] Starting SWIOTLB debug thread.
> > > [ 23.842750] swiotlb_start_thread: Go!
> > > [ 23.842838] xen_swiotlb_start_thread: Go!
> > > [ 28.841451] 0 [budget_av 0000:00:01.0] bounce: from:435596(slow:0)to:0 map:658 unmap:0 sync:435596
> > > [ 28.841592] SWIOTLB is 4% full
> > > [ 33.840147] 0 [budget_av 0000:00:01.0] bounce: from:127652(slow:0)to:0 map:0 unmap:0 sync:127652
> > > [ 33.840283] SWIOTLB is 4% full
> > > [ 33.844222] 0 budget_av 0000:00:01.0 alloc coherent: 8, free: 0
> > > [ 38.840227] 0 [budget_av 0000:00:01.0] bounce: from:128310(slow:0)to:0 map:0 unmap:0 sync:128310
> >
> > Whoa. Yes. You are definitly using the bounce buffer :-)
> >
> > Now it is time to look at why the drive is not using those coherent ones - it
> > looks to allocate just eight of them but does not use them.. Unless it is
> > using them _and_ bouncing them (which would be odd).
> >
> > And BTW, you can lower your 'swiotlb=XX' value. The 4% is how much you
> > are using of the default size.
>
> So I able to see this with an atl1c ethernet driver on my SandyBridge i3
> box. It looks as if the card is truly 32-bit so on a box with 8GB it
> bounces the data. If I booted the Xen hypervisor with 'mem=4GB' I get no
> bounces (no surprise there).
>
> In other words - I see the same behavior you are seeing. Now off to:
> >
> > I should find out_why_ the old Xen kernels do not use the bounce buffer
> > so much...
>
> which will require some fiddling around.
And I am not seeing any difference - the swiotlb is used with the same usage when
booting a classic (old style XEnoLinux) 2.6.32 vs using a brand new pvops (3.2).
Obviously if I limit the physical amount of memory (so 'mem=4GB' on Xen hypervisor
line), the bounce usage disappears. Hmm, I wonder if there is a nice way to
tell the hypervisor - hey, please stuff dom0 under 4GB.
Here is the patch I used against classic XenLinux. Any chance you could run
it with your classis guests and see what numbers you get?
[-- Attachment #2: swiotlb-against-old-type.patch --]
[-- Type: text/plain, Size: 7793 bytes --]
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index ab0bb23..17faefd 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -469,3 +469,10 @@ config XEN_SYS_HYPERVISOR
hypervisor environment. When running native or in another
virtual environment, /sys/hypervisor will still be present,
but will have no xen contents.
+
+config SWIOTLB_DEBUG
+ tristate "swiotlb debug facility."
+ default m
+ help
+ Do not enable it.
+
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 28fb50a..df84614 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -42,3 +42,4 @@ obj-$(CONFIG_XEN_GRANT_DEV) += gntdev/
obj-$(CONFIG_XEN_NETDEV_ACCEL_SFC_UTIL) += sfc_netutil/
obj-$(CONFIG_XEN_NETDEV_ACCEL_SFC_FRONTEND) += sfc_netfront/
obj-$(CONFIG_XEN_NETDEV_ACCEL_SFC_BACKEND) += sfc_netback/
+obj-$(CONFIG_SWIOTLB_DEBUG) += dump_swiotlb.o
diff --git a/drivers/xen/dump_swiotlb.c b/drivers/xen/dump_swiotlb.c
new file mode 100644
index 0000000..7168eed
--- /dev/null
+++ b/drivers/xen/dump_swiotlb.c
@@ -0,0 +1,72 @@
+/*
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License v2.0 as published by
+ * the Free Software Foundation
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/module.h>
+#include <linux/string.h>
+#include <linux/types.h>
+#include <linux/init.h>
+#include <linux/stat.h>
+#include <linux/err.h>
+#include <linux/ctype.h>
+#include <linux/slab.h>
+#include <linux/limits.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/blkdev.h>
+#include <linux/device.h>
+
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/fcntl.h>
+#include <linux/slab.h>
+#include <linux/kmod.h>
+#include <linux/major.h>
+#include <linux/highmem.h>
+#include <linux/blkdev.h>
+#include <linux/module.h>
+#include <linux/blkpg.h>
+#include <linux/buffer_head.h>
+#include <linux/mpage.h>
+#include <linux/mount.h>
+#include <linux/uio.h>
+#include <linux/namei.h>
+#include <asm/uaccess.h>
+
+#include <linux/pagemap.h>
+#include <linux/pagevec.h>
+
+#include <linux/swiotlb.h>
+#define DUMP_SWIOTLB_FUN "0.1"
+
+MODULE_AUTHOR("Konrad Rzeszutek Wilk <konrad@darnok.org>");
+MODULE_DESCRIPTION("dump swiotlb");
+MODULE_LICENSE("GPL");
+MODULE_VERSION(DUMP_SWIOTLB_FUN);
+/*
+extern int xen_swiotlb_start_thread(void);
+extern void xen_swiotlb_stop_thread(void);*/
+static int __init dump_swiotlb_init(void)
+{
+ printk(KERN_INFO "Starting SWIOTLB debug thread.\n");
+ swiotlb_start_thread();
+ //xen_swiotlb_start_thread();
+ return 0;
+}
+
+static void __exit dump_swiotlb_exit(void)
+{
+ swiotlb_stop_thread();
+ //xen_swiotlb_stop_thread();
+}
+
+module_init(dump_swiotlb_init);
+module_exit(dump_swiotlb_exit);
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 73b1f1c..81f5a1e 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -7,6 +7,9 @@ struct device;
struct dma_attrs;
struct scatterlist;
+
+extern int swiotlb_start_thread(void);
+extern void swiotlb_stop_thread(void);
/*
* Maximum allowable number of contiguous slabs to map,
* must be a power of 2. What is the appropriate value ?
diff --git a/lib/swiotlb-xen.c b/lib/swiotlb-xen.c
index 152696c..d1df462 100644
--- a/lib/swiotlb-xen.c
+++ b/lib/swiotlb-xen.c
@@ -118,6 +118,78 @@ setup_io_tlb_npages(char *str)
}
__setup("swiotlb=", setup_io_tlb_npages);
/* make io_tlb_overflow tunable too? */
+
+#include <linux/percpu.h>
+struct swiotlb_debug {
+ unsigned long bounce_to;
+ unsigned long bounce_from;
+ unsigned long bounce_slow;
+ unsigned long map;
+ unsigned long unmap;
+ unsigned long sync;
+ char dev_name[64];
+};
+
+;
+static DEFINE_PER_CPU(struct swiotlb_debug, tlb_debug);
+#include <linux/kthread.h>
+static int swiotlb_debug_thread(void *arg)
+{
+ int cpu;
+ int size = io_tlb_nslabs;
+ do {
+ int i;
+ unsigned long filled = 0;
+ set_current_state(TASK_INTERRUPTIBLE);
+ schedule_timeout_interruptible(HZ*5);
+
+ for_each_online_cpu(cpu) {
+ struct swiotlb_debug *d = &per_cpu(tlb_debug, cpu);
+ /* Can't really happend.*/
+ if (!d)
+ continue;
+ if (d->dev_name[0] == 0)
+ continue;
+
+ printk(KERN_INFO "%d [%s] bounce: from:%ld(slow:%ld)to:%ld map:%ld unmap:%ld sync:%ld\n",
+ cpu,
+ d->dev_name ? d->dev_name : "?",
+ d->bounce_from,
+ d->bounce_slow,
+ d->bounce_to,
+ d->map, d->unmap, d->sync);
+ memset(d, 0, sizeof(struct swiotlb_debug));
+ }
+ /* Very crude calculation. */
+ for (i = 0; i < size; i++) {
+ if (io_tlb_list[i] == 0)
+ filled++;
+ }
+ printk(KERN_INFO "SWIOTLB is %ld%% full\n", (filled * 100) / size);
+
+ } while (!kthread_should_stop());
+ return 0;
+}
+static struct task_struct *debug_thread = NULL;
+
+
+int swiotlb_start_thread(void) {
+
+ if (debug_thread)
+ return -EINVAL;
+ printk(KERN_INFO "%s: Go!\n",__func__);
+ debug_thread = kthread_run(swiotlb_debug_thread, NULL, "swiotlb_debug");
+}
+EXPORT_SYMBOL_GPL(swiotlb_start_thread);
+void swiotlb_stop_thread(void) {
+
+ printk(KERN_INFO "%s: Stop!\n",__func__);
+ if (debug_thread)
+ kthread_stop(debug_thread);
+ debug_thread = NULL;
+}
+EXPORT_SYMBOL_GPL(swiotlb_stop_thread);
+
/* Note that this doesn't work with highmem page */
static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
@@ -270,6 +342,11 @@ static void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size,
enum dma_data_direction dir)
{
unsigned long pfn = PFN_DOWN(phys);
+ struct swiotlb_debug *d;
+
+ preempt_disable();
+ d = &__get_cpu_var(tlb_debug);
+ preempt_enable();
if (PageHighMem(pfn_to_page(pfn))) {
/* The buffer does not have a mapping. Map it in and copy */
@@ -297,12 +374,18 @@ static void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size,
dma_addr += sz;
offset = 0;
}
+ d->bounce_slow++;
} else {
- if (dir == DMA_TO_DEVICE)
+ if (dir == DMA_TO_DEVICE) {
memcpy(dma_addr, phys_to_virt(phys), size);
- else if (__copy_to_user_inatomic(phys_to_virt(phys),
+ d->bounce_to++;
+ }
+ else {
+ if (__copy_to_user_inatomic(phys_to_virt(phys),
dma_addr, size))
/* inaccessible */;
+ d->bounce_from++;
+ }
}
}
@@ -406,6 +489,16 @@ found:
if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
swiotlb_bounce(phys, dma_addr, size, DMA_TO_DEVICE);
+ {
+ struct swiotlb_debug *d;
+ preempt_disable();
+ d = &__get_cpu_var(tlb_debug);
+ preempt_enable();
+ d->map++;
+ snprintf(d->dev_name, sizeof(d->dev_name), "%s %s",
+ dev_driver_string(hwdev), dev_name(hwdev));
+ }
+
return dma_addr;
}
@@ -453,6 +546,17 @@ do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir)
io_tlb_list[i] = ++count;
}
spin_unlock_irqrestore(&io_tlb_lock, flags);
+
+ {
+ struct swiotlb_debug *d;
+ preempt_disable();
+ d = &__get_cpu_var(tlb_debug);
+ preempt_enable();
+ d->unmap++;
+ snprintf(d->dev_name, sizeof(d->dev_name), "%s %s",
+ dev_driver_string(hwdev), dev_name(hwdev));
+ }
+
}
static void
@@ -462,6 +566,14 @@ sync_single(struct device *hwdev, char *dma_addr, size_t size,
int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
phys_addr_t phys = io_tlb_orig_addr[index];
+ struct swiotlb_debug *d;
+ preempt_disable();
+ d = &__get_cpu_var(tlb_debug);
+ preempt_enable();
+ d->sync++;
+ snprintf(d->dev_name, sizeof(d->dev_name), "%s %s",
+ dev_driver_string(hwdev), dev_name(hwdev));
+
phys += ((unsigned long)dma_addr & ((1 << IO_TLB_SHIFT) - 1));
switch (target) {
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
next prev parent reply other threads:[~2011-12-14 22:07 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-24 12:28 Load increase after memory upgrade (part2) Carsten Schiers
2011-11-25 18:42 ` Konrad Rzeszutek Wilk
2011-11-25 22:11 ` Carsten Schiers
2011-11-28 15:28 ` Konrad Rzeszutek Wilk
2011-11-28 15:40 ` Ian Campbell
2011-11-28 16:45 ` Konrad Rzeszutek Wilk
2011-11-29 8:31 ` Jan Beulich
2011-11-29 9:31 ` Carsten Schiers
2011-11-29 9:46 ` Carsten Schiers
2011-11-29 10:23 ` Ian Campbell
2011-11-29 15:33 ` Konrad Rzeszutek Wilk
2011-12-02 15:23 ` Konrad Rzeszutek Wilk
2011-12-04 11:59 ` Carsten Schiers
2011-12-04 12:09 ` Carsten Schiers
2011-12-06 3:26 ` Konrad Rzeszutek Wilk
2011-12-14 20:23 ` Konrad Rzeszutek Wilk
2011-12-14 22:07 ` Konrad Rzeszutek Wilk [this message]
2011-12-15 14:52 ` Carsten Schiers
2011-12-16 14:56 ` Carsten Schiers
2011-12-16 15:04 ` Konrad Rzeszutek Wilk
2011-12-16 15:51 ` Carsten Schiers
2011-12-16 16:19 ` Konrad Rzeszutek Wilk
2011-12-17 22:12 ` Carsten Schiers
2011-12-18 0:19 ` Sander Eikelenboom
2011-12-19 14:56 ` Konrad Rzeszutek Wilk
2012-01-10 21:55 ` Konrad Rzeszutek Wilk
2012-01-12 22:06 ` Sander Eikelenboom
2012-01-13 8:12 ` Jan Beulich
2012-01-13 15:13 ` Konrad Rzeszutek Wilk
2012-01-15 11:32 ` Sander Eikelenboom
2012-01-17 21:02 ` Konrad Rzeszutek Wilk
2012-01-18 11:28 ` Pasi Kärkkäinen
2012-01-18 11:39 ` Jan Beulich
2012-01-18 11:35 ` Jan Beulich
2012-01-18 14:29 ` Konrad Rzeszutek Wilk
2012-01-23 22:32 ` Konrad Rzeszutek Wilk
2012-01-24 8:58 ` Jan Beulich
2012-01-24 14:17 ` Konrad Rzeszutek Wilk
2012-01-24 21:32 ` Carsten Schiers
2012-01-25 12:02 ` Carsten Schiers
2012-01-25 19:06 ` Carsten Schiers
2012-01-25 21:02 ` Konrad Rzeszutek Wilk
2012-02-15 19:28 ` Konrad Rzeszutek Wilk
2012-02-16 8:56 ` Jan Beulich
2012-02-17 15:07 ` Konrad Rzeszutek Wilk
2012-02-28 14:35 ` Carsten Schiers
2012-02-29 12:10 ` Carsten Schiers
2012-02-29 12:56 ` Carsten Schiers
2012-05-11 9:39 ` Carsten Schiers
2012-05-11 19:41 ` Konrad Rzeszutek Wilk
2012-06-13 16:55 ` Konrad Rzeszutek Wilk
2012-06-14 7:07 ` Jan Beulich
2012-06-14 18:33 ` Konrad Rzeszutek Wilk
2012-06-14 18:43 ` Carsten Schiers
2012-06-14 8:38 ` David Vrabel
2012-06-14 18:31 ` Konrad Rzeszutek Wilk
2012-06-14 18:40 ` Carsten Schiers
2012-06-14 19:16 ` Carsten Schiers
2011-12-19 14:54 ` Konrad Rzeszutek Wilk
2011-12-04 12:18 ` Carsten Schiers
2011-11-28 16:58 ` Laszlo Ersek
2011-11-29 9:37 ` Carsten Schiers
2011-11-28 15:52 ` Carsten Schiers
2011-11-26 9:14 ` Carsten Schiers
2011-11-28 15:30 ` Konrad Rzeszutek Wilk
2011-11-29 9:42 ` Carsten Schiers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111214220700.GA9926@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=Ian.Campbell@citrix.com \
--cc=carsten@schiers.de \
--cc=konrad@darnok.org \
--cc=lersek@redhat.com \
--cc=linux@eikelenboom.it \
--cc=xen-devel@lists.xensource.com \
--cc=zhenzhong.duan@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.