* Remove constructor from buffer_head
@ 2007-05-04 3:08 Christoph Lameter
2007-05-04 3:21 ` Andrew Morton
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Christoph Lameter @ 2007-05-04 3:08 UTC (permalink / raw)
To: akpm; +Cc: linux-kernel
Performance tests show a slight improvements in netperf (not a
strong case for a performance improvement but removing the
constructor has definitely no negative impact so why keep
this around?).
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
Before:
87380 16384 16384 10.01 6026.04
87380 16384 16384 10.01 5992.17
87380 16384 16384 10.01 6071.23
After:
87380 16384 16384 10.01 6090.20
87380 16384 16384 10.01 6078.3
87380 16384 16384 10.00 6013.52
Signed-off-by: Christoph Lameter <clameter@sgi.com>
---
fs/buffer.c | 22 ++++------------------
1 file changed, 4 insertions(+), 18 deletions(-)
Index: slub/fs/buffer.c
===================================================================
--- slub.orig/fs/buffer.c 2007-05-03 19:17:09.000000000 -0700
+++ slub/fs/buffer.c 2007-05-03 19:57:30.000000000 -0700
@@ -2907,9 +2907,10 @@ static void recalc_bh_state(void)
struct buffer_head *alloc_buffer_head(gfp_t gfp_flags)
{
- struct buffer_head *ret = kmem_cache_alloc(bh_cachep,
+ struct buffer_head *ret = kmem_cache_zalloc(bh_cachep,
set_migrateflags(gfp_flags, __GFP_RECLAIMABLE));
if (ret) {
+ INIT_LIST_HEAD(&ret->b_assoc_buffers);
get_cpu_var(bh_accounting).nr++;
recalc_bh_state();
put_cpu_var(bh_accounting);
@@ -2928,17 +2929,6 @@ void free_buffer_head(struct buffer_head
}
EXPORT_SYMBOL(free_buffer_head);
-static void
-init_buffer_head(void *data, struct kmem_cache *cachep, unsigned long flags)
-{
- if (flags & SLAB_CTOR_CONSTRUCTOR) {
- struct buffer_head * bh = (struct buffer_head *)data;
-
- memset(bh, 0, sizeof(*bh));
- INIT_LIST_HEAD(&bh->b_assoc_buffers);
- }
-}
-
static void buffer_exit_cpu(int cpu)
{
int i;
@@ -2965,12 +2955,8 @@ void __init buffer_init(void)
{
int nrpages;
- bh_cachep = kmem_cache_create("buffer_head",
- sizeof(struct buffer_head), 0,
- (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
- SLAB_MEM_SPREAD),
- init_buffer_head,
- NULL);
+ bh_cachep = KMEM_CACHE(buffer_head,
+ SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD);
/*
* Limit the bh occupancy to 10% of ZONE_NORMAL
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: Remove constructor from buffer_head 2007-05-04 3:08 Remove constructor from buffer_head Christoph Lameter @ 2007-05-04 3:21 ` Andrew Morton 2007-05-04 3:34 ` Christoph Lameter 2007-05-04 4:34 ` Christoph Lameter 2007-05-04 6:35 ` William Lee Irwin III 2007-05-04 20:42 ` Andrew Morton 2 siblings, 2 replies; 17+ messages in thread From: Andrew Morton @ 2007-05-04 3:21 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-kernel On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote: > Performance tests show a slight improvements in netperf (not a > strong case for a performance improvement but removing the > constructor has definitely no negative impact so why keep > this around?). > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > > Before: > 87380 16384 16384 10.01 6026.04 > 87380 16384 16384 10.01 5992.17 > 87380 16384 16384 10.01 6071.23 > > After: > 87380 16384 16384 10.01 6090.20 > 87380 16384 16384 10.01 6078.3 > 87380 16384 16384 10.00 6013.52 How could a filesystem change affect networking performance? The change looks nice, but I'd microbenchmark it with a write-to-ext2-on-ramdisk or something like that. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 3:21 ` Andrew Morton @ 2007-05-04 3:34 ` Christoph Lameter 2007-05-04 4:37 ` Andrew Morton 2007-05-04 4:34 ` Christoph Lameter 1 sibling, 1 reply; 17+ messages in thread From: Christoph Lameter @ 2007-05-04 3:34 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel On Thu, 3 May 2007, Andrew Morton wrote: > On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote: > > > Performance tests show a slight improvements in netperf (not a > > strong case for a performance improvement but removing the > > constructor has definitely no negative impact so why keep > > this around?). > > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET > > Recv Send Send > > Socket Socket Message Elapsed > > Size Size Size Time Throughput > > bytes bytes bytes secs. 10^6bits/sec > > > > Before: > > 87380 16384 16384 10.01 6026.04 > > 87380 16384 16384 10.01 5992.17 > > 87380 16384 16384 10.01 6071.23 > > > > After: > > 87380 16384 16384 10.01 6090.20 > > 87380 16384 16384 10.01 6078.3 > > 87380 16384 16384 10.00 6013.52 > > How could a filesystem change affect networking performance? > > The change looks nice, but I'd microbenchmark it with a write-to-ext2-on-ramdisk > or something like that. Hmmmm.. I was told in another thread that this is the most frequently used slab for this benchmark ...... Just accepted that as true. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 3:34 ` Christoph Lameter @ 2007-05-04 4:37 ` Andrew Morton 0 siblings, 0 replies; 17+ messages in thread From: Andrew Morton @ 2007-05-04 4:37 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-kernel On Thu, 3 May 2007 20:34:48 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote: > On Thu, 3 May 2007, Andrew Morton wrote: > > > On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote: > > > > > Performance tests show a slight improvements in netperf (not a > > > strong case for a performance improvement but removing the > > > constructor has definitely no negative impact so why keep > > > this around?). > > > > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET > > > Recv Send Send > > > Socket Socket Message Elapsed > > > Size Size Size Time Throughput > > > bytes bytes bytes secs. 10^6bits/sec > > > > > > Before: > > > 87380 16384 16384 10.01 6026.04 > > > 87380 16384 16384 10.01 5992.17 > > > 87380 16384 16384 10.01 6071.23 > > > > > > After: > > > 87380 16384 16384 10.01 6090.20 > > > 87380 16384 16384 10.01 6078.3 > > > 87380 16384 16384 10.00 6013.52 > > > > How could a filesystem change affect networking performance? > > > > The change looks nice, but I'd microbenchmark it with a write-to-ext2-on-ramdisk > > or something like that. > > Hmmmm.. I was told in another thread that this is the most frequently used > slab for this benchmark That would be hair-raising ;) I suspect confusion with sk_buff. buffer_heads do get used quite a bit though. A good microbenchmark would be to sit in a tight loop extending and truncating an ext2 file ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 3:21 ` Andrew Morton 2007-05-04 3:34 ` Christoph Lameter @ 2007-05-04 4:34 ` Christoph Lameter 1 sibling, 0 replies; 17+ messages in thread From: Christoph Lameter @ 2007-05-04 4:34 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel On Thu, 3 May 2007, Andrew Morton wrote: > The change looks nice, but I'd microbenchmark it with a write-to-ext2-on-ramdisk > or something like that. Hmmm... How does one benchmark buffer head performance? Guess just by copying files? Not sure if the following will cut it. Two tests. First copying 8M of small files into a 16M ramdisk: for i in 1 2 3 4 5 6 7 8 9; do mke2fs /dev/ram0 >/dev/null mount /dev/ram0 /media >/dev/null time cp -a /etc /media umount /dev/ram0 done; No constructor real 0m0.104s user 0m0.016s sys 0m0.056s real 0m0.090s user 0m0.008s sys 0m0.056s real 0m0.089s user 0m0.016s sys 0m0.048s real 0m0.097s user 0m0.004s sys 0m0.064s real 0m0.091s user 0m0.008s sys 0m0.052s real 0m0.091s user 0m0.004s sys 0m0.060s real 0m0.098s user 0m0.008s sys 0m0.060s real 0m0.091s user 0m0.000s sys 0m0.064s real 0m0.090s user 0m0.012s sys 0m0.052s W/constructor real 0m0.099s user 0m0.004s sys 0m0.100s real 0m0.098s user 0m0.008s sys 0m0.096s real 0m0.091s user 0m0.016s sys 0m0.080s real 0m0.091s user 0m0.012s sys 0m0.084s real 0m0.090s user 0m0.012s sys 0m0.080s real 0m0.090s user 0m0.020s sys 0m0.076s real 0m1.269s user 0m0.012s sys 0m0.084s real 0m0.095s user 0m0.016s sys 0m0.084s real 0m0.096s user 0m0.020s sys 0m0.084s The no constructor numbers are generally lower. Lowest is no constructor with 0.089. Second. Copy vmlinux (52M) to 128M ramdisk: for i in 1 2 3 4 5 6 7 8 9; do mke2fs /dev/ram0 >/dev/null mount /dev/ram0 /media >/dev/null time cp slub/vmlinux /media umount /dev/ram0 done; No constructor: real 0m2.095s user 0m0.000s sys 0m0.168s real 0m0.187s user 0m0.008s sys 0m0.124s real 0m0.186s user 0m0.008s sys 0m0.120s real 0m0.195s user 0m0.008s sys 0m0.128s real 0m0.177s user 0m0.004s sys 0m0.120s real 0m0.182s user 0m0.004s sys 0m0.120s real 0m0.186s user 0m0.008s sys 0m0.120s real 0m0.190s user 0m0.004s sys 0m0.128s real 0m0.174s user 0m0.004s sys 0m0.116s Constructor real 0m0.183s user 0m0.004s sys 0m0.188s real 0m0.183s user 0m0.004s sys 0m0.192s real 0m0.177s user 0m0.012s sys 0m0.176s real 0m0.186s user 0m0.004s sys 0m0.192s real 0m0.187s user 0m0.008s sys 0m0.188s real 0m0.184s user 0m0.004s sys 0m0.192s real 0m0.177s user 0m0.012s sys 0m0.176s real 0m0.183s user 0m0.004s sys 0m0.192s real 0m0.182s user 0m0.004s sys 0m0.188s Same here. Low is 0.174 no constructor. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 3:08 Remove constructor from buffer_head Christoph Lameter 2007-05-04 3:21 ` Andrew Morton @ 2007-05-04 6:35 ` William Lee Irwin III 2007-05-04 16:05 ` Christoph Lameter 2007-05-04 20:42 ` Andrew Morton 2 siblings, 1 reply; 17+ messages in thread From: William Lee Irwin III @ 2007-05-04 6:35 UTC (permalink / raw) To: Christoph Lameter; +Cc: akpm, linux-kernel On Thu, May 03, 2007 at 08:08:41PM -0700, Christoph Lameter wrote: > Performance tests show a slight improvements in netperf (not a > strong case for a performance improvement but removing the > constructor has definitely no negative impact so why keep > this around?). Cache effects are not so easily visible. Cache profile results from more realistic workloads (e.g. major macrobenchmarks) are more appropriate for gauging this. -- wli ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 6:35 ` William Lee Irwin III @ 2007-05-04 16:05 ` Christoph Lameter 0 siblings, 0 replies; 17+ messages in thread From: Christoph Lameter @ 2007-05-04 16:05 UTC (permalink / raw) To: William Lee Irwin III; +Cc: akpm, linux-kernel On Thu, 3 May 2007, William Lee Irwin III wrote: > On Thu, May 03, 2007 at 08:08:41PM -0700, Christoph Lameter wrote: > > Performance tests show a slight improvements in netperf (not a > > strong case for a performance improvement but removing the > > constructor has definitely no negative impact so why keep > > this around?). > > Cache effects are not so easily visible. Cache profile results from > more realistic workloads (e.g. major macrobenchmarks) are more > appropriate for gauging this. Yeah I really out to stick a performance counter in this but that would require some effort. Defer for now I guess. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 3:08 Remove constructor from buffer_head Christoph Lameter 2007-05-04 3:21 ` Andrew Morton 2007-05-04 6:35 ` William Lee Irwin III @ 2007-05-04 20:42 ` Andrew Morton 2007-05-04 21:33 ` Andrew Morton ` (2 more replies) 2 siblings, 3 replies; 17+ messages in thread From: Andrew Morton @ 2007-05-04 20:42 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-kernel On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote: > Performance tests show a slight improvements in netperf (not a > strong case for a performance improvement but removing the > constructor has definitely no negative impact so why keep > this around?). > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > > Before: > 87380 16384 16384 10.01 6026.04 > 87380 16384 16384 10.01 5992.17 > 87380 16384 16384 10.01 6071.23 > > After: > 87380 16384 16384 10.01 6090.20 > 87380 16384 16384 10.01 6078.3 > 87380 16384 16384 10.00 6013.52 > > > Signed-off-by: Christoph Lameter <clameter@sgi.com> > > --- > fs/buffer.c | 22 ++++------------------ So I benchmarked this by repeatedly extending (via write()) and truncating a 10MB file, on ext2. Using create-delete.c from http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz Machine is a fast 2x4 core Woodcrest. CONFIG_SLAB=y The command used was time create-delete -s $((16 * 1024 * 1024)) -n 300 foo which will allocate and free 300*4096 buffer_heads. With patch: akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.56s system 99% cpu 4.565 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.60s system 99% cpu 4.612 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.60s system 99% cpu 4.602 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.56s system 99% cpu 4.567 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.59s system 95% cpu 4.824 total Without patch: akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.419 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.421 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.427 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.417 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.435 total So the patch took the average system time from 4.42 seconds up to 4.582 seconds. Nice slowdown! It could just be the usual inter-kernel-build noise, dunno. I'd investigate further, but someone has gone and broken oprofile. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 20:42 ` Andrew Morton @ 2007-05-04 21:33 ` Andrew Morton 2007-05-04 23:22 ` Andi Kleen 2007-05-04 21:42 ` Remove constructor from buffer_head Christoph Lameter 2007-05-04 21:52 ` Chuck Ebbert 2 siblings, 1 reply; 17+ messages in thread From: Andrew Morton @ 2007-05-04 21:33 UTC (permalink / raw) To: Andi Kleen; +Cc: Christoph Lameter, linux-kernel On Fri, 4 May 2007 13:42:12 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > I'd investigate further, but someone has gone and broken oprofile. Damn, we went and merged that bustage? 2.6.20: akpm2:/home/akpm> opcontrol --start-daemon /usr/bin/opcontrol: line 1098: /dev/oprofile/0/enabled: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/event: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/count: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/kernel: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/user: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/unit_mask: No such file or directory 2.6.21: akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50 opreport error: No sample file found: try running opcontrol --dump or specify a session containing sample files This is an FC6 machine. `yum update oprofile' says Could not find update match for oprofile No Packages marked for Update/Obsoletion akpm2:/home/akpm> rpm -q oprofile oprofile-0.9.2-3.fc6 I'm quite stunned that we did this. Now what? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 21:33 ` Andrew Morton @ 2007-05-04 23:22 ` Andi Kleen 2007-05-04 23:45 ` Andrew Morton 0 siblings, 1 reply; 17+ messages in thread From: Andi Kleen @ 2007-05-04 23:22 UTC (permalink / raw) To: Andrew Morton; +Cc: Christoph Lameter, linux-kernel On Friday 04 May 2007 23:33:47 Andrew Morton wrote: > On Fri, 4 May 2007 13:42:12 -0700 > > 2.6.20: > > akpm2:/home/akpm> opcontrol --start-daemon > /usr/bin/opcontrol: line 1098: /dev/oprofile/0/enabled: No such file or directory > /usr/bin/opcontrol: line 1098: /dev/oprofile/0/event: No such file or directory > /usr/bin/opcontrol: line 1098: /dev/oprofile/0/count: No such file or directory > /usr/bin/opcontrol: line 1098: /dev/oprofile/0/kernel: No such file or directory > /usr/bin/opcontrol: line 1098: /dev/oprofile/0/user: No such file or directory > /usr/bin/opcontrol: line 1098: /dev/oprofile/0/unit_mask: No such file or directory This isn't a problem anymore since the nmi watchdog is off by default now. > 2.6.21: > > akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50 > opreport error: No sample file found: try running opcontrol --dump > or specify a session containing sample files For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21 Did you try opcontrol --dump? -Andi ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 23:22 ` Andi Kleen @ 2007-05-04 23:45 ` Andrew Morton 2007-05-05 9:31 ` Andi Kleen 0 siblings, 1 reply; 17+ messages in thread From: Andrew Morton @ 2007-05-04 23:45 UTC (permalink / raw) To: Andi Kleen; +Cc: Christoph Lameter, linux-kernel, Chuck Ebbert On Sat, 5 May 2007 01:22:05 +0200 Andi Kleen <ak@suse.de> wrote: > > 2.6.21: > > > > akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50 > > opreport error: No sample file found: try running opcontrol --dump > > or specify a session containing sample files > > For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21 > > Did you try opcontrol --dump? Yes, tried various things. There's just nothing turning up in /var/lib/oprofile. Chuck appears to be claiming that 2.6.21 oprofile is known to be broken, but I never heard anything about that. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 23:45 ` Andrew Morton @ 2007-05-05 9:31 ` Andi Kleen 2007-05-13 20:38 ` oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head) Benjamin LaHaise 0 siblings, 1 reply; 17+ messages in thread From: Andi Kleen @ 2007-05-05 9:31 UTC (permalink / raw) To: Andrew Morton; +Cc: Andi Kleen, Christoph Lameter, linux-kernel, Chuck Ebbert On Fri, May 04, 2007 at 04:45:29PM -0700, Andrew Morton wrote: > On Sat, 5 May 2007 01:22:05 +0200 > Andi Kleen <ak@suse.de> wrote: > > > > 2.6.21: > > > > > > akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50 > > > opreport error: No sample file found: try running opcontrol --dump > > > or specify a session containing sample files > > > > For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21 > > > > Did you try opcontrol --dump? > > Yes, tried various things. There's just nothing turning up in /var/lib/oprofile. > > Chuck appears to be claiming that 2.6.21 oprofile is known to be broken, > but I never heard anything about that. Hmm, after a opcontrol --reset i see the same issue now. Don't know what's wrong, but it must be something different from the .20 perfctr allocation problem. It looks like the daemon doesn't get any data from the kernel -Andi ^ permalink raw reply [flat|nested] 17+ messages in thread
* oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head) 2007-05-05 9:31 ` Andi Kleen @ 2007-05-13 20:38 ` Benjamin LaHaise 2007-05-15 3:12 ` Andrew Morton 0 siblings, 1 reply; 17+ messages in thread From: Benjamin LaHaise @ 2007-05-13 20:38 UTC (permalink / raw) To: Andi Kleen; +Cc: Andrew Morton, Christoph Lameter, linux-kernel, Chuck Ebbert On Sat, May 05, 2007 at 11:31:20AM +0200, Andi Kleen wrote: > Hmm, after a opcontrol --reset i see the same issue now. Don't know what's > wrong, but it must be something different from the .20 perfctr allocation > problem. > > It looks like the daemon doesn't get any data from the kernel I finally had time to track this down. The breakage is caused by "[PATCH] x86-64: Let oprofile reserve MSR on all CPUs". Oprofile is already calling the reserve functions on each CPU in the system when it sets up the MSRs. This results in oprofile getting a reservation failure on CPUs above 0. The following makes oprofile adapt to the API change for now -- oprofile still needs to be modified to perform the reservations earlier during its initialization, but that's a little bit more involved than the immediate bug fix. This only affects systems with more than 1 CPU. This patch has been through limited testing (Athlon 64 X2 and Core 2, but not on the P4) on x86 and x86-64 (Core 2 only). -ben Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> diff --git a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c index 84c3497..21fc74d 100644 --- a/arch/i386/kernel/nmi.c +++ b/arch/i386/kernel/nmi.c @@ -148,7 +148,7 @@ int avail_to_resrv_perfctr_nmi(unsigned int msr) return 1; } -static int __reserve_perfctr_nmi(int cpu, unsigned int msr) +int __reserve_perfctr_nmi(int cpu, unsigned int msr) { unsigned int counter; if (cpu < 0) @@ -162,7 +162,7 @@ static int __reserve_perfctr_nmi(int cpu, unsigned int msr) return 0; } -static void __release_perfctr_nmi(int cpu, unsigned int msr) +void __release_perfctr_nmi(int cpu, unsigned int msr) { unsigned int counter; if (cpu < 0) @@ -212,7 +212,7 @@ int __reserve_evntsel_nmi(int cpu, unsigned int msr) return 0; } -static void __release_evntsel_nmi(int cpu, unsigned int msr) +void __release_evntsel_nmi(int cpu, unsigned int msr) { unsigned int counter; if (cpu < 0) @@ -1188,5 +1188,9 @@ EXPORT_SYMBOL(reserve_perfctr_nmi); EXPORT_SYMBOL(release_perfctr_nmi); EXPORT_SYMBOL(reserve_evntsel_nmi); EXPORT_SYMBOL(release_evntsel_nmi); +EXPORT_SYMBOL(__reserve_perfctr_nmi); +EXPORT_SYMBOL(__release_perfctr_nmi); +EXPORT_SYMBOL(__reserve_evntsel_nmi); +EXPORT_SYMBOL(__release_evntsel_nmi); EXPORT_SYMBOL(disable_timer_nmi_watchdog); EXPORT_SYMBOL(enable_timer_nmi_watchdog); diff --git a/arch/i386/oprofile/op_model_athlon.c b/arch/i386/oprofile/op_model_athlon.c index 3057a19..738a579 100644 --- a/arch/i386/oprofile/op_model_athlon.c +++ b/arch/i386/oprofile/op_model_athlon.c @@ -45,14 +45,14 @@ static void athlon_fill_in_addresses(struct op_msrs * const msrs) int i; for (i=0; i < NUM_COUNTERS; i++) { - if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i)) + if (__reserve_perfctr_nmi(-1, MSR_K7_PERFCTR0 + i)) msrs->counters[i].addr = MSR_K7_PERFCTR0 + i; else msrs->counters[i].addr = 0; } for (i=0; i < NUM_CONTROLS; i++) { - if (reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i)) + if (__reserve_evntsel_nmi(-1, MSR_K7_EVNTSEL0 + i)) msrs->controls[i].addr = MSR_K7_EVNTSEL0 + i; else msrs->controls[i].addr = 0; @@ -160,11 +160,11 @@ static void athlon_shutdown(struct op_msrs const * const msrs) for (i = 0 ; i < NUM_COUNTERS ; ++i) { if (CTR_IS_RESERVED(msrs,i)) - release_perfctr_nmi(MSR_K7_PERFCTR0 + i); + __release_perfctr_nmi(-1, MSR_K7_PERFCTR0 + i); } for (i = 0 ; i < NUM_CONTROLS ; ++i) { if (CTRL_IS_RESERVED(msrs,i)) - release_evntsel_nmi(MSR_K7_EVNTSEL0 + i); + __release_evntsel_nmi(-1, MSR_K7_EVNTSEL0 + i); } } diff --git a/arch/i386/oprofile/op_model_p4.c b/arch/i386/oprofile/op_model_p4.c index 4792592..ce096dc 100644 --- a/arch/i386/oprofile/op_model_p4.c +++ b/arch/i386/oprofile/op_model_p4.c @@ -413,7 +413,7 @@ static void p4_fill_in_addresses(struct op_msrs * const msrs) for (i = 0; i < num_counters; ++i) { addr = p4_counters[VIRT_CTR(stag, i)].counter_address; cccraddr = p4_counters[VIRT_CTR(stag, i)].cccr_address; - if (reserve_perfctr_nmi(addr)){ + if (__reserve_perfctr_nmi(-1, addr)){ msrs->counters[i].addr = addr; msrs->controls[i].addr = cccraddr; } @@ -422,7 +422,7 @@ static void p4_fill_in_addresses(struct op_msrs * const msrs) /* 43 ESCR registers in three or four discontiguous group */ for (addr = MSR_P4_BSU_ESCR0 + stag; addr < MSR_P4_IQ_ESCR0; ++i, addr += addr_increment()) { - if (reserve_evntsel_nmi(addr)) + if (__reserve_evntsel_nmi(-1, addr)) msrs->controls[i].addr = addr; } @@ -431,32 +431,32 @@ static void p4_fill_in_addresses(struct op_msrs * const msrs) if (boot_cpu_data.x86_model >= 0x3) { for (addr = MSR_P4_BSU_ESCR0 + stag; addr <= MSR_P4_BSU_ESCR1; ++i, addr += addr_increment()) { - if (reserve_evntsel_nmi(addr)) + if (__reserve_evntsel_nmi(-1, addr)) msrs->controls[i].addr = addr; } } else { for (addr = MSR_P4_IQ_ESCR0 + stag; addr <= MSR_P4_IQ_ESCR1; ++i, addr += addr_increment()) { - if (reserve_evntsel_nmi(addr)) + if (__reserve_evntsel_nmi(-1, addr)) msrs->controls[i].addr = addr; } } for (addr = MSR_P4_RAT_ESCR0 + stag; addr <= MSR_P4_SSU_ESCR0; ++i, addr += addr_increment()) { - if (reserve_evntsel_nmi(addr)) + if (__reserve_evntsel_nmi(-1, addr)) msrs->controls[i].addr = addr; } for (addr = MSR_P4_MS_ESCR0 + stag; addr <= MSR_P4_TC_ESCR1; ++i, addr += addr_increment()) { - if (reserve_evntsel_nmi(addr)) + if (__reserve_evntsel_nmi(-1, addr)) msrs->controls[i].addr = addr; } for (addr = MSR_P4_IX_ESCR0 + stag; addr <= MSR_P4_CRU_ESCR3; ++i, addr += addr_increment()) { - if (reserve_evntsel_nmi(addr)) + if (__reserve_evntsel_nmi(-1, addr)) msrs->controls[i].addr = addr; } @@ -464,21 +464,21 @@ static void p4_fill_in_addresses(struct op_msrs * const msrs) if (num_counters == NUM_COUNTERS_NON_HT) { /* standard non-HT CPUs handle both remaining ESCRs*/ - if (reserve_evntsel_nmi(MSR_P4_CRU_ESCR5)) + if (__reserve_evntsel_nmi(-1, MSR_P4_CRU_ESCR5)) msrs->controls[i++].addr = MSR_P4_CRU_ESCR5; - if (reserve_evntsel_nmi(MSR_P4_CRU_ESCR4)) + if (__reserve_evntsel_nmi(-1, MSR_P4_CRU_ESCR4)) msrs->controls[i++].addr = MSR_P4_CRU_ESCR4; } else if (stag == 0) { /* HT CPUs give the first remainder to the even thread, as the 32nd control register */ - if (reserve_evntsel_nmi(MSR_P4_CRU_ESCR4)) + if (__reserve_evntsel_nmi(-1, MSR_P4_CRU_ESCR4)) msrs->controls[i++].addr = MSR_P4_CRU_ESCR4; } else { /* and two copies of the second to the odd thread, for the 22st and 23nd control registers */ - if (reserve_evntsel_nmi(MSR_P4_CRU_ESCR5)) { + if (__reserve_evntsel_nmi(-1, MSR_P4_CRU_ESCR5)) { msrs->controls[i++].addr = MSR_P4_CRU_ESCR5; msrs->controls[i++].addr = MSR_P4_CRU_ESCR5; } @@ -684,7 +684,7 @@ static void p4_shutdown(struct op_msrs const * const msrs) for (i = 0 ; i < num_counters ; ++i) { if (CTR_IS_RESERVED(msrs,i)) - release_perfctr_nmi(msrs->counters[i].addr); + __release_perfctr_nmi(-1, msrs->counters[i].addr); } /* some of the control registers are specially reserved in * conjunction with the counter registers (hence the starting offset). @@ -692,7 +692,7 @@ static void p4_shutdown(struct op_msrs const * const msrs) */ for (i = num_counters ; i < num_controls ; ++i) { if (CTRL_IS_RESERVED(msrs,i)) - release_evntsel_nmi(msrs->controls[i].addr); + __release_evntsel_nmi(-1, msrs->controls[i].addr); } } diff --git a/arch/i386/oprofile/op_model_ppro.c b/arch/i386/oprofile/op_model_ppro.c index c554f52..10d2c5d 100644 --- a/arch/i386/oprofile/op_model_ppro.c +++ b/arch/i386/oprofile/op_model_ppro.c @@ -47,14 +47,14 @@ static void ppro_fill_in_addresses(struct op_msrs * const msrs) int i; for (i=0; i < NUM_COUNTERS; i++) { - if (reserve_perfctr_nmi(MSR_P6_PERFCTR0 + i)) + if (__reserve_perfctr_nmi(-1, MSR_P6_PERFCTR0 + i)) msrs->counters[i].addr = MSR_P6_PERFCTR0 + i; else msrs->counters[i].addr = 0; } for (i=0; i < NUM_CONTROLS; i++) { - if (reserve_evntsel_nmi(MSR_P6_EVNTSEL0 + i)) + if (__reserve_evntsel_nmi(-1, MSR_P6_EVNTSEL0 + i)) msrs->controls[i].addr = MSR_P6_EVNTSEL0 + i; else msrs->controls[i].addr = 0; @@ -171,11 +171,11 @@ static void ppro_shutdown(struct op_msrs const * const msrs) for (i = 0 ; i < NUM_COUNTERS ; ++i) { if (CTR_IS_RESERVED(msrs,i)) - release_perfctr_nmi(MSR_P6_PERFCTR0 + i); + __release_perfctr_nmi(-1, MSR_P6_PERFCTR0 + i); } for (i = 0 ; i < NUM_CONTROLS ; ++i) { if (CTRL_IS_RESERVED(msrs,i)) - release_evntsel_nmi(MSR_P6_EVNTSEL0 + i); + __release_evntsel_nmi(-1, MSR_P6_EVNTSEL0 + i); } } diff --git a/arch/x86_64/kernel/nmi.c b/arch/x86_64/kernel/nmi.c index dfab9f1..5011a3b 100644 --- a/arch/x86_64/kernel/nmi.c +++ b/arch/x86_64/kernel/nmi.c @@ -135,7 +135,7 @@ int avail_to_resrv_perfctr_nmi(unsigned int msr) return 1; } -static int __reserve_perfctr_nmi(int cpu, unsigned int msr) +int __reserve_perfctr_nmi(int cpu, unsigned int msr) { unsigned int counter; if (cpu < 0) @@ -149,7 +149,7 @@ static int __reserve_perfctr_nmi(int cpu, unsigned int msr) return 0; } -static void __release_perfctr_nmi(int cpu, unsigned int msr) +void __release_perfctr_nmi(int cpu, unsigned int msr) { unsigned int counter; if (cpu < 0) @@ -198,7 +198,7 @@ int __reserve_evntsel_nmi(int cpu, unsigned int msr) return 0; } -static void __release_evntsel_nmi(int cpu, unsigned int msr) +void __release_evntsel_nmi(int cpu, unsigned int msr) { unsigned int counter; if (cpu < 0) @@ -1073,6 +1073,10 @@ EXPORT_SYMBOL(reserve_perfctr_nmi); EXPORT_SYMBOL(release_perfctr_nmi); EXPORT_SYMBOL(reserve_evntsel_nmi); EXPORT_SYMBOL(release_evntsel_nmi); +EXPORT_SYMBOL(__reserve_perfctr_nmi); +EXPORT_SYMBOL(__release_perfctr_nmi); +EXPORT_SYMBOL(__reserve_evntsel_nmi); +EXPORT_SYMBOL(__release_evntsel_nmi); EXPORT_SYMBOL(disable_timer_nmi_watchdog); EXPORT_SYMBOL(enable_timer_nmi_watchdog); EXPORT_SYMBOL(touch_nmi_watchdog); diff --git a/include/asm-i386/nmi.h b/include/asm-i386/nmi.h index b04333e..062db4d 100644 --- a/include/asm-i386/nmi.h +++ b/include/asm-i386/nmi.h @@ -25,6 +25,11 @@ extern void release_perfctr_nmi(unsigned int); extern int reserve_evntsel_nmi(unsigned int); extern void release_evntsel_nmi(unsigned int); +extern int __reserve_perfctr_nmi(int cpu, unsigned int msr); +extern void __release_perfctr_nmi(int cpu, unsigned int msr); +extern int __reserve_evntsel_nmi(int cpu, unsigned int msr); +extern void __release_evntsel_nmi(int cpu, unsigned int msr); + extern void setup_apic_nmi_watchdog (void *); extern void stop_apic_nmi_watchdog (void *); extern void disable_timer_nmi_watchdog(void); diff --git a/include/asm-x86_64/nmi.h b/include/asm-x86_64/nmi.h index 72375e7..5d6a0b3 100644 --- a/include/asm-x86_64/nmi.h +++ b/include/asm-x86_64/nmi.h @@ -53,6 +53,11 @@ extern void release_perfctr_nmi(unsigned int); extern int reserve_evntsel_nmi(unsigned int); extern void release_evntsel_nmi(unsigned int); +extern int __reserve_perfctr_nmi(int cpu, unsigned int msr); +extern void __release_perfctr_nmi(int cpu, unsigned int msr); +extern int __reserve_evntsel_nmi(int cpu, unsigned int msr); +extern void __release_evntsel_nmi(int cpu, unsigned int msr); + extern void setup_apic_nmi_watchdog (void *); extern void stop_apic_nmi_watchdog (void *); extern void disable_timer_nmi_watchdog(void); ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head) 2007-05-13 20:38 ` oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head) Benjamin LaHaise @ 2007-05-15 3:12 ` Andrew Morton 0 siblings, 0 replies; 17+ messages in thread From: Andrew Morton @ 2007-05-15 3:12 UTC (permalink / raw) To: Benjamin LaHaise Cc: Andi Kleen, Christoph Lameter, linux-kernel, Chuck Ebbert On Sun, 13 May 2007 16:38:16 -0400 Benjamin LaHaise <bcrl@kvack.org> wrote: > On Sat, May 05, 2007 at 11:31:20AM +0200, Andi Kleen wrote: > > Hmm, after a opcontrol --reset i see the same issue now. Don't know what's > > wrong, but it must be something different from the .20 perfctr allocation > > problem. > > > > It looks like the daemon doesn't get any data from the kernel > > I finally had time to track this down. The breakage is caused by "[PATCH] > x86-64: Let oprofile reserve MSR on all CPUs". Oprofile is already calling > the reserve functions on each CPU in the system when it sets up the MSRs. > This results in oprofile getting a reservation failure on CPUs above 0. The > following makes oprofile adapt to the API change for now -- oprofile > still needs to be modified to perform the reservations earlier during its > initialization, but that's a little bit more involved than the immediate > bug fix. This only affects systems with more than 1 CPU. This patch has > been through limited testing (Athlon 64 X2 and Core 2, but not on the P4) on > x86 and x86-64 (Core 2 only). Unfortunately we've left this a bit too late - your patch is patching code which isn't there any more in mainline and we also need a 2.6.21.x fix. So perhaps we could merge your "immediate bugfix" into -stable and implement the "more involved" fix for 2.6.22. Andi, any preferences? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 20:42 ` Andrew Morton 2007-05-04 21:33 ` Andrew Morton @ 2007-05-04 21:42 ` Christoph Lameter 2007-05-04 21:47 ` Andrew Morton 2007-05-04 21:52 ` Chuck Ebbert 2 siblings, 1 reply; 17+ messages in thread From: Christoph Lameter @ 2007-05-04 21:42 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel On Fri, 4 May 2007, Andrew Morton wrote: > So the patch took the average system time from 4.42 seconds up to 4.582 > seconds. Nice slowdown! All of that from a memset and a list head init on a cacheline we already use? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 21:42 ` Remove constructor from buffer_head Christoph Lameter @ 2007-05-04 21:47 ` Andrew Morton 0 siblings, 0 replies; 17+ messages in thread From: Andrew Morton @ 2007-05-04 21:47 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-kernel On Fri, 4 May 2007 14:42:02 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote: > On Fri, 4 May 2007, Andrew Morton wrote: > > > So the patch took the average system time from 4.42 seconds up to 4.582 > > seconds. Nice slowdown! > > All of that from a memset and a list head init on a cacheline we already > use? Seems unlikely, especially when you consider all the other stuff which a write() has to do. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Remove constructor from buffer_head 2007-05-04 20:42 ` Andrew Morton 2007-05-04 21:33 ` Andrew Morton 2007-05-04 21:42 ` Remove constructor from buffer_head Christoph Lameter @ 2007-05-04 21:52 ` Chuck Ebbert 2 siblings, 0 replies; 17+ messages in thread From: Chuck Ebbert @ 2007-05-04 21:52 UTC (permalink / raw) To: Andrew Morton; +Cc: Christoph Lameter, linux-kernel Andrew Morton wrote: > > I'd investigate further, but someone has gone and broken oprofile. > Did you just notice that? Apparently it's been broken since 2.6.21-final. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2007-05-15 3:13 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-05-04 3:08 Remove constructor from buffer_head Christoph Lameter 2007-05-04 3:21 ` Andrew Morton 2007-05-04 3:34 ` Christoph Lameter 2007-05-04 4:37 ` Andrew Morton 2007-05-04 4:34 ` Christoph Lameter 2007-05-04 6:35 ` William Lee Irwin III 2007-05-04 16:05 ` Christoph Lameter 2007-05-04 20:42 ` Andrew Morton 2007-05-04 21:33 ` Andrew Morton 2007-05-04 23:22 ` Andi Kleen 2007-05-04 23:45 ` Andrew Morton 2007-05-05 9:31 ` Andi Kleen 2007-05-13 20:38 ` oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head) Benjamin LaHaise 2007-05-15 3:12 ` Andrew Morton 2007-05-04 21:42 ` Remove constructor from buffer_head Christoph Lameter 2007-05-04 21:47 ` Andrew Morton 2007-05-04 21:52 ` Chuck Ebbert
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.