* Re: [patch 10/10] Scheduler profiling - Use immediate values [not found] ` <8DkTC-5Vy-11@gated-at.bofh.it> @ 2007-07-05 15:23 ` Bodo Eggert 2007-07-05 15:46 ` Mathieu Desnoyers 0 siblings, 1 reply; 32+ messages in thread From: Bodo Eggert @ 2007-07-05 15:23 UTC (permalink / raw) To: Andi Kleen, Alexey Dobriyan, Mathieu Desnoyers, akpm, linux-kernel Andi Kleen <andi@firstfloor.org> wrote: > Alexey Dobriyan <adobriyan@gmail.com> writes: >> On Tue, Jul 03, 2007 at 12:40:56PM -0400, Mathieu Desnoyers wrote: >> > Use immediate values with lower d-cache hit in optimized version as a >> > condition for scheduler profiling call. >> >> I think it's better to put profile.c under CONFIG_PROFILING as >> _expected_, so CONFIG_PROFILING=n users won't get any overhead, immediate or >> not. That's what I'm going to do after test-booting bunch of kernels. > > No, it's better to handle this efficiently at runtime e.g. for > distribution kernels. Mathieu's patch is good IMO you should combine them. For distibutions, it may be good to include profiling support unconditionally, but how many of the vanilla kernel users are going to use profiling at all? -- A man inserted an advertisement in the classified: Wife Wanted." The next day he received a hundred letters. They all said the same thing: "You can have mine." Friß, Spammer: oztx1@nAQd3C.7eggert.dyndns.org wGhu@InUYZ.7eggert.dyndns.org ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-05 15:23 ` [patch 10/10] Scheduler profiling - Use immediate values Bodo Eggert @ 2007-07-05 15:46 ` Mathieu Desnoyers 2007-07-06 21:08 ` Adrian Bunk 0 siblings, 1 reply; 32+ messages in thread From: Mathieu Desnoyers @ 2007-07-05 15:46 UTC (permalink / raw) To: Bodo Eggert; +Cc: Andi Kleen, Alexey Dobriyan, akpm, linux-kernel * Bodo Eggert (7eggert@gmx.de) wrote: > Andi Kleen <andi@firstfloor.org> wrote: > > Alexey Dobriyan <adobriyan@gmail.com> writes: > >> On Tue, Jul 03, 2007 at 12:40:56PM -0400, Mathieu Desnoyers wrote: > > >> > Use immediate values with lower d-cache hit in optimized version as a > >> > condition for scheduler profiling call. > >> > >> I think it's better to put profile.c under CONFIG_PROFILING as > >> _expected_, so CONFIG_PROFILING=n users won't get any overhead, immediate or > >> not. That's what I'm going to do after test-booting bunch of kernels. > > > > No, it's better to handle this efficiently at runtime e.g. for > > distribution kernels. Mathieu's patch is good > > IMO you should combine them. For distibutions, it may be good to include > profiling support unconditionally, but how many of the vanilla kernel users > are going to use profiling at all? For CONFIG_PROFILING, I think of it more like a chicken and egg problem: as long as it won't be easy to enable when needed in distributions kernels, few profiling applications will use it. So if you ban it from distros kernel with a CONFIG option under the pretext that no profiling application use it, you run straight into a conceptual deadlock. :) Another simliar example would be CONFIG_TIMER_STATS, used by powertop, which users will use to tune their laptops (I got 45 minutes more battery time on mine thanks to this wonderful tool). Compiling-in, but dynamically turning it on/off makes sense for a lot of kernel "profiling/stats extraction" mechanisms like those. But I suspect they will be used by distros only when their presence when disabled will be unnoticeable or when a major portion of their users will yell loudly enough telling that they want this and this features, leaving the more specialized minority of distro users without these features that they need to fine-tune their applications or their kernel. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-05 15:46 ` Mathieu Desnoyers @ 2007-07-06 21:08 ` Adrian Bunk 0 siblings, 0 replies; 32+ messages in thread From: Adrian Bunk @ 2007-07-06 21:08 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Bodo Eggert, Andi Kleen, Alexey Dobriyan, akpm, linux-kernel On Thu, Jul 05, 2007 at 11:46:44AM -0400, Mathieu Desnoyers wrote: > * Bodo Eggert (7eggert@gmx.de) wrote: > > Andi Kleen <andi@firstfloor.org> wrote: > > > Alexey Dobriyan <adobriyan@gmail.com> writes: > > >> On Tue, Jul 03, 2007 at 12:40:56PM -0400, Mathieu Desnoyers wrote: > > > > >> > Use immediate values with lower d-cache hit in optimized version as a > > >> > condition for scheduler profiling call. > > >> > > >> I think it's better to put profile.c under CONFIG_PROFILING as > > >> _expected_, so CONFIG_PROFILING=n users won't get any overhead, immediate or > > >> not. That's what I'm going to do after test-booting bunch of kernels. > > > > > > No, it's better to handle this efficiently at runtime e.g. for > > > distribution kernels. Mathieu's patch is good > > > > IMO you should combine them. For distibutions, it may be good to include > > profiling support unconditionally, but how many of the vanilla kernel users > > are going to use profiling at all? > > For CONFIG_PROFILING, I think of it more like a chicken and egg problem: > as long as it won't be easy to enable when needed in distributions > kernels, few profiling applications will use it. So if you ban it from > distros kernel with a CONFIG option under the pretext that no profiling > application use it, you run straight into a conceptual deadlock. :) > > Another simliar example would be CONFIG_TIMER_STATS, used by powertop, > which users will use to tune their laptops (I got 45 minutes more > battery time on mine thanks to this wonderful tool). Compiling-in, but > dynamically turning it on/off makes sense for a lot of kernel > "profiling/stats extraction" mechanisms like those. > > But I suspect they will be used by distros only when their presence when > disabled will be unnoticeable or when a major portion of their users > will yell loudly enough telling that they want this and this features, > leaving the more specialized minority of distro users without these > features that they need to fine-tune their applications or their kernel. There's a surprisingly simple solution solving the problems you describe: For userspace libraries, the common approach is to get them stripped and have versions with all debugging symbols in some -dbg package. So if you want to debug an application using such a library, you simply install this -dbg package. Just let distributions make the same for the kernel - add a -dbg flavour with many debugging options enabled. It might perhaps run 5% or 20% slower than the regular kernel, but you don't need profiling or powertop during normal operation - these are _debug_ tools. This way, there's no runtime penalty and therefore no trickery required for getting the overhead of _debug code_ lower. > Mathieu cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 32+ messages in thread
* [patch 00/10] Immediate Values @ 2007-07-03 16:40 Mathieu Desnoyers 2007-07-03 16:40 ` [patch 10/10] Scheduler profiling - Use immediate values Mathieu Desnoyers 0 siblings, 1 reply; 32+ messages in thread From: Mathieu Desnoyers @ 2007-07-03 16:40 UTC (permalink / raw) To: akpm, linux-kernel Hi, This is the update to the Immediate Values patch. It provides a value with a load immediate instruction instead of requiring a data load which could hurt the data cache. It aims at providing a very efficient manner to branch over compiled-in but mostly inactive code. This release takes care of the modifications suggested in the previous round of review (thanks to all reviewers!). It applies on 2.6.22-rc6-mm1, and depends on the text edit lock patch. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [patch 10/10] Scheduler profiling - Use immediate values 2007-07-03 16:40 [patch 00/10] Immediate Values Mathieu Desnoyers @ 2007-07-03 16:40 ` Mathieu Desnoyers 2007-07-03 18:11 ` Alexey Dobriyan 2007-07-04 20:35 ` Alexey Dobriyan 0 siblings, 2 replies; 32+ messages in thread From: Mathieu Desnoyers @ 2007-07-03 16:40 UTC (permalink / raw) To: akpm, linux-kernel; +Cc: Mathieu Desnoyers [-- Attachment #1: profiling-use-immediate-values.patch --] [-- Type: text/plain, Size: 6593 bytes --] Use immediate values with lower d-cache hit in optimized version as a condition for scheduler profiling call. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> --- drivers/kvm/svm.c | 2 +- drivers/kvm/vmx.c | 2 +- include/linux/profile.h | 9 +++------ kernel/profile.c | 37 +++++++++++++++++++++++++------------ kernel/sched.c | 3 ++- 5 files changed, 32 insertions(+), 21 deletions(-) Index: linux-2.6-lttng/kernel/profile.c =================================================================== --- linux-2.6-lttng.orig/kernel/profile.c 2007-06-29 12:21:08.000000000 -0400 +++ linux-2.6-lttng/kernel/profile.c 2007-06-29 14:26:42.000000000 -0400 @@ -23,6 +23,7 @@ #include <linux/profile.h> #include <linux/highmem.h> #include <linux/mutex.h> +#include <linux/immediate.h> #include <asm/sections.h> #include <asm/semaphore.h> #include <asm/irq_regs.h> @@ -42,9 +43,6 @@ static atomic_t *prof_buffer; static unsigned long prof_len, prof_shift; -int prof_on __read_mostly; -EXPORT_SYMBOL_GPL(prof_on); - static cpumask_t prof_cpu_mask = CPU_MASK_ALL; #ifdef CONFIG_SMP static DEFINE_PER_CPU(struct profile_hit *[2], cpu_profile_hits); @@ -52,6 +50,12 @@ static DEFINE_MUTEX(profile_flip_mutex); #endif /* CONFIG_SMP */ +/* Immediate values */ +immediate_t __read_mostly sleep_profiling, sched_profiling, kvm_profiling, + cpu_profiling; +EXPORT_SYMBOL_GPL(kvm_profiling); +EXPORT_SYMBOL_GPL(cpu_profiling); + static int __init profile_setup(char * str) { static char __initdata schedstr[] = "schedule"; @@ -60,7 +64,7 @@ int par; if (!strncmp(str, sleepstr, strlen(sleepstr))) { - prof_on = SLEEP_PROFILING; + immediate_arm(&sleep_profiling); if (str[strlen(sleepstr)] == ',') str += strlen(sleepstr) + 1; if (get_option(&str, &par)) @@ -69,7 +73,7 @@ "kernel sleep profiling enabled (shift: %ld)\n", prof_shift); } else if (!strncmp(str, schedstr, strlen(schedstr))) { - prof_on = SCHED_PROFILING; + immediate_arm(&sched_profiling); if (str[strlen(schedstr)] == ',') str += strlen(schedstr) + 1; if (get_option(&str, &par)) @@ -78,7 +82,7 @@ "kernel schedule profiling enabled (shift: %ld)\n", prof_shift); } else if (!strncmp(str, kvmstr, strlen(kvmstr))) { - prof_on = KVM_PROFILING; + immediate_arm(&kvm_profiling); if (str[strlen(kvmstr)] == ',') str += strlen(kvmstr) + 1; if (get_option(&str, &par)) @@ -88,7 +92,7 @@ prof_shift); } else if (get_option(&str, &par)) { prof_shift = par; - prof_on = CPU_PROFILING; + immediate_arm(&cpu_profiling); printk(KERN_INFO "kernel profiling enabled (shift: %ld)\n", prof_shift); } @@ -99,7 +103,10 @@ void __init profile_init(void) { - if (!prof_on) + if (!immediate_query(&sleep_profiling) && + !immediate_query(&sched_profiling) && + !immediate_query(&kvm_profiling) && + !immediate_query(&cpu_profiling)) return; /* only text is profiled */ @@ -288,7 +295,7 @@ int i, j, cpu; struct profile_hit *hits; - if (prof_on != type || !prof_buffer) + if (!prof_buffer) return; pc = min((pc - (unsigned long)_stext) >> prof_shift, prof_len - 1); i = primary = (pc & (NR_PROFILE_GRP - 1)) << PROFILE_GRPSHIFT; @@ -398,7 +405,7 @@ { unsigned long pc; - if (prof_on != type || !prof_buffer) + if (!prof_buffer) return; pc = ((unsigned long)__pc - (unsigned long)_stext) >> prof_shift; atomic_add(nr_hits, &prof_buffer[min(pc, prof_len - 1)]); @@ -555,7 +562,10 @@ } return 0; out_cleanup: - prof_on = 0; + immediate_disarm(&sleep_profiling); + immediate_disarm(&sched_profiling); + immediate_disarm(&kvm_profiling); + immediate_disarm(&cpu_profiling); smp_mb(); on_each_cpu(profile_nop, NULL, 0, 1); for_each_online_cpu(cpu) { @@ -582,7 +592,10 @@ { struct proc_dir_entry *entry; - if (!prof_on) + if (!immediate_query(&sleep_profiling) && + !immediate_query(&sched_profiling) && + !immediate_query(&kvm_profiling) && + !immediate_query(&cpu_profiling)) return 0; if (create_hash_tables()) return -1; Index: linux-2.6-lttng/include/linux/profile.h =================================================================== --- linux-2.6-lttng.orig/include/linux/profile.h 2007-06-29 10:21:44.000000000 -0400 +++ linux-2.6-lttng/include/linux/profile.h 2007-06-29 14:26:42.000000000 -0400 @@ -10,7 +10,8 @@ #include <asm/errno.h> -extern int prof_on __read_mostly; +extern immediate_t __read_mostly sleep_profiling, sched_profiling, kvm_profiling, + cpu_profiling; #define CPU_PROFILING 1 #define SCHED_PROFILING 2 @@ -35,11 +36,7 @@ */ static inline void profile_hit(int type, void *ip) { - /* - * Speedup for the common (no profiling enabled) case: - */ - if (unlikely(prof_on == type)) - profile_hits(type, ip, 1); + profile_hits(type, ip, 1); } #ifdef CONFIG_PROC_FS Index: linux-2.6-lttng/kernel/sched.c =================================================================== --- linux-2.6-lttng.orig/kernel/sched.c 2007-06-29 14:16:23.000000000 -0400 +++ linux-2.6-lttng/kernel/sched.c 2007-06-29 14:27:26.000000000 -0400 @@ -3241,7 +3241,8 @@ if (unlikely(in_atomic_preempt_off()) && unlikely(!prev->exit_state)) __schedule_bug(prev); - profile_hit(SCHED_PROFILING, __builtin_return_address(0)); + if (unlikely(immediate(sched_profiling))) + profile_hit(SCHED_PROFILING, __builtin_return_address(0)); schedstat_inc(this_rq(), sched_cnt); } Index: linux-2.6-lttng/drivers/kvm/svm.c =================================================================== --- linux-2.6-lttng.orig/drivers/kvm/svm.c 2007-06-29 12:27:40.000000000 -0400 +++ linux-2.6-lttng/drivers/kvm/svm.c 2007-06-29 14:26:42.000000000 -0400 @@ -1654,7 +1654,7 @@ /* * Profile KVM exit RIPs: */ - if (unlikely(prof_on == KVM_PROFILING)) + if (unlikely(immediate(kvm_profiling))) profile_hit(KVM_PROFILING, (void *)(unsigned long)vcpu->svm->vmcb->save.rip); Index: linux-2.6-lttng/drivers/kvm/vmx.c =================================================================== --- linux-2.6-lttng.orig/drivers/kvm/vmx.c 2007-06-29 12:27:40.000000000 -0400 +++ linux-2.6-lttng/drivers/kvm/vmx.c 2007-06-29 14:26:42.000000000 -0400 @@ -2156,7 +2156,7 @@ /* * Profile KVM exit RIPs: */ - if (unlikely(prof_on == KVM_PROFILING)) + if (unlikely(immediate(kvm_profiling))) profile_hit(KVM_PROFILING, (void *)vmcs_readl(GUEST_RIP)); vcpu->launched = 1; -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-03 16:40 ` [patch 10/10] Scheduler profiling - Use immediate values Mathieu Desnoyers @ 2007-07-03 18:11 ` Alexey Dobriyan 2007-07-03 18:57 ` Mathieu Desnoyers 2007-07-04 20:35 ` Alexey Dobriyan 1 sibling, 1 reply; 32+ messages in thread From: Alexey Dobriyan @ 2007-07-03 18:11 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: akpm, linux-kernel On Tue, Jul 03, 2007 at 12:40:56PM -0400, Mathieu Desnoyers wrote: > Use immediate values with lower d-cache hit in optimized version as a > condition for scheduler profiling call. How much difference in performance do you see? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-03 18:11 ` Alexey Dobriyan @ 2007-07-03 18:57 ` Mathieu Desnoyers 2007-07-04 14:23 ` Adrian Bunk ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Mathieu Desnoyers @ 2007-07-03 18:57 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: akpm, linux-kernel * Alexey Dobriyan (adobriyan@gmail.com) wrote: > On Tue, Jul 03, 2007 at 12:40:56PM -0400, Mathieu Desnoyers wrote: > > Use immediate values with lower d-cache hit in optimized version as a > > condition for scheduler profiling call. > > How much difference in performance do you see? > Hi Alexey, Please have a look at Documentation/immediate.txt for that information. Also note that the main advantage of the load immediate is to free a cache line. Therefore, I guess the best way to quantify the improvement it brings at one single site is not in terms of cycles, but in terms of number of cache lines used by the scheduler code. Since memory bandwidth seems to be an increasing bottleneck (CPU frequency increases faster than the available memory bandwidth), it makes sense to free as much cache lines as we can. Measuring the overall impact on the system of this single modification results in the difference brought by one site within the standard deviation of the normal samples. It will become significant when the number of immediate values used instead of global variables at hot kernel paths (need to ponder with the frequency at which the data is accessed) will start to be significant compared to the L1 data cache size. We could characterize this in memory to L1 cache transfers per seconds. On 3GHz P4: memory read: ~48 cycles So we can definitely say that 48*HZ (approximation of the frequency at which the scheduler is called) won't make much difference, but as it grows, it will. On a 1000HZ system, it results in: 48000 cycles/second, or 16µs/second, or 0.000016% speedup. However, if we place this in code called much more often, such as do_page_fault, we get, with an hypotetical scenario of approximation of 100000 page faults per second: 4800000 cycles/s, 1.6ms/second or 0.0016% speedup. So as the number of immediate values used increase, the overall memory bandwidth required by the kernel will go down. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-03 18:57 ` Mathieu Desnoyers @ 2007-07-04 14:23 ` Adrian Bunk 2007-07-04 20:31 ` Alexey Dobriyan 2007-07-05 20:21 ` Andrew Morton 2 siblings, 0 replies; 32+ messages in thread From: Adrian Bunk @ 2007-07-04 14:23 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: Alexey Dobriyan, akpm, linux-kernel On Tue, Jul 03, 2007 at 02:57:48PM -0400, Mathieu Desnoyers wrote: > * Alexey Dobriyan (adobriyan@gmail.com) wrote: > > On Tue, Jul 03, 2007 at 12:40:56PM -0400, Mathieu Desnoyers wrote: > > > Use immediate values with lower d-cache hit in optimized version as a > > > condition for scheduler profiling call. > > > > How much difference in performance do you see? > > > > Hi Alexey, > > Please have a look at Documentation/immediate.txt for that information. > Also note that the main advantage of the load immediate is to free a > cache line. Therefore, I guess the best way to quantify the improvement > it brings at one single site is not in terms of cycles, but in terms of > number of cache lines used by the scheduler code. Since memory bandwidth > seems to be an increasing bottleneck (CPU frequency increases faster > than the available memory bandwidth), it makes sense to free as much > cache lines as we can. > > Measuring the overall impact on the system of this single modification > results in the difference brought by one site within the standard > deviation of the normal samples. It will become significant when the > number of immediate values used instead of global variables at hot > kernel paths (need to ponder with the frequency at which the data is > accessed) will start to be significant compared to the L1 data cache > size. We could characterize this in memory to L1 cache transfers per > seconds. > > On 3GHz P4: > > memory read: ~48 cycles > > So we can definitely say that 48*HZ (approximation of the frequency at > which the scheduler is called) won't make much difference, but as it > grows, it will. > > On a 1000HZ system, it results in: > > 48000 cycles/second, or 16µs/second, or 0.000016% speedup. > > However, if we place this in code called much more often, such as > do_page_fault, we get, with an hypotetical scenario of approximation > of 100000 page faults per second: > > 4800000 cycles/s, 1.6ms/second or 0.0016% speedup. > > So as the number of immediate values used increase, the overall memory > bandwidth required by the kernel will go down. Might make a nice scientific paper, but even according to your own optimistic numbers it's not realistic that you will ever achieve any visible improvement even if you'd find 100 places in hotpaths you could mark this way. And a better direction for hotpaths seems to be Andi's __cold/COLD in -mm without adding an own framework for doing such things. > Mathieu cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-03 18:57 ` Mathieu Desnoyers 2007-07-04 14:23 ` Adrian Bunk @ 2007-07-04 20:31 ` Alexey Dobriyan 2007-07-05 20:21 ` Andrew Morton 2 siblings, 0 replies; 32+ messages in thread From: Alexey Dobriyan @ 2007-07-04 20:31 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: akpm, linux-kernel On Tue, Jul 03, 2007 at 02:57:48PM -0400, Mathieu Desnoyers wrote: > * Alexey Dobriyan (adobriyan@gmail.com) wrote: > > On Tue, Jul 03, 2007 at 12:40:56PM -0400, Mathieu Desnoyers wrote: > > > Use immediate values with lower d-cache hit in optimized version as a > > > condition for scheduler profiling call. > > > > How much difference in performance do you see? > > > > Hi Alexey, > > Please have a look at Documentation/immediate.txt for that information. > Also note that the main advantage of the load immediate is to free a > cache line. Therefore, I guess the best way to quantify the improvement > it brings at one single site is not in terms of cycles, but in terms of > number of cache lines used by the scheduler code. Since memory bandwidth > seems to be an increasing bottleneck (CPU frequency increases faster > than the available memory bandwidth), it makes sense to free as much > cache lines as we can. > > Measuring the overall impact on the system of this single modification > results in the difference brought by one site within the standard > deviation of the normal samples. It will become significant when the > number of immediate values used instead of global variables at hot > kernel paths (need to ponder with the frequency at which the data is > accessed) will start to be significant compared to the L1 data cache > size. L1 cache is 8K here. Just how many such variables should exist? On hot paths! > We could characterize this in memory to L1 cache transfers per > seconds. > > On 3GHz P4: > > memory read: ~48 cycles > > So we can definitely say that 48*HZ (approximation of the frequency at > which the scheduler is called) won't make much difference, but as it > grows, it will. > > On a 1000HZ system, it results in: > > 48000 cycles/second, or 16µs/second, or 0.000016% speedup. > > However, if we place this in code called much more often, such as > do_page_fault, we get, with an hypotetical scenario of approximation > of 100000 page faults per second: > > 4800000 cycles/s, 1.6ms/second or 0.0016% speedup. > > So as the number of immediate values used increase, the overall memory > bandwidth required by the kernel will go down. Adding so many infrastructure for something that you can't even measure is totally unjustified. There are already too many places where unlikely() and __read_mostly are used just because they can be used, so adding yet another such very specific, let's call it annotation, seems wrong to me. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-03 18:57 ` Mathieu Desnoyers 2007-07-04 14:23 ` Adrian Bunk 2007-07-04 20:31 ` Alexey Dobriyan @ 2007-07-05 20:21 ` Andrew Morton 2007-07-05 20:29 ` Andrew Morton 2007-07-06 11:44 ` Andi Kleen 2 siblings, 2 replies; 32+ messages in thread From: Andrew Morton @ 2007-07-05 20:21 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: Alexey Dobriyan, linux-kernel On Tue, 3 Jul 2007 14:57:48 -0400 Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > Measuring the overall impact on the system of this single modification > results in the difference brought by one site within the standard > deviation of the normal samples. It will become significant when the > number of immediate values used instead of global variables at hot > kernel paths (need to ponder with the frequency at which the data is > accessed) will start to be significant compared to the L1 data cache > size. We could characterize this in memory to L1 cache transfers per > seconds. > > On 3GHz P4: > > memory read: ~48 cycles > > So we can definitely say that 48*HZ (approximation of the frequency at > which the scheduler is called) won't make much difference, but as it > grows, it will. > > On a 1000HZ system, it results in: > > 48000 cycles/second, or 16__s/second, or 0.000016% speedup. > > However, if we place this in code called much more often, such as > do_page_fault, we get, with an hypotetical scenario of approximation > of 100000 page faults per second: > > 4800000 cycles/s, 1.6ms/second or 0.0016% speedup. > > So as the number of immediate values used increase, the overall memory > bandwidth required by the kernel will go down. Is that 48 cycles measured when the target of the read is in L1 cache, as it would be in any situation which we actually care about? I guess so... Boy, this is a tiny optimisation and boy, you added a pile of tricky new code to obtain it. Frankly, I'm thinking that life would be simpler if we just added static markers and stopped trying to add lots of tricksy maintenance-load-increasing things like this. Ho hum. Need more convincing, please. Also: a while back (maybe as much as a year) we had an extensive discussion regarding whether we want static markers at all in the kernel. The eventual outcome was, I believe, "yes". But our reasons for making that decision appear to have been lost. So if I were to send the markers patches to Linus and he were to ask me "why are you sending these", I'd be forced to answer "I don't know". This is not a good situation. Please prepare and maintain a short document which describes the justification for making all these changes to the kernel. The changelog for the main markers patch wold be an appropriate place for this. The target audience would be kernel developers and it should capture the pro- and con- arguments which were raised during that discussion. Bascially: tell us why we should merge _any_ of this stuff, because I for one have forgotten. Thanks. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-05 20:21 ` Andrew Morton @ 2007-07-05 20:29 ` Andrew Morton 2007-07-05 20:41 ` Mathieu Desnoyers 2007-07-06 11:44 ` Andi Kleen 1 sibling, 1 reply; 32+ messages in thread From: Andrew Morton @ 2007-07-05 20:29 UTC (permalink / raw) To: Mathieu Desnoyers, Alexey Dobriyan, linux-kernel On Thu, 5 Jul 2007 13:21:20 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > Please prepare and maintain a short document which describes the > justification for making all these changes to the kernel. oh, you did. It's there in the add-kconfig-stuff patch. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-05 20:29 ` Andrew Morton @ 2007-07-05 20:41 ` Mathieu Desnoyers 0 siblings, 0 replies; 32+ messages in thread From: Mathieu Desnoyers @ 2007-07-05 20:41 UTC (permalink / raw) To: Andrew Morton; +Cc: Alexey Dobriyan, linux-kernel * Andrew Morton (akpm@linux-foundation.org) wrote: > On Thu, 5 Jul 2007 13:21:20 -0700 > Andrew Morton <akpm@linux-foundation.org> wrote: > > > Please prepare and maintain a short document which describes the > > justification for making all these changes to the kernel. > > oh, you did. It's there in the add-kconfig-stuff patch. Yes, if you feel it should be put in a different patch header, I'll move it. -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-05 20:21 ` Andrew Morton 2007-07-05 20:29 ` Andrew Morton @ 2007-07-06 11:44 ` Andi Kleen 2007-07-06 17:50 ` Li, Tong N 2007-07-06 22:14 ` Chuck Ebbert 1 sibling, 2 replies; 32+ messages in thread From: Andi Kleen @ 2007-07-06 11:44 UTC (permalink / raw) To: Andrew Morton; +Cc: Mathieu Desnoyers, Alexey Dobriyan, linux-kernel Andrew Morton <akpm@linux-foundation.org> writes: > Is that 48 cycles measured when the target of the read is in L1 cache, as > it would be in any situation which we actually care about? I guess so... The normal situation is big database or other bloated software runs; clears all the dcaches, then enters kernel. Kernel has a cache miss on all its data. But icache access is faster because the CPU prefetches. We've had cases like this -- e.g. the additional dcache line accesses that were added by the new time code in vgettimeofday() were visible in macro benchmarks. Also cache misses in this situation tend to be much more than 48 cycles (even an K8 with integrated memory controller with fastest DIMMs is slower than that) Mathieu probably measured an L2 miss, not a load from RAM. Load from RAM can be hundreds of ns in the worst case. I think the optimization is a good idea, although i dislike it that it is complicated for the dynamic markers. If it was just static it would be much simpler. -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
* RE: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-06 11:44 ` Andi Kleen @ 2007-07-06 17:50 ` Li, Tong N 2007-07-06 20:03 ` Andi Kleen 2007-07-06 22:14 ` Chuck Ebbert 1 sibling, 1 reply; 32+ messages in thread From: Li, Tong N @ 2007-07-06 17:50 UTC (permalink / raw) To: Andi Kleen, Andrew Morton Cc: Mathieu Desnoyers, Alexey Dobriyan, linux-kernel > Also cache misses in this situation tend to be much more than 48 cycles > (even an K8 with integrated memory controller with fastest DIMMs is > slower than that) Mathieu probably measured an L2 miss, not a load from > RAM. > Load from RAM can be hundreds of ns in the worst case. > The 48 cycles sounds to me like a memory load in an unloaded system, but it is quite low. I wonder how it was measured... tong ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-06 17:50 ` Li, Tong N @ 2007-07-06 20:03 ` Andi Kleen 2007-07-06 20:57 ` Li, Tong N 0 siblings, 1 reply; 32+ messages in thread From: Andi Kleen @ 2007-07-06 20:03 UTC (permalink / raw) To: Li, Tong N Cc: Andi Kleen, Andrew Morton, Mathieu Desnoyers, Alexey Dobriyan, linux-kernel On Fri, Jul 06, 2007 at 10:50:30AM -0700, Li, Tong N wrote: > > Also cache misses in this situation tend to be much more than 48 > cycles > > (even an K8 with integrated memory controller with fastest DIMMs is > > slower than that) Mathieu probably measured an L2 miss, not a load ^^^^^^^ I meant L2 cache hit of course > from > > RAM. > > Load from RAM can be hundreds of ns in the worst case. > > > > The 48 cycles sounds to me like a memory load in an unloaded system, but > it is quite low. I wonder how it was measured... I found that memory latency is difficult to measure in modern x86 CPUs because they have very clever prefetchers that can often outwit benchmarks. Another trap on P4 is that RDTSC is actually quite slow and synchronizes the CPU; that can add large measurement errors. -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
* RE: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-06 20:03 ` Andi Kleen @ 2007-07-06 20:57 ` Li, Tong N 2007-07-06 21:03 ` Mathieu Desnoyers 0 siblings, 1 reply; 32+ messages in thread From: Li, Tong N @ 2007-07-06 20:57 UTC (permalink / raw) To: Andi Kleen Cc: Andrew Morton, Mathieu Desnoyers, Alexey Dobriyan, linux-kernel > I found that memory latency is difficult to measure in modern x86 > CPUs because they have very clever prefetchers that can often > outwit benchmarks. A pointer-chasing program that accesses a random sequence of addresses usually can produce a good estimate on memory latency. Also, prefetching can be turned off in BIOS or by modifying the MSRs. > Another trap on P4 is that RDTSC is actually quite slow and synchronizes > the CPU; that can add large measurement errors. > > -Andi The cost can be amortized if the portion of memory accesses is long enough. tong ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-06 20:57 ` Li, Tong N @ 2007-07-06 21:03 ` Mathieu Desnoyers 0 siblings, 0 replies; 32+ messages in thread From: Mathieu Desnoyers @ 2007-07-06 21:03 UTC (permalink / raw) To: Li, Tong N; +Cc: Andi Kleen, Andrew Morton, Alexey Dobriyan, linux-kernel * Li, Tong N (tong.n.li@intel.com) wrote: > > I found that memory latency is difficult to measure in modern x86 > > CPUs because they have very clever prefetchers that can often > > outwit benchmarks. > > A pointer-chasing program that accesses a random sequence of addresses > usually can produce a good estimate on memory latency. Also, prefetching > can be turned off in BIOS or by modifying the MSRs. > > > Another trap on P4 is that RDTSC is actually quite slow and > synchronizes > > the CPU; that can add large measurement errors. > > > > -Andi > > The cost can be amortized if the portion of memory accesses is long > enough. > > tong > That's what I am currently doing.. the results are coming in a few moments... :) -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-06 11:44 ` Andi Kleen 2007-07-06 17:50 ` Li, Tong N @ 2007-07-06 22:14 ` Chuck Ebbert 2007-07-06 23:28 ` Adrian Bunk 1 sibling, 1 reply; 32+ messages in thread From: Chuck Ebbert @ 2007-07-06 22:14 UTC (permalink / raw) To: Andi Kleen Cc: Andrew Morton, Mathieu Desnoyers, Alexey Dobriyan, linux-kernel On 07/06/2007 07:44 AM, Andi Kleen wrote: > I think the optimization is a good idea, although i dislike it > that it is complicated for the dynamic markers. If it was just > static it would be much simpler. > Another thing to consider is that there might be hundreds of these probes/tracepoints active in an instrumented kernel. The overhead adds up fast, so the gain may be worth all the pain. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-06 22:14 ` Chuck Ebbert @ 2007-07-06 23:28 ` Adrian Bunk 2007-07-06 23:38 ` Dave Jones 2007-07-06 23:43 ` Mathieu Desnoyers 0 siblings, 2 replies; 32+ messages in thread From: Adrian Bunk @ 2007-07-06 23:28 UTC (permalink / raw) To: Chuck Ebbert Cc: Andi Kleen, Andrew Morton, Mathieu Desnoyers, Alexey Dobriyan, linux-kernel On Fri, Jul 06, 2007 at 06:14:10PM -0400, Chuck Ebbert wrote: > On 07/06/2007 07:44 AM, Andi Kleen wrote: > > I think the optimization is a good idea, although i dislike it > > that it is complicated for the dynamic markers. If it was just > > static it would be much simpler. > > Another thing to consider is that there might be hundreds of these > probes/tracepoints active in an instrumented kernel. The overhead > adds up fast, so the gain may be worth all the pain. Only if you want to squeeze the last bit of performance out of _debugging_ functionality. You avoid all the pain if you simply don't use debugging functionality on production systems. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-06 23:28 ` Adrian Bunk @ 2007-07-06 23:38 ` Dave Jones 2007-07-07 0:10 ` Adrian Bunk 2007-07-06 23:43 ` Mathieu Desnoyers 1 sibling, 1 reply; 32+ messages in thread From: Dave Jones @ 2007-07-06 23:38 UTC (permalink / raw) To: Adrian Bunk Cc: Chuck Ebbert, Andi Kleen, Andrew Morton, Mathieu Desnoyers, Alexey Dobriyan, linux-kernel On Sat, Jul 07, 2007 at 01:28:43AM +0200, Adrian Bunk wrote: > Only if you want to squeeze the last bit of performance out of > _debugging_ functionality. > > You avoid all the pain if you simply don't use debugging functionality > on production systems. I think you're mixing up profiling and debugging. The former is extremely valuable under production systems. Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-06 23:38 ` Dave Jones @ 2007-07-07 0:10 ` Adrian Bunk 2007-07-07 15:45 ` Frank Ch. Eigler 0 siblings, 1 reply; 32+ messages in thread From: Adrian Bunk @ 2007-07-07 0:10 UTC (permalink / raw) To: Dave Jones, Chuck Ebbert, Andi Kleen, Andrew Morton, Mathieu Desnoyers, Alexey Dobriyan, linux-kernel On Fri, Jul 06, 2007 at 07:38:27PM -0400, Dave Jones wrote: > On Sat, Jul 07, 2007 at 01:28:43AM +0200, Adrian Bunk wrote: > > > Only if you want to squeeze the last bit of performance out of > > _debugging_ functionality. > > > > You avoid all the pain if you simply don't use debugging functionality > > on production systems. > > I think you're mixing up profiling and debugging. The former is > extremely valuable under production systems. profiling = debugging of performance problems My words were perhaps a bit sloppy, but profiling isn't part of normal operation and if people use a separate kernel for such purposes we don't need infrastructure for reducing performance penalties of enabled debug options. > Dave cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-07 0:10 ` Adrian Bunk @ 2007-07-07 15:45 ` Frank Ch. Eigler 2007-07-07 17:01 ` Adrian Bunk 0 siblings, 1 reply; 32+ messages in thread From: Frank Ch. Eigler @ 2007-07-07 15:45 UTC (permalink / raw) To: Adrian Bunk Cc: Dave Jones, Chuck Ebbert, Andi Kleen, Andrew Morton, Mathieu Desnoyers, Alexey Dobriyan, linux-kernel Adrian Bunk <bunk@stusta.de> writes: > [...] > profiling = debugging of performance problems Indeed. > My words were perhaps a bit sloppy, but profiling isn't part of > normal operation and if people use a separate kernel for such > purposes we don't need infrastructure for reducing performance > penalties of enabled debug options. Things are not so simple. One might not know that one has a performance problem until one tries some analysis tools. Rebooting into different kernels just to investigate does not work generally: the erroneous phenomenon may have been short lived; the debug kernel, being "only" for debugging, may not be well tested => sufficiently trustworthy. Your question asking for an actual performance impact of dormant hooks is OTOH entirely legitimate. It clearly depends on the placement of those hooks and thus their encounter rate, more so than their underlying technology (markers with whatever optimizations). If the cost is small enough, you will likely find that people will be willing to pay a small fraction of average performance, in order to eke out large gains when finding occasional e.g. algorithmic bugs. - FChE ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-07 15:45 ` Frank Ch. Eigler @ 2007-07-07 17:01 ` Adrian Bunk 2007-07-07 17:20 ` Willy Tarreau 2007-07-07 17:55 ` Frank Ch. Eigler 0 siblings, 2 replies; 32+ messages in thread From: Adrian Bunk @ 2007-07-07 17:01 UTC (permalink / raw) To: Frank Ch. Eigler Cc: Dave Jones, Chuck Ebbert, Andi Kleen, Andrew Morton, Mathieu Desnoyers, Alexey Dobriyan, linux-kernel On Sat, Jul 07, 2007 at 11:45:20AM -0400, Frank Ch. Eigler wrote: > Adrian Bunk <bunk@stusta.de> writes: > > > [...] > > profiling = debugging of performance problems > > Indeed. > > > My words were perhaps a bit sloppy, but profiling isn't part of > > normal operation and if people use a separate kernel for such > > purposes we don't need infrastructure for reducing performance > > penalties of enabled debug options. > > Things are not so simple. One might not know that one has a > performance problem until one tries some analysis tools. Rebooting > into different kernels just to investigate does not work generally: > the erroneous phenomenon may have been short lived; the debug kernel, > being "only" for debugging, may not be well tested => sufficiently > trustworthy. I'm not getting this: You'll only start looking into an analysis tool if you have a performance problem, IOW if you are not satisfied with the performance. And the debug code will not have been tested on this machine no matter whether it's enabled through a compile option or at runtime. > Your question asking for an actual performance impact of dormant hooks > is OTOH entirely legitimate. It clearly depends on the placement of > those hooks and thus their encounter rate, more so than their > underlying technology (markers with whatever optimizations). If the > cost is small enough, you will likely find that people will be willing > to pay a small fraction of average performance, in order to eke out > large gains when finding occasional e.g. algorithmic bugs. If you might be able to get a big part of tracing and other debug code enabled with a performance penalty of a few percent of _kernel_ performance, then you might get much debugging aid without any effective impact on application performance. You always have to decide between some debug code and some small bit of performance. There's a reason why options to disable things like BUG() or printk() are in the kernel config menus hidden behind CONFIG_EMBEDDED although they obviously have some performance impact. > - FChE cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-07 17:01 ` Adrian Bunk @ 2007-07-07 17:20 ` Willy Tarreau 2007-07-07 17:59 ` Adrian Bunk 2007-07-07 17:55 ` Frank Ch. Eigler 1 sibling, 1 reply; 32+ messages in thread From: Willy Tarreau @ 2007-07-07 17:20 UTC (permalink / raw) To: Adrian Bunk Cc: Frank Ch. Eigler, Dave Jones, Chuck Ebbert, Andi Kleen, Andrew Morton, Mathieu Desnoyers, Alexey Dobriyan, linux-kernel On Sat, Jul 07, 2007 at 07:01:57PM +0200, Adrian Bunk wrote: > On Sat, Jul 07, 2007 at 11:45:20AM -0400, Frank Ch. Eigler wrote: > > Adrian Bunk <bunk@stusta.de> writes: > > > > > [...] > > > profiling = debugging of performance problems > > > > Indeed. > > > > > My words were perhaps a bit sloppy, but profiling isn't part of > > > normal operation and if people use a separate kernel for such > > > purposes we don't need infrastructure for reducing performance > > > penalties of enabled debug options. > > > > Things are not so simple. One might not know that one has a > > performance problem until one tries some analysis tools. Rebooting > > into different kernels just to investigate does not work generally: > > the erroneous phenomenon may have been short lived; the debug kernel, > > being "only" for debugging, may not be well tested => sufficiently > > trustworthy. > > I'm not getting this: > > You'll only start looking into an analysis tool if you have a > performance problem, IOW if you are not satisfied with the performance. > > And the debug code will not have been tested on this machine no matter > whether it's enabled through a compile option or at runtime. At least all the rest of the code will be untouched and you will not have to reboot the machine. If you reboot to another kernel, nothing ensures that you will have the same code sequences (in fact, gcc will reorder some parts of code such as loops just because of an additional 'if'). So you know that the non-debug code you run will remain untouched. *This* is important, because the debug code is not there to debug itself, but to debug the rest. > > Your question asking for an actual performance impact of dormant hooks > > is OTOH entirely legitimate. It clearly depends on the placement of > > those hooks and thus their encounter rate, more so than their > > underlying technology (markers with whatever optimizations). If the > > cost is small enough, you will likely find that people will be willing > > to pay a small fraction of average performance, in order to eke out > > large gains when finding occasional e.g. algorithmic bugs. > > If you might be able to get a big part of tracing and other debug code > enabled with a performance penalty of a few percent of _kernel_ > performance, then you might get much debugging aid without any effective > impact on application performance. it largely depends on the application. Applications which require a lot of system calls will be more sensible to kernel debugging. Common sense also implies that such applications will be the ones for which kernel debugging will be relevant. > You always have to decide between some debug code and some small bit of > performance. There's a reason why options to disable things like BUG() > or printk() are in the kernel config menus hidden behind CONFIG_EMBEDDED > although they obviously have some performance impact. It is not for the CPU performance they can be disabled, but for the code size which is a real problem on embedded system. While you often have mem/cpu_mhz ratios around 1GB/1GHz on servers and desktops, you more often have ratios like 16MB/500MHz which is 1:32 of the former. That's why you optimize for size at the expense of speed on such systems. > > > - FChE > > cu > Adrian Regards, Willy ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-07 17:20 ` Willy Tarreau @ 2007-07-07 17:59 ` Adrian Bunk 0 siblings, 0 replies; 32+ messages in thread From: Adrian Bunk @ 2007-07-07 17:59 UTC (permalink / raw) To: Willy Tarreau Cc: Frank Ch. Eigler, Dave Jones, Chuck Ebbert, Andi Kleen, Andrew Morton, Mathieu Desnoyers, Alexey Dobriyan, linux-kernel On Sat, Jul 07, 2007 at 07:20:12PM +0200, Willy Tarreau wrote: > On Sat, Jul 07, 2007 at 07:01:57PM +0200, Adrian Bunk wrote: > > On Sat, Jul 07, 2007 at 11:45:20AM -0400, Frank Ch. Eigler wrote: >... > > You always have to decide between some debug code and some small bit of > > performance. There's a reason why options to disable things like BUG() > > or printk() are in the kernel config menus hidden behind CONFIG_EMBEDDED > > although they obviously have some performance impact. > > It is not for the CPU performance they can be disabled, but for the code > size which is a real problem on embedded system. While you often have > mem/cpu_mhz ratios around 1GB/1GHz on servers and desktops, you more often > have ratios like 16MB/500MHz which is 1:32 of the former. That's why you > optimize for size at the expense of speed on such systems. The latter is not true for my two examples. CONFIG_PRINTK=n, CONFIG_BUG=n will obviously make the kernel both smaller and faster. [1] > Regards, > Willy cu Adrian [1] faster due to less code to execute and positive cache effects due to the smaller code [2] [2] whether the "faster" is big enough that it is in any way measurable is a different question -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-07 17:01 ` Adrian Bunk 2007-07-07 17:20 ` Willy Tarreau @ 2007-07-07 17:55 ` Frank Ch. Eigler 1 sibling, 0 replies; 32+ messages in thread From: Frank Ch. Eigler @ 2007-07-07 17:55 UTC (permalink / raw) To: Adrian Bunk Cc: Dave Jones, Chuck Ebbert, Andi Kleen, Andrew Morton, Mathieu Desnoyers, Alexey Dobriyan, linux-kernel Hi, Adrian - On Sat, Jul 07, 2007 at 07:01:57PM +0200, Adrian Bunk wrote: > [...] > > Things are not so simple. One might not know that one has a > > performance problem until one tries some analysis tools. Rebooting > > into different kernels just to investigate does not work generally [...] > > I'm not getting this: > > You'll only start looking into an analysis tool if you have a > performance problem, IOW if you are not satisfied with the > performance. There may be people whose jobs entail continually suspecting performance problems. Or one may run instrumentation code on a long-term basis specifically to locate performance spikes. > And the debug code will not have been tested on this machine no matter > whether it's enabled through a compile option or at runtime. There is a big difference in favour of the former. The additional instrumentation code may be small enough to inspect carefully. The rest of the kernel would be unaffected. > [...] If you might be able to get a big part of tracing and other > debug code enabled with a performance penalty of a few percent of > _kernel_ performance, then you might get much debugging aid without > any effective impact on application performance. Agreed. - FChE ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-06 23:28 ` Adrian Bunk 2007-07-06 23:38 ` Dave Jones @ 2007-07-06 23:43 ` Mathieu Desnoyers 2007-07-07 2:25 ` Adrian Bunk 1 sibling, 1 reply; 32+ messages in thread From: Mathieu Desnoyers @ 2007-07-06 23:43 UTC (permalink / raw) To: Adrian Bunk Cc: Chuck Ebbert, Andi Kleen, Andrew Morton, Alexey Dobriyan, linux-kernel, mbligh * Adrian Bunk (bunk@stusta.de) wrote: > On Fri, Jul 06, 2007 at 06:14:10PM -0400, Chuck Ebbert wrote: > > On 07/06/2007 07:44 AM, Andi Kleen wrote: > > > I think the optimization is a good idea, although i dislike it > > > that it is complicated for the dynamic markers. If it was just > > > static it would be much simpler. > > > > Another thing to consider is that there might be hundreds of these > > probes/tracepoints active in an instrumented kernel. The overhead > > adds up fast, so the gain may be worth all the pain. > > Only if you want to squeeze the last bit of performance out of > _debugging_ functionality. > > You avoid all the pain if you simply don't use debugging functionality > on production systems. > Adrian, Please have a look at my markers posts, especially: http://www.ussg.iu.edu/hypermail/linux/kernel/0707.0/0669.html And also look into OLS 2007 proceedings for Martin Bligh's paper on Debugging Google sized clusters. It basically makes the case for adding functionnality to debug _user space_ problems on production systems that can be turned on dynamically. Mathieu > cu > Adrian > > -- > > "Is there not promise of rain?" Ling Tan asked suddenly out > of the darkness. There had been need of rain for many days. > "Only a promise," Lao Er said. > Pearl S. Buck - Dragon Seed > -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-06 23:43 ` Mathieu Desnoyers @ 2007-07-07 2:25 ` Adrian Bunk 2007-07-07 2:35 ` Mathieu Desnoyers 0 siblings, 1 reply; 32+ messages in thread From: Adrian Bunk @ 2007-07-07 2:25 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Chuck Ebbert, Andi Kleen, Andrew Morton, Alexey Dobriyan, linux-kernel, mbligh On Fri, Jul 06, 2007 at 07:43:15PM -0400, Mathieu Desnoyers wrote: > * Adrian Bunk (bunk@stusta.de) wrote: > > On Fri, Jul 06, 2007 at 06:14:10PM -0400, Chuck Ebbert wrote: > > > On 07/06/2007 07:44 AM, Andi Kleen wrote: > > > > I think the optimization is a good idea, although i dislike it > > > > that it is complicated for the dynamic markers. If it was just > > > > static it would be much simpler. > > > > > > Another thing to consider is that there might be hundreds of these > > > probes/tracepoints active in an instrumented kernel. The overhead > > > adds up fast, so the gain may be worth all the pain. > > > > Only if you want to squeeze the last bit of performance out of > > _debugging_ functionality. > > > > You avoid all the pain if you simply don't use debugging functionality > > on production systems. > > > > Adrian, > > Please have a look at my markers posts, especially: > > http://www.ussg.iu.edu/hypermail/linux/kernel/0707.0/0669.html > > And also look into OLS 2007 proceedings for Martin Bligh's paper on > Debugging Google sized clusters. It basically makes the case for adding > functionnality to debug _user space_ problems on production systems that > can be turned on dynamically. Using a different kernel for tracing still fulfills all the requirements listed in section 5 of your paper... > Mathieu cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-07 2:25 ` Adrian Bunk @ 2007-07-07 2:35 ` Mathieu Desnoyers 2007-07-07 4:03 ` Adrian Bunk 0 siblings, 1 reply; 32+ messages in thread From: Mathieu Desnoyers @ 2007-07-07 2:35 UTC (permalink / raw) To: Adrian Bunk Cc: Chuck Ebbert, Andi Kleen, Andrew Morton, Alexey Dobriyan, linux-kernel, mbligh * Adrian Bunk (bunk@stusta.de) wrote: > On Fri, Jul 06, 2007 at 07:43:15PM -0400, Mathieu Desnoyers wrote: > > * Adrian Bunk (bunk@stusta.de) wrote: > > > On Fri, Jul 06, 2007 at 06:14:10PM -0400, Chuck Ebbert wrote: > > > > On 07/06/2007 07:44 AM, Andi Kleen wrote: > > > > > I think the optimization is a good idea, although i dislike it > > > > > that it is complicated for the dynamic markers. If it was just > > > > > static it would be much simpler. > > > > > > > > Another thing to consider is that there might be hundreds of these > > > > probes/tracepoints active in an instrumented kernel. The overhead > > > > adds up fast, so the gain may be worth all the pain. > > > > > > Only if you want to squeeze the last bit of performance out of > > > _debugging_ functionality. > > > > > > You avoid all the pain if you simply don't use debugging functionality > > > on production systems. > > > > > > > Adrian, > > > > Please have a look at my markers posts, especially: > > > > http://www.ussg.iu.edu/hypermail/linux/kernel/0707.0/0669.html > > > > And also look into OLS 2007 proceedings for Martin Bligh's paper on > > Debugging Google sized clusters. It basically makes the case for adding > > functionnality to debug _user space_ problems on production systems that > > can be turned on dynamically. > > Using a different kernel for tracing still fulfills all the requirements > listed in section 5 of your paper... > Not exactly. I assume you understand that rebooting 1000 live production servers to find the source of a rare bug or the cause of a performance issue is out of question. Moreover, strategies like enabling flight recorder traces on a few nodes on demand to detect performance problems can only be deployed in production environment if they are part of the standard production kernel. Also, managing two different kernels is often out of question. Not only is it a maintainance burden, but just switching to the "debug" kernel can impact the system's behavior so badly that it could make the problem disappear. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-07 2:35 ` Mathieu Desnoyers @ 2007-07-07 4:03 ` Adrian Bunk 2007-07-07 5:02 ` Willy Tarreau 0 siblings, 1 reply; 32+ messages in thread From: Adrian Bunk @ 2007-07-07 4:03 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Chuck Ebbert, Andi Kleen, Andrew Morton, Alexey Dobriyan, linux-kernel, mbligh On Fri, Jul 06, 2007 at 10:35:11PM -0400, Mathieu Desnoyers wrote: > * Adrian Bunk (bunk@stusta.de) wrote: > > On Fri, Jul 06, 2007 at 07:43:15PM -0400, Mathieu Desnoyers wrote: > > > * Adrian Bunk (bunk@stusta.de) wrote: > > > > On Fri, Jul 06, 2007 at 06:14:10PM -0400, Chuck Ebbert wrote: > > > > > On 07/06/2007 07:44 AM, Andi Kleen wrote: > > > > > > I think the optimization is a good idea, although i dislike it > > > > > > that it is complicated for the dynamic markers. If it was just > > > > > > static it would be much simpler. > > > > > > > > > > Another thing to consider is that there might be hundreds of these > > > > > probes/tracepoints active in an instrumented kernel. The overhead > > > > > adds up fast, so the gain may be worth all the pain. > > > > > > > > Only if you want to squeeze the last bit of performance out of > > > > _debugging_ functionality. > > > > > > > > You avoid all the pain if you simply don't use debugging functionality > > > > on production systems. > > > > > > > > > > Adrian, > > > > > > Please have a look at my markers posts, especially: > > > > > > http://www.ussg.iu.edu/hypermail/linux/kernel/0707.0/0669.html > > > > > > And also look into OLS 2007 proceedings for Martin Bligh's paper on > > > Debugging Google sized clusters. It basically makes the case for adding > > > functionnality to debug _user space_ problems on production systems that > > > can be turned on dynamically. > > > > Using a different kernel for tracing still fulfills all the requirements > > listed in section 5 of your paper... > > > > Not exactly. I assume you understand that rebooting 1000 live production > servers to find the source of a rare bug or the cause of a performance > issue is out of question. > > Moreover, strategies like enabling flight recorder traces on a few nodes > on demand to detect performance problems can only be deployed in > production environment if they are part of the standard production > kernel. > > Also, managing two different kernels is often out of question. Not only > is it a maintainance burden, but just switching to the "debug" kernel > can impact the system's behavior so badly that it could make the problem > disappear. As can turning tracing on at runtime. And you can always define requirements in a way that your solution is the only one... Let's go to a different point: Your paper says "When not running, must have zero effective impact." How big is the measured impact of your markers when not used without any immediate voodoo? You have sent many numbers about micro-benchmarks and theoretical numbers, but if you have sent the interesting numbers comparing 1. MARKERS=n 2. MARKERS=y, IMMEDIATE=n 3. MARKERS=y, IMMEDIATE=y in actual benchmark testing I must have missed it. Does 3. have a measurable and effective advantage over 2. or are you optimizing for some 0.01% or 1% performance difference without any effective impact and therefore not requred for the goals outlined in your paper? > Mathieu cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-07 4:03 ` Adrian Bunk @ 2007-07-07 5:02 ` Willy Tarreau 0 siblings, 0 replies; 32+ messages in thread From: Willy Tarreau @ 2007-07-07 5:02 UTC (permalink / raw) To: Adrian Bunk Cc: Mathieu Desnoyers, Chuck Ebbert, Andi Kleen, Andrew Morton, Alexey Dobriyan, linux-kernel, mbligh On Sat, Jul 07, 2007 at 06:03:07AM +0200, Adrian Bunk wrote: > On Fri, Jul 06, 2007 at 10:35:11PM -0400, Mathieu Desnoyers wrote: > > * Adrian Bunk (bunk@stusta.de) wrote: > > > On Fri, Jul 06, 2007 at 07:43:15PM -0400, Mathieu Desnoyers wrote: > > > > * Adrian Bunk (bunk@stusta.de) wrote: > > > > > On Fri, Jul 06, 2007 at 06:14:10PM -0400, Chuck Ebbert wrote: > > > > > > On 07/06/2007 07:44 AM, Andi Kleen wrote: > > > > > > > I think the optimization is a good idea, although i dislike it > > > > > > > that it is complicated for the dynamic markers. If it was just > > > > > > > static it would be much simpler. > > > > > > > > > > > > Another thing to consider is that there might be hundreds of these > > > > > > probes/tracepoints active in an instrumented kernel. The overhead > > > > > > adds up fast, so the gain may be worth all the pain. > > > > > > > > > > Only if you want to squeeze the last bit of performance out of > > > > > _debugging_ functionality. > > > > > > > > > > You avoid all the pain if you simply don't use debugging functionality > > > > > on production systems. > > > > > > > > > > > > > Adrian, > > > > > > > > Please have a look at my markers posts, especially: > > > > > > > > http://www.ussg.iu.edu/hypermail/linux/kernel/0707.0/0669.html > > > > > > > > And also look into OLS 2007 proceedings for Martin Bligh's paper on > > > > Debugging Google sized clusters. It basically makes the case for adding > > > > functionnality to debug _user space_ problems on production systems that > > > > can be turned on dynamically. > > > > > > Using a different kernel for tracing still fulfills all the requirements > > > listed in section 5 of your paper... > > > > > > > Not exactly. I assume you understand that rebooting 1000 live production > > servers to find the source of a rare bug or the cause of a performance > > issue is out of question. > > > > Moreover, strategies like enabling flight recorder traces on a few nodes > > on demand to detect performance problems can only be deployed in > > production environment if they are part of the standard production > > kernel. > > > > Also, managing two different kernels is often out of question. Not only > > is it a maintainance burden, but just switching to the "debug" kernel > > can impact the system's behavior so badly that it could make the problem > > disappear. > > As can turning tracing on at runtime. > > And you can always define requirements in a way that your solution is > the only one... On large production environments, you always lose a certain percentage of machines at each reboot. Most often, it's the CR2032 lithium battery which is dead and which causes all or parts of the CMOS settings to vanish, hanging the system at boot. Then you play with the ON/OFF switch and a small percentage of the power supplies refuse to restart and some disks refuse to spin up. Fortunately this does not happen with all machines, but if you have such problems with 1% of your machines, you lose 10 machines when you reboot 1000 of them. Those problems require a lot of man power, which explains why such systems are rarely updated. Causing that much trouble just to enable debugging is clearly unacceptable, and your debug kernel will simply never be used. Not to mention the fact that people will never trust it because it's almost never used ! Regards, Willy ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-03 16:40 ` [patch 10/10] Scheduler profiling - Use immediate values Mathieu Desnoyers 2007-07-03 18:11 ` Alexey Dobriyan @ 2007-07-04 20:35 ` Alexey Dobriyan 2007-07-04 22:41 ` Andi Kleen 1 sibling, 1 reply; 32+ messages in thread From: Alexey Dobriyan @ 2007-07-04 20:35 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: akpm, linux-kernel On Tue, Jul 03, 2007 at 12:40:56PM -0400, Mathieu Desnoyers wrote: > Use immediate values with lower d-cache hit in optimized version as a > condition for scheduler profiling call. I think it's better to put profile.c under CONFIG_PROFILING as _expected_, so CONFIG_PROFILING=n users won't get any overhead, immediate or not. That's what I'm going to do after test-booting bunch of kernels. Thus, enabling CONFIG_PROFILING option will buy you some overhead, again, as _expected_. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch 10/10] Scheduler profiling - Use immediate values 2007-07-04 20:35 ` Alexey Dobriyan @ 2007-07-04 22:41 ` Andi Kleen 0 siblings, 0 replies; 32+ messages in thread From: Andi Kleen @ 2007-07-04 22:41 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: Mathieu Desnoyers, akpm, linux-kernel Alexey Dobriyan <adobriyan@gmail.com> writes: > On Tue, Jul 03, 2007 at 12:40:56PM -0400, Mathieu Desnoyers wrote: > > Use immediate values with lower d-cache hit in optimized version as a > > condition for scheduler profiling call. > > I think it's better to put profile.c under CONFIG_PROFILING as > _expected_, so CONFIG_PROFILING=n users won't get any overhead, immediate or > not. That's what I'm going to do after test-booting bunch of kernels. No, it's better to handle this efficiently at runtime e.g. for distribution kernels. Mathieu's patch is good -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2007-07-07 17:59 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <8CTJM-4Z1-27@gated-at.bofh.it>
[not found] ` <8CU38-5E1-23@gated-at.bofh.it>
[not found] ` <8DjNU-4bL-47@gated-at.bofh.it>
[not found] ` <8DkTC-5Vy-11@gated-at.bofh.it>
2007-07-05 15:23 ` [patch 10/10] Scheduler profiling - Use immediate values Bodo Eggert
2007-07-05 15:46 ` Mathieu Desnoyers
2007-07-06 21:08 ` Adrian Bunk
2007-07-03 16:40 [patch 00/10] Immediate Values Mathieu Desnoyers
2007-07-03 16:40 ` [patch 10/10] Scheduler profiling - Use immediate values Mathieu Desnoyers
2007-07-03 18:11 ` Alexey Dobriyan
2007-07-03 18:57 ` Mathieu Desnoyers
2007-07-04 14:23 ` Adrian Bunk
2007-07-04 20:31 ` Alexey Dobriyan
2007-07-05 20:21 ` Andrew Morton
2007-07-05 20:29 ` Andrew Morton
2007-07-05 20:41 ` Mathieu Desnoyers
2007-07-06 11:44 ` Andi Kleen
2007-07-06 17:50 ` Li, Tong N
2007-07-06 20:03 ` Andi Kleen
2007-07-06 20:57 ` Li, Tong N
2007-07-06 21:03 ` Mathieu Desnoyers
2007-07-06 22:14 ` Chuck Ebbert
2007-07-06 23:28 ` Adrian Bunk
2007-07-06 23:38 ` Dave Jones
2007-07-07 0:10 ` Adrian Bunk
2007-07-07 15:45 ` Frank Ch. Eigler
2007-07-07 17:01 ` Adrian Bunk
2007-07-07 17:20 ` Willy Tarreau
2007-07-07 17:59 ` Adrian Bunk
2007-07-07 17:55 ` Frank Ch. Eigler
2007-07-06 23:43 ` Mathieu Desnoyers
2007-07-07 2:25 ` Adrian Bunk
2007-07-07 2:35 ` Mathieu Desnoyers
2007-07-07 4:03 ` Adrian Bunk
2007-07-07 5:02 ` Willy Tarreau
2007-07-04 20:35 ` Alexey Dobriyan
2007-07-04 22:41 ` Andi Kleen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox