public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* A quick view of the performance benchmark for semaphore-like and mutex
@ 2012-04-17  9:36 Chen, Dennis (SRDC SW)
  2012-04-17 10:09 ` Peter Zijlstra
  2012-04-17 10:12 ` Peter Zijlstra
  0 siblings, 2 replies; 4+ messages in thread
From: Chen, Dennis (SRDC SW) @ 2012-04-17  9:36 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org
  Cc: Ingo Molnar, paulmck@linux.vnet.ibm.com, peterz@infradead.org,
	Paul Mackerras, Arnaldo Carvalho de Melo

Just as a quick & rough test, with below changes based on mutex (almost the same as semaphore):

--- /home/dennis/Linux/linux-3.3.2-sem/kernel/mutex.c   2012-04-17 14:59:49.823177615 +0800
+++ ./mutex.c   2012-04-17 17:00:12.963059284 +0800
@@ -140,6 +140,7 @@ __mutex_lock_common(struct mutex *lock,
        preempt_disable();
        mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);
 
+#if 0
 #ifdef CONFIG_MUTEX_SPIN_ON_OWNER
        /*
         * Optimistic spinning.
@@ -195,6 +196,7 @@ __mutex_lock_common(struct mutex *lock,
                arch_mutex_cpu_relax();
        }
 #endif
+#endif
        spin_lock_mutex(&lock->wait_lock, flags);
 
        debug_mutex_lock_common(lock, &waiter);


#perf record -a perf bench locking mutex -p 8 -t 3000

The benchmark result BEFORE (mutex)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
round 1:
Total duration     39868 s   536095 us

real: 15.89   s
user: 0.00   
sys:  0.31  

Events: 64K cycles
 20.18%           perf  [kernel.kallsyms]                  [k] __mutex_lock_slowpath                                                      
  8.41%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock                                                              
  8.00%           perf  [kernel.kallsyms]                  [k] mutex_unlock                                                               
  5.29%           perf  [kernel.kallsyms]                  [k] mutex_lock                                                                  
  2.88%           perf  [kernel.kallsyms]                  [k] link_path_walk                                                              
  2.56%           perf  [kernel.kallsyms]                  [k] __mutex_unlock_slowpath                                                     
  2.31%           perf  [kernel.kallsyms]                  [k] mutex_spin_on_owner                                                         
  2.29%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave                                                      
  1.68%           perf  [kernel.kallsyms]                  [k] __d_lookup                                                                  
  1.33%           perf  [kernel.kallsyms]                  [k] dput                                                                        
  1.33%           perf  [kernel.kallsyms]                  [k] clear_page_c                                                               
  1.06%           perf  [kernel.kallsyms]                  [k] __strncpy_from_user                                                         
  1.04%           perf  [kernel.kallsyms]                  [k] do_lookup                        
  ...
-------------------------------------------------------------------------------------
round 2:
Total duration     39748 s   176410 us

real: 15.92   s
user: 0.00   
sys:  0.32

Events: 63K cycles
 19.68%           perf  [kernel.kallsyms]                  [k] __mutex_lock_slowpath                                                      
  8.53%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock                                                              
  7.74%           perf  [kernel.kallsyms]                  [k] mutex_unlock                                                               
  5.09%           perf  [kernel.kallsyms]                  [k] mutex_lock                                                                  
  3.06%           perf  [kernel.kallsyms]                  [k] link_path_walk                                                              
  2.54%           perf  [kernel.kallsyms]                  [k] __mutex_unlock_slowpath                                                     
  2.31%           perf  [kernel.kallsyms]                  [k] mutex_spin_on_owner                                                         
  2.30%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave                                                      
  1.76%           perf  [kernel.kallsyms]                  [k] __d_lookup                                                                  
  1.46%           perf  [kernel.kallsyms]                  [k] clear_page_c                                                               
  1.31%           perf  [kernel.kallsyms]                  [k] dput                                                                        
  1.10%           perf  [kernel.kallsyms]                  [k] __strncpy_from_user                                                         
  1.08%           perf  [kernel.kallsyms]                  [k] do_lookup  
  ...
-------------------------------------------------------------------------------------
round 3:
Total duration     40047 s   394364 us

real: 15.59   s
user: 0.00   
sys:  0.30   

Events: 58K cycles
 19.18%           perf  [kernel.kallsyms]                  [k] __mutex_lock_slowpath                                                      
  8.68%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock                                                              
  7.80%           perf  [kernel.kallsyms]                  [k] mutex_unlock                                                               
  5.24%           perf  [kernel.kallsyms]                  [k] mutex_lock                                                                  
  3.22%           perf  [kernel.kallsyms]                  [k] link_path_walk                                                              
  2.57%           perf  [kernel.kallsyms]                  [k] __mutex_unlock_slowpath                                                     
  2.38%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave                                                      
  2.13%           perf  [kernel.kallsyms]                  [k] mutex_spin_on_owner                                                         
  1.79%           perf  [kernel.kallsyms]                  [k] __d_lookup                                                                  
  1.54%           perf  [kernel.kallsyms]                  [k] clear_page_c                                                               
  1.34%           perf  [kernel.kallsyms]                  [k] dput                                                                        
  1.12%           perf  [kernel.kallsyms]                  [k] do_lookup                                                                  
  1.04%           perf  [kernel.kallsyms]                  [k] __strncpy_from_user                                                         
  1.02%           perf  [kernel.kallsyms]                  [k] system_call                                                                 
  1.02%           perf  [kernel.kallsyms]                  [k] get_page_from_freelist
  ...

The benchmark result AFTER (remove the optimization part of mutex)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
round 1:
Total duration     66319 s   868892 us

 real: 23.16   s
 user: 0.00   
 sys:  0.29  

Events: 81K cycles
  6.30%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock                                                              
  3.13%           perf  [kernel.kallsyms]                  [k] mutex_unlock                                                                
  3.09%           perf  [kernel.kallsyms]                  [k] mutex_lock                                                                  
  3.07%           perf  [kernel.kallsyms]                  [k] link_path_walk                                                              
  2.66%        swapper  [kernel.kallsyms]                  [k] intel_idle                                                                  
  2.21%           perf  [kernel.kallsyms]                  [k] __d_lookup                                                                  
  1.80%           perf  [kernel.kallsyms]                  [k] clear_page_c                                                                
  1.58%           perf  [kernel.kallsyms]                  [k] system_call                                                                 
  1.56%           perf  [kernel.kallsyms]                  [k] __strncpy_from_user                                                         
  1.53%           perf  [kernel.kallsyms]                  [k] do_lookup                                                                   
  1.47%           perf  [kernel.kallsyms]                  [k] dput                                                                        
  1.43%           perf  [kernel.kallsyms]                  [k] get_page_from_freelist                                                      
  1.28%           perf  libc-2.13.so                       [.] 0xa99f6                                                                     
  1.19%        swapper  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave                                                      
  1.15%           perf  [kernel.kallsyms]                  [k] vfsmount_lock_local_lock                                                    
  1.12%           perf  [kernel.kallsyms]                  [k] kfree        
  ...   
-------------------------------------------------------------------------------------
round 2:
Total duration     67448 s   392232 us

 real: 23.21   s
 user: 0.00   
 sys:  0.29

Events: 82K cycles
  6.23%             perf  [kernel.kallsyms]                  [k] _raw_spin_lock                                                            
  3.23%             perf  [kernel.kallsyms]                  [k] mutex_unlock                                                              
  3.10%             perf  [kernel.kallsyms]                  [k] mutex_lock                                                                
  3.10%             perf  [kernel.kallsyms]                  [k] link_path_walk                                                            
  2.59%          swapper  [kernel.kallsyms]                  [k] intel_idle                                                                
  2.18%             perf  [kernel.kallsyms]                  [k] __d_lookup                                                                
  1.88%             perf  [kernel.kallsyms]                  [k] clear_page_c                                                              
  1.60%             perf  [kernel.kallsyms]                  [k] __strncpy_from_user                                                       
  1.50%             perf  [kernel.kallsyms]                  [k] system_call                                                               
  1.48%             perf  [kernel.kallsyms]                  [k] dput                                                                      
  1.44%             perf  [kernel.kallsyms]                  [k] do_lookup                                                                 
  1.33%             perf  [kernel.kallsyms]                  [k] get_page_from_freelist                                                    
  1.29%             perf  libc-2.13.so                       [.] 0x82715                                                                   
  1.19%          swapper  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave                                                    
  1.11%             perf  [kernel.kallsyms]                  [k] kfree                                                                     
  1.10%             perf  [kernel.kallsyms]                  [k] vfsmount_lock_local_lock                                                  
  1.01%             perf  [kernel.kallsyms]                  [k] __alloc_pages_nodemask
  ...
-------------------------------------------------------------------------------------
round 3:
Total duration     66468 s   532417 us

 real: 23.35   s
 user: 0.00   
 sys:  0.28
Events: 87K cycles
  6.30%             perf  [kernel.kallsyms]                  [k] _raw_spin_lock                                                            
  3.09%             perf  [kernel.kallsyms]                  [k] mutex_unlock                                                              
  2.98%             perf  [kernel.kallsyms]                  [k] link_path_walk                                                            
  2.98%             perf  [kernel.kallsyms]                  [k] mutex_lock                                                                
  2.70%          swapper  [kernel.kallsyms]                  [k] intel_idle                                                                
  2.25%             perf  [kernel.kallsyms]                  [k] __d_lookup                                                                
  1.92%             perf  [kernel.kallsyms]                  [k] clear_page_c                                                              
  1.56%             perf  [kernel.kallsyms]                  [k] __strncpy_from_user                                                       
  1.47%             perf  [kernel.kallsyms]                  [k] dput                                                                      
  1.47%             perf  [kernel.kallsyms]                  [k] system_call                                                               
  1.42%             perf  [kernel.kallsyms]                  [k] do_lookup                                                                 
  1.35%             perf  [kernel.kallsyms]                  [k] get_page_from_freelist                                                    
  1.32%             perf  libc-2.13.so                       [.] 0x12902e                                                                  
  1.32%          swapper  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave                                                    
  1.10%             perf  [kernel.kallsyms]                  [k] vfsmount_lock_local_lock                                                  
  1.02%             perf  [kernel.kallsyms]                  [k] kfree                                                                     
  1.00%             perf  [kernel.kallsyms]                  [k] __alloc_pages_nodemask

Interesting!! Semaphore-like is almost 8s slower than mutex... Also, the Events sycles of perf
reported is different



 


                                                                


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-04-17 11:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-17  9:36 A quick view of the performance benchmark for semaphore-like and mutex Chen, Dennis (SRDC SW)
2012-04-17 10:09 ` Peter Zijlstra
2012-04-17 10:12 ` Peter Zijlstra
2012-04-17 11:52   ` Chen, Dennis (SRDC SW)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox