All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: "Chen, Dennis (SRDC SW)" <Dennis1.Chen@amd.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"paulmck@linux.vnet.ibm.com" <paulmck@linux.vnet.ibm.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	Paul Mackerras <paulus@samba.org>,
	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Subject: Re: [PATCH 0/2] tools perf: Add a new benchmark tool for semaphore/mutex
Date: Mon, 16 Apr 2012 11:24:37 +0200	[thread overview]
Message-ID: <20120416092437.GB27526@gmail.com> (raw)
In-Reply-To: <491D6B4EAD0A714894D8AD22F4BDE043B158BE@SCYBEXDAG03.amd.com>


* Chen, Dennis (SRDC SW) <Dennis1.Chen@amd.com> wrote:

> <PATCH PREFACE>
> -------------------
> This patch series are used to add a new performance benchmark tool for semaphore or mutex:
> The new tool will fork NR tasks specified through the command line and bind each of them
> to every CPUs in the system equally. The command to launch the tool looks like:
> '# perf bench locking mutex -p 8 -t 400 -c'
> 
> The above command will create 400 tasks in a system with 8-CPU, each CPU will have 50 tasks.
> After the task be created, it will read all the files and directories in '/sys/module'.
> sysfs is RAM based and its read operation for both dir and file is very sensitive for mutex
> lock, also '/sys/module' has almost no dependencies on external devices.
> 
> We can use this tool with 'perf record' command to get the hot-spot of the codes or 
> 'perf top -g' to get live info, for example, below is a test case run in a intel i7-2600 box
> (-c option is to get the cpu cycles, I don't use it in this test case):
> 
> # perf record -a perf bench locking mutex -p 8 -t 4000
> # Running locking/mutex benchmark... 
>  ...
>  [13894 ]/6  duration        23 s   609392 us
>  [13996 ]/4  duration        23 s   599418 us
>  [14056 ]/0  duration        23 s   595710 us
>  [13715 ]/3  duration        23 s   621719 us
>  [13390 ]/6  duration        23 s   644020 us
>  [13696 ]/0  duration        23 s   623101 us
>  [14334 ]/6  duration        23 s   580262 us
>  [14343 ]/7  duration        23 s   578702 us
>  [14283 ]/3  duration        23 s   583007 us
>  -----------------------------------
>  Total duration     79353 s   943945 us
> 
>  real: 23.84   s
>  user: 0.00   
>  sys:  0.45   
> 
> # perf report
> ===================================================================================
> ...
> # perf version : 3.3.2
> # arch : x86_64
> # nrcpus online : 8
> # nrcpus avail : 8
> # cpudesc : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
> # total memory : 3966460 kB
> # cmdline : /usr/bin/perf record -a perf bench locking mutex -p 8 -t 4000
> 
> # Events: 131K cycles
> #
> # Overhead          Command                      Shared Object                                 Symbol
> # ........  ...............  .................................  .....................................
> #
>     22.12%           perf  [kernel.kallsyms]                  [k] __mutex_lock_slowpath
>      8.27%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock
>      6.16%           perf  [kernel.kallsyms]                  [k] mutex_unlock
>      5.22%           perf  [kernel.kallsyms]                  [k] mutex_spin_on_owner
>      4.94%           perf  [kernel.kallsyms]                  [k] sysfs_refresh_inode
>      4.82%           perf  [kernel.kallsyms]                  [k] mutex_lock
>      2.67%           perf  [kernel.kallsyms]                  [k] __mutex_unlock_slowpath
>      2.61%           perf  [kernel.kallsyms]                  [k] link_path_walk
>      2.42%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave
>      1.61%           perf  [kernel.kallsyms]                  [k] __d_lookup
>      1.18%           perf  [kernel.kallsyms]                  [k] clear_page_c
>      1.16%           perf  [kernel.kallsyms]                  [k] dput
>      0.97%           perf  [kernel.kallsyms]                  [k] do_lookup
>      0.93%        swapper  [kernel.kallsyms]                  [k] intel_idle
>      0.87%           perf  [kernel.kallsyms]                  [k] get_page_from_freelist
>      0.85%           perf  [kernel.kallsyms]                  [k] __strncpy_from_user
>      0.81%           perf  [kernel.kallsyms]                  [k] system_call
>      0.78%           perf  libc-2.13.so                       [.] 0x84ef0         
>      0.71%           perf  [kernel.kallsyms]                  [k] vfsmount_lock_local_lock
>      0.68%           perf  [kernel.kallsyms]                  [k] sysfs_dentry_revalidate
>      0.62%           perf  [kernel.kallsyms]                  [k] try_to_wake_up
>      0.62%           perf  [kernel.kallsyms]                  [k] kfree
>      0.60%           perf  [kernel.kallsyms]                  [k] kmem_cache_alloc   
> ............................................................................................
> 

Nice! Would be nice to lift some of this information over into 
the changelogs, to address my complaints in the previous mail.

> We can see that for 4000 tasks running in 8 CPUs simultaneously, it will create a very heavy 
> contention for the mutex lock, so lot's of tasks enter into the slow path of the mutex lock...
> I am very curious if we switch the mutex to the semaphore in this case, how's thing going? 
> My next plan

Seems like an unfinished sentence.

Thanks,

	Ingo

  reply	other threads:[~2012-04-16  9:24 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-01  9:56 semaphore and mutex in current Linux kernel (3.2.2) Chen, Dennis (SRDC SW)
2012-04-01 12:19 ` Ingo Molnar
2012-04-02 15:28   ` Chen, Dennis (SRDC SW)
2012-04-03  7:52     ` Ingo Molnar
2012-04-05  8:37       ` Chen, Dennis (SRDC SW)
2012-04-05 14:15         ` Clemens Ladisch
2012-04-06  9:45           ` Chen, Dennis (SRDC SW)
2012-04-06 10:10             ` Clemens Ladisch
2012-04-06 17:47               ` Chen, Dennis (SRDC SW)
2012-04-09 18:45                 ` Paul E. McKenney
2012-04-11  5:04                   ` Chen, Dennis (SRDC SW)
2012-04-11 17:30                     ` Paul E. McKenney
2012-04-12  9:42                       ` Chen, Dennis (SRDC SW)
2012-04-12 15:18                         ` Paul E. McKenney
2012-04-13 14:15                           ` Chen, Dennis (SRDC SW)
2012-04-13 18:43                             ` Paul E. McKenney
2012-04-16  8:33                               ` [PATCH 0/2] tools perf: Add a new benchmark tool for semaphore/mutex Chen, Dennis (SRDC SW)
2012-04-16  9:24                                 ` Ingo Molnar [this message]
2012-04-16 14:10                                   ` Chen, Dennis (SRDC SW)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120416092437.GB27526@gmail.com \
    --to=mingo@kernel.org \
    --cc=Dennis1.Chen@amd.com \
    --cc=acme@ghostprotocols.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.