All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <waiman.long@hp.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Jeff Layton <jlayton@redhat.com>,
	Miklos Szeredi <mszeredi@suse.cz>, Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Andi Kleen <andi@firstfloor.org>,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	"Norton, Scott J" <scott.norton@hp.com>
Subject: Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount
Date: Tue, 03 Sep 2013 11:14:54 -0400	[thread overview]
Message-ID: <5225FCEE.7030901@hp.com> (raw)
In-Reply-To: <20130903060130.GD16261@gmail.com>

On 09/03/2013 02:01 AM, Ingo Molnar wrote:
> * Waiman Long<waiman.long@hp.com>  wrote:
>
>> Yes, that patch worked. It eliminated the lglock as a bottleneck in 
>> the AIM7 workload. The lg_global_lock did not show up in the perf 
>> profile, whereas the lg_local_lock was only 0.07%. 
> Just curious: what's the worst bottleneck now in the optimized kernel? :-)
>
> Thanks,
>
> 	Ingo
With the following patches on v3.11:
1. Linus's version of lockref patch
2. Al's lglock patch
3. My preliminary patch to convert prepend_path under RCU

The perf profile of the kernel portion of the short workload in a 
80-core system became like this:

     29.87%     reaim  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
                   |--50.00%-- tty_ldisc_deref
                   |--49.01%-- tty_ldisc_try
                    --0.99%-- [...]

      7.55%   swapper  [kernel.kallsyms]  [k] intel_idle
      1.03%     reaim  [kernel.kallsyms]  [k] copy_user_generic_string
      0.91%     reaim  [kernel.kallsyms]  [k] _raw_spin_lock
                   |--15.88%-- __rcu_process_callbacks
                   |--6.55%-- load_balance
                   |--6.02%-- sem_lock
                   |--4.77%-- enqueue_to_backlog
                   |--4.21%-- task_rq_lock
                   |--3.97%-- process_backlog
                   |--3.35%-- unix_dgram_sendmsg
                   |--3.28%-- kmem_cache_free
                   |--3.16%-- tcp_v4_rcv
                   |--2.77%-- unix_stream_sendmsg
                   |--2.36%-- rcu_accelerate_cbs
                   |--2.02%-- do_wp_page
                   |--2.02%-- unix_create1
                   |--1.83%-- unix_peer_get
                   |--1.67%-- udp_lib_get_port
                   |--1.66%-- unix_stream_recvmsg
                   |--1.63%-- handle_pte_fault
                   |--1.63%-- udp_queue_rcv_skb
                   |--1.54%-- unix_release_sock
                   |--1.48%-- try_to_wake_up
                   |--1.37%-- do_anonymous_page
                   |--1.37%-- new_inode_pseudo
                   |--1.33%-- __d_lookup
                   |--1.20%-- free_one_page
                   |--1.11%-- __do_fault
                   |--1.06%-- scheduler_tick
                   |--0.90%-- __drain_alien_cache
                   |--0.81%-- inet_csk_get_port
                   |--0.76%-- sock_alloc
                   |--0.76%-- shmem_lock
                   |--0.75%-- __d_instantiate
                   |--0.70%-- __inet_hash_connect
                   |--0.69%-- __inet_hash_nolisten
                   |--0.68%-- ip_local_deliver_finish
                   |--0.64%-- inet_hash
                   |--0.64%-- kfree
                   |--0.60%-- d_path
                   |--0.58%-- __close_fd
                   |--0.51%-- evict
                    --11.76%-- [...]

      0.51%     reaim  [ip_tables]        [k] ipt_do_table
      0.46%     reaim  [kernel.kallsyms]  [k] __alloc_skb
      0.38%     reaim  [kernel.kallsyms]  [k] kfree
      0.36%     reaim  [kernel.kallsyms]  [k] kmem_cache_free
      0.34%     reaim  [kernel.kallsyms]  [k] system_call_after_swapg
      0.32%     reaim  [kernel.kallsyms]  [k] fsnotify
      0.32%     reaim  [kernel.kallsyms]  [k] ip_finish_output
      0.27%     reaim  [kernel.kallsyms]  [k] system_call

Other than the global tty_ldisc_lock, there is no other major
bottleneck. I am not that worry about the tty_ldisc_lock bottleneck
as real world applications probably won't have that many calls to
set the tty driver.

Regards,
Longman


  parent reply	other threads:[~2013-09-03 15:15 UTC|newest]

Thread overview: 154+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-06  3:12 [PATCH v7 0/4] Lockless update of reference count protected by spinlock Waiman Long
2013-08-06  3:12 ` [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount Waiman Long
2013-08-29  1:40   ` Linus Torvalds
2013-08-29  4:44     ` Benjamin Herrenschmidt
2013-08-29  7:00       ` Ingo Molnar
2013-08-29 16:43         ` Linus Torvalds
2013-08-29 19:25           ` Linus Torvalds
2013-08-29 23:42             ` Linus Torvalds
2013-08-30  0:26               ` Benjamin Herrenschmidt
2013-08-30  0:49                 ` Linus Torvalds
2013-08-30  2:06                   ` Michael Neuling
2013-08-30  2:30                     ` Benjamin Herrenschmidt
2013-08-30  2:35                       ` Linus Torvalds
2013-08-30  2:45                         ` Benjamin Herrenschmidt
2013-08-30  2:31                     ` Linus Torvalds
2013-08-30  2:43                       ` Benjamin Herrenschmidt
2013-08-30  7:16                   ` Ingo Molnar
2013-08-30 15:28                     ` Linus Torvalds
2013-08-30  3:12               ` Waiman Long
2013-08-30  3:54                 ` Linus Torvalds
2013-08-30  7:55                   ` Sedat Dilek
2013-08-30  8:10                     ` Sedat Dilek
2013-08-30  9:27                     ` Sedat Dilek
2013-08-30  9:48                       ` Ingo Molnar
2013-08-30  9:56                         ` Sedat Dilek
2013-08-30  9:58                           ` Sedat Dilek
2013-08-30 10:29                             ` Sedat Dilek
2013-08-30 10:36                               ` Peter Zijlstra
2013-08-30 10:44                                 ` Sedat Dilek
2013-08-30 10:46                                   ` Sedat Dilek
2013-08-30 10:52                                   ` Peter Zijlstra
2013-08-30 10:57                                     ` Sedat Dilek
2013-08-30 14:05                                       ` Sedat Dilek
2013-08-30 11:19                                 ` Sedat Dilek
2013-08-30 10:38                               ` Sedat Dilek
2013-08-30 15:34                       ` Linus Torvalds
2013-08-30 15:38                         ` Sedat Dilek
2013-08-30 16:12                           ` Steven Rostedt
2013-08-30 16:16                             ` Sedat Dilek
2013-08-30 18:42                             ` Linus Torvalds
2013-08-30 16:32                           ` Linus Torvalds
2013-08-30 16:37                             ` Sedat Dilek
2013-08-30 16:52                               ` Linus Torvalds
2013-08-30 17:11                                 ` Sedat Dilek
2013-08-30 17:26                                   ` Linus Torvalds
2013-09-01 10:01                                 ` Sedat Dilek
2013-09-01 10:33                                   ` Sedat Dilek
2013-09-01 15:32                                   ` Linus Torvalds
2013-09-01 15:45                                     ` Sedat Dilek
2013-09-01 15:55                                       ` Linus Torvalds
2013-09-02 10:30                                         ` Sedat Dilek
2013-09-02 16:09                                           ` David Ahern
2013-09-02 16:09                                             ` David Ahern
2013-09-01 20:59                                     ` Linus Torvalds
2013-09-01 21:23                                       ` Al Viro
2013-09-01 22:16                                         ` Linus Torvalds
2013-09-01 22:35                                           ` Al Viro
2013-09-01 22:44                                             ` Al Viro
2013-09-01 22:58                                               ` Linus Torvalds
2013-09-01 22:48                                           ` Linus Torvalds
2013-09-01 23:30                                             ` Al Viro
2013-09-02  0:12                                               ` Linus Torvalds
2013-09-02  0:50                                                 ` Linus Torvalds
2013-09-02  0:50                                                   ` Linus Torvalds
2013-09-02  7:05                                                   ` Ingo Molnar
2013-09-02 16:44                                                     ` Linus Torvalds
2013-09-03 10:15                                                       ` Ingo Molnar
2013-09-03 15:41                                                         ` Linus Torvalds
2013-09-03 18:34                                                           ` Linus Torvalds
2013-09-03 19:19                                                             ` Ingo Molnar
2013-09-03 21:05                                                               ` Linus Torvalds
2013-09-03 21:13                                                                 ` Linus Torvalds
2013-09-03 21:34                                                                   ` Linus Torvalds
2013-09-03 21:39                                                                     ` Linus Torvalds
2013-09-03 14:08                                                       ` Pavel Machek
2013-09-03 22:37                                     ` Sedat Dilek
2013-09-03 22:55                                       ` Dave Jones
2013-09-03 23:05                                         ` Sedat Dilek
2013-09-03 23:15                                           ` Dave Jones
2013-09-03 23:20                                             ` Sedat Dilek
2013-09-03 23:45                                       ` Sedat Dilek
2013-08-30 18:33                   ` Waiman Long
2013-08-30 18:53                     ` Linus Torvalds
2013-08-30 19:20                       ` Waiman Long
2013-08-30 19:33                         ` Linus Torvalds
2013-08-30 20:15                           ` Waiman Long
2013-08-30 20:43                             ` Linus Torvalds
2013-08-30 20:54                               ` Al Viro
2013-08-30 21:03                                 ` Linus Torvalds
2013-08-30 21:44                                   ` Al Viro
2013-08-30 22:30                                     ` Linus Torvalds
2013-08-31 21:23                                       ` Al Viro
2013-08-31 22:49                                         ` Linus Torvalds
2013-08-31 23:27                                           ` Al Viro
2013-09-01  0:13                                             ` Al Viro
2013-09-01 17:48                                               ` Al Viro
2013-09-09  8:30                                               ` Peter Zijlstra
2013-08-30 21:10                                 ` Waiman Long
2013-08-30 21:22                                   ` Linus Torvalds
2013-08-30 21:30                                   ` Al Viro
2013-08-30 21:42                                     ` Waiman Long
2013-08-30 19:40                         ` Al Viro
2013-08-30 19:52                           ` Waiman Long
2013-08-30 20:26                             ` Al Viro
2013-08-30 20:35                               ` Waiman Long
2013-08-30 20:48                                 ` Al Viro
2013-08-31  2:02                                   ` Waiman Long
2013-08-31  2:35                                     ` Al Viro
2013-08-31  2:42                                       ` Al Viro
2013-09-02 19:25                                         ` Waiman Long
2013-09-03  6:01                                           ` Ingo Molnar
2013-09-03  7:24                                             ` Sedat Dilek
2013-09-03 15:38                                               ` Linus Torvalds
2013-09-03 15:14                                             ` Waiman Long [this message]
2013-09-03 15:34                                               ` Linus Torvalds
2013-09-03 19:09                                                 ` Linus Torvalds
2013-09-03 21:01                                                   ` Waiman Long
2013-09-04 14:52                                                   ` Waiman Long
2013-09-04 15:14                                                     ` Linus Torvalds
2013-09-04 19:25                                                       ` Waiman Long
2013-09-04 21:34                                                         ` Linus Torvalds
2013-09-05  2:35                                                           ` Waiman Long
2013-09-05 13:31                                                     ` Ingo Molnar
2013-09-05 17:33                                                       ` Waiman Long
2013-09-05 17:40                                                         ` Ingo Molnar
2013-09-03 22:41                                               ` Sedat Dilek
2013-09-03 23:11                                                 ` Sedat Dilek
2013-09-03 23:11                                                   ` Sedat Dilek
2013-09-08 21:45               ` Linus Torvalds
2013-09-09  0:03                 ` Al Viro
2013-09-09  0:25                   ` Linus Torvalds
2013-09-09  0:35                     ` Al Viro
2013-09-09  0:38                       ` Linus Torvalds
2013-09-09  0:57                         ` Al Viro
2013-09-09  2:09                     ` Ramkumar Ramachandra
2013-09-09  0:30                   ` Al Viro
2013-09-09  3:32                   ` Linus Torvalds
2013-09-09  4:06                     ` Ramkumar Ramachandra
2013-09-09  5:44                     ` Al Viro
2013-08-30 17:17           ` Peter Zijlstra
2013-08-30 17:28             ` Linus Torvalds
2013-08-30 17:33               ` Linus Torvalds
2013-08-29 15:20     ` Waiman Long
2013-08-06  3:12 ` [PATCH v7 2/4] spinlock: Enable x86 architecture to do lockless refcount update Waiman Long
2013-08-06  3:12 ` [PATCH v7 3/4] dcache: replace d_lock/d_count by d_lockcnt Waiman Long
2013-08-06  3:12 ` [PATCH v7 4/4] dcache: Enable lockless update of dentry's refcount Waiman Long
2013-08-13 18:03 ` [PATCH v7 0/4] Lockless update of reference count protected by spinlock Waiman Long
  -- strict thread matches above, loose matches on Subject: below --
2013-08-31  3:06 [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount George Spelvin
2013-08-31 17:16 ` Linus Torvalds
2013-09-01  8:50   ` George Spelvin
2013-09-01 11:10     ` Theodore Ts'o
2013-09-01 15:49       ` Linus Torvalds
2013-09-01 18:11         ` Steven Rostedt
2013-09-01 20:03           ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5225FCEE.7030901@hp.com \
    --to=waiman.long@hp.com \
    --cc=andi@firstfloor.org \
    --cc=aswin@hp.com \
    --cc=benh@kernel.crashing.org \
    --cc=jlayton@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mszeredi@suse.cz \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.