All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <waiman.long-VXdhtT5mjnY@public.gmane.org>
To: Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>
Cc: Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>,
	Alexander Viro
	<viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Miklos Szeredi <mszeredi-AlSwsSmVLrQ@public.gmane.org>,
	Ian Kent <raven-PKsaG3nR2I+sTnJN9+BGXg@public.gmane.org>,
	Sage Weil <sage-4GqslpFJ+cxBDgjK7y7TUQ@public.gmane.org>,
	Steve French <sfrench-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>,
	Trond Myklebust
	<Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>,
	Eric Paris <eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	autofs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	"Chandramouleeswaran, Aswin" <aswin-VXdhtT5mjnY@public.gmane.org>,
	"Norton, Scott J" <scott.norton-VXdhtT5mjnY@public.gmane.org>
Subject: Re: [PATCH 0/3 v3] dcache: make it more scalable on large system
Date: Wed, 29 May 2013 16:23:02 -0400	[thread overview]
Message-ID: <51A663A6.90904@hp.com> (raw)
In-Reply-To: <20130529161358.GJ6123-1g7Xle2YJi4/4alezvVtWx2eb7JE58TQ@public.gmane.org>

On 05/29/2013 12:13 PM, Andi Kleen wrote:
>> The d_path() is called by perf_event_mmap_event() which translates
>> VMA to its file path for memory segments backed by files. As perf is
>> not just for sampling data within the kernel, it can also be used
>> for checking access pattern in the user space. As a result, it needs
>> to map VMAs back to the backing files to access their symbols
>> information. If d_path() is not the right function to call for this
>> purpose, what other alternatives do we have?
> In principle it should be only called for new file mappings
> getting maped.  Do you really have that many new file mappings all
> the time? Or is this related to program startup?

The AIM7 benchmark that I used runs a large number of relatively short 
jobs. I think each time a new job is spawned, the file mappngs have to 
be redone again. It is probably not a big problem for long running 
processes.

>> My patch set consists of 2 different changes. The first one is to
>> avoid taking the d_lock lock when updating the reference count in
>> the dentries. This particular change also benefit some other
>> workloads that are filesystem intensive. One particular example is
>> the short workload in the AIM7 benchmark. One of the job type in the
>> short workload is "misc_rtns_1" which calls security functions like
>> getpwnam(), getpwuid(), getgrgid() a couple of times. These
>> functions open the /etc/passwd or /etc/group files, read their
>> content and close the files. It is the intensive open/read/close
>> sequence from multiple threads that is causing 80%+ contention in
>> the d_lock on a system with large number of cores. The MIT's
>> MOSBench paper also outlined dentry reference counting as a
> The paper was before Nick Piggin's RCU (and our) work on this.
> Modern kernels do not have dcache problems with mosbench, unless
> you run weird security modules like SMACK that effectively
> disable dcache RCU.

I had tried, but not yet able to run the MOSBench myself. Thank for 
letting me know that the dcache problem wrt MOSBench was fixed.

> BTW lock elision may fix these problems anyways, in a much
> simpler way.

I will certainly hope so. However, there will still be a lot of 
computers out there running pre-Haswell Intel chips. For them, locking 
is still a problem that need to be solved.

Regards,
Longman

WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <waiman.long@hp.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: Dave Chinner <david@fromorbit.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Jeff Layton <jlayton@redhat.com>,
	Miklos Szeredi <mszeredi@suse.cz>, Ian Kent <raven@themaw.net>,
	Sage Weil <sage@inktank.com>, Steve French <sfrench@samba.org>,
	Trond Myklebust <Trond.Myklebust@netapp.com>,
	Eric Paris <eparis@redhat.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	autofs@vger.kernel.org, ceph-devel@vger.kernel.org,
	linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	"Norton, Scott J" <scott.norton@hp.com>
Subject: Re: [PATCH 0/3 v3] dcache: make it more scalable on large system
Date: Wed, 29 May 2013 16:23:02 -0400	[thread overview]
Message-ID: <51A663A6.90904@hp.com> (raw)
In-Reply-To: <20130529161358.GJ6123@two.firstfloor.org>

On 05/29/2013 12:13 PM, Andi Kleen wrote:
>> The d_path() is called by perf_event_mmap_event() which translates
>> VMA to its file path for memory segments backed by files. As perf is
>> not just for sampling data within the kernel, it can also be used
>> for checking access pattern in the user space. As a result, it needs
>> to map VMAs back to the backing files to access their symbols
>> information. If d_path() is not the right function to call for this
>> purpose, what other alternatives do we have?
> In principle it should be only called for new file mappings
> getting maped.  Do you really have that many new file mappings all
> the time? Or is this related to program startup?

The AIM7 benchmark that I used runs a large number of relatively short 
jobs. I think each time a new job is spawned, the file mappngs have to 
be redone again. It is probably not a big problem for long running 
processes.

>> My patch set consists of 2 different changes. The first one is to
>> avoid taking the d_lock lock when updating the reference count in
>> the dentries. This particular change also benefit some other
>> workloads that are filesystem intensive. One particular example is
>> the short workload in the AIM7 benchmark. One of the job type in the
>> short workload is "misc_rtns_1" which calls security functions like
>> getpwnam(), getpwuid(), getgrgid() a couple of times. These
>> functions open the /etc/passwd or /etc/group files, read their
>> content and close the files. It is the intensive open/read/close
>> sequence from multiple threads that is causing 80%+ contention in
>> the d_lock on a system with large number of cores. The MIT's
>> MOSBench paper also outlined dentry reference counting as a
> The paper was before Nick Piggin's RCU (and our) work on this.
> Modern kernels do not have dcache problems with mosbench, unless
> you run weird security modules like SMACK that effectively
> disable dcache RCU.

I had tried, but not yet able to run the MOSBench myself. Thank for 
letting me know that the dcache problem wrt MOSBench was fixed.

> BTW lock elision may fix these problems anyways, in a much
> simpler way.

I will certainly hope so. However, there will still be a lot of 
computers out there running pre-Haswell Intel chips. For them, locking 
is still a problem that need to be solved.

Regards,
Longman

  parent reply	other threads:[~2013-05-29 20:23 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-23  1:37 [PATCH 0/3 v3] dcache: make it more scalable on large system Waiman Long
2013-05-23  1:37 ` [PATCH 1/3 v3] dcache: Don't take unnecessary lock in d_count update Waiman Long
2013-05-23  1:37 ` Waiman Long
2013-05-23  1:37   ` Waiman Long
2013-05-23  1:37 ` [PATCH 2/3 v3] dcache: introduce a new sequence read/write lock type Waiman Long
2013-05-23  1:37   ` Waiman Long
2013-05-23  1:37 ` Waiman Long
2013-05-23  1:37 ` [PATCH 3/3 v3] dcache: change rename_lock to a sequence read/write lock Waiman Long
2013-05-23  1:37 ` Waiman Long
2013-05-23  1:37   ` Waiman Long
2013-05-23  9:42 ` [PATCH 0/3 v3] dcache: make it more scalable on large system Dave Chinner
2013-05-23 21:34   ` Waiman Long
2013-05-23 21:34     ` Waiman Long
2013-05-27  2:09     ` Dave Chinner
2013-05-29 15:55       ` Waiman Long
     [not found]         ` <51A624E2.3000301-VXdhtT5mjnY@public.gmane.org>
2013-05-29 16:13           ` Andi Kleen
2013-05-29 16:13             ` Andi Kleen
     [not found]             ` <20130529161358.GJ6123-1g7Xle2YJi4/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2013-05-29 20:23               ` Waiman Long [this message]
2013-05-29 20:23                 ` Waiman Long
2013-05-29 16:18           ` Simo Sorce
2013-05-29 16:18             ` Simo Sorce
2013-05-29 16:56             ` Andi Kleen
2013-05-29 17:03               ` Simo Sorce
2013-05-29 20:37               ` Waiman Long
     [not found]             ` <1369844289.2769.146.camel-Hs+ccMQdwurzDu64bZtGtWD2FQJk+8+b@public.gmane.org>
2013-05-29 20:32               ` Waiman Long
2013-05-29 20:32                 ` Waiman Long
2013-05-29 18:46           ` J. Bruce Fields
2013-05-29 18:46             ` J. Bruce Fields
2013-05-29 20:37             ` Andi Kleen
     [not found]               ` <20130529203700.GM6123-1g7Xle2YJi4/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2013-05-29 20:43                 ` J. Bruce Fields
2013-05-29 20:43                   ` J. Bruce Fields
2013-05-29 21:01                   ` Andi Kleen
2013-05-29 21:19               ` Jörn Engel
2013-05-29 21:19                 ` Jörn Engel
2013-05-30 15:48                 ` Waiman Long
2013-05-30 15:48                   ` Waiman Long
2013-05-30 15:11                   ` Jörn Engel
2013-05-30 15:11                     ` Jörn Engel
2013-06-06  3:48               ` Dave Chinner
2013-05-29 20:40             ` Waiman Long
  -- strict thread matches above, loose matches on Subject: below --
2013-05-23  1:37 Waiman Long
2013-05-23  1:37 ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51A663A6.90904@hp.com \
    --to=waiman.long-vxdhtt5mjny@public.gmane.org \
    --cc=Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org \
    --cc=andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org \
    --cc=aswin-VXdhtT5mjnY@public.gmane.org \
    --cc=autofs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org \
    --cc=eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mszeredi-AlSwsSmVLrQ@public.gmane.org \
    --cc=raven-PKsaG3nR2I+sTnJN9+BGXg@public.gmane.org \
    --cc=sage-4GqslpFJ+cxBDgjK7y7TUQ@public.gmane.org \
    --cc=scott.norton-VXdhtT5mjnY@public.gmane.org \
    --cc=sfrench-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.