public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>,
	Kay Sievers <kay.sievers@vrfy.org>,
	linux-kernel@vger.kernel.org,
	Cornelia Huck <cornelia.huck@de.ibm.com>,
	linux-fsdevel@vger.kernel.org,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Benjamin LaHaise <bcrl@lhnet.ca>, Serge Hallyn <serue@us.ibm.com>,
	"Eric W. Biederman" <ebiederm@aristanetworks.com>
Subject: Re: [PATCH 3/7] sysfs: Keep an nlink count on sysfs directories.
Date: Mon, 11 Jan 2010 17:02:51 -0800	[thread overview]
Message-ID: <m1y6k43zk4.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <4B4BC683.7060508@kernel.org> (Tejun Heo's message of "Tue\, 12 Jan 2010 09\:46\:59 +0900")

Tejun Heo <tj@kernel.org> writes:

> Hello,
>
> On 01/12/2010 05:21 AM, Eric W. Biederman wrote:
>> On large directories sysfs_count_nlinks can be a significant
>> bottleneck, so keep a count in sysfs_dirent.
>
> I was about to suggest changing s_flags to ushort too.  Hmmm... adding
> a new field to sysfs_dirent somewhat worries me but this doesn't add
> to the size of the structure.  How significant bottlenect are we
> talking about?

It was seen in measurements of sysfs before my last round of changes,
which cause us to refresh the inode, and call sysfs_count_nlink more
often.

I am surprised no one has complained about 2.6.33-rcN yet and reported
a performance regression.

Ultimately not having a cached nlink count transforms what should
be constant time operations to operations that run in time O(N).

>> If we exceed the maximum number of directory entries we can store
>> return nlink of 1.  An nlink of 1 matches what reiserfs does in this
>> case, and it let's find and similar utlities know that we have a the
>> directory nlink can not be used for optimization purposes.
>
> Hmmm... what's the limit on reiserfs?  Is it 64k too?

The resierfs limit is a bit short of a 32bit number.  Ext[234]
all have a 16bit nlink field, and they fail the operation
when you attempt to increment nlink past their limit.

In this case the comparison with reiserfs is to show that at some
point throwing up our hands and not counting and just returning nlink
1 is something userspace can occassionally expect to see.  It is
common enough that find has handled this idiom for years.

Since we can handle this without increasing the size of the sysfs_dirent
I figure we should have a good quality of implementation for the common
case and return something userspace can deal with for the extreme cases.

Eric

  parent reply	other threads:[~2010-01-12  1:03 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-11 20:16 [PATCH 0/7] General sysfs enhancements Eric W. Biederman
2010-01-11 20:21 ` [PATCH 1/7] sysfs: Serialize updates to the vfs inode Eric W. Biederman
2010-01-12  5:41   ` Serge E. Hallyn
2010-01-11 20:21 ` [PATCH 2/7] sysfs: Pack sysfs_dirent more tightly Eric W. Biederman
2010-01-12  0:41   ` Tejun Heo
2010-01-11 20:21 ` [PATCH 3/7] sysfs: Keep an nlink count on sysfs directories Eric W. Biederman
2010-01-12  0:46   ` Tejun Heo
2010-01-12  0:53     ` Benjamin LaHaise
2010-01-12  1:06       ` Eric W. Biederman
2010-01-12  1:12         ` Benjamin LaHaise
2010-01-12  1:23           ` Eric W. Biederman
2010-01-12  6:22         ` Tejun Heo
2010-01-12 15:49       ` Valdis.Kletnieks
2010-01-12  1:02     ` Eric W. Biederman [this message]
2010-01-12  5:56   ` Serge E. Hallyn
2010-01-12  8:30     ` Eric W. Biederman
2010-01-12 12:41   ` Cornelia Huck
2010-01-12 15:34     ` Eric W. Biederman
2010-01-11 20:21 ` [PATCH 4/7] sysfs: Implement sysfs_rename_link Eric W. Biederman
2010-01-12  6:24   ` Tejun Heo
2010-01-12 17:30   ` Serge E. Hallyn
2010-01-11 20:21 ` [PATCH 5/7] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
2010-01-12  6:25   ` Tejun Heo
2010-01-12 17:34   ` Serge E. Hallyn
2010-01-11 20:21 ` [PATCH 6/7] sysfs: Pass super_block to sysfs_get_inode Eric W. Biederman
2010-01-12 17:43   ` Serge E. Hallyn
2010-01-11 20:21 ` [PATCH 7/7] sysfs: Kill unused sysfs_sb variable Eric W. Biederman
2010-01-12 17:43   ` Serge E. Hallyn
2010-01-15 21:37 ` [PATCH 0/7] General sysfs enhancements Greg KH
2010-02-13  3:20   ` [PATCH 0/6] " Eric W. Biederman
2010-02-13  3:22     ` [PATCH 1/6] sysfs: Serialize updates to the vfs inode Eric W. Biederman
2010-02-13  3:22     ` [PATCH 2/6] sysfs: Pack sysfs_dirent more tightly Eric W. Biederman
2010-02-13  3:22     ` [PATCH 3/6] sysfs: Implement sysfs_rename_link Eric W. Biederman
2010-02-13  3:22     ` [PATCH 4/6] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
2010-02-13  3:22     ` [PATCH 5/6] sysfs: Pass super_block to sysfs_get_inode Eric W. Biederman
2010-02-13  3:22     ` [PATCH 6/6] sysfs: Kill unused sysfs_sb variable Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1y6k43zk4.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=bcrl@lhnet.ca \
    --cc=cornelia.huck@de.ibm.com \
    --cc=ebiederm@aristanetworks.com \
    --cc=eric.dumazet@gmail.com \
    --cc=gregkh@suse.de \
    --cc=kay.sievers@vrfy.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=serue@us.ibm.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox