All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Phillips <phillips@google.com>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-kernel@vger.kernel.org, ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] Ocfs2 performance bugs of doom
Date: Sun, 05 Mar 2006 17:28:21 -0800	[thread overview]
Message-ID: <440B9035.1070404@google.com> (raw)
In-Reply-To: <20060303233617.51718c8e.akpm@osdl.org>

Andrew Morton wrote:
> Daniel Phillips <phillips@google.com> wrote:
>>   	assert_spin_locked(&dlm->spinlock);
>> +	bucket = dlm->lockres_hash + full_name_hash(name, len) % DLM_HASH_BUCKETS;
>>
>> -	hash = full_name_hash(name, len);
>
> err, you might want to calculate that hash outside the spinlock.

Yah.

> Maybe have a lock per bucket, too.

So the lock memory is as much as the hash table? ;-)

> A 1MB hashtable is verging on comical.  How may data are there in total?

Even with the 256K entry hash table, __dlm_lookup_lockres is still the
top systime gobbler:

-------------
real 31.01
user 25.29
sys 3.09
-------------

CPU: P4 / Xeon, speed 2793.37 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 240000
samples  %        image name               app name                 symbol name
-------------------------------------------------------------------------------
17071831 71.2700  libbz2.so.1.0.2          libbz2.so.1.0.2          (no symbols)
   17071831 100.000  libbz2.so.1.0.2          libbz2.so.1.0.2          (no symbols) [self]
-------------------------------------------------------------------------------
2638066  11.0132  vmlinux                  vmlinux                  __dlm_lookup_lockres
   2638066  100.000  vmlinux                  vmlinux                  __dlm_lookup_lockres [self]
-------------------------------------------------------------------------------
332683    1.3889  oprofiled                oprofiled                (no symbols)
   332683   100.000  oprofiled                oprofiled                (no symbols) [self]
-------------------------------------------------------------------------------
254736    1.0634  vmlinux                  vmlinux                  ocfs2_local_alloc_count_bits
   254736   100.000  vmlinux                  vmlinux                  ocfs2_local_alloc_count_bits [self]
-------------------------------------------------------------------------------
176794    0.7381  tar                      tar                      (no symbols)
   176794   100.000  tar                      tar                      (no symbols) [self]
-------------------------------------------------------------------------------

Note, this is uniprocessor, single node on a local disk.  Something
pretty badly broken all right.  Tomorrow I will take a look at the hash
distribution and see what's up.

I guess there are about 250k symbols in the table before purging
finally kicks in, which happens 5th or 6th time I untar a kernel tree.
So, 20,000 names times 5-6 times the three locks per inode Mark
mentioned.  I'll actually measure that tomorrow instead of inferring
it.

I think this table is per-ocfs2-mount, and really really, a meg is
nothing if it makes CPU cycles  go away.  That's .05% of the memory
on this box, which is a small box where clusters are concerned.  But
there is also some gratuitous cpu suck still happening in there that
needs investigating.  I would not be surprised at all to learn that
full_name_hash is a terrible hash function.

Regards,

Daniel

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Phillips <phillips@google.com>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-kernel@vger.kernel.org, ocfs2-devel@oss.oracle.com
Subject: Re: Ocfs2 performance bugs of doom
Date: Sun, 05 Mar 2006 17:28:21 -0800	[thread overview]
Message-ID: <440B9035.1070404@google.com> (raw)
In-Reply-To: <20060303233617.51718c8e.akpm@osdl.org>

Andrew Morton wrote:
> Daniel Phillips <phillips@google.com> wrote:
>>   	assert_spin_locked(&dlm->spinlock);
>> +	bucket = dlm->lockres_hash + full_name_hash(name, len) % DLM_HASH_BUCKETS;
>>
>> -	hash = full_name_hash(name, len);
>
> err, you might want to calculate that hash outside the spinlock.

Yah.

> Maybe have a lock per bucket, too.

So the lock memory is as much as the hash table? ;-)

> A 1MB hashtable is verging on comical.  How may data are there in total?

Even with the 256K entry hash table, __dlm_lookup_lockres is still the
top systime gobbler:

-------------
real 31.01
user 25.29
sys 3.09
-------------

CPU: P4 / Xeon, speed 2793.37 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 240000
samples  %        image name               app name                 symbol name
-------------------------------------------------------------------------------
17071831 71.2700  libbz2.so.1.0.2          libbz2.so.1.0.2          (no symbols)
   17071831 100.000  libbz2.so.1.0.2          libbz2.so.1.0.2          (no symbols) [self]
-------------------------------------------------------------------------------
2638066  11.0132  vmlinux                  vmlinux                  __dlm_lookup_lockres
   2638066  100.000  vmlinux                  vmlinux                  __dlm_lookup_lockres [self]
-------------------------------------------------------------------------------
332683    1.3889  oprofiled                oprofiled                (no symbols)
   332683   100.000  oprofiled                oprofiled                (no symbols) [self]
-------------------------------------------------------------------------------
254736    1.0634  vmlinux                  vmlinux                  ocfs2_local_alloc_count_bits
   254736   100.000  vmlinux                  vmlinux                  ocfs2_local_alloc_count_bits [self]
-------------------------------------------------------------------------------
176794    0.7381  tar                      tar                      (no symbols)
   176794   100.000  tar                      tar                      (no symbols) [self]
-------------------------------------------------------------------------------

Note, this is uniprocessor, single node on a local disk.  Something
pretty badly broken all right.  Tomorrow I will take a look at the hash
distribution and see what's up.

I guess there are about 250k symbols in the table before purging
finally kicks in, which happens 5th or 6th time I untar a kernel tree.
So, 20,000 names times 5-6 times the three locks per inode Mark
mentioned.  I'll actually measure that tomorrow instead of inferring
it.

I think this table is per-ocfs2-mount, and really really, a meg is
nothing if it makes CPU cycles  go away.  That's .05% of the memory
on this box, which is a small box where clusters are concerned.  But
there is also some gratuitous cpu suck still happening in there that
needs investigating.  I would not be surprised at all to learn that
full_name_hash is a terrible hash function.

Regards,

Daniel

  parent reply	other threads:[~2006-03-06  1:28 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-03 22:27 [Ocfs2-devel] Ocfs2 performance bugs of doom Daniel Phillips
2006-03-03 22:27 ` Daniel Phillips
2006-03-04  0:53 ` [Ocfs2-devel] " Mark Fasheh
2006-03-04  0:53   ` Mark Fasheh
2006-03-04  3:42   ` [Ocfs2-devel] " Daniel Phillips
2006-03-04  3:42     ` Daniel Phillips
2006-03-04  7:36 ` Andrew Morton
2006-03-04  7:37   ` [Ocfs2-devel] " Andrew Morton
2006-03-05 19:22   ` Mark Fasheh
2006-03-05 19:22     ` Mark Fasheh
2006-03-06  1:28   ` Daniel Phillips [this message]
2006-03-06  1:28     ` Daniel Phillips
2006-03-06  2:58     ` [Ocfs2-devel] " Mark Fasheh
2006-03-06  2:58       ` Mark Fasheh
2006-03-06  4:59       ` Daniel Phillips
2006-03-06  4:59         ` Daniel Phillips
2006-03-06 19:51         ` Mark Fasheh
2006-03-06 19:51           ` Mark Fasheh
2006-03-07  3:34           ` Andi Kleen
2006-03-07  3:34             ` Andi Kleen
2006-03-07  4:58             ` [Ocfs2-devel] " Mark Fasheh
2006-03-07  4:58               ` Mark Fasheh
2006-03-07  6:56               ` [Ocfs2-devel] " Daniel Phillips
2006-03-07  6:56                 ` Daniel Phillips
2006-03-09  6:26               ` Daniel Phillips
2006-03-09  6:26                 ` Daniel Phillips
2006-03-09  7:26                 ` Nick Piggin
2006-03-09  7:26                   ` Nick Piggin
2006-03-09  7:43                 ` Nick Piggin
2006-03-09  7:43                   ` Nick Piggin
2006-03-09  4:19                   ` Andi Kleen
2006-03-09  4:19                     ` Andi Kleen
2006-03-09 12:30                     ` Nick Piggin
2006-03-09 12:30                       ` Nick Piggin
2006-03-10  5:14                       ` Nick Piggin
2006-03-10  5:14                         ` Nick Piggin
2006-03-10  0:21                 ` [Ocfs2-devel] Ocfs2 performance Mark Fasheh
2006-03-10  0:21                   ` Mark Fasheh
2006-03-10  1:14                   ` Bernd Eckenfels
2006-03-10  7:10                     ` Joel Becker
2006-03-10  7:10                       ` Joel Becker
2006-03-11  1:09                     ` Mark Fasheh
2006-03-11  1:09                       ` Mark Fasheh
2006-03-11  1:57                       ` Bernd Eckenfels
2006-03-11  1:57                         ` Bernd Eckenfels
2006-03-10 11:17                   ` Daniel Phillips
2006-03-10 11:17                     ` Daniel Phillips
2006-03-10 18:23                     ` Zach Brown
2006-03-10 18:23                       ` Zach Brown
2006-03-10 21:13                       ` Daniel Phillips
2006-03-10 21:13                         ` Daniel Phillips
2006-03-10 21:13                     ` Daniel Phillips
2006-03-10 21:13                       ` Daniel Phillips
2006-03-10  2:33                 ` [Ocfs2-devel] Ocfs2 performance bugs of doom J. Bruce Fields
2006-03-10  2:33                   ` J. Bruce Fields
2006-03-10 10:27                   ` Daniel Phillips
2006-03-10 10:27                     ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=440B9035.1070404@google.com \
    --to=phillips@google.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.