All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Waychison <mikew@google.com>
To: Mike Waychison <mikew@google.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v1 0/8] Deferred dput() and iput() -- reducing lock	contention
Date: Wed, 28 Jan 2009 18:09:26 -0800	[thread overview]
Message-ID: <49810FD6.2080501@google.com> (raw)
In-Reply-To: <20090117081210.GL8071@disturbed>

Dave Chinner wrote:
> On Fri, Jan 16, 2009 at 06:29:36PM -0800, Mike Waychison wrote:
>> We've noticed that at times it can become very easy to have a system begin to
>> livelock on dcache_lock/inode_lock (specifically in atomic_dec_and_lock()) when
>> a lot of dentries are getting finalized at the same time (massive delete and
>> large fdtable destructions are two paths I've seen cause problems).
>>
>> This patchset is an attempt to try and reduce the locking overheads associated
>> with final dput() and final iput().  This is done by batching dentries and
>> inodes into per-process queues and processing them in 'parallel' to consolidate
>> some of the locking.
> 
> Hmmmm. This deferring of dput/iput will have the same class of
> effects on filesystems as the recent reverted changes to make
> generic_delete_inode() an asynchronous process. That is, it
> temporally separates the transaction for namespace deletion (i.e.
> unlink) from the transaction that completes the inode deletion that
> occurs, typically, during ->clear_inode. See the recent thread
> titled:
> 
> [PATCH] async: Don't call async_synchronize_full_special() while holding sb_lock
> 
> For more details.
> 
> I suspect that change is likely to cause worse problems than the
> async changes in that it doesn't have a cap on the number of
> deferred operations.....

*sigh*.  So I did some testing with XFS today and you're right, 
separating the namespace operation from the actual file delete really 
slows things down.

I'll follow up with a v2 patchset with more precise details, but 
basically, I create (on one fs) 100K files (each empty) per cpu (8 core 
machine).  I then time how long it takes to have one-process-per-cpu 
delete it's own file hierarchy of 100K files.

XFS performs horribly at this test (taking approx 34 minutes to 
complete).  With my changes, it takes approx 57 minutes.  For 
comparison, ext2 takes ~4.5 seconds, ext4 ~30 seconds and ext4 without a 
journal ~4.1 seconds.

Seems to me like there is a design issue causing XFS to suck at mass 
deletes, and I can understand why you wouldn't want to separate the 
namespace+inode drop.

What say you to letting the fs itself specify whether it's inodes should 
be handled serially (like it is today) or in deferred batches?  Patch 4 
of this series introduced an I_SYNCIPUT flag to deal with umount issues; 
perhaps it would make sense to have inodes from XFS tagged with this flag?

Mike Waychison

  parent reply	other threads:[~2009-01-29  2:09 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-17  2:29 [PATCH v1 0/8] Deferred dput() and iput() -- reducing lock contention Mike Waychison
2009-01-17  2:29 ` [PATCH v1 1/8] Deferred batching of dput() Mike Waychison
2009-01-17 10:15   ` Evgeniy Polyakov
2009-01-20 20:07     ` Mike Waychison
2009-01-17  2:29 ` [PATCH v1 2/8] Parallel dput() Mike Waychison
2009-01-17  2:29 ` [PATCH v1 3/8] Deferred batching of iput() Mike Waychison
2009-01-17 10:18   ` Evgeniy Polyakov
2009-01-20 20:07     ` Mike Waychison
2009-01-17  2:29 ` [PATCH v1 4/8] Fixing iput() called from put_super path Mike Waychison
2009-01-17  2:30 ` [PATCH v1 5/8] Parallelize iput() Mike Waychison
2009-01-17  2:30 ` [PATCH v1 6/8] hugetlbfs drop_inode update Mike Waychison
2009-01-17  2:30 ` [PATCH v1 7/8] Make drop_caches flush pending dput()s and iput()s Mike Waychison
2009-01-17  2:30 ` [PATCH v1 8/8] Make the sync path drain dentries and inodes Mike Waychison
2009-01-17  7:04 ` [PATCH v1 0/8] Deferred dput() and iput() -- reducing lock contention Eric Dumazet
2009-01-20 20:00   ` Mike Waychison
2009-01-20 20:00     ` Mike Waychison
2009-01-17  8:12 ` Dave Chinner
2009-01-20 19:01   ` Mike Waychison
2009-01-29  2:09   ` Mike Waychison [this message]
2009-01-21  5:52 ` Andi Kleen
2009-01-21  6:22   ` Mike Waychison
2009-01-21  8:48     ` Andi Kleen
2009-01-21 17:28       ` Mike Waychison

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49810FD6.2080501@google.com \
    --to=mikew@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.